[jira] [Commented] (SPARK-6703) Provide a way to discover existing SparkContext's
[ https://issues.apache.org/jira/browse/SPARK-6703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14495777#comment-14495777 ] Evan Chan commented on SPARK-6703: -- I should note: Having the jobserver support generic apps that don't implement an interface is an interesting idea (well, more like an implicit Trait { def main(args: Array[String]) }, I suppose). The only way I could think of to have them share a context would be to have the job server load the job jars and start / call the main method. So what you describe might be useful. We also do have users working with multiple contexts in the same JVM. However we are working on support for one JVM per context. Provide a way to discover existing SparkContext's - Key: SPARK-6703 URL: https://issues.apache.org/jira/browse/SPARK-6703 Project: Spark Issue Type: New Feature Components: Spark Core Affects Versions: 1.3.0 Reporter: Patrick Wendell Assignee: Ilya Ganelin Priority: Critical Right now it is difficult to write a Spark application in a way that can be run independently and also be composed with other Spark applications in an environment such as the JobServer, notebook servers, etc where there is a shared SparkContext. It would be nice to provide a rendez-vous point so that applications can learn whether an existing SparkContext already exists before creating one. The most simple/surgical way I see to do this is to have an optional static SparkContext singleton that people can be retrieved as follows: {code} val sc = SparkContext.getOrCreate(conf = new SparkConf()) {code} And you could also have a setter where some outer framework/server can set it for use by multiple downstream applications. A more advanced version of this would have some named registry or something, but since we only support a single SparkContext in one JVM at this point anyways, this seems sufficient and much simpler. Another advanced option would be to allow plugging in some other notion of configuration you'd pass when retrieving an existing context. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-6703) Provide a way to discover existing SparkContext's
[ https://issues.apache.org/jira/browse/SPARK-6703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14492888#comment-14492888 ] Patrick Wendell commented on SPARK-6703: Hey [~ilganeli] - sure thing. I've pinged a couple of people to provide feedback on the design. Overall I think it won't be a complicated feature to implement. I've added you as the assignee. One note, if it gets very close to the 1.4 code freeze I may need to help take it across the finish line. But for now why don't you go ahead, I think we'll be fine. Provide a way to discover existing SparkContext's - Key: SPARK-6703 URL: https://issues.apache.org/jira/browse/SPARK-6703 Project: Spark Issue Type: New Feature Components: Spark Core Affects Versions: 1.3.0 Reporter: Patrick Wendell Right now it is difficult to write a Spark application in a way that can be run independently and also be composed with other Spark applications in an environment such as the JobServer, notebook servers, etc where there is a shared SparkContext. It would be nice to provide a rendez-vous point so that applications can learn whether an existing SparkContext already exists before creating one. The most simple/surgical way I see to do this is to have an optional static SparkContext singleton that people can be retrieved as follows: {code} val sc = SparkContext.getOrCreate(conf = new SparkConf()) {code} And you could also have a setter where some outer framework/server can set it for use by multiple downstream applications. A more advanced version of this would have some named registry or something, but since we only support a single SparkContext in one JVM at this point anyways, this seems sufficient and much simpler. Another advanced option would be to allow plugging in some other notion of configuration you'd pass when retrieving an existing context. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-6703) Provide a way to discover existing SparkContext's
[ https://issues.apache.org/jira/browse/SPARK-6703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14492898#comment-14492898 ] Patrick Wendell commented on SPARK-6703: /cc [~velvia] Provide a way to discover existing SparkContext's - Key: SPARK-6703 URL: https://issues.apache.org/jira/browse/SPARK-6703 Project: Spark Issue Type: New Feature Components: Spark Core Affects Versions: 1.3.0 Reporter: Patrick Wendell Assignee: Ilya Ganelin Priority: Critical Right now it is difficult to write a Spark application in a way that can be run independently and also be composed with other Spark applications in an environment such as the JobServer, notebook servers, etc where there is a shared SparkContext. It would be nice to provide a rendez-vous point so that applications can learn whether an existing SparkContext already exists before creating one. The most simple/surgical way I see to do this is to have an optional static SparkContext singleton that people can be retrieved as follows: {code} val sc = SparkContext.getOrCreate(conf = new SparkConf()) {code} And you could also have a setter where some outer framework/server can set it for use by multiple downstream applications. A more advanced version of this would have some named registry or something, but since we only support a single SparkContext in one JVM at this point anyways, this seems sufficient and much simpler. Another advanced option would be to allow plugging in some other notion of configuration you'd pass when retrieving an existing context. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-6703) Provide a way to discover existing SparkContext's
[ https://issues.apache.org/jira/browse/SPARK-6703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14492909#comment-14492909 ] Ilya Ganelin commented on SPARK-6703: - Patrick - what¹s the time line for the 1.4 release? Just want to have a sense for it so I can schedule accordingly. Thank you, Ilya Ganelin The information contained in this e-mail is confidential and/or proprietary to Capital One and/or its affiliates. The information transmitted herewith is intended only for use by the individual or entity to which it is addressed. If the reader of this message is not the intended recipient, you are hereby notified that any review, retransmission, dissemination, distribution, copying or other use of, or taking of any action in reliance upon this information is strictly prohibited. If you have received this communication in error, please contact the sender and delete the material from your computer. Provide a way to discover existing SparkContext's - Key: SPARK-6703 URL: https://issues.apache.org/jira/browse/SPARK-6703 Project: Spark Issue Type: New Feature Components: Spark Core Affects Versions: 1.3.0 Reporter: Patrick Wendell Assignee: Ilya Ganelin Priority: Critical Right now it is difficult to write a Spark application in a way that can be run independently and also be composed with other Spark applications in an environment such as the JobServer, notebook servers, etc where there is a shared SparkContext. It would be nice to provide a rendez-vous point so that applications can learn whether an existing SparkContext already exists before creating one. The most simple/surgical way I see to do this is to have an optional static SparkContext singleton that people can be retrieved as follows: {code} val sc = SparkContext.getOrCreate(conf = new SparkConf()) {code} And you could also have a setter where some outer framework/server can set it for use by multiple downstream applications. A more advanced version of this would have some named registry or something, but since we only support a single SparkContext in one JVM at this point anyways, this seems sufficient and much simpler. Another advanced option would be to allow plugging in some other notion of configuration you'd pass when retrieving an existing context. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-6703) Provide a way to discover existing SparkContext's
[ https://issues.apache.org/jira/browse/SPARK-6703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493179#comment-14493179 ] Patrick Wendell commented on SPARK-6703: Yes, ideally we get it into 1.4 - though I think the ultimate solution here could be a very small patch. Provide a way to discover existing SparkContext's - Key: SPARK-6703 URL: https://issues.apache.org/jira/browse/SPARK-6703 Project: Spark Issue Type: New Feature Components: Spark Core Affects Versions: 1.3.0 Reporter: Patrick Wendell Assignee: Ilya Ganelin Priority: Critical Right now it is difficult to write a Spark application in a way that can be run independently and also be composed with other Spark applications in an environment such as the JobServer, notebook servers, etc where there is a shared SparkContext. It would be nice to provide a rendez-vous point so that applications can learn whether an existing SparkContext already exists before creating one. The most simple/surgical way I see to do this is to have an optional static SparkContext singleton that people can be retrieved as follows: {code} val sc = SparkContext.getOrCreate(conf = new SparkConf()) {code} And you could also have a setter where some outer framework/server can set it for use by multiple downstream applications. A more advanced version of this would have some named registry or something, but since we only support a single SparkContext in one JVM at this point anyways, this seems sufficient and much simpler. Another advanced option would be to allow plugging in some other notion of configuration you'd pass when retrieving an existing context. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-6703) Provide a way to discover existing SparkContext's
[ https://issues.apache.org/jira/browse/SPARK-6703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493303#comment-14493303 ] Evan Chan commented on SPARK-6703: -- Hey folks, Thought I would just put in my 2 cents as the author of the Spark Jobserver. What is the envisioned way for multiple applications to share the same SparkContext? Code has to be running in the same JVM, and for most applications there already must exist some shared knowledge of the framework or environment. This will affect whether this feature is useful or not. For example, the Spark Jobserver requires jobs to implement an interface, and also manages creation of the SparkContext. That way, jobs get the SparkContext through a method call, and we can have other method calls to do things like input validation. What I'm saying is that this feature would have little existing value to job server users, as jobs in job server already have a way to discover the existing context, and to implement a good RESTful API, for example. Another thing to think about is what about SQLContext, HiveContext. I realize there is the JDBC server, but in job server we have a way to pass in alternative forms of the contexts. I suppose you could then add this method to a static SQLContext singleton as well. Provide a way to discover existing SparkContext's - Key: SPARK-6703 URL: https://issues.apache.org/jira/browse/SPARK-6703 Project: Spark Issue Type: New Feature Components: Spark Core Affects Versions: 1.3.0 Reporter: Patrick Wendell Assignee: Ilya Ganelin Priority: Critical Right now it is difficult to write a Spark application in a way that can be run independently and also be composed with other Spark applications in an environment such as the JobServer, notebook servers, etc where there is a shared SparkContext. It would be nice to provide a rendez-vous point so that applications can learn whether an existing SparkContext already exists before creating one. The most simple/surgical way I see to do this is to have an optional static SparkContext singleton that people can be retrieved as follows: {code} val sc = SparkContext.getOrCreate(conf = new SparkConf()) {code} And you could also have a setter where some outer framework/server can set it for use by multiple downstream applications. A more advanced version of this would have some named registry or something, but since we only support a single SparkContext in one JVM at this point anyways, this seems sufficient and much simpler. Another advanced option would be to allow plugging in some other notion of configuration you'd pass when retrieving an existing context. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-6703) Provide a way to discover existing SparkContext's
[ https://issues.apache.org/jira/browse/SPARK-6703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493306#comment-14493306 ] Apache Spark commented on SPARK-6703: - User 'ilganeli' has created a pull request for this issue: https://github.com/apache/spark/pull/5501 Provide a way to discover existing SparkContext's - Key: SPARK-6703 URL: https://issues.apache.org/jira/browse/SPARK-6703 Project: Spark Issue Type: New Feature Components: Spark Core Affects Versions: 1.3.0 Reporter: Patrick Wendell Assignee: Ilya Ganelin Priority: Critical Right now it is difficult to write a Spark application in a way that can be run independently and also be composed with other Spark applications in an environment such as the JobServer, notebook servers, etc where there is a shared SparkContext. It would be nice to provide a rendez-vous point so that applications can learn whether an existing SparkContext already exists before creating one. The most simple/surgical way I see to do this is to have an optional static SparkContext singleton that people can be retrieved as follows: {code} val sc = SparkContext.getOrCreate(conf = new SparkConf()) {code} And you could also have a setter where some outer framework/server can set it for use by multiple downstream applications. A more advanced version of this would have some named registry or something, but since we only support a single SparkContext in one JVM at this point anyways, this seems sufficient and much simpler. Another advanced option would be to allow plugging in some other notion of configuration you'd pass when retrieving an existing context. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-6703) Provide a way to discover existing SparkContext's
[ https://issues.apache.org/jira/browse/SPARK-6703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14491877#comment-14491877 ] Ilya Ganelin commented on SPARK-6703: - Patrick - I can look into this. Thank you. Provide a way to discover existing SparkContext's - Key: SPARK-6703 URL: https://issues.apache.org/jira/browse/SPARK-6703 Project: Spark Issue Type: New Feature Components: Spark Core Affects Versions: 1.3.0 Reporter: Patrick Wendell Right now it is difficult to write a Spark application in a way that can be run independently and also be composed with other Spark applications in an environment such as the JobServer, notebook servers, etc where there is a shared SparkContext. It would be nice to provide a rendez-vous point so that applications can learn whether an existing SparkContext already exists before creating one. The most simple/surgical way I see to do this is to have an optional static SparkContext singleton that people can be retrieved as follows: {code} val sc = SparkContext.getOrCreate(conf = new SparkConf()) {code} And you could also have a setter where some outer framework/server can set it for use by multiple downstream applications. A more advanced version of this would have some named registry or something, but since we only support a single SparkContext in one JVM at this point anyways, this seems sufficient and much simpler. Another advanced option would be to allow plugging in some other notion of configuration you'd pass when retrieving an existing context. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org