The Spark job-server project may help (https://github.com/spark-jobserver/spark-jobserver). -- Ali
On Oct 21, 2015, at 11:43 PM, ?????? <yuhang.c...@foxmail.com> wrote: > Hi developers, I've encountered some problem with Spark, and before opening > an issue, I'd like to hear your thoughts. > > Currently, if you want to submit a Spark job, you'll need to write the code, > make a jar, and then submit it with spark-submit or > org.apache.spark.launcher.SparkLauncher. > > But sometimes, the RDD operation chain is transferred dynamically in code, > from SQL or even GUI. thus it seems either inconvenient or not possible to > make a separated jar. Then I tried something like below: > val conf = new SparkConf().setAppName("Demo").setMaster("yarn-client") > val sc = new SparkContext(conf) > sc.textFile("README.md").flatMap(_.split(" ")).map((_, > 1)).reduceByKey(_+_).foreach(println) // A simple word count > When they are executed, a Spark job is submitted. However, there are some > remaining problems: > 1. It doesn't support all deploy modes, such as yarn-cluster. > 2. With the "Only 1 SparkContext in 1 JVM" limit, I can not run this twice. > 3. It runs within the same process with my code, no child process is created. > > Thus, what I wish for is that the problems can be handle by Spark itself, and > my request can be simply described as a "adding submit() method for > SparkContext / StreamingContext / SQLContext". I hope if I added a line after > the code above like this: > sc.submit() > then Spark can handle all background submitting processing for me. > > I already opened an issue before for this demand, but I couldn't make myself > clear back then. So I wrote this email and try to talk to you guys. Please > reply if you need further descriptions, and I'll open a issue for this if you > understand my demand and believe that it's something worth doing. > > Thanks a lot. > > Yuhang Chen. > yuhang.c...@foxmail.com