I think the original idea is that the life of the driver is the life of the SparkContext: the context is stopped when the driver finishes. Or: if for some reason the "context" dies or there's an unrecoverable error, that's it for the driver.
(There's nothing wrong with stop(), right? you have to call that when the driver ends to shut down Spark cleanly. It's the re-starting another context that's at issue.) This makes most sense in the context of a resource manager, which can conceivably restart a driver if you like, but can't reach into your program. That's probably still the best way to think of it. Still it would be nice if SparkContext were friendlier to a restart just as a matter of design. AFAIK it is; not sure about SQLContext though. If it's not a priority it's just because this isn't a usual usage pattern, which doesn't mean it's crazy, just not the primary pattern. On Tue, Dec 22, 2015 at 5:57 PM, Jerry Lam <chiling...@gmail.com> wrote: > Hi Sean, > > What if the spark context stops for involuntary reasons (misbehavior of some > connections) then we need to programmatically handle the failures by > recreating spark context. Is there something I don't understand/know about > the assumptions on how to use spark context? I tend to think of it as a > resource manager/scheduler for spark jobs. Are you guys planning to deprecate > the stop method from spark? > > Best Regards, > > Jerry > > Sent from my iPhone > >> On 22 Dec, 2015, at 3:57 am, Sean Owen <so...@cloudera.com> wrote: >> >> Although in many cases it does work to stop and then start a second >> context, it wasn't how Spark was originally designed, and I still see >> gotchas. I'd avoid it. I don't think you should have to release some >> resources; just keep the same context alive. >> >>> On Tue, Dec 22, 2015 at 5:13 AM, Jerry Lam <chiling...@gmail.com> wrote: >>> Hi Zhan, >>> >>> I'm illustrating the issue via a simple example. However it is not difficult >>> to imagine use cases that need this behaviour. For example, you want to >>> release all resources of spark when it does not use for longer than an hour >>> in a job server like web services. Unless you can prevent people from >>> stopping spark context, then it is reasonable to assume that people can stop >>> it and start it again in later time. >>> >>> Best Regards, >>> >>> Jerry >>> >>> >>>> On Mon, Dec 21, 2015 at 7:20 PM, Zhan Zhang <zzh...@hortonworks.com> wrote: >>>> >>>> This looks to me is a very unusual use case. You stop the SparkContext, >>>> and start another one. I don’t think it is well supported. As the >>>> SparkContext is stopped, all the resources are supposed to be released. >>>> >>>> Is there any mandatory reason you have stop and restart another >>>> SparkContext. >>>> >>>> Thanks. >>>> >>>> Zhan Zhang >>>> >>>> Note that when sc is stopped, all resources are released (for example in >>>> yarn >>>>> On Dec 20, 2015, at 2:59 PM, Jerry Lam <chiling...@gmail.com> wrote: >>>>> >>>>> Hi Spark developers, >>>>> >>>>> I found that SQLContext.getOrCreate(sc: SparkContext) does not behave >>>>> correctly when a different spark context is provided. >>>>> >>>>> ``` >>>>> val sc = new SparkContext >>>>> val sqlContext =SQLContext.getOrCreate(sc) >>>>> sc.stop >>>>> ... >>>>> >>>>> val sc2 = new SparkContext >>>>> val sqlContext2 = SQLContext.getOrCreate(sc2) >>>>> sc2.stop >>>>> ``` >>>>> >>>>> The sqlContext2 will reference sc instead of sc2 and therefore, the >>>>> program will not work because sc has been stopped. >>>>> >>>>> Best Regards, >>>>> >>>>> Jerry >>> --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org