Re: [Spark SQL] SQLContext getOrCreate incorrect behaviour

2015-12-23 Thread Jerry Lam
Hi Kostas, Thank you for the references of the 2 tickets. It helps me to understand why I got some weird experiences lately. Best Regards, Jerry On Wed, Dec 23, 2015 at 2:32 AM, kostas papageorgopoylos wrote: > Hi > > Fyi > The following 2 tickets are blocking currently

Re: [Spark SQL] SQLContext getOrCreate incorrect behaviour

2015-12-22 Thread Sean Owen
I think the original idea is that the life of the driver is the life of the SparkContext: the context is stopped when the driver finishes. Or: if for some reason the "context" dies or there's an unrecoverable error, that's it for the driver. (There's nothing wrong with stop(), right? you have to

Re: [Spark SQL] SQLContext getOrCreate incorrect behaviour

2015-12-22 Thread Jerry Lam
Hi Sean, What if the spark context stops for involuntary reasons (misbehavior of some connections) then we need to programmatically handle the failures by recreating spark context. Is there something I don't understand/know about the assumptions on how to use spark context? I tend to think of

Re: [Spark SQL] SQLContext getOrCreate incorrect behaviour

2015-12-22 Thread kostas papageorgopoylos
Hi Fyi The following 2 tickets are blocking currently (for releases up to 1.5.2) the pattern of Starting and Stopping a sparkContext inside the same driver program https://issues.apache.org/jira/browse/SPARK-11700 ->memory leak in SqlContext https://issues.apache.org/jira/browse/SPARK-11739 In

Re: [Spark SQL] SQLContext getOrCreate incorrect behaviour

2015-12-21 Thread Zhan Zhang
This looks to me is a very unusual use case. You stop the SparkContext, and start another one. I don’t think it is well supported. As the SparkContext is stopped, all the resources are supposed to be released. Is there any mandatory reason you have stop and restart another SparkContext.

Re: [Spark SQL] SQLContext getOrCreate incorrect behaviour

2015-12-21 Thread Jerry Lam
Hi Zhan, I'm illustrating the issue via a simple example. However it is not difficult to imagine use cases that need this behaviour. For example, you want to release all resources of spark when it does not use for longer than an hour in a job server like web services. Unless you can prevent

Re: [Spark SQL] SQLContext getOrCreate incorrect behaviour

2015-12-21 Thread Ted Yu
In Jerry's example, the first SparkContext, sc, has been stopped. So there would be only one SparkContext running at any given moment. Cheers On Mon, Dec 21, 2015 at 8:23 AM, Chester @work wrote: > Jerry > I thought you should not create more than one SparkContext

Re: [Spark SQL] SQLContext getOrCreate incorrect behaviour

2015-12-21 Thread Chester @work
Jerry I thought you should not create more than one SparkContext within one Jvm, ... Chester Sent from my iPhone > On Dec 20, 2015, at 2:59 PM, Jerry Lam wrote: > > Hi Spark developers, > > I found that SQLContext.getOrCreate(sc: SparkContext) does not behave >

Re: [Spark SQL] SQLContext getOrCreate incorrect behaviour

2015-12-20 Thread Yin Huai
Hi Jerry, Looks like https://issues.apache.org/jira/browse/SPARK-11739 is for the issue you described. It has been fixed in 1.6. With this change, when you call SQLContext.getOrCreate(sc2), we will first check if sc has been stopped. If so, we will create a new SQLContext using sc2. Thanks, Yin

[Spark SQL] SQLContext getOrCreate incorrect behaviour

2015-12-20 Thread Jerry Lam
Hi Spark developers, I found that SQLContext.getOrCreate(sc: SparkContext) does not behave correctly when a different spark context is provided. ``` val sc = new SparkContext val sqlContext =SQLContext.getOrCreate(sc) sc.stop ... val sc2 = new SparkContext val sqlContext2 =