Repository: spark
Updated Branches:
  refs/heads/master b040cef2e -> 285a7798e

[SPARK-18687][PYSPARK][SQL] Backward compatibility - creating a Dataframe on a 
new SQLContext object fails with a Derby error

Change is for SQLContext to reuse the active SparkSession during construction 
if the sparkContext supplied is the same as the currently active SparkContext. 
Without this change, a new SparkSession is instantiated that results in a Derby 
error when attempting to create a dataframe using a new SQLContext object even 
though the SparkContext supplied to the new SQLContext is same as the currently 
active one. Refer for details 
on the error and a repro.

Existing unit tests and a new unit test added to pyspark-sql:

/python/run-tests --python-executables=python --modules=pyspark-sql

Please review before opening a pull 

Author: Vinayak <>
Author: Vinayak Joshi <>

Closes #16119 from vijoshi/SPARK-18687_master.


Branch: refs/heads/master
Commit: 285a7798e267311730b0163d37d726a81465468a
Parents: b040cef
Author: Vinayak <>
Authored: Fri Jan 13 18:35:12 2017 +0800
Committer: Wenchen Fan <>
Committed: Fri Jan 13 18:35:51 2017 +0800

 python/pyspark/sql/ | 2 +-
 python/pyspark/sql/   | 7 ++++++-
 2 files changed, 7 insertions(+), 2 deletions(-)
diff --git a/python/pyspark/sql/ b/python/pyspark/sql/
index de4c335..c22f4b8 100644
--- a/python/pyspark/sql/
+++ b/python/pyspark/sql/
@@ -73,7 +73,7 @@ class SQLContext(object):
         self._jsc = self._sc._jsc
         self._jvm = self._sc._jvm
         if sparkSession is None:
-            sparkSession = SparkSession(sparkContext)
+            sparkSession = SparkSession.builder.getOrCreate()
         if jsqlContext is None:
             jsqlContext = sparkSession._jwrapped
         self.sparkSession = sparkSession
diff --git a/python/pyspark/sql/ b/python/pyspark/sql/
index d178285..a825028 100644
--- a/python/pyspark/sql/
+++ b/python/pyspark/sql/
@@ -47,7 +47,7 @@ else:
     import unittest
 from pyspark import SparkContext
-from pyspark.sql import SparkSession, HiveContext, Column, Row
+from pyspark.sql import SparkSession, SQLContext, HiveContext, Column, Row
 from pyspark.sql.types import *
 from pyspark.sql.types import UserDefinedType, _infer_type
 from pyspark.tests import ReusedPySparkTestCase, SparkSubmitTests
@@ -206,6 +206,11 @@ class SQLTests(ReusedPySparkTestCase):
         shutil.rmtree(, ignore_errors=True)
+    def test_sqlcontext_reuses_sparksession(self):
+        sqlContext1 = SQLContext(
+        sqlContext2 = SQLContext(
+        self.assertTrue(sqlContext1.sparkSession is sqlContext2.sparkSession)
     def test_row_should_be_read_only(self):
         row = Row(a=1, b=2)
         self.assertEqual(1, row.a)

To unsubscribe, e-mail:
For additional commands, e-mail:

Reply via email to