Vladimir Feinberg created SPARK-16262:
-----------------------------------------
Summary: Impossible to remake new SparkContext using SparkSession
API in Pyspark
Key: SPARK-16262
URL: https://issues.apache.org/jira/browse/SPARK-16262
Project: Spark
Issue Type: Bug
Components: PySpark
Reporter: Vladimir Feinberg
There are multiple use cases where one might like to be able to stop and
re-start a {{SparkSession}}: configuration changes or modular testing. The
following code demonstrates that without clearing a hidden global
{{SparkSession._instantiatedContext = None}} it is impossible to re-create a
new Spark session after stopping one in the same process:
{code}
>>> from pyspark.sql import SparkSession
>>> spark = SparkSession.builder.getOrCreate()
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel).
16/06/28 11:28:10 WARN NativeCodeLoader: Unable to load native-hadoop library
for your platform... using builtin-java classes where applicable
16/06/28 11:28:10 WARN Utils: Your hostname, vlad-databricks resolves to a
loopback address: 127.0.1.1; using 192.168.3.166 instead (on interface
enp0s31f6)
16/06/28 11:28:10 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another
address
>>> spark.stop()
>>> spark = SparkSession.builder.getOrCreate()
>>> spark.createDataFrame([(1,)])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "pyspark/sql/session.py", line 514, in createDataFrame
rdd, schema = self._createFromLocal(map(prepare, data), schema)
File "pyspark/sql/session.py", line 394, in _createFromLocal
return self._sc.parallelize(data), schema
File "pyspark/context.py", line 410, in parallelize
numSlices = int(numSlices) if numSlices is not None else
self.defaultParallelism
File "pyspark/context.py", line 346, in defaultParallelism
return self._jsc.sc().defaultParallelism()
AttributeError: 'NoneType' object has no attribute 'sc'
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]