Hi all.
I read source code at spark/python/pyspark/sql/connect/session.py at master
· apache/spark (github.com) and the comment for the "stop" method is described
as follows:
def stop(self) -> None:
# Stopping the session will only close the connection to the current
session (and
# the life cycle of the session is maintained by the server),
# whereas the regular PySpark session immediately terminates the Spark
Context
# itself, meaning that stopping all Spark sessions.
# It is controversial to follow the existing the regular Spark
session's behavior
# specifically in Spark Connect the Spark Connect server is designed for
# multi-tenancy - the remote client side cannot just stop the server
and stop
# other remote clients being used from other users.
So, that's how it was designed.
eabour
From: eab...@163.com
Date: 2023-10-20 15:56
To: user @spark
Subject: spark.stop() cannot stop spark connect session
Hi,
my code:
from pyspark.sql import SparkSession
spark = SparkSession.builder.remote("sc://172.29.190.147").getOrCreate()
import pandas as pd
# 创建pandas dataframe
pdf = pd.DataFrame({
"name": ["Alice", "Bob", "Charlie"],
"age": [25, 30, 35],
"gender": ["F", "M", "M"]
})
# 将pandas dataframe转换为spark dataframe
sdf = spark.createDataFrame(pdf)
# 显示spark dataframe
sdf.show()
spark.stop()
After stop, execute sdf.show() throw
pyspark.errors.exceptions.connect.SparkConnectException: [NO_ACTIVE_SESSION] No
active Spark session found. Please create a new Spark session before running
the code. Visit the Spark web UI at http://172.29.190.147:4040/connect/ to
check if the current session is still running and has not been stopped yet.
1 session(s) are online, running 0 Request(s)
Session Statistics (1)
1 Pages. Jump to. Showitems in a page.Go
Page:
1
User
Session ID
Start Time ▾
Finish Time
Duration
Total Execute
29f05cde-8f8b-418d-95c0-8dbbbfb556d22023/10/20 15:30:0414 minutes 49 seconds2
eabour