Kontinuation commented on issue #1771:
URL: https://github.com/apache/sedona/issues/1771#issuecomment-2616864626

   The way we obtain spark session in 
[dataframe_api.py](https://github.com/apache/sedona/blob/sedona-1.7.0/python/sedona/sql/dataframe_api.py#L60-L78)
 is problematic in multi-threaded environment. The "active session" is thread 
local and `SparkSession.getActiveSession` will only return a valid session in 
the thread that starts the Spark session. I believe that the Python backend is 
handling requests in a different thread so that thread has no active session.
   
   What we need for calling sedona function is a JVMView object. We can obtain 
this object from `SparkContext._jvm` instead of `spark._jvm`. This won't use 
any thread local states and will work correctly when there's an active Spark 
context in the current process.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to