Kristin Cowalcijk created SEDONA-706: ----------------------------------------
Summary: Python DataFrame API have problem working in multi-threaded environment Key: SEDONA-706 URL: https://issues.apache.org/jira/browse/SEDONA-706 Project: Apache Sedona Issue Type: Bug Reporter: Kristin Cowalcijk Fix For: 1.7.1 This issue is reported by [https://github.com/apache/sedona/issues/1771|https://github.com/apache/sedona/issues/1771]. The user wanted to call ST functions using DataFrame API but an exception was raised. Further investigation showed that DataFrame API relies on {{SparkSession.getActiveSession}} to construct Spark SQL UDF calls. The "active session" is thread local and {{SparkSession.getActiveSession}} will only return a valid session in the thread that starts the Spark session. I believe that the Python backend is handling requests in a different thread so that thread has no active session. What we need for calling sedona function is a JVMView object. We can obtain this object from {{SparkContext._jvm}} instead of {{spark._jvm}}. This won't use any thread local states and will work correctly when there's an active Spark context in the current process. -- This message was sent by Atlassian Jira (v8.20.10#820010)