WeichenXu123 commented on issue #24898: [SPARK-22340][PYTHON] Add a mode to pin 
Python thread into JVM's
URL: https://github.com/apache/spark/pull/24898#issuecomment-514085398
 
 
   @holdenk 
   > (Because if it's just fixing the job group ID we might be able to find a 
simpler way to track that information in the Python thread and pass it through 
each time)?
   
   +1 Pin python side thread to jvm thread should be the simplest way. 
Otherwise multithreads in pyspark will cause confusion on thread-local 
informations, we need to switch JVM side thread-local infos for different 
python threads.
   
   Current master code is in a buggy state, i.e.,
   If python side have multiple threads, if python side function running time 
do not overlapped with each other, then they reuse the same jvm thread. If any 
one python thread function occupy the jvm thread, then new python thread 
function call will use new jvm thread.
   This buggy state make us no way to control the thread-local infos in 
pyspark, and any APIs related to thread-local infos won't work correctly. So we 
have to fix this.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to