HyukjinKwon commented on issue #24898: [SPARK-22340][PYTHON] Add a mode to pin 
Python thread into JVM's
URL: https://github.com/apache/spark/pull/24898#issuecomment-529158091
 
 
   @squito, given the JIRA description at SPARK-29017, seems the analysis is 
matched with here.
   
   I also echo with:
   
   > I think the right way to fix this is to keep a python thread-local 
tracking these properties, and then sending them through to the JVM on calls to 
submitJob. This is going to be a headache to get right, though; we've also got 
to handle implicit calls, eg. rdd.collect(), rdd.forEach(), etc. And of course 
users may have defined their own functions, which will be broken until they fix 
it to use the same thread-locals.
   
   My impression was that, to do this, we should basically land some fixes into 
Py4J to store and set local properties for every command interaction - in my 
case, I didn't take a super close look for this yet because I thought this way 
is easier and cleaner with some minimised changes.
   
   so .. It needs some discussion and agreement on the approach we will take.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to