HyukjinKwon commented on a change in pull request #24898: [SPARK-22340][PYTHON] 
Add a mode to pin Python thread into JVM's
URL: https://github.com/apache/spark/pull/24898#discussion_r338841999
 
 

 ##########
 File path: python/pyspark/context.py
 ##########
 @@ -1010,13 +1010,42 @@ def setJobGroup(self, groupId, description, 
interruptOnCancel=False):
         ensure that the tasks are actually stopped in a timely manner, but is 
off by default due
         to HDFS-1208, where HDFS may respond to Thread.interrupt() by marking 
nodes as dead.
         """
+        warnings.warn(
 
 Review comment:
   What I am worries are .. 
   
   Firstly, people use it although it's buggy because it kind of works okay in 
single thread without a pin-thread mode. Seems like there is still possibility 
that another thread is launched and local properties are reset though.
   
   Secondly, even with pin-thread mode, it does not work properly about 
inherited threads, yes, as you said.
   
   About warning vs info, PySpark currently does not have a proper logging 
system .. so we should rely on manual printing out or `warning` module (which 
can be integrated logging system later if we happen to add it to PySpark).
   If we use manual printing way, it's tricky for users to control it. In case 
of warning, they can control, for instance, if they want to print out the 
warning only for the initial call or every call.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to