WeichenXu123 commented on issue #24898: [SPARK-22340][PYTHON] Add a mode to pin 
Python thread into JVM's
URL: https://github.com/apache/spark/pull/24898#issuecomment-503634199
 
 
   @HyukjinKwon 
   
   I suggest test pinned threads via the following code:
   
   ```
   import random
   import time
   from functools import reduce
   import threading
   
   num_threads = 100
   loop_cnt_each_thread = 10
   
   jvm_thread_id_list = [None] * num_threads
   is_thread_check_pass = [True] * num_threads
   
   def test_jvm_thread_id(thread_idx):
           time.sleep(random.random())
           tid = spark._jvm.java.lang.Thread.currentThread().getId()
           jvm_thread_id_list[thread_idx] = tid
           for i in range(loop_cnt_each_thread):
                   cid = spark._jvm.java.lang.Thread.currentThread().getId()
                   if cid != tid:
                           is_thread_check_pass[thread_idx] = False
                           break
                   time.sleep(random.random())
   
   threads = [None] * num_threads
   for i in range(num_threads):
           threads[i] = threading.Thread(target=test_jvm_thread_id, args=(i,))
   
   for t in threads:
           t.start()
   
   for t in threads:
           t.join()
   
   assert reduce(lambda x,y: x and y, is_thread_check_pass)
   assert num_threads == len(set(jvm_thread_id_list))
   ```
   
   The above code can check:
   * In one python thread, if we call jvm methods multiple times, the 
underlying jvm code will be run in a fixed jvm thread.
   * If there're multiple python threads, then each python thread corresponds 
to different underlying jvm thread.
   
   My test code use `spark._jvm.java.lang.Thread.currentThread().getId()` so 
that we can directly check the jvm thread id.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to