kw2542 commented on pull request #15081:
URL: https://github.com/apache/beam/pull/15081#issuecomment-878505102


   > > I am curious why artifact staging does not work with threads? I wonder 
if we should fix that instead of introducing yet more complexity to this 
already complex API.
   > > In Python, I thought we used processes instead of threads because of the 
GIL. But Java has no GIL, so I'm not sure there is an advantage to using 
processes.
   > 
   > Using threads still makes sense for IO bound tasks in Python since Python 
can parallelize IO effectively. Python's GIL is problematic for CPU bound tasks.
   
   @lukecwik @ibzib Correct me if I am wrong, my understanding here is that we 
use process mode mainly because we can simplify the workflow by reusing the 
boot executable, which can only be executed in a sub process instead of thread. 
In addition, the boot executable starts the actual worker in a sub process too.
   
   It is true that we may implement a new workflow to support thread mode 
instead of relying boot executable but it could be much more significant work, 
let me know if you think it is worth the effort.
   
   In addition, I am wondering if we could add a prepare step in external pool 
mode, then we may not need to run artifact staging for each start worker 
request then. WDYT.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to