phoerious commented on a change in pull request #16658:
URL: https://github.com/apache/beam/pull/16658#discussion_r810545034
##########
File path: sdks/python/container/boot.go
##########
@@ -210,15 +224,29 @@ func main() {
wg.Add(len(workerIds))
for _, workerId := range workerIds {
go func(workerId string) {
+ defer wg.Done()
log.Printf("Executing: python %v", strings.Join(args, "
"))
- log.Fatalf("Python exited: %v",
execx.ExecuteEnv(map[string]string{"WORKER_ID": workerId}, "python", args...))
- wg.Done()
+ log.Printf("Python exited: %v",
execx.ExecuteEnv(map[string]string{"WORKER_ID": workerId}, "python", args...))
}(workerId)
}
wg.Wait()
}
-// setup wheel specs according to installed python version
+// setupVenv initialize a local Python venv and set the corresponding env
variables
+func setupVenv(dir string) error {
+ log.Printf("Initializing temporary Python venv ...")
+ if _, err := os.Stat(filepath.Join(dir, "pyenv.cfg"));
os.IsNotExist(err) {
Review comment:
Same reason the "execute only once" stuff was added: efficiency and
thread safety. The latter isn't a problem anymore when we use separate venvs,
since we're not installing to the same location anymore. Efficiency may still
be an issue, but as written above, I don't think we can get away with reusing
the venvs, because the lifecycle of a job is totally underspecified and the
worker pool knows nothing about whether it's safe to clean up now.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]