tvalentyn commented on issue #29663:
URL: https://github.com/apache/beam/issues/29663#issuecomment-1845980486

   `semiPersistDir` should be set by the runner when runner lauches SDK harness 
container. It might be configurable already, based on some references in 
codebase: 
https://github.com/apache/beam/blob/90dd93f5241284da2e49c818af03e98b5132d30a/buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy#L2746
   
   Would setting `semiPersistDir`  to  `spark.local.dir` work for all spark 
runner users, so that users don't have to worry about this knob?
   
   given that venv in semipersist dir doesn't work for dataflow, we could 
detect if dataflow is used as a special case and if so, set 
RUN_PYTHON_SDK_IN_DEFAULT_ENVIRONMENT. The extra venv wrap doesn't make much 
sense for dataflow.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to