zendesk-kjaanson commented on issue #32743:
URL: https://github.com/apache/beam/issues/32743#issuecomment-2582822746

   I solved the issue for myself, not sure how relevant it is to the issue at 
hand here. In my case, when trying to use PortableRunner with flink using 
Apache Flink Operator, the staging volume was not accessible/same for job 
manager and task managers/workers. For some reason this causes empty files for 
`submission_environment_dependencies.txt` (and if you use `save_main_session` 
then also for `pickled_main_session` to appear in `/tmp/staged` which then 
results in the `failed to retrieve staged files: failed to retrieve /tmp/staged 
in 3 attempts: failed to retrieve chunk for 
/tmp/staged/submission_environment_dependencies.txt` error to appear when 
worker process tries to load these.
   
   My issue was solved when I was able to create a **working** shared staging 
volume across pods.
   
   _Fun side note that might be helpful for someone: When you try to create 
host mounted PersistantVolume with ReadWriteMany access mode on Googles GKE and 
use it as a volume then it never actually tells you that you can't do it, but 
will simply mount random (different) volumes across all pods. Docs mention that 
it is not supposed to be supported :D. I went with FUSE CSI driver that solved 
the issue for GKE._


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to