[ 
https://issues.apache.org/jira/browse/BEAM-13669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17479895#comment-17479895
 ] 

Janek Bevendorff edited comment on BEAM-13669 at 1/21/22, 8:31 AM:
-------------------------------------------------------------------

Yes, this is a Kubernetes-Deployment. It's less of an issue if I schedule the 
pods dynamically via Flink's native Kubernetes integration, but that has its 
own problems sometimes and doesn't always prevent this issue either if a new 
job is submitted before Flink despawns all previous pods. With persistent 
sidecar containers, the whole thing is kind of unusable at the moment.

The linked issue indeed points to the same problem, but it's not just extra 
dependencies, it's also the application wheel itself that is a problem.


was (Author: phoerious):
Yes, this is a Kubernetes-Deployment. It's less of an issue if I schedule the 
pods dynamically via Flink's native Kubernetes integration, but that has its 
own problems sometimes and doesn't always prevent this issue either if a new 
job is submitted before Flink despawns all previous pods. With persistent 
sidecar contains, the whole thing is kind of unusable at the moment.

The linked issue indeed points to the same problem, but it's not just extra 
dependencies, it's also the application wheel itself that is a problem.

> Install Python wheel and dependencies to local venv in SDK harness
> ------------------------------------------------------------------
>
>                 Key: BEAM-13669
>                 URL: https://issues.apache.org/jira/browse/BEAM-13669
>             Project: Beam
>          Issue Type: Improvement
>          Components: sdk-py-harness
>    Affects Versions: 2.35.0
>            Reporter: Janek Bevendorff
>            Priority: P2
>
> With the {{--setup-file}} option, the Python SDK harness compiles and 
> installs a wheel and its package dependencies on the taskmanager SDK sidecar 
> containers. Unfortunately, this installation is global and persistent, so 
> consecutive job submissions will reuse the previously installed wheel instead 
> of reinstalling or updating it. This makes it impossible to submit updated 
> code (beyond the main .py file, which will always be fresh) without deleting 
> and recreating all taskmanager containers. It also messes with dependencies 
> of other jobs and can cause inconsistencies or errors that are extremely hard 
> to debug.
> *Suggestion:* Create a temporary venv per job, install packages there, delete 
> venv after job completion or failure.
> *Fixes:* Stale or conflicting dependencies in persistent SDK containers. Pip 
> will not reinstall existing packages by default, even if their version 
> numbers differ and the number of installed dependency packages grows 
> unnecessarily over time. There is also no point in keeping the 
> {{--setup-file}} wheel installed for all eternity.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to