[
https://issues.apache.org/jira/browse/FLINK-35338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rasmus Thygesen updated FLINK-35338:
------------------------------------
Description:
[This pull
request|https://github.com/apache/flink-kubernetes-operator/pull/609] was made
to allow enabling FS plugins on the Flink Kubernetes Operator which allows
reading a jar for a session job on various file systems. It normally works
well, but we are running our cluster with *[Restricted Pod
Security|https://kubernetes.io/docs/concepts/security/pod-security-standards/#restricted]*
which among other things means the Flink Operator pod is configured to use
*readOnlyRootFilesystem* and *runAsNonRoot* which means we are not allowed to
write to our plugins directory.
We have tried using *operatorVolumes* and *operatorVolumeMounts* to mount
*/opt/flink/plugins* which would allow us to write to it, but that overrides
all the pre-installed plugins. When all the pre-installed plugins are removed
before startup, the operator sees the directory for the plugin we are trying to
install, but does not find a jar file inside the directory and therefore
complains. We think that when the pre-installed plugins are there, the operator
takes a bit longer before it starts reading the new plugin and therefore there
is enough time to download the new plugin with curl.
We are open to suggestions for how we can solve this issue while keeping
*readOnlyRootFilesystem* and {*}runAsNonRoot{*}. We are considering a solution
where we mount a volume and download all the pre-installed plugins as well as
any extra plugins we need through an init container and we propose a new value
to the Flink Operator Helm chart.
We have tested that it also works if we build our own image where we add the
plugin, but we need to deploy the operator in different clusters with different
requirements for filesystems so we would have to create a new image for each
filesystem as well as updating all our own images every time there is an update
to the official Flink Operator image
was:
[This pull
request|https://github.com/apache/flink-kubernetes-operator/pull/609] was made
to allow enabling FS plugins on the Flink Kubernetes Operator which allows
reading a jar for a session job on various file systems. It normally works
well, but we are running our cluster with *[Restricted Pod
Security|https://kubernetes.io/docs/concepts/security/pod-security-standards/#restricted]*
which among other things mean the Flink Operator pod is configured to use
*readOnlyRootFilesystem* and *runAsNonRoot* which means we are not allowed to
write to our plugins directory.
We have tried using *operatorVolumes* and *operatorVolumeMounts* to mount
*/opt/flink/plugins* which would allow us to write to it, but that overrides
all the pre-installed plugins. When all the pre-installed plugins are removed
before startup, the operator sees the directory for the plugin we are trying to
install, but does not find a jar file inside the directory and therefore
complains. We think that when the pre-installed plugins are there, the operator
takes a bit longer before it starts reading the new plugin and therefore there
is enough time to download the new plugin with curl.
We are open to suggestions for how we can solve this issue while keeping
*readOnlyRootFilesystem* and {*}runAsNonRoot{*}. We are considering a solution
where we mount a volume and download all the pre-installed plugins as well as
any extra plugins we need through an init container and we propose a new value
to the Flink Operator Helm chart.
We have tested that it also works if we build our own image where we add the
plugin, but we need to deploy the operator in different clusters with different
requirements for filesystems so we would have to create a new image for each
filesystem as well as updating all our own images every time there is an update
to the official Flink Operator image
> Enable FS Plugins as non-root
> -----------------------------
>
> Key: FLINK-35338
> URL: https://issues.apache.org/jira/browse/FLINK-35338
> Project: Flink
> Issue Type: New Feature
> Components: Kubernetes Operator
> Affects Versions: 1.8.0
> Reporter: Rasmus Thygesen
> Priority: Not a Priority
>
> [This pull
> request|https://github.com/apache/flink-kubernetes-operator/pull/609] was
> made to allow enabling FS plugins on the Flink Kubernetes Operator which
> allows reading a jar for a session job on various file systems. It normally
> works well, but we are running our cluster with *[Restricted Pod
> Security|https://kubernetes.io/docs/concepts/security/pod-security-standards/#restricted]*
> which among other things means the Flink Operator pod is configured to use
> *readOnlyRootFilesystem* and *runAsNonRoot* which means we are not allowed to
> write to our plugins directory.
> We have tried using *operatorVolumes* and *operatorVolumeMounts* to mount
> */opt/flink/plugins* which would allow us to write to it, but that overrides
> all the pre-installed plugins. When all the pre-installed plugins are removed
> before startup, the operator sees the directory for the plugin we are trying
> to install, but does not find a jar file inside the directory and therefore
> complains. We think that when the pre-installed plugins are there, the
> operator takes a bit longer before it starts reading the new plugin and
> therefore there is enough time to download the new plugin with curl.
> We are open to suggestions for how we can solve this issue while keeping
> *readOnlyRootFilesystem* and {*}runAsNonRoot{*}. We are considering a
> solution where we mount a volume and download all the pre-installed plugins
> as well as any extra plugins we need through an init container and we propose
> a new value to the Flink Operator Helm chart.
> We have tested that it also works if we build our own image where we add the
> plugin, but we need to deploy the operator in different clusters with
> different requirements for filesystems so we would have to create a new image
> for each filesystem as well as updating all our own images every time there
> is an update to the official Flink Operator image
--
This message was sent by Atlassian Jira
(v8.20.10#820010)