Re: flink-kubernetes-operator: image entrypoint misbehaves due to inability to write

Andrew Otto Thu, 01 Dec 2022 08:19:14 -0800

> Andrew please see my previous response, that covers the secrets case.
> kubernetes.jobmanager.entrypoint.args: -D
datadog.secret.conf=$MY_SECRET_ENV


This way^?  Ya that makes sense.  It'd be nice if there was a way to get
Secrets into the values used for rendering flink-conf.yaml too, so the
confs will be all in the same place.





On Thu, Dec 1, 2022 at 9:30 AM Gyula Fóra <gyula.f...@gmail.com> wrote:

> Andrew please see my previous response, that covers the secrets case.
>
> Gyula
>
> On Thu, Dec 1, 2022 at 2:54 PM Andrew Otto <o...@wikimedia.org> wrote:
>
>> > several failures to write into $FLINK_HOME/conf/.
>> I'm working on
>> <https://gerrit.wikimedia.org/r/c/operations/docker-images/production-images/+/858356/>
>> building Flink and flink-kubernetes-operator images for the Wikimedia
>> Foundation, and I found this strange as well.  It makes sense in a docker /
>> docker-compose only environment, but in k8s where you have ConfigMap
>> responsible for flink-conf.yaml, and (also logs all going to the console,
>> not FLINK_HOME/log), I'd prefer if the image was not modified by the
>> ENTRYPOINT.
>>
>> I believe that for flink-kubernetes-operator, the docker-entrypoint.sh
>> <https://github.com/apache/flink-docker/blob/master/1.16/scala_2.12-java11-ubuntu/docker-entrypoint.sh>
>> provided by flink-docker is not really needed.  It seems to be written more
>> for deployments outside of kubernetes.
>>  flink-kubernetes-operator never calls the built in subcommands (e.g.
>> standalone-job), and always runs in 'pass-through' mode, just execing the
>> args passed to it.  At WMF we build
>> <https://doc.wikimedia.org/docker-pkg/> our own images, so I'm planning
>> on removing all of the stuff in ENTRYPOINTs that mangles the image.
>> Anything that I might want to keep from docker-entrypoint.sh (like enabling
>> jemoalloc
>> <https://gerrit.wikimedia.org/r/c/operations/docker-images/production-images/+/858356/6/images/flink/Dockerfile.template#73>)
>> I should be able to do in the Dockerfile at image creation time.
>>
>> >  want to set an API key as part of the flink-conf.yaml file, but we
>> don't want it to be persisted in Kubernetes or in our version control
>> I personally am still pretty green at k8s, but would using kubernetes
>> Secrets
>> <https://kubernetes.io/docs/concepts/configuration/secret/#use-case-secret-visible-to-one-container-in-a-pod>
>> work for your use case? I know we use them at WMF, but from a quick glance
>> I'm not sure how to combine them in flink-kubernetes-operator's ConfigMap
>> that renders flink-conf.yaml, but I feel like there should be a way.
>>
>>
>>
>>
>> On Wed, Nov 30, 2022 at 4:59 PM Gyula Fóra <gyula.f...@gmail.com> wrote:
>>
>>> Hi Lucas!
>>>
>>> The Flink kubernetes integration itself is responsible for mounting the
>>> configmap and overwriting the entrypoint not the operator. Therefore this
>>> is not something we can easily change from the operator side. However I
>>> think we are looking at the problem from the wrong side and there may be a
>>> solution already :)
>>>
>>> Ideally what you want is ENV replacement in Flink configuration. This is
>>> not something that the Flink community has added yet unfortunately but we
>>> have it on our radar for the operator at least (
>>> https://issues.apache.org/jira/browse/FLINK-27491). It will probably be
>>> added in the next 1.4.0 version.
>>>
>>> This will be possible from Flink 1.16 which introduced a small feature
>>> that allows us to inject parameters to the kubernetes entrypoints:
>>> https://issues.apache.org/jira/browse/FLINK-29123
>>>
>>> https://github.com/apache/flink/commit/c37643031dca2e6d4c299c0d704081a8bffece1d
>>>
>>> While it's not implemented in the operator yet, you could try setting
>>> the following config in Flink 1.16.0:
>>> kubernetes.jobmanager.entrypoint.args: -D
>>> datadog.secret.conf=$MY_SECRET_ENV
>>> kubernetes.taskmanager.entrypoint.args: -D
>>> datadog.secret.conf=$MY_SECRET_ENV
>>>
>>> If you use this configuration together with the default native mode in
>>> the operator, it should work I believe.
>>>
>>> Please try and let me know!
>>> Gyula
>>>
>>> On Wed, Nov 30, 2022 at 10:36 PM Lucas Caparelli <
>>> lucas.capare...@gympass.com> wrote:
>>>
>>>> Hello folks,
>>>>
>>>> Not sure if this is the best list for this, sorry if it isn't. I'd
>>>> appreciate some pointers :-)
>>>>
>>>> When using flink-kubernetes-operator [1], docker-entrypoint.sh [2] goes
>>>> through several failures to write into $FLINK_HOME/conf/. We believe this
>>>> is due to this volume being mounted from a ConfigMap, which means it's
>>>> read-only.
>>>>
>>>> This has been reported in the past in GCP's operator, but I was unable
>>>> to find any kind of resolution for it:
>>>> https://github.com/GoogleCloudPlatform/flink-on-k8s-operator/issues/213
>>>>
>>>> In our use case, we want to set an API key as part of the
>>>> flink-conf.yaml file, but we don't want it to be persisted in Kubernetes or
>>>> in our version control, since it's sensitive data. This API Key is used by
>>>> Flink to report metrics to Datadog [3].
>>>>
>>>> We have automation in place which allows us to accomplish this by
>>>> setting environment variables pointing to a path in our secret manager,
>>>> which only gets injected during runtime. That part is working fine.
>>>>
>>>> However, we're trying to inject this secret using the FLINK_PROPERTIES
>>>> variable, which is appended [4] to the flink-conf.yaml file in the
>>>> docker-entrypoint script, which fails due to the filesystem where the file
>>>> is being read-only.
>>>>
>>>> We attempted working around this in 2 different ways:
>>>>
>>>>   - providing our own .spec.containers[0].command, where we copied over
>>>> /opt/flink to /tmp/flink and set FLINK_HOME=/tmp/flink. This did not work
>>>> because the operator overwrote it and replaced it with its original
>>>> command/args;
>>>>   - providing an initContainer sharing the volumes so it could make the
>>>> copy without being overridden by the operator's command/args. This did not
>>>> work because the initContainer present in the spec never makes it to the
>>>> resulting Deployment, it seems the operator ignores it.
>>>>
>>>> We have some questions:
>>>>
>>>> 1. Is this overriding of the pod template present in FlinkDeployment
>>>> intentional? That is, should our custom command/args and initContainers
>>>> have been overwritten? If so, I find it a bit confusing that these fields
>>>> are present and available for use at all.
>>>> 2. Since the ConfigMap volume will always be mounted as read-only, it
>>>> seems to me there's some adjustments to be made in order for this script to
>>>> work correctly. Do you think it would make sense for the script to copy
>>>> over contents from the ConfigMap volume to a writable directory during
>>>> initialization, and then use this copy for any subsequent operation?
>>>> Perhaps copying over to $FLINK_HOME, which the user could set themselves,
>>>> maybe even with a sane default which wouldn't fail on writes (eg
>>>> /tmp/flink).
>>>>
>>>> Thanks in advance for your attention and hard work on the project!
>>>>
>>>> [1]: https://github.com/apache/flink-kubernetes-operator
>>>> [2]:
>>>> https://github.com/apache/flink-docker/blob/master/1.16/scala_2.12-java11-ubuntu/docker-entrypoint.sh
>>>> [3]: https://docs.datadoghq.com/integrations/flink/
>>>> [4]:
>>>> https://github.com/apache/flink-docker/blob/master/1.16/scala_2.12-java11-ubuntu/docker-entrypoint.sh#L86-L88
>>>>
>>>

Re: flink-kubernetes-operator: image entrypoint misbehaves due to inability to write

Reply via email to