I second Thomas that we can support both Java 8 and 11.

Best,
Yangze Guo

On Wed, Mar 18, 2020 at 12:12 PM Thomas Weise <t...@apache.org> wrote:
>
> -->
>
> On Mon, Mar 16, 2020 at 1:58 AM Andrey Zagrebin <azagre...@apache.org> wrote:
>>
>> Thanks for the further feedback Thomas and Yangze.
>>
>> > A generic, dynamic configuration mechanism based on environment variables
>> is essential and it is already supported via envsubst and an environment
>> variable that can supply a configuration fragment
>>
>> True, we already have this. As I understand this was introduced for
>> flexibility to template a custom flink-conf.yaml with env vars, put it into
>> the FLINK_PROPERTIES and merge it with the default one.
>> Could we achieve the same with the dynamic properties (-Drpc.port=1234),
>> passed as image args to run it, instead of FLINK_PROPERTIES?
>> They could be also parametrised with env vars. This would require
>> jobmanager.sh to properly propagate them to
>> the StandaloneSessionClusterEntrypoint though:
>> https://github.com/docker-flink/docker-flink/pull/82#issuecomment-525285552
>> cc @Till
>> This would provide a unified configuration approach.
>>
>
> How would that look like for the various use cases? The k8s operator would 
> need to generate the -Dabc .. -Dxyz entry point command instead of setting 
> the FLINK_PROPERTIES environment variable? Potentially that introduces 
> additional complexity for little gain. Do most deployment platforms that 
> support Docker containers handle the command line route well? Backward 
> compatibility may also be a concern.
>
>>
>> > On the flip side, attempting to support a fixed subset of configuration
>> options is brittle and will probably lead to compatibility issues down the
>> road
>>
>> I agree with it. The idea was to have just some shortcut scripted functions
>> to set options in flink-conf.yaml for a custom Dockerfile or entry point
>> script.
>> TASK_MANAGER_NUMBER_OF_TASK_SLOTS could be set as a dynamic property of
>> started JM.
>> I am not sure how many users depend on it. Maybe we could remove it.
>> It also looks we already have somewhat unclean state in
>> the docker-entrypoint.sh where some ports are set the hardcoded values
>> and then FLINK_PROPERTIES are applied potentially duplicating options in
>> the result flink-conf.yaml.
>
>
> That is indeed possible and duplicate entries from FLINK_PROPERTIES prevail. 
> Unfortunately, the special cases you mention were already established and the 
> generic mechanism was added later for the k8s operators.
>
>>
>>
>> I can see some potential usage of env vars as standard entry point args but
>> for purposes related to something which cannot be achieved by passing entry
>> point args, like changing flink-conf.yaml options. Nothing comes into my
>> mind at the moment. It could be some setting specific to the running mode
>> of the entry point. The mode itself can stay the first arg of the entry
>> point.
>>
>> > I would second that it is desirable to support Java 11
>>
>> > Regarding supporting JAVA 11:
>> > - Not sure if it is necessary to ship JAVA. Maybe we could just change
>> > the base image from openjdk:8-jre to openjdk:11-jre in template docker
>> > file[1]. Correct me if I understand incorrectly. Also, I agree to move
>> > this out of the scope of this FLIP if it indeed takes much extra
>> > effort.
>>
>> This is what I meant by bumping up the Java version in the docker hub Flink
>> image:
>> FROM openjdk:8-jre -> FROM openjdk:11-jre
>> This can be polled dependently in user mailing list.
>
>
> That sounds reasonable as long as we can still support both Java versions 
> (i.e. provide separate images for 8 and 11).
>
>>
>>
>> > and in general use a base image that allows the (straightforward) use of
>> more recent versions of other software (Python etc.)
>>
>> This can be polled whether to always include some version of python into
>> the docker hub image.
>> A potential problem here is once it is there, it is some hassle to
>> remove/change it in a custom extended Dockerfile.
>>
>> It would be also nice to avoid maintaining images for various combinations
>> of installed Java/Scala/Python in docker hub.
>>
>> > Regarding building from local dist:
>> > - Yes, I bring this up mostly for development purpose. Since k8s is
>> > popular, I believe more and more developers would like to test their
>> > work on k8s cluster. I'm not sure should all developers write a custom
>> > docker file themselves in this scenario. Thus, I still prefer to
>> > provide a script for devs.
>> > - I agree to keep the scope of this FLIP mostly for those normal
>> > users. But as far as I can see, supporting building from local dist
>> > would not take much extra effort.
>> > - The maven docker plugin sounds good. I'll take a look at it.
>>
>> I would see any scripts introduced in this FLIP also as potential building
>> blocks for a custom dev Dockerfile.
>> Maybe, this will be all what we need for dev images or we write a dev
>> Dockerfile, highly parametrised for building a dev image.
>> If scripts stay in apache/flink-docker, it is also somewhat inconvenient to
>> use them in the main Flink repo but possible.
>> If we move them to apache/flink then we will have to e.g. include them into
>> the release to make them easily available in apache/flink-docker and
>> maintain them in main repo, although they are only docker specific.
>> All in all, I would say, once we implement them, we can revisit this topic.
>>
>> Best,
>> Andrey
>>
>> On Wed, Mar 11, 2020 at 8:58 AM Yangze Guo <karma...@gmail.com> wrote:
>>
>> > Thanks for the reply, Andrey.
>> >
>> > Regarding building from local dist:
>> > - Yes, I bring this up mostly for development purpose. Since k8s is
>> > popular, I believe more and more developers would like to test their
>> > work on k8s cluster. I'm not sure should all developers write a custom
>> > docker file themselves in this scenario. Thus, I still prefer to
>> > provide a script for devs.
>> > - I agree to keep the scope of this FLIP mostly for those normal
>> > users. But as far as I can see, supporting building from local dist
>> > would not take much extra effort.
>> > - The maven docker plugin sounds good. I'll take a look at it.
>> >
>> > Regarding supporting JAVA 11:
>> > - Not sure if it is necessary to ship JAVA. Maybe we could just change
>> > the base image from openjdk:8-jre to openjdk:11-jre in template docker
>> > file[1]. Correct me if I understand incorrectly. Also, I agree to move
>> > this out of the scope of this FLIP if it indeed takes much extra
>> > effort.
>> >
>> > Regarding the custom configuration, the mechanism that Thomas mentioned
>> > LGTM.
>> >
>> > [1]
>> > https://github.com/apache/flink-docker/blob/master/Dockerfile-debian.template
>> >
>> > Best,
>> > Yangze Guo
>> >
>> > On Wed, Mar 11, 2020 at 5:52 AM Thomas Weise <t...@apache.org> wrote:
>> > >
>> > > Thanks for working on improvements to the Flink Docker container images.
>> > This will be important as more and more users are looking to adopt
>> > Kubernetes and other deployment tooling that relies on Docker images.
>> > >
>> > > A generic, dynamic configuration mechanism based on environment
>> > variables is essential and it is already supported via envsubst and an
>> > environment variable that can supply a configuration fragment:
>> > >
>> > >
>> > https://github.com/apache/flink-docker/blob/09adf2dcd99abfb6180e1e2b5b917b288e0c01f6/docker-entrypoint.sh#L88
>> > >
>> > https://github.com/apache/flink-docker/blob/09adf2dcd99abfb6180e1e2b5b917b288e0c01f6/docker-entrypoint.sh#L85
>> > >
>> > > This gives the necessary control for infrastructure use cases that aim
>> > to supply deployment tooling other users. An example in this category this
>> > is the FlinkK8sOperator:
>> > >
>> > > https://github.com/lyft/flinkk8soperator/tree/master/examples/wordcount
>> > >
>> > > On the flip side, attempting to support a fixed subset of configuration
>> > options is brittle and will probably lead to compatibility issues down the
>> > road:
>> > >
>> > >
>> > https://github.com/apache/flink-docker/blob/09adf2dcd99abfb6180e1e2b5b917b288e0c01f6/docker-entrypoint.sh#L97
>> > >
>> > > Besides the configuration, it may be worthwhile to see in which other
>> > ways the base Docker images can provide more flexibility to incentivize
>> > wider adoption.
>> > >
>> > > I would second that it is desirable to support Java 11 and in general
>> > use a base image that allows the (straightforward) use of more recent
>> > versions of other software (Python etc.)
>> > >
>> > >
>> > https://github.com/apache/flink-docker/blob/d3416e720377e9b4c07a2d0f4591965264ac74c5/Dockerfile-debian.template#L19
>> > >
>> > > Thanks,
>> > > Thomas
>> > >
>> > > On Tue, Mar 10, 2020 at 12:26 PM Andrey Zagrebin <azagre...@apache.org>
>> > wrote:
>> > >>
>> > >> Hi All,
>> > >>
>> > >> Thanks a lot for the feedback!
>> > >>
>> > >> *@Yangze Guo*
>> > >>
>> > >> - Regarding the flink_docker_utils#install_flink function, I think it
>> > >> > should also support build from local dist and build from a
>> > >> > user-defined archive.
>> > >>
>> > >> I suppose you bring this up mostly for development purpose or powerful
>> > >> users.
>> > >> Most of normal users are usually interested in mainstream released
>> > versions
>> > >> of Flink.
>> > >> Although, you are bring a valid concern, my idea was to keep scope of
>> > this
>> > >> FLIP mostly for those normal users.
>> > >> The powerful users are usually capable to design a completely
>> > >> custom Dockerfile themselves.
>> > >> At the moment, we already have custom Dockerfiles e.g. for tests in
>> > >>
>> > flink-end-to-end-tests/test-scripts/docker-hadoop-secure-cluster/Dockerfile.
>> > >> We can add something similar for development purposes and maybe
>> > introduce a
>> > >> special maven goal. There is a maven docker plugin, afaik.
>> > >> I will add this to FLIP as next step.
>> > >>
>> > >> - It seems that the install_shaded_hadoop could be an option of
>> > >> > install_flink
>> > >>
>> > >> I woud rather think about this as a separate independent optional step.
>> > >>
>> > >> - Should we support JAVA 11? Currently, most of the docker file based on
>> > >> > JAVA 8.
>> > >>
>> > >> Indeed, it is a valid concern. Java version is a fundamental property of
>> > >> the docker image.
>> > >> To customise this in the current mainstream image is difficult, this
>> > would
>> > >> require to ship it w/o Java at all.
>> > >> Or this is a separate discussion whether we want to distribute docker
>> > hub
>> > >> images with different Java versions or just bump it to Java 11.
>> > >> This should be easy in a custom Dockerfile for development purposes
>> > though
>> > >> as mentioned before.
>> > >>
>> > >> - I do not understand how to set config options through
>> > >>
>> > >> "flink_docker_utils configure"? Does this step happen during the image
>> > >> > build or the container start? If it happens during the image build,
>> > >> > there would be a new image every time we change the config. If it just
>> > >> > a part of the container entrypoint, I think there is no need to add a
>> > >> > configure command, we could just add all dynamic config options to the
>> > >> > args list of "start_jobmaster"/"start_session_jobmanager". Am I
>> > >> > understanding this correctly?
>> > >>
>> > >>  `flink_docker_utils configure ...` can be called everywhere:
>> > >> - while building a custom image (`RUN flink_docker_utils configure ..`)
>> > by
>> > >> extending our base image from docker hub (`from flink`)
>> > >> - in a custom entry point as well
>> > >> I will check this but if user can also pass a dynamic config option it
>> > also
>> > >> sounds like a good option
>> > >> Our standard entry point script in base image could just properly
>> > forward
>> > >> the arguments to the Flink process.
>> > >>
>> > >> @Yang Wang
>> > >>
>> > >> > About docker utils
>> > >> > I really like the idea to provide some utils for the docker file and
>> > entry
>> > >> > point. The
>> > >> > `flink_docker_utils` will help to build the image easier. I am not
>> > sure
>> > >> > about the
>> > >> > `flink_docker_utils start_jobmaster`. Do you mean when we build a
>> > docker
>> > >> > image, we
>> > >> > need to add `RUN flink_docker_utils start_jobmaster` in the docker
>> > file?
>> > >> > Why do we need this?
>> > >>
>> > >> This is a scripted action to start JM. It can be called everywhere.
>> > >> Indeed, it does not make too much sense to run it in Dockerfile.
>> > >> Mostly, the idea was to use in a custom entry point. When our base
>> > docker
>> > >> hub image is started its entry point can be also completely overridden.
>> > >> The actions are also sorted in the FLIP: for Dockerfile or for entry
>> > point.
>> > >> E.g. our standard entry point script in the base docker hub image can
>> > >> already use it.
>> > >> Anyways, it was just an example, the details are to be defined in Jira,
>> > imo.
>> > >>
>> > >> > About docker entry point
>> > >> > I agree with you that the docker entry point could more powerful with
>> > more
>> > >> > functionality.
>> > >> > Mostly, it is about to override the config options. If we support
>> > dynamic
>> > >> > properties, i think
>> > >> > it is more convenient for users without any learning curve.
>> > >> > `docker run flink session_jobmanager -D rest.bind-port=8081`
>> > >>
>> > >> Indeed, as mentioned before, it can be a better option.
>> > >> The standard entry point also decides at least what to run JM or TM. I
>> > >> think we will see what else makes sense to include there during the
>> > >> implementation.
>> > >> Some specifics may be more convenient to set with env vars as Konstantin
>> > >> mentioned.
>> > >>
>> > >> > About the logging
>> > >> > Updating the `log4j-console.properties` to support multiple appender
>> > is a
>> > >> > better option.
>> > >> > Currently, the native K8s is suggesting users to debug the logs in
>> > this
>> > >> > way[1]. However,
>> > >> > there is also some problems. The stderr and stdout of JM/TM processes
>> > could
>> > >> > not be
>> > >> > forwarded to the docker container console.
>> > >>
>> > >> Strange, we should check maybe there is a docker option to query the
>> > >> container's stderr output as well.
>> > >> If we forward Flink process stdout as usual in bash console, it should
>> > not
>> > >> be a problem. Why can it not be forwarded?
>> > >>
>> > >> @Konstantin Knauf
>> > >>
>> > >> For the entrypoint, have you considered to also allow setting
>> > configuration
>> > >> > via environment variables as in "docker run -e
>> > FLINK_REST_BIN_PORT=8081
>> > >> > ..."? This is quite common and more flexible, e.g. it makes it very
>> > easy to
>> > >> > pass values of Kubernetes Secrets into the Flink configuration.
>> > >>
>> > >> This is indeed an interesting option to pass arguments to the entry
>> > point
>> > >> in general.
>> > >> For the config options, the dynamic args can be a better option as
>> > >> mentioned above.
>> > >>
>> > >> With respect to logging, I would opt to keep this very basic and to only
>> > >> > support logging to the console (maybe with a fix for the web user
>> > >> > interface). For everything else, users can easily build their own
>> > images
>> > >> > based on library/flink (provide the dependencies, change the logging
>> > >> > configuration).
>> > >>
>> > >> agree
>> > >>
>> > >> Thanks,
>> > >> Andrey
>> > >>
>> > >> On Sun, Mar 8, 2020 at 8:55 PM Konstantin Knauf <
>> > konstan...@ververica.com>
>> > >> wrote:
>> > >>
>> > >> > Hi Andrey,
>> > >> >
>> > >> > thanks a lot for this proposal. The variety of Docker files in the
>> > project
>> > >> > has been causing quite some confusion.
>> > >> >
>> > >> > For the entrypoint, have you considered to also allow setting
>> > >> > configuration via environment variables as in "docker run -e
>> > >> > FLINK_REST_BIN_PORT=8081 ..."? This is quite common and more
>> > flexible, e.g.
>> > >> > it makes it very easy to pass values of Kubernetes Secrets into the
>> > Flink
>> > >> > configuration.
>> > >> >
>> > >> > With respect to logging, I would opt to keep this very basic and to
>> > only
>> > >> > support logging to the console (maybe with a fix for the web user
>> > >> > interface). For everything else, users can easily build their own
>> > images
>> > >> > based on library/flink (provide the dependencies, change the logging
>> > >> > configuration).
>> > >> >
>> > >> > Cheers,
>> > >> >
>> > >> > Konstantin
>> > >> >
>> > >> >
>> > >> > On Thu, Mar 5, 2020 at 11:01 AM Yang Wang <danrtsey...@gmail.com>
>> > wrote:
>> > >> >
>> > >> >> Hi Andrey,
>> > >> >>
>> > >> >>
>> > >> >> Thanks for driving this significant FLIP. From the user ML, we could
>> > also
>> > >> >> know there are
>> > >> >> many users running Flink in container environment. Then the docker
>> > image
>> > >> >> will be the
>> > >> >> very basic requirement. Just as you say, we should provide a unified
>> > >> >> place for all various
>> > >> >> usage(e.g. session, job, native k8s, swarm, etc.).
>> > >> >>
>> > >> >>
>> > >> >> > About docker utils
>> > >> >>
>> > >> >> I really like the idea to provide some utils for the docker file and
>> > >> >> entry point. The
>> > >> >> `flink_docker_utils` will help to build the image easier. I am not
>> > sure
>> > >> >> about the
>> > >> >> `flink_docker_utils start_jobmaster`. Do you mean when we build a
>> > docker
>> > >> >> image, we
>> > >> >> need to add `RUN flink_docker_utils start_jobmaster` in the docker
>> > file?
>> > >> >> Why do we need this?
>> > >> >>
>> > >> >>
>> > >> >> > About docker entry point
>> > >> >>
>> > >> >> I agree with you that the docker entry point could more powerful with
>> > >> >> more functionality.
>> > >> >> Mostly, it is about to override the config options. If we support
>> > dynamic
>> > >> >> properties, i think
>> > >> >> it is more convenient for users without any learning curve.
>> > >> >> `docker run flink session_jobmanager -D rest.bind-port=8081`
>> > >> >>
>> > >> >>
>> > >> >> > About the logging
>> > >> >>
>> > >> >> Updating the `log4j-console.properties` to support multiple appender
>> > is a
>> > >> >> better option.
>> > >> >> Currently, the native K8s is suggesting users to debug the logs in
>> > this
>> > >> >> way[1]. However,
>> > >> >> there is also some problems. The stderr and stdout of JM/TM processes
>> > >> >> could not be
>> > >> >> forwarded to the docker container console.
>> > >> >>
>> > >> >>
>> > >> >> [1].
>> > >> >>
>> > https://ci.apache.org/projects/flink/flink-docs-master/ops/deployment/native_kubernetes.html#log-files
>> > >> >>
>> > >> >>
>> > >> >> Best,
>> > >> >> Yang
>> > >> >>
>> > >> >>
>> > >> >>
>> > >> >>
>> > >> >> Andrey Zagrebin <azagre...@apache.org> 于2020年3月4日周三 下午5:34写道:
>> > >> >>
>> > >> >>> Hi All,
>> > >> >>>
>> > >> >>> If you have ever touched the docker topic in Flink, you
>> > >> >>> probably noticed that we have multiple places in docs and repos
>> > which
>> > >> >>> address its various concerns.
>> > >> >>>
>> > >> >>> We have prepared a FLIP [1] to simplify the perception of docker
>> > topic in
>> > >> >>> Flink by users. It mostly advocates for an approach of extending
>> > official
>> > >> >>> Flink image from the docker hub. For convenience, it can come with
>> > a set
>> > >> >>> of
>> > >> >>> bash utilities and documented examples of their usage. The utilities
>> > >> >>> allow
>> > >> >>> to:
>> > >> >>>
>> > >> >>>    - run the docker image in various modes (single job, session
>> > master,
>> > >> >>>    task manager etc)
>> > >> >>>    - customise the extending Dockerfile
>> > >> >>>    - and its entry point
>> > >> >>>
>> > >> >>> Eventually, the FLIP suggests to remove all other user facing
>> > Dockerfiles
>> > >> >>> and building scripts from Flink repo, move all docker docs to
>> > >> >>> apache/flink-docker and adjust existing docker use cases to refer
>> > to this
>> > >> >>> new approach (mostly Kubernetes now).
>> > >> >>>
>> > >> >>> The first contributed version of Flink docker integration also
>> > contained
>> > >> >>> example and docs for the integration with Bluemix in IBM cloud. We
>> > also
>> > >> >>> suggest to maintain it outside of Flink repository (cc Markus
>> > Müller).
>> > >> >>>
>> > >> >>> Thanks,
>> > >> >>> Andrey
>> > >> >>>
>> > >> >>> [1]
>> > >> >>>
>> > >> >>>
>> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-111%3A+Docker+image+unification
>> > >> >>>
>> > >> >>
>> > >> >
>> > >> > --
>> > >> >
>> > >> > Konstantin Knauf | Head of Product
>> > >> >
>> > >> > +49 160 91394525
>> > >> >
>> > >> >
>> > >> > Follow us @VervericaData Ververica <https://www.ververica.com/>
>> > >> >
>> > >> >
>> > >> > --
>> > >> >
>> > >> > Join Flink Forward <https://flink-forward.org/> - The Apache Flink
>> > >> > Conference
>> > >> >
>> > >> > Stream Processing | Event Driven | Real Time
>> > >> >
>> > >> > --
>> > >> >
>> > >> > Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany
>> > >> >
>> > >> > --
>> > >> > Ververica GmbH
>> > >> > Registered at Amtsgericht Charlottenburg: HRB 158244 B
>> > >> > Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason,
>> > Ji
>> > >> > (Tony) Cheng
>> > >> >
>> >

Reply via email to