Re: [DISCUSS] FLIP-111: Docker image unification

Canbin Zheng Mon, 06 Apr 2020 07:50:25 -0700

Hi, all

Thanks a lot for this FLIP and all the fruitable discussion. I am not sure
whether the following questions are in the scope of this FLIP, but I still
expect your reply:


   1. Which docker base image do we plan to use for Java? As far as I see,
   openjdk:8-jre-alpine[1] is not officially supported by the OpenJDK project
   anymore; openjdk:8-jre is larger than openjdk:8-jre-slim in size so that we
   use the latter one in our internal branch and it works fine so far.
   2. Is it possible that we execute the container CMD under *TINI*[2]
   instead of the shell for better hygiene? As far as I see, the container of
   the JM or TMs is running in the shell form and it could not receive the
   *TERM* signal when the pod is deleted[3]. Some of the problems are as
   follows:
      - The JM and the TMs could have no chance of cleanup, I used to
      create FLINK-15843[4] for tracking this problem.
      - The pod could take a long time(up to 40 seconds) to be deleted
      after the K8s API Server receives the deletion request.

           At the moment, we use *TINI* in our internal branch for the
native K8s setup and it solves the problems mentioned above.

[1]
https://github.com/docker-library/docs/blob/master/openjdk/README.md#supported-tags-and-respective-dockerfile-links

https://github.com/docker-library/openjdk/commit/3eb0351b208d739fac35345c85e3c6237c2114ec#diff-f95ffa3d1377774732c33f7b8368e099
 [2]
https://github.com/krallin/tini
 [3]
https://docs.docker.com/engine/reference/commandline/kill/
 [4]
https://issues.apache.org/jira/browse/FLINK-15843

Regards,
Canbin Zheng

Till Rohrmann <[email protected]> 于2020年4月6日周一 下午5:34写道：

> Thanks for the feedback Niels. This is very helpful.
>
> 1. I agree `flink:latest` is nice to get started but in the long run people
> will want to pin their dependencies to a specific Flink version. I think
> the fix will happen as part of FLINK-15794.
>
> 2. SNAPSHOT docker images will be really helpful for developers as well as
> users who want to use the latest features. I believe that this will be a
> follow-up of this FLIP.
>
> 3. The goal of FLIP-111 is to create an image which allows to start a
> session as well as job cluster. Hence, I believe that we will solve this
> problem soon.
>
> 4. Same as 3. The new image will also contain the native K8s integration so
> that there is no need to create a special image modulo the artifacts you
> want to add.
>
> Additional notes:
>
> 1. I agree that one log makes it harder to separate different execution
> attempts or different tasks. However, on the other hand, it gives you an
> overall picture of what's happening in a Flink process. If things were
> split apart, then it might become super hard to detect problems in the
> runtime which affect the user code to fail or vice versa, for example. In
> general cross correlation will be harder. I guess a solution could be to
> make this configurable. In any case, we should move the discussion about
> this topic into a separate thread.
>
> Cheers,
> Till
>
> On Mon, Apr 6, 2020 at 10:40 AM Niels Basjes <[email protected]> wrote:
>
> > Hi all,
> >
> > Sorry for jumping in at this late point of the discussion.
> > I see a lot of things I really like and I would like to put my "needs"
> and
> > observations here too so you take them into account (where possible).
> > I suspect that there will be overlap with things you already have taken
> > into account.
> >
> >    1. No more 'flink:latest' docker image tag.
> >    Related to https://issues.apache.org/jira/browse/FLINK-15794
> >    What I have learned is that the 'latest' version of a docker image
> only
> >    makes sense IFF this is an almost standalone thing.
> >    So if I have a servlet that does something in isolation (like my hobby
> >    project https://hub.docker.com/r/nielsbasjes/yauaa ) then 'latest'
> > makes
> >    sense.
> >    With Flink you have the application code and all nodes in the cluster
> >    that are depending on each other and as such must run the exact same
> >    versions of the base software.
> >    So if you run flink in a cluster (local/yarn/k8s/mesos/swarm/...)
> where
> >    the application and the nodes inter communicate and closely depend on
> > each
> >    other then 'latest' is a bad idea.
> >       1. Assume I have an application built against the Flink N api and
> the
> >       cluster downloads the latest which is also Flink N.
> >       Then a week later Flink N+1 is released and the API I use changes
> >       (Deprecated)
> >       and a while later Flink N+2 is released and the deprecated API is
> >       removed: Then my application no longer works even though I have
> > not changed
> >       anything.
> >       So I want my application to be 'pinned' to the exact version I
> built
> >       it with.
> >       2. I have a running cluster with my application and cluster running
> >       Flink N.
> >       I add some additional nodes and the new nodes pick up the Flink N+1
> >       image ... now I have a cluster with mixed versions.
> >       3. The version of flink is really the "Flink+Scala" version pair.
> >       If you have the right flink but the wrong scala you get really
> nasty
> >       errors: https://issues.apache.org/jira/browse/FLINK-16289
> >
> >       2. Deploy SNAPSHOT docker images (i.e. something like
> >    *flink:1.11-SNAPSHOT_2.12*) .
> >    More and more use cases will be running on the code delivered via
> Docker
> >    images instead of bare jar files.
> >    So if a "SNAPSHOT" is released and deployed into a 'staging' maven
> repo
> >    (which may be locally on the developers workstation) then it is my
> > opinion
> >    that at the same moment a "SNAPSHOT" docker image should be
> >    created/deployed.
> >    Each time a "SNAPSHOT" docker image is released this will overwrite
> the
> >    previous "SNAPSHOT".
> >    If the final version is released the SNAPSHOTs of that version
> >    can/should be removed.
> >    This will make testing in clusters a lot easier.
> >    Also building a local fix and then running it locally will work
> without
> >    additional modifications to the code.
> >
> >    3. Support for a 'single application cluster'
> >    I've been playing around with the S3 plugin and what I have found is
> >    that this essentially requires all nodes to have full access to the
> >    credentials needed to connect to S3.
> >    This essentially means that a multi-tenant setup is not possible in
> >    these cases.
> >    So I think the single application cluster should be a feature
> available
> >    in all cases.
> >
> >    4. I would like a native-kubernetes-single-application base image.
> >    I can then create a derived image where I only add the jar of my
> >    application.
> >    My desire is that I can then create a k8s yaml file for kubectl
> >    that adds the needed configs/secrets/arguments/environment variables
> and
> >    starts the cluster and application.
> >    Because the native kubernetes support makes it automatically scale
> based
> >    on the application this should 'just work'.
> >
> > Additional note:
> >
> >    1. Job/Task attempt logging instead of task manager logging.
> >    *I realize this has nothing to do with the docker images*
> >    I found something "hard to work with" while running some tests last
> > week.
> >    The logging is done to a single log for the task manager.
> >    So if I have multiple things running in the single task manager then
> the
> >    logs are mixed together.
> >    Also several attempts of the same task are mixed which makes it very
> >    hard to find out 'what went wrong'.
> >
> >
> >
> > On Fri, Apr 3, 2020 at 4:27 PM Ufuk Celebi <[email protected]> wrote:
> >
> > > Thanks for the summary, Andrey. Good idea to link Patrick's document
> from
> > > the FLIP as a future direction so it doesn't get lost. Could you make
> > sure
> > > to revive that discussion when FLIP-111 nears an end?
> > >
> > > This is good to go on my part. +1 to start the VOTE.
> > >
> > >
> > > @Till, @Yang: Thanks for the clarification with the output
> redirection. I
> > > didn't see that. The concern with the `tee` approach is that the file
> > would
> > > grow indefinitely. I think we can solve this with regular logging by
> > > redirecting stderr to ERROR log level, but I'm not sure. We can look
> at a
> > > potential solution when we get to that point. :-)
> > >
> > >
> > >
> > > On Fri, Apr 3, 2020 at 3:36 PM Andrey Zagrebin <[email protected]>
> > > wrote:
> > >
> > > > Hi everyone,
> > > >
> > > > Patrick and Ufuk, thanks a lot for more ideas and suggestions!
> > > >
> > > > I have updated the FLIP according to the current state of discussion.
> > > > Now it also contains the implementation steps and future follow-ups.
> > > > Please, review if there are any concerns.
> > > > The order of the steps aims for keeping Flink releasable at any point
> > if
> > > > something does not have enough time to get in.
> > > >
> > > > It looks that we are reaching mostly a consensus for the open
> > questions.
> > > > There is also a list of items, which have been discussed in this
> > thread,
> > > > and short summary below.
> > > > As soon as there are no concerns, I will create a voting thread.
> > > >
> > > > I also added some thoughts for further customising logging setup.
> This
> > > may
> > > > be an optional follow-up
> > > > which is additional to the default logging into files for Web UI.
> > > >
> > > > # FLIP scope
> > > > The focus is users of the official releases.
> > > > Create docs for how to use the official docker image.
> > > > Remove other Dockerfiles in Flink repo.
> > > > Rely on running the official docker image in different modes (JM/TM).
> > > > Customise running the official image with env vars (This should
> > minimise
> > > > manual manipulating of local files and creation of a custom image).
> > > >
> > > > # Base oficial image
> > > >
> > > > ## Java versions
> > > > There is a separate effort for this:
> > > > https://github.com/apache/flink-docker/pull/9
> > > >
> > > > # Run image
> > > >
> > > > ## Entry point modes
> > > > JM session, JM job, TM
> > > >
> > > > ## Entry point config
> > > > We use env vars for this, e.g. FLINK_PROPERTIES and
> > > ENABLE_BUILT_IN_PLUGINS
> > > >
> > > > ## Flink config options
> > > > We document the existing FLINK_PROPERTIES env var to override config
> > > > options in flink-conf.yaml.
> > > > Then later, we do not need to expose and handle any other special env
> > > vars
> > > > for config options (address, port etc).
> > > > The future plan is to make Flink process configurable by env vars,
> e.g.
> > > > 'some.yaml.option: val' -> FLINK_SOME_YAML_OPTION=val
> > > >
> > > > ## Extra files: jars, custom logging properties
> > > > We can provide env vars to point to custom locations, e.g. in mounted
> > > > volumes.
> > > >
> > > > # Extend image
> > > >
> > > > ## Python/hadoop versions, activating certain libs/plugins
> > > > Users can install extra dependencies and change configs in their
> custom
> > > > image which extends our base image.
> > > >
> > > > # Logging
> > > >
> > > > ## Web UI
> > > > Modify the *log4j-console.properties* to also output logs into the
> > files
> > > > for WebUI. Limit log file size.
> > > >
> > > > ## Container output
> > > > Separate effort for proper split of Flink process stdout and stderr
> > into
> > > > files and container output
> > > > (idea with tee command: `program start-foreground &2>1 | tee
> > > > flink-user-taskexecutor.out`)
> > > >
> > > > # Docker bash utils
> > > > We are not going to expose it to users as an API.
> > > > They should be able either to configure and run the standard entry
> > point
> > > > or the documentation should give short examples about how to extend
> and
> > > > customise the base image.
> > > >
> > > > During the implementation, we will see if it makes sense to factor
> out
> > > > certain bash procedures
> > > > to reuse them e.g. in custom dev versions of docker image.
> > > >
> > > > # Dockerfile / image for developers
> > > > We keep it on our future roadmap. This effort should help to
> understand
> > > > what we can reuse there.
> > > >
> > > > Best,
> > > > Andrey
> > > >
> > > >
> > > > On Fri, Apr 3, 2020 at 12:57 PM Till Rohrmann <[email protected]>
> > > > wrote:
> > > >
> > > >> Hi everyone,
> > > >>
> > > >> just a small inline comment.
> > > >>
> > > >> On Fri, Apr 3, 2020 at 11:42 AM Ufuk Celebi <[email protected]> wrote:
> > > >>
> > > >> > Hey Yang,
> > > >> >
> > > >> > thanks! See inline answers.
> > > >> >
> > > >> > On Fri, Apr 3, 2020 at 5:11 AM Yang Wang <[email protected]>
> > > wrote:
> > > >> >
> > > >> > > Hi Ufuk,
> > > >> > >
> > > >> > > Thanks for make the conclusion and directly point out what need
> to
> > > be
> > > >> > done
> > > >> > > in
> > > >> > > FLIP-111. I agree with you that we should narrow down the scope
> > and
> > > >> focus
> > > >> > > the
> > > >> > > most important and basic part about docker image unification.
> > > >> > >
> > > >> > > (1) Extend the entrypoint script in apache/flink-docker to start
> > the
> > > >> job
> > > >> > >> cluster entry point
> > > >> > >
> > > >> > > I want to add a small requirement for the entry point script.
> > > >> Currently,
> > > >> > > for the native
> > > >> > > K8s integration, we are using the apache/flink-docker image, but
> > > with
> > > >> > > different entry
> > > >> > > point("kubernetes-entry.sh"). Generate the java cmd in
> > > KubernetesUtils
> > > >> > and
> > > >> > > run it
> > > >> > > in the entry point. I really hope it could merge to
> > > >> apache/flink-docker
> > > >> > > "docker-entrypoint.sh".
> > > >> > >
> > > >> >
> > > >> > The script [1] only adds the FLINK_CLASSPATH env var which seems
> > > >> generally
> > > >> > reasonable to me. But since principled classpath and entrypoint
> > > >> > configuration is somewhat related to the follow-up improvement
> > > >> proposals, I
> > > >> > could also see this being done after FLIP-111.
> > > >> >
> > > >> >
> > > >> > > (2) Extend the example log4j-console configuration
> > > >> > >> => support log retrieval from the Flink UI out of the box
> > > >> > >
> > > >> > > If you mean to update the
> > "flink-dist/conf/log4j-console.properties"
> > > >> to
> > > >> > > support console and
> > > >> > > local log files. I will say "+1". But we need to find a proper
> way
> > > to
> > > >> > make
> > > >> > > stdout/stderr output
> > > >> > > both available for console and log files. Maybe till's proposal
> > > could
> > > >> > help
> > > >> > > to solve this.
> > > >> > > "`program &2>1 | tee flink-user-taskexecutor.out`"
> > > >> > >
> > > >> >
> > > >> > I think we can simply add a rolling file appender with a limit on
> > the
> > > >> log
> > > >> > size.
> > > >> >
> > > >> > I think this won't solve Yang's concern. What he wants to achieve
> is
> > > >> that
> > > >> STDOUT and STDERR go to STDOUT and STDERR as well as into some *.out
> > and
> > > >> *.err file which are accessible from the web ui. I don't think that
> > log
> > > >> appender will help with this problem.
> > > >>
> > > >> Cheers,
> > > >> Till
> > > >>
> > > >>
> > > >> > – Ufuk
> > > >> >
> > > >> > [1]
> > > >> >
> > > >> >
> > > >>
> > >
> >
> https://github.com/apache/flink/blob/master/flink-dist/src/main/flink-bin/kubernetes-bin/kubernetes-entry.sh
> > > >> >
> > > >>
> > > >
> > >
> >
> >
> > --
> > Best regards / Met vriendelijke groeten,
> >
> > Niels Basjes
> >
>

Re: [DISCUSS] FLIP-111: Docker image unification

Reply via email to