Re: [DISCUSS] FLINK-16194: Refactor the Kubernetes architecture design

2020-02-21 Thread felixzheng zheng
Great thanks for the quick feedback Till. You are right; it is not a
fundamentally different approach compared to
what we have right now, all the Kubernetes resources created are the same,
we aim to evolve the existing decorator approach so that,
1. the decorators are monadic and smaller in size and functionality.
2. the new decorator design allows reusing the decorators between the
client and the cluster as much as possible.
3. all the decorators are independent with each other, and they could have
arbitrary order in the chain, they share the same APIs and follow a unified
orchestrator architecture so that new developers could quickly understand
what should be done to introduce a new feature.

Besides that, the new approach allows us adding tests for every decorator
alone instead of doing a final test of all the decorators in the
Fabric8ClientTest.java.

Cheers,
Canbin Zheng

Till Rohrmann  于2020年2月22日周六 上午12:28写道:

> Thanks for starting this discussion Canbin. If I understand your proposal
> correctly, then you would like to evolve the existing decorator approach so
> that decorators are monadic and smaller in size and functionality. The
> latter aspect will allow to reuse them between the client and the cluster.
> Just to make sure, it is not a fundamentally different approach compared to
> what we have right now, is it?
>
> If this is the case, then I think it makes sense to reuse code as much as
> possible and to create small code units which are easier to test.
>
> Cheers,
> Till
>
> On Fri, Feb 21, 2020 at 4:41 PM felixzheng zheng 
> wrote:
>
> > Thanks for the feedback @Yang Wang. I would like to discuss some of the
> > details in depth about why I am confused about the existing design.
> >
> > Question 1: How do we mount a configuration file?
> >
> > For the existing design,
> >
> >1.
> >
> >We need several classes to finish it:
> >1.
> >
> >   InitializerDecorator
> >   2.
> >
> >   OwnerReferenceDecorator
> >   3.
> >
> >   ConfigMapDecorator
> >   4.
> >
> >   KubernetesUtils: providing the getConfigMapVolume method to share
> for
> >   the FlinkMasterDeploymentDecorator and the TaskManagerPodDecorator.
> >   5.
> >
> >   FlinkMasterDeploymentDecorator: mounts the ConfigMap volume.
> >   6.
> >
> >   TaskManagerPodDecorator: mounts the ConfigMap volume.
> >   7.
> >
> >   If in the future, someone would like to introduce an init
> Container,
> >   the InitContainerDecorator has to mount the ConfigMap volume too.
> >
> >
> > I am confused about the current solution to mounting a configuration
> file:
> >
> >1.
> >
> >Actually, we do not need so many Decorators for mounting a file.
> >2.
> >
> >If we would like to mount a new file, we have no choice but to repeat
> >the same tedious and scattered routine.
> >3.
> >
> >There’s no easy way to test the file mounting functionality alone; we
> >have to construct the ConfigMap, the Deployment or the TaskManagerPod
> > first
> >and then do a final test.
> >
> >
> > The reason why it is so complex to mount a configuration file is that we
> > don’t fully consider the internal connections among those resources in
> the
> > existing design.
> >
> > The new abstraction we proposed could solve such a kind of problem, the
> new
> > Decorator object is as follows:
> >
> > public interface KubernetesStepDecorator {
> >
> >   /**
> >
> >* Apply transformations to the given FlinkPod in accordance with this
> > feature. This can include adding
> >
> >* labels/annotations, mounting volumes, and setting startup command or
> > parameters, etc.
> >
> >*/
> >
> >   FlinkPod decorateFlinkPod(FlinkPod flinkPod);
> >
> >   /**
> >
> >* Build the accompanying Kubernetes resources that should be
> introduced
> > to support this feature. This could
> >
> >* only applicable to the client-side submission process.
> >
> >*/
> >
> >   List buildAccompanyingKubernetesResources() throws
> > IOException;
> >
> > }
> >
> > The FlinkPod is a composition of the Pod, the main Container, the init
> > Container, and the sidecar Container.
> >
> > Next, we introduce a KubernetesStepDecorator implementation, the method
> of
> > buildAccompanyingKubernetesResources creates the corresponding ConfigMap,
> > and the method of decorateFlinkPod conf

Re: [DISCUSS] FLINK-16194: Refactor the Kubernetes architecture design

2020-02-21 Thread felixzheng zheng
ove the implementation.
>
>
> Regarding your two main points.
>
> >> Introduce a unified monadic-step based orchestrator architecture that
> has a better,
> cleaner and consistent abstraction for the Kubernetes resources
> construction process,
> both applicable to the client side and the master side.
>
> When i introduce the decorator for the K8s in Flink, there is always a
> guideline in my mind
> that it should be easy for extension and adding new features. Just as you
> say, we have lots
> of functions to support and the K8s is also evolving very fast. The current
> `ConfigMapDecorator`,
> `FlinkMasterDeploymentDecorator`, `TaskManagerPodDecorator` is a basic
> implementation with
> some prerequisite parameters. Of course we could chain more decorators to
> construct the K8s
> resources. For example, InitializerDecorator -> OwnerReferenceDecorator ->
> FlinkMasterDeploymentDecorator -> InitContainerDecorator ->
> SidecarDecorator -> etc.
>
> I am little sceptical about splitting every parameter into a single
> decorator.  Since it does
> not take too much benefits. But i agree with moving some common parameters
> into separate
> decorators(e.g. volume mount). Also introducing the `~Builder` class and
> spinning off the chaining
> decorator calls from `Fabric8FlinkKubeClient` make sense to me.
>
>
>
> >> Add some dedicated tools for centrally parsing, verifying, and managing
> the Kubernetes parameters.
>
> Currently, we always get the parameters directly from Flink configuration(
> e.g. `flinkConfig.getString(KubernetesConfigOptions.CONTAINER_IMAGE)`). I
> think it could be improved
> by introducing some dedicated conf parser classes. It is a good idea.
>
>
> Best,
> Yang
>
>
>
>
> felixzheng zheng  于2020年2月21日周五 上午9:31写道:
>
> > Hi everyone,
> >
> > I would like to kick off a discussion on refactoring the existing
> > Kubernetes resources construction architecture design.
> >
> > I created a design document [1] that clarifies our motivation to do this
> > and some improvement proposals for the new design.
> >
> > Briefly, we would like to
> > 1. Introduce a unified monadic-step based orchestrator architecture that
> > has a better, cleaner and consistent abstraction for the Kubernetes
> > resources construction process, both applicable to the client side and
> the
> > master side.
> > 2. Add some dedicated tools for centrally parsing, verifying, and
> managing
> > the Kubernetes parameters.
> >
> > It would be great to start the efforts before adding big features for the
> > native Kubernetes submodule, and Tison and I plan to work together for
> this
> > issue.
> >
> > Please let me know your thoughts.
> >
> > Regards,
> > Canbin Zheng
> >
> > [1]
> >
> >
> https://docs.google.com/document/d/1dFBjqho8IRyNWxKVhFnTf0qGCsgp72aKON4wUdHY5Pg/edit?usp=sharing
> > [2] https://issues.apache.org/jira/browse/FLINK-16194
> >
>


[DISCUSS] FLINK-16194: Refactor the Kubernetes architecture design

2020-02-20 Thread felixzheng zheng
Hi everyone,

I would like to kick off a discussion on refactoring the existing
Kubernetes resources construction architecture design.

I created a design document [1] that clarifies our motivation to do this
and some improvement proposals for the new design.

Briefly, we would like to
1. Introduce a unified monadic-step based orchestrator architecture that
has a better, cleaner and consistent abstraction for the Kubernetes
resources construction process, both applicable to the client side and the
master side.
2. Add some dedicated tools for centrally parsing, verifying, and managing
the Kubernetes parameters.

It would be great to start the efforts before adding big features for the
native Kubernetes submodule, and Tison and I plan to work together for this
issue.

Please let me know your thoughts.

Regards,
Canbin Zheng

[1]
https://docs.google.com/document/d/1dFBjqho8IRyNWxKVhFnTf0qGCsgp72aKON4wUdHY5Pg/edit?usp=sharing
[2] https://issues.apache.org/jira/browse/FLINK-16194


Re: [ANNOUNCE] Yu Li became a Flink committer

2020-01-30 Thread felixzheng zheng
Congrats!

Benchao Li  于2020年1月31日周五 上午10:02写道:

> Congratulations!!
>
> Biao Liu  于2020年1月29日周三 下午9:25写道:
>
> > Congrats!
> >
> > On Wed, Jan 29, 2020 at 10:37 aihua li  wrote:
> >
> > > Congratulations Yu LI, well deserved.
> > >
> > > > 2020年1月23日 下午4:59,Stephan Ewen  写道:
> > > >
> > > > Hi all!
> > > >
> > > > We are announcing that Yu Li has joined the rank of Flink committers.
> > > >
> > > > Yu joined already in late December, but the announcement got lost
> > because
> > > > of the Christmas and New Years season, so here is a belated proper
> > > > announcement.
> > > >
> > > > Yu is one of the main contributors to the state backend components in
> > the
> > > > recent year, working on various improvements, for example the RocksDB
> > > > memory management for 1.10.
> > > > He has also been one of the release managers for the big 1.10
> release.
> > > >
> > > > Congrats for joining us, Yu!
> > > >
> > > > Best,
> > > > Stephan
> > >
> > > --
> >
> > Thanks,
> > Biao /'bɪ.aʊ/
> >
>
>
> --
>
> Benchao Li
> School of Electronics Engineering and Computer Science, Peking University
> Tel:+86-15650713730
> Email: libenc...@gmail.com; libenc...@pku.edu.cn
>


Re: [DISCUSS] Active Kubernetes integration phase2

2020-01-19 Thread felixzheng zheng
Hi Yang Wang,

Thanks for your effort on this topic and inviting me, I am glad to join the
future work on native Kubernetes integration, will try to join the slack
channel latter.

Yang Wang  于2020年1月19日周日 下午6:04写道:

> Hi Canbin Zheng,
>
> I have found that you created some tickets about Flink on Kubernetes. We
> just have the same requirements.
> Maybe we could have more discussion about the use cases and implementation
> details. I have created
> a slack channel[1], please join in if you want.
>
> Any dev users if you have some ideas about the new features. Join in us,
> let's make Flink on Kubernetes
> moving forward together.
>
>
> [1].
>
> https://slack.com/share/ISUKQ2WG5/cXwlw5HZgp3AE5ElE8m580Pk/enQtOTEyNjcwMDk4NTQ5LWZiNTU0ZmJiZTU4MWU1OTk2YjllNmE0OTg2YjIxYjA2YmE1MWFlOTE4NWFhMDBkMzE4NDQzYjk1YmQwMDI2MzU
>
>
> Yang Wang  于2020年1月19日周日 下午5:07写道:
>
> > Hi everyone,
> >
> >
> > Currently Flink supports the resource management system YARN and Mesos.
> > However, they were not
> > designed for fast moving cloud native architectures, and they could not
> > support mixed workloads (e.g. batch,
> > streaming, deep learning, web services, etc.) relatively well. At the
> same
> > time, Kubernetes is evolving very
> > fast to fill those gaps and become the de-facto orchestration framework.
> > So running Flink on Kubernetes is
> > a very basic requirement for many users.
> >
> >
> > At least, we have the following advantages when natively running Flink on
> > Kubernetes.
> > * Flink KubernetesResourceManager will allocate TaskManager pods
> > dynamically based on the resource
> > requirement of the jobs.
> > * Using Flink bundled scripts to start/stop session cluster on
> > Kuberenetes. Do not need external tools anymore.
> > * Compared with Yarn deployment, different Flink clusters could get
> better
> > isolation by leveraging the ability
> > of Kubernetes.
> >
> >
> > Recently, i also find more and more uses are very interested in running
> > Flink on Kubernetes natively.
> > The community has already made some efforts[1] and will be released in
> > 1.10. Welcome to have a taste and
> > give us your feedback. However, it is a basic requirement and we still
> > need many features before production.
> > So i want to start this discussion to collect the requirements that you
> > have came across. Feel free to share
> > your valuable thoughts. We will try to conclude and create sub tasks in
> > the umbrella ticket[2]. Also i will move
> > some existing tickets there for easier tracking them.
> >
> >
> > Best,
> > Yang
> >
> > [1].
> >
> https://ci.apache.org/projects/flink/flink-docs-master/ops/deployment/native_kubernetes.html
> > [2]. https://issues.apache.org/jira/browse/FLINK-14460
> >
>