Hi All! Thanks for the questions, there are still quite a few unknowns and decisions to be made but here are my current thoughts:
# 1 Flink Native vs Standalone integration Maybe we should make this more clear in the FLIP but we agreed to do the first version of the operator based on the native integration. While this clearly does not cover all use-cases and requirements, it seems this would lead to a much smaller initial effort and a nicer first version. # How do we run a Flink job from a CR? I am very much leaning toward using the ApplicationDeployer interface to submit jobs directly from java. Again this would be a very nice and simple Java solution. I think this will also help making the deployment interfaces more solid so we can then make them public. If there is no way around it we could also invoke the CLI classes from within the application but I would prefer not to. # Pod template I cannot comment on this yet :D Cheers, Gyula On Wed, Jan 26, 2022 at 12:38 PM Yang Wang <danrtsey...@gmail.com> wrote: > Hi Biao, > > # 1 Flink Native vs Standalone integration > I think we have got a trend in this discussion[1] that the newly introduced > Flink K8s operator will start with native K8s integration first. > Do you have some concerns about this? > > # 2 K8S StatefulSet v.s. K8S Deployment > IIUC, the FlinkDeployment is just a custom resource name. It does not mean > that we need to create a corresponding K8s deployment for JobManager or > TaskManager. > If we are using native K8s integration, the JobManager is started with K8s > deployment while TaskManagers are naked pods managed by > FlinkResourceManager. > > Actually, I think "FlinkDeployment" is easier to understand than > "FlinkStatefulSet" :) > > > [1]. https://lists.apache.org/thread/l1dkp8v4bhlcyb4tdts99g7w4wdglfy4 > > > Best, > Yang > > Biao Geng <biaoge...@gmail.com> 于2022年1月26日周三 18:00写道: > > > Hi Thomas, > > Thanks a lot for the great efforts in this well-organized FLIP! After > > reading the FLIP carefully, I think Yang has given some great feedback > and > > I just want to share some of my concerns: > > # 1 Flink Native vs Standalone integration > > I believe it is reasonable to support both modes in the long run but in > the > > FLIP and previous thread[1], it seems that we have not made a decision on > > which one to implement initially. The FLIP mentioned "Maybe start with > > support for Flink Native" for reusing codes in [2]. Is it the selected > one > > finally? > > # 2 K8S StatefulSet v.s. K8S Deployment > > In the CR Example, I notice that the kind we use is FlinkDeployment. I > > would like to check if we have made the decision to use K8S Deployment > > workload resource. As the name implies, StatefulSet is for stateful apps > > while Deployment is usually for stateless apps. I think it is worthwhile > to > > consider the choice more carefully due to some user case in gcp > > operator[3], which may influence our other design choices(like the Flink > > application deletion strategy). > > > > Again, thanks for the work and I believe this FLIP is pretty useful for > > many customers and I hope I can make some contributions to this FLIP > impl! > > > > Best regard, > > Biao Geng > > > > [1] https://lists.apache.org/thread/l1dkp8v4bhlcyb4tdts99g7w4wdglfy4 > > [2] https://github.com/wangyang0918/flink-native-k8s-operator > > [3] > https://github.com/GoogleCloudPlatform/flink-on-k8s-operator/pull/354 > > > > Yang Wang <danrtsey...@gmail.com> 于2022年1月26日周三 15:25写道: > > > > > Thanks Thomas for creating FLIP-212 to introduce the Flink Kubernetes > > > Operator. > > > > > > The proposal looks already very good to me and has integrated all the > > input > > > in the previous discussion(e.g. native K8s VS standalone, Go VS java). > > > > > > I read the FLIP carefully and have some questions that need to be > > > clarified. > > > > > > # How do we run a Flink job from a CR? > > > 1. Start a session cluster and then followed by submitting the Flink > job > > > via rest API > > > 2. Start a Flink application cluster which bundles one or more Flink > jobs > > > It is not clear enough to me which way we will choose. It seems that > the > > > existing google/lyft K8s operator is using #1. But I lean to #2 in the > > new > > > introduced K8s operator. > > > If #2 is the case, how could we get the job status when it finished or > > > failed? Maybe FLINK-24113[1] and FLINK-25715[2] could help. Or we may > > need > > > to enable the Flink history server[3]. > > > > > > > > > # ApplicationDeployer Interface or "flink run-application" / > > > "kubernetes-session.sh" > > > How do we start the Flink application or session cluster? > > > It will be great if we have the public and stable interfaces for > > deployment > > > in Flink. But currently we only have an internal interface > > > *ApplicationDeployer* to deploy the application cluster and > > > no interfaces for deploying session cluster. > > > Of cause, we could also use the CLI command for submission. However, it > > > will have poor performance when launching multiple applications. > > > > > > > > > # Pod Template > > > Is the pod template in CR same with what Flink has already > supported[4]? > > > Then I am afraid not the arbitrary field(e.g. cpu/memory resources) > could > > > take effect. > > > > > > > > > [1]. https://issues.apache.org/jira/browse/FLINK-24113 > > > [2]. https://issues.apache.org/jira/browse/FLINK-25715 > > > [3]. > > > > > > > > > https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/deployment/advanced/historyserver/ > > > [4]. > > > > > > > > > https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/deployment/resource-providers/native_kubernetes/#pod-template > > > > > > > > > > > > Best, > > > Yang > > > > > > > > > Thomas Weise <t...@apache.org> 于2022年1月25日周二 13:08写道: > > > > > > > Hi, > > > > > > > > As promised in [1] we would like to start the discussion on the > > > > addition of a Kubernetes operator to the Flink project as FLIP-212: > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-212%3A+Introduce+Flink+Kubernetes+Operator > > > > > > > > Please note that the FLIP is currently focussed on the overall > > > > direction; the intention is to fill in more details once we converge > > > > on the high level plan. > > > > > > > > Thanks and looking forward to a lively discussion! > > > > > > > > Thomas > > > > > > > > [1] https://lists.apache.org/thread/l1dkp8v4bhlcyb4tdts99g7w4wdglfy4 > > > > > > > > > >