Thanks Thomas & Biao for your feedback. Any additional opinions on how we should proceed with per job-mode? As you might have guessed, I am leaning towards proposing to deprecate per-job mode.
On Thu, Jan 13, 2022 at 5:11 PM Thomas Weise <t...@apache.org> wrote: > Regarding session mode: > > ## Session Mode > * main() method executed in client > > Session mode also supports execution of the main method on Jobmanager > with submission through REST API. That's how Flinkk k8s operators like > [1] work. It's actually an important capability because it allows for > allocation of the cluster resources prior to taking down the previous > job during upgrade when the goal is optimization for availability. > > Thanks, > Thomas > > [1] https://github.com/lyft/flinkk8soperator > > On Thu, Jan 13, 2022 at 12:32 AM Konstantin Knauf <kna...@apache.org> > wrote: > > > > Hi everyone, > > > > I would like to discuss and understand if the benefits of having Per-Job > > Mode in Apache Flink outweigh its drawbacks. > > > > > > *# Background: Flink's Deployment Modes* > > Flink currently has three deployment modes. They differ in the following > > dimensions: > > * main() method executed on Jobmanager or Client > > * dependencies shipped by client or bundled with all nodes > > * number of jobs per cluster & relationship between job and cluster > > lifecycle* (supported resource providers) > > > > ## Application Mode > > * main() method executed on Jobmanager > > * dependencies already need to be available on all nodes > > * dedicated cluster for all jobs executed from the same main()-method > > (Note: applications with more than one job, currently still significant > > limitations like missing high-availability). Technically, a session > cluster > > dedicated to all jobs submitted from the same main() method. > > * supported by standalone, native kubernetes, YARN > > > > ## Session Mode > > * main() method executed in client > > * dependencies are distributed from and by the client to all nodes > > * cluster is shared by multiple jobs submitted from different clients, > > independent lifecycle > > * supported by standalone, Native Kubernetes, YARN > > > > ## Per-Job Mode > > * main() method executed in client > > * dependencies are distributed from and by the client to all nodes > > * dedicated cluster for a single job > > * supported by YARN only > > > > > > *# Reasons to Keep** There are use cases where you might need the > > combination of a single job per cluster, but main() method execution in > the > > client. This combination is only supported by per-job mode. > > * It currently exists. Existing users will need to migrate to either > > session or application mode. > > > > > > *# Reasons to Drop** With Per-Job Mode and Application Mode we have two > > modes that for most users probably do the same thing. Specifically, for > > those users that don't care where the main() method is executed and want > to > > submit a single job per cluster. Having two ways to do the same thing is > > confusing. > > * Per-Job Mode is only supported by YARN anyway. If we keep it, we should > > work towards support in Kubernetes and Standalone, too, to reduce special > > casing. > > * Dropping per-job mode would reduce complexity in the code and allow us > to > > dedicate more resources to the other two deployment modes. > > * I believe with session mode and application mode we have to easily > > distinguishable and understandable deployment modes that cover Flink's > use > > cases: > > * session mode: olap-style, interactive jobs/queries, short lived > batch > > jobs, very small jobs, traditional cluster-centric deployment mode (fits > > the "Hadoop world") > > * application mode: long-running streaming jobs, large scale & > > heterogenous jobs (resource isolation!), application-centric deployment > > mode (fits the "Kubernetes world") > > > > > > *# Call to Action* > > * Do you use per-job mode? If so, why & would you be able to migrate to > one > > of the other methods? > > * Am I missing any pros/cons? > > * Are you in favor of dropping per-job mode midterm? > > > > Cheers and thank you, > > > > Konstantin > > > > -- > > > > Konstantin Knauf > > > > https://twitter.com/snntrable > > > > https://github.com/knaufk > -- Konstantin Knauf https://twitter.com/snntrable https://github.com/knaufk