Hi all,

And thanks for the discussion topics.

For the cluster lifecycle, it is the Entrypoint that will tear down
the cluster when the application finishes. Probably we should
emphasise it a bit more in the FLIP.

For the -R flag, this was in the PoC that I published just as a quick
implementation, so that I can move fast to the entrypoint part.
Personally, I would not even be against having a separate command in
the CLI for this, sth like run-on-cluster or something along those
lines.
What do you think?

For fetching jars, in the FLIP we say that as a first implementation
we can have Local and DFS. I was wondering if in the case of YARN,
both could be somehow implemented
using LocalResources, and let Yarn do the actual fetch. But I have not
investigated it further. Do you have any opinion on this?

Cheers,
Kostas

On Mon, Mar 9, 2020 at 10:47 AM Becket Qin <becket....@gmail.com> wrote:
>
> Thanks Yang,
>
> That would be very helpful!
>
> Jiangjie (Becket) Qin
>
> On Mon, Mar 9, 2020 at 3:31 PM Yang Wang <danrtsey...@gmail.com> wrote:
>>
>> Hi Becket,
>>
>> Thanks for your suggestion. We will update the FLIP to add/enrich the 
>> following parts.
>> * User cli option change, use "-R/--remote" to apply the cluster deploy mode
>> * Configuration change, how to specify remote user jars and dependencies
>> * The whole story about how "application mode" works, upload -> fetch -> 
>> submit job
>> * The cluster lifecycle, when and how the Flink cluster is destroyed
>>
>>
>> Best,
>> Yang
>>
>> Becket Qin <becket....@gmail.com> 于2020年3月9日周一 下午12:34写道:
>>>
>>> Thanks for the reply, tison and Yang,
>>>
>>> Regarding the public interface, is "-R/--remote" option the only change? 
>>> Will the users also need to provide a remote location to upload and store 
>>> the jars, and a list of jars as dependencies to be uploaded?
>>>
>>> It would be important that the public interface section in the FLIP 
>>> includes all the user sensible changes including the CLI / configuration / 
>>> metrics, etc. Can we update the FLIP to include the conclusion we have here 
>>> in the ML?
>>>
>>> Thanks,
>>>
>>> Jiangjie (Becket) Qin
>>>
>>> On Mon, Mar 9, 2020 at 11:59 AM Yang Wang <danrtsey...@gmail.com> wrote:
>>>>
>>>> Hi Becket,
>>>>
>>>> Thanks for jumping out and sharing your concerns. I second tison's answer 
>>>> and just
>>>> make some additions.
>>>>
>>>>
>>>> > job submission interface
>>>>
>>>> This FLIP will introduce an interface for running user `main()` on 
>>>> cluster, named as
>>>> “ProgramDeployer”. However, it is not a public interface. It will be used 
>>>> in `CliFrontend`
>>>> when the remote deploy option(-R/--remote-deploy) is specified. So the 
>>>> only changes
>>>> on user side is about the cli option.
>>>>
>>>>
>>>> > How to fetch the jars?
>>>>
>>>> The “local path” and “dfs path“ could be supported to fetch the user jars 
>>>> and dependencies.
>>>> Just like tison has said, we could ship the user jar and dependencies from 
>>>> client side to
>>>> HDFS and use the entrypoint to fetch.
>>>>
>>>> Also we have some other practical ways to use the new “application mode“.
>>>> 1. Upload the user jars and dependencies to the DFS(e.g. HDFS, S3, Aliyun 
>>>> OSS) manually
>>>> or some external deployer system. For K8s, the user jars and dependencies 
>>>> could also be
>>>> built in the docker image.
>>>> 2. Specify the remote/local user jar and dependencies in `flink run`. 
>>>> Usually this could also
>>>> be done by the external deployer system.
>>>> 3. When the `ClusterEntrypoint` is launched, it will fetch the jars and 
>>>> files automatically. We
>>>> do not need any specific fetcher implementation. Since we could leverage 
>>>> flink `FileSystem`
>>>> to do this.
>>>>
>>>>
>>>>
>>>>
>>>> Best,
>>>> Yang
>>>>
>>>> tison <wander4...@gmail.com> 于2020年3月9日周一 上午11:34写道:
>>>>>
>>>>> Hi Becket,
>>>>>
>>>>> Thanks for your attention on FLIP-85! I answered your question inline.
>>>>>
>>>>> 1. What exactly the job submission interface will look like after this 
>>>>> FLIP? The FLIP template has a Public Interface section but was removed 
>>>>> from this FLIP.
>>>>>
>>>>> As Yang mentioned in this thread above:
>>>>>
>>>>> From user perspective, only a `-R/-- remote-deploy` cli option is 
>>>>> visible. They are not aware of the application mode.
>>>>>
>>>>> 2. How will the new ClusterEntrypoint fetch the jars from external 
>>>>> storage? What external storage will be supported out of the box? Will 
>>>>> this "jar fetcher" be pluggable? If so, how does the API look like and 
>>>>> how will users specify the custom "jar fetcher"?
>>>>>
>>>>> It depends actually. Here are several points:
>>>>>
>>>>> i. Currently, shipping user files is handled by Flink, dependencies 
>>>>> fetching can be handled by Flink.
>>>>> ii. Current, we only support local file system shipfiles. When in 
>>>>> Application Mode, to support meaningful jar fetch we should support user 
>>>>> to configure richer shipfiles schema at first.
>>>>> iii. Dependencies fetching varies from deployments. That is, on YARN, its 
>>>>> convention is through HDFS; on Kubernetes, its convention is configured 
>>>>> resource server and fetched by initContainer.
>>>>>
>>>>> Thus, in the First phase of Application Mode dependencies fetching is 
>>>>> totally handled within Flink.
>>>>>
>>>>> 3. It sounds that in this FLIP, the "session cluster" running the 
>>>>> application has the same lifecycle as the user application. How will the 
>>>>> session cluster be teared down after the application finishes? Will the 
>>>>> ClusterEntrypoint do that? Will there be an option of not tearing the 
>>>>> cluster down?
>>>>>
>>>>> The precondition we tear down the cluster is *both*
>>>>>
>>>>> i. user main reached to its end
>>>>> ii. all jobs submitted(current, at most one) reached global terminate 
>>>>> state
>>>>>
>>>>> For the "how", it is an implementation topic, but conceptually it is 
>>>>> ClusterEntrypoint's responsibility.
>>>>>
>>>>> >Will there be an option of not tearing the cluster down?
>>>>>
>>>>> I think the answer is "No" because the cluster is designed to be bounded 
>>>>> with an Application. User logic that communicates with the job is always 
>>>>> in its `main`, and for history information we have history server.
>>>>>
>>>>> Best,
>>>>> tison.
>>>>>
>>>>>
>>>>> Becket Qin <becket....@gmail.com> 于2020年3月9日周一 上午8:12写道:
>>>>>>
>>>>>> Hi Peter and Kostas,
>>>>>>
>>>>>> Thanks for creating this FLIP. Moving the JobGraph compilation to the 
>>>>>> cluster makes a lot of sense to me. FLIP-40 had the exactly same idea, 
>>>>>> but is currently dormant and can probably be superseded by this FLIP. 
>>>>>> After reading the FLIP, I still have a few questions.
>>>>>>
>>>>>> 1. What exactly the job submission interface will look like after this 
>>>>>> FLIP? The FLIP template has a Public Interface section but was removed 
>>>>>> from this FLIP.
>>>>>> 2. How will the new ClusterEntrypoint fetch the jars from external 
>>>>>> storage? What external storage will be supported out of the box? Will 
>>>>>> this "jar fetcher" be pluggable? If so, how does the API look like and 
>>>>>> how will users specify the custom "jar fetcher"?
>>>>>> 3. It sounds that in this FLIP, the "session cluster" running the 
>>>>>> application has the same lifecycle as the user application. How will the 
>>>>>> session cluster be teared down after the application finishes? Will the 
>>>>>> ClusterEntrypoint do that? Will there be an option of not tearing the 
>>>>>> cluster down?
>>>>>>
>>>>>> Maybe they have been discussed in the ML earlier, but I think they 
>>>>>> should be part of the FLIP also.
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Jiangjie (Becket) Qin
>>>>>>
>>>>>> On Thu, Mar 5, 2020 at 10:09 PM Kostas Kloudas <kklou...@gmail.com> 
>>>>>> wrote:
>>>>>>>
>>>>>>> Also from my side +1  to start voting.
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Kostas
>>>>>>>
>>>>>>> On Thu, Mar 5, 2020 at 7:45 AM tison <wander4...@gmail.com> wrote:
>>>>>>> >
>>>>>>> > +1 to star voting.
>>>>>>> >
>>>>>>> > Best,
>>>>>>> > tison.
>>>>>>> >
>>>>>>> >
>>>>>>> > Yang Wang <danrtsey...@gmail.com> 于2020年3月5日周四 下午2:29写道:
>>>>>>> >>
>>>>>>> >> Hi Peter,
>>>>>>> >> Really thanks for your response.
>>>>>>> >>
>>>>>>> >> Hi all @Kostas Kloudas @Zili Chen @Peter Huang @Rong Rong
>>>>>>> >> It seems that we have reached an agreement. The “application mode” 
>>>>>>> >> is regarded as the enhanced “per-job”. It is
>>>>>>> >> orthogonal with “cluster deploy”. Currently, we bind the “per-job” 
>>>>>>> >> to `run-user-main-on-client` and “application mode”
>>>>>>> >> to `run-user-main-on-cluster`.
>>>>>>> >>
>>>>>>> >> Do you have other concerns to moving FLIP-85 to voting?
>>>>>>> >>
>>>>>>> >>
>>>>>>> >> Best,
>>>>>>> >> Yang
>>>>>>> >>
>>>>>>> >> Peter Huang <huangzhenqiu0...@gmail.com> 于2020年3月5日周四 下午12:48写道:
>>>>>>> >>>
>>>>>>> >>> Hi Yang and Kostas,
>>>>>>> >>>
>>>>>>> >>> Thanks for the clarification. It makes more sense to me if the long 
>>>>>>> >>> term goal is to replace per job mode to application mode
>>>>>>> >>>  in the future (at the time that multiple execute can be 
>>>>>>> >>> supported). Before that, It will be better to keep the concept of
>>>>>>> >>>  application mode internally. As Yang suggested, User only need to 
>>>>>>> >>> use a `-R/-- remote-deploy` cli option to launch
>>>>>>> >>> a per job cluster with the main function executed in cluster 
>>>>>>> >>> entry-point.  +1 for the execution plan.
>>>>>>> >>>
>>>>>>> >>>
>>>>>>> >>>
>>>>>>> >>> Best Regards
>>>>>>> >>> Peter Huang
>>>>>>> >>>
>>>>>>> >>>
>>>>>>> >>>
>>>>>>> >>>
>>>>>>> >>> On Tue, Mar 3, 2020 at 7:11 AM Yang Wang <danrtsey...@gmail.com> 
>>>>>>> >>> wrote:
>>>>>>> >>>>
>>>>>>> >>>> Hi Peter,
>>>>>>> >>>>
>>>>>>> >>>> Having the application mode does not mean we will drop the 
>>>>>>> >>>> cluster-deploy
>>>>>>> >>>> option. I just want to share some thoughts about “Application 
>>>>>>> >>>> Mode”.
>>>>>>> >>>>
>>>>>>> >>>>
>>>>>>> >>>> 1. The application mode could cover the per-job sematic. Its 
>>>>>>> >>>> lifecyle is bound
>>>>>>> >>>> to the user `main()`. And all the jobs in the user main will be 
>>>>>>> >>>> executed in a same
>>>>>>> >>>> Flink cluster. In first phase of FLIP-85 implementation, running 
>>>>>>> >>>> user main on the
>>>>>>> >>>> cluster side could be supported in application mode.
>>>>>>> >>>>
>>>>>>> >>>> 2. Maybe in the future, we also need to support multiple 
>>>>>>> >>>> `execute()` on client side
>>>>>>> >>>> in a same Flink cluster. Then the per-job mode will evolve to 
>>>>>>> >>>> application mode.
>>>>>>> >>>>
>>>>>>> >>>> 3. From user perspective, only a `-R/-- remote-deploy` cli option 
>>>>>>> >>>> is visible. They
>>>>>>> >>>> are not aware of the application mode.
>>>>>>> >>>>
>>>>>>> >>>> 4. In the first phase, the application mode is working as 
>>>>>>> >>>> “per-job”(only one job in
>>>>>>> >>>> the user main). We just leave more potential for the future.
>>>>>>> >>>>
>>>>>>> >>>>
>>>>>>> >>>> I am not against with calling it “cluster deploy mode” if you all 
>>>>>>> >>>> think it is clearer for users.
>>>>>>> >>>>
>>>>>>> >>>>
>>>>>>> >>>>
>>>>>>> >>>> Best,
>>>>>>> >>>> Yang
>>>>>>> >>>>
>>>>>>> >>>> Kostas Kloudas <kklou...@gmail.com> 于2020年3月3日周二 下午6:49写道:
>>>>>>> >>>>>
>>>>>>> >>>>> Hi Peter,
>>>>>>> >>>>>
>>>>>>> >>>>> I understand your point. This is why I was also a bit torn about 
>>>>>>> >>>>> the
>>>>>>> >>>>> name and my proposal was a bit aligned with yours (something 
>>>>>>> >>>>> along the
>>>>>>> >>>>> lines of "cluster deploy" mode).
>>>>>>> >>>>>
>>>>>>> >>>>> But many of the other participants in the discussion suggested the
>>>>>>> >>>>> "Application Mode". I think that the reasoning is that now the 
>>>>>>> >>>>> user's
>>>>>>> >>>>> Application is more self-contained.
>>>>>>> >>>>> It will be submitted to the cluster and the user can just 
>>>>>>> >>>>> disconnect.
>>>>>>> >>>>> In addition, as discussed briefly in the doc, in the future there 
>>>>>>> >>>>> may
>>>>>>> >>>>> be better support for multi-execute applications which will bring 
>>>>>>> >>>>> us
>>>>>>> >>>>> one step closer to the true "Application Mode". But this is how I
>>>>>>> >>>>> interpreted their arguments, of course they can also express their
>>>>>>> >>>>> thoughts on the topic :)
>>>>>>> >>>>>
>>>>>>> >>>>> Cheers,
>>>>>>> >>>>> Kostas
>>>>>>> >>>>>
>>>>>>> >>>>> On Mon, Mar 2, 2020 at 6:15 PM Peter Huang 
>>>>>>> >>>>> <huangzhenqiu0...@gmail.com> wrote:
>>>>>>> >>>>> >
>>>>>>> >>>>> > Hi Kostas,
>>>>>>> >>>>> >
>>>>>>> >>>>> > Thanks for updating the wiki. We have aligned with the 
>>>>>>> >>>>> > implementations in the doc. But I feel it is still a little bit 
>>>>>>> >>>>> > confusing of the naming from a user's perspective. It is well 
>>>>>>> >>>>> > known that Flink support per job cluster and session cluster. 
>>>>>>> >>>>> > The concept is in the layer of how a job is managed within 
>>>>>>> >>>>> > Flink. The method introduced util now is a kind of mixing job 
>>>>>>> >>>>> > and session cluster to promising the implementation complexity. 
>>>>>>> >>>>> > We probably don't need to label it as Application Model as the 
>>>>>>> >>>>> > same layer of per job cluster and session cluster. 
>>>>>>> >>>>> > Conceptually, I think it is still a cluster mode implementation 
>>>>>>> >>>>> > for per job cluster.
>>>>>>> >>>>> >
>>>>>>> >>>>> > To minimize the confusion of users, I think it would be better 
>>>>>>> >>>>> > just an option of per job cluster for each type of cluster 
>>>>>>> >>>>> > manager. How do you think?
>>>>>>> >>>>> >
>>>>>>> >>>>> >
>>>>>>> >>>>> > Best Regards
>>>>>>> >>>>> > Peter Huang
>>>>>>> >>>>> >
>>>>>>> >>>>> >
>>>>>>> >>>>> >
>>>>>>> >>>>> >
>>>>>>> >>>>> >
>>>>>>> >>>>> >
>>>>>>> >>>>> >
>>>>>>> >>>>> >
>>>>>>> >>>>> > On Mon, Mar 2, 2020 at 7:22 AM Kostas Kloudas 
>>>>>>> >>>>> > <kklou...@gmail.com> wrote:
>>>>>>> >>>>> >>
>>>>>>> >>>>> >> Hi Yang,
>>>>>>> >>>>> >>
>>>>>>> >>>>> >> The difference between per-job and application mode is that, 
>>>>>>> >>>>> >> as you
>>>>>>> >>>>> >> described, in the per-job mode the main is executed on the 
>>>>>>> >>>>> >> client
>>>>>>> >>>>> >> while in the application mode, the main is executed on the 
>>>>>>> >>>>> >> cluster.
>>>>>>> >>>>> >> I do not think we have to offer "application mode" with 
>>>>>>> >>>>> >> running the
>>>>>>> >>>>> >> main on the client side as this is exactly what the per-job 
>>>>>>> >>>>> >> mode does
>>>>>>> >>>>> >> currently and, as you described also, it would be redundant.
>>>>>>> >>>>> >>
>>>>>>> >>>>> >> Sorry if this was not clear in the document.
>>>>>>> >>>>> >>
>>>>>>> >>>>> >> Cheers,
>>>>>>> >>>>> >> Kostas
>>>>>>> >>>>> >>
>>>>>>> >>>>> >> On Mon, Mar 2, 2020 at 3:17 PM Yang Wang 
>>>>>>> >>>>> >> <danrtsey...@gmail.com> wrote:
>>>>>>> >>>>> >> >
>>>>>>> >>>>> >> > Hi Kostas,
>>>>>>> >>>>> >> >
>>>>>>> >>>>> >> > Thanks a lot for your conclusion and updating the FLIP-85 
>>>>>>> >>>>> >> > WIKI. Currently, i have no more
>>>>>>> >>>>> >> > questions about motivation, approach, fault tolerance and 
>>>>>>> >>>>> >> > the first phase implementation.
>>>>>>> >>>>> >> >
>>>>>>> >>>>> >> > I think the new title "Flink Application Mode" makes a lot 
>>>>>>> >>>>> >> > senses to me. Especially for the
>>>>>>> >>>>> >> > containerized environment, the cluster deploy option will be 
>>>>>>> >>>>> >> > very useful.
>>>>>>> >>>>> >> >
>>>>>>> >>>>> >> > Just one concern, how do we introduce this new application 
>>>>>>> >>>>> >> > mode to our users?
>>>>>>> >>>>> >> > Each user program(i.e. `main()`) is an application. 
>>>>>>> >>>>> >> > Currently, we intend to only support one
>>>>>>> >>>>> >> > `execute()`. So what's the difference between per-job and 
>>>>>>> >>>>> >> > application mode?
>>>>>>> >>>>> >> >
>>>>>>> >>>>> >> > For per-job, user `main()` is always executed on client 
>>>>>>> >>>>> >> > side. And For application mode, user
>>>>>>> >>>>> >> > `main()` could be executed on client or master 
>>>>>>> >>>>> >> > side(configured via cli option).
>>>>>>> >>>>> >> > Right? We need to have a clear concept. Otherwise, the users 
>>>>>>> >>>>> >> > will be more and more confusing.
>>>>>>> >>>>> >> >
>>>>>>> >>>>> >> >
>>>>>>> >>>>> >> > Best,
>>>>>>> >>>>> >> > Yang
>>>>>>> >>>>> >> >
>>>>>>> >>>>> >> > Kostas Kloudas <kklou...@gmail.com> 于2020年3月2日周一 下午5:58写道:
>>>>>>> >>>>> >> >>
>>>>>>> >>>>> >> >> Hi all,
>>>>>>> >>>>> >> >>
>>>>>>> >>>>> >> >> I update 
>>>>>>> >>>>> >> >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-85+Flink+Application+Mode
>>>>>>> >>>>> >> >> based on the discussion we had here:
>>>>>>> >>>>> >> >>
>>>>>>> >>>>> >> >> https://docs.google.com/document/d/1ji72s3FD9DYUyGuKnJoO4ApzV-nSsZa0-bceGXW7Ocw/edit#
>>>>>>> >>>>> >> >>
>>>>>>> >>>>> >> >> Please let me know what you think and please keep the 
>>>>>>> >>>>> >> >> discussion in the ML :)
>>>>>>> >>>>> >> >>
>>>>>>> >>>>> >> >> Thanks for starting the discussion and I hope that soon we 
>>>>>>> >>>>> >> >> will be
>>>>>>> >>>>> >> >> able to vote on the FLIP.
>>>>>>> >>>>> >> >>
>>>>>>> >>>>> >> >> Cheers,
>>>>>>> >>>>> >> >> Kostas
>>>>>>> >>>>> >> >>
>>>>>>> >>>>> >> >> On Thu, Jan 16, 2020 at 3:40 AM Yang Wang 
>>>>>>> >>>>> >> >> <danrtsey...@gmail.com> wrote:
>>>>>>> >>>>> >> >> >
>>>>>>> >>>>> >> >> > Hi all,
>>>>>>> >>>>> >> >> >
>>>>>>> >>>>> >> >> > Thanks a lot for the feedback from @Kostas Kloudas. Your 
>>>>>>> >>>>> >> >> > all concerns are
>>>>>>> >>>>> >> >> > on point. The FLIP-85 is mainly
>>>>>>> >>>>> >> >> > focused on supporting cluster mode for per-job. Since it 
>>>>>>> >>>>> >> >> > is more urgent and
>>>>>>> >>>>> >> >> > have much more use
>>>>>>> >>>>> >> >> > cases both in Yarn and Kubernetes deployment. For session 
>>>>>>> >>>>> >> >> > cluster, we could
>>>>>>> >>>>> >> >> > have more discussion
>>>>>>> >>>>> >> >> > in a new thread later.
>>>>>>> >>>>> >> >> >
>>>>>>> >>>>> >> >> > #1, How to download the user jars and dependencies for 
>>>>>>> >>>>> >> >> > per-job in cluster
>>>>>>> >>>>> >> >> > mode?
>>>>>>> >>>>> >> >> > For Yarn, we could register the user jars and 
>>>>>>> >>>>> >> >> > dependencies as
>>>>>>> >>>>> >> >> > LocalResource. They will be distributed
>>>>>>> >>>>> >> >> > by Yarn. And once the JobManager and TaskManager 
>>>>>>> >>>>> >> >> > launched, the jars are
>>>>>>> >>>>> >> >> > already exists.
>>>>>>> >>>>> >> >> > For Standalone per-job and K8s, we expect that the user 
>>>>>>> >>>>> >> >> > jars
>>>>>>> >>>>> >> >> > and dependencies are built into the image.
>>>>>>> >>>>> >> >> > Or the InitContainer could be used for downloading. It is 
>>>>>>> >>>>> >> >> > natively
>>>>>>> >>>>> >> >> > distributed and we will not have bottleneck.
>>>>>>> >>>>> >> >> >
>>>>>>> >>>>> >> >> > #2, Job graph recovery
>>>>>>> >>>>> >> >> > We could have an optimization to store job graph on the 
>>>>>>> >>>>> >> >> > DFS. However, i
>>>>>>> >>>>> >> >> > suggest building a new jobgraph
>>>>>>> >>>>> >> >> > from the configuration is the default option. Since we 
>>>>>>> >>>>> >> >> > will not always have
>>>>>>> >>>>> >> >> > a DFS store when deploying a
>>>>>>> >>>>> >> >> > Flink per-job cluster. Of course, we assume that using 
>>>>>>> >>>>> >> >> > the same
>>>>>>> >>>>> >> >> > configuration(e.g. job_id, user_jar, main_class,
>>>>>>> >>>>> >> >> > main_args, parallelism, savepoint_settings, etc.) will 
>>>>>>> >>>>> >> >> > get a same job
>>>>>>> >>>>> >> >> > graph. I think the standalone per-job
>>>>>>> >>>>> >> >> > already has the similar behavior.
>>>>>>> >>>>> >> >> >
>>>>>>> >>>>> >> >> > #3, What happens with jobs that have multiple execute 
>>>>>>> >>>>> >> >> > calls?
>>>>>>> >>>>> >> >> > Currently, it is really a problem. Even we use a local 
>>>>>>> >>>>> >> >> > client on Flink
>>>>>>> >>>>> >> >> > master side, it will have different behavior with
>>>>>>> >>>>> >> >> > client mode. For client mode, if we execute multiple 
>>>>>>> >>>>> >> >> > times, then we will
>>>>>>> >>>>> >> >> > deploy multiple Flink clusters for each execute.
>>>>>>> >>>>> >> >> > I am not pretty sure whether it is reasonable. However, i 
>>>>>>> >>>>> >> >> > still think using
>>>>>>> >>>>> >> >> > the local client is a good choice. We could
>>>>>>> >>>>> >> >> > continue the discussion in a new thread. @Zili Chen 
>>>>>>> >>>>> >> >> > <wander4...@gmail.com> Do
>>>>>>> >>>>> >> >> > you want to drive this?
>>>>>>> >>>>> >> >> >
>>>>>>> >>>>> >> >> >
>>>>>>> >>>>> >> >> >
>>>>>>> >>>>> >> >> > Best,
>>>>>>> >>>>> >> >> > Yang
>>>>>>> >>>>> >> >> >
>>>>>>> >>>>> >> >> > Peter Huang <huangzhenqiu0...@gmail.com> 于2020年1月16日周四 
>>>>>>> >>>>> >> >> > 上午1:55写道:
>>>>>>> >>>>> >> >> >
>>>>>>> >>>>> >> >> > > Hi Kostas,
>>>>>>> >>>>> >> >> > >
>>>>>>> >>>>> >> >> > > Thanks for this feedback. I can't agree more about the 
>>>>>>> >>>>> >> >> > > opinion. The
>>>>>>> >>>>> >> >> > > cluster mode should be added
>>>>>>> >>>>> >> >> > > first in per job cluster.
>>>>>>> >>>>> >> >> > >
>>>>>>> >>>>> >> >> > > 1) For job cluster implementation
>>>>>>> >>>>> >> >> > > 1. Job graph recovery from configuration or store as 
>>>>>>> >>>>> >> >> > > static job graph as
>>>>>>> >>>>> >> >> > > session cluster. I think the static one will be better 
>>>>>>> >>>>> >> >> > > for less recovery
>>>>>>> >>>>> >> >> > > time.
>>>>>>> >>>>> >> >> > > Let me update the doc for details.
>>>>>>> >>>>> >> >> > >
>>>>>>> >>>>> >> >> > > 2. For job execute multiple times, I think @Zili Chen
>>>>>>> >>>>> >> >> > > <wander4...@gmail.com> has proposed the local client 
>>>>>>> >>>>> >> >> > > solution that can
>>>>>>> >>>>> >> >> > > the run program actually in the cluster entry point. We 
>>>>>>> >>>>> >> >> > > can put the
>>>>>>> >>>>> >> >> > > implementation in the second stage,
>>>>>>> >>>>> >> >> > > or even a new FLIP for further discussion.
>>>>>>> >>>>> >> >> > >
>>>>>>> >>>>> >> >> > > 2) For session cluster implementation
>>>>>>> >>>>> >> >> > > We can disable the cluster mode for the session cluster 
>>>>>>> >>>>> >> >> > > in the first
>>>>>>> >>>>> >> >> > > stage. I agree the jar downloading will be a painful 
>>>>>>> >>>>> >> >> > > thing.
>>>>>>> >>>>> >> >> > > We can consider about PoC and performance evaluation 
>>>>>>> >>>>> >> >> > > first. If the end to
>>>>>>> >>>>> >> >> > > end experience is good enough, then we can consider
>>>>>>> >>>>> >> >> > > proceeding with the solution.
>>>>>>> >>>>> >> >> > >
>>>>>>> >>>>> >> >> > > Looking forward to more opinions from @Yang Wang 
>>>>>>> >>>>> >> >> > > <danrtsey...@gmail.com> @Zili
>>>>>>> >>>>> >> >> > > Chen <wander4...@gmail.com> @Dian Fu 
>>>>>>> >>>>> >> >> > > <dian0511...@gmail.com>.
>>>>>>> >>>>> >> >> > >
>>>>>>> >>>>> >> >> > >
>>>>>>> >>>>> >> >> > > Best Regards
>>>>>>> >>>>> >> >> > > Peter Huang
>>>>>>> >>>>> >> >> > >
>>>>>>> >>>>> >> >> > > On Wed, Jan 15, 2020 at 7:50 AM Kostas Kloudas 
>>>>>>> >>>>> >> >> > > <kklou...@gmail.com> wrote:
>>>>>>> >>>>> >> >> > >
>>>>>>> >>>>> >> >> > >> Hi all,
>>>>>>> >>>>> >> >> > >>
>>>>>>> >>>>> >> >> > >> I am writing here as the discussion on the Google Doc 
>>>>>>> >>>>> >> >> > >> seems to be a
>>>>>>> >>>>> >> >> > >> bit difficult to follow.
>>>>>>> >>>>> >> >> > >>
>>>>>>> >>>>> >> >> > >> I think that in order to be able to make progress, it 
>>>>>>> >>>>> >> >> > >> would be helpful
>>>>>>> >>>>> >> >> > >> to focus on per-job mode for now.
>>>>>>> >>>>> >> >> > >> The reason is that:
>>>>>>> >>>>> >> >> > >>  1) making the (unique) JobSubmitHandler responsible 
>>>>>>> >>>>> >> >> > >> for creating the
>>>>>>> >>>>> >> >> > >> jobgraphs,
>>>>>>> >>>>> >> >> > >>   which includes downloading dependencies, is not an 
>>>>>>> >>>>> >> >> > >> optimal solution
>>>>>>> >>>>> >> >> > >>  2) even if we put the responsibility on the 
>>>>>>> >>>>> >> >> > >> JobMaster, currently each
>>>>>>> >>>>> >> >> > >> job has its own
>>>>>>> >>>>> >> >> > >>   JobMaster but they all run on the same process, so 
>>>>>>> >>>>> >> >> > >> we have again a
>>>>>>> >>>>> >> >> > >> single entity.
>>>>>>> >>>>> >> >> > >>
>>>>>>> >>>>> >> >> > >> Of course after this is done, and if we feel 
>>>>>>> >>>>> >> >> > >> comfortable with the
>>>>>>> >>>>> >> >> > >> solution, then we can go to the session mode.
>>>>>>> >>>>> >> >> > >>
>>>>>>> >>>>> >> >> > >> A second comment has to do with fault-tolerance in the 
>>>>>>> >>>>> >> >> > >> per-job,
>>>>>>> >>>>> >> >> > >> cluster-deploy mode.
>>>>>>> >>>>> >> >> > >> In the document, it is suggested that upon recovery, 
>>>>>>> >>>>> >> >> > >> the JobMaster of
>>>>>>> >>>>> >> >> > >> each job re-creates the JobGraph.
>>>>>>> >>>>> >> >> > >> I am just wondering if it is better to create and 
>>>>>>> >>>>> >> >> > >> store the jobGraph
>>>>>>> >>>>> >> >> > >> upon submission and only fetch it
>>>>>>> >>>>> >> >> > >> upon recovery so that we have a static jobGraph.
>>>>>>> >>>>> >> >> > >>
>>>>>>> >>>>> >> >> > >> Finally, I have a question which is what happens with 
>>>>>>> >>>>> >> >> > >> jobs that have
>>>>>>> >>>>> >> >> > >> multiple execute calls?
>>>>>>> >>>>> >> >> > >> The semantics seem to change compared to the current 
>>>>>>> >>>>> >> >> > >> behaviour, right?
>>>>>>> >>>>> >> >> > >>
>>>>>>> >>>>> >> >> > >> Cheers,
>>>>>>> >>>>> >> >> > >> Kostas
>>>>>>> >>>>> >> >> > >>
>>>>>>> >>>>> >> >> > >> On Wed, Jan 8, 2020 at 8:05 PM tison 
>>>>>>> >>>>> >> >> > >> <wander4...@gmail.com> wrote:
>>>>>>> >>>>> >> >> > >> >
>>>>>>> >>>>> >> >> > >> > not always, Yang Wang is also not yet a committer 
>>>>>>> >>>>> >> >> > >> > but he can join the
>>>>>>> >>>>> >> >> > >> > channel. I cannot find the id by clicking “Add new 
>>>>>>> >>>>> >> >> > >> > member in channel” so
>>>>>>> >>>>> >> >> > >> > come to you and ask for try out the link. Possibly I 
>>>>>>> >>>>> >> >> > >> > will find other
>>>>>>> >>>>> >> >> > >> ways
>>>>>>> >>>>> >> >> > >> > but the original purpose is that the slack channel 
>>>>>>> >>>>> >> >> > >> > is a public area we
>>>>>>> >>>>> >> >> > >> > discuss about developing...
>>>>>>> >>>>> >> >> > >> > Best,
>>>>>>> >>>>> >> >> > >> > tison.
>>>>>>> >>>>> >> >> > >> >
>>>>>>> >>>>> >> >> > >> >
>>>>>>> >>>>> >> >> > >> > Peter Huang <huangzhenqiu0...@gmail.com> 
>>>>>>> >>>>> >> >> > >> > 于2020年1月9日周四 上午2:44写道:
>>>>>>> >>>>> >> >> > >> >
>>>>>>> >>>>> >> >> > >> > > Hi Tison,
>>>>>>> >>>>> >> >> > >> > >
>>>>>>> >>>>> >> >> > >> > > I am not the committer of Flink yet. I think I 
>>>>>>> >>>>> >> >> > >> > > can't join it also.
>>>>>>> >>>>> >> >> > >> > >
>>>>>>> >>>>> >> >> > >> > >
>>>>>>> >>>>> >> >> > >> > > Best Regards
>>>>>>> >>>>> >> >> > >> > > Peter Huang
>>>>>>> >>>>> >> >> > >> > >
>>>>>>> >>>>> >> >> > >> > > On Wed, Jan 8, 2020 at 9:39 AM tison 
>>>>>>> >>>>> >> >> > >> > > <wander4...@gmail.com> wrote:
>>>>>>> >>>>> >> >> > >> > >
>>>>>>> >>>>> >> >> > >> > > > Hi Peter,
>>>>>>> >>>>> >> >> > >> > > >
>>>>>>> >>>>> >> >> > >> > > > Could you try out this link?
>>>>>>> >>>>> >> >> > >> > > https://the-asf.slack.com/messages/CNA3ADZPH
>>>>>>> >>>>> >> >> > >> > > >
>>>>>>> >>>>> >> >> > >> > > > Best,
>>>>>>> >>>>> >> >> > >> > > > tison.
>>>>>>> >>>>> >> >> > >> > > >
>>>>>>> >>>>> >> >> > >> > > >
>>>>>>> >>>>> >> >> > >> > > > Peter Huang <huangzhenqiu0...@gmail.com> 
>>>>>>> >>>>> >> >> > >> > > > 于2020年1月9日周四 上午1:22写道:
>>>>>>> >>>>> >> >> > >> > > >
>>>>>>> >>>>> >> >> > >> > > > > Hi Tison,
>>>>>>> >>>>> >> >> > >> > > > >
>>>>>>> >>>>> >> >> > >> > > > > I can't join the group with shared link. Would 
>>>>>>> >>>>> >> >> > >> > > > > you please add me
>>>>>>> >>>>> >> >> > >> into
>>>>>>> >>>>> >> >> > >> > > the
>>>>>>> >>>>> >> >> > >> > > > > group? My slack account is huangzhenqiu0825.
>>>>>>> >>>>> >> >> > >> > > > > Thank you in advance.
>>>>>>> >>>>> >> >> > >> > > > >
>>>>>>> >>>>> >> >> > >> > > > >
>>>>>>> >>>>> >> >> > >> > > > > Best Regards
>>>>>>> >>>>> >> >> > >> > > > > Peter Huang
>>>>>>> >>>>> >> >> > >> > > > >
>>>>>>> >>>>> >> >> > >> > > > > On Wed, Jan 8, 2020 at 12:02 AM tison 
>>>>>>> >>>>> >> >> > >> > > > > <wander4...@gmail.com>
>>>>>>> >>>>> >> >> > >> wrote:
>>>>>>> >>>>> >> >> > >> > > > >
>>>>>>> >>>>> >> >> > >> > > > > > Hi Peter,
>>>>>>> >>>>> >> >> > >> > > > > >
>>>>>>> >>>>> >> >> > >> > > > > > As described above, this effort should get 
>>>>>>> >>>>> >> >> > >> > > > > > attention from people
>>>>>>> >>>>> >> >> > >> > > > > developing
>>>>>>> >>>>> >> >> > >> > > > > > FLIP-73 a.k.a. Executor abstractions. I 
>>>>>>> >>>>> >> >> > >> > > > > > recommend you to join
>>>>>>> >>>>> >> >> > >> the
>>>>>>> >>>>> >> >> > >> > > > public
>>>>>>> >>>>> >> >> > >> > > > > > slack channel[1] for Flink Client API 
>>>>>>> >>>>> >> >> > >> > > > > > Enhancement and you can
>>>>>>> >>>>> >> >> > >> try to
>>>>>>> >>>>> >> >> > >> > > > > share
>>>>>>> >>>>> >> >> > >> > > > > > you detailed thoughts there. It possibly 
>>>>>>> >>>>> >> >> > >> > > > > > gets more concrete
>>>>>>> >>>>> >> >> > >> > > attentions.
>>>>>>> >>>>> >> >> > >> > > > > >
>>>>>>> >>>>> >> >> > >> > > > > > Best,
>>>>>>> >>>>> >> >> > >> > > > > > tison.
>>>>>>> >>>>> >> >> > >> > > > > >
>>>>>>> >>>>> >> >> > >> > > > > > [1]
>>>>>>> >>>>> >> >> > >> > > > > >
>>>>>>> >>>>> >> >> > >> > > > > >
>>>>>>> >>>>> >> >> > >> > > > >
>>>>>>> >>>>> >> >> > >> > > >
>>>>>>> >>>>> >> >> > >> > >
>>>>>>> >>>>> >> >> > >> https://slack.com/share/IS21SJ75H/Rk8HhUly9FuEHb7oGwBZ33uL/enQtODg2MDYwNjE5MTg3LTA2MjIzNDc1M2ZjZDVlMjdlZjk1M2RkYmJhNjAwMTk2ZDZkODQ4NmY5YmI4OGRhNWJkYTViMTM1NzlmMzc4OWM
>>>>>>> >>>>> >> >> > >> > > > > >
>>>>>>> >>>>> >> >> > >> > > > > >
>>>>>>> >>>>> >> >> > >> > > > > > Peter Huang <huangzhenqiu0...@gmail.com> 
>>>>>>> >>>>> >> >> > >> > > > > > 于2020年1月7日周二 上午5:09写道:
>>>>>>> >>>>> >> >> > >> > > > > >
>>>>>>> >>>>> >> >> > >> > > > > > > Dear All,
>>>>>>> >>>>> >> >> > >> > > > > > >
>>>>>>> >>>>> >> >> > >> > > > > > > Happy new year! According to existing 
>>>>>>> >>>>> >> >> > >> > > > > > > feedback from the
>>>>>>> >>>>> >> >> > >> community,
>>>>>>> >>>>> >> >> > >> > > we
>>>>>>> >>>>> >> >> > >> > > > > > > revised the doc with the consideration of 
>>>>>>> >>>>> >> >> > >> > > > > > > session cluster
>>>>>>> >>>>> >> >> > >> support,
>>>>>>> >>>>> >> >> > >> > > > and
>>>>>>> >>>>> >> >> > >> > > > > > > concrete interface changes needed and 
>>>>>>> >>>>> >> >> > >> > > > > > > execution plan. Please
>>>>>>> >>>>> >> >> > >> take
>>>>>>> >>>>> >> >> > >> > > one
>>>>>>> >>>>> >> >> > >> > > > > > more
>>>>>>> >>>>> >> >> > >> > > > > > > round of review at your most convenient 
>>>>>>> >>>>> >> >> > >> > > > > > > time.
>>>>>>> >>>>> >> >> > >> > > > > > >
>>>>>>> >>>>> >> >> > >> > > > > > >
>>>>>>> >>>>> >> >> > >> > > > > > >
>>>>>>> >>>>> >> >> > >> > > > > >
>>>>>>> >>>>> >> >> > >> > > > >
>>>>>>> >>>>> >> >> > >> > > >
>>>>>>> >>>>> >> >> > >> > >
>>>>>>> >>>>> >> >> > >> https://docs.google.com/document/d/1aAwVjdZByA-0CHbgv16Me-vjaaDMCfhX7TzVVTuifYM/edit#
>>>>>>> >>>>> >> >> > >> > > > > > >
>>>>>>> >>>>> >> >> > >> > > > > > >
>>>>>>> >>>>> >> >> > >> > > > > > > Best Regards
>>>>>>> >>>>> >> >> > >> > > > > > > Peter Huang
>>>>>>> >>>>> >> >> > >> > > > > > >
>>>>>>> >>>>> >> >> > >> > > > > > >
>>>>>>> >>>>> >> >> > >> > > > > > >
>>>>>>> >>>>> >> >> > >> > > > > > >
>>>>>>> >>>>> >> >> > >> > > > > > >
>>>>>>> >>>>> >> >> > >> > > > > > > On Thu, Jan 2, 2020 at 11:29 AM Peter 
>>>>>>> >>>>> >> >> > >> > > > > > > Huang <
>>>>>>> >>>>> >> >> > >> > > > > huangzhenqiu0...@gmail.com>
>>>>>>> >>>>> >> >> > >> > > > > > > wrote:
>>>>>>> >>>>> >> >> > >> > > > > > >
>>>>>>> >>>>> >> >> > >> > > > > > > > Hi Dian,
>>>>>>> >>>>> >> >> > >> > > > > > > > Thanks for giving us valuable feedbacks.
>>>>>>> >>>>> >> >> > >> > > > > > > >
>>>>>>> >>>>> >> >> > >> > > > > > > > 1) It's better to have a whole design 
>>>>>>> >>>>> >> >> > >> > > > > > > > for this feature
>>>>>>> >>>>> >> >> > >> > > > > > > > For the suggestion of enabling the 
>>>>>>> >>>>> >> >> > >> > > > > > > > cluster mode also session
>>>>>>> >>>>> >> >> > >> > > > > cluster, I
>>>>>>> >>>>> >> >> > >> > > > > > > > think Flink already supported it. 
>>>>>>> >>>>> >> >> > >> > > > > > > > WebSubmissionExtension
>>>>>>> >>>>> >> >> > >> already
>>>>>>> >>>>> >> >> > >> > > > > allows
>>>>>>> >>>>> >> >> > >> > > > > > > > users to start a job with the specified 
>>>>>>> >>>>> >> >> > >> > > > > > > > jar by using web UI.
>>>>>>> >>>>> >> >> > >> > > > > > > > But we need to enable the feature from 
>>>>>>> >>>>> >> >> > >> > > > > > > > CLI for both local
>>>>>>> >>>>> >> >> > >> jar,
>>>>>>> >>>>> >> >> > >> > > > remote
>>>>>>> >>>>> >> >> > >> > > > > > > jar.
>>>>>>> >>>>> >> >> > >> > > > > > > > I will align with Yang Wang first about 
>>>>>>> >>>>> >> >> > >> > > > > > > > the details and
>>>>>>> >>>>> >> >> > >> update
>>>>>>> >>>>> >> >> > >> > > the
>>>>>>> >>>>> >> >> > >> > > > > > design
>>>>>>> >>>>> >> >> > >> > > > > > > > doc.
>>>>>>> >>>>> >> >> > >> > > > > > > >
>>>>>>> >>>>> >> >> > >> > > > > > > > 2) It's better to consider the 
>>>>>>> >>>>> >> >> > >> > > > > > > > convenience for users, such
>>>>>>> >>>>> >> >> > >> as
>>>>>>> >>>>> >> >> > >> > > > > debugging
>>>>>>> >>>>> >> >> > >> > > > > > > >
>>>>>>> >>>>> >> >> > >> > > > > > > > I am wondering whether we can store the 
>>>>>>> >>>>> >> >> > >> > > > > > > > exception in
>>>>>>> >>>>> >> >> > >> jobgragh
>>>>>>> >>>>> >> >> > >> > > > > > > > generation in application master. As no 
>>>>>>> >>>>> >> >> > >> > > > > > > > streaming graph can
>>>>>>> >>>>> >> >> > >> be
>>>>>>> >>>>> >> >> > >> > > > > > scheduled
>>>>>>> >>>>> >> >> > >> > > > > > > in
>>>>>>> >>>>> >> >> > >> > > > > > > > this case, there will be no more TM will 
>>>>>>> >>>>> >> >> > >> > > > > > > > be requested from
>>>>>>> >>>>> >> >> > >> > > FlinkRM.
>>>>>>> >>>>> >> >> > >> > > > > > > > If the AM is still running, users can 
>>>>>>> >>>>> >> >> > >> > > > > > > > still query it from
>>>>>>> >>>>> >> >> > >> CLI. As
>>>>>>> >>>>> >> >> > >> > > > it
>>>>>>> >>>>> >> >> > >> > > > > > > > requires more change, we can get some 
>>>>>>> >>>>> >> >> > >> > > > > > > > feedback from <
>>>>>>> >>>>> >> >> > >> > > > > > aljos...@apache.org
>>>>>>> >>>>> >> >> > >> > > > > > > >
>>>>>>> >>>>> >> >> > >> > > > > > > > and @zjf...@gmail.com <zjf...@gmail.com>.
>>>>>>> >>>>> >> >> > >> > > > > > > >
>>>>>>> >>>>> >> >> > >> > > > > > > > 3) It's better to consider the impact to 
>>>>>>> >>>>> >> >> > >> > > > > > > > the stability of
>>>>>>> >>>>> >> >> > >> the
>>>>>>> >>>>> >> >> > >> > > > cluster
>>>>>>> >>>>> >> >> > >> > > > > > > >
>>>>>>> >>>>> >> >> > >> > > > > > > > I agree with Yang Wang's opinion.
>>>>>>> >>>>> >> >> > >> > > > > > > >
>>>>>>> >>>>> >> >> > >> > > > > > > >
>>>>>>> >>>>> >> >> > >> > > > > > > >
>>>>>>> >>>>> >> >> > >> > > > > > > > Best Regards
>>>>>>> >>>>> >> >> > >> > > > > > > > Peter Huang
>>>>>>> >>>>> >> >> > >> > > > > > > >
>>>>>>> >>>>> >> >> > >> > > > > > > >
>>>>>>> >>>>> >> >> > >> > > > > > > > On Sun, Dec 29, 2019 at 9:44 PM Dian Fu <
>>>>>>> >>>>> >> >> > >> dian0511...@gmail.com>
>>>>>>> >>>>> >> >> > >> > > > > wrote:
>>>>>>> >>>>> >> >> > >> > > > > > > >
>>>>>>> >>>>> >> >> > >> > > > > > > >> Hi all,
>>>>>>> >>>>> >> >> > >> > > > > > > >>
>>>>>>> >>>>> >> >> > >> > > > > > > >> Sorry to jump into this discussion. 
>>>>>>> >>>>> >> >> > >> > > > > > > >> Thanks everyone for the
>>>>>>> >>>>> >> >> > >> > > > > > discussion.
>>>>>>> >>>>> >> >> > >> > > > > > > >> I'm very interested in this topic 
>>>>>>> >>>>> >> >> > >> > > > > > > >> although I'm not an
>>>>>>> >>>>> >> >> > >> expert in
>>>>>>> >>>>> >> >> > >> > > > this
>>>>>>> >>>>> >> >> > >> > > > > > > part.
>>>>>>> >>>>> >> >> > >> > > > > > > >> So I'm glad to share my thoughts as 
>>>>>>> >>>>> >> >> > >> > > > > > > >> following:
>>>>>>> >>>>> >> >> > >> > > > > > > >>
>>>>>>> >>>>> >> >> > >> > > > > > > >> 1) It's better to have a whole design 
>>>>>>> >>>>> >> >> > >> > > > > > > >> for this feature
>>>>>>> >>>>> >> >> > >> > > > > > > >> As we know, there are two deployment 
>>>>>>> >>>>> >> >> > >> > > > > > > >> modes: per-job mode
>>>>>>> >>>>> >> >> > >> and
>>>>>>> >>>>> >> >> > >> > > > session
>>>>>>> >>>>> >> >> > >> > > > > > > >> mode. I'm wondering which mode really 
>>>>>>> >>>>> >> >> > >> > > > > > > >> needs this feature.
>>>>>>> >>>>> >> >> > >> As the
>>>>>>> >>>>> >> >> > >> > > > > > design
>>>>>>> >>>>> >> >> > >> > > > > > > doc
>>>>>>> >>>>> >> >> > >> > > > > > > >> mentioned, per-job mode is more used 
>>>>>>> >>>>> >> >> > >> > > > > > > >> for streaming jobs and
>>>>>>> >>>>> >> >> > >> > > > session
>>>>>>> >>>>> >> >> > >> > > > > > > mode is
>>>>>>> >>>>> >> >> > >> > > > > > > >> usually used for batch jobs(Of course, 
>>>>>>> >>>>> >> >> > >> > > > > > > >> the job types and
>>>>>>> >>>>> >> >> > >> the
>>>>>>> >>>>> >> >> > >> > > > > > deployment
>>>>>>> >>>>> >> >> > >> > > > > > > >> modes are orthogonal). Usually 
>>>>>>> >>>>> >> >> > >> > > > > > > >> streaming job is only
>>>>>>> >>>>> >> >> > >> needed to
>>>>>>> >>>>> >> >> > >> > > be
>>>>>>> >>>>> >> >> > >> > > > > > > submitted
>>>>>>> >>>>> >> >> > >> > > > > > > >> once and it will run for days or weeks, 
>>>>>>> >>>>> >> >> > >> > > > > > > >> while batch jobs
>>>>>>> >>>>> >> >> > >> will be
>>>>>>> >>>>> >> >> > >> > > > > > > submitted
>>>>>>> >>>>> >> >> > >> > > > > > > >> more frequently compared with streaming 
>>>>>>> >>>>> >> >> > >> > > > > > > >> jobs. This means
>>>>>>> >>>>> >> >> > >> that
>>>>>>> >>>>> >> >> > >> > > > maybe
>>>>>>> >>>>> >> >> > >> > > > > > > session
>>>>>>> >>>>> >> >> > >> > > > > > > >> mode also needs this feature. However, 
>>>>>>> >>>>> >> >> > >> > > > > > > >> if we support this
>>>>>>> >>>>> >> >> > >> > > feature
>>>>>>> >>>>> >> >> > >> > > > in
>>>>>>> >>>>> >> >> > >> > > > > > > >> session mode, the application master 
>>>>>>> >>>>> >> >> > >> > > > > > > >> will become the new
>>>>>>> >>>>> >> >> > >> > > > centralized
>>>>>>> >>>>> >> >> > >> > > > > > > >> service(which should be solved). So in 
>>>>>>> >>>>> >> >> > >> > > > > > > >> this case, it's
>>>>>>> >>>>> >> >> > >> better to
>>>>>>> >>>>> >> >> > >> > > > > have
>>>>>>> >>>>> >> >> > >> > > > > > a
>>>>>>> >>>>> >> >> > >> > > > > > > >> complete design for both per-job mode 
>>>>>>> >>>>> >> >> > >> > > > > > > >> and session mode.
>>>>>>> >>>>> >> >> > >> > > > Furthermore,
>>>>>>> >>>>> >> >> > >> > > > > > > even
>>>>>>> >>>>> >> >> > >> > > > > > > >> if we can do it phase by phase, we need 
>>>>>>> >>>>> >> >> > >> > > > > > > >> to have a whole
>>>>>>> >>>>> >> >> > >> picture
>>>>>>> >>>>> >> >> > >> > > of
>>>>>>> >>>>> >> >> > >> > > > > how
>>>>>>> >>>>> >> >> > >> > > > > > > it
>>>>>>> >>>>> >> >> > >> > > > > > > >> works in both per-job mode and session 
>>>>>>> >>>>> >> >> > >> > > > > > > >> mode.
>>>>>>> >>>>> >> >> > >> > > > > > > >>
>>>>>>> >>>>> >> >> > >> > > > > > > >> 2) It's better to consider the 
>>>>>>> >>>>> >> >> > >> > > > > > > >> convenience for users, such
>>>>>>> >>>>> >> >> > >> as
>>>>>>> >>>>> >> >> > >> > > > > > debugging
>>>>>>> >>>>> >> >> > >> > > > > > > >> After we finish this feature, the job 
>>>>>>> >>>>> >> >> > >> > > > > > > >> graph will be
>>>>>>> >>>>> >> >> > >> compiled in
>>>>>>> >>>>> >> >> > >> > > > the
>>>>>>> >>>>> >> >> > >> > > > > > > >> application master, which means that 
>>>>>>> >>>>> >> >> > >> > > > > > > >> users cannot easily
>>>>>>> >>>>> >> >> > >> get the
>>>>>>> >>>>> >> >> > >> > > > > > > exception
>>>>>>> >>>>> >> >> > >> > > > > > > >> message synchorousely in the job client 
>>>>>>> >>>>> >> >> > >> > > > > > > >> if there are
>>>>>>> >>>>> >> >> > >> problems
>>>>>>> >>>>> >> >> > >> > > > during
>>>>>>> >>>>> >> >> > >> > > > > > the
>>>>>>> >>>>> >> >> > >> > > > > > > >> job graph compiling (especially for 
>>>>>>> >>>>> >> >> > >> > > > > > > >> platform users), such
>>>>>>> >>>>> >> >> > >> as the
>>>>>>> >>>>> >> >> > >> > > > > > > resource
>>>>>>> >>>>> >> >> > >> > > > > > > >> path is incorrect, the user program 
>>>>>>> >>>>> >> >> > >> > > > > > > >> itself has some
>>>>>>> >>>>> >> >> > >> problems,
>>>>>>> >>>>> >> >> > >> > > etc.
>>>>>>> >>>>> >> >> > >> > > > > > What
>>>>>>> >>>>> >> >> > >> > > > > > > I'm
>>>>>>> >>>>> >> >> > >> > > > > > > >> thinking is that maybe we should throw 
>>>>>>> >>>>> >> >> > >> > > > > > > >> the exceptions as
>>>>>>> >>>>> >> >> > >> early
>>>>>>> >>>>> >> >> > >> > > as
>>>>>>> >>>>> >> >> > >> > > > > > > possible
>>>>>>> >>>>> >> >> > >> > > > > > > >> (during job submission stage).
>>>>>>> >>>>> >> >> > >> > > > > > > >>
>>>>>>> >>>>> >> >> > >> > > > > > > >> 3) It's better to consider the impact 
>>>>>>> >>>>> >> >> > >> > > > > > > >> to the stability of
>>>>>>> >>>>> >> >> > >> the
>>>>>>> >>>>> >> >> > >> > > > > cluster
>>>>>>> >>>>> >> >> > >> > > > > > > >> If we perform the compiling in the 
>>>>>>> >>>>> >> >> > >> > > > > > > >> application master, we
>>>>>>> >>>>> >> >> > >> should
>>>>>>> >>>>> >> >> > >> > > > > > > consider
>>>>>>> >>>>> >> >> > >> > > > > > > >> the impact of the compiling errors. 
>>>>>>> >>>>> >> >> > >> > > > > > > >> Although YARN could
>>>>>>> >>>>> >> >> > >> resume
>>>>>>> >>>>> >> >> > >> > > the
>>>>>>> >>>>> >> >> > >> > > > > > > >> application master in case of failures, 
>>>>>>> >>>>> >> >> > >> > > > > > > >> but in some case
>>>>>>> >>>>> >> >> > >> the
>>>>>>> >>>>> >> >> > >> > > > > compiling
>>>>>>> >>>>> >> >> > >> > > > > > > >> failure may be a waste of cluster 
>>>>>>> >>>>> >> >> > >> > > > > > > >> resource and may impact
>>>>>>> >>>>> >> >> > >> the
>>>>>>> >>>>> >> >> > >> > > > > > stability
>>>>>>> >>>>> >> >> > >> > > > > > > the
>>>>>>> >>>>> >> >> > >> > > > > > > >> cluster and the other jobs in the 
>>>>>>> >>>>> >> >> > >> > > > > > > >> cluster, such as the
>>>>>>> >>>>> >> >> > >> resource
>>>>>>> >>>>> >> >> > >> > > > path
>>>>>>> >>>>> >> >> > >> > > > > > is
>>>>>>> >>>>> >> >> > >> > > > > > > >> incorrect, the user program itself has 
>>>>>>> >>>>> >> >> > >> > > > > > > >> some problems(in
>>>>>>> >>>>> >> >> > >> this
>>>>>>> >>>>> >> >> > >> > > case,
>>>>>>> >>>>> >> >> > >> > > > > job
>>>>>>> >>>>> >> >> > >> > > > > > > >> failover cannot solve this kind of 
>>>>>>> >>>>> >> >> > >> > > > > > > >> problems) etc. In the
>>>>>>> >>>>> >> >> > >> current
>>>>>>> >>>>> >> >> > >> > > > > > > >> implemention, the compiling errors are 
>>>>>>> >>>>> >> >> > >> > > > > > > >> handled in the
>>>>>>> >>>>> >> >> > >> client
>>>>>>> >>>>> >> >> > >> > > side
>>>>>>> >>>>> >> >> > >> > > > > and
>>>>>>> >>>>> >> >> > >> > > > > > > there
>>>>>>> >>>>> >> >> > >> > > > > > > >> is no impact to the cluster at all.
>>>>>>> >>>>> >> >> > >> > > > > > > >>
>>>>>>> >>>>> >> >> > >> > > > > > > >> Regarding to 1), it's clearly pointed 
>>>>>>> >>>>> >> >> > >> > > > > > > >> in the design doc
>>>>>>> >>>>> >> >> > >> that
>>>>>>> >>>>> >> >> > >> > > only
>>>>>>> >>>>> >> >> > >> > > > > > > per-job
>>>>>>> >>>>> >> >> > >> > > > > > > >> mode will be supported. However, I 
>>>>>>> >>>>> >> >> > >> > > > > > > >> think it's better to
>>>>>>> >>>>> >> >> > >> also
>>>>>>> >>>>> >> >> > >> > > > > consider
>>>>>>> >>>>> >> >> > >> > > > > > > the
>>>>>>> >>>>> >> >> > >> > > > > > > >> session mode in the design doc.
>>>>>>> >>>>> >> >> > >> > > > > > > >> Regarding to 2) and 3), I have not seen 
>>>>>>> >>>>> >> >> > >> > > > > > > >> related sections
>>>>>>> >>>>> >> >> > >> in the
>>>>>>> >>>>> >> >> > >> > > > > design
>>>>>>> >>>>> >> >> > >> > > > > > > >> doc. It will be good if we can cover 
>>>>>>> >>>>> >> >> > >> > > > > > > >> them in the design
>>>>>>> >>>>> >> >> > >> doc.
>>>>>>> >>>>> >> >> > >> > > > > > > >>
>>>>>>> >>>>> >> >> > >> > > > > > > >> Feel free to correct me If there is 
>>>>>>> >>>>> >> >> > >> > > > > > > >> anything I
>>>>>>> >>>>> >> >> > >> misunderstand.
>>>>>>> >>>>> >> >> > >> > > > > > > >>
>>>>>>> >>>>> >> >> > >> > > > > > > >> Regards,
>>>>>>> >>>>> >> >> > >> > > > > > > >> Dian
>>>>>>> >>>>> >> >> > >> > > > > > > >>
>>>>>>> >>>>> >> >> > >> > > > > > > >>
>>>>>>> >>>>> >> >> > >> > > > > > > >> > 在 2019年12月27日,上午3:13,Peter Huang <
>>>>>>> >>>>> >> >> > >> huangzhenqiu0...@gmail.com>
>>>>>>> >>>>> >> >> > >> > > > 写道:
>>>>>>> >>>>> >> >> > >> > > > > > > >> >
>>>>>>> >>>>> >> >> > >> > > > > > > >> > Hi Yang,
>>>>>>> >>>>> >> >> > >> > > > > > > >> >
>>>>>>> >>>>> >> >> > >> > > > > > > >> > I can't agree more. The effort 
>>>>>>> >>>>> >> >> > >> > > > > > > >> > definitely needs to align
>>>>>>> >>>>> >> >> > >> with
>>>>>>> >>>>> >> >> > >> > > > the
>>>>>>> >>>>> >> >> > >> > > > > > > final
>>>>>>> >>>>> >> >> > >> > > > > > > >> > goal of FLIP-73.
>>>>>>> >>>>> >> >> > >> > > > > > > >> > I am thinking about whether we can 
>>>>>>> >>>>> >> >> > >> > > > > > > >> > achieve the goal with
>>>>>>> >>>>> >> >> > >> two
>>>>>>> >>>>> >> >> > >> > > > > phases.
>>>>>>> >>>>> >> >> > >> > > > > > > >> >
>>>>>>> >>>>> >> >> > >> > > > > > > >> > 1) Phase I
>>>>>>> >>>>> >> >> > >> > > > > > > >> > As the CLiFrontend will not be 
>>>>>>> >>>>> >> >> > >> > > > > > > >> > depreciated soon. We can
>>>>>>> >>>>> >> >> > >> still
>>>>>>> >>>>> >> >> > >> > > > use
>>>>>>> >>>>> >> >> > >> > > > > > the
>>>>>>> >>>>> >> >> > >> > > > > > > >> > deployMode flag there,
>>>>>>> >>>>> >> >> > >> > > > > > > >> > pass the program info through Flink 
>>>>>>> >>>>> >> >> > >> > > > > > > >> > configuration,  use
>>>>>>> >>>>> >> >> > >> the
>>>>>>> >>>>> >> >> > >> > > > > > > >> > ClassPathJobGraphRetriever
>>>>>>> >>>>> >> >> > >> > > > > > > >> > to generate the job graph in 
>>>>>>> >>>>> >> >> > >> > > > > > > >> > ClusterEntrypoints of yarn
>>>>>>> >>>>> >> >> > >> and
>>>>>>> >>>>> >> >> > >> > > > > > > Kubernetes.
>>>>>>> >>>>> >> >> > >> > > > > > > >> >
>>>>>>> >>>>> >> >> > >> > > > > > > >> > 2) Phase II
>>>>>>> >>>>> >> >> > >> > > > > > > >> > In  AbstractJobClusterExecutor, the 
>>>>>>> >>>>> >> >> > >> > > > > > > >> > job graph is
>>>>>>> >>>>> >> >> > >> generated in
>>>>>>> >>>>> >> >> > >> > > > the
>>>>>>> >>>>> >> >> > >> > > > > > > >> execute
>>>>>>> >>>>> >> >> > >> > > > > > > >> > function. We can still
>>>>>>> >>>>> >> >> > >> > > > > > > >> > use the deployMode in it. With 
>>>>>>> >>>>> >> >> > >> > > > > > > >> > deployMode = cluster, the
>>>>>>> >>>>> >> >> > >> > > execute
>>>>>>> >>>>> >> >> > >> > > > > > > >> function
>>>>>>> >>>>> >> >> > >> > > > > > > >> > only starts the cluster.
>>>>>>> >>>>> >> >> > >> > > > > > > >> >
>>>>>>> >>>>> >> >> > >> > > > > > > >> > When 
>>>>>>> >>>>> >> >> > >> > > > > > > >> > {Yarn/Kuberneates}PerJobClusterEntrypoint
>>>>>>> >>>>> >> >> > >> > > > > > > >> >  starts,
>>>>>>> >>>>> >> >> > >> It will
>>>>>>> >>>>> >> >> > >> > > > > start
>>>>>>> >>>>> >> >> > >> > > > > > > the
>>>>>>> >>>>> >> >> > >> > > > > > > >> > dispatch first, then we can use
>>>>>>> >>>>> >> >> > >> > > > > > > >> > a ClusterEnvironment similar to 
>>>>>>> >>>>> >> >> > >> > > > > > > >> > ContextEnvironment to
>>>>>>> >>>>> >> >> > >> submit
>>>>>>> >>>>> >> >> > >> > > the
>>>>>>> >>>>> >> >> > >> > > > > job
>>>>>>> >>>>> >> >> > >> > > > > > > >> with
>>>>>>> >>>>> >> >> > >> > > > > > > >> > jobName the local
>>>>>>> >>>>> >> >> > >> > > > > > > >> > dispatcher. For the details, we need 
>>>>>>> >>>>> >> >> > >> > > > > > > >> > more investigation.
>>>>>>> >>>>> >> >> > >> Let's
>>>>>>> >>>>> >> >> > >> > > > > wait
>>>>>>> >>>>> >> >> > >> > > > > > > >> > for @Aljoscha
>>>>>>> >>>>> >> >> > >> > > > > > > >> > Krettek <aljos...@apache.org> @Till 
>>>>>>> >>>>> >> >> > >> > > > > > > >> > Rohrmann <
>>>>>>> >>>>> >> >> > >> > > > > trohrm...@apache.org
>>>>>>> >>>>> >> >> > >> > > > > > >'s
>>>>>>> >>>>> >> >> > >> > > > > > > >> > feedback after the holiday season.
>>>>>>> >>>>> >> >> > >> > > > > > > >> >
>>>>>>> >>>>> >> >> > >> > > > > > > >> > Thank you in advance. Merry Chrismas 
>>>>>>> >>>>> >> >> > >> > > > > > > >> > and Happy New
>>>>>>> >>>>> >> >> > >> Year!!!
>>>>>>> >>>>> >> >> > >> > > > > > > >> >
>>>>>>> >>>>> >> >> > >> > > > > > > >> >
>>>>>>> >>>>> >> >> > >> > > > > > > >> > Best Regards
>>>>>>> >>>>> >> >> > >> > > > > > > >> > Peter Huang
>>>>>>> >>>>> >> >> > >> > > > > > > >> >
>>>>>>> >>>>> >> >> > >> > > > > > > >> >
>>>>>>> >>>>> >> >> > >> > > > > > > >> >
>>>>>>> >>>>> >> >> > >> > > > > > > >> >
>>>>>>> >>>>> >> >> > >> > > > > > > >> >
>>>>>>> >>>>> >> >> > >> > > > > > > >> >
>>>>>>> >>>>> >> >> > >> > > > > > > >> >
>>>>>>> >>>>> >> >> > >> > > > > > > >> >
>>>>>>> >>>>> >> >> > >> > > > > > > >> > On Wed, Dec 25, 2019 at 1:08 AM Yang 
>>>>>>> >>>>> >> >> > >> > > > > > > >> > Wang <
>>>>>>> >>>>> >> >> > >> > > > danrtsey...@gmail.com>
>>>>>>> >>>>> >> >> > >> > > > > > > >> wrote:
>>>>>>> >>>>> >> >> > >> > > > > > > >> >
>>>>>>> >>>>> >> >> > >> > > > > > > >> >> Hi Peter,
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >> I think we need to reconsider 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >> tison's suggestion
>>>>>>> >>>>> >> >> > >> seriously.
>>>>>>> >>>>> >> >> > >> > > > After
>>>>>>> >>>>> >> >> > >> > > > > > > >> FLIP-73,
>>>>>>> >>>>> >> >> > >> > > > > > > >> >> the deployJobCluster has
>>>>>>> >>>>> >> >> > >> > > > > > > >> >> beenmoved into 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >> `JobClusterExecutor#execute`. It 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >> should
>>>>>>> >>>>> >> >> > >> not be
>>>>>>> >>>>> >> >> > >> > > > > > > perceived
>>>>>>> >>>>> >> >> > >> > > > > > > >> >> for `CliFrontend`. That
>>>>>>> >>>>> >> >> > >> > > > > > > >> >> means the user program will *ALWAYS* 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >> be executed on
>>>>>>> >>>>> >> >> > >> client
>>>>>>> >>>>> >> >> > >> > > > side.
>>>>>>> >>>>> >> >> > >> > > > > > This
>>>>>>> >>>>> >> >> > >> > > > > > > >> is
>>>>>>> >>>>> >> >> > >> > > > > > > >> >> the by design behavior.
>>>>>>> >>>>> >> >> > >> > > > > > > >> >> So, we could not just add `if(client 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >> mode) .. else
>>>>>>> >>>>> >> >> > >> if(cluster
>>>>>>> >>>>> >> >> > >> > > > > mode)
>>>>>>> >>>>> >> >> > >> > > > > > > >> ...`
>>>>>>> >>>>> >> >> > >> > > > > > > >> >> codes in `CliFrontend` to bypass
>>>>>>> >>>>> >> >> > >> > > > > > > >> >> the executor. We need to find a 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >> clean way to decouple
>>>>>>> >>>>> >> >> > >> > > executing
>>>>>>> >>>>> >> >> > >> > > > > > user
>>>>>>> >>>>> >> >> > >> > > > > > > >> >> program and deploying per-job
>>>>>>> >>>>> >> >> > >> > > > > > > >> >> cluster. Based on this, we could 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >> support to execute user
>>>>>>> >>>>> >> >> > >> > > > program
>>>>>>> >>>>> >> >> > >> > > > > on
>>>>>>> >>>>> >> >> > >> > > > > > > >> client
>>>>>>> >>>>> >> >> > >> > > > > > > >> >> or master side.
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >> Maybe Aljoscha and Jeff could give 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >> some good
>>>>>>> >>>>> >> >> > >> suggestions.
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >> Best,
>>>>>>> >>>>> >> >> > >> > > > > > > >> >> Yang
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >> Peter Huang 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >> <huangzhenqiu0...@gmail.com> 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >> 于2019年12月25日周三
>>>>>>> >>>>> >> >> > >> > > > > 上午4:03写道:
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>> Hi Jingjing,
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>> The improvement proposed is a 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>> deployment option for
>>>>>>> >>>>> >> >> > >> CLI. For
>>>>>>> >>>>> >> >> > >> > > > SQL
>>>>>>> >>>>> >> >> > >> > > > > > > based
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>> Flink application, It is more 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>> convenient to use the
>>>>>>> >>>>> >> >> > >> existing
>>>>>>> >>>>> >> >> > >> > > > > model
>>>>>>> >>>>> >> >> > >> > > > > > > in
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>> SqlClient in which
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>> the job graph is generated within 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>> SqlClient. After
>>>>>>> >>>>> >> >> > >> adding
>>>>>>> >>>>> >> >> > >> > > the
>>>>>>> >>>>> >> >> > >> > > > > > > delayed
>>>>>>> >>>>> >> >> > >> > > > > > > >> job
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>> graph generation, I think there is 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>> no change is needed
>>>>>>> >>>>> >> >> > >> for
>>>>>>> >>>>> >> >> > >> > > > your
>>>>>>> >>>>> >> >> > >> > > > > > > side.
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>> Best Regards
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>> Peter Huang
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>> On Wed, Dec 18, 2019 at 6:01 AM 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>> jingjing bai <
>>>>>>> >>>>> >> >> > >> > > > > > > >> baijingjing7...@gmail.com>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>> wrote:
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>> hi peter:
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>    we had extension SqlClent to 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>> support sql job
>>>>>>> >>>>> >> >> > >> submit in
>>>>>>> >>>>> >> >> > >> > > web
>>>>>>> >>>>> >> >> > >> > > > > > base
>>>>>>> >>>>> >> >> > >> > > > > > > on
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>> flink 1.9.   we support submit to 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>> yarn on per job
>>>>>>> >>>>> >> >> > >> mode too.
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>    in this case, the job graph 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>> generated  on client
>>>>>>> >>>>> >> >> > >> side
>>>>>>> >>>>> >> >> > >> > > .  I
>>>>>>> >>>>> >> >> > >> > > > > > think
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>> this
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>> discuss Mainly to improve api 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>> programme.  but in my
>>>>>>> >>>>> >> >> > >> case ,
>>>>>>> >>>>> >> >> > >> > > > > there
>>>>>>> >>>>> >> >> > >> > > > > > is
>>>>>>> >>>>> >> >> > >> > > > > > > >> no
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>> jar to upload but only a sql 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>> string .
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>    do u had more suggestion to 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>> improve for sql mode
>>>>>>> >>>>> >> >> > >> or it
>>>>>>> >>>>> >> >> > >> > > is
>>>>>>> >>>>> >> >> > >> > > > > > only a
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>> switch for api programme?
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>> best
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>> bai jj
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>> Yang Wang <danrtsey...@gmail.com> 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>> 于2019年12月18日周三
>>>>>>> >>>>> >> >> > >> 下午7:21写道:
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>> I just want to revive this 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>> discussion.
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>> Recently, i am thinking about how 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>> to natively run
>>>>>>> >>>>> >> >> > >> flink
>>>>>>> >>>>> >> >> > >> > > > > per-job
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>> cluster on
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>> Kubernetes.
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>> The per-job mode on Kubernetes is 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>> very different
>>>>>>> >>>>> >> >> > >> from on
>>>>>>> >>>>> >> >> > >> > > > Yarn.
>>>>>>> >>>>> >> >> > >> > > > > > And
>>>>>>> >>>>> >> >> > >> > > > > > > >> we
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>> will
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>> have
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>> the same deployment requirements 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>> to the client and
>>>>>>> >>>>> >> >> > >> entry
>>>>>>> >>>>> >> >> > >> > > > > point.
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>> 1. Flink client not always need a 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>> local jar to start
>>>>>>> >>>>> >> >> > >> a
>>>>>>> >>>>> >> >> > >> > > Flink
>>>>>>> >>>>> >> >> > >> > > > > > > per-job
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>> cluster. We could
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>> support multiple schemas. For 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>> example,
>>>>>>> >>>>> >> >> > >> > > > file:///path/of/my.jar
>>>>>>> >>>>> >> >> > >> > > > > > > means
>>>>>>> >>>>> >> >> > >> > > > > > > >> a
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>> jar
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>> located
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>> at client side,
>>>>>>> >>>>> >> >> > >> hdfs://myhdfs/user/myname/flink/my.jar
>>>>>>> >>>>> >> >> > >> > > > means a
>>>>>>> >>>>> >> >> > >> > > > > > jar
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>> located
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>> at
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>> remote hdfs, 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>> local:///path/in/image/my.jar 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>> means a
>>>>>>> >>>>> >> >> > >> jar
>>>>>>> >>>>> >> >> > >> > > > located
>>>>>>> >>>>> >> >> > >> > > > > > at
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>> jobmanager side.
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>> 2. Support running user program 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>> on master side. This
>>>>>>> >>>>> >> >> > >> also
>>>>>>> >>>>> >> >> > >> > > > > means
>>>>>>> >>>>> >> >> > >> > > > > > > the
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>> entry
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>> point
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>> will generate the job graph on 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>> master side. We could
>>>>>>> >>>>> >> >> > >> use
>>>>>>> >>>>> >> >> > >> > > the
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>> ClasspathJobGraphRetriever
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>> or start a local Flink client to 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>> achieve this
>>>>>>> >>>>> >> >> > >> purpose.
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>> cc tison, Aljoscha & Kostas Do 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>> you think this is the
>>>>>>> >>>>> >> >> > >> right
>>>>>>> >>>>> >> >> > >> > > > > > > >> direction we
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>> need to work?
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>> tison <wander4...@gmail.com> 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>> 于2019年12月12日周四
>>>>>>> >>>>> >> >> > >> 下午4:48写道:
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>> A quick idea is that we separate 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>> the deployment
>>>>>>> >>>>> >> >> > >> from user
>>>>>>> >>>>> >> >> > >> > > > > > program
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>> that
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>> it
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>> has always been done
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>> outside the program. On user 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>> program executed there
>>>>>>> >>>>> >> >> > >> is
>>>>>>> >>>>> >> >> > >> > > > > always a
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>> ClusterClient that communicates 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>> with
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>> an existing cluster, remote or 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>> local. It will be
>>>>>>> >>>>> >> >> > >> another
>>>>>>> >>>>> >> >> > >> > > > > thread
>>>>>>> >>>>> >> >> > >> > > > > > > so
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>> just
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>> for
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>> your information.
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>> Best,
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>> tison.
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>> tison <wander4...@gmail.com> 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>> 于2019年12月12日周四
>>>>>>> >>>>> >> >> > >> 下午4:40写道:
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>> Hi Peter,
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>> Another concern I realized 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>> recently is that with
>>>>>>> >>>>> >> >> > >> current
>>>>>>> >>>>> >> >> > >> > > > > > > Executors
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>> abstraction(FLIP-73)
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>> I'm afraid that user program is 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>> designed to ALWAYS
>>>>>>> >>>>> >> >> > >> run
>>>>>>> >>>>> >> >> > >> > > on
>>>>>>> >>>>> >> >> > >> > > > > the
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>> client
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>> side.
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>> Specifically,
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>> we deploy the job in executor 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>> when env.execute
>>>>>>> >>>>> >> >> > >> called.
>>>>>>> >>>>> >> >> > >> > > > This
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>> abstraction
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>> possibly prevents
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>> Flink runs user program on the 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>> cluster side.
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>> For your proposal, in this case 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>> we already
>>>>>>> >>>>> >> >> > >> compiled the
>>>>>>> >>>>> >> >> > >> > > > > > program
>>>>>>> >>>>> >> >> > >> > > > > > > >> and
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>> run
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>> on
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>> the client side,
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>> even we deploy a cluster and 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>> retrieve job graph
>>>>>>> >>>>> >> >> > >> from
>>>>>>> >>>>> >> >> > >> > > > program
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>> metadata, it
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>> doesn't make
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>> many sense.
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>> cc Aljoscha & Kostas what do 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>> you think about this
>>>>>>> >>>>> >> >> > >> > > > > constraint?
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>> Best,
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>> tison.
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>> Peter Huang 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>> <huangzhenqiu0...@gmail.com>
>>>>>>> >>>>> >> >> > >> 于2019年12月10日周二
>>>>>>> >>>>> >> >> > >> > > > > > > >> 下午12:45写道:
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>> Hi Tison,
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>> Yes, you are right. I think I 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>> made the wrong
>>>>>>> >>>>> >> >> > >> argument
>>>>>>> >>>>> >> >> > >> > > in
>>>>>>> >>>>> >> >> > >> > > > > the
>>>>>>> >>>>> >> >> > >> > > > > > > doc.
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>> Basically, the packaging jar 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>> problem is only for
>>>>>>> >>>>> >> >> > >> > > platform
>>>>>>> >>>>> >> >> > >> > > > > > > users.
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>> In
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>> our
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>> internal deploy service,
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>> we further optimized the 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>> deployment latency by
>>>>>>> >>>>> >> >> > >> letting
>>>>>>> >>>>> >> >> > >> > > > > users
>>>>>>> >>>>> >> >> > >> > > > > > to
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>> packaging
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>> flink-runtime together with 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>> the uber jar, so that
>>>>>>> >>>>> >> >> > >> we
>>>>>>> >>>>> >> >> > >> > > > don't
>>>>>>> >>>>> >> >> > >> > > > > > need
>>>>>>> >>>>> >> >> > >> > > > > > > >> to
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>> consider
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>> multiple flink version
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>> support for now. In the 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>> session client mode, as
>>>>>>> >>>>> >> >> > >> Flink
>>>>>>> >>>>> >> >> > >> > > > libs
>>>>>>> >>>>> >> >> > >> > > > > > will
>>>>>>> >>>>> >> >> > >> > > > > > > >> be
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>> shipped
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>> anyway as local resources of 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>> yarn. Users actually
>>>>>>> >>>>> >> >> > >> don't
>>>>>>> >>>>> >> >> > >> > > > > need
>>>>>>> >>>>> >> >> > >> > > > > > to
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>> package
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>> those libs into job jar.
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>> Best Regards
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>> Peter Huang
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>> On Mon, Dec 9, 2019 at 8:35 PM 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>> tison <
>>>>>>> >>>>> >> >> > >> > > > wander4...@gmail.com
>>>>>>> >>>>> >> >> > >> > > > > >
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>> wrote:
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> 3. What do you mean about 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> the package? Do users
>>>>>>> >>>>> >> >> > >> need
>>>>>>> >>>>> >> >> > >> > > to
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>> compile
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>> their
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>> jars
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>> inlcuding flink-clients, 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>> flink-optimizer,
>>>>>>> >>>>> >> >> > >> flink-table
>>>>>>> >>>>> >> >> > >> > > > > codes?
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>> The answer should be no 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>> because they exist in
>>>>>>> >>>>> >> >> > >> system
>>>>>>> >>>>> >> >> > >> > > > > > > classpath.
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>> Best,
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>> tison.
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>> Yang Wang 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>> <danrtsey...@gmail.com> 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>> 于2019年12月10日周二
>>>>>>> >>>>> >> >> > >> > > > > 下午12:18写道:
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> Hi Peter,
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> Thanks a lot for starting 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> this discussion. I
>>>>>>> >>>>> >> >> > >> think
>>>>>>> >>>>> >> >> > >> > > this
>>>>>>> >>>>> >> >> > >> > > > > is
>>>>>>> >>>>> >> >> > >> > > > > > a
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>> very
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>> useful
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> feature.
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> Not only for Yarn, i am 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> focused on flink on
>>>>>>> >>>>> >> >> > >> > > Kubernetes
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>> integration
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>> and
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>> come
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> across the same
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> problem. I do not want the 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> job graph generated
>>>>>>> >>>>> >> >> > >> on
>>>>>>> >>>>> >> >> > >> > > > client
>>>>>>> >>>>> >> >> > >> > > > > > > side.
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>> Instead,
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>> the
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> user jars are built in
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> a user-defined image. When 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> the job manager
>>>>>>> >>>>> >> >> > >> launched,
>>>>>>> >>>>> >> >> > >> > > we
>>>>>>> >>>>> >> >> > >> > > > > > just
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>> need to
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> generate the job graph
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> based on local user jars.
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> I have some small suggestion 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> about this.
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> 1. 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> `ProgramJobGraphRetriever` 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> is very similar to
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> `ClasspathJobGraphRetriever`,
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>  the differences
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> are the former needs 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> `ProgramMetadata` and the
>>>>>>> >>>>> >> >> > >> latter
>>>>>>> >>>>> >> >> > >> > > > > needs
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>> some
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>> arguments.
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> Is it possible to
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> have an unified 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> `JobGraphRetriever` to 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> support
>>>>>>> >>>>> >> >> > >> both?
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> 2. Is it possible to not use 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> a local user jar to
>>>>>>> >>>>> >> >> > >> > > start
>>>>>>> >>>>> >> >> > >> > > > a
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>> per-job
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>> cluster?
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> In your case, the user jars 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> has
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> existed on hdfs already and 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> we do need to
>>>>>>> >>>>> >> >> > >> download
>>>>>>> >>>>> >> >> > >> > > the
>>>>>>> >>>>> >> >> > >> > > > > jars
>>>>>>> >>>>> >> >> > >> > > > > > > to
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>> deployer
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> service. Currently, we
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> always need a local user jar 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> to start a flink
>>>>>>> >>>>> >> >> > >> > > cluster.
>>>>>>> >>>>> >> >> > >> > > > It
>>>>>>> >>>>> >> >> > >> > > > > > is
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>> be
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>> great
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>> if
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>> we
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> could support remote user 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> jars.
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>>> In the implementation, we 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>>> assume users package
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>> flink-clients,
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> flink-optimizer, flink-table 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> together within
>>>>>>> >>>>> >> >> > >> the job
>>>>>>> >>>>> >> >> > >> > > > jar.
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>> Otherwise,
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>> the
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> job graph generation within
>>>>>>> >>>>> >> >> > >> JobClusterEntryPoint will
>>>>>>> >>>>> >> >> > >> > > > > fail.
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> 3. What do you mean about 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> the package? Do users
>>>>>>> >>>>> >> >> > >> need
>>>>>>> >>>>> >> >> > >> > > to
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>> compile
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>> their
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>> jars
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> inlcuding flink-clients, 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> flink-optimizer,
>>>>>>> >>>>> >> >> > >> flink-table
>>>>>>> >>>>> >> >> > >> > > > > > codes?
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> Best,
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> Yang
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> Peter Huang 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> <huangzhenqiu0...@gmail.com>
>>>>>>> >>>>> >> >> > >> > > > 于2019年12月10日周二
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>> 上午2:37写道:
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> Dear All,
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> Recently, the Flink 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> community starts to
>>>>>>> >>>>> >> >> > >> improve the
>>>>>>> >>>>> >> >> > >> > > > yarn
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>> cluster
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> descriptor
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> to make job jar and config 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> files configurable
>>>>>>> >>>>> >> >> > >> from
>>>>>>> >>>>> >> >> > >> > > > CLI.
>>>>>>> >>>>> >> >> > >> > > > > It
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>> improves
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>> the
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> flexibility of  Flink 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> deployment Yarn Per Job
>>>>>>> >>>>> >> >> > >> Mode.
>>>>>>> >>>>> >> >> > >> > > > For
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>> platform
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>> users
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> who
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> manage tens of hundreds of 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> streaming pipelines
>>>>>>> >>>>> >> >> > >> for
>>>>>>> >>>>> >> >> > >> > > the
>>>>>>> >>>>> >> >> > >> > > > > > whole
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>> org
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>> or
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> company, we found the job 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> graph generation in
>>>>>>> >>>>> >> >> > >> > > > > client-side
>>>>>>> >>>>> >> >> > >> > > > > > is
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>> another
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> pinpoint. Thus, we want to 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> propose a
>>>>>>> >>>>> >> >> > >> configurable
>>>>>>> >>>>> >> >> > >> > > > > feature
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>> for
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> FlinkYarnSessionCli. The 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> feature can allow
>>>>>>> >>>>> >> >> > >> users to
>>>>>>> >>>>> >> >> > >> > > > > choose
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>> the
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>> job
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>> graph
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> generation in Flink 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> ClusterEntryPoint so that
>>>>>>> >>>>> >> >> > >> the
>>>>>>> >>>>> >> >> > >> > > job
>>>>>>> >>>>> >> >> > >> > > > > jar
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>> doesn't
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>> need
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>> to
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> be locally for the job 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> graph generation. The
>>>>>>> >>>>> >> >> > >> > > proposal
>>>>>>> >>>>> >> >> > >> > > > is
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>> organized
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>> as a
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> FLIP
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>
>>>>>>> >>>>> >> >> > >> > > > > > > >>
>>>>>>> >>>>> >> >> > >> > > > > > >
>>>>>>> >>>>> >> >> > >> > > > > >
>>>>>>> >>>>> >> >> > >> > > > >
>>>>>>> >>>>> >> >> > >> > > >
>>>>>>> >>>>> >> >> > >> > >
>>>>>>> >>>>> >> >> > >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-85+Delayed+JobGraph+Generation
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> .
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> Any questions and 
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> suggestions are welcomed.
>>>>>>> >>>>> >> >> > >> Thank
>>>>>>> >>>>> >> >> > >> > > you
>>>>>>> >>>>> >> >> > >> > > > in
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>> advance.
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> Best Regards
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> Peter Huang
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>
>>>>>>> >>>>> >> >> > >> > > > > > > >> >>
>>>>>>> >>>>> >> >> > >> > > > > > > >>
>>>>>>> >>>>> >> >> > >> > > > > > > >>
>>>>>>> >>>>> >> >> > >> > > > > > >
>>>>>>> >>>>> >> >> > >> > > > > >
>>>>>>> >>>>> >> >> > >> > > > >
>>>>>>> >>>>> >> >> > >> > > >
>>>>>>> >>>>> >> >> > >> > >
>>>>>>> >>>>> >> >> > >>
>>>>>>> >>>>> >> >> > >
>>>>>>> >>>>> >> >>

Reply via email to