[ 
https://issues.apache.org/jira/browse/FLINK-13132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16880399#comment-16880399
 ] 

Till Rohrmann commented on FLINK-13132:
---------------------------------------

Hi [~ZhenqiuHuang], before diving into the solution you can maybe tell us why 
it is so costly to download the jar and generate the {{JobGraph}} on the client 
side. 

As [~maguowei] said, the cluster also needs to get access to the user code jar 
somehow. Hence, it would also need to download the jar from somewhere. Maybe 
you can outline a bit how you envision the whole procedure to work in the 
future and how you intend to solve the problem.

A problem with the {{ClassPathJobGraphRetriever}} is that it generates the 
{{JobGraph}} from the user code every time the {{ClusterEntrypoint}} is 
started. Hence, if the user code contains non-deterministic logic, it will 
generate different {{JobGraphs}} across master failovers. Admittedly, this is 
also a problem for the {{StandaloneJobClusterEntrypoint}}. But before spreading 
this problem, we should think about a possible solution.

> Allow ClusterEntrypoints use user main method to generate job graph
> -------------------------------------------------------------------
>
>                 Key: FLINK-13132
>                 URL: https://issues.apache.org/jira/browse/FLINK-13132
>             Project: Flink
>          Issue Type: Improvement
>          Components: Deployment / YARN
>    Affects Versions: 1.8.0, 1.8.1
>            Reporter: Zhenqiu Huang
>            Assignee: Zhenqiu Huang
>            Priority: Minor
>
> We are building a service that can transparently deploy a job to different 
> cluster management systems, such as Yarn and another internal system. It is 
> very cost to download the jar and generate JobGraph in the client side. Thus, 
> I want to propose an improvement to make Yarn Entrypoints can be configurable 
> to use either FileJobGraphRetriever or ClassPathJobGraphRetriever. It is 
> actually a long asking TODO in AbstractionYarnClusterDescriptor in line 834.
> https://github.com/apache/flink/blob/21468e0050dc5f97de5cfe39885e0d3fd648e399/flink-yarn/src/main/java/org/apache/flink/yarn/AbstractYarnClusterDescriptor.java#L834



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to