[ 
https://issues.apache.org/jira/browse/FLINK-19761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17218891#comment-17218891
 ] 

Xuannan Su commented on FLINK-19761:
------------------------------------

[~trohrmann] The assumption is that the ShuffleMaster should know about the 
cluster partition you are asking for. Meaning if NettyShuffleService is used, 
the cluster partition should only be accessible in the same cluster. If you 
want to share the intermediate result across different clusters, you need to 
have an external shuffle service whose lifecycle is not bound to the cluster.
If the client tries to ask for a cluster partition that the shuffle master 
doesn't know of, the job will fail, and it is up to the client-side to decide 
what to do. For the cache table on Table API, it can re-execute the original 
graph that produces the intermediate result. 

> Add lookup method for registered ShuffleDescriptor in ShuffleMaster
> -------------------------------------------------------------------
>
>                 Key: FLINK-19761
>                 URL: https://issues.apache.org/jira/browse/FLINK-19761
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / Network
>            Reporter: Xuannan Su
>            Priority: Major
>
> Currently, the ShuffleMaster can register a partition and get the shuffle 
> descriptor. However, it lacks the ability to look up the registered 
> ShuffleDescriptors belongs to an IntermediateResult by the 
> IntermediateDataSetID.
> Adding the lookup method to the ShuffleMaster can make reusing the cluster 
> partition more easily. For example, we don't have to return the 
> ShuffleDescriptor to the client just so that the other job can somehow encode 
> the ShuffleDescriptor in the JobGraph to consume the cluster partition. 
> Instead, we only need to return the IntermediateDatSetID and use it to lookup 
> the ShuffleDescriptor by another job.
> By adding the lookup method in ShuffleMaster, if we have an external shuffle 
> service and the lifecycle of the IntermediateResult is not bounded to the 
> cluster, we can look up the ShuffleDescriptor and reuse the 
> IntermediateResult by a job running on another cluster even if the cluster 
> that produced the IntermediateResult is shutdown.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to