[
https://issues.apache.org/jira/browse/FLINK-24316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Zhu Zhu closed FLINK-24316.
---------------------------
Resolution: Duplicate
> Refactor IntermediateDataSet to have only one consumer
> ------------------------------------------------------
>
> Key: FLINK-24316
> URL: https://issues.apache.org/jira/browse/FLINK-24316
> Project: Flink
> Issue Type: Technical Debt
> Components: Runtime / Coordination
> Reporter: Zhilong Hong
> Priority: Major
> Fix For: 1.15.0
>
>
> Currently, IntermediateDataSet has an assumption that an IntermediateDataSet
> can be consumed by multiple consumers. However, this assumption has never
> came to reality. For an upstream vertex that is connected to multiple
> downstream vertices, it will generate multiple IntermediateDataSets. Each
> consumer is corresponding to one IntermediateDataSet.
> Furthermore, there are several checks in the code to make sure that an
> IntermediateDataSet has only one consumer, like
> {{Execution#getPartitionMaxParallelism}},
> {{SsgNetworkMemoryCalculationUtils#getMaxSubpartitionNums}}, and etc. These
> checks make the logic complicated. And it's hard to guarantee the
> consistency, because we can't make sure all the calls to {{getConsumers}}
> have this check in the future.
> Since multiple consumers for IntermediateDataSet may not come true in a long
> time, we think maybe it's better to refactor IntermediateDataSet to have only
> one consumer, as the discussion mentioned in
> [https://github.com/apache/flink/pull/16856#discussion_r691878539].
> If we are going to support multiple consumers for IntermediateDataSet in the
> future, we can just bring it back and refactor all the usages.
> As IntermediateDataSet changes, all classes related to it should change,
> including IntermediateResult, IntermediateResultPartition,
> DefaultResultPartition, and etc. All the related sanity checks need to be
> removed, too.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)