[
https://issues.apache.org/jira/browse/FLINK-16001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17044280#comment-17044280
]
Gary Yao commented on FLINK-16001:
----------------------------------
[~wind_ljy] I am in favour of optimizing this if we can prove that we can save
significant amount of time. I have my doubts that we can save much here
considering that {{toPipelinedRegionsSet()}} is linear wrt. to the number of
distint pipelined regions. In many jobs we will have only one region or
_"parallelism of the job"_ number of regions. If you want to work on this, can
you create a JMH microbenchmark?
> Avoid using Java Streams in construction of ExecutionGraph
> ----------------------------------------------------------
>
> Key: FLINK-16001
> URL: https://issues.apache.org/jira/browse/FLINK-16001
> Project: Flink
> Issue Type: Improvement
> Components: Runtime / Coordination
> Affects Versions: 1.10.0
> Reporter: Jiayi Liao
> Priority: Major
>
> I think we should avoid {{Java Streams}} in construction of
> {{ExecutionGraph}} like function {{toPipelinedRegionsSet}} in
> {{PipelinedRegionComputeUtil}} because the job submission is definitely
> performance sensitive, especially when {{distinctRegions}} has a large
> cardinality.
> Also includes some other places in package
> {{org.apache.flink.runtime.executiongraph}}
> cc [~trohrmann] [~gjy] [~zhuzh]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)