[ 
https://issues.apache.org/jira/browse/FLINK-15249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17019220#comment-17019220
 ] 

Chongchen Chen commented on FLINK-15249:
----------------------------------------

[^new.diff]  Thank you, [~zhuzh] . I have improved the implementation, now for 
your testcase, they have close performance. actually, we cannot get any benefit 
from Disjoint Set in your testcase. your testcase is a simple chain, the cost 
of set union is very low, it's trying to union a big set and a set whose size 
is 1,  the cost is close to the cost of adding an element to a set. If you want 
to see performance improvement, you should construct some complex graph, e.g.  
DAG, tree. by the way, after reimplementation, for my testcase above, the 
disjoint set only costs 1700ms on my computer. it costs  40% time comparing to 
naive implementation. 

> Improve PipelinedRegions calculation with Union Set
> ---------------------------------------------------
>
>                 Key: FLINK-15249
>                 URL: https://issues.apache.org/jira/browse/FLINK-15249
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / Coordination
>            Reporter: Chongchen Chen
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: PipelinedRegionComputeUtil.diff, 
> RegionFailoverPerfTest.java, new.diff
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Union Set's Merge Set cost is O(1). current implementation is O(N). the 
> attachment is patch.
> [Disjoint Set Data 
> Structure|[https://en.wikipedia.org/wiki/Disjoint-set_data_structure]]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to