[
https://issues.apache.org/jira/browse/FLINK-36270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Abhi Gupta updated FLINK-36270:
-------------------------------
Description: In DDB Streams connector, while testing we found out that when
we are spending a lot of time in markAsFinished function because we are calling
splitsAvailableForAssignment which is O(N), and given n shards can be marked as
finished concurrently, the algorithm becomes O(n^2). Change the algo to assign
only child shards when a parent is finished. We can start tracking child shards
of a shard in SplitTracker (was: In DDB Streams connector, while testing we
found out that when we are spending a lot of time in markAsFinished function
because we are calling splitsAvailableForAssignment which is O(n), and given n
shards can be marked as finished concurrently, the algorithm becomes O(n^2).
Change the algo to assign only child shards when a parent is finished. We can
start tracking child shards of a shard in SplitTracker)
> DDB Streams Connector performance issue due to splitsAvailableForAssignment
> function
> ------------------------------------------------------------------------------------
>
> Key: FLINK-36270
> URL: https://issues.apache.org/jira/browse/FLINK-36270
> Project: Flink
> Issue Type: Bug
> Components: Connectors / DynamoDB
> Reporter: Abhi Gupta
> Priority: Major
>
> In DDB Streams connector, while testing we found out that when we are
> spending a lot of time in markAsFinished function because we are calling
> splitsAvailableForAssignment which is O(N), and given n shards can be marked
> as finished concurrently, the algorithm becomes O(n^2). Change the algo to
> assign only child shards when a parent is finished. We can start tracking
> child shards of a shard in SplitTracker
--
This message was sent by Atlassian Jira
(v8.20.10#820010)