[
https://issues.apache.org/jira/browse/FLINK-31008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17686963#comment-17686963
]
Jingsong Lee commented on FLINK-31008:
--------------------------------------
[~Ming Li] Assigned to u~
> [Flink][Table Store] The Split allocation of the same bucket in
> ContinuousFileSplitEnumerator may be out of order
> -----------------------------------------------------------------------------------------------------------------
>
> Key: FLINK-31008
> URL: https://issues.apache.org/jira/browse/FLINK-31008
> Project: Flink
> Issue Type: Bug
> Components: Table Store
> Reporter: ming li
> Assignee: ming li
> Priority: Major
>
> There are two places in {{ContinuousFileSplitEnumerator}} that add
> {{FileStoreSourceSplit}} to {{{}bucketSplits{}}}: {{addSplitsBack}} and
> {{{}processDiscoveredSplits{}}}. {{processDiscoveredSplits}} will
> continuously check for new splits and add them to the queue. At this time,
> the order of the splits is in order.
> {code:java}
> private void addSplits(Collection<FileStoreSourceSplit> splits) {
> splits.forEach(this::addSplit);
> }
> private void addSplit(FileStoreSourceSplit split) {
> bucketSplits
> .computeIfAbsent(((DataSplit) split.split()).bucket(), i -> new
> LinkedList<>())
> .add(split);
> }{code}
> However, when the task failover, the splits that have been allocated before
> will be returned. At this time, these returned splits are also added to the
> end of the queue, which leads to disorder in the allocation of splits.
>
> I think these returned splits should be added to the head of the queue to
> ensure the order of allocation.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)