[
https://issues.apache.org/jira/browse/FLINK-36239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated FLINK-36239:
-----------------------------------
Labels: pull-request-available (was: )
> DDB Streams Connector reprocessing due to DescribeStream inconsistencies for
> trimmed shards
> -------------------------------------------------------------------------------------------
>
> Key: FLINK-36239
> URL: https://issues.apache.org/jira/browse/FLINK-36239
> Project: Flink
> Issue Type: Bug
> Components: Connectors / DynamoDB
> Reporter: Abhi Gupta
> Priority: Major
> Labels: pull-request-available
>
> *Problem*
> We can have reprocessing of events when DDBStream shards are deleted by DDB
> after 24 hours.
> *Root cause*
> We use DDB DescribeStream API to retrieve a list of shards to consume from.
> This API is eventually consistent when it comes to deleting expired shards,
> so some responses will include it, and some will not.
> On DDBStreams connector, shards have the following lifecycle.
> # {*}Discovery{*}: Shard discovered (known)
> # {*}Assign{*}: Shard assigned (assigned)
> # {*}Finished{*}: Once shard is finished, it will be moved to finished
> (finished)
> # *Cleanup:* Once shard is finished, DescribeStream doesn't return the
> shardId, and >24h has progressed, we will delete the shard from state (to
> prevent accumulating unnecessary state).
> *Example:*
> * We did describestream and processed shard-a >24 hours ago
> * Now the shard has been removed since its more than 24 hours.
> * We just got a describestream call for this shard.
> * Describestream didn’t give this shard
> * We got another describestream call. Describestream somehow sent that shard
> back due to inconsistencies, we sent this out to SplitTracker. This shard was
> not in finished shards, since we just deleted it some time back since it was
> more than 25 hours old, so this shard would be processed duplicately.
> *Fix*
> We'll make the shard retention to be 48 hours to avoid these edge cases
--
This message was sent by Atlassian Jira
(v8.20.10#820010)