[ https://issues.apache.org/jira/browse/BEAM-13009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Anonymous updated BEAM-13009: ----------------------------- Status: Triage Needed (was: Resolved) > DynamoDBIO misses writing items if `withDeduplicateKeys` is not set > ------------------------------------------------------------------- > > Key: BEAM-13009 > URL: https://issues.apache.org/jira/browse/BEAM-13009 > Project: Beam > Issue Type: Bug > Components: io-java-aws > Affects Versions: 2.27.0 > Reporter: Lei Li > Assignee: Moritz Mack > Priority: P1 > Labels: aws, data-loss, dynamodb > Fix For: 2.36.0 > > Time Spent: 4h 20m > Remaining Estimate: 0h > > A new method `withDeduplicateKeys` was added in DynamoDBIO from 2.27.0. It > feels like it is optional according to the > [doc|https://beam.apache.org/releases/javadoc/2.27.0/index.html?org/apache/beam/sdk/io/aws/dynamodb/DynamoDBIO.html], > and it was not shown in the examples either. But if a key name not set by > it, [the deduplication > logic|https://github.com/apache/beam/pull/12583/files#diff-0b5f7a7c1ee0ec890eef82e05e08ef1152421d2c8dcef11fca107f6af0d22e87R479-R492] > still takes effect but uses an empty map as the `Map<String, > AttributeValue>` part of the deduplication key, which results in all items > having the same key and being deduplicated, writing only the last item to > DynamoDB. > I think we need to add an check on DeduplicateKeys in > `extractDeduplicateKeyValues`, and skip the deduplication logic if it's empty. -- This message was sent by Atlassian Jira (v8.20.10#820010)