[ 
https://issues.apache.org/jira/browse/BEAM-13009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anonymous updated BEAM-13009:
-----------------------------
    Status: Triage Needed  (was: Resolved)

> DynamoDBIO misses writing items if `withDeduplicateKeys` is not set
> -------------------------------------------------------------------
>
>                 Key: BEAM-13009
>                 URL: https://issues.apache.org/jira/browse/BEAM-13009
>             Project: Beam
>          Issue Type: Bug
>          Components: io-java-aws
>    Affects Versions: 2.27.0
>            Reporter: Lei Li
>            Assignee: Moritz Mack
>            Priority: P1
>              Labels: aws, data-loss, dynamodb
>             Fix For: 2.36.0
>
>          Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> A new method `withDeduplicateKeys` was added in DynamoDBIO from 2.27.0. It 
> feels like it is optional according to the 
> [doc|https://beam.apache.org/releases/javadoc/2.27.0/index.html?org/apache/beam/sdk/io/aws/dynamodb/DynamoDBIO.html],
>  and it was not shown in the examples either. But if a key name not set by 
> it, [the deduplication 
> logic|https://github.com/apache/beam/pull/12583/files#diff-0b5f7a7c1ee0ec890eef82e05e08ef1152421d2c8dcef11fca107f6af0d22e87R479-R492]
>  still takes effect but uses an empty map as the `Map<String, 
> AttributeValue>` part of the deduplication key, which results in all items 
> having the same key and being deduplicated, writing only the last item to 
> DynamoDB.
> I think we need to add an check on DeduplicateKeys in 
> `extractDeduplicateKeyValues`, and skip the deduplication logic if it's empty.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to