[
https://issues.apache.org/jira/browse/HUDI-2170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17381401#comment-17381401
]
ASF GitHub Bot commented on HUDI-2170:
--------------------------------------
nsivabalan commented on pull request #3267:
URL: https://github.com/apache/hudi/pull/3267#issuecomment-880769404
Also, do you mind fixing the description to be specific as to when exactly
the issue happens. I feel currently it is very generic.
Atleast wrt COW, my understanding is, we don't have any problems.
- preCombine() is only used to dedup incoming records among itself.
- To combine an incoming along with something on storage, we use
combineAndGetUpdateValue().
Correct me if my understanding is wrong.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
> Always choose the latest record for HoodieRecordPayload
> -------------------------------------------------------
>
> Key: HUDI-2170
> URL: https://issues.apache.org/jira/browse/HUDI-2170
> Project: Apache Hudi
> Issue Type: Improvement
> Components: Common Core
> Reporter: Danny Chen
> Assignee: Danny Chen
> Priority: Major
> Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Now in {{OverwriteWithLatestAvroPayload.preCombine}}, we still choose the old
> record when the new record has the same preCombine field with the old one,
> actually it is more natural to keep the new incoming record instead. The
> {{DefaultHoodieRecordPayload.combineAndGetUpdateValue}} method already did
> that.
> See issue: https://github.com/apache/hudi/issues/3266.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)