voonhous commented on PR #8579:
URL: https://github.com/apache/hudi/pull/8579#issuecomment-1523138078
> > Results will be different if combineAndUpdateValue is invoked in order
without invoking preCombine.
>
> What is exactly lost here?
# PreCombine + combineAndGetUpdateValue
```
Table schema: {id: int , name: string, price: double, _ts: int}
recordKey: id
precombineField: _ts
Table initial state:
[1 a1_0 10.0 1001]
Table performs an update with an incoming batch that has the following
results (2):
(precombine + combineAndGetUpdateValue)
[
[1 a1_0 11.0 999],
[1 a1_0 null 1001]
]
End state of the table:
[1 a1_0 11.0 1001]
```
This is so as the incoming batch at (2) will be `preCombine` before
performing a `combineAndUpdateValue`.
# combineAndGetUpdateValue ONLY
Results will be different if `combineAndUpdateValue` is invoked in order
without invoking `preCombine`.
Example:
```
Table initial state:
[1 a1_0 10.0 1001]
Table performs an update:
(combineAndGetUpdateValue)
[1 a1_0 11.0 999]
Table performs an update again:
(combineAndGetUpdateValue)
[1 a1_0 null 1001]
End state of the table:
[1 a1_0 10.0 1001]
```
[
[1 a1_0 11.0 999],
[1 a1_0 null 1001]
]
End state of the table:
[1 a1_0 11.0 1001]
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]