xushiyan commented on code in PR #4676:
URL: https://github.com/apache/hudi/pull/4676#discussion_r969210823
##########
hudi-client/hudi-flink-client/src/main/java/org/apache/hudi/table/action/commit/FlinkWriteHelper.java:
##########
@@ -97,7 +98,9 @@ public List<HoodieRecord<T>> deduplicateRecords(
final T data1 = rec1.getData();
final T data2 = rec2.getData();
- @SuppressWarnings("unchecked") final T reducedData = (T)
data2.preCombine(data1);
+ Properties properties = new Properties();
+ properties.put("schema", schemaString);
+ @SuppressWarnings("unchecked") final T reducedData = (T)
data2.preCombine(data1, properties);
Review Comment:
@danny0405 @fengjian428 while it does look more aligned with `preCombine`
having a new API containing `Schema`, but the timing of this change does not
look great. We have a major refactoring close to land in
https://github.com/apache/hudi/tree/release-feature-rfc46 with a new public
unified API for merge and deprecating HoodieRecordPayload, which makes
introducing `preCombine(T oldValue, Schema schema, Properties properties)` very
odd. We have to evaluate the business requirement thoroughly to make the
decision: option A) keep the old APIs as is for partial update and implement
new partial update merger once RFC46 landed; option B) adding a new APIs for
preCombine and keep partial update working and supported via the deprecated
APIs, while in parallel support partial update in the new merger APIs
i think option B is acceptable in case of strong business requirements.
please chime in here @alexeykudinkin @prasannarajaperumal
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]