geserdugarov commented on code in PR #12796:
URL: https://github.com/apache/hudi/pull/12796#discussion_r1959596037


##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/StreamWriteFunction.java:
##########
@@ -140,8 +172,29 @@ public void snapshotState() {
   }
 
   @Override
-  public void processElement(I value, ProcessFunction<I, Object>.Context ctx, 
Collector<Object> out) throws Exception {
-    bufferRecord((HoodieRecord<?>) value);
+  public void processElement(HoodieFlinkInternalRow record,
+                             ProcessFunction<HoodieFlinkInternalRow, 
Object>.Context ctx,
+                             Collector<Object> out) throws Exception {
+    RowData row = record.getRowData();

Review Comment:
   Postponed conversion until the moment before flushing. And in the latest 
version, the conversion is localized in a separate method 
`StreamWriteFunction::convertToHoodieRecords`.
   
   But simplifying of `StreamWriteFunction` requires review of 
`deduplicateRecordsIfNeeded`. And `FlinkWriteHelper::deduplicateRecords` looks 
like as a part that could be optimized. Added corresponding task in a backlog, 
HUDI-9043.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to