hudi-agent commented on code in PR #18384: URL: https://github.com/apache/hudi/pull/18384#discussion_r3253214755
########## hudi-common/src/main/java/org/apache/hudi/common/table/log/AbstractHoodieLogRecordScanner.java: ########## Review Comment: 🤖 **Line 211:** I believe this is still the same NPE scenario from last round — when only `_hoodie_partition_path` is excluded but `_hoodie_record_key` is not, `populateMetaFields = flags.isRecordKeyPopulated() = true`, and downstream `processDataBlock` (line 472-474) builds `recordKeyPartitionPathFieldPair = Option.empty()`. That flows into `SpillableMapUtils.convertToHoodieRecordPayload` line 141 which does `record.get(PARTITION_PATH_METADATA_FIELD).toString()` on the now-null column → NPE. The existing tests only cover the case where record_key and partition_path are excluded together (via `EXCLUDE_ALL_EXCEPT_COMMIT_TIME`), so this edge case isn't exercised. Could the `populateMetaFields` flag here gate on both `isRecordKeyPopulated() && isPartitionPathPopulated()`, or could the field pair always be sourced from the already-resolved `recordKeyField`/`partitionPathFieldOpt`? <sub><i>- AI-generated; verify before applying. React 👍/👎 to flag quality.</i></sub> ########## hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieWriteMergeHandle.java: ########## Review Comment: 🤖 **Line 421:** I think this still has the issue I flagged last round — the preserve path only nullifies `_hoodie_file_name`, and `fileWriter.write(...)` routes to `writeAvro(...)` (verified in `HoodieAvroParquetWriter` line 85), which does NOT call `prepRecordWithMetadata` and therefore doesn't consult `HoodieMetaFieldFlags`. So if a transition grows the exclude list to include e.g. `_hoodie_record_key`, the first compaction reads old records (with the key populated) and writes them verbatim into the new file, leaving the new file inconsistent with the table's exclusion intent. Could you also null out the other 4 excluded meta fields here (the same `updateMetaField` pattern as file_name), or alternatively route this through a path that goes through `prepRecordWithMetadata`? <sub><i>- AI-generated; verify before applying. React 👍/👎 to flag quality.</i></sub> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
