hudi-agent commented on code in PR #18837:
URL: https://github.com/apache/hudi/pull/18837#discussion_r3469851923
##########
hudi-trino/src/test/java/io/trino/plugin/hudi/split/TestHudiSplitFactory.java:
##########
@@ -181,8 +179,10 @@ private static HudiTableHandle createTableHandle()
"/test/path",
HoodieTableType.MERGE_ON_READ,
ImmutableList.of(),
+ ImmutableList.of(),
TupleDomain.all(),
TupleDomain.all(),
+ java.util.OptionalLong.empty(),
Review Comment:
🤖 nit: same as in `TestHudiPartitionInfoLoader` — `java.util.OptionalLong`
could be a proper import so this reads as `OptionalLong.empty()` like the other
`java.util.*` types in the file.
<sub><i>⚠️ AI-generated; verify before applying. React 👍/👎 to flag
quality.</i></sub>
##########
hudi-trino/src/test/java/io/trino/plugin/hudi/partition/TestHudiPartitionInfoLoader.java:
##########
@@ -161,12 +160,14 @@ private static HudiSplitFactory createSplitFactory()
TABLE_PATH,
HoodieTableType.COPY_ON_WRITE,
ImmutableList.of(),
+ ImmutableList.of(),
TupleDomain.all(),
TupleDomain.all(),
+ java.util.OptionalLong.empty(),
Review Comment:
🤖 nit: could you add `import java.util.OptionalLong;` and use
`OptionalLong.empty()` here? There's no naming conflict with the existing
`Optional` import, so the FQ name is a bit inconsistent with the rest of the
file. The same pattern appears in `TestHudiSplitFactory.java` at its equivalent
constructor call site.
<sub><i>⚠️ AI-generated; verify before applying. React 👍/👎 to flag
quality.</i></sub>
##########
hudi-trino-plugin/src/main/java/io/trino/plugin/hudi/reader/HudiTrinoReaderContext.java:
##########
@@ -130,98 +149,16 @@ public IndexedRecord next()
}
@Override
- public IndexedRecord convertAvroRecord(IndexedRecord record)
+ protected Option<HoodieRecordMerger> getRecordMerger(RecordMergeMode
mergeMode, String mergeStrategyId, String mergeImplClasses)
{
- return record;
+ return Option.of(new HoodiePreCombineAvroRecordMerger());
Review Comment:
🤖 Confirming this still applies on the current revision — the method now
returns `HoodieAvroRecordMerger.INSTANCE` but still ignores
`mergeMode`/`mergeStrategyId`/`mergeImplClasses` entirely. For contrast,
`FlinkRowDataReaderContext.getRecordMerger` switches on `mergeMode`
(EVENT_TIME_ORDERING vs COMMIT_TIME_ORDERING → different mergers) and falls
back to `HoodieRecordUtils.createValidRecordMerger(...)` for custom mergers. A
table written with `COMMIT_TIME_ORDERING` read through Trino would get
event-time-style precombine semantics instead, so MoR results could differ from
Spark/Flink. Might be worth mirroring that dispatch, or at least scoping the
unconditional merger to the modes it's actually correct for.
##########
hudi-trino/src/main/java/io/trino/plugin/hudi/HudiPageSource.java:
##########
@@ -115,8 +132,14 @@ public long getMemoryUsage()
public void close()
throws IOException
{
- fileGroupReader.close();
- pageSource.close();
+ // recordIterator is the outermost wrapper: CloseableMappingIterator ->
+ // HoodieFileGroupReaderIterator -> fileGroupReader.close() ->
baseFileIterator.close()
+ // (which closes the Trino pageSource) + recordBuffer.close(). Closing
it alone therefore
+ // releases every underlying resource exactly once; also registering
fileGroupReader/
+ // pageSource would re-close the same handles, since close() does not
null its fields.
+ try (Closer closer = Closer.create()) {
Review Comment:
🤖 nit: could you just call `recordIterator.close()` directly here?
`Closer.create()` with a single registered resource adds indirection without
any benefit — its multi-resource exception-chaining behaviour only kicks in
when there are two or more closables. The existing comment already explains why
we only close `recordIterator`, so the intent would remain clear with a plain
direct call.
<sub><i>⚠️ AI-generated; verify before applying. React 👍/👎 to flag
quality.</i></sub>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]