hudi-agent commented on code in PR #18767:
URL: https://github.com/apache/hudi/pull/18767#discussion_r3260485265


##########
hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/sink/utils/TestWriteBase.java:
##########
@@ -365,29 +477,51 @@ public TestHarness jobFailover() throws Exception {
       return this;
     }
 
-    public TestHarness noCompleteInstant() {
+    public TestHarness assertInstantRecommit() {
       // no complete instant
-      checkInstantState(HoodieInstant.State.COMPLETED, null);
+      assertNotNull(lastCompleteInstant(), "The instant should recommit");
+      return this;
+    }
+
+    /**
+     * Asserts the data files are non-empty.
+     */
+    public TestHarness assertDataFilesExists() {
+      assertTrue(fileExists(false), "Data files should have been created");
       return this;
     }
 
     /**
      * Asserts the data files are empty.
      */
     public TestHarness assertEmptyDataFiles() {
-      assertFalse(fileExists(), "No data files should have been created");
+      assertFalse(fileExists(false), "No data files should have been created");
       return this;
     }
 
-    private boolean fileExists() {
+    /**
+     * Asserts the data files are empty.
+     */
+    public TestHarness assertEmptyBaseFiles() {

Review Comment:
   🤖 nit: the Javadoc (line 503) is a copy-paste from `assertEmptyDataFiles` — 
it says "data files" instead of "base files". The assertion message on the next 
line has the same copy-paste issue (`"No data files should have been created"` 
→ `"No base files should have been created"`).
   
   <sub><i>- AI-generated; verify before applying. React 👍/👎 to flag 
quality.</i></sub>



##########
hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/sink/utils/TestWriteBase.java:
##########
@@ -144,7 +173,7 @@ public TestHarness preparePipeline(File basePath, 
Configuration conf) throws Exc
       this.pipeline = TestData.getWritePipeline(this.basePath, conf);
       // open the function and ingest data
       this.pipeline.openFunction();
-      this.ckpMetadata = CkpMetadata.getInstance(conf);
+      HoodieWriteConfig writeConfig = 
this.pipeline.getCoordinator().getWriteClient().getConfig();

Review Comment:
   🤖 nit: `writeConfig` is declared but never used — it looks like dead code 
left over from replacing the `ckpMetadata` assignment. Could you either wire it 
into the rest of the method or remove it?
   
   <sub><i>- AI-generated; verify before applying. React 👍/👎 to flag 
quality.</i></sub>



##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/event/WriteMetadataEvent.java:
##########
@@ -108,6 +111,14 @@ public void setTaskID(int taskID) {
     this.taskID = taskID;
   }
 
+  public Long getCheckpointId() {

Review Comment:
   🤖 nit: could you change the return type to primitive `long` here? The 
underlying field is `long` and `setCheckpointId` also takes `long`, so 
returning boxed `Long` implies the value could be `null` when it never can be. 
Keeping it primitive stays consistent with `getTaskID()` and avoids any 
misleading nullable signal to callers.
   
   <sub><i>- AI-generated; verify before applying. React 👍/👎 to flag 
quality.</i></sub>



##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/util/FlinkWriteClients.java:
##########


Review Comment:
   🤖 **Line 268:** Could we log a WARN when this overrides a user-set policy? 
`isBlockingInstantGeneration` is false for MOR upserts without CDC (a very 
common case), so most users who explicitly configured `EAGER` will silently get 
`LAZY` here with no indication. A one-line WARN when the existing value differs 
from `LAZY` would make this much easier to debug.
   
   <sub><i>- AI-generated; verify before applying. React 👍/👎 to flag 
quality.</i></sub>



##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/utils/ExplicitClassloaderThreadFactory.java:
##########
@@ -0,0 +1,63 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.sink.utils;
+
+import org.apache.flink.util.FatalExitExceptionHandler;
+
+import java.util.concurrent.ThreadFactory;
+
+/**
+ * A thread factory with explicit classloader.
+ */
+public class ExplicitClassloaderThreadFactory implements ThreadFactory {
+  private final String threadName;
+  private final ClassLoader cl;
+  private final Thread.UncaughtExceptionHandler errorHandler;
+
+  private Thread t;
+
+  public ExplicitClassloaderThreadFactory(

Review Comment:
   🤖 nit: the single-character field name `t` makes the guard check `if (t != 
null)` in `newThread()` harder to parse at a glance — it's not immediately 
obvious what `t` represents. Renaming it to `thread` would make the intent 
self-explanatory.
   
   <sub><i>- AI-generated; verify before applying. React 👍/👎 to flag 
quality.</i></sub>



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to