Copilot commented on code in PR #17335:
URL: https://github.com/apache/iotdb/pull/17335#discussion_r2971602035


##########
iotdb-core/datanode/src/main/java/org/apache/iotdb/db/pipe/event/common/tsfile/PipeTsFileInsertionEvent.java:
##########
@@ -859,6 +865,17 @@ public PipeEventResource eventResourceBuilder() {
         this.eventParser);
   }
 
+  private void refreshModFileState() {
+    if (!shouldTransferModFile || Objects.isNull(resource)) {
+      isWithMod = false;
+      modFile = null;
+      return;
+    }
+
+    isWithMod = resource.anyModFileExists();
+    modFile = isWithMod ? resource.getExclusiveModFile().getFile() : null;

Review Comment:
   `refreshModFileState()` always resets `modFile` from 
`resource.getExclusiveModFile().getFile()`. After 
`internallyIncreaseResourceReferenceCount()` runs, `modFile` may already have 
been replaced with the hardlinked/copied file returned by 
`PipeTsFileResourceManager.increaseFileReference(...)`; subsequent calls to 
`getModFile()`/`isWithMod()` will refresh again and can overwrite that pinned 
path. This can break reference tracking (leaking the pinned mod copy and/or 
decreasing the wrong path) and can make the sink transfer a different file than 
the one whose reference count was increased. Consider preventing refresh from 
overwriting `modFile` once the event is pinned (e.g., when `referenceCount > 
0`), or track original-vs-pinned mod file paths separately.



##########
iotdb-core/datanode/src/main/java/org/apache/iotdb/db/pipe/sink/protocol/thrift/async/IoTDBDataRegionAsyncSink.java:
##########
@@ -404,6 +404,12 @@ private boolean transferWithoutCheck(final 
TsFileInsertionEvent tsFileInsertionE
         throw new 
FileNotFoundException(pipeTsFileInsertionEvent.getTsFile().getAbsolutePath());
       }
 
+      final boolean supportMod = 
clientManager.supportModsIfIsDataNodeReceiver();
+      final File modFile =
+          (supportMod && pipeTsFileInsertionEvent.isWithMod())
+              ? pipeTsFileInsertionEvent.getModFile()
+              : null;

Review Comment:
   This new `modFile` snapshot still gates `getModFile()` behind 
`pipeTsFileInsertionEvent.isWithMod()`. With the new dynamic refresh semantics, 
this can miss a late-created mod file that appears after `isWithMod()` returns 
false (because `getModFile()` is never called). To reliably capture 
late-created mods and avoid double-refresh/TOCTOU, take a single snapshot via 
`getModFile()` (and then decide transfer based on `modFile != null`), while 
still honoring `supportMod`.
   ```suggestion
             supportMod ? pipeTsFileInsertionEvent.getModFile() : null;
   ```



##########
iotdb-core/datanode/src/test/java/org/apache/iotdb/db/pipe/event/PipeTsFileInsertionEventTest.java:
##########
@@ -157,6 +159,82 @@ public void testAuthCheck() throws Exception {
     }
   }
 
+  @Test
+  public void testLateCreatedModFileCanStillBeObservedAfterShallowCopy() 
throws Exception {
+    final File baseDir = new File(TestConstant.BASE_OUTPUT_PATH, "late-mod");
+    final File tsFile = new File(baseDir, "late-mod.tsfile");
+    PipeTsFileInsertionEvent originalEvent = null;
+    PipeTsFileInsertionEvent copiedEvent = null;
+    try {
+      Assert.assertTrue(baseDir.mkdirs() || baseDir.exists());
+      Assert.assertTrue(tsFile.createNewFile() || tsFile.exists());
+
+      final TsFileResource resource = new TsFileResource(tsFile);
+      resource.setStatus(TsFileResourceStatus.NORMAL);
+
+      originalEvent =
+          new PipeTsFileInsertionEvent(
+              false,
+              "root.db",
+              resource,
+              null,
+              true,
+              false,
+              false,
+              null,
+              "testPipe",
+              1L,
+              null,
+              buildUnionPattern(true, Collections.singletonList(new 
IoTDBTreePattern(true, null))),
+              new TablePattern(false, null, null),
+              null,
+              null,
+              null,
+              true,
+              Long.MIN_VALUE,
+              Long.MAX_VALUE);
+      copiedEvent =
+          originalEvent.shallowCopySelfAndBindPipeTaskMetaForProgressReport(
+              "testPipeCopy",
+              2L,
+              null,
+              buildUnionPattern(true, Collections.singletonList(new 
IoTDBTreePattern(true, null))),
+              new TablePattern(false, null, null),
+              null,
+              null,
+              null,
+              true,
+              Long.MIN_VALUE,
+              Long.MAX_VALUE);
+
+      Assert.assertFalse(originalEvent.isWithMod());
+      Assert.assertFalse(copiedEvent.isWithMod());
+
+      resource
+          .getExclusiveModFile()
+          .write(new TreeDeletionEntry(new MeasurementPath("root.db.d1.s1"), 
0, 1));
+      final File modFile = resource.getExclusiveModFile().getFile();
+      Assert.assertTrue(modFile.exists());
+
+      Assert.assertTrue(originalEvent.isWithMod());
+      Assert.assertEquals(modFile, originalEvent.getModFile());
+      Assert.assertTrue(copiedEvent.isWithMod());
+      Assert.assertEquals(modFile, copiedEvent.getModFile());
+

Review Comment:
   The added test validates late-created mod visibility on the in-memory event 
objects, but it doesn't cover the more typical transfer path where 
`increaseReferenceCount()` is called and the mod is copied into the pipe 
hardlink/copy directory. Given the new dynamic refresh logic, adding assertions 
around `increaseReferenceCount()` + `getModFile()` would help catch regressions 
where the returned `modFile` path changes after pinning (and reference counts 
are decreased for a different file).
   ```suggestion
   
         // Verify that after increasing the reference count (typical transfer 
path),
         // the mod file observed by the copied event remains stable and valid.
         copiedEvent.increaseReferenceCount();
         final File pinnedModFile = copiedEvent.getModFile();
         Assert.assertNotNull(pinnedModFile);
         Assert.assertTrue(pinnedModFile.exists());
         Assert.assertEquals(pinnedModFile, copiedEvent.getModFile());
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to