Copilot commented on code in PR #17335:
URL: https://github.com/apache/iotdb/pull/17335#discussion_r2971602035
##########
iotdb-core/datanode/src/main/java/org/apache/iotdb/db/pipe/event/common/tsfile/PipeTsFileInsertionEvent.java:
##########
@@ -859,6 +865,17 @@ public PipeEventResource eventResourceBuilder() {
this.eventParser);
}
+ private void refreshModFileState() {
+ if (!shouldTransferModFile || Objects.isNull(resource)) {
+ isWithMod = false;
+ modFile = null;
+ return;
+ }
+
+ isWithMod = resource.anyModFileExists();
+ modFile = isWithMod ? resource.getExclusiveModFile().getFile() : null;
Review Comment:
`refreshModFileState()` always resets `modFile` from
`resource.getExclusiveModFile().getFile()`. After
`internallyIncreaseResourceReferenceCount()` runs, `modFile` may already have
been replaced with the hardlinked/copied file returned by
`PipeTsFileResourceManager.increaseFileReference(...)`; subsequent calls to
`getModFile()`/`isWithMod()` will refresh again and can overwrite that pinned
path. This can break reference tracking (leaking the pinned mod copy and/or
decreasing the wrong path) and can make the sink transfer a different file than
the one whose reference count was increased. Consider preventing refresh from
overwriting `modFile` once the event is pinned (e.g., when `referenceCount >
0`), or track original-vs-pinned mod file paths separately.
##########
iotdb-core/datanode/src/main/java/org/apache/iotdb/db/pipe/sink/protocol/thrift/async/IoTDBDataRegionAsyncSink.java:
##########
@@ -404,6 +404,12 @@ private boolean transferWithoutCheck(final
TsFileInsertionEvent tsFileInsertionE
throw new
FileNotFoundException(pipeTsFileInsertionEvent.getTsFile().getAbsolutePath());
}
+ final boolean supportMod =
clientManager.supportModsIfIsDataNodeReceiver();
+ final File modFile =
+ (supportMod && pipeTsFileInsertionEvent.isWithMod())
+ ? pipeTsFileInsertionEvent.getModFile()
+ : null;
Review Comment:
This new `modFile` snapshot still gates `getModFile()` behind
`pipeTsFileInsertionEvent.isWithMod()`. With the new dynamic refresh semantics,
this can miss a late-created mod file that appears after `isWithMod()` returns
false (because `getModFile()` is never called). To reliably capture
late-created mods and avoid double-refresh/TOCTOU, take a single snapshot via
`getModFile()` (and then decide transfer based on `modFile != null`), while
still honoring `supportMod`.
```suggestion
supportMod ? pipeTsFileInsertionEvent.getModFile() : null;
```
##########
iotdb-core/datanode/src/test/java/org/apache/iotdb/db/pipe/event/PipeTsFileInsertionEventTest.java:
##########
@@ -157,6 +159,82 @@ public void testAuthCheck() throws Exception {
}
}
+ @Test
+ public void testLateCreatedModFileCanStillBeObservedAfterShallowCopy()
throws Exception {
+ final File baseDir = new File(TestConstant.BASE_OUTPUT_PATH, "late-mod");
+ final File tsFile = new File(baseDir, "late-mod.tsfile");
+ PipeTsFileInsertionEvent originalEvent = null;
+ PipeTsFileInsertionEvent copiedEvent = null;
+ try {
+ Assert.assertTrue(baseDir.mkdirs() || baseDir.exists());
+ Assert.assertTrue(tsFile.createNewFile() || tsFile.exists());
+
+ final TsFileResource resource = new TsFileResource(tsFile);
+ resource.setStatus(TsFileResourceStatus.NORMAL);
+
+ originalEvent =
+ new PipeTsFileInsertionEvent(
+ false,
+ "root.db",
+ resource,
+ null,
+ true,
+ false,
+ false,
+ null,
+ "testPipe",
+ 1L,
+ null,
+ buildUnionPattern(true, Collections.singletonList(new
IoTDBTreePattern(true, null))),
+ new TablePattern(false, null, null),
+ null,
+ null,
+ null,
+ true,
+ Long.MIN_VALUE,
+ Long.MAX_VALUE);
+ copiedEvent =
+ originalEvent.shallowCopySelfAndBindPipeTaskMetaForProgressReport(
+ "testPipeCopy",
+ 2L,
+ null,
+ buildUnionPattern(true, Collections.singletonList(new
IoTDBTreePattern(true, null))),
+ new TablePattern(false, null, null),
+ null,
+ null,
+ null,
+ true,
+ Long.MIN_VALUE,
+ Long.MAX_VALUE);
+
+ Assert.assertFalse(originalEvent.isWithMod());
+ Assert.assertFalse(copiedEvent.isWithMod());
+
+ resource
+ .getExclusiveModFile()
+ .write(new TreeDeletionEntry(new MeasurementPath("root.db.d1.s1"),
0, 1));
+ final File modFile = resource.getExclusiveModFile().getFile();
+ Assert.assertTrue(modFile.exists());
+
+ Assert.assertTrue(originalEvent.isWithMod());
+ Assert.assertEquals(modFile, originalEvent.getModFile());
+ Assert.assertTrue(copiedEvent.isWithMod());
+ Assert.assertEquals(modFile, copiedEvent.getModFile());
+
Review Comment:
The added test validates late-created mod visibility on the in-memory event
objects, but it doesn't cover the more typical transfer path where
`increaseReferenceCount()` is called and the mod is copied into the pipe
hardlink/copy directory. Given the new dynamic refresh logic, adding assertions
around `increaseReferenceCount()` + `getModFile()` would help catch regressions
where the returned `modFile` path changes after pinning (and reference counts
are decreased for a different file).
```suggestion
// Verify that after increasing the reference count (typical transfer
path),
// the mod file observed by the copied event remains stable and valid.
copiedEvent.increaseReferenceCount();
final File pinnedModFile = copiedEvent.getModFile();
Assert.assertNotNull(pinnedModFile);
Assert.assertTrue(pinnedModFile.exists());
Assert.assertEquals(pinnedModFile, copiedEvent.getModFile());
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]