gianm commented on code in PR #19496:
URL: https://github.com/apache/druid/pull/19496#discussion_r3291631550


##########
processing/src/main/java/org/apache/druid/segment/file/PartialSegmentFileMapperV10.java:
##########
@@ -384,6 +485,92 @@ private void ensureFileDownloaded(String name, 
SegmentInternalFileMetadata fileM
     }
   }
 
+  /**
+   * Public entry point for cache-layer code that wants to ensure a container 
is materialized before any data is
+   * downloaded into it (e.g. when a per-bundle cache entry is mounted, the 
entry pre-allocates its container files
+   * so that subsequent {@link #mapFile} calls have somewhere to write into 
and the cache layer can charge the
+   * reservation up front).
+   */
+  public void initializeContainer(int containerIndex) throws IOException
+  {
+    checkClosed();
+    ensureContainerInitialized(containerIndex);
+  }
+
+  /**
+   * Reverse of {@link #initializeContainer(int)}: unmap the in-memory view of 
the container, delete the local
+   * container file, and clear the bitmap bits + {@link #downloadedFiles} 
entries for every internal file that lived
+   * in this container.
+   * <p>
+   * Used by per-bundle cache entries on unmount/eviction to release the disk 
and memory footprint of one bundle
+   * without affecting other bundles sharing the same {@link 
PartialSegmentFileMapperV10}. After eviction, subsequent
+   * {@link #mapFile} calls for files in this container will re-trigger 
downloads via {@link #initializeContainer}
+   * and the bitmap will be repopulated incrementally.
+   * <p>
+   * No-op if the container has not been initialized.
+   */
+  public void evictContainer(int containerIndex)
+  {
+    checkClosed();
+    containerLocks[containerIndex].lock();
+    try {
+      final MappedByteBuffer existing = containers[containerIndex];
+      if (existing != null) {
+        ByteBufferUtils.unmap(existing);
+        containers[containerIndex] = null;
+      }
+      // Try the cached containerFiles[i] first. If it's null, the container 
was never initialized in this mapper
+      // instance (typical right after create() with an empty bitmap), but the 
on-disk file may still exist from a
+      // previous run. Fall back to the deterministic path so eviction is 
always effective.
+      File containerFile = containerFiles[containerIndex];
+      if (containerFile == null) {
+        containerFile = new File(
+            localCacheDir,
+            StringUtils.format("%s.container.%05d", targetFilename, 
containerIndex)
+        );
+      }
+      if (containerFile.exists()) {
+        containerFile.delete();
+      }
+      containerFiles[containerIndex] = null;
+    }
+    finally {
+      containerLocks[containerIndex].unlock();
+    }
+
+    // clear bitmap bits + downloadedFiles entries for files that lived in 
this container. Iterates
+    // metadata.getFiles() without external synchronization: 
SegmentFileMetadata is constructed once at mapper
+    // creation and its file map is effectively immutable for the mapper's 
lifetime, so concurrent iteration is safe.
+    for (Map.Entry<String, SegmentInternalFileMetadata> entry : 
metadata.getFiles().entrySet()) {
+      if (entry.getValue().getContainer() != containerIndex) {
+        continue;
+      }
+      final String fileName = entry.getKey();
+      if (downloadedFiles.remove(fileName)) {

Review Comment:
   Can `evictContainer` race with `mapFile`? Does something prevent them from 
running concurrently with each other?
   
   I ask because it looks like this function (`evictContainer`) is removing the 
container from `containers` first and then removes the internal files from 
`downloadedFiles`. On the other hand, `mapFile` checks `downloadedFiles` first, 
and then if the file is there, it pulls the container from `containers` and 
dereferences it. Maybe it'd be null at that point if a concurrent 
`evictContainer` was happening?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to