gianm commented on code in PR #19496:
URL: https://github.com/apache/druid/pull/19496#discussion_r3291631550
##########
processing/src/main/java/org/apache/druid/segment/file/PartialSegmentFileMapperV10.java:
##########
@@ -384,6 +485,92 @@ private void ensureFileDownloaded(String name,
SegmentInternalFileMetadata fileM
}
}
+ /**
+ * Public entry point for cache-layer code that wants to ensure a container
is materialized before any data is
+ * downloaded into it (e.g. when a per-bundle cache entry is mounted, the
entry pre-allocates its container files
+ * so that subsequent {@link #mapFile} calls have somewhere to write into
and the cache layer can charge the
+ * reservation up front).
+ */
+ public void initializeContainer(int containerIndex) throws IOException
+ {
+ checkClosed();
+ ensureContainerInitialized(containerIndex);
+ }
+
+ /**
+ * Reverse of {@link #initializeContainer(int)}: unmap the in-memory view of
the container, delete the local
+ * container file, and clear the bitmap bits + {@link #downloadedFiles}
entries for every internal file that lived
+ * in this container.
+ * <p>
+ * Used by per-bundle cache entries on unmount/eviction to release the disk
and memory footprint of one bundle
+ * without affecting other bundles sharing the same {@link
PartialSegmentFileMapperV10}. After eviction, subsequent
+ * {@link #mapFile} calls for files in this container will re-trigger
downloads via {@link #initializeContainer}
+ * and the bitmap will be repopulated incrementally.
+ * <p>
+ * No-op if the container has not been initialized.
+ */
+ public void evictContainer(int containerIndex)
+ {
+ checkClosed();
+ containerLocks[containerIndex].lock();
+ try {
+ final MappedByteBuffer existing = containers[containerIndex];
+ if (existing != null) {
+ ByteBufferUtils.unmap(existing);
+ containers[containerIndex] = null;
+ }
+ // Try the cached containerFiles[i] first. If it's null, the container
was never initialized in this mapper
+ // instance (typical right after create() with an empty bitmap), but the
on-disk file may still exist from a
+ // previous run. Fall back to the deterministic path so eviction is
always effective.
+ File containerFile = containerFiles[containerIndex];
+ if (containerFile == null) {
+ containerFile = new File(
+ localCacheDir,
+ StringUtils.format("%s.container.%05d", targetFilename,
containerIndex)
+ );
+ }
+ if (containerFile.exists()) {
+ containerFile.delete();
+ }
+ containerFiles[containerIndex] = null;
+ }
+ finally {
+ containerLocks[containerIndex].unlock();
+ }
+
+ // clear bitmap bits + downloadedFiles entries for files that lived in
this container. Iterates
+ // metadata.getFiles() without external synchronization:
SegmentFileMetadata is constructed once at mapper
+ // creation and its file map is effectively immutable for the mapper's
lifetime, so concurrent iteration is safe.
+ for (Map.Entry<String, SegmentInternalFileMetadata> entry :
metadata.getFiles().entrySet()) {
+ if (entry.getValue().getContainer() != containerIndex) {
+ continue;
+ }
+ final String fileName = entry.getKey();
+ if (downloadedFiles.remove(fileName)) {
Review Comment:
Can `evictContainer` race with `mapFile`? Does something prevent them from
running concurrently with each other?
I ask because it looks like this function (`evictContainer`) is removing the
container from `containers` first and then removes the internal files from
`downloadedFiles`. On the other hand, `mapFile` checks `downloadedFiles` first,
and then if the file is there, it pulls the container from `containers` and
dereferences it. Maybe it'd be null at that point if a concurrent
`evictContainer` was happening?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]