aglinxinyuan opened a new issue, #5548:
URL: https://github.com/apache/texera/issues/5548
## Background
While reviewing #5447, Yicong-Huang flagged that the `Files.walk(...)`
stream-leak pattern Copilot caught in my test cleanup helpers may exist
elsewhere. A sweep of the codebase confirmed one production usage that leaks
the underlying directory handle:
`common/workflow-core/src/main/scala/org/apache/texera/amber/core/storage/util/dataset/GitVersionControlLocalFileStorage.java`,
lines 80-85:
```java
public static void deleteRepo(Path directoryPath) throws IOException {
Files.walk(directoryPath)
.sorted(Comparator.reverseOrder())
.map(Path::toFile)
.forEach(File::delete);
}
```
`Files.walk(...)` returns a closeable `java.util.stream.Stream` backed by an
open directory handle. Without an explicit `close()`, the handle stays open
until GC — which can flake temp-dir deletion on Windows and leak file
descriptors on long-lived JVMs (e.g. the dataset service that calls
`deleteRepo` whenever a Git-backed dataset is removed).
Every other `Files.walk` / `Files.list` usage in the codebase already wraps
the stream in `try/finally` and closes it (verified across `PveManager.scala`,
`HuggingFaceModelResource.scala`, `HuggingFaceModelResourceSpec.scala`).
## What needs to change
Convert `deleteRepo` to use try-with-resources so the stream is closed even
if iteration throws:
```java
public static void deleteRepo(Path directoryPath) throws IOException {
try (var stream = Files.walk(directoryPath)) {
stream
.sorted(Comparator.reverseOrder())
.map(Path::toFile)
.forEach(File::delete);
}
}
```
No behavior change for callers; just the stream-lifecycle fix.
## Scope
- Single-method edit in `GitVersionControlLocalFileStorage.java`.
- No new tests required (existing dataset deletion paths already exercise
this code).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]