[
https://issues.apache.org/jira/browse/TIKA-4704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18071444#comment-18071444
]
ASF GitHub Bot commented on TIKA-4704:
--------------------------------------
Copilot commented on code in PR #2743:
URL: https://github.com/apache/tika/pull/2743#discussion_r3039836124
##########
tika-pipes/tika-pipes-integration-tests/src/test/java/org/apache/tika/pipes/core/PassbackFilterTest.java:
##########
@@ -81,6 +81,7 @@ public void testPassbackFilter(@TempDir Path tmpDir) throws
Exception {
.emitData()
.getMetadataList()
.get(0);
+ pipesClient.close();
assertEquals("TESTOVERLAPPINGTEXT.PDF",
metadata.get(TikaCoreProperties.RESOURCE_NAME_KEY));
Review Comment:
Calling `pipesClient.close()` here only closes the client on the success
path; if `process(...)` throws or an earlier assertion fails, the client (and
its temp dir) will remain open. Prefer try-with-resources for `PipesClient` in
this test, or ensure close happens in a `finally`/`@AfterEach`.
##########
tika-pipes/tika-pipes-integration-tests/src/test/java/org/apache/tika/pipes/core/UnpackModeTest.java:
##########
@@ -310,24 +311,24 @@ public void
testUnpackModeBytesEmittedToOutputDir(@TempDir Path tmp) throws Exce
Path outputDir = tmp.resolve("output");
Files.createDirectories(outputDir);
- PipesClient pipesClient = init(tmp, testDocWithEmbedded);
-
- ParseContext parseContext = new ParseContext();
- parseContext.set(ParseMode.class, ParseMode.UNPACK);
-
- PipesResult pipesResult = pipesClient.process(
- new FetchEmitTuple(testDocWithEmbedded, new
FetchKey(fetcherName, testDocWithEmbedded),
- new EmitKey(emitterName, testDocWithEmbedded), new
Metadata(), parseContext,
- FetchEmitTuple.ON_PARSE_EXCEPTION.EMIT));
-
- assertTrue(pipesResult.isSuccess(), "UNPACK should succeed");
-
- // Check that output files were created for the embedded documents
- // The exact naming depends on the EmittingUnpackHandler configuration
- // At minimum, we verify the metadata JSON was written
- assertTrue(Files.exists(outputDir.resolve(testDocWithEmbedded +
".json")) ||
- Files.list(outputDir).count() > 0,
- "Output directory should contain emitted files");
+ try (PipesClient pipesClient = init(tmp, testDocWithEmbedded)) {
+ ParseContext parseContext = new ParseContext();
+ parseContext.set(ParseMode.class, ParseMode.UNPACK);
+
+ PipesResult pipesResult = pipesClient.process(
+ new FetchEmitTuple(testDocWithEmbedded, new
FetchKey(fetcherName, testDocWithEmbedded),
+ new EmitKey(emitterName, testDocWithEmbedded), new
Metadata(), parseContext,
+ FetchEmitTuple.ON_PARSE_EXCEPTION.EMIT));
+
+ assertTrue(pipesResult.isSuccess(), "UNPACK should succeed");
+
+ // Check that output files were created for the embedded documents
+ // The exact naming depends on the EmittingUnpackHandler
configuration
+ // At minimum, we verify the metadata JSON was written
+ assertTrue(Files.exists(outputDir.resolve(testDocWithEmbedded +
".json")) ||
+ Files.list(outputDir).count() > 0,
+ "Output directory should contain emitted files");
Review Comment:
`Files.list(outputDir)` returns a Stream that must be closed; using it
inline in the assertion leaks a directory stream on some platforms. Wrap the
`Files.list` call in a try-with-resources (or replace with a non-stream
alternative) before counting/checking entries.
> Avoid remaining temp files
> --------------------------
>
> Key: TIKA-4704
> URL: https://issues.apache.org/jira/browse/TIKA-4704
> Project: Tika
> Issue Type: Task
> Affects Versions: 3.3.0
> Reporter: Tilman Hausherr
> Priority: Minor
> Fix For: 4.0.0, 3.3.1
>
> Attachments: screenshot-1.png
>
>
> This is my temp directory after a successful build of tika 3. We should try
> to lessen this.
> !screenshot-1.png!
--
This message was sent by Atlassian Jira
(v8.20.10#820010)