Copilot commented on code in PR #2728:
URL: https://github.com/apache/tika/pull/2728#discussion_r3017899281
##########
tika-server/tika-server-core/src/test/java/org/apache/tika/server/core/CXFTestBase.java:
##########
@@ -152,7 +152,7 @@ protected static String
getStringFromInputStream(InputStream in) throws IOExcept
public static InputStream gzip(InputStream is) throws IOException {
ByteArrayOutputStream bos = new ByteArrayOutputStream();
OutputStream gz = new GzipCompressorOutputStream(bos);
- IOUtils.copy(is, gz);
+ is.transferTo(gz);
gz.flush();
gz.close();
return new ByteArrayInputStream(bos.toByteArray());
Review Comment:
`gzip(InputStream)` creates a `GzipCompressorOutputStream` and closes it
only on the happy path. If `transferTo()` throws, the gzip stream won’t be
closed, which can leave the compressor in an inconsistent state and leak
resources. Use try-with-resources (or a finally) to guarantee the gzip stream
is closed even on exceptions.
##########
tika-server/tika-server-core/src/test/java/org/apache/tika/server/core/CXFTestBase.java:
##########
@@ -201,7 +202,7 @@ public void setUp() throws Exception {
// Initialize PipesParsingHelper for pipes-based parsing
// Merge the fetcher config with basePath pointing to the temp
directory
- this.pipesConfigPath = mergeFetcherConfig(this.pipesConfigPath,
inputTempDirectory);
+ this.pipesConfigPath = mergeFetcherConfig(tmpPipesConfigPath,
inputTempDirectory);
TikaJsonConfig tikaJsonConfig =
TikaJsonConfig.load(this.pipesConfigPath);
Review Comment:
In `setUp()`, `tmpPipesConfigPath` is always a temp file created by
`mergePassbackAllStrategy()`/`createDefaultTestConfig()`, but it is never
deleted after `mergeFetcherConfig(...)` writes a new temp config. This leaves
an intermediate temp file behind for each test run. Please delete
`tmpPipesConfigPath` once the merged config has been written (ideally in a
finally that won’t delete the final `pipesConfigPath`).
##########
tika-server/tika-server-core/src/test/java/org/apache/tika/server/core/CXFTestBase.java:
##########
@@ -268,6 +271,8 @@ private Path mergePassbackAllStrategy(InputStream
pipesConfigInputStream) throws
/**
* Merges the tika-server-fetcher configuration into the pipes config.
* The fetcher is configured with basePath pointing to the input temp
directory.
+ *
+ * @return new (different) JSON configPath in the temp director
Review Comment:
Javadoc typo: "temp director" should be "temp directory", and the `@return`
sentence should end with a period.
```suggestion
* @return new (different) JSON configPath in the temp directory.
```
##########
tika-server/tika-server-core/src/test/java/org/apache/tika/server/core/CXFTestBase.java:
##########
@@ -186,12 +186,13 @@ public void setUp() throws Exception {
Files.copy(getTikaConfigInputStream(), tmp,
StandardCopyOption.REPLACE_EXISTING);
InputStream pipesConfigInputStream = getPipesConfigInputStream();
+ Path tmpPipesConfigPath;
if (pipesConfigInputStream != null) {
// Test provided its own pipes config - merge in PASSBACK_ALL
emit strategy
- this.pipesConfigPath =
mergePassbackAllStrategy(pipesConfigInputStream);
+ tmpPipesConfigPath =
mergePassbackAllStrategy(pipesConfigInputStream);
} else {
// Create a default pipes config, merging metadata-filters
from tika config
- this.pipesConfigPath = createDefaultTestConfig(tmp);
+ tmpPipesConfigPath = createDefaultTestConfig(tmp);
}
Review Comment:
`pipesConfigInputStream` is obtained from `getPipesConfigInputStream()` and
passed into `mergePassbackAllStrategy(...)` without being closed. Even though
the current base implementation returns a `ByteArrayInputStream`, subclasses
can override this to return a stream that requires closing. Wrap
`pipesConfigInputStream` in try-with-resources when non-null.
##########
tika-server/tika-server-core/src/test/java/org/apache/tika/server/core/CXFTestBase.java:
##########
@@ -452,9 +459,8 @@ protected Map<String, byte[]>
readZipArchiveBytes(InputStream inputStream) throw
Enumeration<ZipArchiveEntry> entries = zip.getEntries();
while (entries.hasMoreElements()) {
ZipArchiveEntry entry = entries.nextElement();
- ByteArrayOutputStream bos = new ByteArrayOutputStream();
- IOUtils.copy(zip.getInputStream(entry), bos);
- data.put(entry.getName(), bos.toByteArray());
+ byte[] bytes = zip.getInputStream(entry).readAllBytes();
+ data.put(entry.getName(), bytes);
}
Review Comment:
In `readZipArchiveBytes(...)`, the per-entry stream returned by
`zip.getInputStream(entry)` is not closed. Wrap that stream in
try-with-resources before calling `readAllBytes()` to avoid leaking file
descriptors.
##########
tika-server/tika-server-core/src/test/java/org/apache/tika/server/core/CXFTestBase.java:
##########
@@ -429,9 +437,8 @@ protected Map<String, String> readZipArchive(InputStream
inputStream) throws IOE
Enumeration<ZipArchiveEntry> entries = zip.getEntries();
while (entries.hasMoreElements()) {
ZipArchiveEntry entry = entries.nextElement();
- ByteArrayOutputStream bos = new ByteArrayOutputStream();
- IOUtils.copy(zip.getInputStream(entry), bos);
- data.put(entry.getName(),
DigestUtils.md5Hex(bos.toByteArray()));
+ byte[] bytes = zip.getInputStream(entry).readAllBytes();
+ data.put(entry.getName(), DigestUtils.md5Hex(bytes));
}
Review Comment:
In `readZipArchive(...)`, the per-entry stream returned by
`zip.getInputStream(entry)` is not closed. This can leak file handles across
many entries. Wrap the entry stream in try-with-resources before calling
`readAllBytes()`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]