Copilot commented on code in PR #2881:
URL: https://github.com/apache/tika/pull/2881#discussion_r3380657769
##########
tika-eval/tika-eval-app/src/main/java/org/apache/tika/eval/app/ExtractComparerRunner.java:
##########
@@ -263,6 +272,53 @@ private static void deleteDirectory(Path dir) throws
IOException {
}
}
+ /**
+ * Gzip the H2 db file (<dbPath>.mv.db ->
<dbPath>.mv.db.gz) so it can be
+ * transferred. The db connection is already closed by {@link #execute}
before
+ * this runs, so the file is unlocked. No-op (with a warning) when there
is no
+ * on-disk file db to gzip: a temp db (no -d), or a non-file jdbc
connection
+ * (e.g. mem/tcp). A {@code jdbc:h2:file:} URL is supported by extracting
its
+ * file path.
+ */
+ private static void gzipDb(String dbPath, boolean usesTempDb) throws
IOException {
+ if (usesTempDb) {
+ LOG.warn("-z (gzip) ignored: no -d db specified, so there is no db
file to transfer");
+ return;
+ }
+ String filePath = dbPath;
+ if (dbPath.startsWith("jdbc:")) {
+ String prefix = "jdbc:h2:file:";
+ if (!dbPath.startsWith(prefix)) {
+ LOG.warn("-z (gzip) ignored: db is a non-file jdbc connection
({}), no local file to transfer",
+ dbPath);
+ return;
+ }
+ // Strip the jdbc:h2:file: prefix and any ;OPTION=... suffix to
get the file base path.
+ filePath = dbPath.substring(prefix.length());
+ int semi = filePath.indexOf(';');
+ if (semi >= 0) {
+ filePath = filePath.substring(0, semi);
+ }
+ }
Review Comment:
`gzipDb` only treats `jdbc:h2:file:` URLs as file-backed and will
incorrectly warn/skip for valid file-based H2 URLs like `jdbc:h2:/path/to/db`
(which is also what `H2Util` generates). It also doesn't handle callers passing
`-d ...mv.db`, which would make it look for `...mv.db.mv.db`. Consider
supporting the broader H2 file URL forms and trimming an existing `.mv.db`
suffix before appending it again.
##########
tika-eval/tika-eval-app/src/main/java/org/apache/tika/eval/app/ExtractComparerRunner.java:
##########
@@ -128,15 +129,23 @@ public static void main(String[] args) throws Exception {
if (commandLine.hasOption('r')) {
String reportsDir = commandLine.getOptionValue("rd",
"reports");
LOG.info("Running Report...");
- ResultsReporter.main(new String[]{"-d", dbPath, "-rd",
reportsDir});
+ if (dbPath.startsWith("jdbc:")) {
+ ResultsReporter.main(new String[]{"-jdbc", dbPath, "-rd",
reportsDir});
+ } else {
+ ResultsReporter.main(new String[]{"-d", dbPath, "-rd",
reportsDir});
+ }
Path reportsDirPath = Paths.get(reportsDir);
if (Files.isDirectory(reportsDirPath)) {
- Path tgzPath = reportsDirPath.resolveSibling(reportsDir +
".tar.gz");
+ Path tgzPath =
reportsDirPath.resolveSibling(reportsDirPath.getFileName() + ".tgz");
Review Comment:
`reportsDirPath.getFileName()` can return null when `reportsDirPath` is the
filesystem root (e.g., `-rd /` on *nix), which would cause a
NullPointerException when building the archive name. Use a null-safe basename
when constructing the `.tgz` path.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]