[
https://issues.apache.org/jira/browse/TIKA-4737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18083519#comment-18083519
]
Hudson commented on TIKA-4737:
------------------------------
SUCCESS: Integrated in Jenkins build Tika » tika-main-jdk17 #1386 (See
[https://ci-builds.apache.org/job/Tika/job/tika-main-jdk17/1386/])
TIKA-4737 -- improve docs for tika-pipes via tika-app (#2836) (github:
[https://github.com/apache/tika/commit/4bfbdf22cf0d42f3661e5f0abcd42fd7e6190735])
* (edit) docs/modules/ROOT/pages/pipes/configuration.adoc
* (edit) docs/modules/ROOT/pages/using-tika/cli/index.adoc
* (edit) docs/modules/ROOT/pages/pipes/cpu-sizing.adoc
* (edit) docs/modules/ROOT/pages/migration-to-4x/index.adoc
* (edit) tika-app/src/main/java/org/apache/tika/cli/TikaCLI.java
> tika-4.0.0-alpha1 - Batch mode is confusing
> -------------------------------------------
>
> Key: TIKA-4737
> URL: https://issues.apache.org/jira/browse/TIKA-4737
> Project: Tika
> Issue Type: Bug
> Environment: Windows 11 with Java 17
> Reporter: Adrian Bird
> Priority: Major
>
> Looking at the documentation I've found it very confusing for using what I'll
> call 'standard' mode vs 'batch' mode.
> # [Batch
> Processing|https://tika.apache.org/docs/4.0.0-SNAPSHOT/using-tika/cli/index.html#_batch_processing_tika_async_cli]
> says 'For processing large numbers of files, use {{{}tika-async-cli{}}}. It
> uses the Tika Pipes architecture with forked JVM processes for fault
> tolerance.'
> The examples uses 'tika-async-cli.jar' but this doesn't exist, but the
> example runs with the 'tika-app.jar'.
> # By using 'tika-app.jar' it is not clear what makes it run in 'batch' or
> 'standard ' mode. My assumption is that it is the presence of the '-i' and
> '-o' options.
> # The help from the 'batch' process differs quite a lot from the options
> specified in the Batch Processing page above and in the 'standard' help
> output.
> # The Batch Processing page above doesn't say anything about how to use a
> config file, but the help does.
> # It is confusing to have 2 different ways of specifying the config file,
> depending whether you are using the 'standard' '–config=file.json' or 'batch'
> '-c file.json'.
> # It would also be useful if a message was output saying whether it was
> 'standard' or 'batch' mode.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)