[
https://issues.apache.org/jira/browse/TIKA-3947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17648161#comment-17648161
]
Tim Allison edited comment on TIKA-3947 at 12/15/22 4:56 PM:
-------------------------------------------------------------
We can move tika-eval and tika-app over to the pipes module. We shouldn't need
to support 4 different methods of forking+parsing, and the pipes module is used
in both tika-server and tika-app.
This will help with build speed and focus energy/attention on the tika-pipes
module.
We'll need to test performance diffs. My sense is that tika-batch is faster
because it uses a single jvm for all parsing and only has one monitor jvm,
whereas tika-pipes processes a single file at a time in a jvm.
was (Author: [email protected]):
We can move tika-eval and tika-app over to the pipes module. We shouldn't need
to support 4 different methods of forking+parsing, and the pipes module is used
in both tika-server and tika-app.
This will help with build speed and focus energy/attention on the tika-pipes
module.
> Remove tika-batch in 3.x
> ------------------------
>
> Key: TIKA-3947
> URL: https://issues.apache.org/jira/browse/TIKA-3947
> Project: Tika
> Issue Type: Task
> Components: tika-batch
> Reporter: Tim Allison
> Priority: Major
> Labels: tika-3x
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)