[jira] [Commented] (TIKA-4512) Experiment with pf4j for tika-pipes

Tim Allison (Jira) Fri, 17 Oct 2025 09:39:43 -0700


    [ 
https://issues.apache.org/jira/browse/TIKA-4512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18030670#comment-18030670
 ]


Tim Allison commented on TIKA-4512:
-----------------------------------

The other major change that we should adopt from [~ndipiazza] 's POC is 
changing the configs so that they are simple containers for json.

When we move to pf4j, we don't want to have, e.g., the 
FileSystemFetcherConfig.class in both the main module and in the plugin module. 
It should only live in the plugin module.

Therefore, we can create a generic FetcherConfig class in tika-pipes-api, and 
have that carry the json and the pluginId. 

More generally, I've observed this problem in the main pipes process vs the 
forked pipes process, where we only want to load the parsers in the forked 
process. As we have it set up currently, users can pass in a 
PDFParserConfig.class from the main process, but that means that they have to 
have the parsers in both the main process and in the forked process. This is 
less than ideal.

Fixing that at the parser level is an entirely different and much larger thing, 
but I think we can experiment with that pattern on the pipes side for now.

> Experiment with pf4j for tika-pipes
> -----------------------------------
>
>                 Key: TIKA-4512
>                 URL: https://issues.apache.org/jira/browse/TIKA-4512
>             Project: Tika
>          Issue Type: Sub-task
>            Reporter: Tim Allison
>            Priority: Major
>
> This is proof of concept work based on [~ndipiazza] 's TIKA-4272-docker 
> branch.
> I really like a lot of that work.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (TIKA-4512) Experiment with pf4j for tika-pipes

Reply via email to