Tim Allison created TIKA-4543:
---------------------------------
Summary: Reorganize pipes implementation modules around resource
as opposed to task
Key: TIKA-4543
URL: https://issues.apache.org/jira/browse/TIKA-4543
Project: Tika
Issue Type: Task
Reporter: Tim Allison
We currently have pipes implementations by task – fetchers, emitters, etc. The
actual code we have for each is pretty small, and we have a lot of modules.
It would be more efficient to group the modules by resource: tika-pipes-s3,
tika-pipes-file-system, and then include the fetchers, emitters etc for that
resource.
This way, if we're pulling from s3, iterating in a bucket and writing to s3,
the application only needs the tika-pipes-s3 module, with the heavy s3
dependencies.
If we're pulling from s3 and writing to a local file share, the dependencies
between where we are now and the proposed reorganization wouldn't change.
This change would only be in 4.x.
I'm going to draft a PR off the TIKA-4519 branch unless there are objections.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)