Tim Allison created TIKA-4543:
---------------------------------

             Summary: Reorganize pipes implementation modules around resource 
as opposed to task
                 Key: TIKA-4543
                 URL: https://issues.apache.org/jira/browse/TIKA-4543
             Project: Tika
          Issue Type: Task
            Reporter: Tim Allison


We currently have pipes implementations by task – fetchers, emitters, etc. The 
actual code we have for each is pretty small, and we have a lot of modules.

It would be more efficient to group the modules by resource: tika-pipes-s3, 
tika-pipes-file-system, and then include the fetchers, emitters etc for that 
resource.

This way, if we're pulling from s3, iterating in a bucket and writing to s3, 
the application only needs the tika-pipes-s3 module, with the heavy s3 
dependencies. 

If we're pulling from s3 and writing to a local file share, the dependencies 
between where we are now and the proposed reorganization wouldn't change.

This change would only be in 4.x.

I'm going to draft a PR off the TIKA-4519 branch unless there are objections.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to