[ 
https://issues.apache.org/jira/browse/TIKA-4316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18042526#comment-18042526
 ] 

Tim Allison commented on TIKA-4316:
-----------------------------------

I've started drafting from notes and observations on the large refactoring for 
4.x on TIKA-4545: 
[https://cwiki.apache.org/confluence/display/TIKA/Design+notes+for+4.x] . I'll 
try to keep that up to date as we move forward. Please edit/comment as you see 
fit.

> Goals for Tika 4.x
> ------------------
>
>                 Key: TIKA-4316
>                 URL: https://issues.apache.org/jira/browse/TIKA-4316
>             Project: Tika
>          Issue Type: Task
>            Reporter: Tim Allison
>            Priority: Major
>
> I proposed a tentative roadmap here: 
> https://lists.apache.org/thread/9yfzf6qwpc7c6qnlp4tdwsdrnjvv7r1z
> Let's use this ticket to discuss some high level changes in 4.x
> Some thoughts:
> 1) Require Java 17
> 2) Remove tika-batch in favor of tika-pipes with filesystem dependencies
> 3) Move tika-pipes to a separate module. Consider moving non-trivial 
> implementations of tika-pipes components to a separate project? Consider 
> using pf4j in tika-pipes and other components?
> 4) Remove unsupported dl4j and sentiment analysis and agepredictor modules 
> and...? 
> 5) Avoid fat jars where possible -- at least move tika-server to a lib/* 
> pattern with the assembly plugin or pf4j instead of the shade plugin
> 6) Use an auto-correcting linter instead of checkstyle (cosium with google's 
> style format?)
> 7) Remove the legacy external parser mechanism in favor of the external2 
> mechanism



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to