Tim Allison created TIKA-4550:
---------------------------------

             Summary: Determine migration path for xml-json configuration in 4.x
                 Key: TIKA-4550
                 URL: https://issues.apache.org/jira/browse/TIKA-4550
             Project: Tika
          Issue Type: Task
            Reporter: Tim Allison


In main, we now have the json configuration operative from 
tika-parsers-standard-package through the rest of the build. 

We still have remnants in tika-app, but tika-server is json only.

What do we want the user experience to be when migrating from 3.x to 4.x?

I think most of the conversions should be fairly straightforward – claude was 
able to do a pretty good job, for example. I worry a bit about the parsers 
configurations.

Mechanically at a high level:

0) Abrupt transition. Rely on documentation. Remove all of TikaConfig from main 
now.

1) Same as option 0, but try to backport the json deserialization into 3.x so 
that users can get ready.

2) Keep TikaConfig in main, and make it optional in tika-app and tika-server 
for 4.x

3) Something else?

My preference is for option 0. I worry about the amount of code and effort it 
would take to get 1 and 2 right.

WDYT?

 

Things that would help with the transition:

0) documentation, obviously.

1) Perhaps some code that would convert at least the parsers section for the 
tika-parsers-standard parsers from xml to json.

2) other things?

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to