Tim Allison created TIKA-4550:
---------------------------------
Summary: Determine migration path for xml-json configuration in 4.x
Key: TIKA-4550
URL: https://issues.apache.org/jira/browse/TIKA-4550
Project: Tika
Issue Type: Task
Reporter: Tim Allison
In main, we now have the json configuration operative from
tika-parsers-standard-package through the rest of the build.
We still have remnants in tika-app, but tika-server is json only.
What do we want the user experience to be when migrating from 3.x to 4.x?
I think most of the conversions should be fairly straightforward – claude was
able to do a pretty good job, for example. I worry a bit about the parsers
configurations.
Mechanically at a high level:
0) Abrupt transition. Rely on documentation. Remove all of TikaConfig from main
now.
1) Same as option 0, but try to backport the json deserialization into 3.x so
that users can get ready.
2) Keep TikaConfig in main, and make it optional in tika-app and tika-server
for 4.x
3) Something else?
My preference is for option 0. I worry about the amount of code and effort it
would take to get 1 and 2 right.
WDYT?
Things that would help with the transition:
0) documentation, obviously.
1) Perhaps some code that would convert at least the parsers section for the
tika-parsers-standard parsers from xml to json.
2) other things?
--
This message was sent by Atlassian Jira
(v8.20.10#820010)