janhoy commented on PR #3674:
URL: https://github.com/apache/solr/pull/3674#issuecomment-3480639957
> The PR introduces breaking changes (therefore backporting should probably
be avoided). Apache Tika 2 and 3 standardized the metadata fields, which affect
the returned fields.
I tackled that in the `tikaserver` backend by adding a Metadata mapper that,
if enabled, will map from e.g. `dc.author` to `Author` to please what users
might have come to expect in Tika1.x. If you intend to pursue some upgrade in
the 9.x line, re-using that class could perhaps make the upgrade somewhat more
compatible. But if it is compatible enough to warrant this breaking change in
9.x I don't know.
I'd not be opposed to announce that a "necessary" breaking change will
happen in, say 9.11, due to security risks, and then prepare users for the
change. I kept the mapping option hidden, un-documented, since I don't want us
to have to support it. But one could offer a user-supplied map `{"from": "to",
"from2", "to2"}` where she could tailor this. Or, perhaps that would not be
needed since we already have the fmap feature able to map fields, e.g.
`fmap.dc.author=Author`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]