Tim Allison created TIKA-3750:
---------------------------------
Summary: Bug in sorting parsers
Key: TIKA-3750
URL: https://issues.apache.org/jira/browse/TIKA-3750
Project: Tika
Issue Type: Bug
Reporter: Tim Allison
Throughout our documentation and unit tests, we declare that parsers with a
different namespace than org.apache.tika should come first. The problem is
that the DefaultParser iterates through the list of parsers and overwrites
parsers based on supported mime types.
So, if there's a custom parser {{com.acme.parser.PDFParser}} that supports
{{application/pdf}}, that will be added to the map of parsers in DefaultParser
first and then overwritten by org.apache.tika's PDFParser.
We should instead sort non-o.a.t. parsers last, no?
--
This message was sent by Atlassian Jira
(v8.20.7#820007)