Tim Allison created TIKA-3750:
---------------------------------

             Summary: Bug in sorting parsers
                 Key: TIKA-3750
                 URL: https://issues.apache.org/jira/browse/TIKA-3750
             Project: Tika
          Issue Type: Bug
            Reporter: Tim Allison


Throughout our documentation and unit tests, we declare that parsers with a 
different namespace than org.apache.tika should come first.  The problem is 
that the DefaultParser iterates through the list of parsers and overwrites 
parsers based on supported mime types.

So, if there's a custom parser {{com.acme.parser.PDFParser}} that supports 
{{application/pdf}}, that will be added to the map of parsers in DefaultParser 
first and then overwritten by org.apache.tika's PDFParser.

We should instead sort non-o.a.t. parsers last, no?



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to