[ https://issues.apache.org/jira/browse/TIKA-298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12764471#action_12764471 ]
Ken Krugler commented on TIKA-298: ---------------------------------- Jukka said on the mailing list: ======================================================== Note that both the MimeType.getSuperType() method already does some of this and we have related supertype settings stored in the tika-mimetypes.xml configuration. The type registry could also be told about the +xml convention and related implicit supertype settings like the ones encoded in the MediaType.isSpecializationOf() method. (Note that we currently have both MimeType and MediaType classes for similar purposes. This is due to an ongoing redesign of the mime type registry. For now it's probably best to work on the MimeType class until the redesign is more complete.) ======================================================== > CompositeParser.getParser() should use mimetype hierarchy when falling back > --------------------------------------------------------------------------- > > Key: TIKA-298 > URL: https://issues.apache.org/jira/browse/TIKA-298 > Project: Tika > Issue Type: Improvement > Affects Versions: 0.4 > Reporter: Ken Krugler > > CompositeParser.getParser() doesn't use supertypes when falling back - if it > can't get a parser for the exact mimetype, then it goes > straight to the fallback parser. > So, for example, if the file mimetype is application/<whatever>+xml, and no > parser exists for it, then you get the default "do nothing" parser versus the > XML parser. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.