[
https://issues.apache.org/jira/browse/TIKA-1110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13638903#comment-13638903
]
Nick Burch commented on TIKA-1110:
----------------------------------
CompositeParser always works with the canonical (normalised) form of the
mimetype. The Parser *must* list the canonical form as one it supports (it can
list aliases too if it wants, but that isn't needed)
Would you be able to do a patch that changes ChmParser to list the correct
mimetype, rather than an alias?
> Incorrectly declared SUPPORTED_TYPES in ChmParser.
> --------------------------------------------------
>
> Key: TIKA-1110
> URL: https://issues.apache.org/jira/browse/TIKA-1110
> Project: Tika
> Issue Type: Bug
> Affects Versions: 1.3, 1.4
> Reporter: Andrzej Bialecki
> Fix For: 1.4
>
>
> [This
> link|http://www.iana.org/assignments/media-types/application/vnd.ms-htmlhelp]
> assigns the official mime type for these files to
> "application/vnd.ms-htmlhelp". In the wild there are also two other types
> used:
> * application/chm
> * application/x-chm
> tika-mimetypes.xml uses the correct official mime type, but ChmParser
> declares that it supports only "application/chm". For this reason content
> that uses the official mime type (e.g. coming via Detector or parsed using
> AutoDetectParser, or simply declared in metadata) fails to parse due to
> unknown mime type.
> The fix seems simple - ChmParser should declare also all of the above types
> in its SUPPORTED_TYPES.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira