[ 
https://issues.apache.org/jira/browse/TIKA-1110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13638903#comment-13638903
 ] 

Nick Burch commented on TIKA-1110:
----------------------------------

CompositeParser always works with the canonical (normalised) form of the 
mimetype. The Parser *must* list the canonical form as one it supports (it can 
list aliases too if it wants, but that isn't needed)

Would you be able to do a patch that changes ChmParser to list the correct 
mimetype, rather than an alias?
                
> Incorrectly declared SUPPORTED_TYPES in ChmParser.
> --------------------------------------------------
>
>                 Key: TIKA-1110
>                 URL: https://issues.apache.org/jira/browse/TIKA-1110
>             Project: Tika
>          Issue Type: Bug
>    Affects Versions: 1.3, 1.4
>            Reporter: Andrzej Bialecki 
>             Fix For: 1.4
>
>
> [This 
> link|http://www.iana.org/assignments/media-types/application/vnd.ms-htmlhelp] 
> assigns the official mime type for these files to 
> "application/vnd.ms-htmlhelp". In the wild there are also two other types 
> used:
> * application/chm
> * application/x-chm
> tika-mimetypes.xml uses the correct official mime type, but ChmParser 
> declares that it supports only "application/chm". For this reason content 
> that uses the official mime type (e.g. coming via Detector or parsed using 
> AutoDetectParser, or simply declared in metadata) fails to parse due to 
> unknown mime type.
> The fix seems simple - ChmParser should declare also all of the above types 
> in its SUPPORTED_TYPES.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to