[ 
https://issues.apache.org/jira/browse/TIKA-1110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vadim Roizman updated TIKA-1110:
--------------------------------

    Attachment: TIKA-1110.patch

Nick, the patch lists all 3 types, also content-type in metadata set to  
"application/vnd.ms-htmlhelp".

> Incorrectly declared SUPPORTED_TYPES in ChmParser.
> --------------------------------------------------
>
>                 Key: TIKA-1110
>                 URL: https://issues.apache.org/jira/browse/TIKA-1110
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 1.3, 1.4
>            Reporter: Andrzej Bialecki 
>             Fix For: 1.5
>
>         Attachments: TIKA-1110.patch
>
>
> [This 
> link|http://www.iana.org/assignments/media-types/application/vnd.ms-htmlhelp] 
> assigns the official mime type for these files to 
> "application/vnd.ms-htmlhelp". In the wild there are also two other types 
> used:
> * application/chm
> * application/x-chm
> tika-mimetypes.xml uses the correct official mime type, but ChmParser 
> declares that it supports only "application/chm". For this reason content 
> that uses the official mime type (e.g. coming via Detector or parsed using 
> AutoDetectParser, or simply declared in metadata) fails to parse due to 
> unknown mime type.
> The fix seems simple - ChmParser should declare also all of the above types 
> in its SUPPORTED_TYPES.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to