[ 
https://issues.apache.org/jira/browse/TIKA-1044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Burch resolved TIKA-1044.
------------------------------

       Resolution: Fixed
    Fix Version/s: 1.3

Fixed in r1421646, along with a unit test based on your files, thanks!
                
> Can't parse Word files with no format set
> -----------------------------------------
>
>                 Key: TIKA-1044
>                 URL: https://issues.apache.org/jira/browse/TIKA-1044
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 1.0
>            Reporter: Jonas Wilhelmsson
>            Priority: Trivial
>             Fix For: 1.3
>
>         Attachments: test2.doc, test.docx
>
>
> When we were using Solr for indexing we came over this Tika bug.
> While parsing a doc or docx file that contains text without any format set 
> (format inside Microsoft Word) the parser will throw exceptions.
> By setting a format to the text the file can be correctly parsed without 
> unexpected errors.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to