[
https://issues.apache.org/jira/browse/TIKA-3709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17514776#comment-17514776
]
Tim Allison commented on TIKA-3709:
-----------------------------------
Duplicates TIKA-1733, too? I'll take a look.
> RuntimeException when parsing word (.doc) document
> --------------------------------------------------
>
> Key: TIKA-3709
> URL: https://issues.apache.org/jira/browse/TIKA-3709
> Project: Tika
> Issue Type: Bug
> Affects Versions: 2.3.0
> Reporter: Johannes Wirkkala Westlund
> Priority: Minor
> Attachments: Avtalsvillkor (1).doc
>
>
> Hi,
> I have a word file that throw the following error when I try to parse it with
> Tika:
> {code:java}
> Caused by: java.lang.IllegalArgumentException: This paragraph is not the
> first one in the table
> at org.apache.poi.hwpf.usermodel.Range.getTable(Range.java:810)
> at
> org.apache.tika.parser.microsoft.WordExtractor.handleParagraph(WordExtractor.java:272)
> at
> org.apache.tika.parser.microsoft.WordExtractor.handleHeaderFooter(WordExtractor.java:255)
> at
> org.apache.tika.parser.microsoft.WordExtractor.parse(WordExtractor.java:210)
> at
> org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:216)
> at
> org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:173)
> at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:289)
> ... 5 more {code}
> I have attached the document with this issue.
> Might be related to: https://issues.apache.org/jira/browse/TIKA-1251
--
This message was sent by Atlassian Jira
(v8.20.1#820001)