[ 
https://issues.apache.org/jira/browse/TIKA-109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mats Norén updated TIKA-109:
----------------------------

    Attachment: fil6.doc

Attached file fails with:

java.lang.StringIndexOutOfBoundsException: String index out of range: -4095
        at 
java.lang.AbstractStringBuilder.substring(AbstractStringBuilder.java:886)
        at java.lang.StringBuffer.substring(StringBuffer.java:417)
        at org.apache.poi.hwpf.model.TextPiece.substring(TextPiece.java:88)
        at 
org.apache.tika.parser.microsoft.WordParser.extractText(WordParser.java:163)
        at 
org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:61)
        at 
org.apache.tika.utils.ParseUtils.getStringContent(ParseUtils.java:173)
        at 
org.apache.tika.utils.ParseUtils.getStringContent(ParseUtils.java:233)
        at 
org.apache.tika.utils.ParseUtils.getStringContent(ParseUtils.java:251)
        at org.apache.tika.TestParsers.testWORDxtraction(TestParsers.java:120)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at 
com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:40)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at com.intellij.rt.execution.application.AppMain.main(AppMain.java:90)

> WordParser fails on some Word files
> -----------------------------------
>
>                 Key: TIKA-109
>                 URL: https://issues.apache.org/jira/browse/TIKA-109
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 0.1-incubator
>         Environment: Windows XP
> Java(TM) SE Runtime Environment (build 1.6.0_03-b05)
>            Reporter: Mats Norén
>         Attachments: fil6.doc
>
>
> WordParser fail on some word files. A negative value is sent to 
> TextPiece.substring in POI for some corner case in the algorithm.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to