[ https://issues.apache.org/jira/browse/TIKA-109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mats Norén updated TIKA-109: ---------------------------- Attachment: fil6.doc Attached file fails with: java.lang.StringIndexOutOfBoundsException: String index out of range: -4095 at java.lang.AbstractStringBuilder.substring(AbstractStringBuilder.java:886) at java.lang.StringBuffer.substring(StringBuffer.java:417) at org.apache.poi.hwpf.model.TextPiece.substring(TextPiece.java:88) at org.apache.tika.parser.microsoft.WordParser.extractText(WordParser.java:163) at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:61) at org.apache.tika.utils.ParseUtils.getStringContent(ParseUtils.java:173) at org.apache.tika.utils.ParseUtils.getStringContent(ParseUtils.java:233) at org.apache.tika.utils.ParseUtils.getStringContent(ParseUtils.java:251) at org.apache.tika.TestParsers.testWORDxtraction(TestParsers.java:120) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:40) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at com.intellij.rt.execution.application.AppMain.main(AppMain.java:90) > WordParser fails on some Word files > ----------------------------------- > > Key: TIKA-109 > URL: https://issues.apache.org/jira/browse/TIKA-109 > Project: Tika > Issue Type: Bug > Components: parser > Affects Versions: 0.1-incubator > Environment: Windows XP > Java(TM) SE Runtime Environment (build 1.6.0_03-b05) > Reporter: Mats Norén > Attachments: fil6.doc > > > WordParser fail on some word files. A negative value is sent to > TextPiece.substring in POI for some corner case in the algorithm. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.