RTF parsing fails with Java 7 early access on 64bit platforms
-------------------------------------------------------------
Key: TIKA-621
URL: https://issues.apache.org/jira/browse/TIKA-621
Project: Tika
Issue Type: Bug
Components: parser
Affects Versions: 0.9, 0.8
Environment: $ java -version
java version "1.7.0-ea"
Java(TM) SE Runtime Environment (build 1.7.0-ea-b134)
Java HotSpot(TM) 64-Bit Server VM (build 21.0-b04, mixed mode)
(Seen using this version of Java on both Windows 2008 and CentOS 5)
Reporter: Matt Sheppard
I've run across an RTF documents which tika is failing to convert on 64bit
platforms (Windows and Linux) using the Java 7 early access version. The same
document is successfully converted on 32bit Windows and Linux, and using Java 6.
{noformat}
java -jar tika-app-0.9.jar -t full.rtf
Exception in thread "main" org.apache.tika.exception.TikaException: Unexpected
RuntimeException from org.apache.tika.parser.rtf.RTFParser@1fa78298
at
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:199)
at
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:197)
at
org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:135)
at org.apache.tika.cli.TikaCLI$OutputType.process(TikaCLI.java:107)
at org.apache.tika.cli.TikaCLI.process(TikaCLI.java:302)
at org.apache.tika.cli.TikaCLI.main(TikaCLI.java:91)
Caused by: java.lang.NullPointerException
at javax.swing.text.GapContent.compare(Unknown Source)
at javax.swing.text.GapContent.findSortIndex(Unknown Source)
at javax.swing.text.GapContent.createPosition(Unknown Source)
at javax.swing.text.AbstractDocument.createPosition(Unknown Source)
at javax.swing.text.AbstractDocument$LeafElement.<init>(Unknown Source)
at javax.swing.text.AbstractDocument.createLeafElement(Unknown Source)
at
javax.swing.text.DefaultStyledDocument$ElementBuffer.insertElement(Unknown
Source)
at
javax.swing.text.DefaultStyledDocument$ElementBuffer.insertUpdate(Unknown
Source)
at javax.swing.text.DefaultStyledDocument$ElementBuffer.insert(Unknown
Source)
at javax.swing.text.DefaultStyledDocument.insertUpdate(Unknown Source)
at javax.swing.text.AbstractDocument.handleInsertString(Unknown Source)
at javax.swing.text.AbstractDocument.insertString(Unknown Source)
at
org.apache.tika.parser.rtf.RTFParser$CustomStyledDocument.insertString(RTFParser.java:376)
at
javax.swing.text.rtf.RTFReader$DocumentDestination.deliverText(Unknown Source)
at
javax.swing.text.rtf.RTFReader$TextHandlingDestination.handleText(Unknown
Source)
at javax.swing.text.rtf.RTFReader.handleText(Unknown Source)
at javax.swing.text.rtf.RTFParser.write(Unknown Source)
at javax.swing.text.rtf.AbstractFilter.write(Unknown Source)
at javax.swing.text.rtf.AbstractFilter.readFromStream(Unknown Source)
at javax.swing.text.rtf.RTFEditorKit.read(Unknown Source)
at org.apache.tika.parser.rtf.RTFParser.parse(RTFParser.java:112)
at
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:197)
... 5 more
{noformat}
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira