Tim Allison created TIKA-3477:
---------------------------------
Summary: Fix new closed channel exception in MSOffice files in 2.x
Key: TIKA-3477
URL: https://issues.apache.org/jira/browse/TIKA-3477
Project: Tika
Issue Type: Task
Reporter: Tim Allison
There's a new exception processing embedded files (?) in MSOffice OLE2 files in
{{main}}. I found this via regression comparisons with Tika 1.27.
{noformat}
Caused by: java.lang.RuntimeException: java.nio.channels.ClosedChannelException
at
org.apache.poi.poifs.filesystem.POIFSStream$StreamBlockByteBufferIterator.<init>(POIFSStream.java:151)
at
org.apache.poi.poifs.filesystem.POIFSStream.getBlockIterator(POIFSStream.java:95)
at
org.apache.poi.poifs.filesystem.POIFSStream.iterator(POIFSStream.java:86)
at
org.apache.poi.poifs.filesystem.POIFSDocument.getBlockIterator(POIFSDocument.java:183)
at
org.apache.poi.poifs.filesystem.DocumentInputStream.<init>(DocumentInputStream.java:97)
at
org.apache.poi.poifs.filesystem.DirectoryNode.createDocumentInputStream(DirectoryNode.java:159)
at
org.apache.poi.hwpf.HWPFDocumentCore.getDocumentEntryBytes(HWPFDocumentCore.java:329)
at
org.apache.poi.hwpf.HWPFDocumentCore.getEncryptionInfo(HWPFDocumentCore.java:254)
at
org.apache.poi.hwpf.HWPFDocumentCore.getDocumentEntryBytes(HWPFDocumentCore.java:327)
at
org.apache.poi.hwpf.HWPFDocumentCore.<init>(HWPFDocumentCore.java:169)
at org.apache.poi.hwpf.HWPFDocument.<init>(HWPFDocument.java:193)
at
org.apache.tika.parser.microsoft.WordExtractor.parse(WordExtractor.java:155)
at
org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:216)
at
org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:173)
at
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:289)
... 31 more
Caused by: java.nio.channels.ClosedChannelException
at sun.nio.ch.FileChannelImpl.ensureOpen(FileChannelImpl.java:110)
{noformat}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)