[
https://issues.apache.org/jira/browse/TIKA-3477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17380767#comment-17380767
]
Hudson commented on TIKA-3477:
------------------------------
SUCCESS: Integrated in Jenkins build Tika ยป tika-main-jdk8 #280 (See
[https://ci-builds.apache.org/job/Tika/job/tika-main-jdk8/280/])
TIKA-3477 -- don't close embedded word doc because that in turn closes the
root. Unrelated issue, avoid NPE if ooxml part doesn't exist. (tallison:
[https://github.com/apache/tika/commit/070222a6a968aca6aff6fdf438379dfe4723e7e0])
* (edit)
tika-parsers/tika-parsers-standard/tika-parsers-standard-modules/tika-parser-microsoft-module/src/main/java/org/apache/tika/parser/microsoft/ooxml/AbstractOOXMLExtractor.java
* (edit)
tika-parsers/tika-parsers-standard/tika-parsers-standard-modules/tika-parser-microsoft-module/src/main/java/org/apache/tika/parser/microsoft/WordExtractor.java
> Fix new closed channel exception in MSOffice files in 2.x
> ---------------------------------------------------------
>
> Key: TIKA-3477
> URL: https://issues.apache.org/jira/browse/TIKA-3477
> Project: Tika
> Issue Type: Task
> Reporter: Tim Allison
> Priority: Major
> Fix For: 2.0.0
>
>
> There's a new exception processing embedded files (?) in MSOffice OLE2 files
> in {{main}}. I found this via regression comparisons with Tika 1.27.
> {noformat}
> Caused by: java.lang.RuntimeException:
> java.nio.channels.ClosedChannelException
> at
> org.apache.poi.poifs.filesystem.POIFSStream$StreamBlockByteBufferIterator.<init>(POIFSStream.java:151)
> at
> org.apache.poi.poifs.filesystem.POIFSStream.getBlockIterator(POIFSStream.java:95)
> at
> org.apache.poi.poifs.filesystem.POIFSStream.iterator(POIFSStream.java:86)
> at
> org.apache.poi.poifs.filesystem.POIFSDocument.getBlockIterator(POIFSDocument.java:183)
> at
> org.apache.poi.poifs.filesystem.DocumentInputStream.<init>(DocumentInputStream.java:97)
> at
> org.apache.poi.poifs.filesystem.DirectoryNode.createDocumentInputStream(DirectoryNode.java:159)
> at
> org.apache.poi.hwpf.HWPFDocumentCore.getDocumentEntryBytes(HWPFDocumentCore.java:329)
> at
> org.apache.poi.hwpf.HWPFDocumentCore.getEncryptionInfo(HWPFDocumentCore.java:254)
> at
> org.apache.poi.hwpf.HWPFDocumentCore.getDocumentEntryBytes(HWPFDocumentCore.java:327)
> at
> org.apache.poi.hwpf.HWPFDocumentCore.<init>(HWPFDocumentCore.java:169)
> at org.apache.poi.hwpf.HWPFDocument.<init>(HWPFDocument.java:193)
> at
> org.apache.tika.parser.microsoft.WordExtractor.parse(WordExtractor.java:155)
> at
> org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:216)
> at
> org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:173)
> at
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:289)
> ... 31 more
> Caused by: java.nio.channels.ClosedChannelException
> at sun.nio.ch.FileChannelImpl.ensureOpen(FileChannelImpl.java:110)
> {noformat}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)