[
https://issues.apache.org/jira/browse/TIKA-2055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15493943#comment-15493943
]
Hudson commented on TIKA-2055:
------------------------------
SUCCESS: Integrated in Jenkins build Tika-trunk #1100 (See
[https://builds.apache.org/job/Tika-trunk/1100/])
TIKA-2055 catch exception when totalTime out of unsigned int range in
(tallison: rev 27b9cf566da9772961b2fac3c2aa6cc1648ab2a5)
* (add)
tika-parsers/src/test/resources/test-documents/testWORD_totalTimeOutOfRange.docx
* (edit)
tika-parsers/src/main/java/org/apache/tika/parser/microsoft/ooxml/MetadataExtractor.java
* (edit)
tika-parsers/src/test/java/org/apache/tika/parser/microsoft/ooxml/OOXMLParserTest.java
> Exception on parsing .docx file
> -------------------------------
>
> Key: TIKA-2055
> URL: https://issues.apache.org/jira/browse/TIKA-2055
> Project: Tika
> Issue Type: Bug
> Components: parser
> Affects Versions: 1.13
> Environment: Linux Centos 7
> Reporter: Sebastian Iturra
> Priority: Critical
> Fix For: 2.0, 1.14
>
>
> Command: java -jar tika-app-1.13.jar input.docx
> Exception in thread "main" org.apache.tika.exception.TikaException: Error
> creating OOXML extractor
> at
> org.apache.tika.parser.microsoft.ooxml.OOXMLExtractorFactory.parse(OOXMLExtractorFactory.java:120)
> at
> org.apache.tika.parser.microsoft.ooxml.OOXMLParser.parse(OOXMLParser.java:87)
> at
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
> at
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
> at
> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
> at org.apache.tika.cli.TikaCLI$OutputType.process(TikaCLI.java:191)
> at org.apache.tika.cli.TikaCLI.process(TikaCLI.java:480)
> at org.apache.tika.cli.TikaCLI.main(TikaCLI.java:145)
> Caused by: org.apache.xmlbeans.impl.values.XmlValueOutOfRangeException:
> Invalid int value: 4294967295
> at
> org.apache.xmlbeans.impl.values.JavaIntHolder.set_text(JavaIntHolder.java:43)
> at
> org.apache.xmlbeans.impl.values.XmlObjectBase.update_from_wscanon_text(XmlObjectBase.java:1180)
> at
> org.apache.xmlbeans.impl.values.XmlObjectBase.check_dated(XmlObjectBase.java:1319)
> at
> org.apache.xmlbeans.impl.values.JavaIntHolder.getIntValue(JavaIntHolder.java:53)
> at
> org.openxmlformats.schemas.officeDocument.x2006.extendedProperties.impl.CTPropertiesImpl.getTotalTime(Unknown
> Source)
> at
> org.apache.tika.parser.microsoft.ooxml.MetadataExtractor.extractMetadata(MetadataExtractor.java:124)
> at
> org.apache.tika.parser.microsoft.ooxml.MetadataExtractor.extract(MetadataExtractor.java:62)
> at
> org.apache.tika.parser.microsoft.ooxml.OOXMLExtractorFactory.parse(OOXMLExtractorFactory.java:109)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)