[ https://issues.apache.org/jira/browse/TIKA-3128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sameer Apte updated TIKA-3128: ------------------------------ Summary: MOV file produces RuntimeException with 1.24.1, used to work with earlier version 1.19.1 (was: MOV file produces RuntimeException with 1.24.1, used to work with earlier version) > MOV file produces RuntimeException with 1.24.1, used to work with earlier > version 1.19.1 > ---------------------------------------------------------------------------------------- > > Key: TIKA-3128 > URL: https://issues.apache.org/jira/browse/TIKA-3128 > Project: Tika > Issue Type: Bug > Components: parser > Affects Versions: 1.24.1 > Reporter: Sameer Apte > Priority: Major > Attachments: HDSIT_157516.mov > > > Attached _mov_ file produces _RuntimeException_ when parsed with *tika > v1.24.1* > The same _mov_ file can be parsed without any issues with *tika v1.19.1* > *Tika 1.19.1 stand alone app _SUCCESSFUL_ run* > {code:java} > [sapte@sapte-dt tikatest]$ java -jar tika-app-1.19.1.jar -m HDSIT_157516.mov > Jun 18, 2020 11:25:00 AM org.apache.tika.config.InitializableProblemHandler$3 > handleInitializableProblem > WARNING: J2KImageReader not loaded. JPEG2000 files will not be processed. > See https://pdfbox.apache.org/2.0/dependencies.html#jai-image-io > for optional dependencies.Jun 18, 2020 11:25:00 AM > org.apache.tika.config.InitializableProblemHandler$3 > handleInitializableProblem > WARNING: org.xerial's sqlite-jdbc is not loaded. > Please provide the jar on your classpath to parse sqlite files. > See tika-parsers/pom.xml for the correct version. > Content-Length: 51066400 > Content-Type: application/mp4 > Creation-Date: 2015-05-18T16:23:25Z > Last-Modified: 2015-05-18T16:31:09Z > Last-Save-Date: 2015-05-18T16:31:09Z > X-Parsed-By: org.apache.tika.parser.DefaultParser > X-Parsed-By: org.apache.tika.parser.mp4.MP4Parser > date: 2015-05-18T16:31:09Z > dcterms:created: 2015-05-18T16:23:25Z > dcterms:modified: 2015-05-18T16:31:09Z > meta:creation-date: 2015-05-18T16:23:25Z > meta:save-date: 2015-05-18T16:31:09Z > modified: 2015-05-18T16:31:09Z > resourceName: HDSIT_157516.mov > tiff:ImageLength: 1080 > tiff:ImageWidth: 1920 > xmpDM:audioSampleRate: 30000 > xmpDM:duration: 125.99 > {code} > *Tika 1.24.1 standalone app _RUNTIMEEXCEPTION_ run* > {code:java} > [sapte@sapte-dt tikatest]$ java -jar tika-app-1.24.1.jar -m HDSIT_157516.mov > Jun 18, 2020 11:24:50 AM org.apache.tika.config.InitializableProblemHandler$3 > handleInitializableProblem > WARNING: J2KImageReader not loaded. JPEG2000 files will not be processed. > See https://pdfbox.apache.org/2.0/dependencies.html#jai-image-io > for optional dependencies. > Jun 18, 2020 11:24:50 AM org.apache.tika.config.InitializableProblemHandler$3 > handleInitializableProblem > WARNING: org.xerial's sqlite-jdbc is not loaded. > Please provide the jar on your classpath to parse sqlite files. > See tika-parsers/pom.xml for the correct version. > Exception in thread "main" org.apache.tika.exception.TikaException: > Unexpected RuntimeException from org.apache.tika.parser.mp4.MP4Parser@23348b5d > at > org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:293) > at > org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) > at > org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143) > at org.apache.tika.cli.TikaCLI$OutputType.process(TikaCLI.java:209) > at org.apache.tika.cli.TikaCLI.process(TikaCLI.java:496) > at org.apache.tika.cli.TikaCLI.main(TikaCLI.java:149) > Caused by: java.lang.RuntimeException: box size of zero means 'till end of > file. That is not yet supported > at org.mp4parser.AbstractBoxParser.parseBox(AbstractBoxParser.java:90) > at org.mp4parser.BasicContainer.initContainer(BasicContainer.java:107) > at > org.mp4parser.boxes.sampleentry.VisualSampleEntry.parse(VisualSampleEntry.java:195) > at org.mp4parser.AbstractBoxParser.parseBox(AbstractBoxParser.java:115) > at org.mp4parser.BasicContainer.initContainer(BasicContainer.java:107) > at > org.mp4parser.boxes.iso14496.part12.SampleDescriptionBox.parse(SampleDescriptionBox.java:91) > at org.mp4parser.AbstractBoxParser.parseBox(AbstractBoxParser.java:115) > at org.mp4parser.BasicContainer.initContainer(BasicContainer.java:107) > at > org.mp4parser.support.AbstractContainerBox.parse(AbstractContainerBox.java:76) > at org.mp4parser.AbstractBoxParser.parseBox(AbstractBoxParser.java:115) > at org.mp4parser.BasicContainer.initContainer(BasicContainer.java:107) > at > org.mp4parser.support.AbstractContainerBox.parse(AbstractContainerBox.java:76) > at org.mp4parser.AbstractBoxParser.parseBox(AbstractBoxParser.java:115) > at org.mp4parser.BasicContainer.initContainer(BasicContainer.java:107) > at > org.mp4parser.support.AbstractContainerBox.parse(AbstractContainerBox.java:76) > at org.mp4parser.AbstractBoxParser.parseBox(AbstractBoxParser.java:115) > at org.mp4parser.BasicContainer.initContainer(BasicContainer.java:107) > at > org.mp4parser.support.AbstractContainerBox.parse(AbstractContainerBox.java:76) > at org.mp4parser.AbstractBoxParser.parseBox(AbstractBoxParser.java:115) > at org.mp4parser.BasicContainer.initContainer(BasicContainer.java:107) > at > org.mp4parser.support.AbstractContainerBox.parse(AbstractContainerBox.java:76) > at org.mp4parser.AbstractBoxParser.parseBox(AbstractBoxParser.java:115) > at org.mp4parser.BasicContainer.initContainer(BasicContainer.java:107) > at org.mp4parser.IsoFile.<init>(IsoFile.java:58) > at org.mp4parser.IsoFile.<init>(IsoFile.java:45) > at org.apache.tika.parser.mp4.MP4Parser.parse(MP4Parser.java:130) > at > org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) > ... 5 more > {code} > Commit _8e2eb05292bc35503a3d82a908c426854e23ac83_ in v1.24.1 which switched > the mp4 parser from _googlecode_ to _tallison_ appears to be directly > responsible for the change in behavior. -- This message was sent by Atlassian Jira (v8.3.4#803005)