[ https://issues.apache.org/jira/browse/TIKA-2891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Quentin Laville resolved TIKA-2891. ----------------------------------- Resolution: Fixed Fix Version/s: 1.21 > ForkClient "fillBootstrapJar()" lack few "MANIFEST.MF" properties > ----------------------------------------------------------------- > > Key: TIKA-2891 > URL: https://issues.apache.org/jira/browse/TIKA-2891 > Project: Tika > Issue Type: Bug > Components: parser > Affects Versions: 1.18 > Reporter: Quentin Laville > Priority: Blocker > Labels: bug, forkclient, forkparser, parser > Fix For: 1.21 > > > Due to "OOM: heap space" caused by big ".doc" files, we have decided to move > to a "ForkParser" in order to get these errors, log them and keep processing > the next documents. > Unfortunately, whenever we have an image in a document, we get the following > error: > {code:java} > Unexpected error in forked server process > org.apache.tika.exception.TikaException: Unexpected error in forked server > process > ... (bunch of line to tell call to "ForkParser.parse" failed) > Cause: java.util.ServiceConfigurationError: > javax.imageio.spi.ImageOutputStreamSpi: Provider > com.github.jaiimageio.impl.stream.ChannelImageOutputStreamSpi could not be > instantiated > at java.util.ServiceLoader.fail(ServiceLoader.java:232) > at java.util.ServiceLoader.access$100(ServiceLoader.java:185) > at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:384) > at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:404) > at java.util.ServiceLoader$1.next(ServiceLoader.java:480) > at > javax.imageio.spi.IIORegistry.registerApplicationClasspathSpis(IIORegistry.java:210) > at javax.imageio.spi.IIORegistry.<init>(IIORegistry.java:138) > at javax.imageio.spi.IIORegistry.getDefaultInstance(IIORegistry.java:159) > at javax.imageio.ImageIO.<clinit>(ImageIO.java:66) > at > org.apache.pdfbox.tools.imageio.ImageIOUtil.writeImage(ImageIOUtil.java:174) > ... > Cause: java.lang.ExceptionInInitializerError: > at > com.github.jaiimageio.impl.stream.ChannelImageOutputStreamSpi.<init>(ChannelImageOutputStreamSpi.java:66) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > at java.lang.Class.newInstance(Class.java:442) > at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:380) > at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:404) > at java.util.ServiceLoader$1.next(ServiceLoader.java:480) > at > javax.imageio.spi.IIORegistry.registerApplicationClasspathSpis(IIORegistry.java:210) > ... > Cause: java.lang.NullPointerException: > at > com.github.jaiimageio.impl.common.PackageUtil.<clinit>(PackageUtil.java:91) > at > com.github.jaiimageio.impl.stream.ChannelImageOutputStreamSpi.<init>(ChannelImageOutputStreamSpi.java:66) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > at java.lang.Class.newInstance(Class.java:442) > at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:380) > at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:404) > at java.util.ServiceLoader$1.next(ServiceLoader.java:480) > ... > {code} > This kind of errors didn't appear before, when we were only using an > "AutodetectParser". My research of a solution lead me to "ForkClient" where > you can see that only the "Main-Class" is defined in "META-INF/MANIFEST.MF", > whereas in > "com.github.jaiimageio.impl.common.PackageUtil.<clinit>(PackageUtil.java:91)" > they check that the "Implementation-Vendor" and "Implementation-Version" are > not null. > It's quite easy to reproduce: > # download a simple file example with [this > link|https://file-examples.com/wp-content/uploads/2017/10/file-sample_100kB.odt] > # use this piece of code: > {code:java} > def test = { > val forkParser = new ForkParser(ExtractText.getClass.getClassLoader, new > AutoDetectParser()) > val output = new BodyContentHandler() > val stream = TikaInputStream.get(new > FileInputStream("/path/to/file-sample_100kB.odt")) > val ctx = new ParseContext() > forkParser.parse(stream, output, new Metadata(), ctx) > }{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)