If you need DEBUG elsewhere, can you selectively turn logging for the ExternalParser to ERROR? Or is there a fix you'd recommend on the Tika side?
On Wed, Oct 5, 2022 at 7:22 AM Markus Jelsma <[email protected]> wrote: > > Hello, > > We use Tika embedded in our Java programs and recently upgraded from one of > the last 1.x to 2.x, currently 2.4.1. > > Since then, with debug logging on, Tika spews out a few pretty bug and > partially repeating exceptions. This is not a real runtime problem, but just > a distracting nuisance as my attention triggers when seeing stack traces. > > Is there something to do about it? > > This is the exif related trace: > 2022-10-05 13:16:42,136 DEBUG > [TEST-SequenceBlockMarkerTest.testDierenforum-seed#[5F443E2359FE59DA]] > external.ExternalParser (ExternalParser.java:172) - exit > value for ffmpeg: 0 > 2022-10-05 13:16:42,140 DEBUG > [TEST-SequenceBlockMarkerTest.testDierenforum-seed#[5F443E2359FE59DA]] > external.ExternalParser (ExternalParser.java:180) - exce > ption trying to run exiftool > java.io.IOException: Cannot run program "exiftool": error=2, No such file or > directory > at java.lang.ProcessBuilder.start(ProcessBuilder.java:1128) ~[?:?] > at java.lang.ProcessBuilder.start(ProcessBuilder.java:1071) ~[?:?] > at java.lang.Runtime.exec(Runtime.java:592) ~[?:?] > at java.lang.Runtime.exec(Runtime.java:451) ~[?:?] > at > org.apache.tika.parser.external.ExternalParser.check(ExternalParser.java:161) > ~[tika-core-2.4.1.jar:2.4.1] > at > org.apache.tika.parser.external.ExternalParsersConfigReader.readCheckTagAndCheck(ExternalParsersConfigReader.java:203) > ~[tika-core-2.4.1.jar:2.4.1 > ] > at > org.apache.tika.parser.external.ExternalParsersConfigReader.readParser(ExternalParsersConfigReader.java:110) > ~[tika-core-2.4.1.jar:2.4.1] > at > org.apache.tika.parser.external.ExternalParsersConfigReader.read(ExternalParsersConfigReader.java:80) > ~[tika-core-2.4.1.jar:2.4.1] > at > org.apache.tika.parser.external.ExternalParsersConfigReader.read(ExternalParsersConfigReader.java:67) > ~[tika-core-2.4.1.jar:2.4.1] > at > org.apache.tika.parser.external.ExternalParsersConfigReader.read(ExternalParsersConfigReader.java:60) > ~[tika-core-2.4.1.jar:2.4.1] > at > org.apache.tika.parser.external.ExternalParsersFactory.create(ExternalParsersFactory.java:67) > ~[tika-core-2.4.1.jar:2.4.1] > at > org.apache.tika.parser.external.ExternalParsersFactory.create(ExternalParsersFactory.java:60) > ~[tika-core-2.4.1.jar:2.4.1] > at > org.apache.tika.parser.external.ExternalParsersFactory.create(ExternalParsersFactory.java:49) > ~[tika-core-2.4.1.jar:2.4.1] > at > org.apache.tika.parser.external.ExternalParsersFactory.create(ExternalParsersFactory.java:44) > ~[tika-core-2.4.1.jar:2.4.1] > at > org.apache.tika.parser.external.CompositeExternalParser.<init>(CompositeExternalParser.java:42) > ~[tika-core-2.4.1.jar:2.4.1] > at > org.apache.tika.parser.external.CompositeExternalParser.<init>(CompositeExternalParser.java:37) > ~[tika-core-2.4.1.jar:2.4.1] > at > jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) ~[?:?] > at > jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > ~[?:?] > at > jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > ~[?:?] > at java.lang.reflect.Constructor.newInstance(Constructor.java:490) > ~[?:?] > at java.lang.Class.newInstance(Class.java:584) ~[?:?] > at > org.apache.tika.utils.ServiceLoaderUtils.newInstance(ServiceLoaderUtils.java:80) > ~[tika-core-2.4.1.jar:2.4.1] > at > org.apache.tika.config.ServiceLoader.loadStaticServiceProviders(ServiceLoader.java:345) > ~[tika-core-2.4.1.jar:2.4.1] > at > org.apache.tika.parser.DefaultParser.getDefaultParsers(DefaultParser.java:105) > ~[tika-core-2.4.1.jar:2.4.1] > at org.apache.tika.parser.DefaultParser.<init>(DefaultParser.java:52) > ~[tika-core-2.4.1.jar:2.4.1] > at org.apache.tika.parser.DefaultParser.<init>(DefaultParser.java:66) > ~[tika-core-2.4.1.jar:2.4.1] > at > org.apache.tika.config.TikaConfig.getDefaultParser(TikaConfig.java:291) > ~[tika-core-2.4.1.jar:2.4.1] > at org.apache.tika.config.TikaConfig.access$900(TikaConfig.java:87) > ~[tika-core-2.4.1.jar:2.4.1] > at > org.apache.tika.config.TikaConfig$ParserXmlLoader.createDefault(TikaConfig.java:878) > ~[tika-core-2.4.1.jar:2.4.1] > at > org.apache.tika.config.TikaConfig$ParserXmlLoader.createDefault(TikaConfig.java:824) > ~[tika-core-2.4.1.jar:2.4.1] > at > org.apache.tika.config.TikaConfig$XmlLoader.loadOverall(TikaConfig.java:648) > ~[tika-core-2.4.1.jar:2.4.1] > at org.apache.tika.config.TikaConfig.<init>(TikaConfig.java:170) > ~[tika-core-2.4.1.jar:2.4.1] > at org.apache.tika.config.TikaConfig.<init>(TikaConfig.java:150) > ~[tika-core-2.4.1.jar:2.4.1] > at org.apache.tika.config.TikaConfig.<init>(TikaConfig.java:142) > ~[tika-core-2.4.1.jar:2.4.1] > at org.apache.tika.config.TikaConfig.<init>(TikaConfig.java:138) > ~[tika-core-2.4.1.jar:2.4.1] > at io.openindex.sax.SAXTestCase.getHandler(SAXTestCase.java:119) > ~[test-classes/:?] > at io.openindex.sax.SAXTestCase.getHandler(SAXTestCase.java:112) > ~[test-classes/:?] > at io.openindex.sax.SAXTestCase.getHandler(SAXTestCase.java:106) > ~[test-classes/:?] > at > io.openindex.sax.readable.marker.block.SequenceBlockMarkerTest.testDierenforum(SequenceBlockMarkerTest.java:91) > ~[test-classes/:?] > at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) ~[?:?] > at > jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > ~[?:?] > at > jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ~[?:?] > at java.lang.reflect.Method.invoke(Method.java:566) ~[?:?] > at > com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1758) > ~[randomizedtesting-runner-2.8.0.jar:?] > at > com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:946) > ~[randomizedtesting-runner-2.8.0.jar:?] > at > com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) > ~[randomizedtesting-runner-2.8.0.jar:?] > at > com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390) > ~[randomizedtesting-runner-2.8.0.jar:?] > at > com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:843) > ~[randomizedtesting-runner-2.8.0.jar:?] > at > com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:490) > ~[randomizedtesting-runner-2.8.0.jar:?] > at > com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:955) > ~[randomizedtesting-runner-2.8.0.jar:?] > at > com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:840) > ~[randomizedtesting-runner-2.8.0.jar:?] > at > com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:891) > ~[randomizedtesting-runner-2.8.0.jar:?] > at > com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:902) > ~[randomizedtesting-runner-2.8.0.jar:?] > at > com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) > ~[randomizedtesting-runner-2.8.0.jar:?] > at > com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390) > ~[randomizedtesting-runner-2.8.0.jar:?] > at > com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:843) > ~[randomizedtesting-runner-2.8.0.jar:?] > at > com.carrotsearch.randomizedtesting.ThreadLeakControl$2.evaluate(ThreadLeakControl.java:426) > ~[randomizedtesting-runner-2.8.0.jar:?] > at > com.carrotsearch.randomizedtesting.RandomizedRunner.runSuite(RandomizedRunner.java:716) > ~[randomizedtesting-runner-2.8.0.jar:?] > at > com.carrotsearch.randomizedtesting.RandomizedRunner.access$200(RandomizedRunner.java:138) > ~[randomizedtesting-runner-2.8.0.jar:?] > at > com.carrotsearch.randomizedtesting.RandomizedRunner$2.run(RandomizedRunner.java:637) > ~[randomizedtesting-runner-2.8.0.jar:?] > Caused by: java.io.IOException: error=2, No such file or directory > at java.lang.ProcessImpl.forkAndExec(Native Method) ~[?:?] > at java.lang.ProcessImpl.<init>(ProcessImpl.java:340) ~[?:?] > at java.lang.ProcessImpl.start(ProcessImpl.java:271) ~[?:?] > at java.lang.ProcessBuilder.start(ProcessBuilder.java:1107) ~[?:?] > ... 59 more > > And here the Tesseract related trace: > 2022-10-05 13:16:42,331 DEBUG > [TEST-SequenceBlockMarkerTest.testDierenforum-seed#[5F443E2359FE59DA]] > external.ExternalParser (ExternalParser.java:180) - exce > ption trying to run tesseract > java.io.IOException: Cannot run program "tesseract": error=2, No such file or > directory > at java.lang.ProcessBuilder.start(ProcessBuilder.java:1128) ~[?:?] > at java.lang.ProcessBuilder.start(ProcessBuilder.java:1071) ~[?:?] > at java.lang.Runtime.exec(Runtime.java:592) ~[?:?] > at java.lang.Runtime.exec(Runtime.java:451) ~[?:?] > at > org.apache.tika.parser.external.ExternalParser.check(ExternalParser.java:161) > ~[tika-core-2.4.1.jar:2.4.1] > at > org.apache.tika.parser.ocr.TesseractOCRParser.hasTesseract(TesseractOCRParser.java:187) > ~[tika-parsers-standard-package-2.4.1.jar:2.4.1] > at > org.apache.tika.parser.ocr.TesseractOCRParser.initialize(TesseractOCRParser.java:529) > ~[tika-parsers-standard-package-2.4.1.jar:2.4.1] > at > org.apache.tika.config.ServiceLoader.loadStaticServiceProviders(ServiceLoader.java:347) > ~[tika-core-2.4.1.jar:2.4.1] > at > org.apache.tika.parser.DefaultParser.getDefaultParsers(DefaultParser.java:105) > ~[tika-core-2.4.1.jar:2.4.1] > at org.apache.tika.parser.DefaultParser.<init>(DefaultParser.java:52) > ~[tika-core-2.4.1.jar:2.4.1] > at org.apache.tika.parser.DefaultParser.<init>(DefaultParser.java:66) > ~[tika-core-2.4.1.jar:2.4.1] > at > org.apache.tika.config.TikaConfig.getDefaultParser(TikaConfig.java:291) > ~[tika-core-2.4.1.jar:2.4.1] > at org.apache.tika.config.TikaConfig.access$900(TikaConfig.java:87) > ~[tika-core-2.4.1.jar:2.4.1] > at > org.apache.tika.config.TikaConfig$ParserXmlLoader.createDefault(TikaConfig.java:878) > ~[tika-core-2.4.1.jar:2.4.1] > at > org.apache.tika.config.TikaConfig$ParserXmlLoader.createDefault(TikaConfig.java:824) > ~[tika-core-2.4.1.jar:2.4.1] > at > org.apache.tika.config.TikaConfig$XmlLoader.loadOverall(TikaConfig.java:648) > ~[tika-core-2.4.1.jar:2.4.1] > at org.apache.tika.config.TikaConfig.<init>(TikaConfig.java:170) > ~[tika-core-2.4.1.jar:2.4.1] > at org.apache.tika.config.TikaConfig.<init>(TikaConfig.java:150) > ~[tika-core-2.4.1.jar:2.4.1] > at org.apache.tika.config.TikaConfig.<init>(TikaConfig.java:142) > ~[tika-core-2.4.1.jar:2.4.1] > at org.apache.tika.config.TikaConfig.<init>(TikaConfig.java:138) > ~[tika-core-2.4.1.jar:2.4.1] > at io.openindex.sax.SAXTestCase.getHandler(SAXTestCase.java:119) > ~[test-classes/:?] > at io.openindex.sax.SAXTestCase.getHandler(SAXTestCase.java:112) > ~[test-classes/:?] > at io.openindex.sax.SAXTestCase.getHandler(SAXTestCase.java:106) > ~[test-classes/:?] > at > io.openindex.sax.readable.marker.block.SequenceBlockMarkerTest.testDierenforum(SequenceBlockMarkerTest.java:91) > ~[test-classes/:?] > at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) ~[?:?] > at > jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > ~[?:?] > at > jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ~[?:?] > at java.lang.reflect.Method.invoke(Method.java:566) ~[?:?] > at > com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1758) > ~[randomizedtesting-runner-2.8.0.jar:?] > at > com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:946) > ~[randomizedtesting-runner-2.8.0.jar:?] > at > com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) > ~[randomizedtesting-runner-2.8.0.jar:?] > at > com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390) > ~[randomizedtesting-runner-2.8.0.jar:?] > at > com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:843) > ~[randomizedtesting-runner-2.8.0.jar:?] > at > com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:490) > ~[randomizedtesting-runner-2.8.0.jar:?] > at > com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:955) > ~[randomizedtesting-runner-2.8.0.jar:?] > at > com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:840) > ~[randomizedtesting-runner-2.8.0.jar:?] > at > com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:891) > ~[randomizedtesting-runner-2.8.0.jar:?] > at > com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:902) > ~[randomizedtesting-runner-2.8.0.jar:?] > at > com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) > ~[randomizedtesting-runner-2.8.0.jar:?] > at > com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390) > ~[randomizedtesting-runner-2.8.0.jar:?] > at > com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:843) > ~[randomizedtesting-runner-2.8.0.jar:?] > at > com.carrotsearch.randomizedtesting.ThreadLeakControl$2.evaluate(ThreadLeakControl.java:426) > ~[randomizedtesting-runner-2.8.0.jar:?] > at > com.carrotsearch.randomizedtesting.RandomizedRunner.runSuite(RandomizedRunner.java:716) > ~[randomizedtesting-runner-2.8.0.jar:?] > at > com.carrotsearch.randomizedtesting.RandomizedRunner.access$200(RandomizedRunner.java:138) > ~[randomizedtesting-runner-2.8.0.jar:?] > at > com.carrotsearch.randomizedtesting.RandomizedRunner$2.run(RandomizedRunner.java:637) > ~[randomizedtesting-runner-2.8.0.jar:?] > Caused by: java.io.IOException: error=2, No such file or directory > at java.lang.ProcessImpl.forkAndExec(Native Method) ~[?:?] > at java.lang.ProcessImpl.<init>(ProcessImpl.java:340) ~[?:?] > at java.lang.ProcessImpl.start(ProcessImpl.java:271) ~[?:?] > at java.lang.ProcessBuilder.start(ProcessBuilder.java:1107) ~[?:?] > ... 44 more > > Thanks, > Markus
