I've opened CONNECTORS-1516 to track the Class Not Found issue, and also created an Apache POI bugzilla ticket, which is referenced.
Karl On Tue, Jul 24, 2018 at 6:15 AM Karl Wright <[email protected]> wrote: > The "class not found" error looks probably like a classloader issue with > Tika -- the class is present in poi-ooxml-3.17.jar, although to be fair it > might possibly be caused by an out-of-memory condition. > > You should be able to find the exception in the Simple History and figure > out what document it came from from that. If not, then look at the log > prior to the exception, and look at what Worker Thread 1 was doing. > > Karl > > > On Tue, Jul 24, 2018 at 5:58 AM msaunier <[email protected]> wrote: > >> Re Karl, >> >> >> >> I have an Out of Memory Error today. I think I have an error with a >> document. I have this WARNING before crash: >> >> >> >> ------------------------------------------------------------------------ >> >> >> >> WARN 2018-07-24T11:46:22,098 (Worker thread '1') - Tika: Tika exception >> extracting: TIKA-198: Illegal IOException from >> org.apache.tika.parser.microsoft.OfficeParser@62980adb >> >> org.apache.tika.exception.TikaException: TIKA-198: Illegal IOException >> from org.apache.tika.parser.microsoft.OfficeParser@62980adb >> >> at >> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:286) >> ~[tika-core-1.17.jar:1.17] >> >> at >> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) >> ~[tika-core-1.17.jar:1.17] >> >> at >> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143) >> ~[tika-core-1.17.jar:1.17] >> >> at >> org.apache.manifoldcf.agents.transformation.tika.TikaParser.parse(TikaParser.java:74) >> ~[mcf-tika-connector.jar:?] >> >> at >> org.apache.manifoldcf.agents.transformation.tika.TikaExtractor.addOrReplaceDocumentWithException(TikaExtractor.java:235) >> [mcf-tika-connector.jar:?] >> >> at >> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineAddEntryPoint.addOrReplaceDocumentWithException(IncrementalIngester.java:3226) >> [mcf-agents.jar:?] >> >> at >> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineAddFanout.sendDocument(IncrementalIngester.java:3077) >> [mcf-agents.jar:?] >> >> at >> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineObjectWithVersions.addOrReplaceDocumentWithException(IncrementalIngester.java:2708) >> [mcf-agents.jar:?] >> >> at >> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:756) >> [mcf-agents.jar:?] >> >> at >> org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocumentWithException(WorkerThread.java:1583) >> [mcf-pull-agent.jar:?] >> >> at >> org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocumentWithException(WorkerThread.java:1548) >> [mcf-pull-agent.jar:?] >> >> at >> org.apache.manifoldcf.crawler.connectors.sharedrive.SharedDriveConnector.processDocuments(SharedDriveConnector.java:939) >> [mcf-jcifs-connector.jar:?] >> >> at >> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:399) >> [mcf-pull-agent.jar:?] >> >> Caused by: java.io.IOException: java.lang.ClassNotFoundException: >> org.apache.poi.poifs.crypt.agile.AgileEncryptionInfoBuilder >> >> at >> org.apache.poi.poifs.crypt.EncryptionInfo.<init>(EncryptionInfo.java:150) >> ~[?:?] >> >> at >> org.apache.poi.poifs.crypt.EncryptionInfo.<init>(EncryptionInfo.java:102) >> ~[?:?] >> >> at >> org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:203) >> ~[?:?] >> >> at >> org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:132) >> ~[?:?] >> >> at >> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) >> ~[?:?] >> >> ... 12 more >> >> Caused by: java.lang.ClassNotFoundException: >> org.apache.poi.poifs.crypt.agile.AgileEncryptionInfoBuilder >> >> at java.net.URLClassLoader.findClass(URLClassLoader.java:381) >> ~[?:1.8.0_171] >> >> at java.lang.ClassLoader.loadClass(ClassLoader.java:424) >> ~[?:1.8.0_171] >> >> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349) >> ~[?:1.8.0_171] >> >> at java.lang.ClassLoader.loadClass(ClassLoader.java:357) >> ~[?:1.8.0_171] >> >> at >> org.apache.poi.poifs.crypt.EncryptionInfo.getBuilder(EncryptionInfo.java:222) >> ~[?:?] >> >> at >> org.apache.poi.poifs.crypt.EncryptionInfo.<init>(EncryptionInfo.java:148) >> ~[?:?] >> >> at >> org.apache.poi.poifs.crypt.EncryptionInfo.<init>(EncryptionInfo.java:102) >> ~[?:?] >> >> at >> org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:203) >> ~[?:?] >> >> at >> org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:132) >> ~[?:?] >> >> at >> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) >> ~[?:?] >> >> ... 12 more >> >> >> >> I think it’s a file, because RAM allocation have a weird behavior. In one >> second, ManifoldCF (or Tika) allocate +6Go RAM. >> >> >> >> >> >> How Can I find the file? >> >> >> >> Thanks, >> >> Maxence, >> >
