What version of MCF is this? That's important to know since Tika has had problems with this kind of thing in the past and this looks like something similar.
The problem you are reporting is due to either a missing jar, or a bug in an internal tika classloader. But I need to know whether this is a current bug or not, since we just went to a new Tika version. Karl On Tue, Jan 9, 2018 at 4:32 AM, msaunier <[email protected]> wrote: > Hello Karl, > > I hope you are well today. > > > > I have 2 problems with ManifoldCF. > > > > ----------- > > In **Outputs connectors** with Solr connector. I have add a « Maximum > document length and I have « Excluded 5 mime types » but it not work. I > join capture. > > > > ---------- > > And in second, I have a **Tika exception** in ManifoldCF. 3 documents are > blocked : > > > > FATAL 2018-01-09T10:19:54,992 (Worker thread '5') - Error tossed: > org.apache.poi.hwmf.record.HwmfFont.getCharSet()Lorg/ > apache/poi/hwmf/record/HwmfFont$WmfCharset; > > java.lang.NoSuchMethodError: org.apache.poi.hwmf.record. > HwmfFont.getCharSet()Lorg/apache/poi/hwmf/record/HwmfFont$WmfCharset; > > at org.apache.tika.parser.microsoft.WMFParser.parse(WMFParser.java:74) > ~[?:?] > > at > org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) > ~[?:?] > > at > org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) > ~[?:?] > > at > org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:135) > ~[?:?] > > at > org.apache.tika.parser.DelegatingParser.parse(DelegatingParser.java:72) > ~[?:?] > > at org.apache.tika.extractor.ParsingEmbeddedDocumentExtract > or.parseEmbedded(ParsingEmbeddedDocumentExtractor.java:102) ~[?:?] > > at org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor. > handleEmbeddedFile(AbstractOOXMLExtractor.java:375) ~[?:?] > > at org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor. > handleEmbeddedPart(AbstractOOXMLExtractor.java:260) ~[?:?] > > at org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor. > handleEmbeddedParts(AbstractOOXMLExtractor.java:205) ~[?:?] > > at org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor. > getXHTML(AbstractOOXMLExtractor.java:142) ~[?:?] > > at org.apache.tika.parser.microsoft.ooxml. > OOXMLExtractorFactory.parse(OOXMLExtractorFactory.java:142) ~[?:?] > > at > org.apache.tika.parser.microsoft.ooxml.OOXMLParser.parse(OOXMLParser.java:106) > ~[?:?] > > at > org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) > ~[?:?] > > at > org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) > ~[?:?] > > at > org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:135) > ~[?:?] > > at org.apache.manifoldcf.agents.transformation.tika. > TikaParser.parse(TikaParser.java:74) ~[?:?] > > at org.apache.manifoldcf.agents.transformation.tika.TikaExtractor. > addOrReplaceDocumentWithException(TikaExtractor.java:235) ~[?:?] > > at org.apache.manifoldcf.agents.incrementalingest. > IncrementalIngester$PipelineAddEntryPoint.addOrReplaceDocumentWithExcept > ion(IncrementalIngester.java:3226) ~[mcf-agents.jar:?] > > at org.apache.manifoldcf.agents.incrementalingest. > IncrementalIngester$PipelineAddFanout.sendDocument(IncrementalIngester.java:3077) > ~[mcf-agents.jar:?] > > at org.apache.manifoldcf.agents.incrementalingest. > IncrementalIngester$PipelineObjectWithVersions. > addOrReplaceDocumentWithException(IncrementalIngester.java:2708) > ~[mcf-agents.jar:?] > > at org.apache.manifoldcf.agents.incrementalingest. > IncrementalIngester.documentIngest(IncrementalIngester.java:756) > ~[mcf-agents.jar:?] > > at org.apache.manifoldcf.crawler.system.WorkerThread$ > ProcessActivity.ingestDocumentWithException(WorkerThread.java:1583) > ~[mcf-pull-agent.jar:?] > > at org.apache.manifoldcf.crawler.system.WorkerThread$ > ProcessActivity.ingestDocumentWithException(WorkerThread.java:1548) > ~[mcf-pull-agent.jar:?] > > at org.apache.manifoldcf.crawler.connectors.sharedrive. > SharedDriveConnector.processDocuments(SharedDriveConnector.java:939) > ~[?:?] > > at > org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:399) > [mcf-pull-agent.jar:?] > > > > I need to create an incident ticket? > > > > ---------- > > > > Thanks for your help. > > > > Cordialement, > > > > [image: msaunier] > > > > > > >
