Re: Tika Error work around?
Assuming here that you’re using DIH or extrracting request handler. There are quite a number of reasons to run Tika outside Solr so you can handle exceptional cases as you see fit, see: https://lucidworks.com/2012/02/14/indexing-with-solrj/ Best, Erick > On Mar 14, 2019, at 7:39 PM, wclarke wrote: > > I am getting an error that stops Tika fetching/processing/and committing when > it reaches a specific language (Malayalam). Is there a work around? > > > > -- > Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Tika Error work around?
I am getting an error that stops Tika fetching/processing/and committing when it reaches a specific language (Malayalam). Is there a work around? -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Re: Tika error
Hi Arkadi, You may want to post this on the u...@tika.apache.org list -- looks like you are missing the univerisalchardetector library as part of your Solr Cell installation. Cheers, Chris On 12/6/12 12:02 AM, Arkadi Colson ark...@smartbit.be wrote: Anybody an idea? Dec 5, 2012 3:52:32 PM org.apache.solr.client.solrj.impl.HttpClientUtil createClient INFO: Creating new http client, config:maxConnections=500maxConnectionsPerHost=16 Dec 5, 2012 3:52:33 PM org.apache.solr.update.processor.LogUpdateProcessor finish INFO: [intradesk] webapp=/solr path=/update/extract params={literal.smsc_ssid=1499commit=trueliteral.id=1354722015literal.s msc_date_edited=2012-11-05T15:09:47Zliteral.smsc_courseID=0literal.smsc_ date_created=2012-1 1-05T15:09:47Zwt=jsonliteral.smsc_module=intradesk} {} 0 313 Dec 5, 2012 3:52:33 PM org.apache.solr.common.SolrException log SEVERE: null:java.lang.RuntimeException: java.lang.NoClassDefFoundError: org/mozilla/universalchardet/CharsetListener at org.apache.solr.servlet.SolrDispatchFilter.sendError(SolrDispatchFilter.ja va:469) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.jav a:297) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Applicati onFilterChain.java:243) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilter Chain.java:210) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve. java:222) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve. java:123) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:1 71) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:9 9) at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:931) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.ja va:118) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:407 ) at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Pro cessor.java:1004) at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(Abstr actProtocol.java:589) at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.jav a:310) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor. java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java :908) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.NoClassDefFoundError: org/mozilla/universalchardet/CharsetListener at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631) at java.lang.ClassLoader.defineClass(ClassLoader.java:615) at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141) at org.apache.catalina.loader.WebappClassLoader.findClassInternal(WebappClass Loader.java:2904) at org.apache.catalina.loader.WebappClassLoader.findClass(WebappClassLoader.j ava:1173) at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.j ava:1681) at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.j ava:1559) at org.apache.tika.parser.txt.UniversalEncodingDetector.detect(UniversalEncod ingDetector.java:40) at org.apache.tika.detect.AutoDetectReader.detect(AutoDetectReader.java:51) at org.apache.tika.detect.AutoDetectReader.init(AutoDetectReader.java:92) at org.apache.tika.detect.AutoDetectReader.init(AutoDetectReader.java:98) at org.apache.tika.parser.txt.TXTParser.parse(TXTParser.java:70) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242) at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120) at org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(Extractin gDocumentLoader.java:219) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(Content StreamHandlerBase.java:74) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBas e.java:129) at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleReque st(RequestHandlers.java:240) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1699) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java :455) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.jav a:276) ... 15 more Caused by: java.lang.ClassNotFoundException: org.mozilla.universalchardet.CharsetListener at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.j ava:1714) at
Tika error
Anybody an idea? Dec 5, 2012 3:52:32 PM org.apache.solr.client.solrj.impl.HttpClientUtil createClient INFO: Creating new http client, config:maxConnections=500maxConnectionsPerHost=16 Dec 5, 2012 3:52:33 PM org.apache.solr.update.processor.LogUpdateProcessor finish INFO: [intradesk] webapp=/solr path=/update/extract params={literal.smsc_ssid=1499commit=trueliteral.id=1354722015literal.smsc_date_edited=2012-11-05T15:09:47Zliteral.smsc_courseID=0literal.smsc_date_created=2012-1 1-05T15:09:47Zwt=jsonliteral.smsc_module=intradesk} {} 0 313 Dec 5, 2012 3:52:33 PM org.apache.solr.common.SolrException log SEVERE: null:java.lang.RuntimeException: java.lang.NoClassDefFoundError: org/mozilla/universalchardet/CharsetListener at org.apache.solr.servlet.SolrDispatchFilter.sendError(SolrDispatchFilter.java:469) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:297) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99) at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:931) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:407) at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1004) at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589) at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.NoClassDefFoundError: org/mozilla/universalchardet/CharsetListener at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631) at java.lang.ClassLoader.defineClass(ClassLoader.java:615) at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141) at org.apache.catalina.loader.WebappClassLoader.findClassInternal(WebappClassLoader.java:2904) at org.apache.catalina.loader.WebappClassLoader.findClass(WebappClassLoader.java:1173) at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1681) at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1559) at org.apache.tika.parser.txt.UniversalEncodingDetector.detect(UniversalEncodingDetector.java:40) at org.apache.tika.detect.AutoDetectReader.detect(AutoDetectReader.java:51) at org.apache.tika.detect.AutoDetectReader.init(AutoDetectReader.java:92) at org.apache.tika.detect.AutoDetectReader.init(AutoDetectReader.java:98) at org.apache.tika.parser.txt.TXTParser.parse(TXTParser.java:70) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242) at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120) at org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:219) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129) at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:240) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1699) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:455) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:276) ... 15 more Caused by: java.lang.ClassNotFoundException: org.mozilla.universalchardet.CharsetListener at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1714) at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1559) ... 38 more Dec 6, 2012 7:58:02 AM org.apache.solr.update.processor.LogUpdateProcessor finish INFO: [intradesk] webapp=/solr path=/update/extract
Fwd: Tika error
However the tomcat logs are reporting: INFO: Adding 'file:/opt/solr/contrib/extraction/lib/juniversalchardet-1.0.3.jar' to classloader Dec 6, 2012 3:42:57 PM org.apache.solr.core.SolrResourceLoader replaceClassLoader Original Message Subject:Tika error Date: Thu, 06 Dec 2012 09:02:14 +0100 From: Arkadi Colson ark...@smartbit.be Reply-To: ark...@smartbit.be Organization: Smartbit bvba To: solr-user@lucene.apache.org solr-user@lucene.apache.org Anybody an idea? Dec 5, 2012 3:52:32 PM org.apache.solr.client.solrj.impl.HttpClientUtil createClient INFO: Creating new http client, config:maxConnections=500maxConnectionsPerHost=16 Dec 5, 2012 3:52:33 PM org.apache.solr.update.processor.LogUpdateProcessor finish INFO: [intradesk] webapp=/solr path=/update/extract params={literal.smsc_ssid=1499commit=trueliteral.id=1354722015literal.smsc_date_edited=2012-11-05T15:09:47Zliteral.smsc_courseID=0literal.smsc_date_created=2012-1 1-05T15:09:47Zwt=jsonliteral.smsc_module=intradesk} {} 0 313 Dec 5, 2012 3:52:33 PM org.apache.solr.common.SolrException log SEVERE: null:java.lang.RuntimeException: java.lang.NoClassDefFoundError: org/mozilla/universalchardet/CharsetListener at org.apache.solr.servlet.SolrDispatchFilter.sendError(SolrDispatchFilter.java:469) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:297) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99) at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:931) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:407) at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1004) at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589) at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.NoClassDefFoundError: org/mozilla/universalchardet/CharsetListener at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631) at java.lang.ClassLoader.defineClass(ClassLoader.java:615) at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141) at org.apache.catalina.loader.WebappClassLoader.findClassInternal(WebappClassLoader.java:2904) at org.apache.catalina.loader.WebappClassLoader.findClass(WebappClassLoader.java:1173) at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1681) at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1559) at org.apache.tika.parser.txt.UniversalEncodingDetector.detect(UniversalEncodingDetector.java:40) at org.apache.tika.detect.AutoDetectReader.detect(AutoDetectReader.java:51) at org.apache.tika.detect.AutoDetectReader.init(AutoDetectReader.java:92) at org.apache.tika.detect.AutoDetectReader.init(AutoDetectReader.java:98) at org.apache.tika.parser.txt.TXTParser.parse(TXTParser.java:70) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242) at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120) at org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:219) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129) at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:240) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1699) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:455) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:276) ... 15 more Caused