Re: Tika, Solr running under Tomcat 6 on Debian
Hi All, I have the same issue. I have installed solr instance on tomcat6. When try to index pdf I am running into the below exception: 11 Apr, 2011 12:11:55 PM org.apache.solr.common.SolrException log SEVERE: java.lang.NoClassDefFoundError: org/apache/tika/exception/TikaException at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:247) at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:359) at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:413) at org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:449) at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.getWrappedHandler(RequestHandlers.java:240) at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:231) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:175) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:844) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447) at java.lang.Thread.run(Thread.java:619) Caused by: java.lang.ClassNotFoundException: org.apache.tika.exception.TikaException at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at java.lang.ClassLoader.loadClass(ClassLoader.java:248) ... 22 more I could not found any tika jar file. Could you please help me out in fixing the above issue. Thanks, Mike -- View this message in context: http://lucene.472066.n3.nabble.com/Tika-Solr-running-under-Tomcat-6-on-Debian-tp993295p2805615.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Tika, Solr running under Tomcat 6 on Debian
\apache-solr-3.1.0\contrib\extraction\lib\tika*.jar -- Best Regards, Roy Liu On Mon, Apr 11, 2011 at 3:10 PM, Mike satish01sud...@gmail.com wrote: Hi All, I have the same issue. I have installed solr instance on tomcat6. When try to index pdf I am running into the below exception: 11 Apr, 2011 12:11:55 PM org.apache.solr.common.SolrException log SEVERE: java.lang.NoClassDefFoundError: org/apache/tika/exception/TikaException at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:247) at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:359) at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:413) at org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:449) at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.getWrappedHandler(RequestHandlers.java:240) at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:231) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:175) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:844) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447) at java.lang.Thread.run(Thread.java:619) Caused by: java.lang.ClassNotFoundException: org.apache.tika.exception.TikaException at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at java.lang.ClassLoader.loadClass(ClassLoader.java:248) ... 22 more I could not found any tika jar file. Could you please help me out in fixing the above issue. Thanks, Mike -- View this message in context: http://lucene.472066.n3.nabble.com/Tika-Solr-running-under-Tomcat-6-on-Debian-tp993295p2805615.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Tika, Solr running under Tomcat 6 on Debian
Hi Roy, Thank you for the quick reply. When i tried to index the PDF file i was able to see the response: 0 479 Query: http://localhost:8080/solr/update/extract?stream.file=D:\mike\lucene\apache-solr-1.4.1\example\exampledocs\Struts%202%20Design%20and%20Programming1.pdfstream.contentType=application/pdfliteral.id=Struts%202%20Design%20and%20Programming1.pdfdefaultField=textcommit=true But when i tried to search the content in the pdf i could not get any results: 0 2 − on 0 struts 10 2.2 Could you please let me know if I am doing anything wrong. It works fine when i tried with default jetty server prior to integrating on the tomcat6. I have followed installation steps from http://wiki.apache.org/solr/SolrTomcat (Tomcat on Windows Single Solr app). Thanks, Mike -- View this message in context: http://lucene.472066.n3.nabble.com/Tika-Solr-running-under-Tomcat-6-on-Debian-tp993295p2805974.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Tika, Solr running under Tomcat 6 on Debian
Ah! Did you set the UTF-8 parameter in Tomcat? On Mon, Apr 11, 2011 at 2:49 AM, Mike satish01sud...@gmail.com wrote: Hi Roy, Thank you for the quick reply. When i tried to index the PDF file i was able to see the response: 0 479 Query: http://localhost:8080/solr/update/extract?stream.file=D:\mike\lucene\apache-solr-1.4.1\example\exampledocs\Struts%202%20Design%20and%20Programming1.pdfstream.contentType=application/pdfliteral.id=Struts%202%20Design%20and%20Programming1.pdfdefaultField=textcommit=true But when i tried to search the content in the pdf i could not get any results: 0 2 − on 0 struts 10 2.2 Could you please let me know if I am doing anything wrong. It works fine when i tried with default jetty server prior to integrating on the tomcat6. I have followed installation steps from http://wiki.apache.org/solr/SolrTomcat (Tomcat on Windows Single Solr app). Thanks, Mike -- View this message in context: http://lucene.472066.n3.nabble.com/Tika-Solr-running-under-Tomcat-6-on-Debian-tp993295p2805974.html Sent from the Solr - User mailing list archive at Nabble.com. -- Lance Norskog goks...@gmail.com
Re: Tika, Solr running under Tomcat 6 on Debian
I would start over from the Solr 1.4.1 binary distribution and follow the instructions on the wiki: http://wiki.apache.org/solr/ExtractingRequestHandler (Java classpath stuff is notoriously difficult, especially when dynamically configured and loaded. I often cannot tell if Java cannot load the class it prints, or if that class requires others.) On Sat, Jul 24, 2010 at 11:21 PM, Tim AtLee timat...@gmail.com wrote: Hello I desperately hope someone can help me here... I'm a bit out of my league here. I am trying to implement content extraction using Tika and Solr as part of a search package for a product I am using. I have been successful in getting Solr to work so far as indexing text, and returning search results, however I am hitting a wall when I try to use Tika for content extraction. I add the following configuration to solrconfig.xml: requestHandler name=/extract/tika class=org.apache.solr.handler.extraction.ExtractingRequestHandler lst name=defaults /lst !-- This path only extracts - never updates -- lst name=invariants bool name=extractOnlytrue/bool /lst /requestHandler During a test, I receive the following error: org.apache.solr.common.SolrException: Error loading class 'org.apache.solr.handler.extraction.ExtractingRequestHandler' The full text of this error is listed below. So, as I indicated in the subject line, I am using Debian linux Squeeze (testing). Tomcat is at version 6.0.26 and is installed by apt. Solr is also installed from apt, and is at version: 1.4.0.2010.04.24.07.20.22. Java -version looks like this: java version 1.6.0_20 Java(TM) SE Runtime Environment (build 1.6.0_20-b02) The JDK is also at the same version, and also from apt. I have built Tika from source (nightly build) using mvn2, and placed the complied jar's in /lib. /lib is located at /var/solr/site/lib, along with /var/solr/site/conf and /var/solr/site/data. Hopefully this is the right place to put the jar's. I also tried building solr from source (also the nightly build), and was able to get solr sort of working (not Tika). I could run a single instance, but getting multiple instances running didn't seem to be in the cards. I didn't pursue this any further. If this is the route I should go down, if anyone can direct me on how to install a built Solr war and configure it so I can use multiple instances, I'll gladly try it out. I found a similar issue to mine at http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200911.mbox/d2b0462d72664840b72118cb4437cbd403e2a...@ndhamrexm22.amer.pfizer.comhttp://mail-archives.apache.org/mod_mbox/lucene-solr-user/200911.mbox/%3cd2b0462d72664840b72118cb4437cbd403e2a...@ndhamrexm22.amer.pfizer.com%3e, From that email, I tried copying the built Solr jars into the Solr site's lib directory, then realized that the likelihood of that working was pretty slim - jars built from a nightly build trying to work with a .war from 1.4.0 was probably not going work. As you might have guessed, it didn't. This is when I tried building Solr from source (thinking that if all the Solr stuff was at the same revision, it might work). I have not tried all of this under Jetty. It's my understanding that Jetty won't let me do multiple instances, and since this is a requirement for what I'm doing, I'm more or less constrained to Tomcat. I have also seen some other references to using OpenJDK instead of Sun JDK. This resulted in the same error (don't recall the site where I saw this referenced). Any help would be greatly appreciated. I am new to Tomcat and Solr, so I may have some dumb follow-up questions that will be googled thoroughly first. Sorry in advance.. Tim -- - org.apache.solr.common.SolrException: Error loading class 'org.apache.solr.handler.extraction.ExtractingRequestHandler' at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:373) at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:414) at org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:450) at org.apache.solr.core.RequestHandlers.initHandlersFromConfig(RequestHandlers.java:152) at org.apache.solr.core.SolrCore.lt;initgt;(SolrCore.java:557) at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:137) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:83) at org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:295) at org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:422) at org.apache.catalina.core.ApplicationFilterConfig.lt;initgt;(ApplicationFilterConfig.java:115) at org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3838) at
Tika, Solr running under Tomcat 6 on Debian
Hello I desperately hope someone can help me here... I'm a bit out of my league here. I am trying to implement content extraction using Tika and Solr as part of a search package for a product I am using. I have been successful in getting Solr to work so far as indexing text, and returning search results, however I am hitting a wall when I try to use Tika for content extraction. I add the following configuration to solrconfig.xml: requestHandler name=/extract/tika class=org.apache.solr.handler.extraction.ExtractingRequestHandler lst name=defaults /lst !-- This path only extracts - never updates -- lst name=invariants bool name=extractOnlytrue/bool /lst /requestHandler During a test, I receive the following error: org.apache.solr.common.SolrException: Error loading class 'org.apache.solr.handler.extraction.ExtractingRequestHandler' The full text of this error is listed below. So, as I indicated in the subject line, I am using Debian linux Squeeze (testing). Tomcat is at version 6.0.26 and is installed by apt. Solr is also installed from apt, and is at version: 1.4.0.2010.04.24.07.20.22. Java -version looks like this: java version 1.6.0_20 Java(TM) SE Runtime Environment (build 1.6.0_20-b02) The JDK is also at the same version, and also from apt. I have built Tika from source (nightly build) using mvn2, and placed the complied jar's in /lib. /lib is located at /var/solr/site/lib, along with /var/solr/site/conf and /var/solr/site/data. Hopefully this is the right place to put the jar's. I also tried building solr from source (also the nightly build), and was able to get solr sort of working (not Tika). I could run a single instance, but getting multiple instances running didn't seem to be in the cards. I didn't pursue this any further. If this is the route I should go down, if anyone can direct me on how to install a built Solr war and configure it so I can use multiple instances, I'll gladly try it out. I found a similar issue to mine at http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200911.mbox/d2b0462d72664840b72118cb4437cbd403e2a...@ndhamrexm22.amer.pfizer.comhttp://mail-archives.apache.org/mod_mbox/lucene-solr-user/200911.mbox/%3cd2b0462d72664840b72118cb4437cbd403e2a...@ndhamrexm22.amer.pfizer.com%3e, From that email, I tried copying the built Solr jars into the Solr site's lib directory, then realized that the likelihood of that working was pretty slim - jars built from a nightly build trying to work with a .war from 1.4.0 was probably not going work. As you might have guessed, it didn't. This is when I tried building Solr from source (thinking that if all the Solr stuff was at the same revision, it might work). I have not tried all of this under Jetty. It's my understanding that Jetty won't let me do multiple instances, and since this is a requirement for what I'm doing, I'm more or less constrained to Tomcat. I have also seen some other references to using OpenJDK instead of Sun JDK. This resulted in the same error (don't recall the site where I saw this referenced). Any help would be greatly appreciated. I am new to Tomcat and Solr, so I may have some dumb follow-up questions that will be googled thoroughly first. Sorry in advance.. Tim -- - org.apache.solr.common.SolrException: Error loading class 'org.apache.solr.handler.extraction.ExtractingRequestHandler' at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:373) at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:414) at org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:450) at org.apache.solr.core.RequestHandlers.initHandlersFromConfig(RequestHandlers.java:152) at org.apache.solr.core.SolrCore.lt;initgt;(SolrCore.java:557) at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:137) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:83) at org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:295) at org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:422) at org.apache.catalina.core.ApplicationFilterConfig.lt;initgt;(ApplicationFilterConfig.java:115) at org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3838) at org.apache.catalina.core.StandardContext.start(StandardContext.java:4488) at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:791) at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:771) at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:546) at org.apache.catalina.startup.HostConfig.deployDescriptor(HostConfig.java:637) at org.apache.catalina.startup.HostConfig.deployDescriptors(HostConfig.java:563) at