Re: Tika, Solr running under Tomcat 6 on Debian

2011-04-11 Thread Mike
Hi All,

I have the same issue. I have installed solr instance on tomcat6. When try
to index pdf I am running into the below exception:

11 Apr, 2011 12:11:55 PM org.apache.solr.common.SolrException log
SEVERE: java.lang.NoClassDefFoundError:
org/apache/tika/exception/TikaException
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:247)
at
org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:359)
at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:413)
at
org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:449)
at
org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.getWrappedHandler(RequestHandlers.java:240)
at
org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:231)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:175)
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286)
at
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:844)
at
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
at
org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
at java.lang.Thread.run(Thread.java:619)
Caused by: java.lang.ClassNotFoundException:
org.apache.tika.exception.TikaException
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
... 22 more

I could not found any tika jar file.
Could you please help me out in fixing the above issue.

Thanks,
Mike

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Tika-Solr-running-under-Tomcat-6-on-Debian-tp993295p2805615.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Tika, Solr running under Tomcat 6 on Debian

2011-04-11 Thread Roy Liu
\apache-solr-3.1.0\contrib\extraction\lib\tika*.jar

-- 
Best Regards,
Roy Liu


On Mon, Apr 11, 2011 at 3:10 PM, Mike satish01sud...@gmail.com wrote:

 Hi All,

 I have the same issue. I have installed solr instance on tomcat6. When try
 to index pdf I am running into the below exception:

 11 Apr, 2011 12:11:55 PM org.apache.solr.common.SolrException log
 SEVERE: java.lang.NoClassDefFoundError:
 org/apache/tika/exception/TikaException
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:247)
at

 org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:359)
at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:413)
at
 org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:449)
at

 org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.getWrappedHandler(RequestHandlers.java:240)
at

 org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:231)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
at

 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
at

 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
at

 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at

 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at

 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at

 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:175)
at

 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
at

 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at

 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at
 org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286)
at
 org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:844)
at

 org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
at
 org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
at java.lang.Thread.run(Thread.java:619)
 Caused by: java.lang.ClassNotFoundException:
 org.apache.tika.exception.TikaException
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
... 22 more

 I could not found any tika jar file.
 Could you please help me out in fixing the above issue.

 Thanks,
 Mike

 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Tika-Solr-running-under-Tomcat-6-on-Debian-tp993295p2805615.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: Tika, Solr running under Tomcat 6 on Debian

2011-04-11 Thread Mike
Hi Roy,

Thank you for the quick reply. When i tried to index the PDF file i was able
to see the response:


0
479



Query:
http://localhost:8080/solr/update/extract?stream.file=D:\mike\lucene\apache-solr-1.4.1\example\exampledocs\Struts%202%20Design%20and%20Programming1.pdfstream.contentType=application/pdfliteral.id=Struts%202%20Design%20and%20Programming1.pdfdefaultField=textcommit=true

But when i tried to search the content in the pdf i could not get any
results:



0
2
−

on
0
struts
10
2.2




 
Could you please let me know if I am doing anything wrong. It works fine
when i tried with default jetty server prior to integrating on the tomcat6.

I have followed installation steps from
http://wiki.apache.org/solr/SolrTomcat
(Tomcat on Windows Single Solr app).

Thanks,
Mike



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Tika-Solr-running-under-Tomcat-6-on-Debian-tp993295p2805974.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Tika, Solr running under Tomcat 6 on Debian

2011-04-11 Thread Lance Norskog
Ah! Did you set the UTF-8 parameter in Tomcat?

On Mon, Apr 11, 2011 at 2:49 AM, Mike satish01sud...@gmail.com wrote:
 Hi Roy,

 Thank you for the quick reply. When i tried to index the PDF file i was able
 to see the response:


 0
 479



 Query:
 http://localhost:8080/solr/update/extract?stream.file=D:\mike\lucene\apache-solr-1.4.1\example\exampledocs\Struts%202%20Design%20and%20Programming1.pdfstream.contentType=application/pdfliteral.id=Struts%202%20Design%20and%20Programming1.pdfdefaultField=textcommit=true

 But when i tried to search the content in the pdf i could not get any
 results:



 0
 2
 −

 on
 0
 struts
 10
 2.2





 Could you please let me know if I am doing anything wrong. It works fine
 when i tried with default jetty server prior to integrating on the tomcat6.

 I have followed installation steps from
 http://wiki.apache.org/solr/SolrTomcat
 (Tomcat on Windows Single Solr app).

 Thanks,
 Mike



 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Tika-Solr-running-under-Tomcat-6-on-Debian-tp993295p2805974.html
 Sent from the Solr - User mailing list archive at Nabble.com.




-- 
Lance Norskog
goks...@gmail.com


Re: Tika, Solr running under Tomcat 6 on Debian

2010-07-27 Thread Lance Norskog
I would start over from the Solr 1.4.1 binary distribution and follow
the instructions on the wiki:

http://wiki.apache.org/solr/ExtractingRequestHandler

(Java classpath stuff is notoriously difficult, especially when
dynamically configured and loaded. I often cannot tell if Java cannot
load the class it prints, or if that class requires others.)

On Sat, Jul 24, 2010 at 11:21 PM, Tim AtLee timat...@gmail.com wrote:
 Hello

 I desperately hope someone can help me here...  I'm a bit out of my league
 here.

 I am trying to implement content extraction using Tika and Solr as part of a
 search package for a product I am using.  I have been successful in getting
 Solr to work so far as indexing text, and returning search results, however
 I am hitting a wall when I try to use Tika for content extraction.

 I add the following configuration to solrconfig.xml:
  requestHandler name=/extract/tika
 class=org.apache.solr.handler.extraction.ExtractingRequestHandler

    lst name=defaults
    /lst
    !-- This path only extracts - never updates --
    lst name=invariants
      bool name=extractOnlytrue/bool
    /lst
  /requestHandler

 During a test, I receive the following error:
 org.apache.solr.common.SolrException: Error loading class
 'org.apache.solr.handler.extraction.ExtractingRequestHandler'

 The full text of this error is listed below.

 So, as I indicated in the subject line, I am using Debian linux Squeeze
 (testing).  Tomcat is at version 6.0.26 and is installed by apt.

 Solr is also installed from apt, and is at version:
 1.4.0.2010.04.24.07.20.22.

 Java -version looks like this:
 java version 1.6.0_20
 Java(TM) SE Runtime Environment (build 1.6.0_20-b02)

 The JDK is also at the same version, and also from apt.

 I have built Tika from source (nightly build) using mvn2, and placed
 the complied jar's in /lib.  /lib is located at /var/solr/site/lib, along
 with /var/solr/site/conf and /var/solr/site/data.  Hopefully this is the
 right place to put the jar's.

 I also tried building solr from source (also the nightly build), and was
 able to get solr sort of working (not Tika).  I could run a single instance,
 but getting multiple instances running didn't seem to be in the cards.  I
 didn't pursue this any further.  If this is the route I should go down, if
 anyone can direct me on how to install a built Solr war and configure it so
 I can use multiple instances, I'll gladly try it out.

 I found a similar issue to mine at
 http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200911.mbox/d2b0462d72664840b72118cb4437cbd403e2a...@ndhamrexm22.amer.pfizer.comhttp://mail-archives.apache.org/mod_mbox/lucene-solr-user/200911.mbox/%3cd2b0462d72664840b72118cb4437cbd403e2a...@ndhamrexm22.amer.pfizer.com%3e,
 From that email, I tried copying the built Solr jars into the Solr site's
 lib directory, then realized that the likelihood of that working was pretty
 slim - jars built from a nightly build trying to work with a .war from 1.4.0
 was probably not going work.  As you might have guessed, it didn't.  This is
 when I tried building Solr from source (thinking that if all the Solr stuff
 was at the same revision, it might work).

 I have not tried all of this under Jetty.  It's my understanding that Jetty
 won't let me do multiple instances, and since this is a requirement for what
 I'm doing, I'm more or less constrained to Tomcat.

 I have also seen some other references to using OpenJDK instead of Sun JDK.
  This resulted in the same error (don't recall the site where I saw this
 referenced).

 Any help would be greatly appreciated.  I am new to Tomcat and Solr, so I
 may have some dumb follow-up questions that will be googled thoroughly
 first.  Sorry in advance..

 Tim

 --

 -
 org.apache.solr.common.SolrException: Error loading class
 'org.apache.solr.handler.extraction.ExtractingRequestHandler'
        at
 org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:373)
        at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:414)
        at
 org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:450)
        at
 org.apache.solr.core.RequestHandlers.initHandlersFromConfig(RequestHandlers.java:152)
        at org.apache.solr.core.SolrCore.lt;initgt;(SolrCore.java:557)
        at
 org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:137)
        at
 org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:83)
        at
 org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:295)
        at
 org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:422)
        at
 org.apache.catalina.core.ApplicationFilterConfig.lt;initgt;(ApplicationFilterConfig.java:115)
        at
 org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3838)
        at
 

Tika, Solr running under Tomcat 6 on Debian

2010-07-25 Thread Tim AtLee
Hello

I desperately hope someone can help me here...  I'm a bit out of my league
here.

I am trying to implement content extraction using Tika and Solr as part of a
search package for a product I am using.  I have been successful in getting
Solr to work so far as indexing text, and returning search results, however
I am hitting a wall when I try to use Tika for content extraction.

I add the following configuration to solrconfig.xml:
  requestHandler name=/extract/tika
class=org.apache.solr.handler.extraction.ExtractingRequestHandler

lst name=defaults
/lst
!-- This path only extracts - never updates --
lst name=invariants
  bool name=extractOnlytrue/bool
/lst
  /requestHandler

During a test, I receive the following error:
org.apache.solr.common.SolrException: Error loading class
'org.apache.solr.handler.extraction.ExtractingRequestHandler'

The full text of this error is listed below.

So, as I indicated in the subject line, I am using Debian linux Squeeze
(testing).  Tomcat is at version 6.0.26 and is installed by apt.

Solr is also installed from apt, and is at version:
1.4.0.2010.04.24.07.20.22.

Java -version looks like this:
java version 1.6.0_20
Java(TM) SE Runtime Environment (build 1.6.0_20-b02)

The JDK is also at the same version, and also from apt.

I have built Tika from source (nightly build) using mvn2, and placed
the complied jar's in /lib.  /lib is located at /var/solr/site/lib, along
with /var/solr/site/conf and /var/solr/site/data.  Hopefully this is the
right place to put the jar's.

I also tried building solr from source (also the nightly build), and was
able to get solr sort of working (not Tika).  I could run a single instance,
but getting multiple instances running didn't seem to be in the cards.  I
didn't pursue this any further.  If this is the route I should go down, if
anyone can direct me on how to install a built Solr war and configure it so
I can use multiple instances, I'll gladly try it out.

I found a similar issue to mine at
http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200911.mbox/d2b0462d72664840b72118cb4437cbd403e2a...@ndhamrexm22.amer.pfizer.comhttp://mail-archives.apache.org/mod_mbox/lucene-solr-user/200911.mbox/%3cd2b0462d72664840b72118cb4437cbd403e2a...@ndhamrexm22.amer.pfizer.com%3e,
From that email, I tried copying the built Solr jars into the Solr site's
lib directory, then realized that the likelihood of that working was pretty
slim - jars built from a nightly build trying to work with a .war from 1.4.0
was probably not going work.  As you might have guessed, it didn't.  This is
when I tried building Solr from source (thinking that if all the Solr stuff
was at the same revision, it might work).

I have not tried all of this under Jetty.  It's my understanding that Jetty
won't let me do multiple instances, and since this is a requirement for what
I'm doing, I'm more or less constrained to Tomcat.

I have also seen some other references to using OpenJDK instead of Sun JDK.
 This resulted in the same error (don't recall the site where I saw this
referenced).

Any help would be greatly appreciated.  I am new to Tomcat and Solr, so I
may have some dumb follow-up questions that will be googled thoroughly
first.  Sorry in advance..

Tim

--

-
org.apache.solr.common.SolrException: Error loading class
'org.apache.solr.handler.extraction.ExtractingRequestHandler'
at
org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:373)
at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:414)
at
org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:450)
at
org.apache.solr.core.RequestHandlers.initHandlersFromConfig(RequestHandlers.java:152)
at org.apache.solr.core.SolrCore.lt;initgt;(SolrCore.java:557)
at
org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:137)
at
org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:83)
at
org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:295)
at
org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:422)
at
org.apache.catalina.core.ApplicationFilterConfig.lt;initgt;(ApplicationFilterConfig.java:115)
at
org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3838)
at
org.apache.catalina.core.StandardContext.start(StandardContext.java:4488)
at
org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:791)
at
org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:771)
at
org.apache.catalina.core.StandardHost.addChild(StandardHost.java:546)
at
org.apache.catalina.startup.HostConfig.deployDescriptor(HostConfig.java:637)
at
org.apache.catalina.startup.HostConfig.deployDescriptors(HostConfig.java:563)
at