This is what I have; I didn't alter it so I believe it's the default:

<!-- Solr Cell: -->
  <requestHandler name="/update/extract" 
    <lst name="defaults">
      <!-- All the main content goes into "text"... if you need to return
           the extracted text or do highlighting, use a stored field. -->
      <str name="fmap.content">text</str>
      <str name="lowernames">true</str>
      <str name="uprefix">ignored_</str>

      <!-- capture link hrefs but ignore div attributes -->
      <str name="captureAttr">true</str>
      <str name="fmap.a">links</str>
      <str name="fmap.div">ignored_</str>

-----Original Message-----
From: Grant Ingersoll []
Sent: Monday, January 03, 2011 8:10 PM
Subject: Re: Setting up Solr for PDFs on JBoss

What's your solrconfig.xml look like for setting up the ExtractingReqHandler?


On Jan 3, 2011, at 4:44 PM, Olson, Ron wrote:

> Hi all-
> After testing the PDF import functionality in my local copy of Solr 1.4.1 
> with the included Jetty app server, I tried replicating it using my copy of 
> Solr running in JBoss 5.10 (which uses Tomcat as its servlet container). When 
> I try to add a PDF, I get an error buried in the stack trace:
> Caused by: org.apache.solr.common.SolrException: Error Instantiating Request 
> Handler, org.apache.solr.handler.extraction.ExtractingRequestHandler is not a 
> org.apache.solr.request.SolrRequestHandler
> I am using multiple cores, but they all use the common "lib" directory, 
> instead of the "core/lib" directory. This lib directory is what is added to 
> the classpath when JBoss starts ($JBOSS_HOME/server/solr_test/lib), so all 
> the jars in this directory should be available to anything in the "deploy" 
> directory (just mentioning in case people aren't familiar with JBoss). I've 
> added all the jars from the contrib/extraction/lib directory, as well as the 
> jars from dist.
> My lib directory is effectively:
> apache-solr-cell-1.4.1.jar            easymock.jar                          
> lucene-spellchecker-2.9.3.jar
> apache-solr-clustering-1.4.1.jar      fontbox-0.1.0.jar                     
> nekohtml-1.9.9.jar
> apache-solr-core-1.4.1.jar            geronimo-stax-api_1.0_spec-1.0.1.jar  
> ojdbc14.jar
> apache-solr-solrj-1.4.1.jar           geronimo-stax-api_1.0_spec-1.0.jar    
> ooxml-schemas-1.0.jar
> asm-3.1.jar                           icu4j-3.8.jar                         
> pdfbox-0.7.3.jar
> bcmail-jdk14-136.jar                  jcl-over-slf4j-1.5.5.jar              
> poi-3.5-beta6.jar
> bcprov-jdk14-136.jar                  jempbox-0.2.0.jar                     
> poi-ooxml-3.5-beta6.jar
> commons-codec-1.3.jar                 junit-4.3.jar                         
> poi-scratchpad-3.5-beta6.jar
> commons-compress-1.0.jar              log4j-1.2.14.jar                      
> slf4j-api-1.5.5.jar
> commons-csv-1.0-SNAPSHOT-r609327.jar  lucene-analyzers-2.9.3.jar            
> slf4j-jdk14-1.5.5.jar
> commons-fileupload-1.2.1.jar          lucene-core-2.9.3.jar                 
> tika-core-0.4.jar
> commons-httpclient-3.1.jar            lucene-highlighter-2.9.3.jar          
> tika-parsers-0.4.jar
> commons-io-1.4.jar                    lucene-memory-2.9.3.jar               
> wstx-asl-3.2.7.jar
> commons-lang-2.1.jar                  lucene-misc-2.9.3.jar                 
> xercesImpl-2.8.1.jar
> commons-logging-1.1.1.jar             lucene-queries-2.9.3.jar              
> xml-apis-1.0.b2.jar
> dom4j-1.6.1.jar                       lucene-snowball-2.9.3.jar             
> xmlbeans-2.3.0.jar
> I know several of these jars are already essentially present in JBoss (log4j, 
> for example), but I'm at a loss as to what to remove/add to get it to work. 
> Anyone have any ideas of configuring it under JBoss? The other cores are 
> database-based (thus the use of ojdbc14.jar), and they work fine.
> Thanks for any help,
> Ron
> DISCLAIMER: This electronic message, including any attachments, files or 
> documents, is intended only for the addressee and may contain CONFIDENTIAL, 
> PROPRIETARY or LEGALLY PRIVILEGED information.  If you are not the intended 
> recipient, you are hereby notified that any use, disclosure, copying or 
> distribution of this message or any of the information included in or with it 
> is  unauthorized and strictly prohibited.  If you have received this message 
> in error, please notify the sender immediately by reply e-mail and 
> permanently delete and destroy this message and its attachments, along with 
> any copies thereof. This message does not create any contractual obligation 
> on behalf of the sender or Law Bulletin Publishing Company.
> Thank you.

Grant Ingersoll

DISCLAIMER: This electronic message, including any attachments, files or 
documents, is intended only for the addressee and may contain CONFIDENTIAL, 
PROPRIETARY or LEGALLY PRIVILEGED information.  If you are not the intended 
recipient, you are hereby notified that any use, disclosure, copying or 
distribution of this message or any of the information included in or with it 
is  unauthorized and strictly prohibited.  If you have received this message in 
error, please notify the sender immediately by reply e-mail and permanently 
delete and destroy this message and its attachments, along with any copies 
thereof. This message does not create any contractual obligation on behalf of 
the sender or Law Bulletin Publishing Company.
Thank you.

Reply via email to