Hi Grant,
Sorry I already posted on the solr-user mailing list, i thought that since it 
had more to do with the SVN version, i would post here as well.
Here is a link to my 
mail:http://www.mail-archive.com/[email protected]/msg34964.html
As for the request handler:<requestHandler name="/update/extract" 
class="org.apache.solr.handler.extraction.ExtractingRequestHandler">    <lst 
name="defaults">         <str name="fmap.content">content</str>          <str 
name="lowernames">false</str>              <str name="uprefix">tika_</str>      
   <str name="defaultField">content</str>          <str 
name="captureAttr">true</str>              <str name="fmap.a">links</str>    
</lst></requestHandler>
Fields are basic indexed and multivalued strings for testing purposes.
I would gladly help debug solr or tika, but i can't really compile Tika. It's 
using maven and it cannot get through my company's proxy to download 
dependencies... I can debug solr though.
Marc

From: [email protected]
Subject: Re: Problem with PDF extraction
Date: Mon, 26 Apr 2010 18:08:04 -0400
To: [email protected]



Hi Marc,
Can you ask on [email protected] and give more information about any 
errors that occur in your Solr log plus the setup of the 
ExtractingRequestHandler and related schema.
-Grant
On Apr 26, 2010, at 5:04 PM, Marc Ghorayeb wrote:Hello,
I have been having problems with PDF randomly crashing the 1.4 Solr server so i 
tried out the SVN version which contains a newer Tika library. On its own, the 
tika app extracts correctly the content of my PDF. However, inside Solr, when i 
upload a pdf file to my update/extract handler, it does not seem to parse it (a 
blank file is outputted...). The literal values do get indexed though. I have 
had no luck in getting the tika parsing to work. For some reason, i get the 
same result whether or not the tika-parsers-0.7.jar is present in the lib 
folder. Whereas if the tika-core-0.7 jar is absent, it just crashes (which 
seems normal to me...).
I don't seem to be the only one having this problem (on the user mailing list 
that is). Can anyone help me out? It would be greatly appreciated.
I use a fairly classic schema and default requesthandlers.
Marc Ghorayeb.
Hotmail débarque sur votre téléphone ! Paramétrez Hotmail sur votre téléphone! 
Gratuit !

--------------------------Grant Ingersollhttp://www.lucidimagination.com/
Search the Lucene ecosystem using Solr/Lucene: 
http://www.lucidimagination.com/search


                                          
_________________________________________________________________
Découvrez comment SURFER DISCRETEMENT sur un site de rencontres !
http://clk.atdmt.com/FRM/go/206608211/direct/01/

Reply via email to