Re: Problem with PDF extraction

Grant Ingersoll Mon, 26 Apr 2010 15:08:36 -0700

Hi Marc,

Can you ask on [email protected] and give more information about any 
errors that occur in your Solr log plus the setup of the 
ExtractingRequestHandler and related schema.


-Grant

On Apr 26, 2010, at 5:04 PM, Marc Ghorayeb wrote:

> Hello,
> 
> I have been having problems with PDF randomly crashing the 1.4 Solr server so 
> i tried out the SVN version which contains a newer Tika library. On its own, 
> the tika app extracts correctly the content of my PDF. However, inside Solr, 
> when i upload a pdf file to my update/extract handler, it does not seem to 
> parse it (a blank file is outputted...). The literal values do get indexed 
> though. I have had no luck in getting the tika parsing to work. For some 
> reason, i get the same result whether or not the tika-parsers-0.7.jar is 
> present in the lib folder. Whereas if the tika-core-0.7 jar is absent, it 
> just crashes (which seems normal to me...).
> 
> I don't seem to be the only one having this problem (on the user mailing list 
> that is). Can anyone help me out? It would be greatly appreciated.
> 
> I use a fairly classic schema and default requesthandlers.
> 
> Marc Ghorayeb.
> 
> Hotmail débarque sur votre téléphone ! Paramétrez Hotmail sur votre 
> téléphone! Gratuit !

--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem using Solr/Lucene: 
http://www.lucidimagination.com/search

Re: Problem with PDF extraction

Reply via email to