Problem with PDF extraction

Marc Ghorayeb Mon, 26 Apr 2010 14:04:36 -0700
Hello,
I have been having problems with PDF randomly crashing the 1.4 Solr server so i 
tried out the SVN version which contains a newer Tika library. On its own, the 
tika app extracts correctly the content of my PDF. However, inside Solr, when i 
upload a pdf file to my update/extract handler, it does not seem to parse it (a 
blank file is outputted...). The literal values do get indexed though. I have 
had no luck in getting the tika parsing to work. For some reason, i get the 
same result whether or not the tika-parsers-0.7.jar is present in the lib 
folder. Whereas if the tika-core-0.7 jar is absent, it just crashes (which 
seems normal to me...).
I don't seem to be the only one having this problem (on the user mailing list 
that is). Can anyone help me out? It would be greatly appreciated.
I use a fairly classic schema and default requesthandlers.
Marc Ghorayeb.                                    
_________________________________________________________________
Consultez vos emails Orange, Gmail, Yahoo!, Free ... directement depuis HOTMAIL 
!
http://www.windowslive.fr/hotmail/agregation/
Problem with PDF extraction

Reply via email to