Little more info... Seems to be a classloading issue.  The tests pass, but they 
aren't loading the Tika libraries via the Solr ResourceLoader, whereas the 
example is.  Marc, one thing to try is to unjar the Solr WAR file and put the 
Tika libs in there, as I bet it will then work.  Note, however, I haven't tried 
this.

On May 3, 2010, at 6:24 PM, Grant Ingersoll wrote:

> I've opened https://issues.apache.org/jira/browse/SOLR-1902 to track this.  
> It is indeed a bug somewhere (still investigating).  It seems that Tika is 
> now picking an EmptyParser implementation when trying to determine which 
> parser to use, despite the fact that it properly identifies the MIME Type.
> 
> -Grant
> 
> On May 3, 2010, at 5:36 PM, Grant Ingersoll wrote:
> 
>> I'm investigating.
>> 
>> On May 3, 2010, at 5:17 AM, Marc Ghorayeb wrote:
>> 
>>> 
>>> Hi,
>>> Grant, i confirm what Praveen has said, any PDF i try does not work with 
>>> the new Tika and SVN versions. :(
>>> Marc
>>> 
>>>> From: sagar...@opentext.com
>>>> To: solr-user@lucene.apache.org
>>>> Date: Mon, 3 May 2010 13:05:24 +0530
>>>> Subject: RE: Problem with pdf, upgrading Cell
>>>> 
>>>> Hello,
>>>> 
>>>> Please let me know if anybody figured out a way out of this issue. 
>>>> 
>>>> Thanks,
>>>> Sandhya
>>>> 
>>>> -----Original Message-----
>>>> From: Praveen Agrawal [mailto:pkal...@gmail.com] 
>>>> Sent: Friday, April 30, 2010 11:14 PM
>>>> To: solr-user@lucene.apache.org
>>>> Subject: Re: Problem with pdf, upgrading Cell
>>>> 
>>>> Grant,
>>>> You can try any of the sample pdfs that come in /docs folder of Solr 1.4
>>>> dist'n. I had tried 'Installing Solr in Tomcat.pdf', 'index.pdf' etc. Only
>>>> metadata i.e. stream_size, content_type apart from my own literals are
>>>> indexed, and content is missing..
>>>> 
>>>> 
>>>> On Fri, Apr 30, 2010 at 8:52 PM, Grant Ingersoll 
>>>> <gsing...@apache.org>wrote:
>>>> 
>>>>> Praveen and Marc,
>>>>> 
>>>>> Can you share the PDF (feel free to email my private email) that fails in
>>>>> Solr?
>>>>> 
>>>>> Thanks,
>>>>> Grant
>>>>> 
>>>>> 
>>>>> On Apr 30, 2010, at 7:55 AM, Marc Ghorayeb wrote:
>>>>> 
>>>>>> 
>>>>>> Hi
>>>>>> Nope i didn't get it to work... Just like you, command line version of
>>>>> tika extracts correctly the content, but once included in Solr, no content
>>>>> is extracted.
>>>>>> What i tried until now is:- Updating the tika libraries inside Solr 1.4
>>>>> public version, no luck there.- Downloading the latest SVN version, 
>>>>> compiled
>>>>> it, and started from a simple schema, still no luck.- Getting other 
>>>>> versions
>>>>> compiled on hudson (nightly builds), and testing them also, still no
>>>>> extraction.
>>>>>> I sent a mail on the developpers mailing list but they told me i should
>>>>> just mail here, hope some developper reads this because it's quite an
>>>>> important feature of Solr and somehow it got broke between the 1.4 
>>>>> release,
>>>>> and the last version on the svn.
>>>>>> Marc
>>>>>> _________________________________________________________________
>>>>>> Consultez gratuitement vos emails Orange, Gmail, Free, ... directement
>>>>> dans HOTMAIL !
>>>>>> http://www.windowslive.fr/hotmail/agregation/
>>>>> 
>>>>> --------------------------
>>>>> Grant Ingersoll
>>>>> http://www.lucidimagination.com/
>>>>> 
>>>>> Search the Lucene ecosystem using Solr/Lucene:
>>>>> http://www.lucidimagination.com/search
>>>>> 
>>>>> 
>>>                                       
>>> _________________________________________________________________
>>> Hotmail et MSN dans la poche? HOTMAIL et MSN sont dispo gratuitement sur 
>>> votre téléphone!
>>> http://www.messengersurvotremobile.com/?d=Hotmail
>> 
>> --------------------------
>> Grant Ingersoll
>> http://www.lucidimagination.com/
>> 
>> Search the Lucene ecosystem using Solr/Lucene: 
>> http://www.lucidimagination.com/search
>> 
> 
> --------------------------
> Grant Ingersoll
> http://www.lucidimagination.com/
> 
> Search the Lucene ecosystem using Solr/Lucene: 
> http://www.lucidimagination.com/search
> 

--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem using Solr/Lucene: 
http://www.lucidimagination.com/search

Reply via email to