Hi Erik,

The document is committed successfully... it is just missing all the extracted 
content from Tika when I query for that document.

i.e. the mapped content field attr_content is empty (fmap.content=attr_content)

<result name="response" numFound="1" start="0" maxScore="1.9162908">
<doc>
<float name="score">1.9162908</float>
<arr name="attr_character_count">
<str>24</str>
</arr>
<arr name="attr_content">
<str></str>
</arr>
<arr name="attr_creation_date">
<str>2009-04-16T11:32:00</str>
</arr>
<arr name="attr_date">
<str>2012-11-23T00:29:39.73</str>
</arr>

...

</result>


Brett.

-----Original Message-----
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: Sunday, November 25, 2012 9:27 PM
To: solr-user@lucene.apache.org
Subject: Re: Problem with Solr 3.6.1 extracting ODT content using SolrCell's 
ExtractingRequestHandler

Did you commit after you added the document but before you tried the search?

Best
Erick


On Fri, Nov 23, 2012 at 6:25 PM, Brett Melbourne < 
bmelbou...@halogensoftware.com> wrote:

> Hi all,
>
> I am encountering a problem where Solr 3.6.1 is not able to extract 
> the text content from ODT (Open Office Document) files submitted to 
> the ExtractingRequestHandler. I can reproduce this issue against the 
> example schema running with jetty.
>
> Executing a simple index request (based on the example in the wiki):
> curl "
> http://localhost:8983/solr/update/extract?literal.id=doc1&uprefix=attr
> _&fmap.content=attr_content&commit=true
> "<
> http://localhost:8983/solr/update/extract?literal.id=doc1&uprefix=attr
> _&fmap.content=attr_content&commit=true%22>
> -F "myfile=@testfile.odt"
> returns no errors, and does not generate any exceptions in the log/console.
>
> A query for doc1 returns an empty attr_content field:
> <arr name="attr_content"> <str></str> </arr>
>
> Oddly enough, executing an "extractOnly=true" request against the 
> ExtractingRequestHandler with the same ODT file correctly returns the 
> text of the file.
>
> I am wondering:
>
> *         Is this a known issue? (I couldn't find any mention of this
> particular issue anywhere...)
>
> *         Are there any workarounds or does anyone have any suggestions?
>
> Thanks,
>
> Brett.
>
>

Reply via email to