Re: Tika analyzers

Erick Erickson Wed, 30 Jul 2014 08:09:19 -0700

Hmmm, might a custom update processor do that? In an update
processor, you'd get the binary and be able to do anything at all
you wanted to with that. I'm not quite clear on how the binary
gets through the Tika bits and gets passed in in the first place,
but....


Best,
Erick


On Wed, Jul 30, 2014 at 6:00 AM, Tommaso Teofili <tommaso.teof...@gmail.com>
wrote:

> Hi all,
>
> while SolrCell works nicely when in need of indexing binary documents, I am
> wondering about the possibility of having Lucene / Solr documents that have
> binaries in specific Lucene fields, e.g. title="a nice doc",
> name"blabla.doc", binary="0x1234...".
>
> In that case the "binary" field should have an indexing analyzer which can
> extract the text from the binary and index it.
>
> Would it make sense to create a Tika based analyzer for that purpose?
>
> Regards,
> Tommaso
>

Re: Tika analyzers

Reply via email to