Hmmm, might a custom update processor do that? In an update processor, you'd get the binary and be able to do anything at all you wanted to with that. I'm not quite clear on how the binary gets through the Tika bits and gets passed in in the first place, but....
Best, Erick On Wed, Jul 30, 2014 at 6:00 AM, Tommaso Teofili <tommaso.teof...@gmail.com> wrote: > Hi all, > > while SolrCell works nicely when in need of indexing binary documents, I am > wondering about the possibility of having Lucene / Solr documents that have > binaries in specific Lucene fields, e.g. title="a nice doc", > name"blabla.doc", binary="0x1234...". > > In that case the "binary" field should have an indexing analyzer which can > extract the text from the binary and index it. > > Would it make sense to create a Tika based analyzer for that purpose? > > Regards, > Tommaso >