May need to extract outside SolR and index pure text with an external ingestion 
process. You have much more control over the Tika attributes and behaviors.

Rahul Singh

Anant Corporation

On Apr 9, 2018, 10:23 PM -0400, Zheng Lin Edwin Yeo <>, 
> Hi,
> Currently I am facing issue whereby the text in images file like jpg, bmp
> are not being extracted out and indexed. After the indexing, Tika did
> extract all the meta data out and index them under the fields attr_*.
> However, the content field is always empty for images file. For other types
> of document files like .doc, the content is extracted correctly.
> I have already updated the tika-parsers-1.17.jar, under
> \prg\apache\tika\parser\pdf\ for extractInlineImages to true.
> What could be the reason?
> I have just upgraded to Solr 7.3.0.
> Regards,
> Edwin

Reply via email to