On 6/26/2018 7:13 AM, neotorand wrote:
Dont you think the below method is very exepensive

autoParser.parse(input, textHandler, metadata, context);

If the document size if bigger than it will need enough memory to hold the
document(ie ContentHandler).
Any other alternative?

I did find this:

https://stackoverflow.com/questions/25043720/using-poi-or-tika-to-extract-text-stream-to-stream-without-loading-the-entire-f

But I have no actual experience with Tika.  If you want to get a definitive answer, you will need to go to a Tika support resource.  Although Solr does incorporate Tika, we are not experts in its use.

Thanks,
Shawn

Reply via email to