Hello,

I am using latest version of Tika (with Tesseract).

Some of the words in embedded image in a Microsoft doc are mis-spelt.

What is the best way to handle this?

Can I extend Tika to read from a say cache having key-value pairs to
correct the output of Tika?

Please suggest.

Thanks
Naga

Reply via email to