(+) [email protected] On Mon, Aug 21, 2017 at 4:14 PM, Naga Vijay <[email protected]> wrote:
> Hello, > > I am using latest version of Tika (with Tesseract). > > Some of the words in embedded image in a Microsoft doc are mis-spelt in > the Tika output. > > What is the best way to handle this? > > Can I extend Tika to read from a cache having key-value pairs to correct > the output of Tika? > > Please suggest. > > Thanks > Naga >
