Ewan Mellor created TIKA-2581:
---------------------------------
Summary: testOCROutputsHOCR fails with Tesseract 4.0
Key: TIKA-2581
URL: https://issues.apache.org/jira/browse/TIKA-2581
Project: Tika
Issue Type: Bug
Components: parser
Affects Versions: 1.17
Reporter: Ewan Mellor
TesseractOCRParserTest.testOCROutputsHOCR fails with Tesseract 4.0.
With 3.x, the output is `<span>Happy</span>` but with 4.0 the output is
`<span><strong>Happy</strong></span>`.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)