Matthew Caruana Galizia created TIKA-2174:
---------------------------------------------
Summary: JP2 and JPX (JPEG 2000) support not declared by
TesseractOCRParser
Key: TIKA-2174
URL: https://issues.apache.org/jira/browse/TIKA-2174
Project: Tika
Issue Type: Bug
Components: parser
Affects Versions: 1.14
Reporter: Matthew Caruana Galizia
Tesseract produces OCR output fine for JPX images as of this version:
{noformat}
$ tesseract -v
tesseract 3.04.01
leptonica-1.73
libjpeg 8d : libpng 1.6.26 : libtiff 4.0.6 : zlib 1.2.5}}
{noformat}
However, these types are not declared by getSupportTypes so no output is
produced for PDFs which contained JPX images of scanned documents, for example.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)