Matthew Caruana Galizia created TIKA-2174:
---------------------------------------------

             Summary: JP2 and JPX (JPEG 2000) support not declared by 
TesseractOCRParser
                 Key: TIKA-2174
                 URL: https://issues.apache.org/jira/browse/TIKA-2174
             Project: Tika
          Issue Type: Bug
          Components: parser
    Affects Versions: 1.14
            Reporter: Matthew Caruana Galizia


Tesseract produces OCR output fine for JPX images as of this version:

{noformat}
  $ tesseract -v
     tesseract 3.04.01
       leptonica-1.73
         libjpeg 8d : libpng 1.6.26 : libtiff 4.0.6 : zlib 1.2.5}}
{noformat}

However, these types are not declared by getSupportTypes so no output is 
produced for PDFs which contained JPX images of scanned documents, for example.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to