[ https://issues.apache.org/jira/browse/TIKA-2473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16195586#comment-16195586 ]
Hudson commented on TIKA-2473: ------------------------------ SUCCESS: Integrated in Jenkins build Tika-trunk #1375 (See [https://builds.apache.org/job/Tika-trunk/1375/]) TIKA-2473 PCX and DCX mime magic and detection unit tests (nick: [https://github.com/apache/tika/commit/450ab4bee5b91663c8d524ad1f6357147c6cd40f]) * (edit) tika-core/src/main/resources/org/apache/tika/mime/tika-mimetypes.xml * (edit) tika-core/src/test/java/org/apache/tika/TikaDetectionTest.java * (edit) tika-parsers/src/test/java/org/apache/tika/mime/TestMimeTypes.java > PCX and DCX image support > ------------------------- > > Key: TIKA-2473 > URL: https://issues.apache.org/jira/browse/TIKA-2473 > Project: Tika > Issue Type: Improvement > Components: parser > Affects Versions: 1.16 > Reporter: Matthew Caruana Galizia > > It's straightforward in theory to implement support for PCX and DCX. There's > support for it in Commons Imaging as well as in ImageIO via TwelveMonkeys. > In practise, however, I'm not really sure how implement support. We obviously > want to OCR the images, but Tesseract has no support for the format. So where > do we do the conversion to a BufferedImage? I tried to look for what is done > to handle JBIG2 files but I can't find that anywhere. -- This message was sent by Atlassian JIRA (v6.4.14#64029)