[ https://issues.apache.org/jira/browse/TIKA-2473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16635726#comment-16635726 ]
Hudson commented on TIKA-2473: ------------------------------ FAILURE: Integrated in Jenkins build tika-branch-1x #108 (See [https://builds.apache.org/job/tika-branch-1x/108/]) TIKA-2473 - Replace com.sun.xml.bind:jaxb-impl and jaxb-core with (tallison: [https://github.com/apache/tika/commit/033758a5ab709401cdaf615477bde53ff729ed66]) * (edit) tika-langdetect/pom.xml * (edit) CHANGES.txt * (edit) tika-parent/pom.xml * (edit) tika-parsers/pom.xml * (edit) LICENSE.txt > PCX and DCX image support > ------------------------- > > Key: TIKA-2473 > URL: https://issues.apache.org/jira/browse/TIKA-2473 > Project: Tika > Issue Type: Improvement > Components: parser > Affects Versions: 1.16 > Reporter: Matthew Caruana Galizia > Priority: Major > > It's straightforward in theory to implement support for PCX and DCX. There's > support for it in Commons Imaging as well as in ImageIO via TwelveMonkeys. > In practise, however, I'm not really sure how implement support. We obviously > want to OCR the images, but Tesseract has no support for the format. So where > do we do the conversion to a BufferedImage? I tried to look for what is done > to handle JBIG2 files but I can't find that anywhere. -- This message was sent by Atlassian JIRA (v7.6.3#76005)