[ https://issues.apache.org/jira/browse/TIKA-2235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tim Allison resolved TIKA-2235. ------------------------------- Resolution: Fixed Fix Version/s: 1.15 2.0 Thank you! Btw...did you notice TIKA-2232 via [~pascal.essiembre]? Make sure to add jpx dependencies to classpath...if you aren't already. :) > Use Tesseract's recommended DPI for PDF images > ---------------------------------------------- > > Key: TIKA-2235 > URL: https://issues.apache.org/jira/browse/TIKA-2235 > Project: Tika > Issue Type: Improvement > Components: parser > Affects Versions: 1.14 > Reporter: Matthew Caruana Galizia > Priority: Minor > Labels: ocr, pdf > Fix For: 2.0, 1.15 > > > From the [Tesseract > wiki|https://github.com/tesseract-ocr/tesseract/wiki/ImproveQuality]: > {quote} > Tesseract works best on images which have a DPI of at least 300 dpi.... > {quote} > PDFParserConfig is currently initialised with a value of 200 for ocrDPI. -- This message was sent by Atlassian JIRA (v6.3.4#6332)