----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/22402/ -----------------------------------------------------------
Review request for tika and Chris Mattmann. Repository: tika Description ------- Integrating Tesseract OCR with Tika through a new Parser. See TIKA-93. Diffs ----- trunk/tika-parsers/src/main/java/org/apache/tika/parser/ocr/TesseractOCRConfig.java PRE-CREATION trunk/tika-parsers/src/main/java/org/apache/tika/parser/ocr/TesseractOCRParser.java PRE-CREATION trunk/tika-parsers/src/main/resources/META-INF/services/org.apache.tika.parser.Parser 1601508 trunk/tika-parsers/src/test/java/org/apache/tika/parser/ocr/TesseractOCRTest.java PRE-CREATION trunk/tika-parsers/src/test/java/org/apache/tika/parser/pdf/PDFParserTest.java 1601508 trunk/tika-server/src/test/java/org/apache/tika/server/TikaMimeTypesTest.java 1601508 Diff: https://reviews.apache.org/r/22402/diff/ Testing ------- Extracting the text from an embedded image in a DOCX, PPTX, and PDF. Thanks, Tyler Palsulich
