This is an automated email from the ASF dual-hosted git repository.
tallison pushed a change to branch TIKA-4259
in repository https://gitbox.apache.org/repos/asf/tika.git
from 97a0fb3ff TIKA-4259 -- refactor xml parser convenience methods out of
ParseContext
add 7a03331f8 TIKA-4256 -- allow inlining of ocr'd content in the
RecursiveParserWrapper (#1762)
add 019041117 TIKA-4257 -- lower dbf priority (#1773)
add 84dc15e7d TIKA-4166: add / move comment, update jetty, aws
add 6dfb4ad1f Merge remote-tracking branch 'origin/main' into TIKA-4259
No new revisions were added by this update.
Summary of changes:
.../ParentContentHandler.java} | 22 ++---
.../apache/tika/parser/RecursiveParserWrapper.java | 94 +++++++++++++++-------
tika-parent/pom.xml | 7 +-
.../apache/tika/parser/ocr/TesseractOCRConfig.java | 10 +++
.../apache/tika/parser/ocr/TesseractOCRParser.java | 36 ++++++++-
.../tika/parser/ocr/TesseractOCRParserTest.java | 18 +++++
6 files changed, 143 insertions(+), 44 deletions(-)
copy tika-core/src/main/java/org/apache/tika/{sax/ContentHandlerFactory.java
=> extractor/ParentContentHandler.java} (65%)