This is an automated email from the ASF dual-hosted git repository. tallison pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/tika.git.
from 8187f21 TIKA-2792 -- revert mp4parser dependency new 398bcd8 TIKA-2798 -- improve reporting for attachment diffs new d838bd7 TIKA-2791 -- add tags/structure to tika-eval The 2 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: .../sax/AbstractRecursiveParserWrapperHandler.java | 4 + .../tika/sax/RecursiveParserWrapperHandler.java | 1 + tika-eval/pom.xml | 6 +- .../org/apache/tika/eval/AbstractProfiler.java | 142 +++++++++++++++++---- .../java/org/apache/tika/eval/ExtractComparer.java | 30 ++++- .../java/org/apache/tika/eval/ExtractProfiler.java | 27 +++- .../tika/eval/batch/ExtractComparerBuilder.java | 2 + .../tika/eval/batch/ExtractProfilerBuilder.java | 1 + .../main/java/org/apache/tika/eval/db/Cols.java | 22 +++- .../org/apache/tika/eval/io/ExtractReader.java | 71 +++++++---- .../apache/tika/eval/util/ContentTagParser.java | 89 +++++++++++++ .../org/apache/tika/eval/util/ContentTags.java | 63 +++++++++ .../src/main/resources/comparison-reports.xml | 40 +++++- .../org/apache/tika/eval/SimpleComparerTest.java | 126 +++++++++++++----- .../resources/test-dirs/extractsA/file15_tags.json | 41 ++++++ .../test-dirs/extractsA/file16_badTags.json | 41 ++++++ .../test-dirs/extractsA/file17_tagsOutOfOrder.json | 41 ++++++ .../resources/test-dirs/extractsB/file15_tags.html | 31 +++++ .../test-dirs/extractsB/file16_badTags.html | 31 +++++ 19 files changed, 710 insertions(+), 99 deletions(-) create mode 100644 tika-eval/src/main/java/org/apache/tika/eval/util/ContentTagParser.java create mode 100644 tika-eval/src/main/java/org/apache/tika/eval/util/ContentTags.java create mode 100644 tika-eval/src/test/resources/test-dirs/extractsA/file15_tags.json create mode 100644 tika-eval/src/test/resources/test-dirs/extractsA/file16_badTags.json create mode 100644 tika-eval/src/test/resources/test-dirs/extractsA/file17_tagsOutOfOrder.json create mode 100644 tika-eval/src/test/resources/test-dirs/extractsB/file15_tags.html create mode 100644 tika-eval/src/test/resources/test-dirs/extractsB/file16_badTags.html