This is an automated email from the ASF dual-hosted git repository.

tallison pushed a change to branch branch_3x
in repository https://gitbox.apache.org/repos/asf/tika.git


    from 7d48f34719 TIKA-4488: update logback
     new 82f26b63dc TIKA-4646 -- extract hyperlinks from instrText and other 
areas in ooxml(#2578)
     new 65bf98d3c5 TIKA-4646 -- extract hyperlinks from instrText and other 
areas in ooxml(#2578) fix merge conflicts

The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .../main/java/org/apache/tika/metadata/Office.java |  51 +++
 .../org/apache/tika/parser/AutoDetectParser.java   |   8 +-
 .../microsoft/ooxml/AbstractOOXMLExtractor.java    |  26 ++
 .../microsoft/ooxml/FieldHyperlinkTracker.java     | 168 +++++++++
 .../microsoft/ooxml/OOXMLTikaBodyPartHandler.java  |  25 ++
 .../ooxml/OOXMLWordAndPowerPointTextHandler.java   | 187 +++++++++-
 .../ooxml/SXWPFWordExtractorDecorator.java         | 179 +++++++++-
 .../ooxml/XSSFExcelExtractorDecorator.java         | 390 +++++++++++++++++++++
 .../ooxml/XWPFWordExtractorDecorator.java          |  95 ++++-
 .../xslf/XSLFEventBasedPowerPointExtractor.java    |   5 +
 .../ooxml/xwpf/XWPFEventBasedWordExtractor.java    |   5 +
 .../tika/parser/microsoft/ExcelParserTest.java     |  43 +++
 .../parser/microsoft/ooxml/OOXMLParserTest.java    |  39 +++
 .../parser/microsoft/ooxml/SXWPFExtractorTest.java | 109 ++++++
 .../parser/microsoft/pst/OutlookPSTParserTest.java |   3 +
 .../test-documents/testAttachedTemplate.docx       | Bin 0 -> 2284 bytes
 .../test-documents/testDataConnections.xlsx        | Bin 0 -> 2967 bytes
 .../test/resources/test-documents/testDdeLink.xlsx | Bin 0 -> 3030 bytes
 .../resources/test-documents/testExternalRefs.docx | Bin 0 -> 2125 bytes
 .../resources/test-documents/testFrameset.docx     | Bin 0 -> 2328 bytes
 .../resources/test-documents/testHoverAndVml.docx  | Bin 0 -> 2270 bytes
 .../resources/test-documents/testInstrLink.docx    | Bin 0 -> 14464 bytes
 .../resources/test-documents/testMailMerge.docx    | Bin 0 -> 2306 bytes
 .../resources/test-documents/testSubdocument.docx  | Bin 0 -> 1980 bytes
 24 files changed, 1323 insertions(+), 10 deletions(-)
 create mode 100644 
tika-parsers/tika-parsers-standard/tika-parsers-standard-modules/tika-parser-microsoft-module/src/main/java/org/apache/tika/parser/microsoft/ooxml/FieldHyperlinkTracker.java
 create mode 100644 
tika-parsers/tika-parsers-standard/tika-parsers-standard-modules/tika-parser-microsoft-module/src/test/resources/test-documents/testAttachedTemplate.docx
 create mode 100644 
tika-parsers/tika-parsers-standard/tika-parsers-standard-modules/tika-parser-microsoft-module/src/test/resources/test-documents/testDataConnections.xlsx
 create mode 100644 
tika-parsers/tika-parsers-standard/tika-parsers-standard-modules/tika-parser-microsoft-module/src/test/resources/test-documents/testDdeLink.xlsx
 create mode 100644 
tika-parsers/tika-parsers-standard/tika-parsers-standard-modules/tika-parser-microsoft-module/src/test/resources/test-documents/testExternalRefs.docx
 create mode 100644 
tika-parsers/tika-parsers-standard/tika-parsers-standard-modules/tika-parser-microsoft-module/src/test/resources/test-documents/testFrameset.docx
 create mode 100644 
tika-parsers/tika-parsers-standard/tika-parsers-standard-modules/tika-parser-microsoft-module/src/test/resources/test-documents/testHoverAndVml.docx
 create mode 100644 
tika-parsers/tika-parsers-standard/tika-parsers-standard-modules/tika-parser-microsoft-module/src/test/resources/test-documents/testInstrLink.docx
 create mode 100644 
tika-parsers/tika-parsers-standard/tika-parsers-standard-modules/tika-parser-microsoft-module/src/test/resources/test-documents/testMailMerge.docx
 create mode 100644 
tika-parsers/tika-parsers-standard/tika-parsers-standard-modules/tika-parser-microsoft-module/src/test/resources/test-documents/testSubdocument.docx

Reply via email to