[tika] branch TIKA-1599 updated (60479ddd4 -> 6ba636b57)
This is an automated email from the ASF dual-hosted git repository. tallison pushed a change to branch TIKA-1599 in repository https://gitbox.apache.org/repos/asf/tika.git from 60479ddd4 TIKA-1599 -- migrate to jsoup parser -- checkstyle add 6ba636b57 TIKA-1599 -- migrate to jsoup parser -- fix bad auto replace all No new revisions were added by this update. Summary of changes: .../apache/tika/config/TIKA-2273-exclude-encoding-detector-default.xml | 2 +- .../org/apache/tika/config/TIKA-2485-encoding-detector-mark-limits.xml | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)
[tika] branch TIKA-1599 updated (a47e37ced -> 60479ddd4)
This is an automated email from the ASF dual-hosted git repository. tallison pushed a change to branch TIKA-1599 in repository https://gitbox.apache.org/repos/asf/tika.git from a47e37ced TIKA-1599 -- migrate to jsoup parser -- fix EncodingDetector and fix or disable unit tests add 60479ddd4 TIKA-1599 -- migrate to jsoup parser -- checkstyle No new revisions were added by this update. Summary of changes: .../src/test/java/org/apache/tika/parser/html/HtmlParserTest.java| 1 - 1 file changed, 1 deletion(-)
[tika] branch TIKA-1599 updated (f05e9b45e -> a47e37ced)
This is an automated email from the ASF dual-hosted git repository. tallison pushed a change to branch TIKA-1599 in repository https://gitbox.apache.org/repos/asf/tika.git from f05e9b45e TIKA-1599 -- migrate to jsoup parser -- mv tagsoup htmlparser to tika-parsrs-extended add a47e37ced TIKA-1599 -- migrate to jsoup parser -- fix EncodingDetector and fix or disable unit tests No new revisions were added by this update. Summary of changes: .../services/org.apache.tika.detect.EncodingDetector| 2 +- .../java/org/apache/tika/parser/html/HtmlParserTest.java| 13 +++-- 2 files changed, 8 insertions(+), 7 deletions(-)
[tika] branch TIKA-1599 updated (1d4e6ebb6 -> f05e9b45e)
This is an automated email from the ASF dual-hosted git repository. tallison pushed a change to branch TIKA-1599 in repository https://gitbox.apache.org/repos/asf/tika.git from 1d4e6ebb6 TIKA-1599 -- migrate to jsoup parser -- remove runtime exception add f05e9b45e TIKA-1599 -- migrate to jsoup parser -- mv tagsoup htmlparser to tika-parsrs-extended No new revisions were added by this update. Summary of changes: pom.xml| 2 + tika-bom/pom.xml | 11 ++- tika-parent/pom.xml| 5 + .../tika-parser-tagsoup-module/pom.xml | 34 +++ .../tika/parser/html/tagsoup}/DataURIScheme.java | 2 +- .../html/tagsoup}/DataURISchemeParseException.java | 2 +- .../parser/html/tagsoup}/DataURISchemeUtil.java| 2 +- .../parser/html/tagsoup}/DefaultHtmlMapper.java| 2 +- .../parser/html/tagsoup}/HtmlEncodingDetector.java | 2 +- .../tika/parser/html/tagsoup}/HtmlHandler.java | 2 +- .../tika/parser/html/tagsoup}/HtmlMapper.java | 2 +- .../tika/parser/html/tagsoup}/HtmlParser.java | 2 +- .../parser/html/tagsoup}/IdentityHtmlMapper.java | 2 +- .../html/tagsoup}/XHTMLDowngradeHandler.java | 2 +- .../tagsoup}/charsetdetector/CharsetAliases.java | 6 +- .../charsetdetector/CharsetDetectionResult.java| 2 +- .../tagsoup}/charsetdetector/MetaProcessor.java| 6 +- .../html/tagsoup}/charsetdetector/PreScanner.java | 2 +- .../StandardHtmlEncodingDetector.java | 6 +- .../charsets/ReplacementCharset.java | 2 +- .../charsets/XUserDefinedCharset.java | 2 +- .../org.apache.tika.detect.EncodingDetector| 2 +- .../services/org.apache.tika.parser.Parser | 2 +- .../StandardCharsets_unsupported_by_IANA.txt | 0 .../html/tagsoup}/DataURISchemeParserTest.java | 3 +- .../html/tagsoup}/HtmlEncodingDetectorTest.java| 3 +- .../tika/parser/html/tagsoup}/HtmlParserTest.java | 5 +- .../tika/parser/html/tagsoup}/SrcDocTest.java | 2 +- .../tagsoup}/StandardHtmlEncodingDetectorTest.java | 6 +- .../org/apache/tika/parser/html/tika-config.xml| 4 +- .../resources/test-documents/big-preamble.html | 0 .../test-documents/boilerplate-whitespace.html | 0 .../test/resources/test-documents/boilerplate.html | 0 .../testBoilerplateMissingSpace.html | 0 .../test/resources/test-documents/testHTML.html| 0 .../test-documents/testHTMLBadScript.html | 0 .../test-documents/testHTMLGoodScript.html | 0 .../testHTMLNoisyMetaEncoding_1.html | 0 .../testHTMLNoisyMetaEncoding_2.html | 0 .../testHTMLNoisyMetaEncoding_3.html | 0 .../testHTMLNoisyMetaEncoding_4.html | 0 .../test-documents/testHTML_charset_utf16le.html | Bin .../test-documents/testHTML_charset_utf8.html | 0 .../testHTML_embedded_data_uri_js.html | 0 .../test-documents/testHTML_embedded_img.html | 0 .../testHTML_embedded_img_in_js.html | 0 .../resources/test-documents/testHTML_head.html| 0 .../test-documents/testHTML_metadata.html | 0 .../testHTML_metadata_two_titles.html | 0 .../resources/test-documents/testHTML_utf8.html| 0 .../test/resources/test-documents/testSrcDoc.html | 0 .../test-documents/testUserDefinedCharset.mhtml| 0 .../test/resources/test-documents/testXHTML.html | 0 .../src/test/resources/test-documents/tika434.html | 0 .../pom.xml| 46 ++--- .../tika-parser-html-module/pom.xml| 6 -- .../org.apache.tika.detect.EncodingDetector| 2 +- .../apache/tika/parser/html/HtmlParserTest.java| 107 +++-- ...TIKA-2273-exclude-encoding-detector-default.xml | 2 +- .../TIKA-2485-encoding-detector-mark-limits.xml| 2 +- 60 files changed, 138 insertions(+), 152 deletions(-) create mode 100644 tika-parsers/tika-parsers-extended/tika-parser-tagsoup-module/pom.xml copy tika-parsers/{tika-parsers-standard/tika-parsers-standard-modules/tika-parser-html-module/src/main/java/org/apache/tika/parser/html => tika-parsers-extended/tika-parser-tagsoup-module/src/main/java/org/apache/tika/parser/html/tagsoup}/DataURIScheme.java (98%) copy tika-parsers/{tika-parsers-standard/tika-parsers-standard-modules/tika-parser-html-module/src/main/java/org/apache/tika/parser/html => tika-parsers-extended/tika-parser-tagsoup-module/src/main/java/org/apache/tika/parser/html/tagsoup}/DataURISchemeParseException.java (95%) copy tika-parsers/{tika-parsers-standard/tika-parsers-standard-modules/tika-parser-html-module/src/main/java/org/apache/tika/parser/html =>
[tika] branch TIKA-1599 updated (d1bc68eb8 -> 1d4e6ebb6)
This is an automated email from the ASF dual-hosted git repository. tallison pushed a change to branch TIKA-1599 in repository https://gitbox.apache.org/repos/asf/tika.git from d1bc68eb8 TIKA-1599 -- migrate to jsoup parser -- checkstyle fix add e04c47820 TIKA-4138 -- move BoilerpipeContentHandler (#1355) add d1a5fbc32 Merge remote-tracking branch 'origin/main' into TIKA-1599 add 1d4e6ebb6 TIKA-1599 -- migrate to jsoup parser -- remove runtime exception No new revisions were added by this update. Summary of changes: CHANGES.txt| 5 ++ pom.xml| 1 + tika-app/pom.xml | 2 +- tika-bom/pom.xml | 2 +- tika-bundles/tika-bundle-standard/pom.xml | 2 +- tika-handlers/README.md| 2 + .../tika-emitter-jdbc => tika-handlers}/pom.xml| 24 --- .../tika-handler-boilerpipe}/pom.xml | 21 +++--- .../sax/boilerpipe/BoilerpipeContentHandler.java | 0 .../tika-parsers-standard-modules/pom.xml | 1 - .../tika-parser-html-commons/README.md | 22 --- .../tika-parser-html-commons/pom.xml | 74 -- .../org/apache/tika/parser/html/JSoupParser.java | 2 +- .../tika-parsers-standard-package/pom.xml | 2 +- tika-server/tika-server-core/pom.xml | 2 +- tika-server/tika-server-standard/pom.xml | 6 +- 16 files changed, 44 insertions(+), 124 deletions(-) create mode 100644 tika-handlers/README.md copy {tika-pipes/tika-emitters/tika-emitter-jdbc => tika-handlers}/pom.xml (70%) copy {tika-parsers/tika-parsers-standard/tika-parsers-standard-modules/tika-parser-jdbc-commons => tika-handlers/tika-handler-boilerpipe}/pom.xml (66%) rename {tika-parsers/tika-parsers-standard/tika-parsers-standard-modules/tika-parser-html-commons => tika-handlers/tika-handler-boilerpipe}/src/main/java/org/apache/tika/sax/boilerpipe/BoilerpipeContentHandler.java (100%) delete mode 100644 tika-parsers/tika-parsers-standard/tika-parsers-standard-modules/tika-parser-html-commons/README.md delete mode 100644 tika-parsers/tika-parsers-standard/tika-parsers-standard-modules/tika-parser-html-commons/pom.xml
[tika] branch TIKA-4138 deleted (was cbc46ee9b)
This is an automated email from the ASF dual-hosted git repository. tallison pushed a change to branch TIKA-4138 in repository https://gitbox.apache.org/repos/asf/tika.git was cbc46ee9b TIKA-4138 -- move BoilerpipeContentHandler The revisions that were on this branch are still contained in other references; therefore, this change does not discard any commits from the repository.
[tika] branch main updated (6871c9157 -> e04c47820)
This is an automated email from the ASF dual-hosted git repository. tallison pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/tika.git from 6871c9157 TIKA-4137 -- add a jdk21 build workflow add e04c47820 TIKA-4138 -- move BoilerpipeContentHandler (#1355) No new revisions were added by this update. Summary of changes: CHANGES.txt| 5 ++ pom.xml| 1 + tika-app/pom.xml | 2 +- tika-bom/pom.xml | 2 +- tika-bundles/tika-bundle-standard/pom.xml | 2 +- tika-handlers/README.md| 2 + .../tika-emitter-jdbc => tika-handlers}/pom.xml| 24 --- .../tika-handler-boilerpipe}/pom.xml | 21 +++--- .../sax/boilerpipe/BoilerpipeContentHandler.java | 0 .../tika-parsers-standard-modules/pom.xml | 1 - .../tika-parser-html-commons/README.md | 22 --- .../tika-parser-html-commons/pom.xml | 74 -- .../tika-parsers-standard-package/pom.xml | 2 +- tika-server/tika-server-core/pom.xml | 2 +- tika-server/tika-server-standard/pom.xml | 6 +- 15 files changed, 43 insertions(+), 123 deletions(-) create mode 100644 tika-handlers/README.md copy {tika-pipes/tika-emitters/tika-emitter-jdbc => tika-handlers}/pom.xml (70%) copy {tika-parsers/tika-parsers-standard/tika-parsers-standard-modules/tika-parser-jdbc-commons => tika-handlers/tika-handler-boilerpipe}/pom.xml (66%) rename {tika-parsers/tika-parsers-standard/tika-parsers-standard-modules/tika-parser-html-commons => tika-handlers/tika-handler-boilerpipe}/src/main/java/org/apache/tika/sax/boilerpipe/BoilerpipeContentHandler.java (100%) delete mode 100644 tika-parsers/tika-parsers-standard/tika-parsers-standard-modules/tika-parser-html-commons/README.md delete mode 100644 tika-parsers/tika-parsers-standard/tika-parsers-standard-modules/tika-parser-html-commons/pom.xml
[tika] branch TIKA-1599 updated (b8d4e6d66 -> d1bc68eb8)
This is an automated email from the ASF dual-hosted git repository. tallison pushed a change to branch TIKA-1599 in repository https://gitbox.apache.org/repos/asf/tika.git from b8d4e6d66 TIKA-1599 -- migrate to jsoup parser add d1bc68eb8 TIKA-1599 -- migrate to jsoup parser -- checkstyle fix No new revisions were added by this update. Summary of changes: .../src/main/java/org/apache/tika/example/TIAParsingExample.java| 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
[tika] 01/01: TIKA-1599 -- migrate to jsoup parser
This is an automated email from the ASF dual-hosted git repository. tallison pushed a commit to branch TIKA-1599 in repository https://gitbox.apache.org/repos/asf/tika.git commit b8d4e6d6670485bbb762c5b1e4fe9641cea94f25 Author: tallison AuthorDate: Fri Sep 22 12:23:24 2023 -0400 TIKA-1599 -- migrate to jsoup parser --- .../test/java/org/apache/tika/cli/TikaCLITest.java | 4 +- .../src/test/resources/test-data/tika-config1.xml | 2 +- .../org/apache/tika/example/TIAParsingExample.java | 6 +- .../src/test/resources/2.4.0-no-tesseract.txt | 8 +- .../src/test/resources/2.4.0-tesseract.txt | 8 +- .../src/test/resources/2.4.1-no-tesseract.txt | 8 +- .../src/test/resources/2.4.1-tesseract.txt | 8 +- .../tika-parser-html-module/pom.xml| 5 + .../org/apache/tika/parser/html/JSoupParser.java | 243 + .../services/org.apache.tika.parser.Parser | 2 +- .../org/apache/tika/parser/html/tika-config.xml| 4 +- .../tika/parser/mail/MailContentHandler.java | 4 +- .../tika/parser/microsoft/JackcessExtractor.java | 6 +- .../tika/parser/microsoft/OutlookExtractor.java| 6 +- .../tika/parser/microsoft/chm/ChmParser.java | 6 +- .../tika/parser/microsoft/rtf/RTFParserTest.java | 2 +- .../org/apache/tika/sax/BoilerpipeHandlerTest.java | 21 +- 17 files changed, 300 insertions(+), 43 deletions(-) diff --git a/tika-app/src/test/java/org/apache/tika/cli/TikaCLITest.java b/tika-app/src/test/java/org/apache/tika/cli/TikaCLITest.java index e6c5c2296..b8795225b 100644 --- a/tika-app/src/test/java/org/apache/tika/cli/TikaCLITest.java +++ b/tika-app/src/test/java/org/apache/tika/cli/TikaCLITest.java @@ -272,7 +272,7 @@ public class TikaCLITest { assertTrue(json.contains( "\"X-TIKA:Parsed-By\" : [ \"org.apache.tika.parser.DefaultParser\", " + -"\"org.apache.tika.parser.html.HtmlParser\" ],")); +"\"org.apache.tika.parser.html.JSoupParser\" ],")); //test legacy alphabetic sort of keys int enc = json.indexOf("\"Content-Encoding\""); int fb = json.indexOf("fb:admins"); @@ -467,7 +467,7 @@ public class TikaCLITest { getParamOutContent("--config=" + TEST_DATA_FILE.toString() + "/tika-config1.xml", resourcePrefix + "bad_xml.xml"); assertTrue(content.contains("apple")); -assertTrue(content.contains("org.apache.tika.parser.html.HtmlParser")); + assertTrue(content.contains("org.apache.tika.parser.html.JSoupParser")); } @Test diff --git a/tika-app/src/test/resources/test-data/tika-config1.xml b/tika-app/src/test/resources/test-data/tika-config1.xml index ff03407bc..52f4f0949 100644 --- a/tika-app/src/test/resources/test-data/tika-config1.xml +++ b/tika-app/src/test/resources/test-data/tika-config1.xml @@ -1,7 +1,7 @@ - + application/vnd.wap.xhtml+xml application/x-asp application/xhtml+xml diff --git a/tika-example/src/main/java/org/apache/tika/example/TIAParsingExample.java b/tika-example/src/main/java/org/apache/tika/example/TIAParsingExample.java index 5a9ee5dc5..748f83fae 100755 --- a/tika-example/src/main/java/org/apache/tika/example/TIAParsingExample.java +++ b/tika-example/src/main/java/org/apache/tika/example/TIAParsingExample.java @@ -47,7 +47,7 @@ import org.apache.tika.parser.ParseContext; import org.apache.tika.parser.Parser; import org.apache.tika.parser.ParserDecorator; import org.apache.tika.parser.html.HtmlMapper; -import org.apache.tika.parser.html.HtmlParser; +import org.apache.tika.parser.html.JSoupParser; import org.apache.tika.parser.html.IdentityHtmlMapper; import org.apache.tika.parser.txt.TXTParser; import org.apache.tika.parser.xml.XMLParser; @@ -117,7 +117,7 @@ public class TIAParsingExample { ContentHandler handler = new DefaultHandler(); Metadata metadata = new Metadata(); ParseContext context = new ParseContext(); -Parser parser = new HtmlParser(); +Parser parser = new JSoupParser(); parser.parse(stream, handler, metadata, context); } @@ -126,7 +126,7 @@ public class TIAParsingExample { ContentHandler handler = new DefaultHandler(); ParseContext context = new ParseContext(); Map parsersByType = new HashMap<>(); -parsersByType.put(MediaType.parse("text/html"), new HtmlParser()); +parsersByType.put(MediaType.parse("text/html"), new JSoupParser()); parsersByType.put(MediaType.parse("application/xml"), new XMLParser()); CompositeParser parser = new CompositeParser(); diff --git a/tika-parsers/tika-parsers-extended/tika-parser-scientific-package/src/test/resources/2.4.0-no-tesseract.txt b/tika-parsers/tika-parsers-extended/tika-parser-scientific-package/src/test/resources/2.4.0-no-tesseract.txt index a929ec74d..ca772e598 100644 ---
[tika] branch TIKA-1599 created (now b8d4e6d66)
This is an automated email from the ASF dual-hosted git repository. tallison pushed a change to branch TIKA-1599 in repository https://gitbox.apache.org/repos/asf/tika.git at b8d4e6d66 TIKA-1599 -- migrate to jsoup parser This branch includes the following new commits: new b8d4e6d66 TIKA-1599 -- migrate to jsoup parser The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference.
[tika] branch TIKA-4138 created (now cbc46ee9b)
This is an automated email from the ASF dual-hosted git repository. tallison pushed a change to branch TIKA-4138 in repository https://gitbox.apache.org/repos/asf/tika.git at cbc46ee9b TIKA-4138 -- move BoilerpipeContentHandler This branch includes the following new commits: new cbc46ee9b TIKA-4138 -- move BoilerpipeContentHandler The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference.
[tika] 01/01: TIKA-4138 -- move BoilerpipeContentHandler
This is an automated email from the ASF dual-hosted git repository. tallison pushed a commit to branch TIKA-4138 in repository https://gitbox.apache.org/repos/asf/tika.git commit cbc46ee9b5295bf14541da8d1f016261c5e30196 Author: tallison AuthorDate: Fri Sep 22 10:31:47 2023 -0400 TIKA-4138 -- move BoilerpipeContentHandler --- CHANGES.txt| 5 ++ pom.xml| 1 + tika-app/pom.xml | 2 +- tika-bom/pom.xml | 2 +- tika-bundles/tika-bundle-standard/pom.xml | 2 +- tika-handlers/README.md| 2 + tika-handlers/pom.xml | 48 ++ .../tika-handler-boilerpipe/pom.xml| 26 ++-- .../sax/boilerpipe/BoilerpipeContentHandler.java | 0 .../tika-parsers-standard-modules/pom.xml | 1 - .../tika-parser-html-commons/pom.xml | 74 -- .../tika-parsers-standard-package/pom.xml | 2 +- tika-server/tika-server-core/pom.xml | 2 +- tika-server/tika-server-standard/pom.xml | 6 +- 14 files changed, 86 insertions(+), 87 deletions(-) diff --git a/CHANGES.txt b/CHANGES.txt index 30c137609..408e42676 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,7 +1,12 @@ Release 3.0.0-BETA - ?? + BREAKING CHANGES + * Require Java 11 (TIKA-4128). + * The boilerpipe handler has been moved to tika-handler-boiler-pipe + + Other Changes/Updates * Fix bug in DateUtils that stripped timezone information from incoming Calendar objects (TIKA-4126). diff --git a/pom.xml b/pom.xml index ab6b22afa..31f025576 100644 --- a/pom.xml +++ b/pom.xml @@ -54,6 +54,7 @@ tika-example tika-java7 tika-detectors +tika-handlers diff --git a/tika-app/pom.xml b/tika-app/pom.xml index 9a48d2ea9..68ac79477 100644 --- a/tika-app/pom.xml +++ b/tika-app/pom.xml @@ -45,7 +45,7 @@ ${project.groupId} - tika-parser-html-commons + tika-handler-boilerpipe ${project.version} diff --git a/tika-bom/pom.xml b/tika-bom/pom.xml index ba2e19d73..5e1aca01e 100644 --- a/tika-bom/pom.xml +++ b/tika-bom/pom.xml @@ -222,7 +222,7 @@ org.apache.tika -tika-parser-html-commons +tika-handler-boilerpipe 3.0.0-SNAPSHOT diff --git a/tika-bundles/tika-bundle-standard/pom.xml b/tika-bundles/tika-bundle-standard/pom.xml index db605c044..1e18b1cb0 100644 --- a/tika-bundles/tika-bundle-standard/pom.xml +++ b/tika-bundles/tika-bundle-standard/pom.xml @@ -58,7 +58,7 @@ ${project.groupId} - tika-parser-html-commons + tika-handler-boilerpipe ${project.version} diff --git a/tika-handlers/README.md b/tika-handlers/README.md new file mode 100644 index 0..bb45651b3 --- /dev/null +++ b/tika-handlers/README.md @@ -0,0 +1,2 @@ +This package is intended to hold non-standard handlers. These may have dependencies that some don't want, +or they may have a focus that isn't general enough to warrant adding them to tika-core \ No newline at end of file diff --git a/tika-handlers/pom.xml b/tika-handlers/pom.xml new file mode 100644 index 0..fcab3eb20 --- /dev/null +++ b/tika-handlers/pom.xml @@ -0,0 +1,48 @@ + + +http://maven.apache.org/POM/4.0.0; + xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance; + xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd;> + 4.0.0 + +org.apache.tika +tika-parent +3.0.0-SNAPSHOT +../tika-parent/pom.xml + + + tika-handlers + + Apache Tika handlers + pom + + +tika-handler-boilerpipe + + + + + ${project.groupId} + tika-core + ${project.version} + provided + + + \ No newline at end of file diff --git a/tika-parsers/tika-parsers-standard/tika-parsers-standard-modules/tika-parser-html-commons/README.md b/tika-handlers/tika-handler-boilerpipe/pom.xml similarity index 51% rename from tika-parsers/tika-parsers-standard/tika-parsers-standard-modules/tika-parser-html-commons/README.md rename to tika-handlers/tika-handler-boilerpipe/pom.xml index 82fb00a47..05d0b69b3 100644 --- a/tika-parsers/tika-parsers-standard/tika-parsers-standard-modules/tika-parser-html-commons/README.md +++ b/tika-handlers/tika-handler-boilerpipe/pom.xml @@ -1,4 +1,5 @@ - -This module only contains the BoilerPipeContentHandler. The boilerpipe dependency is no -longer maintained and contains clashes with NekoHTML. +http://maven.apache.org/POM/4.0.0; + xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance; + xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd;> + 4.0.0 + +org.apache.tika +tika-handlers +3.0.0-SNAPSHOT +../pom.xml + -In Tika 3.x, we
[tika] branch main updated: TIKA-4137 -- add a jdk21 build workflow
This is an automated email from the ASF dual-hosted git repository. tallison pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tika.git The following commit(s) were added to refs/heads/main by this push: new 6871c9157 TIKA-4137 -- add a jdk21 build workflow 6871c9157 is described below commit 6871c9157ed58fe1a0249bbdf44ef76116dba767 Author: tallison AuthorDate: Fri Sep 22 09:33:17 2023 -0400 TIKA-4137 -- add a jdk21 build workflow --- .github/workflows/main-jdk21-build.yml | 38 ++ 1 file changed, 38 insertions(+) diff --git a/.github/workflows/main-jdk21-build.yml b/.github/workflows/main-jdk21-build.yml new file mode 100644 index 0..946cbf0f9 --- /dev/null +++ b/.github/workflows/main-jdk21-build.yml @@ -0,0 +1,38 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# + +name: main jdk21 build + +on: + push: +branches: [ main ] + +jobs: + build: +runs-on: ubuntu-latest +strategy: + matrix: +java: [ '21' ] + +steps: + - uses: actions/checkout@v2 + - name: Set up JDK ${{ matrix.java }} +uses: actions/setup-java@v1 +with: + java-version: ${{ matrix.java }} + - name: Build with Maven +run: mvn clean test install javadoc:aggregate
[tika] branch main updated: Tika 4137 (#1353)
This is an automated email from the ASF dual-hosted git repository. tallison pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tika.git The following commit(s) were added to refs/heads/main by this push: new 72a81a16e Tika 4137 (#1353) 72a81a16e is described below commit 72a81a16e39848dc15202f7e6f8d23661264dc13 Author: Thorsten Heit AuthorDate: Fri Sep 22 15:29:05 2023 +0200 Tika 4137 (#1353) * TIKA-4137 -- Building current Tika main branch fails under Java 20/21 Authored-by: thorsten --- .../src/main/java/org/apache/tika/server/core/resource/TikaResource.java | 1 + 1 file changed, 1 insertion(+) diff --git a/tika-server/tika-server-core/src/main/java/org/apache/tika/server/core/resource/TikaResource.java b/tika-server/tika-server-core/src/main/java/org/apache/tika/server/core/resource/TikaResource.java index aadf86f30..2913e740b 100644 --- a/tika-server/tika-server-core/src/main/java/org/apache/tika/server/core/resource/TikaResource.java +++ b/tika-server/tika-server-core/src/main/java/org/apache/tika/server/core/resource/TikaResource.java @@ -676,6 +676,7 @@ public class TikaResource { handler.getTransformer().setOutputProperty(OutputKeys.METHOD, format); handler.getTransformer().setOutputProperty(OutputKeys.INDENT, "yes"); handler.getTransformer().setOutputProperty(OutputKeys.ENCODING, UTF_8.name()); +handler.getTransformer().setOutputProperty(OutputKeys.VERSION, "1.1"); handler.setResult(new StreamResult(writer)); content = new ExpandedTitleContentHandler(handler); } catch (TransformerConfigurationException e) {
[tika] branch branch_2x updated: TIKA-4123: update netty, aws
This is an automated email from the ASF dual-hosted git repository. tilman pushed a commit to branch branch_2x in repository https://gitbox.apache.org/repos/asf/tika.git The following commit(s) were added to refs/heads/branch_2x by this push: new 1996d73ab TIKA-4123: update netty, aws 1996d73ab is described below commit 1996d73aba38232828e30031419c3389af79c592 Author: Tilman Hausherr AuthorDate: Fri Sep 22 08:13:42 2023 +0200 TIKA-4123: update netty, aws --- tika-parent/pom.xml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tika-parent/pom.xml b/tika-parent/pom.xml index 50bf4c243..81bbc9595 100644 --- a/tika-parent/pom.xml +++ b/tika-parent/pom.xml @@ -306,7 +306,7 @@ 2.27.0 -1.12.554 +1.12.555 9.5 1.1.0 @@ -402,7 +402,7 @@ 6.1.11 1.5.5-5 3.5.1 -4.1.97.Final +4.1.98.Final