This is an automated email from the ASF dual-hosted git repository.
rzo1 pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/incubator-stormcrawler.git
The following commit(s) were added to refs/heads/main by this push:
new 5d33ebb0 Upgrade to Storm 2.6.2, fix #1188 (#1189)
5d33ebb0 is described below
commit 5d33ebb0c8f71d0ab6e1295dba00a146932a11ff
Author: Julien Nioche <[email protected]>
AuthorDate: Mon Apr 15 09:08:11 2024 +0100
Upgrade to Storm 2.6.2, fix #1188 (#1189)
* Upgrade to Storm 2.6.2, fix #1188
Signed-off-by: Julien Nioche <[email protected]>
* Adding back hadoop client API as a dependency
Signed-off-by: Julien Nioche <[email protected]>
---------
Signed-off-by: Julien Nioche <[email protected]>
---
README.md | 2 +-
THIRD-PARTY.txt | 121 ++-------------------
.../main/resources/archetype-resources/README.md | 2 +-
.../src/main/resources/archetype-resources/pom.xml | 2 +-
.../src/main/resources/archetype-resources/pom.xml | 2 +-
external/warc/pom.xml | 7 ++
pom.xml | 2 +-
7 files changed, 22 insertions(+), 116 deletions(-)
diff --git a/README.md b/README.md
index 9c8aef40..cbe828b9 100644
--- a/README.md
+++ b/README.md
@@ -10,7 +10,7 @@ StormCrawler is an open source collection of resources for
building low-latency,
## Quickstart
-NOTE: These instructions assume that you have [Apache
Maven](https://maven.apache.org/install.html) installed. You will need to
install [Apache Storm 2.6.1](http://storm.apache.org/) to run the crawler.
+NOTE: These instructions assume that you have [Apache
Maven](https://maven.apache.org/install.html) installed. You will need to
install [Apache Storm 2.6.2](http://storm.apache.org/) to run the crawler.
StormCrawler requires Java 11 or above.
diff --git a/THIRD-PARTY.txt b/THIRD-PARTY.txt
index 551ebdce..9362d1bc 100644
--- a/THIRD-PARTY.txt
+++ b/THIRD-PARTY.txt
@@ -16,42 +16,28 @@ List of third-party dependencies grouped by their license
type.
* aggs-matrix-stats
(org.opensearch.plugin:aggs-matrix-stats-client:2.12.0 -
https://github.com/opensearch-project/OpenSearch.git)
* Apache Avro (org.apache.avro:avro:1.11.3 - https://avro.apache.org)
- * Apache Commons BeanUtils (commons-beanutils:commons-beanutils:1.9.4
- https://commons.apache.org/proper/commons-beanutils/)
* Apache Commons CLI (commons-cli:commons-cli:1.6.0 -
https://commons.apache.org/proper/commons-cli/)
* Apache Commons Codec (commons-codec:commons-codec:1.11 -
http://commons.apache.org/proper/commons-codec/)
* Apache Commons Codec (commons-codec:commons-codec:1.15 -
https://commons.apache.org/proper/commons-codec/)
* Apache Commons Codec (commons-codec:commons-codec:1.16.0 -
https://commons.apache.org/proper/commons-codec/)
* Apache Commons Collections
(commons-collections:commons-collections:3.2.2 -
http://commons.apache.org/collections/)
* Apache Commons Collections
(org.apache.commons:commons-collections4:4.4 -
https://commons.apache.org/proper/commons-collections/)
- * Apache Commons Compress (org.apache.commons:commons-compress:1.21 -
https://commons.apache.org/proper/commons-compress/)
+ * Apache Commons Compress (org.apache.commons:commons-compress:1.22 -
https://commons.apache.org/proper/commons-compress/)
* Apache Commons Compress (org.apache.commons:commons-compress:1.24.0
- https://commons.apache.org/proper/commons-compress/)
- * Apache Commons Configuration
(org.apache.commons:commons-configuration2:2.8.0 -
https://commons.apache.org/proper/commons-configuration/)
+ * Apache Commons Configuration
(org.apache.commons:commons-configuration2:2.10.1 -
https://commons.apache.org/proper/commons-configuration/)
* Apache Commons Crypto (org.apache.commons:commons-crypto:1.1.0 -
https://commons.apache.org/proper/commons-crypto/)
* Apache Commons CSV (org.apache.commons:commons-csv:1.10.0 -
https://commons.apache.org/proper/commons-csv/)
* Apache Commons Exec (org.apache.commons:commons-exec:1.3 -
http://commons.apache.org/proper/commons-exec/)
* Apache Commons IO (commons-io:commons-io:2.11.0 -
https://commons.apache.org/proper/commons-io/)
- * Apache Commons Lang (org.apache.commons:commons-lang3:3.12.0 -
https://commons.apache.org/proper/commons-lang/)
* Apache Commons Lang (org.apache.commons:commons-lang3:3.13.0 -
https://commons.apache.org/proper/commons-lang/)
+ * Apache Commons Lang (org.apache.commons:commons-lang3:3.14.0 -
https://commons.apache.org/proper/commons-lang/)
* Apache Commons Logging (commons-logging:commons-logging:1.2 -
http://commons.apache.org/proper/commons-logging/)
* Apache Commons Math (org.apache.commons:commons-math3:3.6.1 -
http://commons.apache.org/proper/commons-math/)
- * Apache Commons Net (commons-net:commons-net:3.9.0 -
https://commons.apache.org/proper/commons-net/)
- * Apache Commons Text (org.apache.commons:commons-text:1.10.0 -
https://commons.apache.org/proper/commons-text)
+ * Apache Commons Text (org.apache.commons:commons-text:1.11.0 -
https://commons.apache.org/proper/commons-text)
* Apache FontBox (org.apache.pdfbox:fontbox:2.0.29 -
http://pdfbox.apache.org/)
- * Apache Hadoop Annotations
(org.apache.hadoop:hadoop-annotations:3.3.6 - no url defined)
* Apache Hadoop Auth (org.apache.hadoop:hadoop-auth:3.3.6 - no url
defined)
- * Apache Hadoop Client Aggregator
(org.apache.hadoop:hadoop-client:3.3.6 - no url defined)
* Apache Hadoop Common (org.apache.hadoop:hadoop-common:3.3.6 - no url
defined)
- * Apache Hadoop HDFS (org.apache.hadoop:hadoop-hdfs:3.3.6 - no url
defined)
- * Apache Hadoop HDFS Client
(org.apache.hadoop:hadoop-hdfs-client:3.3.6 - no url defined)
- * Apache Hadoop MapReduce Common
(org.apache.hadoop:hadoop-mapreduce-client-common:3.3.6 - no url defined)
- * Apache Hadoop MapReduce Core
(org.apache.hadoop:hadoop-mapreduce-client-core:3.3.6 - no url defined)
- * Apache Hadoop MapReduce JobClient
(org.apache.hadoop:hadoop-mapreduce-client-jobclient:3.3.6 - no url defined)
* Apache Hadoop shaded Guava
(org.apache.hadoop.thirdparty:hadoop-shaded-guava:1.1.1 -
https://www.apache.org/hadoop-thirdparty/hadoop-shaded-guava/)
- * Apache Hadoop shaded Protobuf 3.7
(org.apache.hadoop.thirdparty:hadoop-shaded-protobuf_3_7:1.1.1 -
https://www.apache.org/hadoop-thirdparty/hadoop-shaded-protobuf_3_7/)
- * Apache Hadoop YARN API (org.apache.hadoop:hadoop-yarn-api:3.3.6 - no
url defined)
- * Apache Hadoop YARN Client
(org.apache.hadoop:hadoop-yarn-client:3.3.6 - no url defined)
- * Apache Hadoop YARN Common
(org.apache.hadoop:hadoop-yarn-common:3.3.6 - no url defined)
- * Apache HBase - Annotations
(org.apache.hbase:hbase-annotations:2.5.6-hadoop3 -
https://hbase.apache.org/hbase-annotations)
* Apache HBase - Client (org.apache.hbase:hbase-client:2.5.6-hadoop3 -
https://hbase.apache.org/hbase-build-configuration/hbase-client)
* Apache HBase - Common (org.apache.hbase:hbase-common:2.5.6-hadoop3 -
https://hbase.apache.org/hbase-build-configuration/hbase-common)
* Apache HBase - Hadoop Compatibility
(org.apache.hbase:hbase-hadoop-compat:2.5.6-hadoop3 -
https://hbase.apache.org/hbase-build-configuration/hbase-hadoop-compat)
@@ -128,10 +114,8 @@ List of third-party dependencies grouped by their license
type.
* Apache Tika XMP commons
(org.apache.tika:tika-parser-xmp-commons:2.9.1 -
https://tika.apache.org/tika-parser-xmp-commons/)
* Apache Tika ZIP commons
(org.apache.tika:tika-parser-zip-commons:2.9.1 -
https://tika.apache.org/tika-parser-zip-commons/)
* Apache XmpBox (org.apache.pdfbox:xmpbox:2.0.29 -
https://www.apache.org/pdfbox-parent/xmpbox/)
- * Apache Yetus - Audience Annotations
(org.apache.yetus:audience-annotations:0.5.0 -
https://yetus.apache.org/audience-annotations)
- * Apache ZooKeeper - Jute (org.apache.zookeeper:zookeeper-jute:3.6.3 -
http://zookeeper.apache.org/zookeeper-jute)
+ * Apache Yetus - Audience Annotations
(org.apache.yetus:audience-annotations:0.13.0 -
https://yetus.apache.org/audience-annotations)
* Apache ZooKeeper - Jute (org.apache.zookeeper:zookeeper-jute:3.9.1 -
http://zookeeper.apache.org/zookeeper-jute)
- * Apache ZooKeeper - Server (org.apache.zookeeper:zookeeper:3.6.3 -
http://zookeeper.apache.org/zookeeper)
* Apache ZooKeeper - Server (org.apache.zookeeper:zookeeper:3.9.1 -
http://zookeeper.apache.org/zookeeper)
* AutoService (com.google.auto.service:auto-service-annotations:1.1.1
- https://github.com/google/auto/tree/main/service)
* AWS Java SDK for Amazon CloudSearch
(com.amazonaws:aws-java-sdk-cloudsearch:1.12.663 -
https://aws.amazon.com/sdkforjava)
@@ -141,15 +125,10 @@ List of third-party dependencies grouped by their license
type.
* Byte Buddy (without dependencies) (net.bytebuddy:byte-buddy:1.14.11
- https://bytebuddy.net/byte-buddy)
* Caffeine cache (com.github.ben-manes.caffeine:caffeine:3.1.8 -
https://github.com/ben-manes/caffeine)
* com.drewnoakes:metadata-extractor
(com.drewnoakes:metadata-extractor:2.18.0 - https://drewnoakes.com/code/exif/)
- * Commons Daemon (commons-daemon:commons-daemon:1.0.13 -
http://commons.apache.org/daemon/)
* Commons Lang (commons-lang:commons-lang:2.6 -
http://commons.apache.org/lang/)
* Commons Logging (commons-logging:commons-logging:1.1.3 -
http://commons.apache.org/proper/commons-logging/)
- * Commons Math (org.apache.commons:commons-math3:3.1.1 -
http://commons.apache.org/math/)
* compiler (com.github.spullara.mustache.java:compiler:0.9.10 -
http://github.com/spullara/mustache.java)
* Crawler-commons (com.github.crawler-commons:crawler-commons:1.4 -
https://github.com/crawler-commons/crawler-commons)
- * Curator Client (org.apache.curator:curator-client:5.2.0 -
http://curator.apache.org/curator-client)
- * Curator Framework (org.apache.curator:curator-framework:5.2.0 -
http://curator.apache.org/curator-framework)
- * Curator Recipes (org.apache.curator:curator-recipes:5.2.0 -
http://curator.apache.org/curator-recipes)
* error-prone annotations
(com.google.errorprone:error_prone_annotations:2.14.0 -
https://errorprone.info/error_prone_annotations)
* error-prone annotations
(com.google.errorprone:error_prone_annotations:2.21.1 -
https://errorprone.info/error_prone_annotations)
* Failsafe (dev.failsafe:failsafe:3.3.2 -
https://failsafe.dev/failsafe)
@@ -159,7 +138,6 @@ List of third-party dependencies grouped by their license
type.
* Gson (com.google.code.gson:gson:2.9.0 -
https://github.com/google/gson/gson)
* Guava: Google Core Libraries for Java (com.google.guava:guava:18.0 -
http://code.google.com/p/guava-libraries/guava)
* Guava: Google Core Libraries for Java
(com.google.guava:guava:31.1-android - https://github.com/google/guava)
- * Guava: Google Core Libraries for Java
(com.google.guava:guava:32.1.3-jre - https://github.com/google/guava)
* Guava: Google Core Libraries for Java
(com.google.guava:guava:33.0.0-jre - https://github.com/google/guava)
* Guava InternalFutureFailureAccess and InternalFutures
(com.google.guava:failureaccess:1.0.1 -
https://github.com/google/guava/failureaccess)
* Guava InternalFutureFailureAccess and InternalFutures
(com.google.guava:failureaccess:1.0.2 -
https://github.com/google/guava/failureaccess)
@@ -184,15 +162,11 @@ List of third-party dependencies grouped by their license
type.
* Jackson dataformat: CBOR
(com.fasterxml.jackson.dataformat:jackson-dataformat-cbor:2.16.1 -
https://github.com/FasterXML/jackson-dataformats-binary)
* Jackson dataformat: Smile
(com.fasterxml.jackson.dataformat:jackson-dataformat-smile:2.16.1 -
https://github.com/FasterXML/jackson-dataformats-binary)
* Jackson-dataformat-YAML
(com.fasterxml.jackson.dataformat:jackson-dataformat-yaml:2.16.1 -
https://github.com/FasterXML/jackson-dataformats-text)
- * Jackson-JAXRS-base
(com.fasterxml.jackson.jaxrs:jackson-jaxrs-base:2.12.7 -
http://github.com/FasterXML/jackson-jaxrs-providers/jackson-jaxrs-base)
- * Jackson-JAXRS-JSON
(com.fasterxml.jackson.jaxrs:jackson-jaxrs-json-provider:2.12.7 -
http://github.com/FasterXML/jackson-jaxrs-providers/jackson-jaxrs-json-provider)
- * Jackson module: JAXB Annotations
(com.fasterxml.jackson.module:jackson-module-jaxb-annotations:2.12.7 -
https://github.com/FasterXML/jackson-modules-base)
* java-libpst (com.pff:java-libpst:0.9.3 -
https://github.com/rjohnsondev/java-libpst)
* JCIP Annotations under Apache License
(com.github.stephenc.jcip:jcip-annotations:1.0-1 -
http://stephenc.github.com/jcip-annotations)
* JCL 1.2 implemented over SLF4J (org.slf4j:jcl-over-slf4j:2.0.10 -
http://www.slf4j.org)
* JCL 1.2 implemented over SLF4J (org.slf4j:jcl-over-slf4j:2.0.9 -
http://www.slf4j.org)
* JetBrains Java Annotations (org.jetbrains:annotations:24.1.0 -
https://github.com/JetBrains/java-annotations)
- * Jettison (org.codehaus.jettison:jettison:1.1 - no url defined)
* JMES Path Query library (com.amazonaws:jmespath-java:1.12.663 -
https://aws.amazon.com/sdkforjava)
* Joda-Time (joda-time:joda-time:2.12.2 -
https://www.joda.org/joda-time/)
* Joda-Time (joda-time:joda-time:2.8.1 -
http://www.joda.org/joda-time/)
@@ -203,17 +177,13 @@ List of third-party dependencies grouped by their license
type.
* Kerb Simple Kdc (org.apache.kerby:kerb-simplekdc:1.0.1 -
http://directory.apache.org/kerby/kerby-kerb/kerb-simplekdc)
* Kerby ASN1 Project (org.apache.kerby:kerby-asn1:1.0.1 -
http://directory.apache.org/kerby/kerby-common/kerby-asn1)
* Kerby Config (org.apache.kerby:kerby-config:1.0.1 -
http://directory.apache.org/kerby/kerby-common/kerby-config)
- * Kerby-kerb Admin (org.apache.kerby:kerb-admin:1.0.1 -
http://directory.apache.org/kerby/kerby-kerb/kerb-admin)
* Kerby-kerb Client (org.apache.kerby:kerb-client:1.0.1 -
http://directory.apache.org/kerby/kerby-kerb/kerb-client)
* Kerby-kerb Common (org.apache.kerby:kerb-common:1.0.1 -
http://directory.apache.org/kerby/kerby-kerb/kerb-common)
* Kerby-kerb core (org.apache.kerby:kerb-core:1.0.1 -
http://directory.apache.org/kerby/kerby-kerb/kerb-core)
* Kerby-kerb Crypto (org.apache.kerby:kerb-crypto:1.0.1 -
http://directory.apache.org/kerby/kerby-kerb/kerb-crypto)
- * Kerby-kerb Identity (org.apache.kerby:kerb-identity:1.0.1 -
http://directory.apache.org/kerby/kerby-kerb/kerb-identity)
- * Kerby-kerb Server (org.apache.kerby:kerb-server:1.0.1 -
http://directory.apache.org/kerby/kerby-kerb/kerb-server)
* Kerby-kerb Util (org.apache.kerby:kerb-util:1.0.1 -
http://directory.apache.org/kerby/kerby-kerb/kerb-util)
* Kerby PKIX Project (org.apache.kerby:kerby-pkix:1.0.1 -
http://directory.apache.org/kerby/kerby-pkix)
* Kerby Util (org.apache.kerby:kerby-util:1.0.1 -
http://directory.apache.org/kerby/kerby-common/kerby-util)
- * Kerby XDR Project (org.apache.kerby:kerby-xdr:1.0.1 -
http://directory.apache.org/kerby/kerby-common/kerby-xdr)
* Kotlin Stdlib (org.jetbrains.kotlin:kotlin-stdlib:1.8.21 -
https://kotlinlang.org/)
* Kotlin Stdlib Common
(org.jetbrains.kotlin:kotlin-stdlib-common:1.9.10 - https://kotlinlang.org/)
* Kotlin Stdlib Jdk7 (org.jetbrains.kotlin:kotlin-stdlib-jdk7:1.8.21 -
https://kotlinlang.org/)
@@ -222,50 +192,17 @@ List of third-party dependencies grouped by their license
type.
* language-detector
(com.optimaize.languagedetector:language-detector:0.6 -
https://github.com/optimaize/language-detector)
* mapper-extras (org.opensearch.plugin:mapper-extras-client:2.12.0 -
https://github.com/opensearch-project/OpenSearch.git)
* Metrics Core (io.dropwizard.metrics:metrics-core:3.2.6 -
http://metrics.dropwizard.io/metrics-core/)
- * Netty/All-in-One (io.netty:netty-all:4.1.89.Final -
https://netty.io/netty-all/)
- * Netty/Buffer (io.netty:netty-buffer:4.1.89.Final -
https://netty.io/netty-buffer/)
* Netty/Buffer (io.netty:netty-buffer:4.1.94.Final -
https://netty.io/netty-buffer/)
- * Netty/Codec/DNS (io.netty:netty-codec-dns:4.1.89.Final -
https://netty.io/netty-codec-dns/)
- * Netty/Codec/HAProxy (io.netty:netty-codec-haproxy:4.1.89.Final -
https://netty.io/netty-codec-haproxy/)
- * Netty/Codec/HTTP (io.netty:netty-codec-http:4.1.89.Final -
https://netty.io/netty-codec-http/)
- * Netty/Codec/HTTP2 (io.netty:netty-codec-http2:4.1.89.Final -
https://netty.io/netty-codec-http2/)
- * Netty/Codec/Memcache (io.netty:netty-codec-memcache:4.1.89.Final -
https://netty.io/netty-codec-memcache/)
- * Netty/Codec/MQTT (io.netty:netty-codec-mqtt:4.1.89.Final -
https://netty.io/netty-codec-mqtt/)
- * Netty/Codec/Redis (io.netty:netty-codec-redis:4.1.89.Final -
https://netty.io/netty-codec-redis/)
- * Netty/Codec/SMTP (io.netty:netty-codec-smtp:4.1.89.Final -
https://netty.io/netty-codec-smtp/)
- * Netty/Codec/Socks (io.netty:netty-codec-socks:4.1.89.Final -
https://netty.io/netty-codec-socks/)
- * Netty/Codec/Stomp (io.netty:netty-codec-stomp:4.1.89.Final -
https://netty.io/netty-codec-stomp/)
- * Netty/Codec/XML (io.netty:netty-codec-xml:4.1.89.Final -
https://netty.io/netty-codec-xml/)
- * Netty/Codec (io.netty:netty-codec:4.1.89.Final -
https://netty.io/netty-codec/)
* Netty/Codec (io.netty:netty-codec:4.1.94.Final -
https://netty.io/netty-codec/)
- * Netty/Common (io.netty:netty-common:4.1.89.Final -
https://netty.io/netty-common/)
* Netty/Common (io.netty:netty-common:4.1.94.Final -
https://netty.io/netty-common/)
- * Netty/Handler/Proxy (io.netty:netty-handler-proxy:4.1.89.Final -
https://netty.io/netty-handler-proxy/)
- * Netty/Handler/Ssl/Ocsp (io.netty:netty-handler-ssl-ocsp:4.1.89.Final
- https://netty.io/netty-handler-ssl-ocsp/)
- * Netty/Handler (io.netty:netty-handler:4.1.89.Final -
https://netty.io/netty-handler/)
* Netty/Handler (io.netty:netty-handler:4.1.94.Final -
https://netty.io/netty-handler/)
- * Netty/Resolver/DNS/Classes/MacOS
(io.netty:netty-resolver-dns-classes-macos:4.1.89.Final -
https://netty.io/netty-resolver-dns-classes-macos/)
- * Netty/Resolver/DNS/Native/MacOS
(io.netty:netty-resolver-dns-native-macos:4.1.89.Final -
https://netty.io/netty-resolver-dns-native-macos/)
- * Netty/Resolver/DNS (io.netty:netty-resolver-dns:4.1.89.Final -
https://netty.io/netty-resolver-dns/)
- * Netty/Resolver (io.netty:netty-resolver:4.1.89.Final -
https://netty.io/netty-resolver/)
* Netty/Resolver (io.netty:netty-resolver:4.1.94.Final -
https://netty.io/netty-resolver/)
* Netty/TomcatNative [BoringSSL - Static]
(io.netty:netty-tcnative-boringssl-static:2.0.61.Final -
https://github.com/netty/netty-tcnative/netty-tcnative-boringssl-static/)
* Netty/TomcatNative [OpenSSL - Classes]
(io.netty:netty-tcnative-classes:2.0.61.Final -
https://github.com/netty/netty-tcnative/netty-tcnative-classes/)
- * Netty/Transport/Classes/Epoll
(io.netty:netty-transport-classes-epoll:4.1.89.Final -
https://netty.io/netty-transport-classes-epoll/)
* Netty/Transport/Classes/Epoll
(io.netty:netty-transport-classes-epoll:4.1.94.Final -
https://netty.io/netty-transport-classes-epoll/)
- * Netty/Transport/Classes/KQueue
(io.netty:netty-transport-classes-kqueue:4.1.89.Final -
https://netty.io/netty-transport-classes-kqueue/)
- * Netty/Transport/Native/Epoll
(io.netty:netty-transport-native-epoll:4.1.63.Final -
https://netty.io/netty-transport-native-epoll/)
- * Netty/Transport/Native/Epoll
(io.netty:netty-transport-native-epoll:4.1.89.Final -
https://netty.io/netty-transport-native-epoll/)
* Netty/Transport/Native/Epoll
(io.netty:netty-transport-native-epoll:4.1.94.Final -
https://netty.io/netty-transport-native-epoll/)
- * Netty/Transport/Native/KQueue
(io.netty:netty-transport-native-kqueue:4.1.89.Final -
https://netty.io/netty-transport-native-kqueue/)
- * Netty/Transport/Native/Unix/Common
(io.netty:netty-transport-native-unix-common:4.1.89.Final -
https://netty.io/netty-transport-native-unix-common/)
* Netty/Transport/Native/Unix/Common
(io.netty:netty-transport-native-unix-common:4.1.94.Final -
https://netty.io/netty-transport-native-unix-common/)
- * Netty/Transport/RXTX (io.netty:netty-transport-rxtx:4.1.89.Final -
https://netty.io/netty-transport-rxtx/)
- * Netty/Transport/SCTP (io.netty:netty-transport-sctp:4.1.89.Final -
https://netty.io/netty-transport-sctp/)
- * Netty/Transport/UDT (io.netty:netty-transport-udt:4.1.89.Final -
https://netty.io/netty-transport-udt/)
- * Netty/Transport (io.netty:netty-transport:4.1.89.Final -
https://netty.io/netty-transport/)
* Netty/Transport (io.netty:netty-transport:4.1.94.Final -
https://netty.io/netty-transport/)
- * Netty (io.netty:netty:3.10.6.Final - http://netty.io/)
* Nimbus JOSE+JWT (com.nimbusds:nimbus-jose-jwt:9.8.1 -
https://bitbucket.org/connect2id/nimbus-jose-jwt)
* Non-Blocking Reactive Foundation for the JVM
(io.projectreactor:reactor-core:3.5.14 -
https://github.com/reactor/reactor-core)
* Objenesis (org.objenesis:objenesis:3.3 -
http://objenesis.org/objenesis)
@@ -314,14 +251,13 @@ List of third-party dependencies grouped by their license
type.
* rome (com.rometools:rome:2.1.0 - http://rometools.com/rome)
* rome-utils (com.rometools:rome-utils:2.1.0 -
http://rometools.com/rome-utils)
* server (org.opensearch:opensearch:2.12.0 -
https://github.com/opensearch-project/OpenSearch.git)
- * Shaded Deps for Storm Client
(org.apache.storm:storm-shaded-deps:2.6.1 -
https://storm.apache.org/storm-shaded-deps)
+ * Shaded Deps for Storm Client
(org.apache.storm:storm-shaded-deps:2.6.2 -
https://storm.apache.org/storm-shaded-deps)
* SnakeYAML (org.yaml:snakeyaml:2.2 -
https://bitbucket.org/snakeyaml/snakeyaml)
- * snappy-java (org.xerial.snappy:snappy-java:1.1.8.2 -
https://github.com/xerial/snappy-java)
* sniffer (org.opensearch.client:opensearch-rest-client-sniffer:2.12.0
- https://github.com/opensearch-project/OpenSearch.git)
* SparseBitSet (com.zaxxer:SparseBitSet:1.2 -
https://github.com/brettwooldridge/SparseBitSet)
- * storm-autocreds (org.apache.storm:storm-autocreds:2.6.1 -
https://storm.apache.org/external/storm-autocreds)
- * Storm Client (org.apache.storm:storm-client:2.6.1 -
https://storm.apache.org/storm-client)
- * storm-hdfs (org.apache.storm:storm-hdfs:2.6.1 -
https://storm.apache.org/external/storm-hdfs)
+ * storm-autocreds (org.apache.storm:storm-autocreds:2.6.2 -
https://storm.apache.org/external/storm-autocreds)
+ * Storm Client (org.apache.storm:storm-client:2.6.2 -
https://storm.apache.org/storm-client)
+ * storm-hdfs (org.apache.storm:storm-hdfs:2.6.2 -
https://storm.apache.org/external/storm-hdfs)
* swagger-annotations-jakarta
(io.swagger.core.v3:swagger-annotations-jakarta:2.2.17 -
https://github.com/swagger-api/swagger-core/modules/swagger-annotations-jakarta)
* TagSoup (org.ccil.cowan.tagsoup:tagsoup:1.2.1 -
http://home.ccil.org/~cowan/XML/tagsoup/)
* T-Digest (com.tdunning:t-digest:3.2 -
https://github.com/tdunning/t-digest)
@@ -331,22 +267,6 @@ List of third-party dependencies grouped by their license
type.
* Xerces2-j (xerces:xercesImpl:2.12.2 -
https://xerces.apache.org/xerces2-j/)
* XmlBeans (org.apache.xmlbeans:xmlbeans:5.1.1 -
https://xmlbeans.apache.org/)
- Apache License, Version 2.0, Eclipse Public License - Version 1.0
-
- * Jetty :: Asynchronous HTTP Client
(org.eclipse.jetty:jetty-client:9.4.51.v20230217 -
https://eclipse.org/jetty/jetty-client)
- * Jetty :: Http Utility (org.eclipse.jetty:jetty-http:9.4.51.v20230217
- https://eclipse.org/jetty/jetty-http)
- * Jetty :: IO Utility (org.eclipse.jetty:jetty-io:9.4.51.v20230217 -
https://eclipse.org/jetty/jetty-io)
- * Jetty :: Security (org.eclipse.jetty:jetty-security:9.4.51.v20230217
- https://eclipse.org/jetty/jetty-security)
- * Jetty :: Server Core
(org.eclipse.jetty:jetty-server:9.4.51.v20230217 -
https://eclipse.org/jetty/jetty-server)
- * Jetty :: Servlet Handling
(org.eclipse.jetty:jetty-servlet:9.4.51.v20230217 -
https://eclipse.org/jetty/jetty-servlet)
- * Jetty :: Utilities :: Ajax(JSON)
(org.eclipse.jetty:jetty-util-ajax:9.4.51.v20230217 -
https://eclipse.org/jetty/jetty-util-ajax)
- * Jetty :: Utilities (org.eclipse.jetty:jetty-util:9.4.51.v20230217 -
https://eclipse.org/jetty/jetty-util)
- * Jetty :: Webapp Application Support
(org.eclipse.jetty:jetty-webapp:9.4.51.v20230217 -
https://eclipse.org/jetty/jetty-webapp)
- * Jetty :: Websocket :: API
(org.eclipse.jetty.websocket:websocket-api:9.4.51.v20230217 -
https://eclipse.org/jetty/websocket-parent/websocket-api)
- * Jetty :: Websocket :: Client
(org.eclipse.jetty.websocket:websocket-client:9.4.51.v20230217 -
https://eclipse.org/jetty/websocket-parent/websocket-client)
- * Jetty :: Websocket :: Common
(org.eclipse.jetty.websocket:websocket-common:9.4.51.v20230217 -
https://eclipse.org/jetty/websocket-parent/websocket-common)
- * Jetty :: XML utilities (org.eclipse.jetty:jetty-xml:9.4.51.v20230217
- https://eclipse.org/jetty/jetty-xml)
-
Apache License, Version 2.0, Eclipse Public License - Version 2.0
* Jetty :: ALPN :: Client (org.eclipse.jetty:jetty-alpn-client:10.0.19
- https://eclipse.dev/jetty/jetty-alpn-parent/jetty-alpn-client)
@@ -377,14 +297,12 @@ List of third-party dependencies grouped by their license
type.
BSD 2-Clause License
- * dnsjava (dnsjava:dnsjava:2.1.7 - http://www.dnsjava.org)
* zstd-jni (com.github.luben:zstd-jni:1.5.5-5 -
https://github.com/luben/zstd-jni)
BSD 3-Clause License
* Adobe XMPCore (com.adobe.xmp:xmpcore:6.1.11 -
https://www.adobe.com/devnet/xmp/library/eula-xmp-library-java.html)
* asm (org.ow2.asm:asm:9.6 - http://asm.ow2.io/)
- * leveldbjni-all (org.fusesource.leveldbjni:leveldbjni-all:1.8 -
http://leveldbjni.fusesource.org/leveldbjni-all)
* Protocol Buffer Java API (com.google.protobuf:protobuf-java:2.5.0 -
http://code.google.com/p/protobuf)
* Protocol Buffers [Core] (com.google.protobuf:protobuf-java:3.21.7 -
https://developers.google.com/protocol-buffers/protobuf-java/)
* Protocol Buffers [Core] (com.google.protobuf:protobuf-java:3.22.3 -
https://developers.google.com/protocol-buffers/protobuf-java/)
@@ -396,10 +314,9 @@ List of third-party dependencies grouped by their license
type.
BSD License
* curvesapi (com.github.virtuald:curvesapi:1.07 -
https://github.com/virtuald/curvesapi)
- * JLine Bundle (org.jline:jline:3.9.0 -
http://nexus.sonatype.org/oss-repository-hosting.html/jline-parent/jline)
* JMatIO (org.tallison:jmatio:1.5 -
https://github.com/tballison/jmatio)
* JZlib (com.jcraft:jzlib:1.1.3 - http://www.jcraft.com/jzlib/)
- * Stax2 API (org.codehaus.woodstox:stax2-api:4.2.1 -
http://github.com/FasterXML/stax2-api)
+ * Stax2 API (org.codehaus.woodstox:stax2-api:4.2 -
http://github.com/FasterXML/stax2-api)
CDDL, v1.0, LGPL, v2.1 or later
@@ -411,21 +328,11 @@ List of third-party dependencies grouped by their license
type.
Common Development and Distribution License
- * Expression Language 3.0 (org.glassfish:javax.el:3.0.1-b12 -
http://uel.java.net)
- * Java Servlet API (javax.servlet:javax.servlet-api:3.1.0 -
http://servlet-spec.java.net)
* javax.annotation API (javax.annotation:javax.annotation-api:1.3.2 -
http://jcp.org/en/jsr/detail?id=250)
- * jsr311-api (javax.ws.rs:jsr311-api:1.1.1 -
https://jsr311.dev.java.net)
Common Development and Distribution License (CDDL) v1.1, The GNU General
Public License (GPL), Version 2, With Classpath Exception
* jaxb-api (javax.xml.bind:jaxb-api:2.3.0 -
https://github.com/javaee/jaxb-spec/jaxb-api)
- * JAXB RI (com.sun.xml.bind:jaxb-impl:2.2.3-1 - http://jaxb.java.net/)
- * jersey-client (com.sun.jersey:jersey-client:1.19.4 -
https://jersey.java.net/jersey-client/)
- * jersey-core (com.sun.jersey:jersey-core:1.19.4 -
https://jersey.java.net/jersey-core/)
- * jersey-json (com.github.pjfanning:jersey-json:1.20 -
https://github.com/pjfanning/jersey-1.x)
- * jersey-server (com.sun.jersey:jersey-server:1.19.4 -
https://jersey.java.net/jersey-server/)
- * jersey-servlet (com.sun.jersey:jersey-servlet:1.19.4 -
https://jersey.java.net/jersey-servlet/)
- * jsp-api (javax.servlet.jsp:jsp-api:2.1 - no url defined)
Eclipse Distribution License, Version 1.0
@@ -472,18 +379,10 @@ List of third-party dependencies grouped by their license
type.
* XZ for Java (org.tukaani:xz:1.9 - https://tukaani.org/xz/java.html)
- Revised BSD
-
- * JSch (com.jcraft:jsch:0.1.55 - http://www.jcraft.com/jsch/)
-
Similar to Apache License but with the acknowledgment clause removed
* JDOM (org.jdom:jdom2:2.0.6.1 - http://www.jdom.org)
- The Go license
-
- * re2j (com.google.re2j:re2j:1.1 - http://github.com/google/re2j)
-
Unicode/ICU License
* icu4j (com.ibm.icu:icu4j:74.2 - https://icu.unicode.org/main/icu4j/)
diff --git a/archetype/src/main/resources/archetype-resources/README.md
b/archetype/src/main/resources/archetype-resources/README.md
index 1d444792..72a4cfcf 100644
--- a/archetype/src/main/resources/archetype-resources/README.md
+++ b/archetype/src/main/resources/archetype-resources/README.md
@@ -3,7 +3,7 @@ Have a look at the code and resources and modify them to your
heart's content.
# Prerequisites
-You need to install Apache Storm. The instructions on [setting up a Storm
cluster](https://storm.apache.org/releases/2.6.1/Setting-up-a-Storm-cluster.html)
should help. Alternatively,
+You need to install Apache Storm. The instructions on [setting up a Storm
cluster](https://storm.apache.org/releases/2.6.2/Setting-up-a-Storm-cluster.html)
should help. Alternatively,
the
[stormcrawler-docker](https://github.com/DigitalPebble/stormcrawler-docker)
project contains resources for running Apache Storm on Docker.
You also need to have an instance of URLFrontier running. See [the URLFrontier
README](https://github.com/crawler-commons/url-frontier/tree/master/service);
the easiest way is to use Docker, like so:
diff --git a/archetype/src/main/resources/archetype-resources/pom.xml
b/archetype/src/main/resources/archetype-resources/pom.xml
index d33d0dd1..d5f83ca2 100644
--- a/archetype/src/main/resources/archetype-resources/pom.xml
+++ b/archetype/src/main/resources/archetype-resources/pom.xml
@@ -32,7 +32,7 @@ under the License.
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<stormcrawler.version>${project.version}</stormcrawler.version>
- <storm.version>2.6.1</storm.version>
+ <storm.version>2.6.2</storm.version>
</properties>
<build>
diff --git
a/external/opensearch/archetype/src/main/resources/archetype-resources/pom.xml
b/external/opensearch/archetype/src/main/resources/archetype-resources/pom.xml
index dc8f55b0..c10ca14c 100644
---
a/external/opensearch/archetype/src/main/resources/archetype-resources/pom.xml
+++
b/external/opensearch/archetype/src/main/resources/archetype-resources/pom.xml
@@ -34,7 +34,7 @@ under the License.
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<stormcrawler.version>${StormCrawlerVersion}</stormcrawler.version>
- <storm.version>2.6.1</storm.version>
+ <storm.version>2.6.2</storm.version>
</properties>
<build>
diff --git a/external/warc/pom.xml b/external/warc/pom.xml
index 9f641cd4..3022ece4 100644
--- a/external/warc/pom.xml
+++ b/external/warc/pom.xml
@@ -81,6 +81,13 @@ under the License.
</exclusions>
</dependency>
+ <!-- https://github.com/apache/incubator-stormcrawler/pull/1189
-->
+ <dependency>
+ <groupId>org.apache.hadoop</groupId>
+ <artifactId>hadoop-client-api</artifactId>
+ <version>3.3.6</version>
+ </dependency>
+
<dependency>
<groupId>org.netpreserve</groupId>
<artifactId>jwarc</artifactId>
diff --git a/pom.xml b/pom.xml
index e9e91cdb..f0f83353 100644
--- a/pom.xml
+++ b/pom.xml
@@ -63,7 +63,7 @@ under the License.
<additionalparam>-Xdoclint:none</additionalparam>
<!-- dependency versions -->
<junit.version>4.13.2</junit.version>
- <storm-client.version>2.6.1</storm-client.version>
+ <storm-client.version>2.6.2</storm-client.version>
<jackson.version>2.15.2</jackson.version>
<tika.version>2.9.1</tika.version>
<mockito.version>5.10.0</mockito.version>