[
https://issues.apache.org/jira/browse/HUDI-3610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17514429#comment-17514429
]
Raymond Xu commented on HUDI-3610:
----------------------------------
Solved w/ user
https://apache-hudi.slack.com/archives/C4D716NPQ/p1646038767655889
dependency issue resolved
> Validate Hudi Kafka Connect Sink writing to S3
> ----------------------------------------------
>
> Key: HUDI-3610
> URL: https://issues.apache.org/jira/browse/HUDI-3610
> Project: Apache Hudi
> Issue Type: Task
> Components: kafka-connect
> Reporter: Ethan Guo
> Assignee: Raymond Xu
> Priority: Critical
> Fix For: 0.11.0
>
>
> From community:
> Hi guys, I'm trying to implement this architecture with hudi
> db table — Debezium --> kafka ---Hudi sink connector --> S3 bucket
> My setting
> Kafka version 2.4
> Hudi version 0.10.1
> Hdf sink connector version 10.1.4
> I'm encountering this error
> {code:java}
> ERROR WorkerSinkTask{id=<XXX>} Task threw an uncaught and unrecoverable
> exception. Task is being killed and will not recover until manually restarted
> (org.apache.kafka.connect.runtime.WorkerTask)
> java.lang.NoClassDefFoundError: org/apache/hadoop/fs/FSDataInputStream
> at org.apache.hudi.connect.HoodieSinkTask.start(HoodieSinkTask.java:80)
> at
> org.apache.kafka.connect.runtime.WorkerSinkTask.initializeAndStart(WorkerSinkTask.java:312)
> at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:186)
> at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:243)
> at
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
> at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> at java.base/java.lang.Thread.run(Thread.java:829)
> Caused by: java.lang.ClassNotFoundException:
> org.apache.hadoop.fs.FSDataInputStream
> at java.base/java.net.URLClassLoader.findClass(URLClassLoader.java:476)
> at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:589)
> at
> org.apache.kafka.connect.runtime.isolation.PluginClassLoader.loadClass(PluginClassLoader.java:103)
> at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:522)
> ... 9 more {code}
> this is the Dockerfile I used to bake the custom image
> {code:java}
> #==================
> FROM maven:3.8.4-openjdk-8-slim as build-hudi
> ENV HUDI_VERSION=0.10.1
> RUN mkdir /home/hudi && \
> curl -L
> https://github.com/apache/hudi/archive/refs/tags/release-$HUDI_VERSION.tar.gz
> \
> > hudi-release-$HUDI_VERSION.tar.gz && \
> tar -xzvf ./hudi-release-$HUDI_VERSION.tar.gz -C /home/hudi && \
> rm ./hudi-release-$HUDI_VERSION.tar.gz && \
> cd /home/hudi/hudi-release-$HUDI_VERSION && \
> mvn package -DskipTests -pl packaging/hudi-kafka-connect-bundle -am
> #==================
> FROM confluentinc/cp-kafka-connect:7.0.1
> ENV DEBEZIUM_VERSION=1.4.1.Final \
> MAVEN_REPO_CORE="https://repo1.maven.org/maven2" \
> CONNECTOR=mysql \
> KAFKA_CONNECT_PLUGINS_DIR=/usr/share/java \
> DATAGEN_VERSION=0.5.3 \
> ADX_SINK_CONNECTOR_VERSION=2.2.0 \
> AMAZON_S3_SINK_CONNECTOR_VERSION=10.0.3 \
> HDFS2_SINK_CONNECTOR_VERSION=10.1.4 \
> HUDI_OUTPUT_JAR_FILE="hudi-kafka-connect-bundle-0.11.0-SNAPSHOT.jar" \
> HUDI_VERSION=0.10.1
> RUN curl -fSL -o /tmp/plugin.tar.gz \
>
> $MAVEN_REPO_CORE/io/debezium/debezium-connector-$CONNECTOR/$DEBEZIUM_VERSION/debezium-connector-$CONNECTOR-$DEBEZIUM_VERSION-plugin.tar.gz
> && \
> tar -xzf /tmp/plugin.tar.gz -C $KAFKA_CONNECT_PLUGINS_DIR && \
> rm -f /tmp/plugin.tar.gz
> RUN confluent-hub install --no-prompt
> confluentinc/kafka-connect-datagen:$DATAGEN_VERSION && \
> confluent-hub install --no-prompt
> microsoftcorporation/kafka-sink-azure-kusto:$ADX_SINK_CONNECTOR_VERSION && \
> confluent-hub install --no-prompt
> confluentinc/kafka-connect-s3:$AMAZON_S3_SINK_CONNECTOR_VERSION && \
> confluent-hub install --no-prompt
> confluentinc/kafka-connect-hdfs:$HDFS2_SINK_CONNECTOR_VERSION
> COPY --from=build-hudi
> /home/hudi/hudi-release-$HUDI_VERSION/packaging/hudi-kafka-connect-bundle/target/hudi-kafka-connect-bundle-$HUDI_VERSION.jar
> $KAFKA_CONNECT_PLUGINS_DIR/$HUDI_OUTPUT_JAR_FILE {code}
--
This message was sent by Atlassian Jira
(v8.20.1#820001)