Hi,

You need to use the fat jar [1] as documented in the Kafka Table & SQL 
connector page [2].

[1] 
https://repo.maven.apache.org/maven2/org/apache/flink/flink-sql-connector-kafka_2.11/1.12.2/flink-sql-connector-kafka_2.11-1.12.2.jar
 
<https://repo.maven.apache.org/maven2/org/apache/flink/flink-sql-connector-kafka_2.11/1.12.2/flink-sql-connector-kafka_2.11-1.12.2.jar>
[2] 
https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/table/connectors/kafka.html
 
<https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/table/connectors/kafka.html>

Regards,
Dian

> 2021年4月19日 上午1:26,g.g.m.5...@web.de 写道:
> 
> 
> Hi,
> I am trying to run a very basic job in PyFlink (getting Data from a 
> Kafka-Topic and printing the stream).
> 
> In the command line I run:
> 
> ./bin/flink run \
> --python /home/ubuntu/load_kafka.py \
> --jarfile /home/ubuntu/flink-connector-kafka_2.12-1.12.2.jar
> 
> I downloaded the jar from:
> https://mvnrepository.com/artifact/org.apache.flink/flink-connector-kafka
> 
> Now I get the following error:
> 
> File "/home/ubuntu/load_kafka.py", line 16, in <module>
>    kafka_consumer = FlinkKafkaConsumer("twitter-stream-source", 
> json_row_schema, kafka_props)
>  File 
> "/home/ubuntu/flink-1.12.2/opt/python/pyflink.zip/pyflink/datastream/connectors.py",
>  line 179, in __init__
>  File 
> "/home/ubuntu/flink-1.12.2/opt/python/pyflink.zip/pyflink/datastream/connectors.py",
>  line 329, in _get_kafka_consumer
>  File 
> "/home/ubuntu/flink-1.12.2/opt/python/py4j-0.10.8.1-src.zip/py4j/java_gateway.py",
>  line 1553, in __call__
>  File 
> "/home/ubuntu/flink-1.12.2/opt/python/pyflink.zip/pyflink/util/exceptions.py",
>  line 147, in deco
>  File 
> "/home/ubuntu/flink-1.12.2/opt/python/py4j-0.10.8.1-src.zip/py4j/protocol.py",
>  line 326, in get_return_value
> py4j.protocol.Py4JJavaError: An error occurred while calling 
> None.org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer.
> : java.lang.NoClassDefFoundError: 
> org/apache/kafka/common/serialization/ByteArrayDeserializer
>       at 
> org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer.setDeserializer(FlinkKafkaConsumer.java:322)
>       at 
> org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer.<init>(FlinkKafkaConsumer.java:223)
>       at 
> org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer.<init>(FlinkKafkaConsumer.java:154)
>       at 
> org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer.<init>(FlinkKafkaConsumer.java:139)
>       at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>       at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>       at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>       at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>       at 
> org.apache.flink.api.python.shaded.py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)
>       at 
> org.apache.flink.api.python.shaded.py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
>       at 
> org.apache.flink.api.python.shaded.py4j.Gateway.invoke(Gateway.java:238)
>       at 
> org.apache.flink.api.python.shaded.py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
>       at 
> org.apache.flink.api.python.shaded.py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
>       at 
> org.apache.flink.api.python.shaded.py4j.GatewayConnection.run(GatewayConnection.java:238)
>       at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.kafka.common.serialization.ByteArrayDeserializer
>       at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
>       at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
>       at 
> org.apache.flink.util.FlinkUserCodeClassLoader.loadClassWithoutExceptionHandling(FlinkUserCodeClassLoader.java:64)
>       at 
> org.apache.flink.util.ChildFirstClassLoader.loadClassWithoutExceptionHandling(ChildFirstClassLoader.java:74)
>       at 
> org.apache.flink.util.FlinkUserCodeClassLoader.loadClass(FlinkUserCodeClassLoader.java:48)
>       at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
>       ... 15 more
> 
> org.apache.flink.client.program.ProgramAbortException
>       at 
> org.apache.flink.client.python.PythonDriver.main(PythonDriver.java:124)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:498)
>       at 
> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:349)
>       at 
> org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:219)
>       at 
> org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:114)
>       at 
> org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:812)
>       at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:246)
>       at 
> org.apache.flink.client.cli.CliFrontend.parseAndRun(CliFrontend.java:1054)
>       at 
> org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:1132)
>       at 
> org.apache.flink.runtime.security.contexts.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:28)
>       at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1132)
> 
> I'm thinking that I might be providing the wrong jar, but don't really have 
> any idea.
> This is my code:
> 
> from pyflink.common.serialization import JsonRowDeserializationSchema
> from pyflink.common.typeinfo import Types
> from pyflink.datastream import StreamExecutionEnvironment
> from pyflink.datastream.connectors import FlinkKafkaConsumer
> 
> env = StreamExecutionEnvironment.get_execution_environment()
> env.set_parallelism(1)
> 
> type_info = Types.ROW([Types.ROW([Types.STRING(), Types.ROW([Types.INT(), 
> Types.INT(), Types.INT(), Types.INT()]), Types.STRING(), Types.STRING(), 
> Types.STRING(), Types.STRING(), Types.STRING()]),
> Types.ROW([Types.ROW([Types.ROW([Types.BOOLEAN(), Types.STRING(), 
> Types.STRING(), Types.STRING(), Types.ROW([Types.INT(), Types.INT(), 
> Types.INT(), Types.INT()]), Types.STRING()])])]),
> Types.ROW([Types.ROW([Types.STRING(), Types.STRING()])])])
> 
> json_row_schema = 
> JsonRowDeserializationSchema.builder().type_info(type_info).build()
> kafka_props = {'bootstrap.servers': 'localhost:9092', 'group.id': 
> 'twitter_consumers'}
> kafka_consumer = FlinkKafkaConsumer("twitter-stream-source", json_row_schema, 
> kafka_props)
> # research this
> kafka_consumer.set_start_from_earliest()
> 
> ds = env.add_source(kafka_consumer)
> ds.print()
> ds.execute()
> 
> Thanks a lot!

Reply via email to