I am trying to read kafka and save the data as parquet file on hdfs
according to this
https://stackoverflow.com/questions/45827664/read-from-kafka-and-write-to-hdfs-in-parquet
<https://stackoverflow.com/questions/45827664/read-from-kafka-and-write-to-hdfs-in-parquet>
The code is similar to :
val df = spark
.read
.format("kafka")
.option("kafka.bootstrap.servers", "host1:port1,host2:port2")
.option("subscribe", "topic1")
.load()
while I am writing in Java.
However, I keep throwing the following warning:
CachedKafkaConsumer: CachedKafkaConsumer is not running in
UninterruptibleThread. It may hang when CachedKafkaConsumer's method are
interrupted because of KAFKA-1894.
How to solve it? Thanks!
Regard,
Junfeng Chen