This used to be working but it appears @FinalizeBundle (which KafkaIO
requires) was simply ignored for portable (Python) pipelines. It looks
relatively easy to fix.
-Max
On 07.07.20 03:37, Luke Cwik wrote:
The KafkaIO implementation relies on checkpointing to be able to update
the last committed offset. This is currently unsupported in the portable
Flink runner. BEAM-6868[1] is the associated JIRA. Please vote on it
and/or offer to provide an implementation.
1: https://issues.apache.org/jira/browse/BEAM-6868
On Mon, Jul 6, 2020 at 1:42 PM Piotr Filipiuk <piotr.filip...@gmail.com
<mailto:piotr.filip...@gmail.com>> wrote:
Hi,
I am trying to run a simple example that uses Python API to
ReadFromKafka, however I am getting the following error when using
Flink Runner:
java.lang.UnsupportedOperationException: The ActiveBundle does not
have a registered bundle checkpoint handler.
See full log in read_from_kafka_flink.log
I am using:
Kafka 2.5.0
Beam 2.22.0
Flink 1.10
When using Direct runner, the pipeline does not fail but does not
seem to be consuming any data (see read_from_kafka.log) even though
the updated offsets are being logged:
[2020-07-06 13:36:01,342] {worker_handlers.py:398} INFO - severity: INFO
timestamp {
seconds: 1594067761
nanos: 340000000
}
message: "Reader-0: reading from test-topic-0 starting at offset 165"
log_location: "org.apache.beam.sdk.io.kafka.KafkaUnboundedSource"
thread: "23"
I am running both Kafka and Flink locally. I would appreciate your
help understanding and fixing the issue.
--
Best regards,
Piotr