Thank you for looking into this.

I upvoted the BEAM-6868 <https://issues.apache.org/jira/browse/BEAM-6868>.
Is there anything else I can do to have that feature prioritized, other
than trying to contribute myself?

Regarding DirectRunner, as I mentioned above I can see in the
worker_handlers.py logs that some data is being consumed from Kafka, since
the offsets logged are consistent with what is being published. Is using
DirectRunner known to be working for this use case? I do not see any errors
(I attached all the logs). The apache/beam_java_sdk:2.22.0 is running (see
logs attached).

On Thu, Jul 9, 2020 at 7:45 AM Maximilian Michels <[email protected]> wrote:

> This used to be working but it appears @FinalizeBundle (which KafkaIO
> requires) was simply ignored for portable (Python) pipelines. It looks
> relatively easy to fix.
>
> -Max
>
> On 07.07.20 03:37, Luke Cwik wrote:
> > The KafkaIO implementation relies on checkpointing to be able to update
> > the last committed offset. This is currently unsupported in the portable
> > Flink runner. BEAM-6868[1] is the associated JIRA. Please vote on it
> > and/or offer to provide an implementation.
> >
> > 1: https://issues.apache.org/jira/browse/BEAM-6868
> >
> > On Mon, Jul 6, 2020 at 1:42 PM Piotr Filipiuk <[email protected]
> > <mailto:[email protected]>> wrote:
> >
> >     Hi,
> >
> >     I am trying to run a simple example that uses Python API to
> >     ReadFromKafka, however I am getting the following error when using
> >     Flink Runner:
> >
> >     java.lang.UnsupportedOperationException: The ActiveBundle does not
> >     have a registered bundle checkpoint handler.
> >
> >     See full log in read_from_kafka_flink.log
> >
> >     I am using:
> >     Kafka 2.5.0
> >     Beam 2.22.0
> >     Flink 1.10
> >
> >     When using Direct runner, the pipeline does not fail but does not
> >     seem to be consuming any data (see read_from_kafka.log) even though
> >     the updated offsets are being logged:
> >
> >     [2020-07-06 13:36:01,342] {worker_handlers.py:398} INFO - severity:
> INFO
> >     timestamp {
> >        seconds: 1594067761
> >        nanos: 340000000
> >     }
> >     message: "Reader-0: reading from test-topic-0 starting at offset 165"
> >     log_location: "org.apache.beam.sdk.io.kafka.KafkaUnboundedSource"
> >     thread: "23"
> >
> >     I am running both Kafka and Flink locally. I would appreciate your
> >     help understanding and fixing the issue.
> >
> >     --
> >     Best regards,
> >     Piotr
> >
>


-- 
Best regards,
Piotr

Attachment: beam_java_sdk.log
Description: Binary data

Reply via email to