+1 for prioritizing this.

On Thu, Jul 9, 2020 at 8:08 AM Piotr Filipiuk <[email protected]>
wrote:

> Thank you for looking into this.
>
> I upvoted the BEAM-6868 <https://issues.apache.org/jira/browse/BEAM-6868>.
> Is there anything else I can do to have that feature prioritized, other
> than trying to contribute myself?
>
> Regarding DirectRunner, as I mentioned above I can see in the
> worker_handlers.py logs that some data is being consumed from Kafka, since
> the offsets logged are consistent with what is being published. Is using
> DirectRunner known to be working for this use case? I do not see any errors
> (I attached all the logs). The apache/beam_java_sdk:2.22.0 is running (see
> logs attached).
>
> On Thu, Jul 9, 2020 at 7:45 AM Maximilian Michels <[email protected]> wrote:
>
>> This used to be working but it appears @FinalizeBundle (which KafkaIO
>> requires) was simply ignored for portable (Python) pipelines. It looks
>> relatively easy to fix.
>>
>> -Max
>>
>> On 07.07.20 03:37, Luke Cwik wrote:
>> > The KafkaIO implementation relies on checkpointing to be able to update
>> > the last committed offset. This is currently unsupported in the
>> portable
>> > Flink runner. BEAM-6868[1] is the associated JIRA. Please vote on it
>> > and/or offer to provide an implementation.
>> >
>> > 1: https://issues.apache.org/jira/browse/BEAM-6868
>> >
>> > On Mon, Jul 6, 2020 at 1:42 PM Piotr Filipiuk <[email protected]
>> > <mailto:[email protected]>> wrote:
>> >
>> >     Hi,
>> >
>> >     I am trying to run a simple example that uses Python API to
>> >     ReadFromKafka, however I am getting the following error when using
>> >     Flink Runner:
>> >
>> >     java.lang.UnsupportedOperationException: The ActiveBundle does not
>> >     have a registered bundle checkpoint handler.
>> >
>> >     See full log in read_from_kafka_flink.log
>> >
>> >     I am using:
>> >     Kafka 2.5.0
>> >     Beam 2.22.0
>> >     Flink 1.10
>> >
>> >     When using Direct runner, the pipeline does not fail but does not
>> >     seem to be consuming any data (see read_from_kafka.log) even though
>> >     the updated offsets are being logged:
>> >
>> >     [2020-07-06 13:36:01,342] {worker_handlers.py:398} INFO - severity:
>> INFO
>> >     timestamp {
>> >        seconds: 1594067761
>> >        nanos: 340000000
>> >     }
>> >     message: "Reader-0: reading from test-topic-0 starting at offset
>> 165"
>> >     log_location: "org.apache.beam.sdk.io.kafka.KafkaUnboundedSource"
>> >     thread: "23"
>> >
>> >     I am running both Kafka and Flink locally. I would appreciate your
>> >     help understanding and fixing the issue.
>> >
>> >     --
>> >     Best regards,
>> >     Piotr
>> >
>>
>
>
> --
> Best regards,
> Piotr
>

Reply via email to