[ 
https://issues.apache.org/jira/browse/FLINK-34400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17815166#comment-17815166
 ] 

Rui Fan edited comment on FLINK-34400 at 2/7/24 8:51 AM:
---------------------------------------------------------

Hi [~asardaes] , the flink version is 1.18.1 , right?

I saw your log has `Caused by: 
org.apache.kafka.clients.consumer.internals.NoAvailableBrokersException`. I'm 
not sure whether the root cause is about your kafka environment. Or the kafka 
cluster is fine, but the flink cluster cannot connect the kafka cluster well.

 

Also, is your flink running well when you don't use the watermark alignment?

If this issue is related to watermark alignment. Would you mind providing the 
simple demo code to reproduce it? I can help troubleshooting it.


was (Author: fanrui):
Hi [~asardaes] , the flink version is 1.18.1 , right?

 

I saw your log has `Caused by: 
org.apache.kafka.clients.consumer.internals.NoAvailableBrokersException`. I'm 
not sure whether the root cause is about your kafka environment.

> Kafka sources with watermark alignment sporadically stop consuming
> ------------------------------------------------------------------
>
>                 Key: FLINK-34400
>                 URL: https://issues.apache.org/jira/browse/FLINK-34400
>             Project: Flink
>          Issue Type: Bug
>    Affects Versions: 1.18.1
>            Reporter: Alexis Sarda-Espinosa
>            Priority: Major
>         Attachments: logs.txt
>
>
> I have 2 Kafka sources that read from different topics. I have assigned them 
> to the same watermark alignment group, and I have _not_ enabled idleness 
> explicitly in their watermark strategies. One topic remains pretty much empty 
> most of the time, while the other receives a few events per second all the 
> time. Parallelism of the active source is 2, for the other one it's 1, and 
> checkpoints are once every minute.
> This works correctly for some time (10 - 15 minutes in my case) but then 1 of 
> the active sources stops consuming, which causes lag to increase. Weirdly, 
> after another 15 minutes or so, all the backlog is consumed at once, and then 
> everything stops again.
> I'm attaching some logs from the Task Manager where the issue appears. You 
> will notice that the Kafka network client reports disconnections (a long time 
> after the deserializer stopped reporting that events were being consumed), 
> I'm not sure if this is related.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to