Hi,

We are seeing an issue with Flink on our production. The version is 1.7
which we use.
We started seeing sudden lag on kafka, and the consumers were no longer
working/accepting messages. On trying to enable debug mode, the below
errors were seen
[image: image.jpeg]

I am not sure why this occurs everyday and when this happens, I can see the
remaining workers arent able to handle the load. Unless i restart my jobs,
i am unable to start processing again. This way, there is data loss as well.

On the below graph, there is a slight dip in consumption before 5:30. That
is when this incident happens and correlated with logs.

[image: image.jpeg]

Any pointers/suggestions would be appreciated.

Thanks.

Reply via email to