Hi Ramya, Unfortunately I cannot see them.
Kostas On Wed, Sep 23, 2020 at 10:27 AM Ramya Ramamurthy <hair...@gmail.com> wrote: > > Hi Kostas, > > Attaching the taskmanager logs regarding this issue. > I have attached the Kaka related metrics. I hope you can see it this time. > > Not sure why we get these many disconnects to Kafka. Maybe because of this > interruptions, we seem to slow down on our processing. At some point the > memory also increases and the workers almost stagnate not doing any > processing. I have 3GB heap committed and allotted 5GB memory to the pods. > > Thanks for your help. > > ~Ramya. > > On Tue, Sep 22, 2020 at 9:18 PM Kostas Kloudas <kklou...@gmail.com> wrote: >> >> Hi Ramya, >> >> Unfortunately your images are blocked. Could you upload them somewhere and >> post the links here? >> Also I think that the TaskManager logs may be able to help a bit more. >> Could you please provide them here? >> >> Cheers, >> Kostas >> >> On Tue, Sep 22, 2020 at 8:58 AM Ramya Ramamurthy <hair...@gmail.com> wrote: >> >> > Hi, >> > >> > We are seeing an issue with Flink on our production. The version is 1.7 >> > which we use. >> > We started seeing sudden lag on kafka, and the consumers were no longer >> > working/accepting messages. On trying to enable debug mode, the below >> > errors were seen >> > [image: image.jpeg] >> > >> > I am not sure why this occurs everyday and when this happens, I can see >> > the remaining workers arent able to handle the load. Unless i restart my >> > jobs, i am unable to start processing again. This way, there is data loss >> > as well. >> > >> > On the below graph, there is a slight dip in consumption before 5:30. That >> > is when this incident happens and correlated with logs. >> > >> > [image: image.jpeg] >> > >> > Any pointers/suggestions would be appreciated. >> > >> > Thanks. >> > >> >