Re: Troubleshooting checkpoint timeout

2021-10-26 Thread Piotr Nowojski
m:* Piotr Nowojski > *Sent:* Montag, 25. Oktober 2021 15:51 > *To:* Alexis Sarda-Espinosa > *Cc:* Parag Somani ; Caizhi Weng < > tsreape...@gmail.com>; Flink ML > *Subject:* Re: Troubleshooting checkpoint timeout > > > > Hi Alexis, > > > > > Should I under

RE: Troubleshooting checkpoint timeout

2021-10-25 Thread Alexis Sarda-Espinosa
checkpoints. Thanks again for all the info. Regards, Alexis. From: Piotr Nowojski Sent: Montag, 25. Oktober 2021 15:51 To: Alexis Sarda-Espinosa Cc: Parag Somani ; Caizhi Weng ; Flink ML Subject: Re: Troubleshooting checkpoint timeout Hi Alexis, > Should I understand these metrics as a prope

Re: Troubleshooting checkpoint timeout

2021-10-25 Thread Piotr Nowojski
t are behind more data than before it > restarted, no? > > > > Regards, > > Alexis. > > > > *From:* Piotr Nowojski > *Sent:* Montag, 25. Oktober 2021 13:35 > *To:* Alexis Sarda-Espinosa > *Cc:* Parag Somani ; Caizhi Weng < > tsreape...@gmail.com>

RE: Troubleshooting checkpoint timeout

2021-10-25 Thread Alexis Sarda-Espinosa
g>> Sent: Montag, 25. Oktober 2021 09:59 To: Alexis Sarda-Espinosa mailto:alexis.sarda-espin...@microfocus.com>> Cc: Parag Somani mailto:somanipa...@gmail.com>>; Caizhi Weng mailto:tsreape...@gmail.com>>; Flink ML mailto:user@flink.apache.org>> Subject: Re: Troubleshooting check

Re: Troubleshooting checkpoint timeout

2021-10-25 Thread Piotr Nowojski
stream operator has lower parallelism? > > > > Regards, > > Alexis. > > > > *From:* Piotr Nowojski > *Sent:* Montag, 25. Oktober 2021 09:59 > *To:* Alexis Sarda-Espinosa > *Cc:* Parag Somani ; Caizhi Weng < > tsreape...@gmail.com>; Flink ML > *Subject

RE: Troubleshooting checkpoint timeout

2021-10-25 Thread Alexis Sarda-Espinosa
eam operator has lower parallelism? Regards, Alexis. From: Piotr Nowojski Sent: Montag, 25. Oktober 2021 09:59 To: Alexis Sarda-Espinosa Cc: Parag Somani ; Caizhi Weng ; Flink ML Subject: Re: Troubleshooting checkpoint timeout Hi Alexis, You can read about those metrics in the documentation

Re: Troubleshooting checkpoint timeout

2021-10-25 Thread Piotr Nowojski
those metrics don’t really help me know in which areas to look > for issues. > > > > Regards, > > Alexis. > > > > *From:* Alexis Sarda-Espinosa > *Sent:* Mittwoch, 20. Oktober 2021 09:43 > *To:* Parag Somani ; Caizhi Weng < > tsreape...@gmail.com> >

RE: Troubleshooting checkpoint timeout

2021-10-21 Thread Alexis Sarda-Espinosa
, Alexis. From: Alexis Sarda-Espinosa Sent: Mittwoch, 20. Oktober 2021 09:43 To: Parag Somani ; Caizhi Weng Cc: Flink ML Subject: RE: Troubleshooting checkpoint timeout Currently the windows are 10 minutes in size with a 1-minute slide time. The approximate 500 event/minute throughput is already

RE: Troubleshooting checkpoint timeout

2021-10-20 Thread Alexis Sarda-Espinosa
; Flink ML Subject: Re: Troubleshooting checkpoint timeout I had similar problem, where i have concurrent two checkpoints were configured. Also, i used to save it in S3(using minio) on k8s 1.18 env. Flink service were getting restarted and timeout was happening. It got resolved: 1. As minio ran

Re: Troubleshooting checkpoint timeout

2021-10-20 Thread Parag Somani
I had similar problem, where i have concurrent two checkpoints were configured. Also, i used to save it in S3(using minio) on k8s 1.18 env. Flink service were getting restarted and timeout was happening. It got resolved: 1. As minio ran out of disk space, caused failure of checkpoints(this was

Re: Troubleshooting checkpoint timeout

2021-10-19 Thread Caizhi Weng
Hi! I see you're using sliding event time windows. What's the exact value of windowLengthMinutes and windowSlideTimeMinutes? If windowLengthMinutes is large and windowSlideTimeMinutes is small then each record may be assigned to a large number of windows as the pipeline proceeds, thus gradually