akashk99 commented on issue #31313:
URL: https://github.com/apache/beam/issues/31313#issuecomment-2137942022

   Just to mimic the local setup I used:
   
   I ran `flink/start-cluster.sh`
   
   used the `flink run command with the -d flag`
   
   and then stopped the job with a savepoint `./flink/bin/flink stop -p 
flink/savepoints cf78a44e6b10ab7062d3c02bb7d4e052`
   
   and then restarted using run with the savepoint path. 
   
   When doing this, I looked inside the task manager logs and searched for 
`Starting getIterator request ` and saw 6 logs for the same timestamp that my 
app restarted. 3 at sequence number and 3 at latest. I am not sure why the 
latest ones are showing up and didnt see anything in the source code that would 
cause this.
   
   I also switched to kafka and noticed the same behavior so it seems to be 
related to the runner. I was unable to fix the performance issues with 
beam_fn_api and notice the backpressure was causing my data to come in waves. 
Looking at a cpu chart, it was very cyclic with peaks of 99% cpu and troughs of 
8% cpu leading me to believe that this pipeline option was causing some sort of 
build up and then a rush of data causing the cpu to spike. 
   
   I can make do with kafka offset commits for now, but if there are any 
pointers on how to fix this in the beam source code, id be happy to take a look 
and even submit a PR to be included in version 2.57. Although still hoping the 
issue is somewhere on my end that can be fixed fairly easily


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to