Hi,

I have a job that uses the state processor to load data from checkpoints on 
google cloud storage to do some processing and then write the result to google 
cloud storage. The total data size is about 30-50 GB and the job may take more 
than 2 hours to finish. From the flame graph generated from the job, I found 
the job spent most of the time on pthread_cond_wait, pthread_cond_timedwait, 
epoll_wait. It looked like the state processor job is IO-bound. I found a very 
few articles on state processor performance. Because the job takes time and 
Flink has lots of parameters to adjust, I wonder whether anyone has experiences 
in improving the performance in such a case? Thanks for any comment.

Best wishes,
Chen-Che

Reply via email to