Hi, We have a flink 1.20.3 streaming job, using hashmap for the state. We normally run with a job parallelism of 6, spread over 6 task managers, and that normally behaves correctly. The jobmanagers have a heap size of 300mb and we never had an issue. We also have HA with zookeeper
For one particular scenario, we increased the job parallelism to 10. We also adjusted the taskmanager parallelism and memory with our expectation, but the jobmanagers are now OOM. We increased their heap size to 400 mb, which seems to help, but we still see the odd restarts, Q: Is there a correlation between the parallelism, number of task managers, and the memory usage for the jobmanagers...is there any tips on how to configure it (other than try/watch/repeat)? what should we take into account to size the jobmanager memory? Thanks JM
