Hi, I found our flink taskmanager is likely to crash due to the python harness being no longer reachable.
However, it seems like the Beam harness is not the child process of flink taskmanager process, and thus the flink metrics monitor is unable to report the usage of the memory usage by the Beam SDK harness (either Java or Python). Which makes me unable to further debug on the issue. Wondering if there's a way to monitor the beam memory usage especially for those harness processes? Also, it is very likely that the disconnect could potentially result from OOM. If that is the case, what is the best way to limit the resource usage by the harness? I noticed there's a resource hint <https://beam.apache.org/documentation/runtime/resource-hints/>, but it also mentioned that not all runners will honor that setting, but I couldn't find anything mentioning in flink runner related to the resource hint. Wondering if that is the best way for us to fix the memory usage or is there any other approach that we can do to avoid the OOM on python task runs on flink runner? Thanks Sincerely, Lydian Lee
