Hi Lydian, note that there was a memory leak in certain versions of Beam: https://github.com/apache/beam/issues/28246 . Make sure you use a newer version. You might also find some of the debugging pointers useful.
To my knowledge flink runner didn't implement resource hints support for min_ram hint. Also the intent of that hint is to specify the lower bound rather than the upper bound. On Wed, Feb 7, 2024 at 11:49 AM Lydian <[email protected]> wrote: > Hi, > > I found our flink taskmanager is likely to crash due to the python harness > being no longer reachable. > > However, it seems like the Beam harness is not the child process of flink > taskmanager process, and thus the flink metrics monitor is unable to report > the usage of the memory usage by the Beam SDK harness (either Java or > Python). Which makes me unable to further debug on the issue. Wondering if > there's a way to monitor the beam memory usage especially for those harness > processes? > > Also, it is very likely that the disconnect could potentially result from > OOM. If that is the case, what is the best way to limit the resource usage > by the harness? I noticed there's a resource hint > <https://beam.apache.org/documentation/runtime/resource-hints/>, but it > also mentioned that not all runners will honor that setting, but I couldn't > find anything mentioning in flink runner related to the resource hint. > Wondering if that is the best way for us to fix the memory usage or is > there any other approach that we can do to avoid the OOM on python task > runs on flink runner? Thanks > Sincerely, > Lydian Lee > >
