You don't need to kill spark-submit process by yourself, just configure this spark conf spark.yarn.submit.waitAppCompletion to be false, then spark submit process will exit right after yarn accepts it.
On Fri, Jul 29, 2022 at 5:23 AM Tornike Gurgenidze <[email protected]> wrote: > Hi all, > > I opened a ticket (https://github.com/apache/airflow/issues/24171) a > while back and I just want to make sure that it got stale deservedly :) > > We used to have an issue with memory consumption on Airflow celery > workers where tasks were often killed by OOM killer. Most of our workload > was running Spark jobs in Yarn cluster mode using SparkSubmitHook. The main > driver for the high memory consumption were spark-submit processes, that > took about 500mb of memory each even though in yarn cluster mode they were > doing essentially nothing. We changed the hook to kill spark-submit process > right after Yarn accepts the application and track the status with "yarn > application -status" calls instead similar to how spark standalone mode is > being tracked right now and OOM issues went away. > > It seems like an issue lots of other users with similar usage pattern > should probably be experiencing, unless they have unnecessarily large > memory allocated to Airflow workers. I want to know if anyone else has > had a similar experience. Is it worth it to work on including our fix in > the upstream repo? Or maybe everyone else has already switched to managed > Spark services and it's just us? :) > -- > Tornike > > -- Best Regards Jeff Zhang
