Hi Arthur, If you are using Spark 3.x you can use executor metrics for memory instrumentation. Metrics are available on the WebUI, see https://spark.apache.org/docs/latest/web-ui.html#stage-detail (search for Peak execution memory). Memory execution metrics are available also in the REST API and the Spark metrics system, see https://spark.apache.org/docs/latest/monitoring.html Further information on the topic also at https://db-blog.web.cern.ch/blog/luca-canali/2020-08-spark3-memory-monitoring Best, Luca
-----Original Message----- From: Arthur Li <[email protected]> Sent: Thursday, December 23, 2021 15:11 To: [email protected] Subject: How to estimate the executor memory size according by the data Dear experts, Recently there’s some OOM issue in my demo jobs which consuming data from the hive database, and I know I can increase the executor memory size to eliminate the OOM error. While I don’t know how to do the executor memory assessment and how to automatically adopt the executor memory size by the data size. Any options I appreciated. Arthur Li --------------------------------------------------------------------- To unsubscribe e-mail: [email protected] --------------------------------------------------------------------- To unsubscribe e-mail: [email protected]
