umartin commented on issue #1040: URL: https://github.com/apache/sedona/issues/1040#issuecomment-1739141695
With a custom build of v 1.4.1 where the metrics are removed or replaced by a LongAccumulator there is no regression regarding memory use. I think the custom metric class in Sedona is build on a misconception. Spark already tracks accumulators per task. There is no need for a map accumulator. The Sedona Metrics class seems to have a large memory overhead, especially when there are a large number of tasks. Current metrics in Spark UI (leading to OOM for many tasks): Accumulator summary:  Tasks details:  With LongAccumulator (no memory overhead):  Task details:  -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
