jpcorreia99 commented on PR #45240: URL: https://github.com/apache/spark/pull/45240#issuecomment-1992568472
Hey @mridulm, thanks for commenting! > would like to understand the usecase better here - It is still unclear to me what characteristics you are shooting for by this PR. I recognize it's a bit of a contrived scenario. We provide users the ability to write spark code while we handle the whole infrastructure provisioning. We have defaults for both the JVM but allow the users to override it. In Spark, the default rule for off heap allocation is 10% of the JVM, with this value being configurable via the `memoryOverheadFactor` config. Let's now say that as part of the infrastructure provisioning we have to perform certain off-heap operations we know will need 500Mb of off-heap memory. We can: - Directly set the `memoryOverhead` config. But here we risk actually giving the user less overhead memory than he would be given otherwise (e.g. if the user requested 10GB JVM, with this conf he's only get 500MB instead of the 1Gb (10*0.1) he'd have otherwise. - We can set the `memoryOverheadFactor` to give us the desired value, but let's imagine a scenario where the default JVM allocation is 1GB: to get 500MB of off-heap we'd need a factor of 0.5. If the user then decided to request for 10GB of JVM instead, we'd be also allocating 5Gb off off-heap memory, which could be very wasteful. As such, that's why I'm proposing this config, as a way to let off-heap memory allocation scale as it usually does but while guaranteeing that we have *at least* a certain configurable amount of off heap memory! > Reduction in OOM is mentioned https://github.com/apache/spark/pull/45240#issuecomment-1976891862 - but overhead does not impact that, heap memory does. In our case the OOMs we got where K8s OOMkiller OOMs, not JVM ones. These originated from the off-heap processes consuming too much memory and k8s killing down the pod. So, in this scenario, increasing the overhead actually stopped the OOMs > Any increase in overhead memory implies reduction in available heap memory if the max container memory has a upper limit and it is at that limit (if container memory is not limited - there is no impact of memory available to user, tune memory and overhead independently). That's a fair concern. In our case we allocate the containers based on the memory required for the JVM + overhead instead of having a fixed memory container, so we don't run into this issue. > The reason behind the question is, I am trying to understand if the default min (which was estimated quite a while ago) needs to be relooked or not - and understanding why your deployment needs a higher min will help. The 387mb value does seem a bit arbitrary 😅 but it seems mostly fine. In our case we simply rely on heavy off-heap operations. Some examples would be package managers like conda + pip, performing ZSTD decompression (the jni implementation delegates to native operations), even up until spark 3.2, the `SparkContext::AddArchive` call would launch `tar` processes for decompression. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
