tgravescs commented on a change in pull request #35504:
URL: https://github.com/apache/spark/pull/35504#discussion_r829209348
##########
File path:
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
private val isDefaultProfile = resourceProfile.id ==
ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) ==
Some(APP_RESOURCE_TYPE_PYTHON)
private val disableConfigMap =
kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+ private val memoryOverheadFactor = if
(kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+ kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+ } else {
+ kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+ }
Review comment:
yeah thanks this is what I was also just looking at but I'm not sure how
it was propagated to the executors. I was looking at through the
KubernetesDriverconf somehow or possible through the pod system properties:
MEMORY_OVERHEAD_FACTOR.key -> overheadFactor.toString.
If you find it let me know, still investigating.
##########
File path:
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
private val isDefaultProfile = resourceProfile.id ==
ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) ==
Some(APP_RESOURCE_TYPE_PYTHON)
private val disableConfigMap =
kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+ private val memoryOverheadFactor = if
(kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+ kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+ } else {
+ kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+ }
Review comment:
ok, I think I see how this is happening:
```
val resolvedDriverSpec = builder.buildFromFeatures(conf, kubernetesClient)
val configMapName = KubernetesClientUtils.configMapNameDriver
val confFilesMap =
KubernetesClientUtils.buildSparkConfDirFilesMap(configMapName,
conf.sparkConf, resolvedDriverSpec.systemProperties)
```
We build the driver spec, which includes the added system properties:
`spec.systemProperties ++ addedSystemProperties`
Added system properties in driver feature steps add the memory overhead
setting there:
```
val additionalProps = mutable.Map(
KUBERNETES_DRIVER_POD_NAME.key -> driverPodName,
"spark.app.id" -> conf.appId,
KUBERNETES_DRIVER_SUBMIT_CHECK.key -> "true",
MEMORY_OVERHEAD_FACTOR.key -> overheadFactor.toString)
```
##########
File path:
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
private val isDefaultProfile = resourceProfile.id ==
ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) ==
Some(APP_RESOURCE_TYPE_PYTHON)
private val disableConfigMap =
kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+ private val memoryOverheadFactor = if
(kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+ kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+ } else {
+ kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+ }
Review comment:
ok, I think I see how this is happening:
```
val resolvedDriverSpec = builder.buildFromFeatures(conf, kubernetesClient)
val configMapName = KubernetesClientUtils.configMapNameDriver
val confFilesMap =
KubernetesClientUtils.buildSparkConfDirFilesMap(configMapName,
conf.sparkConf, resolvedDriverSpec.systemProperties)
```
We build the driver spec, which includes the added system properties:
`spec.systemProperties ++ addedSystemProperties`
Added system properties in driver feature steps add the memory overhead
setting there:
```
val additionalProps = mutable.Map(
KUBERNETES_DRIVER_POD_NAME.key -> driverPodName,
"spark.app.id" -> conf.appId,
KUBERNETES_DRIVER_SUBMIT_CHECK.key -> "true",
MEMORY_OVERHEAD_FACTOR.key -> overheadFactor.toString)
```
Then the KubernetesClientUtils.buildSparkConfDirFilesMap is called which
propagates it to the executors (I think)
##########
File path:
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
private val isDefaultProfile = resourceProfile.id ==
ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) ==
Some(APP_RESOURCE_TYPE_PYTHON)
private val disableConfigMap =
kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+ private val memoryOverheadFactor = if
(kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+ kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+ } else {
+ kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+ }
Review comment:
this pr changed it to:
- MEMORY_OVERHEAD_FACTOR.key -> overheadFactor.toString)
+ DRIVER_MEMORY_OVERHEAD_FACTOR.key -> overheadFactor.toString)
And the ExecutorFeatureStep doesn't look at DRIVER_MEMORY_OVERHEAD.
Previous it was looking at the common MEMORY_OVERHEAD_FACTOR.key.
We could change this back so that it would propagate the default setting or
we could do the same logic looking to see if non jvm then use the 0.4. Not
sure which is easier will look some more.
##########
File path:
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
private val isDefaultProfile = resourceProfile.id ==
ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) ==
Some(APP_RESOURCE_TYPE_PYTHON)
private val disableConfigMap =
kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+ private val memoryOverheadFactor = if
(kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+ kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+ } else {
+ kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+ }
Review comment:
this pr changed it to:
```
- MEMORY_OVERHEAD_FACTOR.key -> overheadFactor.toString)
+ DRIVER_MEMORY_OVERHEAD_FACTOR.key -> overheadFactor.toString)
```
And the ExecutorFeatureStep doesn't look at DRIVER_MEMORY_OVERHEAD.
Previous it was looking at the common MEMORY_OVERHEAD_FACTOR.key.
We could change this back so that it would propagate the default setting or
we could do the same logic looking to see if non jvm then use the 0.4. Not
sure which is easier will look some more.
##########
File path:
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
private val isDefaultProfile = resourceProfile.id ==
ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) ==
Some(APP_RESOURCE_TYPE_PYTHON)
private val disableConfigMap =
kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+ private val memoryOverheadFactor = if
(kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+ kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+ } else {
+ kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+ }
Review comment:
I think it might be easiest just to propagate it as
MEMORY_OVERHEAD_FACTOR. This way BasicExecutorFeatureStep doesn't need to
recheck if nonjvm resource. So its either explicitly set in
EXECUTOR_MEMORY_OVERHEAD_FACTOR or it uses MEMORY_OVERHEAD_FACTOR which would
either be explicitly set or default to 0.1 for jvm and 0.4 for nonjvm.
let me know if you see a problem with that.
##########
File path:
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
private val isDefaultProfile = resourceProfile.id ==
ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) ==
Some(APP_RESOURCE_TYPE_PYTHON)
private val disableConfigMap =
kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+ private val memoryOverheadFactor = if
(kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+ kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+ } else {
+ kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+ }
Review comment:
I think it might be easiest just to propagate it as
MEMORY_OVERHEAD_FACTOR. This way BasicExecutorFeatureStep doesn't need to
recheck if nonjvm resource. So its either explicitly set in
EXECUTOR_MEMORY_OVERHEAD_FACTOR or it uses MEMORY_OVERHEAD_FACTOR which would
either be explicitly set or default to 0.1 for jvm and 0.4 for nonjvm.
let me know if you see a problem with that.
Need to update docs as well.
##########
File path:
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
private val isDefaultProfile = resourceProfile.id ==
ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) ==
Some(APP_RESOURCE_TYPE_PYTHON)
private val disableConfigMap =
kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+ private val memoryOverheadFactor = if
(kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+ kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+ } else {
+ kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+ }
Review comment:
@Kimahriman do you have time to put up PR to fix?
##########
File path:
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
private val isDefaultProfile = resourceProfile.id ==
ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) ==
Some(APP_RESOURCE_TYPE_PYTHON)
private val disableConfigMap =
kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+ private val memoryOverheadFactor = if
(kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+ kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+ } else {
+ kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+ }
Review comment:
in my mind this is a small bug and it is easily fixed. I disagree with
not a straight forward contribution. If you want to revert in branch 3.3 go
ahead and we can just remerge when fixed
@Kimahriman logic in ExecutorFeatureSTeps stays the same.
EXECUTOR_MEMORY_OVERHEAD_FACTOR is used if its specified, otherwise it falls
back to whatever MEMORY_OVERHEAD_FACTOR is and in this case the only difference
is DriverFeatureStep takes care of setting it to the default when its not
explicitly specified by user.
##########
File path:
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
private val isDefaultProfile = resourceProfile.id ==
ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) ==
Some(APP_RESOURCE_TYPE_PYTHON)
private val disableConfigMap =
kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+ private val memoryOverheadFactor = if
(kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+ kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+ } else {
+ kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+ }
Review comment:
I should also state that if people don't want in 3.3, I'm personally
fine with it just need input from anyone interested in the feature.
##########
File path:
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
private val isDefaultProfile = resourceProfile.id ==
ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) ==
Some(APP_RESOURCE_TYPE_PYTHON)
private val disableConfigMap =
kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+ private val memoryOverheadFactor = if
(kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+ kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+ } else {
+ kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+ }
Review comment:
@Kimahriman sorry, I guess there is one correction to what I say, we
only want MEMORY_OVERHEAD_FACTOR to reflect what it would be if it wasn't set
via DRIVER_MEMORY_OVERHEAD_FACTOR
##########
File path:
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
private val isDefaultProfile = resourceProfile.id ==
ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) ==
Some(APP_RESOURCE_TYPE_PYTHON)
private val disableConfigMap =
kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+ private val memoryOverheadFactor = if
(kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+ kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+ } else {
+ kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+ }
Review comment:
I realized that might get ugly, I'll put up a pr shortly
##########
File path:
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
##########
@@ -59,11 +59,16 @@ private[spark] class BasicExecutorFeatureStep(
private val isDefaultProfile = resourceProfile.id ==
ResourceProfile.DEFAULT_RESOURCE_PROFILE_ID
private val isPythonApp = kubernetesConf.get(APP_RESOURCE_TYPE) ==
Some(APP_RESOURCE_TYPE_PYTHON)
private val disableConfigMap =
kubernetesConf.get(KUBERNETES_EXECUTOR_DISABLE_CONFIGMAP)
+ private val memoryOverheadFactor = if
(kubernetesConf.contains(EXECUTOR_MEMORY_OVERHEAD_FACTOR)) {
+ kubernetesConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
+ } else {
+ kubernetesConf.get(MEMORY_OVERHEAD_FACTOR)
+ }
Review comment:
https://github.com/apache/spark/pull/35900 revert pr
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]