This is an automated email from the ASF dual-hosted git repository.
dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push:
new 2bca4426e6dd [SPARK-45447][MLLIB][TESTS] Reduce the memory required to
start the `LocalClusterSparkContext` in the `mllib` module test cases
2bca4426e6dd is described below
commit 2bca4426e6dd830161ca518e22ca3ffb955fa24c
Author: YangJie <[email protected]>
AuthorDate: Sat Oct 7 15:37:37 2023 -0700
[SPARK-45447][MLLIB][TESTS] Reduce the memory required to start the
`LocalClusterSparkContext` in the `mllib` module test cases
### What changes were proposed in this pull request?
This pr aims to reduce the chance of GA test tasks being killed by
decreasing the memory required to start `LocalClusterSparkContext`.
### Why are the changes needed?
The GA test task for mllib is sometimes killed, and the test cases to be
tested that often get killed are those that inherit from
`LocalClusterSparkContext`, such as
- LBFGSClusterSuite:
https://github.com/apache/spark/actions/runs/6421930285/job/17437383223
- NaiveBayesClusterSuite:
https://github.com/apache/spark/actions/runs/6436807001/job/17480899578
- LassoClusterSuite:
https://github.com/apache/spark/actions/runs/6362618008/job/17289887813
- LBFGSClusterSuite:
https://github.com/apache/spark/actions/runs/6346972272/job/17289897396
So this pr circumvents this issue by reducing the startup memory required
for `LocalClusterSparkContext`.
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
Should monitor GA
### Was this patch authored or co-authored using generative AI tooling?
No
Closes #43242 from LuciferYang/mllib-mem.
Lead-authored-by: YangJie <[email protected]>
Co-authored-by: yangjie01 <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
---
.../scala/org/apache/spark/mllib/util/LocalClusterSparkContext.scala | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git
a/mllib/src/test/scala/org/apache/spark/mllib/util/LocalClusterSparkContext.scala
b/mllib/src/test/scala/org/apache/spark/mllib/util/LocalClusterSparkContext.scala
index 79d4785fd6fa..892d2f627425 100644
---
a/mllib/src/test/scala/org/apache/spark/mllib/util/LocalClusterSparkContext.scala
+++
b/mllib/src/test/scala/org/apache/spark/mllib/util/LocalClusterSparkContext.scala
@@ -20,6 +20,7 @@ package org.apache.spark.mllib.util
import org.scalatest.{BeforeAndAfterAll, Suite}
import org.apache.spark.{SparkConf, SparkContext}
+import org.apache.spark.internal.config.EXECUTOR_MEMORY
import org.apache.spark.internal.config.Network.RPC_MESSAGE_MAX_SIZE
trait LocalClusterSparkContext extends BeforeAndAfterAll { self: Suite =>
@@ -28,8 +29,9 @@ trait LocalClusterSparkContext extends BeforeAndAfterAll {
self: Suite =>
override def beforeAll(): Unit = {
super.beforeAll()
val conf = new SparkConf()
- .setMaster("local-cluster[2, 1, 1024]")
+ .setMaster("local-cluster[2, 1, 512]")
.setAppName("test-cluster")
+ .set(EXECUTOR_MEMORY.key, "512m")
.set(RPC_MESSAGE_MAX_SIZE, 1) // set to 1MB to detect direct
serialization of data
sc = new SparkContext(conf)
}
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]