[spark] branch master updated: [SPARK-45447][MLLIB][TESTS] Reduce the memory required to start the `LocalClusterSparkContext` in the `mllib` module test cases

dongjoon Sat, 07 Oct 2023 15:37:52 -0700

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git



The following commit(s) were added to refs/heads/master by this push:
     new 2bca4426e6dd [SPARK-45447][MLLIB][TESTS] Reduce the memory required to 
start the `LocalClusterSparkContext` in the `mllib` module test cases
2bca4426e6dd is described below

commit 2bca4426e6dd830161ca518e22ca3ffb955fa24c
Author: YangJie <[email protected]>
AuthorDate: Sat Oct 7 15:37:37 2023 -0700

    [SPARK-45447][MLLIB][TESTS] Reduce the memory required to start the 
`LocalClusterSparkContext` in the `mllib` module test cases
    
    ### What changes were proposed in this pull request?
    This pr aims to reduce the chance of GA test tasks being killed by 
decreasing the memory required to start `LocalClusterSparkContext`.
    
    ### Why are the changes needed?
    The GA test task for mllib is sometimes killed, and the test cases to be 
tested that often get killed are those that inherit from 
`LocalClusterSparkContext`, such as
    
    - LBFGSClusterSuite: 
https://github.com/apache/spark/actions/runs/6421930285/job/17437383223
    - NaiveBayesClusterSuite: 
https://github.com/apache/spark/actions/runs/6436807001/job/17480899578
    - LassoClusterSuite: 
https://github.com/apache/spark/actions/runs/6362618008/job/17289887813
    - LBFGSClusterSuite: 
https://github.com/apache/spark/actions/runs/6346972272/job/17289897396
    
    So this pr circumvents this issue by reducing the startup memory required 
for `LocalClusterSparkContext`.
    
    ### Does this PR introduce _any_ user-facing change?
    No
    
    ### How was this patch tested?
    Should monitor GA
    
    ### Was this patch authored or co-authored using generative AI tooling?
    No
    
    Closes #43242 from LuciferYang/mllib-mem.
    
    Lead-authored-by: YangJie <[email protected]>
    Co-authored-by: yangjie01 <[email protected]>
    Signed-off-by: Dongjoon Hyun <[email protected]>
---
 .../scala/org/apache/spark/mllib/util/LocalClusterSparkContext.scala  | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git 
a/mllib/src/test/scala/org/apache/spark/mllib/util/LocalClusterSparkContext.scala
 
b/mllib/src/test/scala/org/apache/spark/mllib/util/LocalClusterSparkContext.scala
index 79d4785fd6fa..892d2f627425 100644
--- 
a/mllib/src/test/scala/org/apache/spark/mllib/util/LocalClusterSparkContext.scala
+++ 
b/mllib/src/test/scala/org/apache/spark/mllib/util/LocalClusterSparkContext.scala
@@ -20,6 +20,7 @@ package org.apache.spark.mllib.util
 import org.scalatest.{BeforeAndAfterAll, Suite}
 
 import org.apache.spark.{SparkConf, SparkContext}
+import org.apache.spark.internal.config.EXECUTOR_MEMORY
 import org.apache.spark.internal.config.Network.RPC_MESSAGE_MAX_SIZE
 
 trait LocalClusterSparkContext extends BeforeAndAfterAll { self: Suite =>
@@ -28,8 +29,9 @@ trait LocalClusterSparkContext extends BeforeAndAfterAll { 
self: Suite =>
   override def beforeAll(): Unit = {
     super.beforeAll()
     val conf = new SparkConf()
-      .setMaster("local-cluster[2, 1, 1024]")
+      .setMaster("local-cluster[2, 1, 512]")
       .setAppName("test-cluster")
+      .set(EXECUTOR_MEMORY.key, "512m")
       .set(RPC_MESSAGE_MAX_SIZE, 1) // set to 1MB to detect direct 
serialization of data
     sc = new SparkContext(conf)
   }


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[spark] branch master updated: [SPARK-45447][MLLIB][TESTS] Reduce the memory required to start the `LocalClusterSparkContext` in the `mllib` module test cases

Reply via email to