[spark] branch master updated: [SPARK-43231][ML][PYTHON][CONNECT][TESTS] Reduce the memory requirement in torch-related tests

ruifengz Tue, 25 Apr 2023 00:28:41 -0700

This is an automated email from the ASF dual-hosted git repository.

ruifengz pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git



The following commit(s) were added to refs/heads/master by this push:
     new 71969fb9958 [SPARK-43231][ML][PYTHON][CONNECT][TESTS] Reduce the 
memory requirement in torch-related tests
71969fb9958 is described below

commit 71969fb9958f0022aabcad36c89a461029cb3b8c
Author: Ruifeng Zheng <ruife...@foxmail.com>
AuthorDate: Tue Apr 25 15:28:05 2023 +0800

    [SPARK-43231][ML][PYTHON][CONNECT][TESTS] Reduce the memory requirement in 
torch-related tests
    
    ### What changes were proposed in this pull request?
    Reduce the memory requirement in torch-related tests
    
    ### Why are the changes needed?
    The computation in torch distributor actually happens in the external torch 
processes, and the Github Action resources is very limited, this PR tries to 
make related tests more stable
    
    ### Does this PR introduce _any_ user-facing change?
    no, test-only
    
    ### How was this patch tested?
    CI, let me keep merging oncoming commits from master to see whether this 
change is stable enough
    
    Closes #40874 from zhengruifeng/torch_reduce_memory.
    
    Lead-authored-by: Ruifeng Zheng <ruife...@foxmail.com>
    Co-authored-by: Ruifeng Zheng <ruife...@apache.org>
    Signed-off-by: Ruifeng Zheng <ruife...@apache.org>
---
 python/pyspark/ml/tests/connect/test_parity_torch_distributor.py | 4 ++--
 python/pyspark/ml/torch/tests/test_distributor.py                | 8 ++++++--
 2 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/python/pyspark/ml/tests/connect/test_parity_torch_distributor.py 
b/python/pyspark/ml/tests/connect/test_parity_torch_distributor.py
index 55ea99a6540..b855332f96c 100644
--- a/python/pyspark/ml/tests/connect/test_parity_torch_distributor.py
+++ b/python/pyspark/ml/tests/connect/test_parity_torch_distributor.py
@@ -64,7 +64,7 @@ class TorchDistributorLocalUnitTestsOnConnect(
         builder = builder.config(
             "spark.driver.resource.gpu.discoveryScript", 
cls.gpu_discovery_script_file_name
         )
-        cls.spark = builder.remote("local-cluster[2,2,1024]").getOrCreate()
+        cls.spark = builder.remote("local-cluster[2,2,512]").getOrCreate()
 
     @classmethod
     def tearDownClass(cls):
@@ -126,7 +126,7 @@ class TorchDistributorDistributedUnitTestsOnConnect(
         builder = builder.config(
             "spark.worker.resource.gpu.discoveryScript", 
cls.gpu_discovery_script_file_name
         )
-        cls.spark = builder.remote("local-cluster[2,2,1024]").getOrCreate()
+        cls.spark = builder.remote("local-cluster[2,2,512]").getOrCreate()
 
     @classmethod
     def tearDownClass(cls):
diff --git a/python/pyspark/ml/torch/tests/test_distributor.py 
b/python/pyspark/ml/torch/tests/test_distributor.py
index ebd859031bd..9fd0b4cba94 100644
--- a/python/pyspark/ml/torch/tests/test_distributor.py
+++ b/python/pyspark/ml/torch/tests/test_distributor.py
@@ -148,6 +148,8 @@ def get_local_mode_conf():
     return {
         "spark.test.home": SPARK_HOME,
         "spark.driver.resource.gpu.amount": "3",
+        "spark.driver.memory": "512M",
+        "spark.executor.memory": "512M",
     }
 
 
@@ -158,6 +160,8 @@ def get_distributed_mode_conf():
         "spark.task.cpus": "2",
         "spark.task.resource.gpu.amount": "1",
         "spark.executor.resource.gpu.amount": "1",
+        "spark.driver.memory": "512M",
+        "spark.executor.memory": "512M",
     }
 
 
@@ -412,7 +416,7 @@ class 
TorchDistributorLocalUnitTests(TorchDistributorLocalUnitTestsMixin, unitte
             "spark.driver.resource.gpu.discoveryScript", 
cls.gpu_discovery_script_file_name
         )
 
-        sc = SparkContext("local-cluster[2,2,1024]", cls.__name__, conf=conf)
+        sc = SparkContext("local-cluster[2,2,512]", cls.__name__, conf=conf)
         cls.spark = SparkSession(sc)
 
     @classmethod
@@ -502,7 +506,7 @@ class TorchDistributorDistributedUnitTests(
             "spark.worker.resource.gpu.discoveryScript", 
cls.gpu_discovery_script_file_name
         )
 
-        sc = SparkContext("local-cluster[2,2,1024]", cls.__name__, conf=conf)
+        sc = SparkContext("local-cluster[2,2,512]", cls.__name__, conf=conf)
         cls.spark = SparkSession(sc)
 
     @classmethod


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-43231][ML][PYTHON][CONNECT][TESTS] Reduce the memory requirement in torch-related tests

Reply via email to