This is an automated email from the ASF dual-hosted git repository.

shaneknapp pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
     new 7b90fd2  [SPARK-35430][K8S] Switch on "PVs with local storage" 
integration test on Docker driver
7b90fd2 is described below

commit 7b90fd2ca79b9a1fec5fca0bdcc169c7962ad880
Author: attilapiros <piros.attila.zs...@gmail.com>
AuthorDate: Mon Aug 2 09:17:29 2021 -0700

    [SPARK-35430][K8S] Switch on "PVs with local storage" integration test on 
Docker driver
    
    ### What changes were proposed in this pull request?
    
    Switching back the  "PVs with local storage" integration test on Docker 
driver.
    
    I have analyzed why this test was failing on my machine (I hope the root 
cause of the problem is OS agnostic).
    It failed because of the mounting of the host directory into the Minikube 
node using the `--uid=185` (Spark user user id):
    
    ```
    $ minikube mount ${PVC_TESTS_HOST_PATH}:${PVC_TESTS_VM_PATH} 
--9p-version=9p2000.L --gid=0 --uid=185 &; MOUNT_PID=$!
    ```
    
    Are referring to a nonexistent user. See the the number of occurence of 185 
in "/etc/passwd":
    
    ```
    $ minikube ssh "grep -c 185 /etc/passwd"
    0
    ```
    
    This leads to a permission denied. Skipping the `--uid=185` won't help 
although the path will listable before the test execution:
    
    ```
    ╭─attilazsoltpirosapiros-MBP16 ~/git/attilapiros/spark ‹SPARK-35430*›
    ╰─$ 📁  Mounting host path 
/var/folders/t_/fr_vqcyx23vftk81ftz1k5hw0000gn/T/tmp.k9X4Gecv into VM as 
/var/folders/t_/fr_vqcyx23vftk81ftz1k5hw0000gn/T/tmp.k9X4Gecv ...
        ▪ Mount type:
        ▪ User ID:      docker
        ▪ Group ID:     0
        ▪ Version:      9p2000.L
        ▪ Message Size: 262144
        ▪ Permissions:  755 (-rwxr-xr-x)
        ▪ Options:      map[]
        ▪ Bind Address: 127.0.0.1:51740
    🚀  Userspace file server: ufs starting
    
    ╭─attilazsoltpirosapiros-MBP16 ~/git/attilapiros/spark ‹SPARK-35430*›
    ╰─$ minikube ssh "ls 
/var/folders/t_/fr_vqcyx23vftk81ftz1k5hw0000gn/T/tmp.k9X4Gecv"
    ╭─attilazsoltpirosapiros-MBP16 ~/git/attilapiros/spark ‹SPARK-35430*›
    ╰─$
    ```
    
    But the test will fail and after its execution the `dmesg` shows the 
following error:
    ```
    [13670.493359] bpfilter: Loaded bpfilter_umh pid 66153
    [13670.493363] bpfilter: write fail -32
    [13670.530737] bpfilter: Loaded bpfilter_umh pid 66155
    ...
    ```
    
    This `bpfilter` is a firewall module and we are back to a permission denied 
when we want to list the mounted directory.
    
    The solution is to add a spark user with 185 uid when the minikube is 
started.
    
    **So this must be added to Jenkins job (and the mount should use --gid=0 
--uid=185)**:
    
    ```
    $ minikube ssh "sudo useradd spark -u 185 -g 0 -m -s /bin/bash"
    ```
    
    ### Why are the changes needed?
    
    This integration test is needed to validate the PVs feature.
    
    ### Does this PR introduce _any_ user-facing change?
    
    No. It is just testing.
    
    ### How was this patch tested?
    
    Running the test locally:
    ```
    KubernetesSuite:
    - Run SparkPi with no resources
    - Run SparkPi with a very long application name.
    - Use SparkLauncher.NO_RESOURCE
    - Run SparkPi with a master URL without a scheme.
    - Run SparkPi with an argument.
    - Run SparkPi with custom labels, annotations, and environment variables.
    - All pods have the same service account by default
    - Run extraJVMOptions check on driver
    - Run SparkRemoteFileTest using a remote data file
    - Verify logging configuration is picked from the provided 
SPARK_CONF_DIR/log4j.properties
    - Run SparkPi with env and mount secrets.
    - Run PySpark on simple pi.py example
    - Run PySpark to test a pyfiles example
    - Run PySpark with memory customization
    - Run in client mode.
    - Start pod creation from template
    - PVs with local storage
    ```
    
    The "PVs with local storage" was successful but the next test `Launcher 
client dependencies` the minio stops the test executions on Mac (only on Mac):
    ```
    21/06/29 04:33:32.449 ScalaTest-main-running-KubernetesSuite INFO 
ProcessUtils: 🏃  Starting tunnel for service minio-s3.
    21/06/29 04:33:33.425 ScalaTest-main-running-KubernetesSuite INFO 
ProcessUtils: 
|----------------------------------|----------|-------------|------------------------|
    21/06/29 04:33:33.426 ScalaTest-main-running-KubernetesSuite INFO 
ProcessUtils: |            NAMESPACE             |   NAME   | TARGET PORT |     
     URL           |
    21/06/29 04:33:33.426 ScalaTest-main-running-KubernetesSuite INFO 
ProcessUtils: 
|----------------------------------|----------|-------------|------------------------|
    21/06/29 04:33:33.426 ScalaTest-main-running-KubernetesSuite INFO 
ProcessUtils: | 7855c37ca34340c49a98aa8439f4935c | minio-s3 |             | 
http://127.0.0.1:62138 |
    21/06/29 04:33:33.426 ScalaTest-main-running-KubernetesSuite INFO 
ProcessUtils: 
|----------------------------------|----------|-------------|------------------------|
    21/06/29 04:33:33.449 ScalaTest-main-running-KubernetesSuite INFO 
ProcessUtils: http://127.0.0.1:62138
    21/06/29 04:33:33.449 ScalaTest-main-running-KubernetesSuite INFO 
ProcessUtils: ❗  Because you are using a Docker driver on darwin, the terminal 
needs to be open to run it.
    ```
    This is a different problem which is a docker desktop limitation 
(https://docs.docker.com/docker-for-mac/networking/#per-container-ip-addressing-is-not-possible).
    
    Of course with the default driver on Mac, on hyperkit, all the tests are 
passing:
    ```
    [INFO] --- scalatest-maven-plugin:2.0.0:test (integration-test)  
spark-kubernetes-integration-tests_2.12 ---
    Discovery starting.
    Discovery completed in 498 milliseconds.
    Run starting. Expected test count is: 26
    KubernetesSuite:
    - Run SparkPi with no resources
    - Run SparkPi with a very long application name.
    - Use SparkLauncher.NO_RESOURCE
    - Run SparkPi with a master URL without a scheme.
    - Run SparkPi with an argument.
    - Run SparkPi with custom labels, annotations, and environment variables.
    - All pods have the same service account by default
    - Run extraJVMOptions check on driver
    - Run SparkRemoteFileTest using a remote data file
    - Verify logging configuration is picked from the provided 
SPARK_CONF_DIR/log4j.properties
    - Run SparkPi with env and mount secrets.
    - Run PySpark on simple pi.py example
    - Run PySpark to test a pyfiles example
    - Run PySpark with memory customization
    - Run in client mode.
    - Start pod creation from template
    - PVs with local storage
    - Launcher client dependencies
    - SPARK-33615: Launcher client archives
    - SPARK-33748: Launcher python client respecting PYSPARK_PYTHON
    - SPARK-33748: Launcher python client respecting spark.pyspark.python and 
spark.pyspark.driver.python
    - Launcher python client dependencies using a zip file
    - Test basic decommissioning
    - Test basic decommissioning with shuffle cleanup
    - Test decommissioning with dynamic allocation & shuffle cleanups
    - Test decommissioning timeouts
    ...
    [INFO] BUILD SUCCESS
    ```
    
    Closes #32793 from attilapiros/SPARK-35430.
    
    Authored-by: attilapiros <piros.attila.zs...@gmail.com>
    Signed-off-by: shane knapp <incompl...@gmail.com>
---
 .../org/apache/spark/deploy/k8s/integrationtest/KubernetesSuite.scala   | 1 -
 .../org/apache/spark/deploy/k8s/integrationtest/PVTestsSuite.scala      | 2 +-
 2 files changed, 1 insertion(+), 2 deletions(-)

diff --git 
a/resource-managers/kubernetes/integration-tests/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/KubernetesSuite.scala
 
b/resource-managers/kubernetes/integration-tests/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/KubernetesSuite.scala
index 5007171..d65f594 100644
--- 
a/resource-managers/kubernetes/integration-tests/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/KubernetesSuite.scala
+++ 
b/resource-managers/kubernetes/integration-tests/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/KubernetesSuite.scala
@@ -566,7 +566,6 @@ class KubernetesSuite extends SparkFunSuite
 
 private[spark] object KubernetesSuite {
   val k8sTestTag = Tag("k8s")
-  val pvTestTag = Tag("persistentVolume")
   val rTestTag = Tag("r")
   val MinikubeTag = Tag("minikube")
   val SPARK_PI_MAIN_CLASS: String = "org.apache.spark.examples.SparkPi"
diff --git 
a/resource-managers/kubernetes/integration-tests/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/PVTestsSuite.scala
 
b/resource-managers/kubernetes/integration-tests/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/PVTestsSuite.scala
index 2f1a7aa..86f8cdd 100644
--- 
a/resource-managers/kubernetes/integration-tests/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/PVTestsSuite.scala
+++ 
b/resource-managers/kubernetes/integration-tests/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/PVTestsSuite.scala
@@ -122,7 +122,7 @@ private[spark] trait PVTestsSuite { k8sSuite: 
KubernetesSuite =>
     }
   }
 
-  test("PVs with local storage", pvTestTag, MinikubeTag) {
+  test("PVs with local storage", k8sTestTag, MinikubeTag) {
     sparkAppConf
       
.set(s"spark.kubernetes.driver.volumes.persistentVolumeClaim.data.mount.path",
         CONTAINER_MOUNT_PATH)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

Reply via email to