[spark] branch branch-3.0 updated: [SPARK-31244][K8S][TEST] Use Minio instead of Ceph in K8S DepsTestsSuite

2020-03-25 Thread dongjoon
This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 53221cd  [SPARK-31244][K8S][TEST] Use Minio instead of Ceph in K8S 
DepsTestsSuite
53221cd is described below

commit 53221cda408e9be5d0d2ff5946c200cb43647dd9
Author: Dongjoon Hyun 
AuthorDate: Wed Mar 25 12:38:15 2020 -0700

[SPARK-31244][K8S][TEST] Use Minio instead of Ceph in K8S DepsTestsSuite

### What changes were proposed in this pull request?

This PR (SPARK-31244) replaces `Ceph` with `Minio` in K8S `DepsTestSuite`.

### Why are the changes needed?

Currently, `DepsTestsSuite` is using `ceph` for S3 storage. However, the 
used version and all new releases are broken on new `minikube` releases. We had 
better use more robust and small one.

```
$ minikube version
minikube version: v1.8.2

$ minikube -p minikube docker-env | source

$ docker run -it --rm -e NETWORK_AUTO_DETECT=4 -e RGW_FRONTEND_PORT=8000 -e 
SREE_PORT=5001 -e CEPH_DEMO_UID=nano -e CEPH_DAEMON=demo 
ceph/daemon:v4.0.3-stable-4.0-nautilus-centos-7-x86_64 /bin/sh
2020-03-25 04:26:21  /opt/ceph-container/bin/entrypoint.sh: ERROR- it looks 
like we have not been able to discover the network settings

$ docker run -it --rm -e NETWORK_AUTO_DETECT=4 -e RGW_FRONTEND_PORT=8000 -e 
SREE_PORT=5001 -e CEPH_DEMO_UID=nano -e CEPH_DAEMON=demo 
ceph/daemon:v4.0.11-stable-4.0-nautilus-centos-7 /bin/sh
2020-03-25 04:20:30  /opt/ceph-container/bin/entrypoint.sh: ERROR- it looks 
like we have not been able to discover the network settings
```

Also, the image size is unnecessarily big (almost `1GB`) and growing while 
`minio` is `55.8MB` with the same features.
```
$ docker images | grep ceph
ceph/daemon v4.0.3-stable-4.0-nautilus-centos-7-x86_64 a6a05ccdf924 6 
months ago 852MB
ceph/daemon v4.0.11-stable-4.0-nautilus-centos-7   87f695550d8e 12 
hours ago 901MB

$ docker images | grep minio
minio/minio latest 95c226551ea6 5 days 
ago   55.8MB
```

### Does this PR introduce any user-facing change?

No. (This is a test case change)

### How was this patch tested?

Pass the existing Jenkins K8s integration test job and test with the latest 
minikube.
```
$ minikube version
minikube version: v1.8.2

$ kubectl version --short
Client Version: v1.17.4
Server Version: v1.17.4

$ NO_MANUAL=1 ./dev/make-distribution.sh --r --pip --tgz -Pkubernetes
$ 
resource-managers/kubernetes/integration-tests/dev/dev-run-integration-tests.sh 
--spark-tgz $PWD/spark-*.tgz
...
KubernetesSuite:
- Run SparkPi with no resources
- Run SparkPi with a very long application name.
- Use SparkLauncher.NO_RESOURCE
- Run SparkPi with a master URL without a scheme.
- Run SparkPi with an argument.
- Run SparkPi with custom labels, annotations, and environment variables.
- All pods have the same service account by default
- Run extraJVMOptions check on driver
- Run SparkRemoteFileTest using a remote data file
- Run SparkPi with env and mount secrets.
- Run PySpark on simple pi.py example
- Run PySpark with Python2 to test a pyfiles example
- Run PySpark with Python3 to test a pyfiles example
- Run PySpark with memory customization
- Run in client mode.
- Start pod creation from template
- PVs with local storage *** FAILED *** // This is irrelevant to this PR.
- Launcher client dependencies  // This is the fixed test case by 
this PR.
- Test basic decommissioning
- Run SparkR on simple dataframe.R example
Run completed in 12 minutes, 4 seconds.
...
```

The following is the working snapshot of `DepsTestSuite` test.
```
$ kubectl get all -ncf9438dd8a65436686b1196a6b73000f
NAME  READY   STATUS
RESTARTS   AGE
pod/minio-0   1/1 Running   0   
   70s
pod/spark-test-app-8494bddca3754390b9e59a2ef47584eb   1/1 Running   0   
   55s

NAME TYPECLUSTER-IP 
 EXTERNAL-IP   PORT(S)  AGE
service/minio-s3 NodePort
10.109.54.180   9000:30678/TCP   70s
service/spark-test-app-fd916b711061c7b8-driver-svc   ClusterIP   None   
 7078/TCP,7079/TCP,4040/TCP   55s

NAME READY   AGE
statefulset.apps/minio   1/1 70s
```

Closes #28015 from dongjoon-hyun/SPARK-31244.

Authored-by: Dongjoon Hyun 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit 

[spark] branch branch-3.0 updated: [SPARK-31244][K8S][TEST] Use Minio instead of Ceph in K8S DepsTestsSuite

2020-03-25 Thread dongjoon
This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 53221cd  [SPARK-31244][K8S][TEST] Use Minio instead of Ceph in K8S 
DepsTestsSuite
53221cd is described below

commit 53221cda408e9be5d0d2ff5946c200cb43647dd9
Author: Dongjoon Hyun 
AuthorDate: Wed Mar 25 12:38:15 2020 -0700

[SPARK-31244][K8S][TEST] Use Minio instead of Ceph in K8S DepsTestsSuite

### What changes were proposed in this pull request?

This PR (SPARK-31244) replaces `Ceph` with `Minio` in K8S `DepsTestSuite`.

### Why are the changes needed?

Currently, `DepsTestsSuite` is using `ceph` for S3 storage. However, the 
used version and all new releases are broken on new `minikube` releases. We had 
better use more robust and small one.

```
$ minikube version
minikube version: v1.8.2

$ minikube -p minikube docker-env | source

$ docker run -it --rm -e NETWORK_AUTO_DETECT=4 -e RGW_FRONTEND_PORT=8000 -e 
SREE_PORT=5001 -e CEPH_DEMO_UID=nano -e CEPH_DAEMON=demo 
ceph/daemon:v4.0.3-stable-4.0-nautilus-centos-7-x86_64 /bin/sh
2020-03-25 04:26:21  /opt/ceph-container/bin/entrypoint.sh: ERROR- it looks 
like we have not been able to discover the network settings

$ docker run -it --rm -e NETWORK_AUTO_DETECT=4 -e RGW_FRONTEND_PORT=8000 -e 
SREE_PORT=5001 -e CEPH_DEMO_UID=nano -e CEPH_DAEMON=demo 
ceph/daemon:v4.0.11-stable-4.0-nautilus-centos-7 /bin/sh
2020-03-25 04:20:30  /opt/ceph-container/bin/entrypoint.sh: ERROR- it looks 
like we have not been able to discover the network settings
```

Also, the image size is unnecessarily big (almost `1GB`) and growing while 
`minio` is `55.8MB` with the same features.
```
$ docker images | grep ceph
ceph/daemon v4.0.3-stable-4.0-nautilus-centos-7-x86_64 a6a05ccdf924 6 
months ago 852MB
ceph/daemon v4.0.11-stable-4.0-nautilus-centos-7   87f695550d8e 12 
hours ago 901MB

$ docker images | grep minio
minio/minio latest 95c226551ea6 5 days 
ago   55.8MB
```

### Does this PR introduce any user-facing change?

No. (This is a test case change)

### How was this patch tested?

Pass the existing Jenkins K8s integration test job and test with the latest 
minikube.
```
$ minikube version
minikube version: v1.8.2

$ kubectl version --short
Client Version: v1.17.4
Server Version: v1.17.4

$ NO_MANUAL=1 ./dev/make-distribution.sh --r --pip --tgz -Pkubernetes
$ 
resource-managers/kubernetes/integration-tests/dev/dev-run-integration-tests.sh 
--spark-tgz $PWD/spark-*.tgz
...
KubernetesSuite:
- Run SparkPi with no resources
- Run SparkPi with a very long application name.
- Use SparkLauncher.NO_RESOURCE
- Run SparkPi with a master URL without a scheme.
- Run SparkPi with an argument.
- Run SparkPi with custom labels, annotations, and environment variables.
- All pods have the same service account by default
- Run extraJVMOptions check on driver
- Run SparkRemoteFileTest using a remote data file
- Run SparkPi with env and mount secrets.
- Run PySpark on simple pi.py example
- Run PySpark with Python2 to test a pyfiles example
- Run PySpark with Python3 to test a pyfiles example
- Run PySpark with memory customization
- Run in client mode.
- Start pod creation from template
- PVs with local storage *** FAILED *** // This is irrelevant to this PR.
- Launcher client dependencies  // This is the fixed test case by 
this PR.
- Test basic decommissioning
- Run SparkR on simple dataframe.R example
Run completed in 12 minutes, 4 seconds.
...
```

The following is the working snapshot of `DepsTestSuite` test.
```
$ kubectl get all -ncf9438dd8a65436686b1196a6b73000f
NAME  READY   STATUS
RESTARTS   AGE
pod/minio-0   1/1 Running   0   
   70s
pod/spark-test-app-8494bddca3754390b9e59a2ef47584eb   1/1 Running   0   
   55s

NAME TYPECLUSTER-IP 
 EXTERNAL-IP   PORT(S)  AGE
service/minio-s3 NodePort
10.109.54.180   9000:30678/TCP   70s
service/spark-test-app-fd916b711061c7b8-driver-svc   ClusterIP   None   
 7078/TCP,7079/TCP,4040/TCP   55s

NAME READY   AGE
statefulset.apps/minio   1/1 70s
```

Closes #28015 from dongjoon-hyun/SPARK-31244.

Authored-by: Dongjoon Hyun 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit