This is an automated email from the ASF dual-hosted git repository.
pingsutw pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/submarine.git
The following commit(s) were added to refs/heads/master by this push:
new 772875f SUBMARINE-728. The experiment example will cause OOM error.
772875f is described below
commit 772875f731a3db99175db1d218b8b6acb2486a83
Author: Kai-Hsun Chen <[email protected]>
AuthorDate: Thu Feb 11 14:56:23 2021 +0800
SUBMARINE-728. The experiment example will cause OOM error.
### What is this PR for?
The experiment example will cause OOM error.
### What type of PR is it?
[Improvement]
### Todos
### What is the Jira issue?
https://issues.apache.org/jira/projects/SUBMARINE/issues/SUBMARINE-728?filter=myopenissues
### How should this be tested?
* Step1: Create an experiment
```
curl -X POST -H "Content-Type: application/json" -d '
{
"meta": {
"name": "tf-mnist-json",
"namespace": "default",
"framework": "TensorFlow",
"cmd": "python /var/tf_mnist/mnist_with_summaries.py
--log_dir=/train/log --learning_rate=0.01 --batch_size=150",
"envVars": {
"ENV_1": "ENV1"
}
},
"environment": {
"image": "apache/submarine:tf-mnist-with-summaries-1.0"
},
"spec": {
"Ps": {
"replicas": 1,
"resources": "cpu=1,memory=1024M"
},
"Worker": {
"replicas": 1,
"resources": "cpu=1,memory=2048M"
}
}
}
' http://127.0.0.1:8080/api/v1/experiment
```
* Step2: Check the status of the Pods
```
kubectl get pods
```
### Screenshots (if appropriate)
### Questions:
* Does the licenses files need update? No
* Is there breaking changes for older versions? No
* Does this needs documentation? No
Author: Kai-Hsun Chen <[email protected]>
Signed-off-by: Kevin <[email protected]>
Closes #510 from kevin85421/SUBMARINE-728 and squashes the following
commits:
2d5ec8ee [Kai-Hsun Chen] SUBMARINE-728. The experiment example will cause
OOM error.
---
website/docs/api/experiment.md | 36 ++++++++++++++++++------------------
1 file changed, 18 insertions(+), 18 deletions(-)
diff --git a/website/docs/api/experiment.md b/website/docs/api/experiment.md
index b000a42..8f77fc3 100644
--- a/website/docs/api/experiment.md
+++ b/website/docs/api/experiment.md
@@ -46,11 +46,11 @@ curl -X POST -H "Content-Type: application/json" -d '
"spec": {
"Ps": {
"replicas": 1,
- "resources": "cpu=1,memory=512M"
+ "resources": "cpu=1,memory=1024M"
},
"Worker": {
"replicas": 1,
- "resources": "cpu=1,memory=512M"
+ "resources": "cpu=1,memory=2048M"
}
}
}
@@ -84,11 +84,11 @@ curl -X POST -H "Content-Type: application/json" -d '
"spec": {
"Ps": {
"replicas": 1,
- "resources": "cpu=1,memory=512M"
+ "resources": "cpu=1,memory=1024M"
},
"Worker": {
"replicas": 1,
- "resources": "cpu=1,memory=512M"
+ "resources": "cpu=1,memory=2048M"
}
}
}
@@ -118,11 +118,11 @@ curl -X POST -H "Content-Type: application/json" -d '
"spec": {
"Ps": {
"replicas": 1,
- "resources": "cpu=1,memory=512M"
+ "resources": "cpu=1,memory=1024M"
},
"Worker": {
"replicas": 1,
- "resources": "cpu=1,memory=512M"
+ "resources": "cpu=1,memory=2048M"
}
}
}
@@ -157,11 +157,11 @@ Above example assume environment "my-submarine-env"
already exists in Submarine.
"spec": {
"Ps": {
"replicas": 1,
- "resources": "cpu=1,memory=512M"
+ "resources": "cpu=1,memory=1024M"
},
"Worker": {
"replicas": 1,
- "resources": "cpu=1,memory=512M"
+ "resources": "cpu=1,memory=2048M"
}
}
}
@@ -205,11 +205,11 @@ curl -X GET http://127.0.0.1:8080/api/v1/experiment
"spec": {
"Ps": {
"replicas": 1,
- "resources": "cpu=1,memory=512M"
+ "resources": "cpu=1,memory=1024M"
},
"Worker": {
"replicas": 1,
- "resources": "cpu=1,memory=512M"
+ "resources": "cpu=1,memory=2048M"
}
}
}
@@ -284,11 +284,11 @@ curl -X GET
http://127.0.0.1:8080/api/v1/experiment/experiment_1592057447228_000
"spec": {
"Ps": {
"replicas": 1,
- "resources": "cpu=1,memory=512M"
+ "resources": "cpu=1,memory=1024M"
},
"Worker": {
"replicas": 1,
- "resources": "cpu=1,memory=512M"
+ "resources": "cpu=1,memory=2048M"
}
}
}
@@ -318,11 +318,11 @@ curl -X PATCH -H "Content-Type: application/json" -d '
"spec": {
"Ps": {
"replicas": 1,
- "resources": "cpu=1,memory=512M"
+ "resources": "cpu=1,memory=1024M"
},
"Worker": {
"replicas": 2,
- "resources": "cpu=1,memory=512M"
+ "resources": "cpu=1,memory=2048M"
}
}
}
@@ -351,11 +351,11 @@ curl -X PATCH -H "Content-Type: application/json" -d '
"spec": {
"Ps": {
"replicas": 1,
- "resources": "cpu=1,memory=512M"
+ "resources": "cpu=1,memory=1024M"
},
"Worker": {
"replicas": 2,
- "resources": "cpu=1,memory=512M"
+ "resources": "cpu=1,memory=2048M"
}
}
}
@@ -397,11 +397,11 @@ curl -X DELETE
http://127.0.0.1:8080/api/v1/experiment/experiment_1592057447228_
"spec": {
"Ps": {
"replicas": 1,
- "resources": "cpu=1,memory=512M"
+ "resources": "cpu=1,memory=1024M"
},
"Worker": {
"replicas": 2,
- "resources": "cpu=1,memory=512M"
+ "resources": "cpu=1,memory=2048M"
}
}
}
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]