This is an automated email from the ASF dual-hosted git repository.

liuxun pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/submarine.git


The following commit(s) were added to refs/heads/master by this push:
     new f49c416  SUBMARINE-333. Docs of submarine server deployment
f49c416 is described below

commit f49c41621d16a3e2982fe53836d14a3c8853009f
Author: Wanqiang Ji <[email protected]>
AuthorDate: Sat Jan 4 23:58:08 2020 +0800

    SUBMARINE-333. Docs of submarine server deployment
    
    ### What is this PR for?
    The doc help user quick start submarine server and training the TensorFlow 
job on K8s cluster by submarine server.
    
    ### What type of PR is it?
    Documentation
    
    ### Todos
    
    ### What is the Jira issue?
    https://issues.apache.org/jira/browse/SUBMARINE-333
    
    ### How should this be tested?
    None
    
    ### Screenshots (if appropriate)
    
    ### Questions:
    * Does the licenses files need update? No
    * Is there breaking changes for older versions? No
    * Does this needs documentation? No
    
    Author: Wanqiang Ji <[email protected]>
    
    Closes #143 from jiwq/SUBMARINE-333 and squashes the following commits:
    
    353199f [Wanqiang Ji] SUBMARINE-333. Docs of submarine server deployment
---
 dev-support/k8s/tfjob/crd.yaml                     |  47 ++++++++++
 .../k8s/tfjob/operator/cluster-role-binding.yaml   |  14 +++
 dev-support/k8s/tfjob/operator/cluster-role.yaml   |  95 ++++++++++++++++++++
 dev-support/k8s/tfjob/operator/deployment.yaml     |  30 +++++++
 dev-support/k8s/tfjob/operator/kustomization.yaml  |  15 ++++
 .../k8s/tfjob/operator/service-account.yaml        |  14 +++
 dev-support/k8s/tfjob/operator/service.yaml        |  19 ++++
 docs/README.md                                     |   3 +
 docs/design/submarine-server/jobspec.md            | 100 +++++++++++++++++++++
 docs/submarine-server/README.md                    |  99 ++++++++++++++++++++
 docs/submarine-server/ml-frameworks/README.md      |  22 +++++
 docs/submarine-server/ml-frameworks/tensorflow.md  |  52 +++++++++++
 docs/submarine-server/setup-kubernetes.md          |  99 ++++++++++++++++++++
 13 files changed, 609 insertions(+)

diff --git a/dev-support/k8s/tfjob/crd.yaml b/dev-support/k8s/tfjob/crd.yaml
new file mode 100644
index 0000000..b693c40
--- /dev/null
+++ b/dev-support/k8s/tfjob/crd.yaml
@@ -0,0 +1,47 @@
+apiVersion: apiextensions.k8s.io/v1beta1
+kind: CustomResourceDefinition
+metadata:
+  name: tfjobs.kubeflow.org
+spec:
+  additionalPrinterColumns:
+  - JSONPath: .status.conditions[-1:].type
+    name: State
+    type: string
+  - JSONPath: .metadata.creationTimestamp
+    name: Age
+    type: date
+  group: kubeflow.org
+  names:
+    kind: TFJob
+    plural: tfjobs
+    singular: tfjob
+  scope: Namespaced
+  subresources:
+    status: {}
+  validation:
+    openAPIV3Schema:
+      properties:
+        spec:
+          properties:
+            tfReplicaSpecs:
+              properties:
+                Chief:
+                  properties:
+                    replicas:
+                      maximum: 1
+                      minimum: 1
+                      type: integer
+                PS:
+                  properties:
+                    replicas:
+                      minimum: 1
+                      type: integer
+                Worker:
+                  properties:
+                    replicas:
+                      minimum: 1
+                      type: integer
+  versions:
+  - name: v1
+    served: true
+    storage: true
diff --git a/dev-support/k8s/tfjob/operator/cluster-role-binding.yaml 
b/dev-support/k8s/tfjob/operator/cluster-role-binding.yaml
new file mode 100644
index 0000000..e05aad7
--- /dev/null
+++ b/dev-support/k8s/tfjob/operator/cluster-role-binding.yaml
@@ -0,0 +1,14 @@
+---
+apiVersion: rbac.authorization.k8s.io/v1beta1
+kind: ClusterRoleBinding
+metadata:
+  labels:
+    app: tf-job-operator
+  name: tf-job-operator
+roleRef:
+  apiGroup: rbac.authorization.k8s.io
+  kind: ClusterRole
+  name: tf-job-operator
+subjects:
+- kind: ServiceAccount
+  name: tf-job-operator
diff --git a/dev-support/k8s/tfjob/operator/cluster-role.yaml 
b/dev-support/k8s/tfjob/operator/cluster-role.yaml
new file mode 100644
index 0000000..72b2903
--- /dev/null
+++ b/dev-support/k8s/tfjob/operator/cluster-role.yaml
@@ -0,0 +1,95 @@
+---
+apiVersion: rbac.authorization.k8s.io/v1beta1
+kind: ClusterRole
+metadata:
+  labels:
+    app: tf-job-operator
+  name: tf-job-operator
+rules:
+- apiGroups:
+  - kubeflow.org
+  resources:
+  - tfjobs
+  - tfjobs/status
+  verbs:
+  - '*'
+- apiGroups:
+  - apiextensions.k8s.io
+  resources:
+  - customresourcedefinitions
+  verbs:
+  - '*'
+- apiGroups:
+  - ""
+  resources:
+  - pods
+  - services
+  - endpoints
+  - events
+  verbs:
+  - '*'
+- apiGroups:
+  - apps
+  - extensions
+  resources:
+  - deployments
+  verbs:
+  - '*'
+
+---
+
+apiVersion: rbac.authorization.k8s.io/v1
+kind: ClusterRole
+metadata:
+  name: kubeflow-tfjobs-admin
+  labels:
+    rbac.authorization.kubeflow.org/aggregate-to-kubeflow-admin: "true"
+aggregationRule:
+  clusterRoleSelectors:
+  - matchLabels:
+      rbac.authorization.kubeflow.org/aggregate-to-kubeflow-tfjobs-admin: 
"true"
+rules: []
+
+---
+
+apiVersion: rbac.authorization.k8s.io/v1
+kind: ClusterRole
+metadata:
+  name: kubeflow-tfjobs-edit
+  labels:
+    rbac.authorization.kubeflow.org/aggregate-to-kubeflow-edit: "true"
+    rbac.authorization.kubeflow.org/aggregate-to-kubeflow-tfjobs-admin: "true"
+rules:
+- apiGroups:
+  - kubeflow.org
+  resources:
+  - tfjobs
+  - tfjobs/status
+  verbs:
+  - get
+  - list
+  - watch
+  - create
+  - delete
+  - deletecollection
+  - patch
+  - update
+
+---
+
+apiVersion: rbac.authorization.k8s.io/v1
+kind: ClusterRole
+metadata:
+  name: kubeflow-tfjobs-view
+  labels:
+    rbac.authorization.kubeflow.org/aggregate-to-kubeflow-view: "true"
+rules:
+- apiGroups:
+  - kubeflow.org
+  resources:
+  - tfjobs
+  - tfjobs/status
+  verbs:
+  - get
+  - list
+  - watch
diff --git a/dev-support/k8s/tfjob/operator/deployment.yaml 
b/dev-support/k8s/tfjob/operator/deployment.yaml
new file mode 100644
index 0000000..bacd44e
--- /dev/null
+++ b/dev-support/k8s/tfjob/operator/deployment.yaml
@@ -0,0 +1,30 @@
+---
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+  name: tf-job-operator
+spec:
+  replicas: 1
+  template:
+    metadata:
+      labels:
+        name: tf-job-operator
+    spec:
+      containers:
+      - command:
+        - /opt/kubeflow/tf-operator.v1
+        - --alsologtostderr
+        - -v=1
+        - --monitoring-port=8443
+        env:
+        - name: MY_POD_NAMESPACE
+          valueFrom:
+            fieldRef:
+              fieldPath: metadata.namespace
+        - name: MY_POD_NAME
+          valueFrom:
+            fieldRef:
+              fieldPath: metadata.name
+        image: 
gcr.io/kubeflow-images-public/tf_operator:kubeflow-tf-operator-postsubmit-v1-5adee6f-6109-a25c
+        name: tf-job-operator
+      serviceAccountName: tf-job-operator
diff --git a/dev-support/k8s/tfjob/operator/kustomization.yaml 
b/dev-support/k8s/tfjob/operator/kustomization.yaml
new file mode 100644
index 0000000..86826d7
--- /dev/null
+++ b/dev-support/k8s/tfjob/operator/kustomization.yaml
@@ -0,0 +1,15 @@
+apiVersion: kustomize.config.k8s.io/v1beta1
+kind: Kustomization
+namespace: submarine
+resources:
+- cluster-role-binding.yaml
+- cluster-role.yaml
+- deployment.yaml
+- service-account.yaml
+- service.yaml
+commonLabels:
+  kustomize.component: tf-job-operator
+images:
+- name: gcr.io/kubeflow-images-public/tf_operator
+  newName: gcr.io/kubeflow-images-public/tf_operator
+  newTag: v0.7.0
diff --git a/dev-support/k8s/tfjob/operator/service-account.yaml 
b/dev-support/k8s/tfjob/operator/service-account.yaml
new file mode 100644
index 0000000..c7be4e3
--- /dev/null
+++ b/dev-support/k8s/tfjob/operator/service-account.yaml
@@ -0,0 +1,14 @@
+---
+apiVersion: v1
+kind: ServiceAccount
+metadata:
+  labels:
+    app: tf-job-dashboard
+  name: tf-job-dashboard
+---
+apiVersion: v1
+kind: ServiceAccount
+metadata:
+  labels:
+    app: tf-job-operator
+  name: tf-job-operator
diff --git a/dev-support/k8s/tfjob/operator/service.yaml 
b/dev-support/k8s/tfjob/operator/service.yaml
new file mode 100644
index 0000000..97f92e3
--- /dev/null
+++ b/dev-support/k8s/tfjob/operator/service.yaml
@@ -0,0 +1,19 @@
+---
+apiVersion: v1
+kind: Service
+metadata:
+  annotations:
+    prometheus.io/path: /metrics
+    prometheus.io/scrape: "true"
+    prometheus.io/port: "8443"
+  labels:
+    app: tf-job-operator
+  name: tf-job-operator
+spec:
+  ports:
+  - name: monitoring-port
+    port: 8443
+    targetPort: 8443
+  selector:
+    name: tf-job-operator
+  type: ClusterIP
diff --git a/docs/README.md b/docs/README.md
index fecafcc..854606d 100644
--- a/docs/README.md
+++ b/docs/README.md
@@ -32,6 +32,9 @@ Click below contents if you want to understand more.
 
 [Submarine Workbench Guide](./workbench/README.md)
 
+## Submarine Server
+[Submarine Server Guide](./submarine-server/README.md)
+
 ## Examples
 
 Here're some examples about Submarine usage.
diff --git a/docs/design/submarine-server/jobspec.md 
b/docs/design/submarine-server/jobspec.md
new file mode 100644
index 0000000..654f835
--- /dev/null
+++ b/docs/design/submarine-server/jobspec.md
@@ -0,0 +1,100 @@
+<!-- 
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Generic Job Spec
+
+## Motivation
+As the machine learning platform, the submarine should support multiple 
machine learning frameworks, such as Tensorflow, Pytorch etc. But different 
framework has different distributed components for the training job. So that we 
designed a generic job spec to abstract the training job across different 
frameworks. In this way, the submarine-server can hide the complexity of 
underlying infrastructure differences and provide a cleaner interface to 
manager jobs
+
+## Proposal
+Considering the Tensorflow and Pytorch framework, we propose one spec which 
consists of library spec, submitter spec and task specs etc. Such as:
+```yaml
+name: "mnist"
+librarySpec:
+  name: "TensorFlow"
+  version: "2.1.0"
+  image: "gcr.io/kubeflow-ci/tf-mnist-with-summaries:1.0"
+  cmd: "python /var/tf_mnist/mnist_with_summaries.py --log_dir=/train/log 
--learning_rate=0.01 --batch_size=150"
+  envVars:
+    ENV_1: "ENV1"
+submitterSpec:
+  type: "k8s"
+  configPath:
+  namespace: "submarine"
+  kind: "TFJob"
+  apiVersion: "kubeflow.org/v1"
+taskSpecs:
+  Ps:
+    name: tensorflow
+    replicas: 2
+    resources: "cpu=4,memory=2048M,nvidia.com/gpu=1"
+  Worker:
+    name: tensorflow
+    replicas: 2
+    resources: "cpu=4,memory=2048M,nvidia.com/gpu=1"
+```
+
+### Library Spec
+The library spec describes the info about machine learning framework. All the 
fields as below:
+
+| field | type | optional | description |
+|---|---|---|---|
+| name | string | NO | Machine Learning Framework name. Such as: 
TensorFlow/PyTorch etc. |
+| version | string | NO | The version of ML framework. Such as: 2.1.0 |
+| image | string | NO | The public image used for each task if not specified. 
Such as: apache/submarine |
+| cmd | string | YES | The public entry cmd for the task if not specified. |
+| envVars | key/value | YES | The public env vars for the task if not 
specified. |
+
+### Submitter Spec
+It describes the info of submitter which the user spcified, such as yarn, 
yarnservice or k8s. All the fields as below:
+
+| field | type | optional | description |
+|---|---|---|---|
+| type | string | NO | The submitter type, supports k8s, yarn, yarnservice |
+| configPath | string | NO | The config path of the specified resource manager 
|
+| namespace | string | NO | It's known as queue in Apache Hadoop YARN and 
namespace in Kubernetes. |
+| kind | string | YES | It's used for k8s submitter, supports TFJob |
+| apiVersion | string | YES | It should pair with the kind, such as the 
TFJob's api version is `kubeflow.org/v1` |
+
+### Task Spec
+It describes the task info, the tasks make up the job. So it must be spcified 
when submit the job. All the tasks should putted into the key value collection. 
Such as:
+```yaml
+taskSpecs:
+  Ps:
+    name: tensorflow
+    replicas: 2
+    resources: "cpu=4,memory=2048M,nvidia.com/gpu=1"
+  Worker:
+    name: tensorflow
+    replicas: 2
+    resources: "cpu=4,memory=2048M,nvidia.com/gpu=1"
+```
+
+All the fileds as below:
+
+| field | type | optional | description |
+|---|---|---|---|
+| name | string | YES | The job task name, if not specify using the library 
name |
+| image | string | YES | The job task image |
+| cmd | string | YES | The entry command for running task |
+| envVars | key/value | YES | The env vars for the task |
+| resources | string | NO | The limit resource for the task. Formatter: 
cpu=%s,memory=%s,nvidia.com/gpu=%s |
+
+## Implements
+For more info see 
[SUBMARINE-321](https://issues.apache.org/jira/browse/SUBMARINE-321)
diff --git a/docs/submarine-server/README.md b/docs/submarine-server/README.md
new file mode 100644
index 0000000..30236ed
--- /dev/null
+++ b/docs/submarine-server/README.md
@@ -0,0 +1,99 @@
+<!-- 
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Submarine Server Guide
+This guide covers the deploy and running the training job by submarine server.
+
+## Prepare environment
+Submarine runs on **Linux** and **macOS**, requires the Java 1.8.x or higher. 
We provide the learning and production environment tutorial. For more 
deployment info see [Deploy Submarine Server on 
Kubernetes](./setup-kubernetes.md).
+
+## Training
+We designed the generic job spec for training job, suggest to read the the job 
spec before submit job.
+
+### Job Spec
+Job Spec as the DSL for submarine server, it consists of library, submitter 
and tasks. For example:
+```yaml
+name: "mnist"
+librarySpec:
+  name: "TensorFlow"
+  version: "2.1.0"
+  image: "gcr.io/kubeflow-ci/tf-mnist-with-summaries:1.0"
+  cmd: "python /var/tf_mnist/mnist_with_summaries.py --log_dir=/train/log 
--learning_rate=0.01 --batch_size=150"
+  envVars:
+    ENV_1: "ENV1"
+submitterSpec:
+  type: "k8s"
+  configPath:
+  namespace: "submarine"
+  kind: "TFJob"
+  apiVersion: "kubeflow.org/v1"
+taskSpecs:
+  Ps:
+    name: tensorflow
+    replicas: 2
+    resources: "cpu=4,memory=2048M,nvidia.com/gpu=1"
+  Worker:
+    name: tensorflow
+    replicas: 2
+    resources: "cpu=4,memory=2048M,nvidia.com/gpu=1"
+```
+or
+```json
+{
+  "name": "mnist",
+  "librarySpec": {
+    "name": "TensorFlow",
+    "version": "2.1.0",
+    "image": "gcr.io/kubeflow-ci/tf-mnist-with-summaries:1.0",
+    "cmd": "python /var/tf_mnist/mnist_with_summaries.py --log_dir=/train/log 
--learning_rate=0.01 --batch_size=150",
+    "envVars": {
+      "ENV_1": "ENV1"
+    }
+  },
+  "submitterSpec": {
+    "type": "k8s",
+    "configPath": null,
+    "namespace": "submarine",
+    "kind": "TFJob",
+    "apiVersion": "kubeflow.org/v1"
+  },
+  "taskSpecs": {
+    "Ps": {
+      "name": "tensorflow",
+      "replicas": 2,
+      "resources": "cpu=4,memory=2048M,nvidia.com/gpu=1"
+    },
+    "Worker": {
+      "name": "tensorflow",
+      "replicas": 2,
+      "resources": "cpu=4,memory=2048M,nvidia.com/gpu=1"
+    }
+  }
+}
+```
+
+For more info see [here](../design/submarine-server/jobspec.md).
+
+### Submit Job
+You can use the Postman post the job to server or use `curl` run following 
command:
+```
+curl -H "Content-Type: application/json" --request POST \
+--data 
`{"name":"mnist","librarySpec":{"name":"TensorFlow","version":"2.1.0","image":"gcr.io/kubeflow-ci/tf-mnist-with-summaries:1.0","cmd":"python
 /var/tf_mnist/mnist_with_summaries.py --log_dir=/train/log 
--learning_rate=0.01 
--batch_size=150","envVars":{"ENV_1":"ENV1"}},"submitterSpec":{"type":"k8s","configPath":null,"namespace":"submarine","kind":"TFJob","apiVersion":"kubeflow.org/v1"},"taskSpecs":{"Ps":{"name":"tensorflow","replicas":2,"resources":"cpu=4,memory=2048M,nvidia.com/gpu=
 [...]
+http://127.0.0.1:8080/api/v1/jobs
+```
diff --git a/docs/submarine-server/ml-frameworks/README.md 
b/docs/submarine-server/ml-frameworks/README.md
new file mode 100644
index 0000000..a2b0b82
--- /dev/null
+++ b/docs/submarine-server/ml-frameworks/README.md
@@ -0,0 +1,22 @@
+<!-- 
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Machine Learning Framework
+Submarine 0.3.0 and above supports the training of TensorFlow jobs on 
kubernetes by tf-operator, for more info see [here](./tensorflow.md).
+
diff --git a/docs/submarine-server/ml-frameworks/tensorflow.md 
b/docs/submarine-server/ml-frameworks/tensorflow.md
new file mode 100644
index 0000000..c03a703
--- /dev/null
+++ b/docs/submarine-server/ml-frameworks/tensorflow.md
@@ -0,0 +1,52 @@
+<!-- 
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Deploy Tensorflow Operator on Kubernetes
+
+## TFJob
+We support Tensorflow job on kubernetes by using the tf-operator as a runtime. 
For more info about tf-operator see 
[here](https://github.com/kubeflow/tf-operator).
+
+### Deploy tf-operator
+Running the follow commands:
+```
+kubectl apply -f ./dev-support/k8s/tfjob/crd.yaml
+kubectl kustomize ./dev-support/k8s/tfjob/operator | kubectl apply -f -
+```
+
+> Since K8s 1.14, Kubectl also supports the management of Kubernetes objects 
using a kustomization file. For more info see 
[kustomization](https://kubernetes.io/docs/tasks/manage-kubernetes-objects/kustomization/)
+
+Default namespace is `submarine`, if you want to modify the namespace, please 
modify `./dev-support/k8s/tfjob/operator/kustomization.yaml`, such as modify 
`${NAMESPACE}` as below:
+```yaml
+apiVersion: kustomize.config.k8s.io/v1beta1
+kind: Kustomization
+namespace: ${NAMESPACE}
+resources:
+- cluster-role-binding.yaml
+- cluster-role.yaml
+- deployment.yaml
+- service-account.yaml
+- service.yaml
+commonLabels:
+  kustomize.component: tf-job-operator
+images:
+- name: gcr.io/kubeflow-images-public/tf_operator
+  newName: gcr.io/kubeflow-images-public/tf_operator
+  newTag: v0.7.0
+```
+
diff --git a/docs/submarine-server/setup-kubernetes.md 
b/docs/submarine-server/setup-kubernetes.md
new file mode 100644
index 0000000..264eb09
--- /dev/null
+++ b/docs/submarine-server/setup-kubernetes.md
@@ -0,0 +1,99 @@
+<!-- 
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Deploy Submarine Server on Kubernetes
+This guide covers the deployment server on kubernetes cluster.
+
+## Experiment environment
+
+### Setup Kubernetes
+We recommend using [`kind`](https://kind.sigs.k8s.io/) to setup a Kubernetes 
cluster on a local machine.
+
+Running the following command:
+```
+kind create cluster --image kindest/node:v1.15.6 --name k8s-submarine
+kubectl create namespace submarine
+```
+
+### Kubernetes Dashboard (optional)
+
+#### Deploy
+To deploy Dashboard, execute following command:
+```
+kubectl apply -f 
https://raw.githubusercontent.com/kubernetes/dashboard/v2.0.0-beta8/aio/deploy/recommended.yaml
+```
+
+#### Create RBAC
+Ensure to grant the cluster access permission of dashboard, run the following 
command:
+```
+kubectl create serviceaccount dashboard-admin-sa
+kubectl create clusterrolebinding dashboard-admin-sa 
--clusterrole=cluster-admin --serviceaccount=default:dashboard-admin-sa
+```
+
+#### Get access token (optional)
+If you want to use the token to login the dashboard, run the following command 
to get key:
+```
+kubectl get secrets
+# select the right dashboard-admin-sa-token to describe the secret
+kubectl describe secret dashboard-admin-sa-token-6nhkx
+```
+
+#### Access
+To start the dashboard service, we can run the following command:
+```
+kubectl proxy
+```
+
+Now access Dashboard at:
+> 
http://localhost:8001/api/v1/namespaces/kubernetes-dashboard/services/https:kubernetes-dashboard:/proxy/
+
+### Setup Submarine
+
+#### Local
+
+##### Get package
+You can dowload submarine from releases page or build from source.
+
+##### Configuration
+Copy the kube config into `conf/k8s/config` or modify the 
`conf/submarine-site.xml`:
+```
+<property>
+  <name>submarine.k8s.kube.config</name>
+  <value>PATH_TO_KUBE_CONFIG</value>
+</property>
+```
+
+##### Start Submarine Server
+Running the submarine server, executing the following command:
+```
+./bin/submarine-daemon.sh start getMysqlJar
+```
+
+The REST API URL is: `http://127.0.0.1:8080/api/v1/jobs`
+
+#### Deploy Tensorflow Operator
+For more info see [deploy tensorflow operator](./ml-frameworks/tensorflow.md).
+
+## Production environment
+
+### Setup Kubernetes
+For more info see https://kubernetes.io/docs/setup/#production-environment
+
+### Setup Submarine
+It's will come soon.


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to