[submarine] branch master updated: SUBMARINE-1318. Add documentation for serve

cdmikechen Tue, 04 Oct 2022 03:46:52 -0700

This is an automated email from the ASF dual-hosted git repository.

cdmikechen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/submarine.git



The following commit(s) were added to refs/heads/master by this push:
     new 61716e2d SUBMARINE-1318. Add documentation for serve
61716e2d is described below

commit 61716e2db25641f51d22116260c162246ffc834f
Author: cdmikechen <cdmikec...@hotmail.com>
AuthorDate: Sun Sep 18 14:14:23 2022 +0800

    SUBMARINE-1318. Add documentation for serve
    
    ### What is this PR for?
    Add documentation for serve and revise some text.
    
    ### What type of PR is it?
    Documentation
    
    ### Todos
    * [x] - Serve document
    
    ### What is the Jira issue?
    https://issues.apache.org/jira/projects/SUBMARINE/issues/SUBMARINE-1318
    
    ### How should this be tested?
    No.
    
    ### Screenshots (if appropriate)
    
![serve-document](https://user-images.githubusercontent.com/12069428/190888319-cb6bd904-ec31-4da4-ab74-2aec021178e3.png)
    
    ### Questions:
    * Do the license files need updating? No
    * Are there breaking changes for older versions? No
    * Does this need new documentation? Yes
    
    Author: cdmikechen <cdmikec...@hotmail.com>
    
    Signed-off-by: cdmikechen <cdmikec...@apache.org>
    
    Closes #996 from cdmikechen/SUBMARINE-1318 and squashes the following 
commits:
    
    b012e1fe [cdmikechen] Fix example link
    10e1828e [cdmikechen] Add documentation for serve
---
 dev-support/examples/quickstart/train.py          |   2 +-
 submarine-serve/installation/seldon-secret.yaml   |   2 +-
 website/docs/gettingStarted/quickstart.md         | 199 +++++++++++++++-------
 website/static/img/quickstart-artifacts.png       | Bin 0 -> 46012 bytes
 website/static/img/submarine-model-registry.png   | Bin 0 -> 75839 bytes
 website/static/img/submarine-register-model.png   | Bin 0 -> 71220 bytes
 website/static/img/submarine-serve-model.png      | Bin 0 -> 129247 bytes
 website/static/img/submarine-serve-prediction.png | Bin 0 -> 159760 bytes
 8 files changed, 140 insertions(+), 63 deletions(-)

diff --git a/dev-support/examples/quickstart/train.py 
b/dev-support/examples/quickstart/train.py
index 5653fa97..fd4eb42d 100644
--- a/dev-support/examples/quickstart/train.py
+++ b/dev-support/examples/quickstart/train.py
@@ -14,7 +14,7 @@
 # 
==============================================================================
 """
 An example of multi-worker training with Keras model using Strategy API.
-https://github.com/kubeflow/tf-operator/blob/master/examples/v1/distribution_strategy/keras-API/multi_worker_strategy-with-keras.py
+https://github.com/kubeflow/training-operator/blob/master/examples/tensorflow/distribution_strategy/keras-API/multi_worker_strategy-with-keras.py
 """
 import tensorflow as tf
 import tensorflow_datasets as tfds
diff --git a/submarine-serve/installation/seldon-secret.yaml 
b/submarine-serve/installation/seldon-secret.yaml
index 4a7d0c37..64aed2f4 100644
--- a/submarine-serve/installation/seldon-secret.yaml
+++ b/submarine-serve/installation/seldon-secret.yaml
@@ -18,7 +18,7 @@ apiVersion: v1
 kind: Secret
 metadata:
   name: submarine-serve-secret
-  namespace: default
+  namespace: submarine-user-test
 type: Opaque
 stringData:
   RCLONE_CONFIG_S3_TYPE: s3
diff --git a/website/docs/gettingStarted/quickstart.md 
b/website/docs/gettingStarted/quickstart.md
index 44d1bd27..0ad835ef 100644
--- a/website/docs/gettingStarted/quickstart.md
+++ b/website/docs/gettingStarted/quickstart.md
@@ -37,7 +37,7 @@ This document gives you a quick view on the basic usage of 
Submarine platform. Y
 
 2. Start minikube cluster and install Istio
 
-```
+```bash
 minikube start --vm-driver=docker --cpus 8 --memory 8192 --kubernetes-version 
v1.21.2
 istioctl install -y
 # Or if you want to support Pod Security Policy 
(https://minikube.sigs.k8s.io/docs/tutorials/using_psp), you can use the 
following command to start cluster
@@ -48,14 +48,14 @@ minikube start 
--extra-config=apiserver.enable-admission-plugins=PodSecurityPoli
 
 1. Clone the project
 
-```
+```bash
 git clone https://github.com/apache/submarine.git
 cd submarine
 ```
 
 2. Create necessary namespaces
 
-```
+```bash
 kubectl create namespace submarine
 kubectl create namespace submarine-user-test
 kubectl label namespace submarine istio-injection=enabled
@@ -64,28 +64,34 @@ kubectl label namespace submarine-user-test 
istio-injection=enabled
 
 3. Install the submarine operator and dependencies by helm chart
 
-```
+```bash
 helm install submarine ./helm-charts/submarine -n submarine
 ```
 
 4. Create a Submarine custom resource and the operator will create the 
submarine server, database, etc. for us.
 
-```
+```bash
 kubectl apply -f submarine-cloud-v2/artifacts/examples/example-submarine.yaml 
-n submarine-user-test
 ```
 
-### Ensure submarine is ready
+5. Install submarine serve package istio and seldon-core
 
+```bash
+./submarine-serve/installation/install.sh
 ```
+
+### Ensure submarine is ready
+
+```bash
 $ kubectl get pods -n submarine
 NAME                                              READY   STATUS    RESTARTS   
AGE
 notebook-controller-deployment-66d85984bf-x562z   1/1     Running   0          
7h7m
-pytorch-operator-7d778f4859-g7xph                 2/2     Running   0          
7h7m
-tf-job-operator-7d895bf77c-75n72                  2/2     Running   0          
7h7m
+training-operator-6dcd5b9c64-nxwr2                1/1     Running   0          
7h7m
+submarine-operator-9cb7bc84d-brddz                1/1     Running   0          
7h7m
 
 $ kubectl get pods -n submarine-user-test
 NAME                                     READY   STATUS    RESTARTS   AGE
-submarine-database-bdcb77549-rq2ds       2/2     Running   0          7h6m
+submarine-database-0                     1/1     Running   0          7h6m
 submarine-minio-686b8777ff-zg4d2         2/2     Running   0          7h6m
 submarine-mlflow-68c5559dcb-lkq4g        2/2     Running   0          7h6m
 submarine-server-7c6d7bcfd8-5p42w        2/2     Running   0          9m33s
@@ -96,7 +102,7 @@ submarine-tensorboard-57c5b64778-t4lww   2/2     Running   0 
         7h6m
 
 1. Exposing service
 
-```
+```bash
 kubectl port-forward --address 0.0.0.0 -n istio-system 
service/istio-ingressgateway 32080:80
 ```
 
@@ -116,75 +122,86 @@ Take a simple mnist tensorflow script as an example. We 
choose `MultiWorkerMirro
 ```python
 """
 ./dev-support/examples/quickstart/train.py
-Reference: 
https://github.com/kubeflow/tf-operator/blob/master/examples/v1/distribution_strategy/keras-API/multi_worker_strategy-with-keras.py
+Reference: 
https://github.com/kubeflow/training-operator/blob/master/examples/tensorflow/distribution_strategy/keras-API/multi_worker_strategy-with-keras.py
 """
 
-import tensorflow_datasets as tfds
 import tensorflow as tf
+import tensorflow_datasets as tfds
+from packaging.version import Version
 from tensorflow.keras import layers, models
+
 import submarine
 
+
 def make_datasets_unbatched():
-  BUFFER_SIZE = 10000
+    BUFFER_SIZE = 10000
 
-  # Scaling MNIST data from (0, 255] to (0., 1.]
-  def scale(image, label):
-    image = tf.cast(image, tf.float32)
-    image /= 255
-    return image, label
+    # Scaling MNIST data from (0, 255] to (0., 1.]
+    def scale(image, label):
+        image = tf.cast(image, tf.float32)
+        image /= 255
+        return image, label
 
-  datasets, _ = tfds.load(name='mnist', with_info=True, as_supervised=True)
+    # If we use tensorflow_datasets > 3.1.0, we need to disable GCS
+    # 
https://github.com/tensorflow/datasets/issues/2761#issuecomment-1187413141
+    if Version(tfds.__version__) > Version("3.1.0"):
+        tfds.core.utils.gcs_utils._is_gcs_disabled = True
+    datasets, _ = tfds.load(name="mnist", with_info=True, as_supervised=True)
 
-  return datasets['train'].map(scale).cache().shuffle(BUFFER_SIZE)
+    return datasets["train"].map(scale).cache().shuffle(BUFFER_SIZE)
 
 
 def build_and_compile_cnn_model():
-  model = models.Sequential()
-  model.add(
-      layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)))
-  model.add(layers.MaxPooling2D((2, 2)))
-  model.add(layers.Conv2D(64, (3, 3), activation='relu'))
-  model.add(layers.MaxPooling2D((2, 2)))
-  model.add(layers.Conv2D(64, (3, 3), activation='relu'))
-  model.add(layers.Flatten())
-  model.add(layers.Dense(64, activation='relu'))
-  model.add(layers.Dense(10, activation='softmax'))
-
-  model.summary()
-
-  model.compile(optimizer='adam',
-                loss='sparse_categorical_crossentropy',
-                metrics=['accuracy'])
-
-  return model
-
-def main():
-  strategy = tf.distribute.experimental.MultiWorkerMirroredStrategy(
-      communication=tf.distribute.experimental.CollectiveCommunication.AUTO)
+    model = models.Sequential()
+    model.add(layers.Conv2D(32, (3, 3), activation="relu", input_shape=(28, 
28, 1)))
+    model.add(layers.MaxPooling2D((2, 2)))
+    model.add(layers.Conv2D(64, (3, 3), activation="relu"))
+    model.add(layers.MaxPooling2D((2, 2)))
+    model.add(layers.Conv2D(64, (3, 3), activation="relu"))
+    model.add(layers.Flatten())
+    model.add(layers.Dense(64, activation="relu"))
+    model.add(layers.Dense(10, activation="softmax"))
 
-  BATCH_SIZE_PER_REPLICA = 4
-  BATCH_SIZE = BATCH_SIZE_PER_REPLICA * strategy.num_replicas_in_sync
+    model.summary()
 
-  with strategy.scope():
-    ds_train = make_datasets_unbatched().batch(BATCH_SIZE).repeat()
-    options = tf.data.Options()
-    options.experimental_distribute.auto_shard_policy = \
-        tf.data.experimental.AutoShardPolicy.DATA
-    ds_train = ds_train.with_options(options)
-    # Model building/compiling need to be within `strategy.scope()`.
-    multi_worker_model = build_and_compile_cnn_model()
+    model.compile(optimizer="adam", loss="sparse_categorical_crossentropy", 
metrics=["accuracy"])
 
-  class MyCallback(tf.keras.callbacks.Callback):
-    def on_epoch_end(self, epoch, logs=None):
-      # monitor the loss and accuracy
-      print(logs)
-      submarine.log_metrics({"loss": logs["loss"], "accuracy": 
logs["accuracy"]}, epoch)
+    return model
 
-  multi_worker_model.fit(ds_train, epochs=10, steps_per_epoch=70, 
callbacks=[MyCallback()])
 
+def main():
+    strategy = tf.distribute.experimental.MultiWorkerMirroredStrategy(
+        communication=tf.distribute.experimental.CollectiveCommunication.AUTO
+    )
+
+    BATCH_SIZE_PER_REPLICA = 4
+    BATCH_SIZE = BATCH_SIZE_PER_REPLICA * strategy.num_replicas_in_sync
+
+    with strategy.scope():
+        ds_train = make_datasets_unbatched().batch(BATCH_SIZE).repeat()
+        options = tf.data.Options()
+        options.experimental_distribute.auto_shard_policy = (
+            tf.data.experimental.AutoShardPolicy.DATA
+        )
+        ds_train = ds_train.with_options(options)
+        # Model building/compiling need to be within `strategy.scope()`.
+        multi_worker_model = build_and_compile_cnn_model()
+
+    class MyCallback(tf.keras.callbacks.Callback):
+        def on_epoch_end(self, epoch, logs=None):
+            # monitor the loss and accuracy
+            print(logs)
+            submarine.log_metric("loss", logs["loss"], epoch)
+            submarine.log_metric("accuracy", logs["accuracy"], epoch)
+
+    multi_worker_model.fit(ds_train, epochs=10, steps_per_epoch=70, 
callbacks=[MyCallback()])
+    # save model
+    submarine.save_model(multi_worker_model, "tensorflow")
+
+
+if __name__ == "__main__":
+    main()
 
-if __name__ == '__main__':
-  main()
 ```
 
 ### 2. Prepare an environment compatible with the training
@@ -218,4 +235,64 @@ eval $(minikube docker-env)
 
 ![](/img/quickstart-ui-0-7-0.png)
 
-### 5. Serve the model (In development)
+### 5. Serve the model
+
+1. Before serving, we need to register a new model.
+
+![](/img/submarine-register-model.png)
+
+2. And then, check the output model in experiment page.
+
+![](/img/quickstart-artifacts.png)
+
+3. Click the button and register the model.
+
+![](/img/submarine-register-model.png)
+
+4. Go to the model page and deploy our model for serving.
+
+![](/img/submarine-serve-model.png)
+
+5. We can run the following commands to get the `VirtualService` and 
`Endpoint` that use istio for external port forward or ingress.
+
+```bash
+## get VirtualService with your model name
+kubectl describe VirtualService -n submarine-user-test -l model-name=tf-mnist
+
+Name:         submarine-model-1-2508dd65692740b18ff5c6c6c162b863
+Namespace:    submarine-user-test
+Labels:       model-id=2508dd65692740b18ff5c6c6c162b863
+              model-name=tf-mnist
+              model-version=1
+Annotations:  <none>
+API Version:  networking.istio.io/v1beta1
+Kind:         VirtualService
+Metadata:
+  Creation Timestamp:  2022-09-18T05:26:38Z
+  Generation:          1
+  Managed Fields:
+    ...
+Spec:
+  Gateways:
+    istio-system/seldon-gateway
+  Hosts:
+    *
+  Http:
+    Match:
+      Uri:
+        Prefix:  /seldon/submarine-user-test/1/1/
+    Rewrite:
+      Uri:  /
+    Route:
+      Destination:
+        Host:  submarine-model-1-2508dd65692740b18ff5c6c6c162b863
+        Port:
+          Number:  8000
+Events:            <none>
+```
+
+6. After successfully serving the model, we can test the results of serving 
using the test python code 
[serve_predictions.py](https://github.com/apache/submarine/blob/master/dev-support/examples/quickstart/serve_predictions.py)
+
+![](/img/submarine-serve-prediction.png)
+  
+   
\ No newline at end of file
diff --git a/website/static/img/quickstart-artifacts.png 
b/website/static/img/quickstart-artifacts.png
new file mode 100644
index 00000000..61be8593
Binary files /dev/null and b/website/static/img/quickstart-artifacts.png differ
diff --git a/website/static/img/submarine-model-registry.png 
b/website/static/img/submarine-model-registry.png
new file mode 100644
index 00000000..cf6f2819
Binary files /dev/null and b/website/static/img/submarine-model-registry.png 
differ
diff --git a/website/static/img/submarine-register-model.png 
b/website/static/img/submarine-register-model.png
new file mode 100644
index 00000000..a520be6c
Binary files /dev/null and b/website/static/img/submarine-register-model.png 
differ
diff --git a/website/static/img/submarine-serve-model.png 
b/website/static/img/submarine-serve-model.png
new file mode 100644
index 00000000..19683325
Binary files /dev/null and b/website/static/img/submarine-serve-model.png differ
diff --git a/website/static/img/submarine-serve-prediction.png 
b/website/static/img/submarine-serve-prediction.png
new file mode 100644
index 00000000..7286e639
Binary files /dev/null and b/website/static/img/submarine-serve-prediction.png 
differ


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@submarine.apache.org
For additional commands, e-mail: dev-h...@submarine.apache.org

[submarine] branch master updated: SUBMARINE-1318. Add documentation for serve

Reply via email to