This is an automated email from the ASF dual-hosted git repository.
pingsutw pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/submarine.git
The following commit(s) were added to refs/heads/master by this push:
new efb4e7c SUBMARINE-1163. Remove mlflow
efb4e7c is described below
commit efb4e7c578f9ee5188414b2140ab7b4a9de1ec92
Author: jeff-901 <[email protected]>
AuthorDate: Tue Jan 4 14:59:32 2022 +0800
SUBMARINE-1163. Remove mlflow
### What is this PR for?
Remove mlflow from submarine
### What type of PR is it?
Feature
### Todos
* [x] - Remove mllfow from sdk
* [x] - Update document
### What is the Jira issue?
https://issues.apache.org/jira/browse/SUBMARINE-1163
### How should this be tested?
### Screenshots (if appropriate)
### Questions:
* Do the license files need updating? No
* Are there breaking changes for older versions? Yes
* Does this need new documentation? Yes
Author: jeff-901 <[email protected]>
Signed-off-by: Kevin <[email protected]>
Closes #857 from jeff-901/SUBMARINE-1163 and squashes the following commits:
ad4f9b57 [jeff-901] style
f7f037ed [jeff-901] website
3259af42 [jeff-901] fix err
4fbe1591 [jeff-901] mllfow
7c3aa487 [jeff-901] dev-support
fcb0f9a4 [jeff-901] github action
6a4a6714 [jeff-901] operator
e8ded8a2 [jeff-901] java server
77f11b19 [jeff-901] add mlflow ui back
1fb64813 [jeff-901] update sidebar
0509c5f2 [jeff-901] update quickstart doc
0b8d3104 [jeff-901] update doc
12394590 [jeff-901] remove from sdk
a3cf6a24 [jeff-901] remove from server
d0489bec [jeff-901] deleted pod
---
dev-support/misc/serve/readme.md | 3 -
dev-support/misc/serve/serve.yaml | 91 --------
submarine-sdk/pysubmarine/setup.py | 1 -
submarine-sdk/pysubmarine/submarine/__init__.py | 2 -
.../submarine/client/api/experiment_api.py | 99 ---------
.../pysubmarine/submarine/models/client.py | 126 -----------
.../pysubmarine/submarine/models/constant.py | 21 --
.../pysubmarine/submarine/models/utils.py | 85 --------
submarine-sdk/pysubmarine/tests/models/pytorch.py | 27 ---
.../pysubmarine/tests/models/test_model.py | 67 ------
.../pysubmarine/tests/models/test_model_e2e.py | 50 -----
.../submitter/k8s/parser/ServeSpecParser.java | 232 ---------------------
.../experiment-home/experiment-home.component.html | 11 +
.../experiment-home/experiment-home.component.ts | 21 ++
.../src/app/services/experiment.service.ts | 13 ++
website/docs/gettingStarted/quickstart.md | 20 +-
.../docs/userDocs/submarine-sdk/model-client.md | 143 -------------
.../submarine-sdk/pysubmarine/development.md | 2 +-
website/sidebars.js | 1 -
website/static/img/quickstart-ui-1.png | Bin 0 -> 88062 bytes
website/static/img/quickstart-ui-2.png | Bin 0 -> 65574 bytes
21 files changed, 54 insertions(+), 961 deletions(-)
diff --git a/dev-support/misc/serve/readme.md b/dev-support/misc/serve/readme.md
deleted file mode 100644
index ff0bd84..0000000
--- a/dev-support/misc/serve/readme.md
+++ /dev/null
@@ -1,3 +0,0 @@
-## Serve YAML
-
-This is the yaml version of resource created by `submitter.createServe()`.
diff --git a/dev-support/misc/serve/serve.yaml
b/dev-support/misc/serve/serve.yaml
deleted file mode 100644
index 626680d..0000000
--- a/dev-support/misc/serve/serve.yaml
+++ /dev/null
@@ -1,91 +0,0 @@
-#
-# Licensed to the Apache Software Foundation (ASF) under one or more
-# contributor license agreements. See the NOTICE file distributed with
-# this work for additional information regarding copyright ownership.
-# The ASF licenses this file to You under the Apache License, Version 2.0
-# (the "License"); you may not use this file except in compliance with
-# the License. You may obtain a copy of the License at
-#
-# http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-
-apiVersion: apps/v1
-kind: Deployment
-metadata:
- name: serve-example # name == ${model-registry}-{version}
-spec:
- selector:
- matchLabels:
- app: serve-example-pod #
- template:
- metadata:
- labels:
- app: serve-example-pod
- spec:
- containers:
- - name: serve-example-container
- image: apache/submarine:serve-0.7.0-SNAPSHOT
- command:
- - "mlflow"
- - "models"
- - "serve"
- - "--model-uri"
- - "models:/simple-nn-model/1"
- - "--host"
- - "0.0.0.0" # make it accessible from the outside
- imagePullPolicy: IfNotPresent
- ports:
- - containerPort: 5000
- readinessProbe: # make container ready until mlflow serving server is
ready to receive request
- httpGet:
- path: /ping # from mlflow scoring_server
- port: 5000
----
-apiVersion: v1
-kind: Service
-metadata:
- name: serve-example-service
-spec:
- type: ClusterIP
- selector:
- app: serve-example-pod
- ports:
- - protocol: TCP
- port: 5000 # port on service
- targetPort: 5000 # port on container
----
-apiVersion: traefik.containo.us/v1alpha1
-kind: IngressRoute
-metadata:
- name: serve-example-ingressroute
-spec:
- entryPoints:
- - web
- routes:
- - kind: Rule
- match: "PathPrefix(`/serve/mymodel`)"
- middlewares:
- - name: stripprefix
- services:
- - kind: Service
- name: serve-example-service
- port: 5000
----
-# strip the prefix
-# e.g. Make a HTTP POST: localhost:32080/serve/mymodel/invocations
-# The serve pod (with ingressroute `/serve/mymodel/`) receives path
"/serve/mymodel/invocations"
-# We should strip the prefix and make it become "/invocations"
-apiVersion: traefik.containo.us/v1alpha1
-kind: Middleware
-metadata:
- name: stripprefix
-spec:
- stripPrefix:
- prefixes:
- - /serve/mymodel
diff --git a/submarine-sdk/pysubmarine/setup.py
b/submarine-sdk/pysubmarine/setup.py
index 720a561..18abee4 100644
--- a/submarine-sdk/pysubmarine/setup.py
+++ b/submarine-sdk/pysubmarine/setup.py
@@ -39,7 +39,6 @@ setup(
"certifi>=14.05.14",
"python-dateutil>=2.5.3",
"pyarrow==0.17.0",
- "mlflow>=1.15.0",
"boto3>=1.17.58",
"click==8.0.3",
"rich==10.15.2",
diff --git a/submarine-sdk/pysubmarine/submarine/__init__.py
b/submarine-sdk/pysubmarine/submarine/__init__.py
index 069a762..580ad61 100644
--- a/submarine-sdk/pysubmarine/submarine/__init__.py
+++ b/submarine-sdk/pysubmarine/submarine/__init__.py
@@ -16,7 +16,6 @@
import submarine.tracking.fluent
import submarine.utils as utils
from submarine.client.api.experiment_client import ExperimentClient
-from submarine.models.client import ModelsClient
log_param = submarine.tracking.fluent.log_param
log_metric = submarine.tracking.fluent.log_metric
@@ -31,5 +30,4 @@ __all__ = [
"set_db_uri",
"get_db_uri",
"ExperimentClient",
- "ModelsClient",
]
diff --git a/submarine-sdk/pysubmarine/submarine/client/api/experiment_api.py
b/submarine-sdk/pysubmarine/submarine/client/api/experiment_api.py
index b207b0b..bd2284d 100644
--- a/submarine-sdk/pysubmarine/submarine/client/api/experiment_api.py
+++ b/submarine-sdk/pysubmarine/submarine/client/api/experiment_api.py
@@ -487,105 +487,6 @@ class ExperimentApi(object):
collection_formats=collection_formats,
)
- def get_m_lflow_info(self, **kwargs): # noqa: E501
- """Get mlflow's information # noqa: E501
-
- This method makes a synchronous HTTP request by default. To make an
- asynchronous HTTP request, please pass async_req=True
- >>> thread = api.get_m_lflow_info(async_req=True)
- >>> result = thread.get()
-
- :param async_req bool: execute request asynchronously
- :param _preload_content: if False, the urllib3.HTTPResponse object will
- be returned without reading/decoding response
- data. Default is True.
- :param _request_timeout: timeout setting for this request. If one
- number provided, it will be total request
- timeout. It can also be a pair (tuple) of
- (connection, read) timeouts.
- :return: JsonResponse
- If the method is called asynchronously,
- returns the request thread.
- """
- kwargs["_return_http_data_only"] = True
- return self.get_m_lflow_info_with_http_info(**kwargs) # noqa: E501
-
- def get_m_lflow_info_with_http_info(self, **kwargs): # noqa: E501
- """Get mlflow's information # noqa: E501
-
- This method makes a synchronous HTTP request by default. To make an
- asynchronous HTTP request, please pass async_req=True
- >>> thread = api.get_m_lflow_info_with_http_info(async_req=True)
- >>> result = thread.get()
-
- :param async_req bool: execute request asynchronously
- :param _return_http_data_only: response data without head status code
- and headers
- :param _preload_content: if False, the urllib3.HTTPResponse object will
- be returned without reading/decoding response
- data. Default is True.
- :param _request_timeout: timeout setting for this request. If one
- number provided, it will be total request
- timeout. It can also be a pair (tuple) of
- (connection, read) timeouts.
- :return: tuple(JsonResponse, status_code(int), headers(HTTPHeaderDict))
- If the method is called asynchronously,
- returns the request thread.
- """
-
- local_var_params = locals()
-
- all_params = []
- all_params.extend(
- ["async_req", "_return_http_data_only", "_preload_content",
"_request_timeout"]
- )
-
- for key, val in six.iteritems(local_var_params["kwargs"]):
- if key not in all_params:
- raise ApiTypeError(
- "Got an unexpected keyword argument '%s' to method
get_m_lflow_info" % key
- )
- local_var_params[key] = val
- del local_var_params["kwargs"]
-
- collection_formats = {}
-
- path_params = {}
-
- query_params = []
-
- header_params = {}
-
- form_params = []
- local_var_files = {}
-
- body_params = None
- # HTTP header `Accept`
- header_params["Accept"] = self.api_client.select_header_accept(
- ["application/json; charset=utf-8"]
- ) # noqa: E501
-
- # Authentication setting
- auth_settings = [] # noqa: E501
-
- return self.api_client.call_api(
- "/v1/experiment/mlflow",
- "GET",
- path_params,
- query_params,
- header_params,
- body=body_params,
- post_params=form_params,
- files=local_var_files,
- response_type="JsonResponse", # noqa: E501
- auth_settings=auth_settings,
- async_req=local_var_params.get("async_req"),
-
_return_http_data_only=local_var_params.get("_return_http_data_only"), # noqa:
E501
- _preload_content=local_var_params.get("_preload_content", True),
- _request_timeout=local_var_params.get("_request_timeout"),
- collection_formats=collection_formats,
- )
-
def get_tensorboard_info(self, **kwargs): # noqa: E501
"""Get tensorboard's information # noqa: E501
diff --git a/submarine-sdk/pysubmarine/submarine/models/client.py
b/submarine-sdk/pysubmarine/submarine/models/client.py
deleted file mode 100644
index da13082..0000000
--- a/submarine-sdk/pysubmarine/submarine/models/client.py
+++ /dev/null
@@ -1,126 +0,0 @@
-"""
- Licensed to the Apache Software Foundation (ASF) under one
- or more contributor license agreements. See the NOTICE file
- distributed with this work for additional information
- regarding copyright ownership. The ASF licenses this file
- to you under the Apache License, Version 2.0 (the
- "License"); you may not use this file except in compliance
- with the License. You may obtain a copy of the License at
- http://www.apache.org/licenses/LICENSE-2.0
- Unless required by applicable law or agreed to in writing,
- software distributed under the License is distributed on an
- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
- KIND, either express or implied. See the License for the
- specific language governing permissions and limitations
- under the License.
-"""
-import os
-import time
-
-import mlflow
-from mlflow.exceptions import MlflowException
-from mlflow.tracking import MlflowClient
-
-from .constant import (
- AWS_ACCESS_KEY_ID,
- AWS_SECRET_ACCESS_KEY,
- MLFLOW_S3_ENDPOINT_URL,
- MLFLOW_TRACKING_URI,
-)
-from .utils import exist_ps, get_job_id, get_worker_index
-
-
-class ModelsClient:
- def __init__(
- self,
- tracking_uri: str = None,
- registry_uri: str = None,
- aws_access_key_id: str = None,
- aws_secret_access_key: str = None,
- ):
- """
- Set up mlflow server connection, including: s3 endpoint, aws, tracking
server
- """
- # if setting url in environment variable,
- # there is no need to set it by MlflowClient() or
mlflow.set_tracking_uri() again
- os.environ["MLFLOW_S3_ENDPOINT_URL"] = registry_uri or
MLFLOW_S3_ENDPOINT_URL
- os.environ["AWS_ACCESS_KEY_ID"] = aws_access_key_id or
AWS_ACCESS_KEY_ID
- os.environ["AWS_SECRET_ACCESS_KEY"] = aws_secret_access_key or
AWS_SECRET_ACCESS_KEY
- os.environ["MLFLOW_TRACKING_URI"] = tracking_uri or MLFLOW_TRACKING_URI
- self.client = MlflowClient()
- self.type_to_log_model = {
- "pytorch": mlflow.pytorch.log_model,
- "sklearn": mlflow.sklearn.log_model,
- "tensorflow": mlflow.tensorflow.log_model,
- "keras": mlflow.keras.log_model,
- }
-
- def start(self):
- """
- 1. Start a new Mlflow run
- 2. Direct the logging of the artifacts and metadata
- to the Run named "worker_i" under Experiment "job_id"
- 3. If in distributed training, worker and job id would be parsed from
environment variable
- 4. If in local traning, worker and job id will be generated.
- :return: Active Run
- """
- experiment_name = get_job_id()
- run_name = get_worker_index()
- experiment_id = self._get_or_create_experiment(experiment_name)
- return mlflow.start_run(run_name=run_name, experiment_id=experiment_id)
-
- def log_param(self, key: str, value: str):
- mlflow.log_param(key, value)
-
- def log_params(self, params):
- mlflow.log_params(params)
-
- def log_metric(self, key: str, value: str, step=None):
- mlflow.log_metric(key, value, step)
-
- def log_metrics(self, metrics, step=None):
- mlflow.log_metrics(metrics, step)
-
- def load_model(self, name: str, version: str):
- model = mlflow.pyfunc.load_model(model_uri=f"models:/{name}/{version}")
- return model
-
- def update_model(self, name: str, new_name: str):
- self.client.rename_registered_model(name=name, new_name=new_name)
-
- def delete_model(self, name: str, version: str):
- self.client.delete_model_version(name=name, version=version)
-
- def save_model(self, model_type, model, artifact_path,
registered_model_name=None):
- run_name = get_worker_index()
- if exist_ps():
- # TODO for Tensorflow ParameterServer strategy
- return
- elif run_name == "worker-0":
- if model_type in self.type_to_log_model:
- self.type_to_log_model[model_type](
- model, artifact_path,
registered_model_name=registered_model_name
- )
- else:
- raise MlflowException("No valid type of model has been
matched")
-
- def _get_or_create_experiment(self, experiment_name):
- """
- Return the id of experiment.
- If non-exist, create one. Otherwise, return the existing one.
- :return: Experiment id
- """
- try:
- experiment = mlflow.get_experiment_by_name(experiment_name)
- if experiment is None: # if not found
- run_name = get_worker_index()
- if run_name == "worker-0":
- raise MlflowException("No valid experiment has been found")
- else:
- while experiment is None:
- time.sleep(1)
- experiment =
mlflow.get_experiment_by_name(experiment_name)
- return experiment.experiment_id # if found
- except MlflowException:
- experiment = mlflow.create_experiment(name=experiment_name)
- return experiment
diff --git a/submarine-sdk/pysubmarine/submarine/models/constant.py
b/submarine-sdk/pysubmarine/submarine/models/constant.py
deleted file mode 100644
index a9c876e..0000000
--- a/submarine-sdk/pysubmarine/submarine/models/constant.py
+++ /dev/null
@@ -1,21 +0,0 @@
-"""
- Licensed to the Apache Software Foundation (ASF) under one
- or more contributor license agreements. See the NOTICE file
- distributed with this work for additional information
- regarding copyright ownership. The ASF licenses this file
- to you under the Apache License, Version 2.0 (the
- "License"); you may not use this file except in compliance
- with the License. You may obtain a copy of the License at
- http://www.apache.org/licenses/LICENSE-2.0
- Unless required by applicable law or agreed to in writing,
- software distributed under the License is distributed on an
- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
- KIND, either express or implied. See the License for the
- specific language governing permissions and limitations
- under the License.
-"""
-
-MLFLOW_S3_ENDPOINT_URL = "http://submarine-minio-service:9000"
-AWS_ACCESS_KEY_ID = "submarine_minio"
-AWS_SECRET_ACCESS_KEY = "submarine_minio"
-MLFLOW_TRACKING_URI = "http://submarine-mlflow-service:5000"
diff --git a/submarine-sdk/pysubmarine/submarine/models/utils.py
b/submarine-sdk/pysubmarine/submarine/models/utils.py
deleted file mode 100644
index 88c2e00..0000000
--- a/submarine-sdk/pysubmarine/submarine/models/utils.py
+++ /dev/null
@@ -1,85 +0,0 @@
-# Licensed to the Apache Software Foundation (ASF) under one or more
-# contributor license agreements. See the NOTICE file distributed with
-# this work for additional information regarding copyright ownership.
-# The ASF licenses this file to You under the Apache License, Version 2.0
-# (the "License"); you may not use this file except in compliance with
-# the License. You may obtain a copy of the License at
-#
-# http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-from __future__ import print_function
-
-import json
-import os
-import uuid
-
-from submarine.utils import env
-
-_JOB_ID_ENV_VAR = "JOB_ID"
-
-_TF_CONFIG = "TF_CONFIG"
-_CLUSTER_SPEC = "CLUSTER_SPEC"
-_CLUSTER = "cluster"
-_JOB_NAME = "JOB_NAME"
-_TYPE = "type"
-_TASK = "task"
-_INDEX = "index"
-_RANK = "RANK"
-
-
-def get_job_id():
- """
- Get the current experiment id.
- :return The experiment id:
- """
- # Get yarn application or K8s experiment ID when running distributed
training
- if env.get_env(_JOB_ID_ENV_VAR) is not None:
- return env.get_env(_JOB_ID_ENV_VAR)
- else: # set Random ID when running local training
- job_id = uuid.uuid4().hex
- os.environ[_JOB_ID_ENV_VAR] = job_id
- return job_id
-
-
-def get_worker_index():
- """
- Get the current worker index.
- :return: The worker index:
- """
- # Get TensorFlow worker index
- if env.get_env(_TF_CONFIG) is not None:
- tf_config = json.loads(os.environ.get(_TF_CONFIG))
- task_config = tf_config.get(_TASK)
- task_type = task_config.get(_TYPE)
- task_index = task_config.get(_INDEX)
- worker_index = task_type + "-" + str(task_index)
- elif env.get_env(_CLUSTER_SPEC) is not None:
- cluster_spec = json.loads(os.environ.get(_CLUSTER_SPEC))
- task_config = cluster_spec.get(_TASK)
- task_type = task_config.get(_JOB_NAME)
- task_index = task_config.get(_INDEX)
- worker_index = task_type + "-" + str(task_index)
- # Get PyTorch worker index
- elif env.get_env(_RANK) is not None:
- rank = env.get_env(_RANK)
- worker_index = "worker-" + rank
- # Set worker index to "worker-0" When running local training
- else:
- worker_index = "worker-0"
-
- return worker_index
-
-
-def exist_ps():
- if env.get_env(_TF_CONFIG) is not None:
- tf_config = json.loads(os.environ.get(_TF_CONFIG))
- cluster = tf_config.get(_CLUSTER)
- if "ps" in cluster:
- return True
- return False
diff --git a/submarine-sdk/pysubmarine/tests/models/pytorch.py
b/submarine-sdk/pysubmarine/tests/models/pytorch.py
deleted file mode 100644
index eeffc30..0000000
--- a/submarine-sdk/pysubmarine/tests/models/pytorch.py
+++ /dev/null
@@ -1,27 +0,0 @@
-"""
- Licensed to the Apache Software Foundation (ASF) under one
- or more contributor license agreements. See the NOTICE file
- distributed with this work for additional information
- regarding copyright ownership. The ASF licenses this file
- to you under the Apache License, Version 2.0 (the
- "License"); you may not use this file except in compliance
- with the License. You may obtain a copy of the License at
- http://www.apache.org/licenses/LICENSE-2.0
- Unless required by applicable law or agreed to in writing,
- software distributed under the License is distributed on an
- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
- KIND, either express or implied. See the License for the
- specific language governing permissions and limitations
- under the License.
-"""
-import torch
-
-
-class LinearNNModel(torch.nn.Module):
- def __init__(self):
- super(LinearNNModel, self).__init__()
- self.linear = torch.nn.Linear(1, 1) # One in and one out
-
- def forward(self, x):
- y_pred = self.linear(x)
- return y_pred
diff --git a/submarine-sdk/pysubmarine/tests/models/test_model.py
b/submarine-sdk/pysubmarine/tests/models/test_model.py
deleted file mode 100644
index f94acf1..0000000
--- a/submarine-sdk/pysubmarine/tests/models/test_model.py
+++ /dev/null
@@ -1,67 +0,0 @@
-"""
- Licensed to the Apache Software Foundation (ASF) under one
- or more contributor license agreements. See the NOTICE file
- distributed with this work for additional information
- regarding copyright ownership. The ASF licenses this file
- to you under the Apache License, Version 2.0 (the
- "License"); you may not use this file except in compliance
- with the License. You may obtain a copy of the License at
- http://www.apache.org/licenses/LICENSE-2.0
- Unless required by applicable law or agreed to in writing,
- software distributed under the License is distributed on an
- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
- KIND, either express or implied. See the License for the
- specific language governing permissions and limitations
- under the License.
-"""
-
-import mlflow
-import numpy as np
-from mlflow.tracking import MlflowClient
-from pytorch import LinearNNModel
-
-from submarine import ModelsClient
-
-
-class TestSubmarineModelsClient:
- def setUp(self):
- pass
-
- def tearDown(self):
- pass
-
- def test_save_model(self, mocker):
- mock_method = mocker.patch.object(ModelsClient, "save_model")
- client = ModelsClient()
- model = LinearNNModel()
- name = "simple-nn-model"
- client.save_model("pytorch", model, name)
- mock_method.assert_called_once_with("pytorch", model,
"simple-nn-model")
-
- def test_update_model(self, mocker):
- mock_method = mocker.patch.object(MlflowClient,
"rename_registered_model")
- client = ModelsClient()
- name = "simple-nn-model"
- new_name = "new-simple-nn-model"
- client.update_model(name, new_name)
- mock_method.assert_called_once_with(name="simple-nn-model",
new_name="new-simple-nn-model")
-
- def test_load_model(self, mocker):
- mock_method = mocker.patch.object(mlflow.pyfunc, "load_model")
- mock_method.return_value =
mlflow.pytorch._PyTorchWrapper(LinearNNModel())
- client = ModelsClient()
- name = "simple-nn-model"
- version = "1"
- model = client.load_model(name, version)
-
mock_method.assert_called_once_with(model_uri="models:/simple-nn-model/1")
- x = np.float32([[1.0], [2.0]])
- y = model.predict(x)
- assert y.shape[0] == 2
- assert y.shape[1] == 1
-
- def test_delete_model(self, mocker):
- mock_method = mocker.patch.object(MlflowClient, "delete_model_version")
- client = ModelsClient()
- name = "simple-nn-model"
- client.delete_model(name, "1")
- mock_method.assert_called_once_with(name="simple-nn-model",
version="1")
diff --git a/submarine-sdk/pysubmarine/tests/models/test_model_e2e.py
b/submarine-sdk/pysubmarine/tests/models/test_model_e2e.py
deleted file mode 100644
index 635a7c3..0000000
--- a/submarine-sdk/pysubmarine/tests/models/test_model_e2e.py
+++ /dev/null
@@ -1,50 +0,0 @@
-"""
- Licensed to the Apache Software Foundation (ASF) under one
- or more contributor license agreements. See the NOTICE file
- distributed with this work for additional information
- regarding copyright ownership. The ASF licenses this file
- to you under the Apache License, Version 2.0 (the
- "License"); you may not use this file except in compliance
- with the License. You may obtain a copy of the License at
- http://www.apache.org/licenses/LICENSE-2.0
- Unless required by applicable law or agreed to in writing,
- software distributed under the License is distributed on an
- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
- KIND, either express or implied. See the License for the
- specific language governing permissions and limitations
- under the License.
-"""
-
-import numpy as np
-import pytest
-from pytorch import LinearNNModel
-
-from submarine import ModelsClient
-
-
[email protected](name="models_client", scope="class")
-def models_client_fixture():
- client = ModelsClient("http://localhost:5001", "http://localhost:9000")
- return client
-
-
[email protected]
-class TestSubmarineModelsClientE2E:
- def test_model(self, models_client):
- model = LinearNNModel()
- # log
- name = "simple-nn-model"
- models_client.save_model("pytorch", model, name,
registered_model_name=name)
- # update
- new_name = "new-simple-nn-model"
- models_client.update_model(name, new_name)
- # load
- name = new_name
- version = "1"
- model = models_client.load_model(name, version)
- x = np.float32([[1.0], [2.0]])
- y = model.predict(x)
- assert y.shape[0] == 2
- assert y.shape[1] == 1
- # delete
- models_client.delete_model(name, "1")
diff --git
a/submarine-server/server-submitter/submitter-k8s/src/main/java/org/apache/submarine/server/submitter/k8s/parser/ServeSpecParser.java
b/submarine-server/server-submitter/submitter-k8s/src/main/java/org/apache/submarine/server/submitter/k8s/parser/ServeSpecParser.java
deleted file mode 100644
index 0e42568..0000000
---
a/submarine-server/server-submitter/submitter-k8s/src/main/java/org/apache/submarine/server/submitter/k8s/parser/ServeSpecParser.java
+++ /dev/null
@@ -1,232 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements. See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership. The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License. You may obtain a copy of the License at
- *
- * http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing,
- * software distributed under the License is distributed on an
- * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
- * KIND, either express or implied. See the License for the
- * specific language governing permissions and limitations
- * under the License.
- */
-
-package org.apache.submarine.server.submitter.k8s.parser;
-
-import
org.apache.submarine.server.submitter.k8s.model.ingressroute.IngressRoute;
-import
org.apache.submarine.server.submitter.k8s.model.ingressroute.IngressRouteSpec;
-import org.apache.submarine.server.submitter.k8s.model.ingressroute.SpecRoute;
-import org.apache.submarine.server.submitter.k8s.model.middlewares.Middlewares;
-import
org.apache.submarine.server.submitter.k8s.model.middlewares.MiddlewaresSpec;
-import org.apache.submarine.server.submitter.k8s.model.middlewares.StripPrefix;
-
-import io.kubernetes.client.custom.IntOrString;
-import io.kubernetes.client.openapi.models.V1Container;
-import io.kubernetes.client.openapi.models.V1ContainerPort;
-import io.kubernetes.client.openapi.models.V1Deployment;
-import io.kubernetes.client.openapi.models.V1DeploymentSpec;
-import io.kubernetes.client.openapi.models.V1HTTPGetAction;
-import io.kubernetes.client.openapi.models.V1LabelSelector;
-import io.kubernetes.client.openapi.models.V1ObjectMeta;
-import io.kubernetes.client.openapi.models.V1PodSpec;
-import io.kubernetes.client.openapi.models.V1PodTemplateSpec;
-import io.kubernetes.client.openapi.models.V1Probe;
-import io.kubernetes.client.openapi.models.V1Service;
-import io.kubernetes.client.openapi.models.V1ServicePort;
-import io.kubernetes.client.openapi.models.V1ServiceSpec;
-
-import java.util.ArrayList;
-import java.util.Arrays;
-import java.util.Collections;
-import java.util.HashMap;
-import java.util.HashSet;
-import java.util.Map;
-
-public class ServeSpecParser {
-
- // names
- String generalName;
- String podName;
- String containerName;
- String routeName;
- String svcName;
- String middlewareName;
-
- // path
- String routePath;
-
- // model_uri
- String modelURI;
-
- // cluster related
- String namespace;
- int PORT = 5000; // mlflow serve server is listening on 5000
-
- // constructor
- public ServeSpecParser(String modelName, String modelVersion, String
namespace) {
- // names assignment
- generalName = modelName + "-" + modelVersion;
- podName = generalName + "-pod";
- containerName = generalName + "-container";
- routeName = generalName + "-ingressroute";
- svcName = generalName + "-service";
- middlewareName = generalName + "-middleware";
- // path assignment
- routePath = String.format("/serve/%s", generalName);
- // uri assignment
- modelURI = String.format("models:/%s/%s", modelName, modelVersion);
- // nameSpace
- this.namespace = namespace;
- }
-
- public V1Deployment getDeployment() {
- // Container related
- // TODO(byronhsu) This should not be hard-coded.
- final String serveImage =
- "apache/submarine:serve-0.7.0-SNAPSHOT";
-
- ArrayList<String> cmds = new ArrayList<>(
- Arrays.asList("mlflow", "models", "serve",
- "--model-uri", modelURI, "--host", "0.0.0.0")
- );
-
- V1Deployment deployment = new V1Deployment();
-
- V1ObjectMeta deploymentMetedata = new V1ObjectMeta();
- deploymentMetedata.setName(generalName);
- deployment.setMetadata(deploymentMetedata);
-
- V1DeploymentSpec deploymentSpec = new V1DeploymentSpec();
- deploymentSpec.setSelector(
- new V1LabelSelector().matchLabels(Collections.singletonMap("app",
podName)) // match the template
- );
-
- V1PodTemplateSpec deploymentTemplateSpec = new V1PodTemplateSpec();
- deploymentTemplateSpec.setMetadata(
- new V1ObjectMeta().labels(Collections.singletonMap("app", podName)) //
bind to replicaset and service
- );
-
- V1PodSpec deploymentTemplatePodSpec = new V1PodSpec();
-
- V1Container container = new V1Container();
- container.setName(containerName);
- container.setImage(serveImage);
- container.setCommand(cmds);
- container.setImagePullPolicy("IfNotPresent");
- container.addPortsItem(new V1ContainerPort().containerPort(PORT));
- container.setReadinessProbe(
- new V1Probe().httpGet(new V1HTTPGetAction().path("/ping").port(new
IntOrString(PORT)))
- );
-
-
- deploymentTemplatePodSpec.addContainersItem(container);
- deploymentTemplateSpec.setSpec(deploymentTemplatePodSpec);
- deploymentSpec.setTemplate(deploymentTemplateSpec);
- deployment.setSpec(deploymentSpec);
-
- return deployment;
- }
- public V1Service getService() {
- V1Service svc = new V1Service();
- svc.metadata(new V1ObjectMeta().name(svcName));
-
- V1ServiceSpec svcSpec = new V1ServiceSpec();
- svcSpec.setSelector(Collections.singletonMap("app", podName)); // bind to
pod
- svcSpec.addPortsItem(new V1ServicePort().protocol("TCP").targetPort(
- new IntOrString(PORT)).port(PORT));
- svc.setSpec(svcSpec);
- return svc;
- }
-
- public IngressRoute getIngressRoute() {
- IngressRoute ingressRoute = new IngressRoute();
- ingressRoute.setMetadata(
- new V1ObjectMeta().name(routeName).namespace((namespace))
- );
-
- IngressRouteSpec ingressRouteSpec = new IngressRouteSpec();
- ingressRouteSpec.setEntryPoints(new
HashSet<>(Collections.singletonList("web")));
- SpecRoute specRoute = new SpecRoute();
- specRoute.setKind("Rule");
- specRoute.setMatch(String.format("PathPrefix(`%s`)", routePath));
-
- Map<String, Object> service = new HashMap<String, Object>() {{
- put("kind", "Service");
- put("name", svcName);
- put("port", PORT);
- put("namespace", namespace);
- }};
-
- specRoute.setServices(new HashSet<Map<String, Object>>() {{
- add(service);
- }});
-
- Map<String, String> middleware = new HashMap<String, String>() {{
- put("name", middlewareName);
- }};
-
- specRoute.setMiddlewares(new HashSet<Map<String, String>>() {{
- add(middleware);
- }});
-
- ingressRouteSpec.setRoutes(new HashSet<SpecRoute>() {{
- add(specRoute);
- }});
-
- ingressRoute.setSpec(ingressRouteSpec);
- return ingressRoute;
- }
-
- public Middlewares getMiddlewares() {
- Middlewares middleware = new Middlewares();
- middleware.setMetadata(new
V1ObjectMeta().name(middlewareName).namespace(namespace));
-
- MiddlewaresSpec middlewareSpec = new MiddlewaresSpec().stripPrefix(
- new StripPrefix().prefixes(Arrays.asList(routePath))
- );
- middleware.setSpec(middlewareSpec);
- return middleware;
- }
-
-
- public String getGeneralName() {
- return this.generalName;
- }
-
- public String getPodName() {
- return this.podName;
- }
-
- public String getContainerName() {
- return this.containerName;
- }
- public String getRouteName() {
- return this.routeName;
- }
-
- public String getSvcName() {
- return this.svcName;
- }
-
- public String getMiddlewareName() {
- return this.middlewareName;
- }
-
- public String getRoutePath() {
- return this.routePath;
- }
-
- public String getModelURI() {
- return this.modelURI;
- }
-
- public String getNamespace() {
- return this.namespace;
- }
-}
diff --git
a/submarine-workbench/workbench-web/src/app/pages/workbench/experiment/experiment-home/experiment-home.component.html
b/submarine-workbench/workbench-web/src/app/pages/workbench/experiment/experiment-home/experiment-home.component.html
index 3fa8ffb..9edbce6 100644
---
a/submarine-workbench/workbench-web/src/app/pages/workbench/experiment/experiment-home/experiment-home.component.html
+++
b/submarine-workbench/workbench-web/src/app/pages/workbench/experiment/experiment-home/experiment-home.component.html
@@ -34,6 +34,17 @@
target="_blank"
nzType="primary"
style="margin: 0px 4px 0px 4px"
+ [nzLoading]="isMlflowLoading"
+ [href]="mlflowUrl"
+ >
+ <i nz-icon nzType="radar-chart"></i>
+ MLflow UI
+ </a>
+ <a
+ nz-button
+ target="_blank"
+ nzType="primary"
+ style="margin: 0px 4px 0px 4px"
[nzLoading]="isTensorboardLoading"
[href]="tensorboardUrl"
>
diff --git
a/submarine-workbench/workbench-web/src/app/pages/workbench/experiment/experiment-home/experiment-home.component.ts
b/submarine-workbench/workbench-web/src/app/pages/workbench/experiment/experiment-home/experiment-home.component.ts
index e0a0de2..e78b1a8 100644
---
a/submarine-workbench/workbench-web/src/app/pages/workbench/experiment/experiment-home/experiment-home.component.ts
+++
b/submarine-workbench/workbench-web/src/app/pages/workbench/experiment/experiment-home/experiment-home.component.ts
@@ -72,6 +72,7 @@ export class ExperimentHomeComponent implements OnInit {
this.experimentService.emitInfo(null);
this.getTensorboardInfo(1000, 50000);
+ this.getMlflowInfo(1000, 100000);
this.onSwitchAutoReload();
}
@@ -172,4 +173,24 @@ export class ExperimentHomeComponent implements OnInit {
(err) => console.log(err)
);
}
+
+ getMlflowInfo(period: number, due: number) {
+ interval(period)
+ .pipe(
+ mergeMap(() => this.experimentService.getMlflowInfo()),
+ retryWhen((error) => error),
+ tap((x) => console.log(x)),
+ filter((res) => res.available),
+ take(1),
+ timeout(due)
+ )
+ .subscribe(
+ (res) => {
+ this.isMlflowLoading = !res.available;
+ this.mlflowUrl = res.url;
+ },
+ (err) => console.log(err)
+ );
+ }
}
+
\ No newline at end of file
diff --git
a/submarine-workbench/workbench-web/src/app/services/experiment.service.ts
b/submarine-workbench/workbench-web/src/app/services/experiment.service.ts
index 56969b9..5690cfc 100644
--- a/submarine-workbench/workbench-web/src/app/services/experiment.service.ts
+++ b/submarine-workbench/workbench-web/src/app/services/experiment.service.ts
@@ -278,6 +278,19 @@ export class ExperimentService {
);
}
+ getMlflowInfo(): Observable<MlflowInfo> {
+ const apiUrl = this.baseApi.getRestApi('/v1/experiment/mlflow');
+ return this.httpClient.get<Rest<MlflowInfo>>(apiUrl).pipe(
+ switchMap((res) => {
+ if (res.success) {
+ return of(res.result);
+ } else {
+ throw this.baseApi.createRequestError(res.message, res.code, apiUrl,
'get');
+ }
+ })
+ );
+ }
+
durationHandle(secs: number) {
const hr = Math.floor(secs / 3600);
const min = Math.floor((secs - hr * 3600) / 60);
diff --git a/website/docs/gettingStarted/quickstart.md
b/website/docs/gettingStarted/quickstart.md
index 0d38456..60f8086 100644
--- a/website/docs/gettingStarted/quickstart.md
+++ b/website/docs/gettingStarted/quickstart.md
@@ -111,7 +111,7 @@ Reference:
https://github.com/kubeflow/tf-operator/blob/master/examples/v1/distr
import tensorflow_datasets as tfds
import tensorflow as tf
from tensorflow.keras import layers, models
-from submarine import ModelsClient
+import submarine
def make_datasets_unbatched():
BUFFER_SIZE = 10000
@@ -167,14 +167,12 @@ def main():
def on_epoch_end(self, epoch, logs=None):
# monitor the loss and accuracy
print(logs)
- modelClient.log_metrics({"loss": logs["loss"], "accuracy":
logs["accuracy"]}, epoch)
+ submarine.log_metrics({"loss": logs["loss"], "accuracy":
logs["accuracy"]}, epoch)
- with modelClient.start() as run:
- multi_worker_model.fit(ds_train, epochs=10, steps_per_epoch=70,
callbacks=[MyCallback()])
+ multi_worker_model.fit(ds_train, epochs=10, steps_per_epoch=70,
callbacks=[MyCallback()])
if __name__ == '__main__':
- modelClient = ModelsClient()
main()
```
@@ -200,14 +198,12 @@ eval $(minikube docker-env)
4. The experiment is successfully submitted

-### 4. Monitor the process (modelClient)
+### 4. Monitor the process
-1. In our code, we use `modelClient` from `submarine-sdk` to record the
metrics. To see the result, click `MLflow UI` in the workbench.
-2. To compare the metrics of each worker, you can select all workers and then
click `compare`
-
- 
-
- 
+1. In our code, we use `submarine` from `submarine-sdk` to record the metrics.
To see the result, click corresponding experiment with name `quickstart` in the
workbench.
+2. To see the metrics of each worker, you can select a worker from the left
top list.
+
+
### 5. Serve the model (In development)
diff --git a/website/docs/userDocs/submarine-sdk/model-client.md
b/website/docs/userDocs/submarine-sdk/model-client.md
deleted file mode 100644
index 247acc6..0000000
--- a/website/docs/userDocs/submarine-sdk/model-client.md
+++ /dev/null
@@ -1,143 +0,0 @@
----
-title: Model Client
----
-
-<!--
-Licensed to the Apache Software Foundation (ASF) under one
-or more contributor license agreements. See the NOTICE file
-distributed with this work for additional information
-regarding copyright ownership. The ASF licenses this file
-to you under the Apache License, Version 2.0 (the
-"License"); you may not use this file except in compliance
-with the License. You may obtain a copy of the License at
-
- http://www.apache.org/licenses/LICENSE-2.0
-
-Unless required by applicable law or agreed to in writing,
-software distributed under the License is distributed on an
-"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-KIND, either express or implied. See the License for the
-specific language governing permissions and limitations
-under the License.
--->
-
-## class ModelClient()
-
-The submarine ModelsClient provides a high-level API for logging metrics /
parameters and managing models.
-
-### `ModelsClient(tracking_uri=None, registry_uri=None)->ModelsClient`
-
-Initialize a `ModelsClient` instance.
-
-> **Parameters**
- - **tracking_uri**: If run in Submarine, you do not need to specify it.
Otherwise, specify the external tracking_uri.
- - **registry_uri**: If run in Submarine, you do not need to specify it.
Otherwise, specify the external registry_uri.
-
-> **Returns**
- - ModelsClient instance
-
-Example
-
-```python
-from submarine import ModelsClient
-
-modelClient = ModelsClient(tracking_uri="0.0.0.0:4000",
registry_uri="0.0.0.0:5000")
-```
-### `ModelsClient.start()->[Active Run]`
-
-For details of [Active
Run](https://mlflow.org/docs/latest/_modules/mlflow/tracking/fluent.html#ActiveRun)
-
-Start a new Mlflow run, and direct the logging of the artifacts and metadata
to the Run named "worker_i" under Experiment "job_id". If in distributed
training, worker and job id would be parsed from environment variable. If in
local traning, worker and job id will be generated.
-
-> **Returns**
- - Active Run
-
-### `ModelsClient.log_param(key, value)->None`
-
-Log parameter under the current run.
-
-> **Parameters**
- - **key** – Parameter name
- - **value** – Parameter value
-
-Example
-
-```python
-from submarine import ModelsClient
-
-modelClient = ModelsClient()
-with modelClient.start() as run:
- modelClient.log_param("learning_rate", 0.01)
-```
-
-### `ModelsClient.log_params(params)->None`
-
-Log a batch of params for the current run.
-
-> **Parameters**
- - **params** – Dictionary of param_name: String -> value
-
-Example
-
-```python
-from submarine import ModelsClient
-
-params = {"learning_rate": 0.01, "n_estimators": 10}
-
-modelClient = ModelsClient()
-with modelClient.start() as run:
- modelClient.log_params(params)
-```
-
-### `ModelsClient.log_metric(self, key, value, step=None)->None`
-
-Log a metric under the current run.
-
-> **Parameters**
- - **key** – Metric name (string).
- - **value** – Metric value (float).
- - **step** – Metric step (int). Defaults to zero if unspecified.
-
-Example
-
-```python
-from submarine import ModelsClient
-
-modelClient = ModelsClient()
-with modelClient.start() as run:
- modelClient.log_metric("mse", 2500.00)
-```
-
-### `ModelsClient.log_metrics(self, metrics, step=None)->None`
-
-Log multiple metrics for the current run.
-
-> **Parameters**
- - **metrics** – Dictionary of metric_name: String -> value: Float.
- - **step** – A single integer step at which to log the specified Metrics. If
unspecified, each metric is logged at step zero.
-
-Example
-
-```python
-from submarine import ModelsClient
-
-metrics = {"mse": 2500.00, "rmse": 50.00}
-
-modelClient = ModelsClient()
-with modelClient.start() as run:
- modelClient.log_metrics(metrics)
-```
-
-### `(Beta) ModelsClient.save_model(self, model_type, model, artifact_path,
registered_model_name=None)`
-
-Save model to model registry.
-### `(Beta) ModelsClient.load_model(self, name,
version)->mlflow.pyfunc.PyFuncModel`
-
-Load a model from model registry.
-### `(Beta) ModelsClient.update_model(self, name, new_name)->None`
-
-Update a model by new name.
-
-### `(Beta) ModelsClient.delete_model(self, name, version)->None`
-
-Delete a model in model registry.
diff --git a/website/docs/userDocs/submarine-sdk/pysubmarine/development.md
b/website/docs/userDocs/submarine-sdk/pysubmarine/development.md
index 40bb253..d0f1899 100644
--- a/website/docs/userDocs/submarine-sdk/pysubmarine/development.md
+++ b/website/docs/userDocs/submarine-sdk/pysubmarine/development.md
@@ -155,7 +155,7 @@ to generate pysubmarine client API that used to communicate
with submarine serve
### Model Management Model Development
For local development, we can access cluster's service easily thanks to
[telepresence](https://www.telepresence.io/).
-To elaborate, we can develop the sdk in local but can reach out to mlflow
server by proxy.
+To elaborate, we can develop the sdk in local but can reach out to database
and minio server by proxy.
1. Install telepresence follow [the
instruction](https://www.telepresence.io/reference/install).
2. Start proxy pod
diff --git a/website/sidebars.js b/website/sidebars.js
index 794742d..5855017 100644
--- a/website/sidebars.js
+++ b/website/sidebars.js
@@ -39,7 +39,6 @@ module.exports = {
{
"Submarine SDK": [
"userDocs/submarine-sdk/experiment-client",
- "userDocs/submarine-sdk/model-client",
"userDocs/submarine-sdk/tracking",
],
},
diff --git a/website/static/img/quickstart-ui-1.png
b/website/static/img/quickstart-ui-1.png
new file mode 100644
index 0000000..b9e8289
Binary files /dev/null and b/website/static/img/quickstart-ui-1.png differ
diff --git a/website/static/img/quickstart-ui-2.png
b/website/static/img/quickstart-ui-2.png
new file mode 100644
index 0000000..b4580e9
Binary files /dev/null and b/website/static/img/quickstart-ui-2.png differ
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]