tengqm commented on code in PR #6052: URL: https://github.com/apache/gravitino/pull/6052#discussion_r1900049193
########## docs/manage-model-metadata-using-gravitino.md: ########## @@ -0,0 +1,637 @@ +--- +title: Manage model metadata using Gravitino +slug: /manage-model-metadata-using-gravitino +date: 2024-12-26 +keyword: Gravitino model metadata manage +license: This software is licensed under the Apache License version 2. +--- + +import Tabs from '@theme/Tabs'; +import TabItem from '@theme/TabItem'; + +This page introduces how to manage model metadata in Apache Gravitino. Gravitino model catalog +is a kind of model registry, which provides the ability to manage machine learning models' +versioning metadata. It follows the typical Gravitino 3-level namespace (catalog, schema, and +model) and supports managing the versions for each model. + +Currently, it supports model and model version registering, listing, loading, and deleting. + +To use model catalog, please make sure that: + + - Gravitino server has started, and assume the host and port [http://localhost:8090](http://localhost:8090). Review Comment: ```suggestion - The Gravitino server has started and is serving at, e.g. [http://localhost:8090](http://localhost:8090). ``` ########## docs/manage-model-metadata-using-gravitino.md: ########## @@ -0,0 +1,637 @@ +--- +title: Manage model metadata using Gravitino +slug: /manage-model-metadata-using-gravitino +date: 2024-12-26 +keyword: Gravitino model metadata manage +license: This software is licensed under the Apache License version 2. +--- + +import Tabs from '@theme/Tabs'; +import TabItem from '@theme/TabItem'; + +This page introduces how to manage model metadata in Apache Gravitino. Gravitino model catalog +is a kind of model registry, which provides the ability to manage machine learning models' +versioning metadata. It follows the typical Gravitino 3-level namespace (catalog, schema, and +model) and supports managing the versions for each model. + +Currently, it supports model and model version registering, listing, loading, and deleting. + +To use model catalog, please make sure that: + + - Gravitino server has started, and assume the host and port [http://localhost:8090](http://localhost:8090). + - A metalake has been created and [enabled](./manage-metalake-using-gravitino.md#enable-a-metalake) + +## Catalog operations + +### Create a catalog + +:::info +For a model catalog, you must specify the catalog `type` as `MODEL` when creating the catalog. +Please also be aware that the `provider` is not required for a model catalog. +::: + +You can create a catalog by sending a `POST` request to the `/api/metalakes/{metalake_name}/catalogs` +endpoint or just use the Gravitino Java/Python client. The following is an example of creating a +catalog: + +<Tabs groupId="language" queryString> +<TabItem value="shell" label="Shell"> + +```shell +curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \ +-H "Content-Type: application/json" -d '{ + "name": "model_catalog", + "type": "MODEL", + "comment": "This is a model catalog", + "properties": { + "k1": "v1" + } +}' http://localhost:8090/api/metalakes/example/catalogs +``` + +</TabItem> +<TabItem value="java" label="Java"> + +```java +GravitinoClient gravitinoClient = GravitinoClient + .builder("http://localhost:8090") + .withMetalake("example") + .build(); + +Map<String, String> properties = ImmutableMap.<String, String>builder() + .put("k1", "v1") + .build(); + +Catalog catalog = gravitinoClient.createCatalog( + "model_catalog", + Type.MODEL, + "This is a model catalog", + properties); +``` + +</TabItem> +<TabItem value="python" label="Python"> + +```python +gravitino_client: GravitinoClient = GravitinoClient(uri="http://localhost:8090", metalake_name="example") +catalog = gravitino_client.create_catalog(name="catalog", + type=Catalog.Type.MODEL, + provider=None, + comment="This is a model catalog", + properties={"k1": "v1"}) +``` + +</TabItem> +</Tabs> + +### Load a catalog + +Refer to [Load a catalog](./manage-relational-metadata-using-gravitino.md#load-a-catalog) +in relational catalog for more details. For a model catalog, the load operation is the same. + +### Alter a catalog + +Refer to [Alter a catalog](./manage-relational-metadata-using-gravitino.md#alter-a-catalog) +in relational catalog for more details. For a model catalog, the alter operation is the same. + +### Drop a catalog + +Refer to [Drop a catalog](./manage-relational-metadata-using-gravitino.md#drop-a-catalog) +in relational catalog for more details. For a model catalog, the drop operation is the same. + +### List all catalogs in a metalake + +Please refer to [List all catalogs in a metalake](./manage-relational-metadata-using-gravitino.md#list-all-catalogs-in-a-metalake) +in relational catalog for more details. For a model catalog, the list operation is the same. + +### List all catalogs' information in a metalake + +Please refer to [List all catalogs' information in a metalake](./manage-relational-metadata-using-gravitino.md#list-all-catalogs-information-in-a-metalake) +in relational catalog for more details. For a model catalog, the list operation is the same. + +## Schema operations + +`Schema` is a virtual namespace in a model catalog, which is used to organize the models. It +is similar to the concept of `schema` in the relational catalog. + +:::tip +Users should create a metalake and a catalog before creating a schema. +::: + +### Create a schema + +You can create a schema by sending a `POST` request to the `/api/metalakes/{metalake_name}/catalogs/{catalog_name}/schemas` +endpoint or just use the Gravitino Java/Python client. The following is an example of creating a +schema: + +<Tabs groupId="language" queryString> +<TabItem value="shell" label="Shell"> + +```shell +curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \ +-H "Content-Type: application/json" -d '{ + "name": "model_schema", + "comment": "This is a model schema", + "properties": { + "k1": "v1" + } +}' http://localhost:8090/api/metalakes/example/catalogs/model_catalog/schemas +``` + +</TabItem> +<TabItem value="java" label="Java"> + +```java +GravitinoClient gravitinoClient = GravitinoClient + .builder("http://localhost:8090") + .withMetalake("example") + .build(); + +Catalog catalog = gravitinoClient.loadCatalog("model_catalog"); + +SupportsSchemas supportsSchemas = catalog.asSchemas(); + +Map<String, String> schemaProperties = ImmutableMap.<String, String>builder() + .put("k1", "v1") + .build(); +Schema schema = supportsSchemas.createSchema( + "model_schema", + "This is a schema", + schemaProperties); +// ... +``` + +</TabItem> +<TabItem value="python" label="Python"> + +```python +gravitino_client: GravitinoClient = GravitinoClient(uri="http://localhost:8090", metalake_name="example") + +catalog: Catalog = gravitino_client.load_catalog(name="model_catalog") +catalog.as_schemas().create_schema(name="model_schema", + comment="This is a schema", + properties={"k1": "v1"}) +``` + +</TabItem> +</Tabs> + +### Load a schema + +Please refer to [Load a schema](./manage-relational-metadata-using-gravitino.md#load-a-schema) +in relational catalog for more details. For a model catalog, the schema load operation is the +same. + +### Alter a schema + +Please refer to [Alter a schema](./manage-relational-metadata-using-gravitino.md#alter-a-schema) +in relational catalog for more details. For a model catalog, the schema alter operation is the +same. + +### Drop a schema + +Please refer to [Drop a schema](./manage-relational-metadata-using-gravitino.md#drop-a-schema) +in relational catalog for more details. For a model catalog, the schema drop operation is the +same. + +Note that the drop operation will delete all the model metadata under this schema if `cascade` +set to `true`. + +### List all schemas under a catalog + +Please refer to [List all schemas under a catalog](./manage-relational-metadata-using-gravitino.md#list-all-schemas-under-a-catalog) +in relational catalog for more details. For a model catalog, the schema list operation is the +same. + +## Model operations + +:::tip + - Users should create a metalake, a catalog, and a schema before creating a model. +::: + +### Register a model + +You can create a model by sending a `POST` request to the `/api/metalakes/{metalake_name} +/catalogs/{catalog_name}/schemas/{schema_name}/models` endpoint or just use the Gravitino +Java/Python client. The following is an example of creating a model: + +<Tabs groupId="language" queryString> +<TabItem value="shell" label="Shell"> + +```shell +curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \ +-H "Content-Type: application/json" -d '{ + "name": "example_model", + "comment": "This is an example model", + "properties": { + "k1": "v1" + } +}' http://localhost:8090/api/metalakes/example/catalogs/model_catalog/schemas/model_schema/models +``` + +</TabItem> +<TabItem value="java" label="Java"> + +```java +GravitinoClient gravitinoClient = GravitinoClient + .builder("http://localhost:8090") + .withMetalake("example") + .build(); + +Catalog catalog = gravitinoClient.loadCatalog("model_catalog"); +Map<String, String> propertiesMap = ImmutableMap.<String, String>builder() + .put("k1", "v1") + .build(); + +Model = catalog.asModelCatalog().registerModel( + NameIdentifier.of("model_schema", "example_model"), + "This is an example model", + propertiesMap); +``` + +</TabItem> +<TabItem value="python" label="Python"> + +```python +gravitino_client: GravitinoClient = GravitinoClient(uri="http://localhost:8090", metalake_name="example") + +catalog: Catalog = gravitino_client.load_catalog(name="model_catalog") +model = catalog.as_model_catalog().register_model(ident=NameIdentifier.of("model_schema", "example_model"), + comment="This is an example model", + properties={"k1": "v1"}) +``` + +</TabItem> +</Tabs> + +### Get a model + +You can get a model by sending a `GET` request to the `/api/metalakes/{metalake_name} +/catalogs/{catalog_name}/schemas/{schema_name}/models/{model_name}` endpoint or by using the +Gravitino Java/Python client. The following is an example of getting a model: + +<Tabs groupId="language" queryString> +<TabItem value="shell" label="Shell"> + +```shell +curl -X GET -H "Accept: application/vnd.gravitino.v1+json" \ +-H "Content-Type: application/json" \ +http://localhost:8090/api/metalakes/example/catalogs/model_catalog/schemas/model_schema/models/example_model +``` + +</TabItem> +<TabItem value="java" label="Java"> + +```java +// ... +Catalog catalog = gravitinoClient.loadCatalog("model_catalog"); +Model model = catalog.asModelCatalog().getModel(NameIdentifier.of("model_schema", "example_model")); +// ... +``` + +</TabItem> +<TabItem value="python" label="Python"> + +```python +gravitino_client: GravitinoClient = GravitinoClient(uri="http://localhost:8090", metalake_name="example") + +catalog: Catalog = gravitino_client.load_catalog(name="model_catalog") +model: Model = catalog.as_model_catalog().get_model(ident=NameIdentifier.of("model_schema", "example_model")) +``` + +</TabItem> +</Tabs> + +### Delete a model + +You can delete a model by sending a `DELETE` request to the `/api/metalakes/{metalake_name} +/catalogs/{catalog_name}/schemas/{schema_name}/models/{model_name}` endpoint or by using the +Gravitino Java/Python client. The following is an example of deleting a model: + +<Tabs groupId="language" queryString> +<TabItem value="shell" label="Shell"> + +```shell +curl -X DELETE -H "Accept: application/vnd.gravitino.v1+json" \ +-H "Content-Type: application/json" \ +http://localhost:8090/api/metalakes/example/catalogs/model_catalog/schemas/model_schema/models/example_model +``` + +</TabItem> +<TabItem value="java" label="Java"> + +```java +// ... +Catalog catalog = gravitinoClient.loadCatalog("model_catalog"); +catalog.asModelCatalog().deleteModel(NameIdentifier.of("model_schema", "example_model")); +// ... +``` + +</TabItem> +<TabItem value="python" label="Python"> + +```python +gravitino_client: GravitinoClient = GravitinoClient(uri="http://localhost:8090", metalake_name="example") + +catalog: Catalog = gravitino_client.load_catalog(name="model_catalog") +catalog.as_model_catalog().delete_model(NameIdentifier.of("model_schema", "example_model")) +``` + +</TabItem> +</Tabs> + +Note that the delete operation will delete all the model versions under this model. + +### List models + +You can list all the models in a schema by sending a `GET` request to the `/api/metalakes/ +{metalake_name}/catalogs/{catalog_name}/schemas/{schema_name}/moldes` endpoint or by using the +Gravitino Java/Python client. The following is an example of listing all the models in a schema: + +<Tabs groupId="language" queryString> +<TabItem value="shell" label="Shell"> + +```shell +curl -X GET -H "Accept: application/vnd.gravitino.v1+json" \ +-H "Content-Type: application/json" \ +http://localhost:8090/api/metalakes/example/catalogs/model_catalog/schemas/model_schema/models +``` + +</TabItem> +<TabItem value="java" label="Java"> + +```java +// ... +Catalog catalog = gravitinoClient.loadCatalog("model_catalog"); +NameIdentifier[] identifiers = catalog.asModelCatalog().listModels(Namespace.of("model_schema")); +// ... +``` + +</TabItem> +<TabItem value="python" label="Python"> + +```python +gravitino_client: GravitinoClient = GravitinoClient(uri="http://localhost:8090", metalake_name="example") + +catalog: Catalog = gravitino_client.load_catalog(name="model_catalog") +model_list = catalog.as_model_catalog().list_models(namespace=Namespace.of("model_schema"))) +``` + +</TabItem> +</Tabs> + +## Model version operations + +:::tip + - Users should create a metalake, a catalog, a schema, and a model before link a model version + to the model. +::: + +### Link a model version + +You can link a model version by sending a `POST` request to the `/api/metalakes/{metalake_name} +/catalogs/{catalog_name}/schemas/{schema_name}/models/{model_name}` endpoint or by using +the Gravitino Java/Python client. The following is an example of linking a model version: + +<Tabs groupId="language" queryString> +<TabItem value="shell" label="Shell"> + +```shell +curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \ +-H "Content-Type: application/json" -d '{ + "uri": "path/to/model", + "aliases": ["alias1", "alias2"], + "comment": "This is version 0", + "properties": { + "k1": "v1" + } +}' http://localhost:8090/api/metalakes/example/catalogs/model_catalog/schemas/model_schema/models/example_model +``` + +</TabItem> +<TabItem value="java" label="Java"> + +```java +// ... +Catalog catalog = gravitinoClient.loadCatalog("model_catalog"); +catalog.asModelCatalog().linkModelVersion( + NameIdentifier.of("model_schema", "example_model"), + "path/to/model", + new String[] {"alias1", "alias2"}, + "This is version 0", + ImmutableMap.of("k1", "v1")); +``` + +</TabItem> +<TabItem value="python" label="Python"> + +```python +gravitino_client: GravitinoClient = GravitinoClient(uri="http://localhost:8090", metalake_name="example") + +catalog: Catalog = gravitino_client.load_catalog(name="model_catalog") +catalog.as_model_catalog().link_model_version(model_ident=NameIdentifier.of("model_schema", "example_model"), + uri="path/to/model", + aliases=["alias1", "alias2"], + comment="This is version 0", + properties={"k1": "v1"}) +``` + +</TabItem> +</Tabs> + +The comment and properties of model version can be different from the model. + +### Get a model version + +You can get a model version by sending a `GET` request to the `/api/metalakes/{metalake_name} +/catalogs/{catalog_name}/schemas/{schema_name}/models/{model_name}/versions/{version_number}` Review Comment: Okay, here it is ... `/api/.../models/{model}/versions/{version}` is more RESTful than the previous API. We really should POST to the `/api/.../models/{model}/versions` URI for "creating" a ModelVersion resource then. ########## docs/model-catalog.md: ########## @@ -0,0 +1,87 @@ +--- +title: "Model catalog" +slug: /model-catalog +date: 2024-12-26 +keyword: model catalog +license: "This software is licensed under the Apache License version 2." +--- + +## Introduction + +Model catalog is a metadata catalog that offers the unified interfaces to manage the metadata of +machine learning models in a centralized way. It follows the typical Gravitino 3-level namespace +(catalog, schema, and model) to manage the ML models metadata. In additional, it supports +managing the versions for each model. + +The advantages of using model catalog are: + +* Centralized management of ML models with user defined namespaces. Users can better discover + and govern the models from sematic level, rather than managing the model files directly. +* Version management for each model. Users can easily track the model versions and manage the + model lifecycle. + +The key concept of model management is to manage the path (URI) of the model. Instead of +managing the model storage path physically and separately, model metadata defines the mapping +relation between the model name and the storage path. In the meantime, with the support of +extensible properties of model metadata, users can define the model metadata with more detailed information +rather than just the storage path. + +* **Model**: The model is a metadata defined in the model catalog, to manage the ML models. Each + model can have many **Model Versions**, and each version can have its own properties. Models + can be retrieved by the name. +* **Model Version**: The model version is a metadata defined in the model catalog, to manage each + version of the ML model. Each version has a unique version number, and can have its own + properties and storage path. Model version can be retrieved by the model name and version + number. Also, each version can have a list of aliases, which can also be used to retrieve. + +## Catalog + +### Catalog properties + +Model catalog doesn't have specific properties. It uses the [common catalog properties](./gravitino-server-config.md#apache-gravitino-catalog-properties-configuration). Review Comment: ```suggestion A model catalog doesn't have specific properties. It uses the [common catalog properties](./gravitino-server-config.md#apache-gravitino-catalog-properties-configuration). ``` ########## docs/manage-model-metadata-using-gravitino.md: ########## @@ -0,0 +1,637 @@ +--- +title: Manage model metadata using Gravitino +slug: /manage-model-metadata-using-gravitino +date: 2024-12-26 +keyword: Gravitino model metadata manage +license: This software is licensed under the Apache License version 2. +--- + +import Tabs from '@theme/Tabs'; +import TabItem from '@theme/TabItem'; + +This page introduces how to manage model metadata in Apache Gravitino. Gravitino model catalog +is a kind of model registry, which provides the ability to manage machine learning models' +versioning metadata. It follows the typical Gravitino 3-level namespace (catalog, schema, and +model) and supports managing the versions for each model. + +Currently, it supports model and model version registering, listing, loading, and deleting. + +To use model catalog, please make sure that: Review Comment: ```suggestion To use the model catalog, please make sure that: ``` ########## docs/manage-model-metadata-using-gravitino.md: ########## @@ -0,0 +1,637 @@ +--- +title: Manage model metadata using Gravitino +slug: /manage-model-metadata-using-gravitino +date: 2024-12-26 +keyword: Gravitino model metadata manage +license: This software is licensed under the Apache License version 2. +--- + +import Tabs from '@theme/Tabs'; +import TabItem from '@theme/TabItem'; + +This page introduces how to manage model metadata in Apache Gravitino. Gravitino model catalog +is a kind of model registry, which provides the ability to manage machine learning models' +versioning metadata. It follows the typical Gravitino 3-level namespace (catalog, schema, and Review Comment: ```suggestion versioned metadata. It follows the typical Gravitino 3-level namespace (catalog, schema, and ``` ########## docs/manage-model-metadata-using-gravitino.md: ########## @@ -0,0 +1,637 @@ +--- +title: Manage model metadata using Gravitino +slug: /manage-model-metadata-using-gravitino +date: 2024-12-26 +keyword: Gravitino model metadata manage +license: This software is licensed under the Apache License version 2. +--- + +import Tabs from '@theme/Tabs'; +import TabItem from '@theme/TabItem'; + +This page introduces how to manage model metadata in Apache Gravitino. Gravitino model catalog +is a kind of model registry, which provides the ability to manage machine learning models' +versioning metadata. It follows the typical Gravitino 3-level namespace (catalog, schema, and +model) and supports managing the versions for each model. + +Currently, it supports model and model version registering, listing, loading, and deleting. + +To use model catalog, please make sure that: + + - Gravitino server has started, and assume the host and port [http://localhost:8090](http://localhost:8090). + - A metalake has been created and [enabled](./manage-metalake-using-gravitino.md#enable-a-metalake) + +## Catalog operations + +### Create a catalog + +:::info +For a model catalog, you must specify the catalog `type` as `MODEL` when creating the catalog. +Please also be aware that the `provider` is not required for a model catalog. +::: + +You can create a catalog by sending a `POST` request to the `/api/metalakes/{metalake_name}/catalogs` +endpoint or just use the Gravitino Java/Python client. The following is an example of creating a +catalog: + +<Tabs groupId="language" queryString> +<TabItem value="shell" label="Shell"> + +```shell +curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \ +-H "Content-Type: application/json" -d '{ + "name": "model_catalog", + "type": "MODEL", + "comment": "This is a model catalog", + "properties": { + "k1": "v1" + } +}' http://localhost:8090/api/metalakes/example/catalogs +``` + +</TabItem> +<TabItem value="java" label="Java"> + +```java +GravitinoClient gravitinoClient = GravitinoClient + .builder("http://localhost:8090") + .withMetalake("example") + .build(); + +Map<String, String> properties = ImmutableMap.<String, String>builder() + .put("k1", "v1") + .build(); + +Catalog catalog = gravitinoClient.createCatalog( + "model_catalog", + Type.MODEL, + "This is a model catalog", + properties); +``` + +</TabItem> +<TabItem value="python" label="Python"> + +```python +gravitino_client: GravitinoClient = GravitinoClient(uri="http://localhost:8090", metalake_name="example") +catalog = gravitino_client.create_catalog(name="catalog", + type=Catalog.Type.MODEL, + provider=None, + comment="This is a model catalog", + properties={"k1": "v1"}) +``` + +</TabItem> +</Tabs> + +### Load a catalog + +Refer to [Load a catalog](./manage-relational-metadata-using-gravitino.md#load-a-catalog) +in relational catalog for more details. For a model catalog, the load operation is the same. + +### Alter a catalog + +Refer to [Alter a catalog](./manage-relational-metadata-using-gravitino.md#alter-a-catalog) +in relational catalog for more details. For a model catalog, the alter operation is the same. + +### Drop a catalog + +Refer to [Drop a catalog](./manage-relational-metadata-using-gravitino.md#drop-a-catalog) +in relational catalog for more details. For a model catalog, the drop operation is the same. + +### List all catalogs in a metalake + +Please refer to [List all catalogs in a metalake](./manage-relational-metadata-using-gravitino.md#list-all-catalogs-in-a-metalake) +in relational catalog for more details. For a model catalog, the list operation is the same. + +### List all catalogs' information in a metalake + +Please refer to [List all catalogs' information in a metalake](./manage-relational-metadata-using-gravitino.md#list-all-catalogs-information-in-a-metalake) +in relational catalog for more details. For a model catalog, the list operation is the same. + +## Schema operations + +`Schema` is a virtual namespace in a model catalog, which is used to organize the models. It +is similar to the concept of `schema` in the relational catalog. + +:::tip +Users should create a metalake and a catalog before creating a schema. +::: + +### Create a schema + +You can create a schema by sending a `POST` request to the `/api/metalakes/{metalake_name}/catalogs/{catalog_name}/schemas` +endpoint or just use the Gravitino Java/Python client. The following is an example of creating a +schema: + +<Tabs groupId="language" queryString> +<TabItem value="shell" label="Shell"> + +```shell +curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \ +-H "Content-Type: application/json" -d '{ + "name": "model_schema", + "comment": "This is a model schema", + "properties": { + "k1": "v1" + } +}' http://localhost:8090/api/metalakes/example/catalogs/model_catalog/schemas +``` + +</TabItem> +<TabItem value="java" label="Java"> + +```java +GravitinoClient gravitinoClient = GravitinoClient + .builder("http://localhost:8090") + .withMetalake("example") + .build(); + +Catalog catalog = gravitinoClient.loadCatalog("model_catalog"); + +SupportsSchemas supportsSchemas = catalog.asSchemas(); + +Map<String, String> schemaProperties = ImmutableMap.<String, String>builder() + .put("k1", "v1") + .build(); +Schema schema = supportsSchemas.createSchema( + "model_schema", + "This is a schema", + schemaProperties); +// ... +``` + +</TabItem> +<TabItem value="python" label="Python"> + +```python +gravitino_client: GravitinoClient = GravitinoClient(uri="http://localhost:8090", metalake_name="example") + +catalog: Catalog = gravitino_client.load_catalog(name="model_catalog") +catalog.as_schemas().create_schema(name="model_schema", + comment="This is a schema", + properties={"k1": "v1"}) +``` + +</TabItem> +</Tabs> + +### Load a schema + +Please refer to [Load a schema](./manage-relational-metadata-using-gravitino.md#load-a-schema) +in relational catalog for more details. For a model catalog, the schema load operation is the +same. + +### Alter a schema + +Please refer to [Alter a schema](./manage-relational-metadata-using-gravitino.md#alter-a-schema) +in relational catalog for more details. For a model catalog, the schema alter operation is the +same. + +### Drop a schema + +Please refer to [Drop a schema](./manage-relational-metadata-using-gravitino.md#drop-a-schema) +in relational catalog for more details. For a model catalog, the schema drop operation is the +same. + +Note that the drop operation will delete all the model metadata under this schema if `cascade` +set to `true`. + +### List all schemas under a catalog + +Please refer to [List all schemas under a catalog](./manage-relational-metadata-using-gravitino.md#list-all-schemas-under-a-catalog) +in relational catalog for more details. For a model catalog, the schema list operation is the +same. + +## Model operations + +:::tip + - Users should create a metalake, a catalog, and a schema before creating a model. +::: + +### Register a model Review Comment: Better use consistent terms. Either 'register' v.s. 'unregister', or 'create' vs 'delete'. Pairing 'register' with 'delete' sounds weird. ########## docs/model-catalog.md: ########## @@ -0,0 +1,87 @@ +--- +title: "Model catalog" +slug: /model-catalog +date: 2024-12-26 +keyword: model catalog +license: "This software is licensed under the Apache License version 2." +--- + +## Introduction + +Model catalog is a metadata catalog that offers the unified interfaces to manage the metadata of Review Comment: ```suggestion A model catalog is a metadata catalog that provides a unified interface to manage the metadata of ``` ########## docs/model-catalog.md: ########## @@ -0,0 +1,87 @@ +--- +title: "Model catalog" +slug: /model-catalog +date: 2024-12-26 +keyword: model catalog +license: "This software is licensed under the Apache License version 2." +--- + +## Introduction + +Model catalog is a metadata catalog that offers the unified interfaces to manage the metadata of +machine learning models in a centralized way. It follows the typical Gravitino 3-level namespace +(catalog, schema, and model) to manage the ML models metadata. In additional, it supports +managing the versions for each model. + +The advantages of using model catalog are: + +* Centralized management of ML models with user defined namespaces. Users can better discover + and govern the models from sematic level, rather than managing the model files directly. +* Version management for each model. Users can easily track the model versions and manage the + model lifecycle. + +The key concept of model management is to manage the path (URI) of the model. Instead of +managing the model storage path physically and separately, model metadata defines the mapping Review Comment: ```suggestion managing the model storage physically and separately, model metadata defines the mapping ``` ########## docs/manage-model-metadata-using-gravitino.md: ########## @@ -0,0 +1,637 @@ +--- +title: Manage model metadata using Gravitino +slug: /manage-model-metadata-using-gravitino +date: 2024-12-26 +keyword: Gravitino model metadata manage +license: This software is licensed under the Apache License version 2. +--- + +import Tabs from '@theme/Tabs'; +import TabItem from '@theme/TabItem'; + +This page introduces how to manage model metadata in Apache Gravitino. Gravitino model catalog +is a kind of model registry, which provides the ability to manage machine learning models' +versioning metadata. It follows the typical Gravitino 3-level namespace (catalog, schema, and +model) and supports managing the versions for each model. + +Currently, it supports model and model version registering, listing, loading, and deleting. + +To use model catalog, please make sure that: + + - Gravitino server has started, and assume the host and port [http://localhost:8090](http://localhost:8090). + - A metalake has been created and [enabled](./manage-metalake-using-gravitino.md#enable-a-metalake) + +## Catalog operations + +### Create a catalog + +:::info +For a model catalog, you must specify the catalog `type` as `MODEL` when creating the catalog. +Please also be aware that the `provider` is not required for a model catalog. +::: + +You can create a catalog by sending a `POST` request to the `/api/metalakes/{metalake_name}/catalogs` +endpoint or just use the Gravitino Java/Python client. The following is an example of creating a +catalog: + +<Tabs groupId="language" queryString> +<TabItem value="shell" label="Shell"> + +```shell +curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \ +-H "Content-Type: application/json" -d '{ + "name": "model_catalog", + "type": "MODEL", + "comment": "This is a model catalog", + "properties": { + "k1": "v1" + } +}' http://localhost:8090/api/metalakes/example/catalogs +``` + +</TabItem> +<TabItem value="java" label="Java"> + +```java +GravitinoClient gravitinoClient = GravitinoClient + .builder("http://localhost:8090") + .withMetalake("example") + .build(); + +Map<String, String> properties = ImmutableMap.<String, String>builder() + .put("k1", "v1") + .build(); + +Catalog catalog = gravitinoClient.createCatalog( + "model_catalog", + Type.MODEL, + "This is a model catalog", + properties); +``` + +</TabItem> +<TabItem value="python" label="Python"> + +```python +gravitino_client: GravitinoClient = GravitinoClient(uri="http://localhost:8090", metalake_name="example") +catalog = gravitino_client.create_catalog(name="catalog", + type=Catalog.Type.MODEL, + provider=None, + comment="This is a model catalog", + properties={"k1": "v1"}) +``` + +</TabItem> +</Tabs> + +### Load a catalog + +Refer to [Load a catalog](./manage-relational-metadata-using-gravitino.md#load-a-catalog) +in relational catalog for more details. For a model catalog, the load operation is the same. + +### Alter a catalog + +Refer to [Alter a catalog](./manage-relational-metadata-using-gravitino.md#alter-a-catalog) +in relational catalog for more details. For a model catalog, the alter operation is the same. + +### Drop a catalog + +Refer to [Drop a catalog](./manage-relational-metadata-using-gravitino.md#drop-a-catalog) +in relational catalog for more details. For a model catalog, the drop operation is the same. + +### List all catalogs in a metalake + +Please refer to [List all catalogs in a metalake](./manage-relational-metadata-using-gravitino.md#list-all-catalogs-in-a-metalake) +in relational catalog for more details. For a model catalog, the list operation is the same. + +### List all catalogs' information in a metalake + +Please refer to [List all catalogs' information in a metalake](./manage-relational-metadata-using-gravitino.md#list-all-catalogs-information-in-a-metalake) +in relational catalog for more details. For a model catalog, the list operation is the same. + +## Schema operations + +`Schema` is a virtual namespace in a model catalog, which is used to organize the models. It +is similar to the concept of `schema` in the relational catalog. + +:::tip +Users should create a metalake and a catalog before creating a schema. +::: + +### Create a schema + +You can create a schema by sending a `POST` request to the `/api/metalakes/{metalake_name}/catalogs/{catalog_name}/schemas` +endpoint or just use the Gravitino Java/Python client. The following is an example of creating a +schema: + +<Tabs groupId="language" queryString> +<TabItem value="shell" label="Shell"> + +```shell +curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \ +-H "Content-Type: application/json" -d '{ + "name": "model_schema", + "comment": "This is a model schema", + "properties": { + "k1": "v1" + } +}' http://localhost:8090/api/metalakes/example/catalogs/model_catalog/schemas +``` + +</TabItem> +<TabItem value="java" label="Java"> + +```java +GravitinoClient gravitinoClient = GravitinoClient + .builder("http://localhost:8090") + .withMetalake("example") + .build(); + +Catalog catalog = gravitinoClient.loadCatalog("model_catalog"); + +SupportsSchemas supportsSchemas = catalog.asSchemas(); + +Map<String, String> schemaProperties = ImmutableMap.<String, String>builder() + .put("k1", "v1") + .build(); +Schema schema = supportsSchemas.createSchema( + "model_schema", + "This is a schema", + schemaProperties); +// ... +``` + +</TabItem> +<TabItem value="python" label="Python"> + +```python +gravitino_client: GravitinoClient = GravitinoClient(uri="http://localhost:8090", metalake_name="example") + +catalog: Catalog = gravitino_client.load_catalog(name="model_catalog") +catalog.as_schemas().create_schema(name="model_schema", + comment="This is a schema", + properties={"k1": "v1"}) +``` + +</TabItem> +</Tabs> + +### Load a schema + +Please refer to [Load a schema](./manage-relational-metadata-using-gravitino.md#load-a-schema) +in relational catalog for more details. For a model catalog, the schema load operation is the +same. + +### Alter a schema + +Please refer to [Alter a schema](./manage-relational-metadata-using-gravitino.md#alter-a-schema) +in relational catalog for more details. For a model catalog, the schema alter operation is the +same. + +### Drop a schema + +Please refer to [Drop a schema](./manage-relational-metadata-using-gravitino.md#drop-a-schema) +in relational catalog for more details. For a model catalog, the schema drop operation is the +same. + +Note that the drop operation will delete all the model metadata under this schema if `cascade` +set to `true`. + +### List all schemas under a catalog + +Please refer to [List all schemas under a catalog](./manage-relational-metadata-using-gravitino.md#list-all-schemas-under-a-catalog) +in relational catalog for more details. For a model catalog, the schema list operation is the +same. + +## Model operations + +:::tip + - Users should create a metalake, a catalog, and a schema before creating a model. +::: + +### Register a model + +You can create a model by sending a `POST` request to the `/api/metalakes/{metalake_name} +/catalogs/{catalog_name}/schemas/{schema_name}/models` endpoint or just use the Gravitino +Java/Python client. The following is an example of creating a model: + +<Tabs groupId="language" queryString> +<TabItem value="shell" label="Shell"> + +```shell +curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \ +-H "Content-Type: application/json" -d '{ + "name": "example_model", + "comment": "This is an example model", + "properties": { + "k1": "v1" + } +}' http://localhost:8090/api/metalakes/example/catalogs/model_catalog/schemas/model_schema/models +``` + +</TabItem> +<TabItem value="java" label="Java"> + +```java +GravitinoClient gravitinoClient = GravitinoClient + .builder("http://localhost:8090") + .withMetalake("example") + .build(); + +Catalog catalog = gravitinoClient.loadCatalog("model_catalog"); +Map<String, String> propertiesMap = ImmutableMap.<String, String>builder() + .put("k1", "v1") + .build(); + +Model = catalog.asModelCatalog().registerModel( + NameIdentifier.of("model_schema", "example_model"), + "This is an example model", + propertiesMap); +``` + +</TabItem> +<TabItem value="python" label="Python"> + +```python +gravitino_client: GravitinoClient = GravitinoClient(uri="http://localhost:8090", metalake_name="example") + +catalog: Catalog = gravitino_client.load_catalog(name="model_catalog") +model = catalog.as_model_catalog().register_model(ident=NameIdentifier.of("model_schema", "example_model"), + comment="This is an example model", + properties={"k1": "v1"}) +``` + +</TabItem> +</Tabs> + +### Get a model + +You can get a model by sending a `GET` request to the `/api/metalakes/{metalake_name} +/catalogs/{catalog_name}/schemas/{schema_name}/models/{model_name}` endpoint or by using the +Gravitino Java/Python client. The following is an example of getting a model: + +<Tabs groupId="language" queryString> +<TabItem value="shell" label="Shell"> + +```shell +curl -X GET -H "Accept: application/vnd.gravitino.v1+json" \ +-H "Content-Type: application/json" \ +http://localhost:8090/api/metalakes/example/catalogs/model_catalog/schemas/model_schema/models/example_model +``` + +</TabItem> +<TabItem value="java" label="Java"> + +```java +// ... +Catalog catalog = gravitinoClient.loadCatalog("model_catalog"); +Model model = catalog.asModelCatalog().getModel(NameIdentifier.of("model_schema", "example_model")); +// ... +``` + +</TabItem> +<TabItem value="python" label="Python"> + +```python +gravitino_client: GravitinoClient = GravitinoClient(uri="http://localhost:8090", metalake_name="example") + +catalog: Catalog = gravitino_client.load_catalog(name="model_catalog") +model: Model = catalog.as_model_catalog().get_model(ident=NameIdentifier.of("model_schema", "example_model")) +``` + +</TabItem> +</Tabs> + +### Delete a model + +You can delete a model by sending a `DELETE` request to the `/api/metalakes/{metalake_name} +/catalogs/{catalog_name}/schemas/{schema_name}/models/{model_name}` endpoint or by using the +Gravitino Java/Python client. The following is an example of deleting a model: + +<Tabs groupId="language" queryString> +<TabItem value="shell" label="Shell"> + +```shell +curl -X DELETE -H "Accept: application/vnd.gravitino.v1+json" \ +-H "Content-Type: application/json" \ +http://localhost:8090/api/metalakes/example/catalogs/model_catalog/schemas/model_schema/models/example_model +``` + +</TabItem> +<TabItem value="java" label="Java"> + +```java +// ... +Catalog catalog = gravitinoClient.loadCatalog("model_catalog"); +catalog.asModelCatalog().deleteModel(NameIdentifier.of("model_schema", "example_model")); +// ... +``` + +</TabItem> +<TabItem value="python" label="Python"> + +```python +gravitino_client: GravitinoClient = GravitinoClient(uri="http://localhost:8090", metalake_name="example") + +catalog: Catalog = gravitino_client.load_catalog(name="model_catalog") +catalog.as_model_catalog().delete_model(NameIdentifier.of("model_schema", "example_model")) +``` + +</TabItem> +</Tabs> + +Note that the delete operation will delete all the model versions under this model. + +### List models + +You can list all the models in a schema by sending a `GET` request to the `/api/metalakes/ +{metalake_name}/catalogs/{catalog_name}/schemas/{schema_name}/moldes` endpoint or by using the +Gravitino Java/Python client. The following is an example of listing all the models in a schema: + +<Tabs groupId="language" queryString> +<TabItem value="shell" label="Shell"> + +```shell +curl -X GET -H "Accept: application/vnd.gravitino.v1+json" \ +-H "Content-Type: application/json" \ +http://localhost:8090/api/metalakes/example/catalogs/model_catalog/schemas/model_schema/models +``` + +</TabItem> +<TabItem value="java" label="Java"> + +```java +// ... +Catalog catalog = gravitinoClient.loadCatalog("model_catalog"); +NameIdentifier[] identifiers = catalog.asModelCatalog().listModels(Namespace.of("model_schema")); +// ... +``` + +</TabItem> +<TabItem value="python" label="Python"> + +```python +gravitino_client: GravitinoClient = GravitinoClient(uri="http://localhost:8090", metalake_name="example") + +catalog: Catalog = gravitino_client.load_catalog(name="model_catalog") +model_list = catalog.as_model_catalog().list_models(namespace=Namespace.of("model_schema"))) +``` + +</TabItem> +</Tabs> + +## Model version operations + +:::tip + - Users should create a metalake, a catalog, a schema, and a model before link a model version + to the model. +::: + +### Link a model version + +You can link a model version by sending a `POST` request to the `/api/metalakes/{metalake_name} +/catalogs/{catalog_name}/schemas/{schema_name}/models/{model_name}` endpoint or by using +the Gravitino Java/Python client. The following is an example of linking a model version: + +<Tabs groupId="language" queryString> +<TabItem value="shell" label="Shell"> + +```shell +curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \ +-H "Content-Type: application/json" -d '{ + "uri": "path/to/model", + "aliases": ["alias1", "alias2"], Review Comment: Actually, the real "thing" we POST to the endpoint is the "alias"... So ... maybe POSTing to `/api/.../models/{model}/aliases` makes better sense. ########## docs/model-catalog.md: ########## @@ -0,0 +1,87 @@ +--- +title: "Model catalog" +slug: /model-catalog +date: 2024-12-26 +keyword: model catalog +license: "This software is licensed under the Apache License version 2." +--- + +## Introduction + +Model catalog is a metadata catalog that offers the unified interfaces to manage the metadata of +machine learning models in a centralized way. It follows the typical Gravitino 3-level namespace +(catalog, schema, and model) to manage the ML models metadata. In additional, it supports +managing the versions for each model. + +The advantages of using model catalog are: + +* Centralized management of ML models with user defined namespaces. Users can better discover + and govern the models from sematic level, rather than managing the model files directly. +* Version management for each model. Users can easily track the model versions and manage the + model lifecycle. + +The key concept of model management is to manage the path (URI) of the model. Instead of +managing the model storage path physically and separately, model metadata defines the mapping +relation between the model name and the storage path. In the meantime, with the support of +extensible properties of model metadata, users can define the model metadata with more detailed information +rather than just the storage path. + +* **Model**: The model is a metadata defined in the model catalog, to manage the ML models. Each Review Comment: ```suggestion * **Model**: A model is a metadata object defined in the model catalog, to manage a ML model. Each ``` ########## docs/model-catalog.md: ########## @@ -0,0 +1,87 @@ +--- +title: "Model catalog" +slug: /model-catalog +date: 2024-12-26 +keyword: model catalog +license: "This software is licensed under the Apache License version 2." +--- + +## Introduction + +Model catalog is a metadata catalog that offers the unified interfaces to manage the metadata of +machine learning models in a centralized way. It follows the typical Gravitino 3-level namespace +(catalog, schema, and model) to manage the ML models metadata. In additional, it supports Review Comment: ```suggestion (catalog, schema, and model) to manage the ML models metadata. In addition, it supports ``` ########## docs/manage-model-metadata-using-gravitino.md: ########## @@ -0,0 +1,637 @@ +--- +title: Manage model metadata using Gravitino +slug: /manage-model-metadata-using-gravitino +date: 2024-12-26 +keyword: Gravitino model metadata manage +license: This software is licensed under the Apache License version 2. +--- + +import Tabs from '@theme/Tabs'; +import TabItem from '@theme/TabItem'; + +This page introduces how to manage model metadata in Apache Gravitino. Gravitino model catalog +is a kind of model registry, which provides the ability to manage machine learning models' +versioning metadata. It follows the typical Gravitino 3-level namespace (catalog, schema, and +model) and supports managing the versions for each model. + +Currently, it supports model and model version registering, listing, loading, and deleting. + +To use model catalog, please make sure that: + + - Gravitino server has started, and assume the host and port [http://localhost:8090](http://localhost:8090). + - A metalake has been created and [enabled](./manage-metalake-using-gravitino.md#enable-a-metalake) + +## Catalog operations + +### Create a catalog + +:::info +For a model catalog, you must specify the catalog `type` as `MODEL` when creating the catalog. +Please also be aware that the `provider` is not required for a model catalog. +::: + +You can create a catalog by sending a `POST` request to the `/api/metalakes/{metalake_name}/catalogs` +endpoint or just use the Gravitino Java/Python client. The following is an example of creating a +catalog: + +<Tabs groupId="language" queryString> +<TabItem value="shell" label="Shell"> + +```shell +curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \ +-H "Content-Type: application/json" -d '{ + "name": "model_catalog", + "type": "MODEL", + "comment": "This is a model catalog", + "properties": { + "k1": "v1" + } +}' http://localhost:8090/api/metalakes/example/catalogs +``` + +</TabItem> +<TabItem value="java" label="Java"> + +```java +GravitinoClient gravitinoClient = GravitinoClient + .builder("http://localhost:8090") + .withMetalake("example") + .build(); + +Map<String, String> properties = ImmutableMap.<String, String>builder() + .put("k1", "v1") + .build(); + +Catalog catalog = gravitinoClient.createCatalog( + "model_catalog", + Type.MODEL, + "This is a model catalog", + properties); +``` + +</TabItem> +<TabItem value="python" label="Python"> + +```python +gravitino_client: GravitinoClient = GravitinoClient(uri="http://localhost:8090", metalake_name="example") +catalog = gravitino_client.create_catalog(name="catalog", + type=Catalog.Type.MODEL, + provider=None, + comment="This is a model catalog", + properties={"k1": "v1"}) +``` + +</TabItem> +</Tabs> + +### Load a catalog + +Refer to [Load a catalog](./manage-relational-metadata-using-gravitino.md#load-a-catalog) +in relational catalog for more details. For a model catalog, the load operation is the same. + +### Alter a catalog + +Refer to [Alter a catalog](./manage-relational-metadata-using-gravitino.md#alter-a-catalog) +in relational catalog for more details. For a model catalog, the alter operation is the same. + +### Drop a catalog + +Refer to [Drop a catalog](./manage-relational-metadata-using-gravitino.md#drop-a-catalog) +in relational catalog for more details. For a model catalog, the drop operation is the same. + +### List all catalogs in a metalake + +Please refer to [List all catalogs in a metalake](./manage-relational-metadata-using-gravitino.md#list-all-catalogs-in-a-metalake) +in relational catalog for more details. For a model catalog, the list operation is the same. + +### List all catalogs' information in a metalake + +Please refer to [List all catalogs' information in a metalake](./manage-relational-metadata-using-gravitino.md#list-all-catalogs-information-in-a-metalake) +in relational catalog for more details. For a model catalog, the list operation is the same. + +## Schema operations + +`Schema` is a virtual namespace in a model catalog, which is used to organize the models. It +is similar to the concept of `schema` in the relational catalog. + +:::tip +Users should create a metalake and a catalog before creating a schema. +::: + +### Create a schema + +You can create a schema by sending a `POST` request to the `/api/metalakes/{metalake_name}/catalogs/{catalog_name}/schemas` +endpoint or just use the Gravitino Java/Python client. The following is an example of creating a +schema: + +<Tabs groupId="language" queryString> +<TabItem value="shell" label="Shell"> + +```shell +curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \ +-H "Content-Type: application/json" -d '{ + "name": "model_schema", + "comment": "This is a model schema", + "properties": { + "k1": "v1" + } +}' http://localhost:8090/api/metalakes/example/catalogs/model_catalog/schemas +``` + +</TabItem> +<TabItem value="java" label="Java"> + +```java +GravitinoClient gravitinoClient = GravitinoClient + .builder("http://localhost:8090") + .withMetalake("example") + .build(); + +Catalog catalog = gravitinoClient.loadCatalog("model_catalog"); + +SupportsSchemas supportsSchemas = catalog.asSchemas(); + +Map<String, String> schemaProperties = ImmutableMap.<String, String>builder() + .put("k1", "v1") + .build(); +Schema schema = supportsSchemas.createSchema( + "model_schema", + "This is a schema", + schemaProperties); +// ... +``` + +</TabItem> +<TabItem value="python" label="Python"> + +```python +gravitino_client: GravitinoClient = GravitinoClient(uri="http://localhost:8090", metalake_name="example") + +catalog: Catalog = gravitino_client.load_catalog(name="model_catalog") +catalog.as_schemas().create_schema(name="model_schema", + comment="This is a schema", + properties={"k1": "v1"}) +``` + +</TabItem> +</Tabs> + +### Load a schema + +Please refer to [Load a schema](./manage-relational-metadata-using-gravitino.md#load-a-schema) +in relational catalog for more details. For a model catalog, the schema load operation is the +same. + +### Alter a schema + +Please refer to [Alter a schema](./manage-relational-metadata-using-gravitino.md#alter-a-schema) +in relational catalog for more details. For a model catalog, the schema alter operation is the +same. + +### Drop a schema + +Please refer to [Drop a schema](./manage-relational-metadata-using-gravitino.md#drop-a-schema) +in relational catalog for more details. For a model catalog, the schema drop operation is the +same. + +Note that the drop operation will delete all the model metadata under this schema if `cascade` +set to `true`. + +### List all schemas under a catalog + +Please refer to [List all schemas under a catalog](./manage-relational-metadata-using-gravitino.md#list-all-schemas-under-a-catalog) +in relational catalog for more details. For a model catalog, the schema list operation is the +same. + +## Model operations + +:::tip + - Users should create a metalake, a catalog, and a schema before creating a model. +::: + +### Register a model + +You can create a model by sending a `POST` request to the `/api/metalakes/{metalake_name} +/catalogs/{catalog_name}/schemas/{schema_name}/models` endpoint or just use the Gravitino +Java/Python client. The following is an example of creating a model: + +<Tabs groupId="language" queryString> +<TabItem value="shell" label="Shell"> + +```shell +curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \ +-H "Content-Type: application/json" -d '{ + "name": "example_model", + "comment": "This is an example model", + "properties": { + "k1": "v1" + } +}' http://localhost:8090/api/metalakes/example/catalogs/model_catalog/schemas/model_schema/models +``` + +</TabItem> +<TabItem value="java" label="Java"> + +```java +GravitinoClient gravitinoClient = GravitinoClient + .builder("http://localhost:8090") + .withMetalake("example") + .build(); + +Catalog catalog = gravitinoClient.loadCatalog("model_catalog"); +Map<String, String> propertiesMap = ImmutableMap.<String, String>builder() + .put("k1", "v1") + .build(); + +Model = catalog.asModelCatalog().registerModel( + NameIdentifier.of("model_schema", "example_model"), + "This is an example model", + propertiesMap); +``` + +</TabItem> +<TabItem value="python" label="Python"> + +```python +gravitino_client: GravitinoClient = GravitinoClient(uri="http://localhost:8090", metalake_name="example") + +catalog: Catalog = gravitino_client.load_catalog(name="model_catalog") +model = catalog.as_model_catalog().register_model(ident=NameIdentifier.of("model_schema", "example_model"), + comment="This is an example model", + properties={"k1": "v1"}) +``` + +</TabItem> +</Tabs> + +### Get a model + +You can get a model by sending a `GET` request to the `/api/metalakes/{metalake_name} +/catalogs/{catalog_name}/schemas/{schema_name}/models/{model_name}` endpoint or by using the +Gravitino Java/Python client. The following is an example of getting a model: + +<Tabs groupId="language" queryString> +<TabItem value="shell" label="Shell"> + +```shell +curl -X GET -H "Accept: application/vnd.gravitino.v1+json" \ +-H "Content-Type: application/json" \ +http://localhost:8090/api/metalakes/example/catalogs/model_catalog/schemas/model_schema/models/example_model +``` + +</TabItem> +<TabItem value="java" label="Java"> + +```java +// ... +Catalog catalog = gravitinoClient.loadCatalog("model_catalog"); +Model model = catalog.asModelCatalog().getModel(NameIdentifier.of("model_schema", "example_model")); +// ... +``` + +</TabItem> +<TabItem value="python" label="Python"> + +```python +gravitino_client: GravitinoClient = GravitinoClient(uri="http://localhost:8090", metalake_name="example") + +catalog: Catalog = gravitino_client.load_catalog(name="model_catalog") +model: Model = catalog.as_model_catalog().get_model(ident=NameIdentifier.of("model_schema", "example_model")) +``` + +</TabItem> +</Tabs> + +### Delete a model + +You can delete a model by sending a `DELETE` request to the `/api/metalakes/{metalake_name} +/catalogs/{catalog_name}/schemas/{schema_name}/models/{model_name}` endpoint or by using the +Gravitino Java/Python client. The following is an example of deleting a model: + +<Tabs groupId="language" queryString> +<TabItem value="shell" label="Shell"> + +```shell +curl -X DELETE -H "Accept: application/vnd.gravitino.v1+json" \ +-H "Content-Type: application/json" \ +http://localhost:8090/api/metalakes/example/catalogs/model_catalog/schemas/model_schema/models/example_model +``` + +</TabItem> +<TabItem value="java" label="Java"> + +```java +// ... +Catalog catalog = gravitinoClient.loadCatalog("model_catalog"); +catalog.asModelCatalog().deleteModel(NameIdentifier.of("model_schema", "example_model")); +// ... +``` + +</TabItem> +<TabItem value="python" label="Python"> + +```python +gravitino_client: GravitinoClient = GravitinoClient(uri="http://localhost:8090", metalake_name="example") + +catalog: Catalog = gravitino_client.load_catalog(name="model_catalog") +catalog.as_model_catalog().delete_model(NameIdentifier.of("model_schema", "example_model")) +``` + +</TabItem> +</Tabs> + +Note that the delete operation will delete all the model versions under this model. + +### List models + +You can list all the models in a schema by sending a `GET` request to the `/api/metalakes/ +{metalake_name}/catalogs/{catalog_name}/schemas/{schema_name}/moldes` endpoint or by using the +Gravitino Java/Python client. The following is an example of listing all the models in a schema: + +<Tabs groupId="language" queryString> +<TabItem value="shell" label="Shell"> + +```shell +curl -X GET -H "Accept: application/vnd.gravitino.v1+json" \ +-H "Content-Type: application/json" \ +http://localhost:8090/api/metalakes/example/catalogs/model_catalog/schemas/model_schema/models +``` + +</TabItem> +<TabItem value="java" label="Java"> + +```java +// ... +Catalog catalog = gravitinoClient.loadCatalog("model_catalog"); +NameIdentifier[] identifiers = catalog.asModelCatalog().listModels(Namespace.of("model_schema")); +// ... +``` + +</TabItem> +<TabItem value="python" label="Python"> + +```python +gravitino_client: GravitinoClient = GravitinoClient(uri="http://localhost:8090", metalake_name="example") + +catalog: Catalog = gravitino_client.load_catalog(name="model_catalog") +model_list = catalog.as_model_catalog().list_models(namespace=Namespace.of("model_schema"))) +``` + +</TabItem> +</Tabs> + +## Model version operations + +:::tip + - Users should create a metalake, a catalog, a schema, and a model before link a model version + to the model. +::: + +### Link a model version + +You can link a model version by sending a `POST` request to the `/api/metalakes/{metalake_name} +/catalogs/{catalog_name}/schemas/{schema_name}/models/{model_name}` endpoint or by using Review Comment: Shall we do a POST to `/api/.../models/{model}/versions` instead? POSTing a thing to the URI of an existing resource doesn't look very RESTful. ########## docs/model-catalog.md: ########## @@ -0,0 +1,87 @@ +--- +title: "Model catalog" +slug: /model-catalog +date: 2024-12-26 +keyword: model catalog +license: "This software is licensed under the Apache License version 2." +--- + +## Introduction + +Model catalog is a metadata catalog that offers the unified interfaces to manage the metadata of +machine learning models in a centralized way. It follows the typical Gravitino 3-level namespace +(catalog, schema, and model) to manage the ML models metadata. In additional, it supports +managing the versions for each model. + +The advantages of using model catalog are: + +* Centralized management of ML models with user defined namespaces. Users can better discover + and govern the models from sematic level, rather than managing the model files directly. +* Version management for each model. Users can easily track the model versions and manage the + model lifecycle. + +The key concept of model management is to manage the path (URI) of the model. Instead of +managing the model storage path physically and separately, model metadata defines the mapping +relation between the model name and the storage path. In the meantime, with the support of +extensible properties of model metadata, users can define the model metadata with more detailed information +rather than just the storage path. + +* **Model**: The model is a metadata defined in the model catalog, to manage the ML models. Each + model can have many **Model Versions**, and each version can have its own properties. Models + can be retrieved by the name. +* **Model Version**: The model version is a metadata defined in the model catalog, to manage each + version of the ML model. Each version has a unique version number, and can have its own + properties and storage path. Model version can be retrieved by the model name and version + number. Also, each version can have a list of aliases, which can also be used to retrieve. Review Comment: If we want to use the term "model version" to represent an object, i.e. a specific version of a model, we may coin a term "ModelVersion", formally. We then use this term consistently across the docs and code base. We don't use the regular phrase "model version" which make cause confusion. ########## docs/model-catalog.md: ########## @@ -0,0 +1,87 @@ +--- +title: "Model catalog" +slug: /model-catalog +date: 2024-12-26 +keyword: model catalog +license: "This software is licensed under the Apache License version 2." +--- + +## Introduction + +Model catalog is a metadata catalog that offers the unified interfaces to manage the metadata of +machine learning models in a centralized way. It follows the typical Gravitino 3-level namespace +(catalog, schema, and model) to manage the ML models metadata. In additional, it supports +managing the versions for each model. + +The advantages of using model catalog are: + +* Centralized management of ML models with user defined namespaces. Users can better discover + and govern the models from sematic level, rather than managing the model files directly. +* Version management for each model. Users can easily track the model versions and manage the + model lifecycle. + +The key concept of model management is to manage the path (URI) of the model. Instead of +managing the model storage path physically and separately, model metadata defines the mapping +relation between the model name and the storage path. In the meantime, with the support of Review Comment: ```suggestion relation between the model name and the storage (path). In the meantime, with the support of ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
