This is an automated email from the ASF dual-hosted git repository. jmclean pushed a commit to branch justinmclean-patch-1 in repository https://gitbox.apache.org/repos/asf/gravitino.git
commit 5579b1d524a657e9dc565ea4a4575f027f85e1df Author: Justin Mclean <[email protected]> AuthorDate: Tue Aug 13 16:59:22 2024 +1000 Add trademark and incubating Need to refer to the project as an incubating one and add ASF trademarks. Fixed a few minor English issues. --- docs/how-to-use-python-client.md | 57 +++++++++++++++++++++------------------- 1 file changed, 30 insertions(+), 27 deletions(-) diff --git a/docs/how-to-use-python-client.md b/docs/how-to-use-python-client.md index 5b3f4aa0c..721c326ce 100644 --- a/docs/how-to-use-python-client.md +++ b/docs/how-to-use-python-client.md @@ -5,23 +5,23 @@ date: 2024-05-09 keyword: Gravitino Python client license: This software is licensed under the Apache License version 2. --- -# Apache Gravitino Python client +# Apache Gravitino™ (incubating) Python client -Apache Gravitino is a high-performance, geo-distributed, and federated metadata lake. -It manages the metadata directly in different sources, types, and regions, also provides users -the unified metadata access for data and AI assets. +Apache Gravitino (incubating) is a high-performance, geo-distributed, federated metadata lake. +It manages the metadata directly in different sources, types, and regions, and also provides users +with unified metadata access for data and AI assets. Gravitino Python client helps data scientists easily manage metadata using Python language. - + ## Use Guidance -You can use Gravitino Python client library with Spark, PyTorch, Tensorflow, Ray and Python environment. +You can use the Gravitino Python client library with Spark, PyTorch, Tensorflow, Ray, and Python environment. -First of all, You must have a Gravitino server set up and run, You can refer document of -[How to install Gravitino](./how-to-install.md) to build Gravitino server from source code and -install it in your local. +First of all, You must have a Gravitino server set up and run, You can refer to the document +[How to install Gravitino](https://datastrato.ai/docs/latest/how-to-install/) to build Gravitino server from source code and +install it on your local machine. ### Apache Gravitino Python client API @@ -29,15 +29,15 @@ install it in your local. pip install apache-gravitino ``` -1. [Manage metalake using Gravitino Python API](./manage-metalake-using-gravitino.md?language=python) -2. [Manage fileset metadata using Gravitino Python API](./manage-fileset-metadata-using-gravitino.md?language=python) +1. [Manage metalake using Gravitino Python API](https://datastrato.ai/docs/latest/manage-metalake-using-gravitino/?language=python) +2. [Manage fileset metadata using Gravitino Python API](https://datastrato.ai/docs/latest/manage-fileset-metadata-using-gravitino/?language=python) ### Apache Gravitino Fileset Example -We offer a playground environment to help you quickly understand how to use Gravitino Python +We offer a playground environment to help you quickly understand how to use the Gravitino Python client to manage non-tabular data on HDFS via Fileset in Gravitino. You can refer to the -document [How to use the playground](./how-to-use-the-playground.md) -to launch a Gravitino server, HDFS and Jupyter notebook environment in you local Docker environment. +document [How to use the playground](https://datastrato.ai/docs/latest/how-to-use-the-playground/) +to launch a Gravitino server, HDFS, and Jupyter Notebook environment in your local Docker environment. Waiting for the playground Docker environment to start, you can directly open `http://localhost:8888/lab/tree/gravitino-fileset-example.ipynb` in the browser and run the example. @@ -46,31 +46,31 @@ The [gravitino-fileset-example](https://github.com/apache/gravitino-playground/b contains the following code snippets: 1. Install HDFS Python client. -2. Create a HDFS client to connect HDFS and to do some test operations. -3. Install Gravitino Python client. -4. Initialize Gravitino admin client and create a Gravitino metalake. -5. Initialize Gravitino client and list metalakes. +2. Create an HDFS client to connect HDFS and to do some test operations. +3. Install the Gravitino Python client. +4. Initialize the Gravitino admin client and create a Gravitino metalake. +5. Initialize the Gravitino client and list metalakes. 6. Create a Gravitino `Catalog` and special `type` is `Catalog.Type.FILESET` and `provider` is - [hadoop](./hadoop-catalog.md) -7. Create a Gravitino `Schema` with the `location` pointed to a HDFS path, and use `hdfs client` to + [hadoop](https://datastrato.ai/docs/latest/hadoop-catalog/) +7. Create a Gravitino `Schema` with the `location` pointed to an HDFS path, and use `hdfs client` to check if the schema location is successfully created in HDFS. -8. Create a `Fileset` with `type` is [Fileset.Type.MANAGED](./manage-fileset-metadata-using-gravitino.md#fileset-operations), +8. Create a `Fileset` with `type` is [Fileset.Type.MANAGED](https://datastrato.ai/docs/latest/manage-fileset-metadata-using-gravitino/#fileset-operations), use `hdfs client` to check if the fileset location was successfully created in HDFS. 9. Drop this `Fileset.Type.MANAGED` type fileset and check if the fileset location was successfully deleted in HDFS. -10. Create a `Fileset` with `type` is [Fileset.Type.EXTERNAL](./manage-fileset-metadata-using-gravitino.md#fileset-operations) +10. Create a `Fileset` with `type` is [Fileset.Type.EXTERNAL](https://datastrato.ai/docs/latest/manage-fileset-metadata-using-gravitino/#fileset-operations) and `location` pointed to exist HDFS path 11. Drop this `Fileset.Type.EXTERNAL` type fileset and check if the fileset location was not deleted in HDFS. -## How to development Apache Gravitino Python Client +## Apache Gravitino Python Client code -You can ues any IDE to develop Gravitino Python Client. Directly open the client-python module project in the IDE. +You can use any IDE to further develop the Gravitino Python Client. Directly open the client-python module project in the IDE. ### Prerequisites + Python 3.8+ -+ Refer to [How to build Gravitino](./how-to-build.md#prerequisites) to have necessary build ++ Refer to [How to build Gravitino](https://datastrato.ai/docs/latest/how-to-build/#prerequisites) to have necessary build environment ready for building. ### Build and testing @@ -95,11 +95,11 @@ You can ues any IDE to develop Gravitino Python Client. Directly open the client 4. Run integration tests - Because Python client connects to Gravitino Server to run integration tests, + Because Python client connects to the Gravitino Server to run integration tests, So it runs `./gradlew compileDistribution -x test` command automatically to compile the Gravitino project in the `distribution` directory. When you run integration tests via Gradle command or IDE, Gravitino integration test framework (`integration_test_env.py`) - will start and stop Gravitino server automatically. + will start and stop the Gravitino server automatically. ```shell ./gradlew :clients:client-python:test @@ -137,3 +137,6 @@ Incubation is required of all newly accepted projects until a further review ind and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF. + +Apache and Gravitino are either registered trademarks or trademarks of The Apache Software Foundation. +
