Copilot commented on code in PR #9173:
URL: https://github.com/apache/gravitino/pull/9173#discussion_r2559880784
##########
docs/generic-lakehouse-catalog.md:
##########
@@ -0,0 +1,281 @@
+---
+title: "Lakehouse catalog"
+slug: /lakehouse-catalog
+keywords:
+ - lakehouse
+ - lance
+ - metadata
+license: "This software is licensed under the Apache License version 2."
+---
+
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+## Introduction
+
+Generic Lakehouse Catalog is a Gravitino catalog implementation that enables
Gravitino to interact with Lakehouse storage systems that use file system for
storing tabular data. Such lakehouse system could be built on top of object
stores like Amazon S3, Azure Blob Storage, Google Cloud Storage, or HDFS.
+Theoretically, it can work with any lakehouse storage system that supports
standard file system operations such as Apache Iceberg, Lance, Delta Lake, and
Apache Hudi. However, currently Gravitino only provides native support for
Lance-based lakehouse storage systems.
+
+### Requirements and limitations
+
+- The lakehouse storage system must support standard file system operations
such as listing directories, reading files, and writing files.
+
+## Catalog
+
+### Catalog capabilities
+
+All capabilities are the same as relational catalog, please refer to [Manage
Relational Metadata Using
Gravitino](./manage-relational-metadata-using-gravitino.md) for more details.
+
+### Catalog properties
+
+The only property that need to be noted for a generic lakehouse catalog is
`location`. This property specifies the root location of the lakehouse storage
system. All schemas and tables will be stored under this location if not
+specified otherwise in schema or table properties.
+
+
+### Catalog operations
+
+All operations are the same as relational catalog, please refer to [Manage
Relational Metadata Using
Gravitino](./manage-relational-metadata-using-gravitino.md) for more details.
+
+One thing need to be noted is that the provider will be `generic_lakehouse`
when creating a generic lakehouse catalog.
+That is:
+
+<Tabs groupId='language' queryString>
+<TabItem value="shell" label="Shell">
+
+```shell
+curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \
+-H "Content-Type: application/json" -d '{
+ "name": "generic_lakehouse_catalog",
+ "type": "RELATIONAL",
+ "comment": "comment",
+ "provider": "generic-lakehouse",
+ "properties": {
+ }
+}' http://localhost:8090/api/metalakes/metalake/catalogs
+```
+
+</TabItem>
+<TabItem value="java" label="Java">
+
+```java
+
+// Assuming you have just created a metalake named `metalake`
+GravitinoClient gravitinoClient = GravitinoClient
+ .builder("http://127.0.0.1:8090")
+ .withMetalake("metalake")
+ .build();
+
+Map<String, String> genericCatalogProperties = ImmutableMap.<String,
String>builder()
+ .put("location", "hdfs://localhost:9000/user/lakehouse") // The root
location of the lakehouse storage system
+ .build();
+
+Catalog catalog = gravitinoClient.createCatalog("generic_lakehouse_catalog",
+ Type.RELATIONAL,
+ "generic_lakehouse",
+ "This is a generic lakehouse catalog",
+ genericCatalogProperties); // Please change the properties according to
the value of the provider.
+// ...
+```
+
+</TabItem>
+</Tabs>
+
+
+## Schema
+
+### Schema capabilities
+
+All capabilities are the same as relational catalog, please refer to [Manage
Relational Metadata Using
Gravitino](./manage-relational-metadata-using-gravitino.md) for more details.
+
+### Schema properties
+
+The same as catalog properties, please refer to [Catalog
properties](#catalog-properties) section for more details. Schema `location`
property can be used to specify the location to store all tables under this
schema.
+
+### Schema operations
+
+Please refer to [Manage Relational Metadata Using
Gravitino](./manage-relational-metadata-using-gravitino.md#schema-operations)
for more details.
+
+## Table
+
+### Table capabilities
+
+Currently, for a lance table, Gravitino supports the following capabilities:
+- List
+- Load
+- Alter (partial supported)
+- Create/register
+- Drop and truncate
+
+### Table partitions
+
+Not support now
+
+### Table sort orders
+
+Not support now.
+
+### Table distributions
+
+Not support now.
+
+### Table column types
+
+Since Lance uses Apache Arrow as the table schema, the following table shows
the mapping between Gravitino types and Arrow types:
+
+| Gravitino Type | Arrow Type |
+|----------------------------------|-----------------------------------------|
+| `Struct` | `Struct` |
+| `Map` | `Map` |
+| `List` | `Array` |
+| `Boolean` | `Boolean` |
+| `Byte` | `Int8` |
+| `Short` | `Int16` |
+| `Integer` | `Int32` |
+| `Long` | `Int64` |
+| `Float` | `Float` |
+| `Double` | `Double` |
+| `String` | `Utf8` |
+| `Binary` | `Binary` |
+| `Decimal(p, s)` | `Decimal(p, s)` (128-bit) |
+| `Date` | `Date` |
+| `Timestamp`/`Timestamp(6)` | `TimestampType withoutZone` |
+| `Timestamp(0)` | `TimestampType Second withoutZone` |
+| `Timestamp(3)` | `TimestampType Millisecond withoutZone` |
+| `Timestamp(9)` | `TimestampType Nanosecond withoutZone` |
+| `Timestamp_tz`/`Timestamp_tz(6)` | `TimestampType Microsecond withUtc` |
+| `Timestamp_tz(0)` | `TimestampType Second withUtc` |
+| `Timestamp_tz(3)` | `TimestampType Millisecond withUtc` |
+| `Timestamp_tz(9)` | `TimestampType Nanosecond withUtc` |
+| `Time`/`Time(9)` | `Time Nanosecond` |
+| `Null` | `Null` |
+| `Fixed(n)` | `Fixed-Size Binary(n)` |
+| `Interval_year` | `Interval(YearMonth)` |
+| `Interval_day` | `Duration(Microsecond)` |
+| `External(arrow_field_json_str)` | Any Arrow Field (see note below) |
+
+`External(arrow_field_json_str)`:
+
+As the table above shows, Gravitino provides mappings for most common data
types. However,
+in some cases, you may need to use an Arrow data type that is not directly
supported by Gravitino.
+
+To address this, Gravitino introduces the `External(arrow_field_json_str)`
type,
+which allows you to define any Arrow data type by providing the JSON string of
an Arrow `Field`.
+
+The JSON string must conform to the Apache Arrow `Field`
[specification](https://github.com/apache/arrow-java/blob/ed81e5981a2bee40584b3a411ed755cb4cc5b91f/vector/src/main/java/org/apache/arrow/vector/types/pojo/Field.java#L80C1-L86C68),
+including details such as the field name, data type, and nullability.
+Here are some examples of how to use `External` type for various Arrow types
that are not natively supported by Gravitino:
+
+| Arrow Type | External type
|
+|-------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| `Large Utf8` |
`External("{\"name\":\"col_name\",\"nullable\":true,\"type\":{\"name\":\"largeutf8\"},\"children\":[]}")`
|
+| `Large Binary` |
`External("{\"name\":\"col_name\",\"nullable\":true,\"type\":{\"name\":\"largebinary\"},\"children\":[]}")`
|
+| `Large List` |
`External("{\"name\":\"col_name\",\"nullable\":true,\"type\":{\"name\":\"largelist\"},\"children\":[{\"name\":\"element\",\"nullable\":true,\"type\":{\"name\":\"int\",
\"bitWidth\":32, \"isSigned\": true},\"children\":[]}]}")`
|
+| `Fixed-Size List` |
`External("{\"name\":\"col_name\",\"nullable\":true,\"type\":{\"name\":\"fixedsizelist\",
\"listSize\":10},\"children\":[{\"name\":\"element\",\"nullable\":true,\"type\":{\"name\":\"int\",
\"bitWidth\":32, \"isSigned\": true},\"children\":[]}]}")` |
+
+**Important considerations:**
+- The `name` attribute and `nullable` attribute in the JSON string must
exactly match the corresponding column name and nullable in the Gravitino table.
+- The `children` array should be empty for primitive types. For complex types
like `Struct` or `List`, it must contain the definitions of the child fields.
+
+### Table properties
+
+Currently, the following properties are required for a table in a generic
lakehouse catalog
+
+| Configuration item | Description
| Default value | Required
| Since version |
+|--------------------|-----------------------------------------------------------------------------------------------|---------------|-----------------------------------------------------------------------------|---------------|
+| `format` | The format for a table, it can be `lance`,
`iceberg`,..., currently, it only supports `lance` | (none) | Yes
| 1.1.0
|
+| `location` | The location to storage the table meta and data.
| (none) | No, but if this is not
set in catalog or schema, then it's a required value | 1.1.0 |
+
+Of course, apart from the above-required properties, you can also set other
table properties supported by the underlying lakehouse storage system or your
custom properties.
+
+### Table indexes
+
+This part is almost the same as relational catalog, please refer to [Manage
Relational Metadata Using
Gravitino](./manage-relational-metadata-using-gravitino.md#table-partitioning-distribution-sort-ordering-and-indexes)
for more details.
+However, different lakehouse storage systems may have different supports for
indexes, and the following tables show the support for indexes in a Lance-based
lakehouse storage system.
+
+| Index type | Description
| Lance |
+|------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------|
+| SCALAR | SCALAR index is used to optimize searches on scalar data types
such as integers, floats, and so on.
| Y |
+| VECTOR | VECTOR index is used to optimize similarity searches in
high-dimensional vector spaces.
| Y |
+| BTREE | BTREE index is a balanced tree data structure that maintains
sorted data and allows for logarithmic time complexity for search, insert, and
delete operations.
| Y |
+| INVERTED | INVERTED index is a data structure used to optimize full-text
searches by mapping terms to their locations within a dataset, allowing for
quick retrieval of documents containing specific words or phrases.
| Y |
+| IVF_FLAT | IVF_FLAT (Inverted File with Flat quantization) index is used
for efficient similarity searches in high-dimensional vector spaces by
partitioning the vector space into clusters and storing vectors in a flat
structure within each cluster.
| Y |
+| IVF_SQ | IVF_SQ (Inverted File with Scalar Quantization) index is used
for efficient similarity searches in high-dimensional vector spaces by
partitioning the vector space into clusters and storing quantized
representations of vectors within each cluster to reduce memory usage.
| Y |
+| IVF_PQ | IVF_PQ (Inverted File with Product Quantization) index is used
for efficient similarity searches in high-dimensional vector spaces by
partitioning the vector space into clusters and storing product-quantized
representations of vectors within each cluster to achieve a balance between
search accuracy and memory efficiency. | Y |
+
+Another point is that **NOT all lakehouse table support creating index when
creating table**, and Lance is one of them. So when creating a lance table, you
cannot specify indexes at the same time. You need to create the table first,
then create indexes on the table.
Review Comment:
Grammatical error: "NOT all lakehouse table support creating index" should
be "NOT all lakehouse tables support creating indexes" (plural forms).
```suggestion
Another point is that **NOT all lakehouse tables support creating indexes
when creating a table**, and Lance is one of them. So when creating a lance
table, you cannot specify indexes at the same time. You need to create the
table first, then create indexes on the table.
```
##########
docs/generic-lakehouse-catalog.md:
##########
@@ -0,0 +1,281 @@
+---
+title: "Lakehouse catalog"
+slug: /lakehouse-catalog
+keywords:
+ - lakehouse
+ - lance
+ - metadata
+license: "This software is licensed under the Apache License version 2."
+---
+
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+## Introduction
+
+Generic Lakehouse Catalog is a Gravitino catalog implementation that enables
Gravitino to interact with Lakehouse storage systems that use file system for
storing tabular data. Such lakehouse system could be built on top of object
stores like Amazon S3, Azure Blob Storage, Google Cloud Storage, or HDFS.
+Theoretically, it can work with any lakehouse storage system that supports
standard file system operations such as Apache Iceberg, Lance, Delta Lake, and
Apache Hudi. However, currently Gravitino only provides native support for
Lance-based lakehouse storage systems.
+
+### Requirements and limitations
+
+- The lakehouse storage system must support standard file system operations
such as listing directories, reading files, and writing files.
+
+## Catalog
+
+### Catalog capabilities
+
+All capabilities are the same as relational catalog, please refer to [Manage
Relational Metadata Using
Gravitino](./manage-relational-metadata-using-gravitino.md) for more details.
+
+### Catalog properties
+
+The only property that need to be noted for a generic lakehouse catalog is
`location`. This property specifies the root location of the lakehouse storage
system. All schemas and tables will be stored under this location if not
+specified otherwise in schema or table properties.
+
+
+### Catalog operations
+
+All operations are the same as relational catalog, please refer to [Manage
Relational Metadata Using
Gravitino](./manage-relational-metadata-using-gravitino.md) for more details.
+
+One thing need to be noted is that the provider will be `generic_lakehouse`
when creating a generic lakehouse catalog.
Review Comment:
Grammatical error: "One thing need to be noted" should be "One thing needs
to be noted" (singular subject requires singular verb).
```suggestion
One thing needs to be noted is that the provider will be `generic_lakehouse`
when creating a generic lakehouse catalog.
```
##########
docs/lance-rest-service.md:
##########
@@ -0,0 +1,206 @@
+---
+title: "Lance REST service"
+slug: /lance-rest-service
+keywords:
+ - Lance REST
+license: "This software is licensed under the Apache License version 2."
+---
+
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+## Background
+
+Since version 1.1.0, Gravitino includes a REST service for Lance datasets. The
Lance REST service is a web service that allows you to interact with Lance
datasets over HTTP. It provides endpoints for querying, inserting, updating,
and deleting data in Lance datasets.
+It abides by the [Lance REST API
specification](https://editor-next.swagger.io/?url=https://raw.githubusercontent.com/lancedb/lance-namespace/refs/heads/main/docs/src/rest.yaml).
More details about the specification, please refer to docs
[here](https://lance.org/format/namespace/impls/rest/)
+
+Besides, Lance REST service can be run standalone without Gravitino server,
+
+
+
+## Capabilities
+
+The Lance REST service supports the APIs defined in the Lance REST API
specification. The following are some of the key capabilities of the Lance REST
service:
+- Namespace management including creating namespace, listing namespaces,
describing, deleting namespace, namespace exists check.
+- Table management including creating tables including creating empty tables,
dropping tables, registering tables and unregistering tables.
+- Index management including creating index, listing indexes. Dropping index
is not supported in 1.1.0.
+
+Full Supports are listed as the following table:
+
+| Operation ID | Description
| Since version |
+|----------------------|-----------------------------------------------------------------------------------------------------|---------------|
+| CreateNamespace | Create a Lance namespace
| 1.1.0 |
+| ListNamespaces | List all namespaces under a specific namespace
| 1.1.0 |
+| DescribeNamespace | Get details of a specific namespace
| 1.1.0 |
+| DropNamespace | Delete a specific namespace
| 1.1.0 |
+| NamespaceExists | Check if a namespace exists
| 1.1.0 |
+| ListTables | List all tables in a specific namespace
| 1.1.0 |
+| CreateTable | Create a new table in a specific namespace
| 1.1.0 |
+| DropTable | Delete a specific table from a namespace, drop table
will drop metadata and Lance data all together | 1.1.0 |
+| TableExists | Check if a specific table exists in a namespace
| 1.1.0 |
+| RegisterTable | Register an existing Lance table to a specific
namespace | 1.1.0 |
+| deregisterTable | Unregister a specific table from a namespace, it will
only remove metadata, Lance data will be kept | 1.1.0 |
+| CreateIndex | Create an index on a specific table
| 1.1.0 |
+| ListIndexes | List all indexes on a specific table
| 1.1.0 |
+
+## Getting started
+
+### Running Lance REST service with Gravitino
+
+To use the Lance REST service, you need to have Gravitino server running with
Lance REST service enabled. The following are configurations to enable Lance
REST service in Gravitino server.
+
+| Configuration item | Description
| Default value | Required
| Since Version |
+|------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------|--------------------------------------------|---------------|
+| `gravitino.auxService.names` | Auxiliary service that runs
Lance REST service, currently it supports `iceberg-rest` and `lance-rest`. It
should include `lance-rest` if you want to start the Lance REST service like
`lance-rest`, or `lance-rest, iceberg-rest` | iceberg-rest,lance-rest | Yes if
Lance REST service is going to run | 0.2.0 |
+| `gravitino.lance-rest.classpath` | The class path of
lance-rest service, it's the relative path compared with Gravitino home.
| lance-rest-server/libs |
Yes if Lance REST service is going to run | 1.1.0 |
+| `gravitino.lance-rest.port` | The port number that Lance
REST service listens on.
| 9101 | Yes if
Lance REST service is going to run | 1.1.0 |
+| `gravitino.lance-rest.host` | The host name that Lance
REST service run in
| 0.0.0.0 | Yes if
Lance REST service is going to run | 1.1.0 |
+| `gravitino.lance-rest.namespace-backend` | backend to store namespace
metadata, currently it only supports `gravitino`
| gravitino | Yes if
Lance REST service is going to run | 1.1.0 |
+| `gravitino.lance-rest.gravitino.uri` | Gravitino server URI, it
should be set when `namespace-backend` is `gravitino`
| http://localhost:8090 | Yes if
Lance REST service is going to run | 1.1.0 |
+| `gravitino.lance-rest.gravitino.metalake-name` | Gravitino metalake name, it
should be set when `namespace-backend` is `gravitino`
| (none) | Yes if
Lance REST service is going to run | 1.1.0 |
+
+### Running Lance REST service standalone
+
+To run Lance REST service standalone without Gravitino server, you can use the
following command:
+
+```shell
+{GRAVITINO_HOME}/bin/gravitino-lance-rest-server.sh start
+```
+
+The following configurations are required to run Lance REST service
standalone, you can set them in `gravitino-lance-rest-server.conf` file or pass
them as command line arguments.
+Typically, you only need to change the following configurations:
+
+| Configuration item | Description
| Default value
| Required | Since Version |
+|------------------------------------------------|------------------------------------------------------------------------------------|--------------------------|--------------------------------------------|---------------|
+| `gravitino.lance-rest.namespace-backend` | backend to store namespace
metadata, currently it only supports `gravitino` | gravitino
| Yes if Lance REST service is going to run | 1.1.0 |
+| `gravitino.lance-rest.gravitino.uri` | Gravitino server URI, it
should be set when `namespace-backend` is `gravitino` |
http://localhost:8090 | Yes if Lance REST service is going to run | 1.1.0
|
+| `gravitino.lance-rest.gravitino.metalake-name` | Gravitino metalake name, it
should be set when `namespace-backend` is `gravitino` | (none)
| Yes if Lance REST service is going to run | 1.1.0 |
+| `gravitino.lance-rest.port` | The port number that Lance
REST service listens on. | 9101
| Yes if Lance REST service is going to run | 1.1.0 |
+| `gravitino.lance-rest.host` | The host name that Lance
REST service run in | 0.0.0.0
| Yes if Lance REST service is going to run | 1.1.0 |
+
+`namespace-backend`, `uri`, `port` and `host` have the same meaning as
described in the previous section, and they have the default values. In most
cases you only need to change `metalake-name` to your Gravitino metalake name.
+For other configurations listed in the file, just keep their default values.
+
+## Using Lance REST service
+
+Currently, as the Lance REST service only support Gravitino backend, so there
are some limitations when using Lance REST service:
+- You need to have a running Gravitino server with a metalake created.
+- As Gravitino has three hierarchies: catalog -> schema -> table, so when you
create namespaces or a table via Lance REST service, you need to make sure the
parent hierarchy exists. For example, when you create a namespace
`lance_catalog/schema`, you need to make sure the catalog `lance_catalog`
already exists in Gravitino metalake. If not, you need to create the
namespace(catalog) `lance_catalog` first.
+- Currently, we can only support two layers of namespaces and then tables,
that is to say, you can create namespace like `lance_catalog/schema`, but you
cannot create namespace like `lance_catalog/schema/sub_schema`. Tables can only
be created under the namespace `lance_catalog/schema`.
+
+## Example
+
+When Gravitino server is started with Lance REST service starts successfully,
and a `generic-lakehouse` catalog named `lance_catalog` is created in Gravitino
metalake, you can use the following Python code to interact with Lance REST
service:
Review Comment:
Grammatical error: "When Gravitino server is started with Lance REST service
starts successfully" is awkward. Should be "When Gravitino server is started
with Lance REST service and starts successfully" or "When Gravitino server with
Lance REST service starts successfully".
```suggestion
When Gravitino server with Lance REST service starts successfully, and a
`generic-lakehouse` catalog named `lance_catalog` is created in Gravitino
metalake, you can use the following Python code to interact with Lance REST
service:
```
##########
docs/lance-rest-service.md:
##########
@@ -0,0 +1,206 @@
+---
+title: "Lance REST service"
+slug: /lance-rest-service
+keywords:
+ - Lance REST
+license: "This software is licensed under the Apache License version 2."
+---
+
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+## Background
+
+Since version 1.1.0, Gravitino includes a REST service for Lance datasets. The
Lance REST service is a web service that allows you to interact with Lance
datasets over HTTP. It provides endpoints for querying, inserting, updating,
and deleting data in Lance datasets.
+It abides by the [Lance REST API
specification](https://editor-next.swagger.io/?url=https://raw.githubusercontent.com/lancedb/lance-namespace/refs/heads/main/docs/src/rest.yaml).
More details about the specification, please refer to docs
[here](https://lance.org/format/namespace/impls/rest/)
+
+Besides, Lance REST service can be run standalone without Gravitino server,
+
+
+
+## Capabilities
+
+The Lance REST service supports the APIs defined in the Lance REST API
specification. The following are some of the key capabilities of the Lance REST
service:
+- Namespace management including creating namespace, listing namespaces,
describing, deleting namespace, namespace exists check.
+- Table management including creating tables including creating empty tables,
dropping tables, registering tables and unregistering tables.
+- Index management including creating index, listing indexes. Dropping index
is not supported in 1.1.0.
+
+Full Supports are listed as the following table:
+
+| Operation ID | Description
| Since version |
+|----------------------|-----------------------------------------------------------------------------------------------------|---------------|
+| CreateNamespace | Create a Lance namespace
| 1.1.0 |
+| ListNamespaces | List all namespaces under a specific namespace
| 1.1.0 |
+| DescribeNamespace | Get details of a specific namespace
| 1.1.0 |
+| DropNamespace | Delete a specific namespace
| 1.1.0 |
+| NamespaceExists | Check if a namespace exists
| 1.1.0 |
+| ListTables | List all tables in a specific namespace
| 1.1.0 |
+| CreateTable | Create a new table in a specific namespace
| 1.1.0 |
+| DropTable | Delete a specific table from a namespace, drop table
will drop metadata and Lance data all together | 1.1.0 |
+| TableExists | Check if a specific table exists in a namespace
| 1.1.0 |
+| RegisterTable | Register an existing Lance table to a specific
namespace | 1.1.0 |
+| deregisterTable | Unregister a specific table from a namespace, it will
only remove metadata, Lance data will be kept | 1.1.0 |
+| CreateIndex | Create an index on a specific table
| 1.1.0 |
+| ListIndexes | List all indexes on a specific table
| 1.1.0 |
+
+## Getting started
+
+### Running Lance REST service with Gravitino
+
+To use the Lance REST service, you need to have Gravitino server running with
Lance REST service enabled. The following are configurations to enable Lance
REST service in Gravitino server.
+
+| Configuration item | Description
| Default value | Required
| Since Version |
+|------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------|--------------------------------------------|---------------|
+| `gravitino.auxService.names` | Auxiliary service that runs
Lance REST service, currently it supports `iceberg-rest` and `lance-rest`. It
should include `lance-rest` if you want to start the Lance REST service like
`lance-rest`, or `lance-rest, iceberg-rest` | iceberg-rest,lance-rest | Yes if
Lance REST service is going to run | 0.2.0 |
+| `gravitino.lance-rest.classpath` | The class path of
lance-rest service, it's the relative path compared with Gravitino home.
| lance-rest-server/libs |
Yes if Lance REST service is going to run | 1.1.0 |
+| `gravitino.lance-rest.port` | The port number that Lance
REST service listens on.
| 9101 | Yes if
Lance REST service is going to run | 1.1.0 |
+| `gravitino.lance-rest.host` | The host name that Lance
REST service run in
| 0.0.0.0 | Yes if
Lance REST service is going to run | 1.1.0 |
Review Comment:
Grammatical error: "The host name that Lance REST service run in" should be
"The hostname that the Lance REST service runs on".
```suggestion
| `gravitino.lance-rest.host` | The hostname that the
Lance REST service runs on
| 0.0.0.0 | Yes
if Lance REST service is going to run | 1.1.0 |
```
##########
docs/lance-rest-service.md:
##########
@@ -0,0 +1,206 @@
+---
+title: "Lance REST service"
+slug: /lance-rest-service
+keywords:
+ - Lance REST
+license: "This software is licensed under the Apache License version 2."
+---
+
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+## Background
+
+Since version 1.1.0, Gravitino includes a REST service for Lance datasets. The
Lance REST service is a web service that allows you to interact with Lance
datasets over HTTP. It provides endpoints for querying, inserting, updating,
and deleting data in Lance datasets.
+It abides by the [Lance REST API
specification](https://editor-next.swagger.io/?url=https://raw.githubusercontent.com/lancedb/lance-namespace/refs/heads/main/docs/src/rest.yaml).
More details about the specification, please refer to docs
[here](https://lance.org/format/namespace/impls/rest/)
+
+Besides, Lance REST service can be run standalone without Gravitino server,
+
+
+
+## Capabilities
+
+The Lance REST service supports the APIs defined in the Lance REST API
specification. The following are some of the key capabilities of the Lance REST
service:
+- Namespace management including creating namespace, listing namespaces,
describing, deleting namespace, namespace exists check.
+- Table management including creating tables including creating empty tables,
dropping tables, registering tables and unregistering tables.
+- Index management including creating index, listing indexes. Dropping index
is not supported in 1.1.0.
+
+Full Supports are listed as the following table:
+
+| Operation ID | Description
| Since version |
+|----------------------|-----------------------------------------------------------------------------------------------------|---------------|
+| CreateNamespace | Create a Lance namespace
| 1.1.0 |
+| ListNamespaces | List all namespaces under a specific namespace
| 1.1.0 |
+| DescribeNamespace | Get details of a specific namespace
| 1.1.0 |
+| DropNamespace | Delete a specific namespace
| 1.1.0 |
+| NamespaceExists | Check if a namespace exists
| 1.1.0 |
+| ListTables | List all tables in a specific namespace
| 1.1.0 |
+| CreateTable | Create a new table in a specific namespace
| 1.1.0 |
+| DropTable | Delete a specific table from a namespace, drop table
will drop metadata and Lance data all together | 1.1.0 |
+| TableExists | Check if a specific table exists in a namespace
| 1.1.0 |
+| RegisterTable | Register an existing Lance table to a specific
namespace | 1.1.0 |
+| deregisterTable | Unregister a specific table from a namespace, it will
only remove metadata, Lance data will be kept | 1.1.0 |
+| CreateIndex | Create an index on a specific table
| 1.1.0 |
+| ListIndexes | List all indexes on a specific table
| 1.1.0 |
+
+## Getting started
+
+### Running Lance REST service with Gravitino
+
+To use the Lance REST service, you need to have Gravitino server running with
Lance REST service enabled. The following are configurations to enable Lance
REST service in Gravitino server.
+
+| Configuration item | Description
| Default value | Required
| Since Version |
+|------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------|--------------------------------------------|---------------|
+| `gravitino.auxService.names` | Auxiliary service that runs
Lance REST service, currently it supports `iceberg-rest` and `lance-rest`. It
should include `lance-rest` if you want to start the Lance REST service like
`lance-rest`, or `lance-rest, iceberg-rest` | iceberg-rest,lance-rest | Yes if
Lance REST service is going to run | 0.2.0 |
+| `gravitino.lance-rest.classpath` | The class path of
lance-rest service, it's the relative path compared with Gravitino home.
| lance-rest-server/libs |
Yes if Lance REST service is going to run | 1.1.0 |
Review Comment:
Grammatical error: "The class path of lance-rest service, it's the relative
path compared with Gravitino home" is awkwardly phrased. Should be "The
classpath of the Lance REST service, relative to Gravitino home directory."
```suggestion
| `gravitino.lance-rest.classpath` | The classpath of the
Lance REST service, relative to the Gravitino home directory.
| lance-rest-server/libs | Yes
if Lance REST service is going to run | 1.1.0 |
```
##########
docs/lance-rest-service.md:
##########
@@ -0,0 +1,206 @@
+---
+title: "Lance REST service"
+slug: /lance-rest-service
+keywords:
+ - Lance REST
+license: "This software is licensed under the Apache License version 2."
+---
+
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+## Background
+
+Since version 1.1.0, Gravitino includes a REST service for Lance datasets. The
Lance REST service is a web service that allows you to interact with Lance
datasets over HTTP. It provides endpoints for querying, inserting, updating,
and deleting data in Lance datasets.
+It abides by the [Lance REST API
specification](https://editor-next.swagger.io/?url=https://raw.githubusercontent.com/lancedb/lance-namespace/refs/heads/main/docs/src/rest.yaml).
More details about the specification, please refer to docs
[here](https://lance.org/format/namespace/impls/rest/)
+
+Besides, Lance REST service can be run standalone without Gravitino server,
+
+
+
Review Comment:
[nitpick] Inconsistent spacing: There's an extra blank line at line 19 that
breaks the flow of the document structure.
```suggestion
```
##########
docs/lance-rest-service.md:
##########
@@ -0,0 +1,206 @@
+---
+title: "Lance REST service"
+slug: /lance-rest-service
+keywords:
+ - Lance REST
+license: "This software is licensed under the Apache License version 2."
+---
+
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+## Background
+
+Since version 1.1.0, Gravitino includes a REST service for Lance datasets. The
Lance REST service is a web service that allows you to interact with Lance
datasets over HTTP. It provides endpoints for querying, inserting, updating,
and deleting data in Lance datasets.
+It abides by the [Lance REST API
specification](https://editor-next.swagger.io/?url=https://raw.githubusercontent.com/lancedb/lance-namespace/refs/heads/main/docs/src/rest.yaml).
More details about the specification, please refer to docs
[here](https://lance.org/format/namespace/impls/rest/)
+
+Besides, Lance REST service can be run standalone without Gravitino server,
+
+
+
+## Capabilities
+
+The Lance REST service supports the APIs defined in the Lance REST API
specification. The following are some of the key capabilities of the Lance REST
service:
+- Namespace management including creating namespace, listing namespaces,
describing, deleting namespace, namespace exists check.
+- Table management including creating tables including creating empty tables,
dropping tables, registering tables and unregistering tables.
+- Index management including creating index, listing indexes. Dropping index
is not supported in 1.1.0.
+
+Full Supports are listed as the following table:
+
+| Operation ID | Description
| Since version |
+|----------------------|-----------------------------------------------------------------------------------------------------|---------------|
+| CreateNamespace | Create a Lance namespace
| 1.1.0 |
+| ListNamespaces | List all namespaces under a specific namespace
| 1.1.0 |
+| DescribeNamespace | Get details of a specific namespace
| 1.1.0 |
+| DropNamespace | Delete a specific namespace
| 1.1.0 |
+| NamespaceExists | Check if a namespace exists
| 1.1.0 |
+| ListTables | List all tables in a specific namespace
| 1.1.0 |
+| CreateTable | Create a new table in a specific namespace
| 1.1.0 |
+| DropTable | Delete a specific table from a namespace, drop table
will drop metadata and Lance data all together | 1.1.0 |
+| TableExists | Check if a specific table exists in a namespace
| 1.1.0 |
+| RegisterTable | Register an existing Lance table to a specific
namespace | 1.1.0 |
+| deregisterTable | Unregister a specific table from a namespace, it will
only remove metadata, Lance data will be kept | 1.1.0 |
+| CreateIndex | Create an index on a specific table
| 1.1.0 |
+| ListIndexes | List all indexes on a specific table
| 1.1.0 |
+
+## Getting started
+
+### Running Lance REST service with Gravitino
+
+To use the Lance REST service, you need to have Gravitino server running with
Lance REST service enabled. The following are configurations to enable Lance
REST service in Gravitino server.
+
+| Configuration item | Description
| Default value | Required
| Since Version |
+|------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------|--------------------------------------------|---------------|
+| `gravitino.auxService.names` | Auxiliary service that runs
Lance REST service, currently it supports `iceberg-rest` and `lance-rest`. It
should include `lance-rest` if you want to start the Lance REST service like
`lance-rest`, or `lance-rest, iceberg-rest` | iceberg-rest,lance-rest | Yes if
Lance REST service is going to run | 0.2.0 |
+| `gravitino.lance-rest.classpath` | The class path of
lance-rest service, it's the relative path compared with Gravitino home.
| lance-rest-server/libs |
Yes if Lance REST service is going to run | 1.1.0 |
+| `gravitino.lance-rest.port` | The port number that Lance
REST service listens on.
| 9101 | Yes if
Lance REST service is going to run | 1.1.0 |
+| `gravitino.lance-rest.host` | The host name that Lance
REST service run in
| 0.0.0.0 | Yes if
Lance REST service is going to run | 1.1.0 |
+| `gravitino.lance-rest.namespace-backend` | backend to store namespace
metadata, currently it only supports `gravitino`
| gravitino | Yes if
Lance REST service is going to run | 1.1.0 |
+| `gravitino.lance-rest.gravitino.uri` | Gravitino server URI, it
should be set when `namespace-backend` is `gravitino`
| http://localhost:8090 | Yes if
Lance REST service is going to run | 1.1.0 |
+| `gravitino.lance-rest.gravitino.metalake-name` | Gravitino metalake name, it
should be set when `namespace-backend` is `gravitino`
| (none) | Yes if
Lance REST service is going to run | 1.1.0 |
+
+### Running Lance REST service standalone
+
+To run Lance REST service standalone without Gravitino server, you can use the
following command:
+
+```shell
+{GRAVITINO_HOME}/bin/gravitino-lance-rest-server.sh start
+```
+
+The following configurations are required to run Lance REST service
standalone, you can set them in `gravitino-lance-rest-server.conf` file or pass
them as command line arguments.
+Typically, you only need to change the following configurations:
+
+| Configuration item | Description
| Default value
| Required | Since Version |
+|------------------------------------------------|------------------------------------------------------------------------------------|--------------------------|--------------------------------------------|---------------|
+| `gravitino.lance-rest.namespace-backend` | backend to store namespace
metadata, currently it only supports `gravitino` | gravitino
| Yes if Lance REST service is going to run | 1.1.0 |
Review Comment:
Grammatical error: "backend to store namespace metadata" should be
capitalized as "Backend to store namespace metadata" to match the style of
other descriptions in the table.
```suggestion
| `gravitino.lance-rest.namespace-backend` | Backend to store
namespace metadata, currently it only supports `gravitino` | gravitino
| Yes if Lance REST service is going to run | 1.1.0 |
```
##########
docs/lance-rest-service.md:
##########
@@ -0,0 +1,206 @@
+---
+title: "Lance REST service"
+slug: /lance-rest-service
+keywords:
+ - Lance REST
+license: "This software is licensed under the Apache License version 2."
+---
+
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+## Background
+
+Since version 1.1.0, Gravitino includes a REST service for Lance datasets. The
Lance REST service is a web service that allows you to interact with Lance
datasets over HTTP. It provides endpoints for querying, inserting, updating,
and deleting data in Lance datasets.
+It abides by the [Lance REST API
specification](https://editor-next.swagger.io/?url=https://raw.githubusercontent.com/lancedb/lance-namespace/refs/heads/main/docs/src/rest.yaml).
More details about the specification, please refer to docs
[here](https://lance.org/format/namespace/impls/rest/)
+
+Besides, Lance REST service can be run standalone without Gravitino server,
+
+
+
+## Capabilities
+
+The Lance REST service supports the APIs defined in the Lance REST API
specification. The following are some of the key capabilities of the Lance REST
service:
+- Namespace management including creating namespace, listing namespaces,
describing, deleting namespace, namespace exists check.
+- Table management including creating tables including creating empty tables,
dropping tables, registering tables and unregistering tables.
+- Index management including creating index, listing indexes. Dropping index
is not supported in 1.1.0.
+
+Full Supports are listed as the following table:
+
+| Operation ID | Description
| Since version |
+|----------------------|-----------------------------------------------------------------------------------------------------|---------------|
+| CreateNamespace | Create a Lance namespace
| 1.1.0 |
+| ListNamespaces | List all namespaces under a specific namespace
| 1.1.0 |
+| DescribeNamespace | Get details of a specific namespace
| 1.1.0 |
+| DropNamespace | Delete a specific namespace
| 1.1.0 |
+| NamespaceExists | Check if a namespace exists
| 1.1.0 |
+| ListTables | List all tables in a specific namespace
| 1.1.0 |
+| CreateTable | Create a new table in a specific namespace
| 1.1.0 |
+| DropTable | Delete a specific table from a namespace, drop table
will drop metadata and Lance data all together | 1.1.0 |
+| TableExists | Check if a specific table exists in a namespace
| 1.1.0 |
+| RegisterTable | Register an existing Lance table to a specific
namespace | 1.1.0 |
+| deregisterTable | Unregister a specific table from a namespace, it will
only remove metadata, Lance data will be kept | 1.1.0 |
+| CreateIndex | Create an index on a specific table
| 1.1.0 |
+| ListIndexes | List all indexes on a specific table
| 1.1.0 |
+
+## Getting started
+
+### Running Lance REST service with Gravitino
+
+To use the Lance REST service, you need to have Gravitino server running with
Lance REST service enabled. The following are configurations to enable Lance
REST service in Gravitino server.
+
+| Configuration item | Description
| Default value | Required
| Since Version |
+|------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------|--------------------------------------------|---------------|
+| `gravitino.auxService.names` | Auxiliary service that runs
Lance REST service, currently it supports `iceberg-rest` and `lance-rest`. It
should include `lance-rest` if you want to start the Lance REST service like
`lance-rest`, or `lance-rest, iceberg-rest` | iceberg-rest,lance-rest | Yes if
Lance REST service is going to run | 0.2.0 |
+| `gravitino.lance-rest.classpath` | The class path of
lance-rest service, it's the relative path compared with Gravitino home.
| lance-rest-server/libs |
Yes if Lance REST service is going to run | 1.1.0 |
+| `gravitino.lance-rest.port` | The port number that Lance
REST service listens on.
| 9101 | Yes if
Lance REST service is going to run | 1.1.0 |
+| `gravitino.lance-rest.host` | The host name that Lance
REST service run in
| 0.0.0.0 | Yes if
Lance REST service is going to run | 1.1.0 |
+| `gravitino.lance-rest.namespace-backend` | backend to store namespace
metadata, currently it only supports `gravitino`
| gravitino | Yes if
Lance REST service is going to run | 1.1.0 |
+| `gravitino.lance-rest.gravitino.uri` | Gravitino server URI, it
should be set when `namespace-backend` is `gravitino`
| http://localhost:8090 | Yes if
Lance REST service is going to run | 1.1.0 |
+| `gravitino.lance-rest.gravitino.metalake-name` | Gravitino metalake name, it
should be set when `namespace-backend` is `gravitino`
| (none) | Yes if
Lance REST service is going to run | 1.1.0 |
+
+### Running Lance REST service standalone
+
+To run Lance REST service standalone without Gravitino server, you can use the
following command:
+
+```shell
+{GRAVITINO_HOME}/bin/gravitino-lance-rest-server.sh start
+```
+
+The following configurations are required to run Lance REST service
standalone, you can set them in `gravitino-lance-rest-server.conf` file or pass
them as command line arguments.
+Typically, you only need to change the following configurations:
+
+| Configuration item | Description
| Default value
| Required | Since Version |
+|------------------------------------------------|------------------------------------------------------------------------------------|--------------------------|--------------------------------------------|---------------|
+| `gravitino.lance-rest.namespace-backend` | backend to store namespace
metadata, currently it only supports `gravitino` | gravitino
| Yes if Lance REST service is going to run | 1.1.0 |
+| `gravitino.lance-rest.gravitino.uri` | Gravitino server URI, it
should be set when `namespace-backend` is `gravitino` |
http://localhost:8090 | Yes if Lance REST service is going to run | 1.1.0
|
+| `gravitino.lance-rest.gravitino.metalake-name` | Gravitino metalake name, it
should be set when `namespace-backend` is `gravitino` | (none)
| Yes if Lance REST service is going to run | 1.1.0 |
+| `gravitino.lance-rest.port` | The port number that Lance
REST service listens on. | 9101
| Yes if Lance REST service is going to run | 1.1.0 |
+| `gravitino.lance-rest.host` | The host name that Lance
REST service run in | 0.0.0.0
| Yes if Lance REST service is going to run | 1.1.0 |
Review Comment:
Grammatical error: Same issue as line 57 - "The host name that Lance REST
service run in" should be "The hostname that the Lance REST service runs on".
```suggestion
| `gravitino.lance-rest.host` | The hostname that the
Lance REST service runs on | 0.0.0.0
| Yes if Lance REST service is going to run | 1.1.0 |
```
##########
docs/lance-rest-service.md:
##########
@@ -0,0 +1,206 @@
+---
+title: "Lance REST service"
+slug: /lance-rest-service
+keywords:
+ - Lance REST
+license: "This software is licensed under the Apache License version 2."
+---
+
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+## Background
+
+Since version 1.1.0, Gravitino includes a REST service for Lance datasets. The
Lance REST service is a web service that allows you to interact with Lance
datasets over HTTP. It provides endpoints for querying, inserting, updating,
and deleting data in Lance datasets.
+It abides by the [Lance REST API
specification](https://editor-next.swagger.io/?url=https://raw.githubusercontent.com/lancedb/lance-namespace/refs/heads/main/docs/src/rest.yaml).
More details about the specification, please refer to docs
[here](https://lance.org/format/namespace/impls/rest/)
+
+Besides, Lance REST service can be run standalone without Gravitino server,
Review Comment:
Grammatical error: Missing period at the end of the sentence.
```suggestion
Besides, Lance REST service can be run standalone without Gravitino server.
```
##########
docs/generic-lakehouse-catalog.md:
##########
@@ -0,0 +1,281 @@
+---
+title: "Lakehouse catalog"
+slug: /lakehouse-catalog
+keywords:
+ - lakehouse
+ - lance
+ - metadata
+license: "This software is licensed under the Apache License version 2."
+---
+
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+## Introduction
+
+Generic Lakehouse Catalog is a Gravitino catalog implementation that enables
Gravitino to interact with Lakehouse storage systems that use file system for
storing tabular data. Such lakehouse system could be built on top of object
stores like Amazon S3, Azure Blob Storage, Google Cloud Storage, or HDFS.
+Theoretically, it can work with any lakehouse storage system that supports
standard file system operations such as Apache Iceberg, Lance, Delta Lake, and
Apache Hudi. However, currently Gravitino only provides native support for
Lance-based lakehouse storage systems.
+
+### Requirements and limitations
+
+- The lakehouse storage system must support standard file system operations
such as listing directories, reading files, and writing files.
+
+## Catalog
+
+### Catalog capabilities
+
+All capabilities are the same as relational catalog, please refer to [Manage
Relational Metadata Using
Gravitino](./manage-relational-metadata-using-gravitino.md) for more details.
+
+### Catalog properties
+
+The only property that need to be noted for a generic lakehouse catalog is
`location`. This property specifies the root location of the lakehouse storage
system. All schemas and tables will be stored under this location if not
Review Comment:
Grammatical error: "The property that need to be noted" should be "The
property that needs to be noted" (singular subject requires singular verb).
```suggestion
The only property that needs to be noted for a generic lakehouse catalog is
`location`. This property specifies the root location of the lakehouse storage
system. All schemas and tables will be stored under this location if not
```
##########
docs/lance-rest-service.md:
##########
@@ -0,0 +1,206 @@
+---
+title: "Lance REST service"
+slug: /lance-rest-service
+keywords:
+ - Lance REST
+license: "This software is licensed under the Apache License version 2."
+---
+
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+## Background
+
+Since version 1.1.0, Gravitino includes a REST service for Lance datasets. The
Lance REST service is a web service that allows you to interact with Lance
datasets over HTTP. It provides endpoints for querying, inserting, updating,
and deleting data in Lance datasets.
+It abides by the [Lance REST API
specification](https://editor-next.swagger.io/?url=https://raw.githubusercontent.com/lancedb/lance-namespace/refs/heads/main/docs/src/rest.yaml).
More details about the specification, please refer to docs
[here](https://lance.org/format/namespace/impls/rest/)
+
+Besides, Lance REST service can be run standalone without Gravitino server,
+
+
+
+## Capabilities
+
+The Lance REST service supports the APIs defined in the Lance REST API
specification. The following are some of the key capabilities of the Lance REST
service:
+- Namespace management including creating namespace, listing namespaces,
describing, deleting namespace, namespace exists check.
+- Table management including creating tables including creating empty tables,
dropping tables, registering tables and unregistering tables.
+- Index management including creating index, listing indexes. Dropping index
is not supported in 1.1.0.
+
+Full Supports are listed as the following table:
+
+| Operation ID | Description
| Since version |
+|----------------------|-----------------------------------------------------------------------------------------------------|---------------|
+| CreateNamespace | Create a Lance namespace
| 1.1.0 |
+| ListNamespaces | List all namespaces under a specific namespace
| 1.1.0 |
+| DescribeNamespace | Get details of a specific namespace
| 1.1.0 |
+| DropNamespace | Delete a specific namespace
| 1.1.0 |
+| NamespaceExists | Check if a namespace exists
| 1.1.0 |
+| ListTables | List all tables in a specific namespace
| 1.1.0 |
+| CreateTable | Create a new table in a specific namespace
| 1.1.0 |
+| DropTable | Delete a specific table from a namespace, drop table
will drop metadata and Lance data all together | 1.1.0 |
+| TableExists | Check if a specific table exists in a namespace
| 1.1.0 |
+| RegisterTable | Register an existing Lance table to a specific
namespace | 1.1.0 |
+| deregisterTable | Unregister a specific table from a namespace, it will
only remove metadata, Lance data will be kept | 1.1.0 |
+| CreateIndex | Create an index on a specific table
| 1.1.0 |
+| ListIndexes | List all indexes on a specific table
| 1.1.0 |
+
+## Getting started
+
+### Running Lance REST service with Gravitino
+
+To use the Lance REST service, you need to have Gravitino server running with
Lance REST service enabled. The following are configurations to enable Lance
REST service in Gravitino server.
+
+| Configuration item | Description
| Default value | Required
| Since Version |
+|------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------|--------------------------------------------|---------------|
+| `gravitino.auxService.names` | Auxiliary service that runs
Lance REST service, currently it supports `iceberg-rest` and `lance-rest`. It
should include `lance-rest` if you want to start the Lance REST service like
`lance-rest`, or `lance-rest, iceberg-rest` | iceberg-rest,lance-rest | Yes if
Lance REST service is going to run | 0.2.0 |
+| `gravitino.lance-rest.classpath` | The class path of
lance-rest service, it's the relative path compared with Gravitino home.
| lance-rest-server/libs |
Yes if Lance REST service is going to run | 1.1.0 |
+| `gravitino.lance-rest.port` | The port number that Lance
REST service listens on.
| 9101 | Yes if
Lance REST service is going to run | 1.1.0 |
+| `gravitino.lance-rest.host` | The host name that Lance
REST service run in
| 0.0.0.0 | Yes if
Lance REST service is going to run | 1.1.0 |
+| `gravitino.lance-rest.namespace-backend` | backend to store namespace
metadata, currently it only supports `gravitino`
| gravitino | Yes if
Lance REST service is going to run | 1.1.0 |
+| `gravitino.lance-rest.gravitino.uri` | Gravitino server URI, it
should be set when `namespace-backend` is `gravitino`
| http://localhost:8090 | Yes if
Lance REST service is going to run | 1.1.0 |
+| `gravitino.lance-rest.gravitino.metalake-name` | Gravitino metalake name, it
should be set when `namespace-backend` is `gravitino`
| (none) | Yes if
Lance REST service is going to run | 1.1.0 |
+
+### Running Lance REST service standalone
+
+To run Lance REST service standalone without Gravitino server, you can use the
following command:
+
+```shell
+{GRAVITINO_HOME}/bin/gravitino-lance-rest-server.sh start
+```
+
+The following configurations are required to run Lance REST service
standalone, you can set them in `gravitino-lance-rest-server.conf` file or pass
them as command line arguments.
+Typically, you only need to change the following configurations:
+
+| Configuration item | Description
| Default value
| Required | Since Version |
+|------------------------------------------------|------------------------------------------------------------------------------------|--------------------------|--------------------------------------------|---------------|
+| `gravitino.lance-rest.namespace-backend` | backend to store namespace
metadata, currently it only supports `gravitino` | gravitino
| Yes if Lance REST service is going to run | 1.1.0 |
+| `gravitino.lance-rest.gravitino.uri` | Gravitino server URI, it
should be set when `namespace-backend` is `gravitino` |
http://localhost:8090 | Yes if Lance REST service is going to run | 1.1.0
|
+| `gravitino.lance-rest.gravitino.metalake-name` | Gravitino metalake name, it
should be set when `namespace-backend` is `gravitino` | (none)
| Yes if Lance REST service is going to run | 1.1.0 |
+| `gravitino.lance-rest.port` | The port number that Lance
REST service listens on. | 9101
| Yes if Lance REST service is going to run | 1.1.0 |
+| `gravitino.lance-rest.host` | The host name that Lance
REST service run in | 0.0.0.0
| Yes if Lance REST service is going to run | 1.1.0 |
+
+`namespace-backend`, `uri`, `port` and `host` have the same meaning as
described in the previous section, and they have the default values. In most
cases you only need to change `metalake-name` to your Gravitino metalake name.
+For other configurations listed in the file, just keep their default values.
+
+## Using Lance REST service
+
+Currently, as the Lance REST service only support Gravitino backend, so there
are some limitations when using Lance REST service:
Review Comment:
Grammatical error: "Currently, as the Lance REST service only support
Gravitino backend" should be "Currently, as the Lance REST service only
supports Gravitino backend" (missing 's' in 'support').
```suggestion
Currently, as the Lance REST service only supports the Gravitino backend,
there are some limitations when using the Lance REST service:
```
##########
docs/generic-lakehouse-catalog.md:
##########
@@ -0,0 +1,281 @@
+---
+title: "Lakehouse catalog"
+slug: /lakehouse-catalog
+keywords:
+ - lakehouse
+ - lance
+ - metadata
+license: "This software is licensed under the Apache License version 2."
+---
+
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+## Introduction
+
+Generic Lakehouse Catalog is a Gravitino catalog implementation that enables
Gravitino to interact with Lakehouse storage systems that use file system for
storing tabular data. Such lakehouse system could be built on top of object
stores like Amazon S3, Azure Blob Storage, Google Cloud Storage, or HDFS.
+Theoretically, it can work with any lakehouse storage system that supports
standard file system operations such as Apache Iceberg, Lance, Delta Lake, and
Apache Hudi. However, currently Gravitino only provides native support for
Lance-based lakehouse storage systems.
+
+### Requirements and limitations
+
+- The lakehouse storage system must support standard file system operations
such as listing directories, reading files, and writing files.
+
+## Catalog
+
+### Catalog capabilities
+
+All capabilities are the same as relational catalog, please refer to [Manage
Relational Metadata Using
Gravitino](./manage-relational-metadata-using-gravitino.md) for more details.
+
+### Catalog properties
+
+The only property that need to be noted for a generic lakehouse catalog is
`location`. This property specifies the root location of the lakehouse storage
system. All schemas and tables will be stored under this location if not
+specified otherwise in schema or table properties.
+
+
+### Catalog operations
+
+All operations are the same as relational catalog, please refer to [Manage
Relational Metadata Using
Gravitino](./manage-relational-metadata-using-gravitino.md) for more details.
+
+One thing need to be noted is that the provider will be `generic_lakehouse`
when creating a generic lakehouse catalog.
+That is:
+
+<Tabs groupId='language' queryString>
+<TabItem value="shell" label="Shell">
+
+```shell
+curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \
+-H "Content-Type: application/json" -d '{
+ "name": "generic_lakehouse_catalog",
+ "type": "RELATIONAL",
+ "comment": "comment",
+ "provider": "generic-lakehouse",
+ "properties": {
+ }
+}' http://localhost:8090/api/metalakes/metalake/catalogs
+```
+
+</TabItem>
+<TabItem value="java" label="Java">
+
+```java
+
+// Assuming you have just created a metalake named `metalake`
+GravitinoClient gravitinoClient = GravitinoClient
+ .builder("http://127.0.0.1:8090")
+ .withMetalake("metalake")
+ .build();
+
+Map<String, String> genericCatalogProperties = ImmutableMap.<String,
String>builder()
+ .put("location", "hdfs://localhost:9000/user/lakehouse") // The root
location of the lakehouse storage system
+ .build();
+
+Catalog catalog = gravitinoClient.createCatalog("generic_lakehouse_catalog",
+ Type.RELATIONAL,
+ "generic_lakehouse",
+ "This is a generic lakehouse catalog",
+ genericCatalogProperties); // Please change the properties according to
the value of the provider.
+// ...
+```
+
+</TabItem>
+</Tabs>
+
+
+## Schema
+
+### Schema capabilities
+
+All capabilities are the same as relational catalog, please refer to [Manage
Relational Metadata Using
Gravitino](./manage-relational-metadata-using-gravitino.md) for more details.
+
+### Schema properties
+
+The same as catalog properties, please refer to [Catalog
properties](#catalog-properties) section for more details. Schema `location`
property can be used to specify the location to store all tables under this
schema.
+
+### Schema operations
+
+Please refer to [Manage Relational Metadata Using
Gravitino](./manage-relational-metadata-using-gravitino.md#schema-operations)
for more details.
+
+## Table
+
+### Table capabilities
+
+Currently, for a lance table, Gravitino supports the following capabilities:
+- List
+- Load
+- Alter (partial supported)
+- Create/register
+- Drop and truncate
+
+### Table partitions
+
+Not support now
+
+### Table sort orders
+
+Not support now.
+
+### Table distributions
+
+Not support now.
+
+### Table column types
+
+Since Lance uses Apache Arrow as the table schema, the following table shows
the mapping between Gravitino types and Arrow types:
+
+| Gravitino Type | Arrow Type |
+|----------------------------------|-----------------------------------------|
+| `Struct` | `Struct` |
+| `Map` | `Map` |
+| `List` | `Array` |
+| `Boolean` | `Boolean` |
+| `Byte` | `Int8` |
+| `Short` | `Int16` |
+| `Integer` | `Int32` |
+| `Long` | `Int64` |
+| `Float` | `Float` |
+| `Double` | `Double` |
+| `String` | `Utf8` |
+| `Binary` | `Binary` |
+| `Decimal(p, s)` | `Decimal(p, s)` (128-bit) |
+| `Date` | `Date` |
+| `Timestamp`/`Timestamp(6)` | `TimestampType withoutZone` |
+| `Timestamp(0)` | `TimestampType Second withoutZone` |
+| `Timestamp(3)` | `TimestampType Millisecond withoutZone` |
+| `Timestamp(9)` | `TimestampType Nanosecond withoutZone` |
+| `Timestamp_tz`/`Timestamp_tz(6)` | `TimestampType Microsecond withUtc` |
+| `Timestamp_tz(0)` | `TimestampType Second withUtc` |
+| `Timestamp_tz(3)` | `TimestampType Millisecond withUtc` |
+| `Timestamp_tz(9)` | `TimestampType Nanosecond withUtc` |
+| `Time`/`Time(9)` | `Time Nanosecond` |
+| `Null` | `Null` |
+| `Fixed(n)` | `Fixed-Size Binary(n)` |
+| `Interval_year` | `Interval(YearMonth)` |
+| `Interval_day` | `Duration(Microsecond)` |
+| `External(arrow_field_json_str)` | Any Arrow Field (see note below) |
+
+`External(arrow_field_json_str)`:
+
+As the table above shows, Gravitino provides mappings for most common data
types. However,
+in some cases, you may need to use an Arrow data type that is not directly
supported by Gravitino.
+
+To address this, Gravitino introduces the `External(arrow_field_json_str)`
type,
+which allows you to define any Arrow data type by providing the JSON string of
an Arrow `Field`.
+
+The JSON string must conform to the Apache Arrow `Field`
[specification](https://github.com/apache/arrow-java/blob/ed81e5981a2bee40584b3a411ed755cb4cc5b91f/vector/src/main/java/org/apache/arrow/vector/types/pojo/Field.java#L80C1-L86C68),
+including details such as the field name, data type, and nullability.
+Here are some examples of how to use `External` type for various Arrow types
that are not natively supported by Gravitino:
+
+| Arrow Type | External type
|
+|-------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| `Large Utf8` |
`External("{\"name\":\"col_name\",\"nullable\":true,\"type\":{\"name\":\"largeutf8\"},\"children\":[]}")`
|
+| `Large Binary` |
`External("{\"name\":\"col_name\",\"nullable\":true,\"type\":{\"name\":\"largebinary\"},\"children\":[]}")`
|
+| `Large List` |
`External("{\"name\":\"col_name\",\"nullable\":true,\"type\":{\"name\":\"largelist\"},\"children\":[{\"name\":\"element\",\"nullable\":true,\"type\":{\"name\":\"int\",
\"bitWidth\":32, \"isSigned\": true},\"children\":[]}]}")`
|
+| `Fixed-Size List` |
`External("{\"name\":\"col_name\",\"nullable\":true,\"type\":{\"name\":\"fixedsizelist\",
\"listSize\":10},\"children\":[{\"name\":\"element\",\"nullable\":true,\"type\":{\"name\":\"int\",
\"bitWidth\":32, \"isSigned\": true},\"children\":[]}]}")` |
+
+**Important considerations:**
+- The `name` attribute and `nullable` attribute in the JSON string must
exactly match the corresponding column name and nullable in the Gravitino table.
+- The `children` array should be empty for primitive types. For complex types
like `Struct` or `List`, it must contain the definitions of the child fields.
+
+### Table properties
+
+Currently, the following properties are required for a table in a generic
lakehouse catalog
+
+| Configuration item | Description
| Default value | Required
| Since version |
+|--------------------|-----------------------------------------------------------------------------------------------|---------------|-----------------------------------------------------------------------------|---------------|
+| `format` | The format for a table, it can be `lance`,
`iceberg`,..., currently, it only supports `lance` | (none) | Yes
| 1.1.0
|
+| `location` | The location to storage the table meta and data.
| (none) | No, but if this is not
set in catalog or schema, then it's a required value | 1.1.0 |
+
+Of course, apart from the above-required properties, you can also set other
table properties supported by the underlying lakehouse storage system or your
custom properties.
+
+### Table indexes
+
+This part is almost the same as relational catalog, please refer to [Manage
Relational Metadata Using
Gravitino](./manage-relational-metadata-using-gravitino.md#table-partitioning-distribution-sort-ordering-and-indexes)
for more details.
+However, different lakehouse storage systems may have different supports for
indexes, and the following tables show the support for indexes in a Lance-based
lakehouse storage system.
+
+| Index type | Description
| Lance |
+|------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------|
+| SCALAR | SCALAR index is used to optimize searches on scalar data types
such as integers, floats, and so on.
| Y |
+| VECTOR | VECTOR index is used to optimize similarity searches in
high-dimensional vector spaces.
| Y |
+| BTREE | BTREE index is a balanced tree data structure that maintains
sorted data and allows for logarithmic time complexity for search, insert, and
delete operations.
| Y |
+| INVERTED | INVERTED index is a data structure used to optimize full-text
searches by mapping terms to their locations within a dataset, allowing for
quick retrieval of documents containing specific words or phrases.
| Y |
+| IVF_FLAT | IVF_FLAT (Inverted File with Flat quantization) index is used
for efficient similarity searches in high-dimensional vector spaces by
partitioning the vector space into clusters and storing vectors in a flat
structure within each cluster.
| Y |
+| IVF_SQ | IVF_SQ (Inverted File with Scalar Quantization) index is used
for efficient similarity searches in high-dimensional vector spaces by
partitioning the vector space into clusters and storing quantized
representations of vectors within each cluster to reduce memory usage.
| Y |
+| IVF_PQ | IVF_PQ (Inverted File with Product Quantization) index is used
for efficient similarity searches in high-dimensional vector spaces by
partitioning the vector space into clusters and storing product-quantized
representations of vectors within each cluster to achieve a balance between
search accuracy and memory efficiency. | Y |
+
+Another point is that **NOT all lakehouse table support creating index when
creating table**, and Lance is one of them. So when creating a lance table, you
cannot specify indexes at the same time. You need to create the table first,
then create indexes on the table.
+
+### Table operations
+
+Please refer to [Manage Relational Metadata Using
Gravitino](./manage-relational-metadata-using-gravitino.md#table-operations)
for more details.
+
+The only difference is when creating/registering a table, you need to specify
the `format` property to indicate the underlying lakehouse storage system
format, e.g. `lance`.
Review Comment:
Grammatical error: "when creating/registering a table, you need to specify"
should be "when creating or registering a table, you need to specify" (clearer
phrasing with "or").
```suggestion
The only difference is when creating or registering a table, you need to
specify the `format` property to indicate the underlying lakehouse storage
system format, e.g. `lance`.
```
##########
docs/lance-rest-service.md:
##########
@@ -0,0 +1,206 @@
+---
+title: "Lance REST service"
+slug: /lance-rest-service
+keywords:
+ - Lance REST
+license: "This software is licensed under the Apache License version 2."
+---
+
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+## Background
+
+Since version 1.1.0, Gravitino includes a REST service for Lance datasets. The
Lance REST service is a web service that allows you to interact with Lance
datasets over HTTP. It provides endpoints for querying, inserting, updating,
and deleting data in Lance datasets.
+It abides by the [Lance REST API
specification](https://editor-next.swagger.io/?url=https://raw.githubusercontent.com/lancedb/lance-namespace/refs/heads/main/docs/src/rest.yaml).
More details about the specification, please refer to docs
[here](https://lance.org/format/namespace/impls/rest/)
+
+Besides, Lance REST service can be run standalone without Gravitino server,
+
+
+
+## Capabilities
+
+The Lance REST service supports the APIs defined in the Lance REST API
specification. The following are some of the key capabilities of the Lance REST
service:
+- Namespace management including creating namespace, listing namespaces,
describing, deleting namespace, namespace exists check.
+- Table management including creating tables including creating empty tables,
dropping tables, registering tables and unregistering tables.
+- Index management including creating index, listing indexes. Dropping index
is not supported in 1.1.0.
+
+Full Supports are listed as the following table:
+
+| Operation ID | Description
| Since version |
+|----------------------|-----------------------------------------------------------------------------------------------------|---------------|
+| CreateNamespace | Create a Lance namespace
| 1.1.0 |
+| ListNamespaces | List all namespaces under a specific namespace
| 1.1.0 |
+| DescribeNamespace | Get details of a specific namespace
| 1.1.0 |
+| DropNamespace | Delete a specific namespace
| 1.1.0 |
+| NamespaceExists | Check if a namespace exists
| 1.1.0 |
+| ListTables | List all tables in a specific namespace
| 1.1.0 |
+| CreateTable | Create a new table in a specific namespace
| 1.1.0 |
+| DropTable | Delete a specific table from a namespace, drop table
will drop metadata and Lance data all together | 1.1.0 |
+| TableExists | Check if a specific table exists in a namespace
| 1.1.0 |
+| RegisterTable | Register an existing Lance table to a specific
namespace | 1.1.0 |
+| deregisterTable | Unregister a specific table from a namespace, it will
only remove metadata, Lance data will be kept | 1.1.0 |
Review Comment:
Spelling/capitalization inconsistency: "deregisterTable" should be
capitalized as "DeregisterTable" to match the pattern of other operation IDs in
the table.
```suggestion
| DeregisterTable | Unregister a specific table from a namespace, it
will only remove metadata, Lance data will be kept | 1.1.0 |
```
##########
docs/lance-rest-service.md:
##########
@@ -0,0 +1,206 @@
+---
+title: "Lance REST service"
+slug: /lance-rest-service
+keywords:
+ - Lance REST
+license: "This software is licensed under the Apache License version 2."
+---
+
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+## Background
+
+Since version 1.1.0, Gravitino includes a REST service for Lance datasets. The
Lance REST service is a web service that allows you to interact with Lance
datasets over HTTP. It provides endpoints for querying, inserting, updating,
and deleting data in Lance datasets.
+It abides by the [Lance REST API
specification](https://editor-next.swagger.io/?url=https://raw.githubusercontent.com/lancedb/lance-namespace/refs/heads/main/docs/src/rest.yaml).
More details about the specification, please refer to docs
[here](https://lance.org/format/namespace/impls/rest/)
+
+Besides, Lance REST service can be run standalone without Gravitino server,
+
+
+
+## Capabilities
+
+The Lance REST service supports the APIs defined in the Lance REST API
specification. The following are some of the key capabilities of the Lance REST
service:
+- Namespace management including creating namespace, listing namespaces,
describing, deleting namespace, namespace exists check.
+- Table management including creating tables including creating empty tables,
dropping tables, registering tables and unregistering tables.
+- Index management including creating index, listing indexes. Dropping index
is not supported in 1.1.0.
+
+Full Supports are listed as the following table:
Review Comment:
Awkward phrasing: "Full Supports are listed" should be "Full support is
listed" or "Full capabilities are listed".
```suggestion
Full capabilities are listed in the following table:
```
##########
docs/lance-rest-service.md:
##########
@@ -0,0 +1,206 @@
+---
+title: "Lance REST service"
+slug: /lance-rest-service
+keywords:
+ - Lance REST
+license: "This software is licensed under the Apache License version 2."
+---
+
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+## Background
+
+Since version 1.1.0, Gravitino includes a REST service for Lance datasets. The
Lance REST service is a web service that allows you to interact with Lance
datasets over HTTP. It provides endpoints for querying, inserting, updating,
and deleting data in Lance datasets.
+It abides by the [Lance REST API
specification](https://editor-next.swagger.io/?url=https://raw.githubusercontent.com/lancedb/lance-namespace/refs/heads/main/docs/src/rest.yaml).
More details about the specification, please refer to docs
[here](https://lance.org/format/namespace/impls/rest/)
+
+Besides, Lance REST service can be run standalone without Gravitino server,
+
+
+
+## Capabilities
+
+The Lance REST service supports the APIs defined in the Lance REST API
specification. The following are some of the key capabilities of the Lance REST
service:
+- Namespace management including creating namespace, listing namespaces,
describing, deleting namespace, namespace exists check.
+- Table management including creating tables including creating empty tables,
dropping tables, registering tables and unregistering tables.
+- Index management including creating index, listing indexes. Dropping index
is not supported in 1.1.0.
+
+Full Supports are listed as the following table:
+
+| Operation ID | Description
| Since version |
+|----------------------|-----------------------------------------------------------------------------------------------------|---------------|
+| CreateNamespace | Create a Lance namespace
| 1.1.0 |
+| ListNamespaces | List all namespaces under a specific namespace
| 1.1.0 |
+| DescribeNamespace | Get details of a specific namespace
| 1.1.0 |
+| DropNamespace | Delete a specific namespace
| 1.1.0 |
+| NamespaceExists | Check if a namespace exists
| 1.1.0 |
+| ListTables | List all tables in a specific namespace
| 1.1.0 |
+| CreateTable | Create a new table in a specific namespace
| 1.1.0 |
+| DropTable | Delete a specific table from a namespace, drop table
will drop metadata and Lance data all together | 1.1.0 |
+| TableExists | Check if a specific table exists in a namespace
| 1.1.0 |
+| RegisterTable | Register an existing Lance table to a specific
namespace | 1.1.0 |
+| deregisterTable | Unregister a specific table from a namespace, it will
only remove metadata, Lance data will be kept | 1.1.0 |
+| CreateIndex | Create an index on a specific table
| 1.1.0 |
+| ListIndexes | List all indexes on a specific table
| 1.1.0 |
+
+## Getting started
+
+### Running Lance REST service with Gravitino
+
+To use the Lance REST service, you need to have Gravitino server running with
Lance REST service enabled. The following are configurations to enable Lance
REST service in Gravitino server.
+
+| Configuration item | Description
| Default value | Required
| Since Version |
+|------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------|--------------------------------------------|---------------|
+| `gravitino.auxService.names` | Auxiliary service that runs
Lance REST service, currently it supports `iceberg-rest` and `lance-rest`. It
should include `lance-rest` if you want to start the Lance REST service like
`lance-rest`, or `lance-rest, iceberg-rest` | iceberg-rest,lance-rest | Yes if
Lance REST service is going to run | 0.2.0 |
+| `gravitino.lance-rest.classpath` | The class path of
lance-rest service, it's the relative path compared with Gravitino home.
| lance-rest-server/libs |
Yes if Lance REST service is going to run | 1.1.0 |
+| `gravitino.lance-rest.port` | The port number that Lance
REST service listens on.
| 9101 | Yes if
Lance REST service is going to run | 1.1.0 |
+| `gravitino.lance-rest.host` | The host name that Lance
REST service run in
| 0.0.0.0 | Yes if
Lance REST service is going to run | 1.1.0 |
+| `gravitino.lance-rest.namespace-backend` | backend to store namespace
metadata, currently it only supports `gravitino`
| gravitino | Yes if
Lance REST service is going to run | 1.1.0 |
+| `gravitino.lance-rest.gravitino.uri` | Gravitino server URI, it
should be set when `namespace-backend` is `gravitino`
| http://localhost:8090 | Yes if
Lance REST service is going to run | 1.1.0 |
+| `gravitino.lance-rest.gravitino.metalake-name` | Gravitino metalake name, it
should be set when `namespace-backend` is `gravitino`
| (none) | Yes if
Lance REST service is going to run | 1.1.0 |
+
+### Running Lance REST service standalone
+
+To run Lance REST service standalone without Gravitino server, you can use the
following command:
+
+```shell
+{GRAVITINO_HOME}/bin/gravitino-lance-rest-server.sh start
+```
+
+The following configurations are required to run Lance REST service
standalone, you can set them in `gravitino-lance-rest-server.conf` file or pass
them as command line arguments.
+Typically, you only need to change the following configurations:
+
+| Configuration item | Description
| Default value
| Required | Since Version |
+|------------------------------------------------|------------------------------------------------------------------------------------|--------------------------|--------------------------------------------|---------------|
+| `gravitino.lance-rest.namespace-backend` | backend to store namespace
metadata, currently it only supports `gravitino` | gravitino
| Yes if Lance REST service is going to run | 1.1.0 |
+| `gravitino.lance-rest.gravitino.uri` | Gravitino server URI, it
should be set when `namespace-backend` is `gravitino` |
http://localhost:8090 | Yes if Lance REST service is going to run | 1.1.0
|
+| `gravitino.lance-rest.gravitino.metalake-name` | Gravitino metalake name, it
should be set when `namespace-backend` is `gravitino` | (none)
| Yes if Lance REST service is going to run | 1.1.0 |
+| `gravitino.lance-rest.port` | The port number that Lance
REST service listens on. | 9101
| Yes if Lance REST service is going to run | 1.1.0 |
+| `gravitino.lance-rest.host` | The host name that Lance
REST service run in | 0.0.0.0
| Yes if Lance REST service is going to run | 1.1.0 |
+
+`namespace-backend`, `uri`, `port` and `host` have the same meaning as
described in the previous section, and they have the default values. In most
cases you only need to change `metalake-name` to your Gravitino metalake name.
+For other configurations listed in the file, just keep their default values.
+
+## Using Lance REST service
+
+Currently, as the Lance REST service only support Gravitino backend, so there
are some limitations when using Lance REST service:
+- You need to have a running Gravitino server with a metalake created.
+- As Gravitino has three hierarchies: catalog -> schema -> table, so when you
create namespaces or a table via Lance REST service, you need to make sure the
parent hierarchy exists. For example, when you create a namespace
`lance_catalog/schema`, you need to make sure the catalog `lance_catalog`
already exists in Gravitino metalake. If not, you need to create the
namespace(catalog) `lance_catalog` first.
+- Currently, we can only support two layers of namespaces and then tables,
that is to say, you can create namespace like `lance_catalog/schema`, but you
cannot create namespace like `lance_catalog/schema/sub_schema`. Tables can only
be created under the namespace `lance_catalog/schema`.
+
+## Example
+
+When Gravitino server is started with Lance REST service starts successfully,
and a `generic-lakehouse` catalog named `lance_catalog` is created in Gravitino
metalake, you can use the following Python code to interact with Lance REST
service:
+
+
+
+<Tabs groupId="language" queryString>
+<TabItem value="shell" label="Shell">
+
+```shell
+# Create a namespace
+# mode can be create or exist_ok or overwrite
+curl -X POST http://localhost:9101/lance/v1/namespace/lance_catalog/create -H
'Content-Type: application/json' -d '{
+ "id": ["lance_catalog"],
+ "mode": "create"
+}'
+
+# Create a schema namespace
+# %24 is the URL encoded character for $
+curl -X POST
http://localhost:9101/lance/v1/namespace/lance_catalog%24schema/create -H
'Content-Type: application/json' -d '{
+ "id": ["lance_catalog", "schema"],
+ "mode": "create"
+}'
+
+# register a table
+curl -X POST
http://localhost:9101/lance/v1/table/lance_catalog2%24schema%24table01/register
-H 'Content-Type: application/json' -d '{
+ "id": ["lance_catalog","schema","table01"],
+ "location": "/tmp/lance_catalog/schema/table01"
+}'
+
+```
+
+</TabItem>
+<TabItem value="java" label="Java">
+
+```java
+// implementation("com.lancedb:lance-namespace-core:0.0.19")
+
+private final BufferAllocator allocator = new RootAllocator(Long.MAX_VALUE);
+LanceNamespace ns = LanceNamespace.connect("rest", Map.of("uri",
"http://localhost:9101/lance"));
+HashMap<String, String> props = Maps.newHashMap();
+props.put(RestNamespaceConfig.URI, getLanceRestServiceUrl());
+props.put(RestNamespaceConfig.DELIMITER,
RestNamespaceConfig.DELIMITER_DEFAULT);
+LanceNamespace ns = LanceNamespaces.connect("rest", props, null, allocator);
+
Review Comment:
There is a duplicate `LanceNamespace ns` initialization. Line 130
initializes `ns` and line 134 reinitializes it with different parameters. One
of these should be removed, or they should be shown as alternative approaches
with clear indication.
```suggestion
// You can initialize the LanceNamespace in one of two ways:
// Approach 1: Simple connection with URI
LanceNamespace ns = LanceNamespace.connect("rest", Map.of("uri",
"http://localhost:9101/lance"));
// Approach 2: Advanced connection with custom properties and allocator
// HashMap<String, String> props = Maps.newHashMap();
// props.put(RestNamespaceConfig.URI, getLanceRestServiceUrl());
// props.put(RestNamespaceConfig.DELIMITER,
RestNamespaceConfig.DELIMITER_DEFAULT);
// LanceNamespace ns = LanceNamespaces.connect("rest", props, null,
allocator);
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]