(gravitino) branch main updated: [#9539] doc(udf): add user doc for Function management (#9971)

jshao Thu, 12 Feb 2026 22:43:20 -0800

This is an automated email from the ASF dual-hosted git repository.

jshao pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/gravitino.git



The following commit(s) were added to refs/heads/main by this push:
     new f2865e1f9a [#9539] doc(udf): add user doc for Function management 
(#9971)
f2865e1f9a is described below

commit f2865e1f9ae6622d5a85f002f55cd6baf28dbba0
Author: mchades <[email protected]>
AuthorDate: Fri Feb 13 14:43:10 2026 +0800

    [#9539] doc(udf): add user doc for Function management (#9971)
    
    ### What changes were proposed in this pull request?
    
    add user doc for Function management
    
    ### Why are the changes needed?
    
    Fix: #9539
    
    ### Does this PR introduce _any_ user-facing change?
    
    no
    
    ### How was this patch tested?
    
    no need
---
 docs/index.md                                      |   2 +
 ...manage-user-defined-function-using-gravitino.md | 703 +++++++++++++++++++++
 docs/spark-connector/spark-catalog-hive.md         |   3 +-
 docs/spark-connector/spark-connector-udf.md        |  80 +++
 4 files changed, 786 insertions(+), 2 deletions(-)

diff --git a/docs/index.md b/docs/index.md
index 7e8b7e0f53..922eb94423 100644
--- a/docs/index.md
+++ b/docs/index.md
@@ -61,6 +61,8 @@ You can use either to manage metadata. See
   messaging metadata.
 * [Manage model metadata using 
Gravitino](./manage-model-metadata-using-gravitino.md) to learn how to manage
   model metadata.
+* [Manage user-defined functions using 
Gravitino](./manage-user-defined-function-using-gravitino.md) to learn how to 
manage
+  user-defined functions.
 
 Also, you can find the complete REST API definition in
 [Gravitino Open API](./api/rest/gravitino-rest-api),
diff --git a/docs/manage-user-defined-function-using-gravitino.md 
b/docs/manage-user-defined-function-using-gravitino.md
new file mode 100644
index 0000000000..2316e7d8b4
--- /dev/null
+++ b/docs/manage-user-defined-function-using-gravitino.md
@@ -0,0 +1,703 @@
+---
+title: Manage user-defined functions using Gravitino
+slug: /manage-user-defined-function-using-gravitino
+keyword: Gravitino user-defined function UDF manage
+license: This software is licensed under the Apache License version 2.
+---
+
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+This page introduces how to manage user-defined functions (UDFs) in Apache 
Gravitino. Gravitino
+provides a centralized function registry that allows you to define custom 
functions once and
+share them across multiple compute engines like Spark and Trino.
+
+A function in Gravitino is characterized by:
+
+- **Name**: The function identifier within a schema.
+- **Function type**: `SCALAR` (row-by-row operations), `AGGREGATE` (group 
operations), or
+  `TABLE` (set-returning operations).
+- **Deterministic**: Whether the function always returns the same result for 
the same input.
+- **Definitions**: One or more overloads, each with a specific parameter list, 
return type
+  (or return columns for table functions), and one or more implementations for 
different
+  runtimes (e.g. Spark, Trino).
+
+Each definition can have multiple implementations in different languages (SQL, 
Java, Python)
+targeting different runtimes. **Each definition must have at most one 
implementation per
+runtime** — for example, you cannot have two implementations both targeting 
`SPARK` in the
+same definition. To replace an existing implementation, use `updateImpl` 
instead of `addImpl`.
+
+| Language | Key fields             | Description                              
            |
+|----------|------------------------|------------------------------------------------------|
+| SQL      | `sql`                  | An inline SQL expression.                
            |
+| Java     | `className`            | Fully qualified Java class name.         
            |
+| Python   | `handler`, `codeBlock` | Python handler entry point and optional 
inline code. |
+
+To use function management, please make sure that:
+
+ - The Gravitino server has started and is serving at, e.g. 
[http://localhost:8090](http://localhost:8090).
+ - A metalake has been created.
+ - A catalog has been created within the metalake.
+ - A schema has been created within the catalog.
+
+## Function operations
+
+### Register a function
+
+You can register a function by sending a `POST` request to the 
`/api/metalakes/{metalake_name}/catalogs/{catalog_name}/schemas/{schema_name}/functions`
+endpoint or just use the Gravitino Java/Python client. The following is an 
example of registering
+a scalar function:
+
+<Tabs groupId="language" queryString>
+<TabItem value="shell" label="Shell">
+
+```shell
+curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \
+-H "Content-Type: application/json" -d '{
+  "name": "add_one",
+  "functionType": "SCALAR",
+  "deterministic": true,
+  "comment": "A scalar function that adds one to the input",
+  "definitions": [
+    {
+      "parameters": [
+        {"name": "x", "dataType": "integer"}
+      ],
+      "returnType": "integer",
+      "impls": [
+        {
+          "language": "SQL",
+          "runtime": "TRINO",
+          "sql": "x + 1"
+        }
+      ]
+    }
+  ]
+}' 
http://localhost:8090/api/metalakes/example/catalogs/my_catalog/schemas/my_schema/functions
+```
+
+</TabItem>
+<TabItem value="java" label="Java">
+
+```java
+GravitinoClient gravitinoClient = GravitinoClient
+    .builder("http://localhost:8090";)
+    .withMetalake("example")
+    .build();
+
+Catalog catalog = gravitinoClient.loadCatalog("my_catalog");
+FunctionCatalog functionCatalog = catalog.asFunctionCatalog();
+
+// Define the function parameter
+FunctionParam param = FunctionParams.of("x", Types.IntegerType.get());
+
+// Create a SQL implementation for Trino
+FunctionImpl sqlImpl = FunctionImpls.ofSql(
+    FunctionImpl.RuntimeType.TRINO, "x + 1");
+
+// Create the function definition with return type
+FunctionDefinition definition = FunctionDefinitions.of(
+    new FunctionParam[] {param},
+    Types.IntegerType.get(),
+    new FunctionImpl[] {sqlImpl});
+
+// Register the function
+Function function = functionCatalog.registerFunction(
+    NameIdentifier.of("my_schema", "add_one"),
+    "A scalar function that adds one to the input",
+    FunctionType.SCALAR,
+    true,
+    new FunctionDefinition[] {definition});
+```
+
+</TabItem>
+<TabItem value="python" label="Python">
+
+```python
+gravitino_client: GravitinoClient = 
GravitinoClient(uri="http://localhost:8090";, metalake_name="example")
+
+catalog: Catalog = gravitino_client.load_catalog(name="my_catalog")
+function_catalog = catalog.as_function_catalog()
+
+# Define the function parameter
+param = FunctionParams.of("x", Types.IntegerType.get())
+
+# Create a SQL implementation for Trino
+sql_impl = (
+    SQLImpl.builder()
+    .with_runtime_type(SQLImpl.RuntimeType.TRINO)
+    .with_sql("x + 1")
+    .build()
+)
+
+# Create the function definition
+definition = FunctionDefinitions.of([param], Types.IntegerType.get(), 
[sql_impl])
+
+# Register the function
+function = function_catalog.register_function(
+    ident=NameIdentifier.of("my_schema", "add_one"),
+    comment="A scalar function that adds one to the input",
+    function_type=FunctionType.SCALAR,
+    deterministic=True,
+    definitions=[definition],
+)
+```
+
+</TabItem>
+</Tabs>
+
+For table-valued functions, use `returnColumns` instead of `returnType` in the 
function
+definition, and use `FunctionType.TABLE` as the function type:
+
+<Tabs groupId="language" queryString>
+<TabItem value="shell" label="Shell">
+
+```shell
+curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \
+-H "Content-Type: application/json" -d '{
+  "name": "generate_series",
+  "functionType": "TABLE",
+  "deterministic": true,
+  "comment": "A table function that generates a series of integers",
+  "definitions": [
+    {
+      "parameters": [
+        {"name": "start_val", "dataType": "integer"},
+        {"name": "end_val", "dataType": "integer"}
+      ],
+      "returnColumns": [
+        {"name": "value", "dataType": "integer", "comment": "The generated 
integer value"}
+      ],
+      "impls": [
+        {
+          "language": "JAVA",
+          "runtime": "SPARK",
+          "className": "com.example.GenerateSeriesFunction",
+          "resources": {
+            "jars": ["hdfs:///path/to/udtf.jar"]
+          }
+        }
+      ]
+    }
+  ]
+}' 
http://localhost:8090/api/metalakes/example/catalogs/my_catalog/schemas/my_schema/functions
+```
+
+</TabItem>
+<TabItem value="java" label="Java">
+
+```java
+// ...
+FunctionCatalog functionCatalog = catalog.asFunctionCatalog();
+
+FunctionParam[] params = new FunctionParam[] {
+    FunctionParams.of("start_val", Types.IntegerType.get()),
+    FunctionParams.of("end_val", Types.IntegerType.get())
+};
+
+FunctionColumn[] returnColumns = new FunctionColumn[] {
+    FunctionColumn.of("value", Types.IntegerType.get(), "The generated integer 
value")
+};
+
+FunctionImpl javaImpl = FunctionImpls.ofJava(
+    FunctionImpl.RuntimeType.SPARK,
+    "com.example.GenerateSeriesFunction",
+    FunctionResources.of(
+        new String[] {"hdfs:///path/to/udtf.jar"}, null, null),
+    null);
+
+FunctionDefinition definition = FunctionDefinitions.of(
+    params, returnColumns, new FunctionImpl[] {javaImpl});
+
+Function function = functionCatalog.registerFunction(
+    NameIdentifier.of("my_schema", "generate_series"),
+    "A table function that generates a series of integers",
+    FunctionType.TABLE,
+    true,
+    new FunctionDefinition[] {definition});
+// ...
+```
+
+</TabItem>
+<TabItem value="python" label="Python">
+
+```python
+# ...
+function_catalog = catalog.as_function_catalog()
+
+# Define the function parameters
+param_start = FunctionParams.of("start_val", Types.IntegerType.get())
+param_end = FunctionParams.of("end_val", Types.IntegerType.get())
+
+# Define the return columns for the table function
+return_columns = [
+    FunctionColumn.of("value", Types.IntegerType.get(), "The generated integer 
value")
+]
+
+java_impl = (
+    JavaImpl.builder()
+    .with_runtime_type(JavaImpl.RuntimeType.SPARK)
+    .with_class_name("com.example.GenerateSeriesFunction")
+    .build()
+)
+
+# Use of_table() for table-valued functions
+definition = FunctionDefinitions.of_table([param_start, param_end], 
return_columns, [java_impl])
+
+function = function_catalog.register_function(
+    ident=NameIdentifier.of("my_schema", "generate_series"),
+    comment="A table function that generates a series of integers",
+    function_type=FunctionType.TABLE,
+    deterministic=True,
+    definitions=[definition],
+)
+```
+
+</TabItem>
+</Tabs>
+
+### Get a function
+
+You can get a function by sending a `GET` request to the 
`/api/metalakes/{metalake_name}/catalogs/{catalog_name}/schemas/{schema_name}/functions/{function_name}`
+endpoint or by using the Gravitino Java/Python client. The following is an 
example of getting
+a function:
+
+<Tabs groupId="language" queryString>
+<TabItem value="shell" label="Shell">
+
+```shell
+curl -X GET -H "Accept: application/vnd.gravitino.v1+json" \
+-H "Content-Type: application/json" \
+http://localhost:8090/api/metalakes/example/catalogs/my_catalog/schemas/my_schema/functions/add_one
+```
+
+</TabItem>
+<TabItem value="java" label="Java">
+
+```java
+// ...
+Catalog catalog = gravitinoClient.loadCatalog("my_catalog");
+FunctionCatalog functionCatalog = catalog.asFunctionCatalog();
+
+Function function = functionCatalog.getFunction(
+    NameIdentifier.of("my_schema", "add_one"));
+// ...
+```
+
+</TabItem>
+<TabItem value="python" label="Python">
+
+```python
+gravitino_client: GravitinoClient = 
GravitinoClient(uri="http://localhost:8090";, metalake_name="example")
+
+catalog: Catalog = gravitino_client.load_catalog(name="my_catalog")
+function = catalog.as_function_catalog().get_function(
+    ident=NameIdentifier.of("my_schema", "add_one"))
+```
+
+</TabItem>
+</Tabs>
+
+### List functions
+
+You can list all the functions in a schema by sending a `GET` request to the 
`/api/metalakes/{metalake_name}/catalogs/{catalog_name}/schemas/{schema_name}/functions`
+endpoint or by using the Gravitino Java/Python client. The following is an 
example of listing
+all the functions in a schema:
+
+<Tabs groupId="language" queryString>
+<TabItem value="shell" label="Shell">
+
+```shell
+curl -X GET -H "Accept: application/vnd.gravitino.v1+json" \
+-H "Content-Type: application/json" \
+http://localhost:8090/api/metalakes/example/catalogs/my_catalog/schemas/my_schema/functions
+```
+
+</TabItem>
+<TabItem value="java" label="Java">
+
+```java
+// ...
+Catalog catalog = gravitinoClient.loadCatalog("my_catalog");
+FunctionCatalog functionCatalog = catalog.asFunctionCatalog();
+
+NameIdentifier[] identifiers = functionCatalog.listFunctions(
+    Namespace.of("my_schema"));
+// ...
+```
+
+</TabItem>
+<TabItem value="python" label="Python">
+
+```python
+gravitino_client: GravitinoClient = 
GravitinoClient(uri="http://localhost:8090";, metalake_name="example")
+
+catalog: Catalog = gravitino_client.load_catalog(name="my_catalog")
+functions = catalog.as_function_catalog().list_functions(
+    namespace=Namespace.of("my_schema"))
+```
+
+</TabItem>
+</Tabs>
+
+You can also list functions with detailed information by adding the `details` 
query parameter.
+This returns the full function objects instead of just the identifiers.
+
+<Tabs groupId="language" queryString>
+<TabItem value="shell" label="Shell">
+
+```shell
+curl -X GET -H "Accept: application/vnd.gravitino.v1+json" \
+-H "Content-Type: application/json" \
+"http://localhost:8090/api/metalakes/example/catalogs/my_catalog/schemas/my_schema/functions?details=true";
+```
+
+</TabItem>
+<TabItem value="java" label="Java">
+
+```java
+// ...
+Catalog catalog = gravitinoClient.loadCatalog("my_catalog");
+FunctionCatalog functionCatalog = catalog.asFunctionCatalog();
+
+Function[] functions = functionCatalog.listFunctionInfos(
+    Namespace.of("my_schema"));
+// ...
+```
+
+</TabItem>
+<TabItem value="python" label="Python">
+
+```python
+gravitino_client: GravitinoClient = 
GravitinoClient(uri="http://localhost:8090";, metalake_name="example")
+
+catalog: Catalog = gravitino_client.load_catalog(name="my_catalog")
+functions = catalog.as_function_catalog().list_function_infos(
+    namespace=Namespace.of("my_schema"))
+```
+
+</TabItem>
+</Tabs>
+
+### Alter a function
+
+You can modify a function by sending a `PUT` request to the 
`/api/metalakes/{metalake_name}/catalogs/{catalog_name}/schemas/{schema_name}/functions/{function_name}`
+endpoint or using the Gravitino Java/Python client. The following is an 
example of updating a
+function's comment:
+
+<Tabs groupId="language" queryString>
+<TabItem value="shell" label="Shell">
+
+```shell
+curl -X PUT -H "Accept: application/vnd.gravitino.v1+json" \
+-H "Content-Type: application/json" -d '{
+  "updates": [
+    {
+      "@type": "updateComment",
+      "newComment": "An improved scalar function that adds one"
+    }
+  ]
+}' 
http://localhost:8090/api/metalakes/example/catalogs/my_catalog/schemas/my_schema/functions/add_one
+```
+
+</TabItem>
+<TabItem value="java" label="Java">
+
+```java
+// ...
+Catalog catalog = gravitinoClient.loadCatalog("my_catalog");
+FunctionCatalog functionCatalog = catalog.asFunctionCatalog();
+
+Function updated = functionCatalog.alterFunction(
+    NameIdentifier.of("my_schema", "add_one"),
+    FunctionChange.updateComment("An improved scalar function that adds one"));
+// ...
+```
+
+</TabItem>
+<TabItem value="python" label="Python">
+
+```python
+gravitino_client: GravitinoClient = 
GravitinoClient(uri="http://localhost:8090";, metalake_name="example")
+
+catalog: Catalog = gravitino_client.load_catalog(name="my_catalog")
+function = catalog.as_function_catalog().alter_function(
+    NameIdentifier.of("my_schema", "add_one"),
+    FunctionChange.update_comment("An improved scalar function that adds one"),
+)
+```
+
+</TabItem>
+</Tabs>
+
+#### Supported modifications
+
+The following operations are supported for altering a function:
+
+| Operation                 | JSON Example                                     
                                           | Java Method                        
                    | Python Method                                           |
+|---------------------------|---------------------------------------------------------------------------------------------|--------------------------------------------------------|---------------------------------------------------------|
+| **Update comment**        | `{"@type":"updateComment","newComment":"new 
comment"}`                                      | 
`FunctionChange.updateComment("new comment")`          | 
`FunctionChange.update_comment("new comment")`          |
+| **Add definition**        | `{"@type":"addDefinition","definition":{...}}`   
                                           | 
`FunctionChange.addDefinition(definition)`             | 
`FunctionChange.add_definition(definition)`             |
+| **Remove definition**     | 
`{"@type":"removeDefinition","parameters":[{"name":"x","dataType":"integer"}]}` 
            | `FunctionChange.removeDefinition(params)`              | 
`FunctionChange.remove_definition(params)`              |
+| **Add implementation**    | 
`{"@type":"addImpl","parameters":[...],"implementation":{...}}`                 
            | `FunctionChange.addImpl(params, impl)`                 | 
`FunctionChange.add_impl(params, impl)`                 |
+| **Update implementation** | 
`{"@type":"updateImpl","parameters":[...],"runtime":"SPARK","implementation":{...}}`
        | `FunctionChange.updateImpl(params, runtime, impl)`     | 
`FunctionChange.update_impl(params, runtime, impl)`     |
+| **Remove implementation** | 
`{"@type":"removeImpl","parameters":[{"name":"x","dataType":"integer"}],"runtime":"SPARK"}`
 | `FunctionChange.removeImpl(params, RuntimeType.SPARK)` | 
`FunctionChange.remove_impl(params, RuntimeType.SPARK)` |
+
+:::note
+When using `addImpl`, the runtime of the new implementation must not already 
exist in the
+ target definition. Use `updateImpl` to replace an existing implementation for 
a given runtime.
+:::
+
+The following is an example of adding a new implementation to an existing 
function definition:
+
+<Tabs groupId="language" queryString>
+<TabItem value="shell" label="Shell">
+
+```shell
+curl -X PUT -H "Accept: application/vnd.gravitino.v1+json" \
+-H "Content-Type: application/json" -d '{
+  "updates": [
+    {
+      "@type": "addImpl",
+      "parameters": [
+        {"name": "x", "dataType": "integer"}
+      ],
+      "implementation": {
+        "language": "JAVA",
+        "runtime": "TRINO",
+        "className": "com.example.AddOneFunction",
+        "resources": {
+          "jars": ["hdfs:///path/to/udf.jar"]
+        }
+      }
+    }
+  ]
+}' 
http://localhost:8090/api/metalakes/example/catalogs/my_catalog/schemas/my_schema/functions/add_one
+```
+
+</TabItem>
+<TabItem value="java" label="Java">
+
+```java
+// ...
+FunctionParam[] params = new FunctionParam[] {
+    FunctionParams.of("x", Types.IntegerType.get())
+};
+FunctionImpl javaImpl = FunctionImpls.ofJava(
+    FunctionImpl.RuntimeType.TRINO,
+    "com.example.AddOneFunction",
+    FunctionResources.of(
+        new String[] {"hdfs:///path/to/udf.jar"}, null, null),
+    null);
+
+Function updated = functionCatalog.alterFunction(
+    NameIdentifier.of("my_schema", "add_one"),
+    FunctionChange.addImpl(params, javaImpl));
+// ...
+```
+
+</TabItem>
+<TabItem value="python" label="Python">
+
+```python
+# ...
+param = FunctionParams.of("x", Types.IntegerType.get())
+
+java_impl = (
+    JavaImpl.builder()
+    .with_runtime_type(JavaImpl.RuntimeType.TRINO)
+    .with_class_name("com.example.AddOneFunction")
+    .build()
+)
+
+function = catalog.as_function_catalog().alter_function(
+    NameIdentifier.of("my_schema", "add_one"),
+    FunctionChange.add_impl([param], java_impl),
+)
+```
+
+</TabItem>
+</Tabs>
+
+### Drop a function
+
+You can drop a function by sending a `DELETE` request to the 
`/api/metalakes/{metalake_name}/catalogs/{catalog_name}/schemas/{schema_name}/functions/{function_name}`
+endpoint or by using the Gravitino Java/Python client. The following is an 
example of dropping
+a function:
+
+<Tabs groupId="language" queryString>
+<TabItem value="shell" label="Shell">
+
+```shell
+curl -X DELETE -H "Accept: application/vnd.gravitino.v1+json" \
+-H "Content-Type: application/json" \
+http://localhost:8090/api/metalakes/example/catalogs/my_catalog/schemas/my_schema/functions/add_one
+```
+
+</TabItem>
+<TabItem value="java" label="Java">
+
+```java
+// ...
+Catalog catalog = gravitinoClient.loadCatalog("my_catalog");
+boolean dropped = catalog.asFunctionCatalog().dropFunction(
+    NameIdentifier.of("my_schema", "add_one"));
+// ...
+```
+
+</TabItem>
+<TabItem value="python" label="Python">
+
+```python
+gravitino_client: GravitinoClient = 
GravitinoClient(uri="http://localhost:8090";, metalake_name="example")
+
+catalog: Catalog = gravitino_client.load_catalog(name="my_catalog")
+dropped: bool = catalog.as_function_catalog().drop_function(
+    NameIdentifier.of("my_schema", "add_one"))
+```
+
+</TabItem>
+</Tabs>
+
+## Advanced examples
+
+### Register a function with multiple overloads
+
+A function can have multiple definitions (overloads) with different parameter 
lists. Each
+definition has its own return type and implementations.
+
+<Tabs groupId="language" queryString>
+<TabItem value="shell" label="Shell">
+
+```shell
+curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \
+-H "Content-Type: application/json" -d '{
+  "name": "add",
+  "functionType": "SCALAR",
+  "deterministic": true,
+  "comment": "An overloaded add function",
+  "definitions": [
+    {
+      "parameters": [
+        {"name": "x", "dataType": "integer"},
+        {"name": "y", "dataType": "integer"}
+      ],
+      "returnType": "integer",
+      "impls": [
+        {
+          "language": "SQL",
+          "runtime": "TRINO",
+          "sql": "x + y"
+        }
+      ]
+    },
+    {
+      "parameters": [
+        {"name": "x", "dataType": "double"},
+        {"name": "y", "dataType": "double"}
+      ],
+      "returnType": "double",
+      "impls": [
+        {
+          "language": "SQL",
+          "runtime": "TRINO",
+          "sql": "x + y"
+        }
+      ]
+    }
+  ]
+}' 
http://localhost:8090/api/metalakes/example/catalogs/my_catalog/schemas/my_schema/functions
+```
+
+</TabItem>
+<TabItem value="java" label="Java">
+
+```java
+// ...
+FunctionCatalog functionCatalog = catalog.asFunctionCatalog();
+
+// Definition 1: integer overload
+FunctionDefinition intDef = FunctionDefinitions.of(
+    new FunctionParam[] {
+        FunctionParams.of("x", Types.IntegerType.get()),
+        FunctionParams.of("y", Types.IntegerType.get())
+    },
+    Types.IntegerType.get(),
+    new FunctionImpl[] {
+        FunctionImpls.ofSql(FunctionImpl.RuntimeType.TRINO, "x + y")
+    });
+
+// Definition 2: double overload
+FunctionDefinition doubleDef = FunctionDefinitions.of(
+    new FunctionParam[] {
+        FunctionParams.of("x", Types.DoubleType.get()),
+        FunctionParams.of("y", Types.DoubleType.get())
+    },
+    Types.DoubleType.get(),
+    new FunctionImpl[] {
+        FunctionImpls.ofSql(FunctionImpl.RuntimeType.TRINO, "x + y")
+    });
+
+Function function = functionCatalog.registerFunction(
+    NameIdentifier.of("my_schema", "add"),
+    "An overloaded add function",
+    FunctionType.SCALAR,
+    true,
+    FunctionDefinitions.of(intDef, doubleDef));
+// ...
+```
+
+</TabItem>
+<TabItem value="python" label="Python">
+
+```python
+# ...
+function_catalog = catalog.as_function_catalog()
+
+# Definition 1: integer overload
+int_param_x = FunctionParams.of("x", Types.IntegerType.get())
+int_param_y = FunctionParams.of("y", Types.IntegerType.get())
+sql_int = (
+    SQLImpl.builder()
+    .with_runtime_type(SQLImpl.RuntimeType.TRINO)
+    .with_sql("x + y")
+    .build()
+)
+int_def = FunctionDefinitions.of([int_param_x, int_param_y], 
Types.IntegerType.get(), [sql_int])
+
+# Definition 2: double overload
+double_param_x = FunctionParams.of("x", Types.DoubleType.get())
+double_param_y = FunctionParams.of("y", Types.DoubleType.get())
+sql_double = (
+    SQLImpl.builder()
+    .with_runtime_type(SQLImpl.RuntimeType.TRINO)
+    .with_sql("x + y")
+    .build()
+)
+double_def = FunctionDefinitions.of([double_param_x, double_param_y], 
Types.DoubleType.get(), [sql_double])
+
+function = function_catalog.register_function(
+    ident=NameIdentifier.of("my_schema", "add"),
+    comment="An overloaded add function",
+    function_type=FunctionType.SCALAR,
+    deterministic=True,
+    definitions=[int_def, double_def],
+)
+```
+
+</TabItem>
+</Tabs>
+
+## Using functions in compute engines
+
+Once a function is registered in Gravitino, it can be used in supported 
compute engines.
+The engine's connector loads the function from Gravitino and invokes the 
appropriate
+implementation based on the runtime.
+
+| Engine | Runtime             | Documentation                                 
                                     |
+|--------|---------------------|------------------------------------------------------------------------------------|
+| Spark  | `RuntimeType.SPARK` | [Spark connector - User-defined 
functions](spark-connector/spark-connector-udf.md) |
+
+:::note
+Support for additional engines (e.g. Trino, Flink) will be documented as they 
become available.
+:::
diff --git a/docs/spark-connector/spark-catalog-hive.md 
b/docs/spark-connector/spark-catalog-hive.md
index 9510f9cc9e..0b8824533e 100644
--- a/docs/spark-connector/spark-catalog-hive.md
+++ b/docs/spark-connector/spark-catalog-hive.md
@@ -11,10 +11,9 @@ With the Apache Gravitino Spark connector, accessing data or 
managing metadata i
 
 Supports most DDL and DML operations in SparkSQL, except such operations:
 
-- Function operations 
+- Function operations (Gravitino UDFs are supported, see [Spark connector - 
User-defined functions](spark-connector-udf.md))
 - Partition operations
 - View operations
-- Querying UDF
 - `LOAD` clause
 - `CREATE TABLE LIKE` clause
 - `TRUCATE TABLE` clause
diff --git a/docs/spark-connector/spark-connector-udf.md 
b/docs/spark-connector/spark-connector-udf.md
new file mode 100644
index 0000000000..a25241616b
--- /dev/null
+++ b/docs/spark-connector/spark-connector-udf.md
@@ -0,0 +1,80 @@
+---
+title: "Spark connector - User-defined functions"
+slug: /spark-connector/spark-connector-udf
+keyword: spark connector UDF user-defined function
+license: "This software is licensed under the Apache License version 2."
+---
+
+## Overview
+
+The Apache Gravitino Spark connector supports loading user-defined functions 
(UDFs) registered
+in the Gravitino function registry. Once a function is
+[registered in Gravitino](../manage-user-defined-function-using-gravitino.md), 
Spark can discover and
+invoke it through standard Spark SQL syntax — no additional `CREATE FUNCTION` 
statement is needed.
+
+:::note
+Currently, only **Java implementations** with `RuntimeType.SPARK` are 
supported in the Spark
+connector. SQL and Python implementations registered in Gravitino cannot yet 
be invoked
+directly from Spark. Support for additional languages is planned for future 
releases.
+:::
+
+## Prerequisites
+
+Before using Gravitino UDFs in Spark, ensure the following:
+
+1. The **Spark connector is configured** and the catalog is accessible
+   (see [Spark connector setup](spark-connector.md)).
+2. The function has been **registered in Gravitino** with at least one 
definition that includes
+   a Java implementation targeting `RuntimeType.SPARK`
+   (see [Register a 
function](../manage-user-defined-function-using-gravitino.md#register-a-function)).
+3. The **JAR containing the UDF class** is available on the Spark classpath 
(e.g. via
+   `--jars` or `spark.jars` configuration).
+
+## Java UDF requirements
+
+The Java class specified in `className` of the function implementation must 
implement Spark's
+`org.apache.spark.sql.connector.catalog.functions.UnboundFunction` interface. 
For details on
+implementing custom Spark functions, refer to the
+[Spark DataSource V2 Functions 
documentation](https://spark.apache.org/docs/latest/api/java/org/apache/spark/sql/connector/catalog/functions/UnboundFunction.html).
+
+Key points:
+
+- The class must have a **public no-arg constructor**.
+- The class must be on the **Spark driver and executor classpath**.
+- Only functions with `RuntimeType.SPARK` are visible to the Spark connector; 
implementations
+  targeting other runtimes (e.g. `TRINO`) are filtered out.
+
+## Calling functions in Spark SQL
+
+Use the fully qualified three-part name `catalog.schema.function_name` to call 
a
+Gravitino-registered function:
+
+```sql
+-- Call a scalar function
+SELECT my_catalog.my_schema.add_one(42);
+
+-- Use in a query
+SELECT id, my_catalog.my_schema.add_one(value) AS incremented
+FROM my_catalog.my_schema.my_table;
+```
+
+:::tip
+You can simplify the syntax by setting the default catalog and schema first:
+
+```sql
+USE my_catalog;
+USE my_schema;
+SELECT add_one(42);
+```
+:::
+
+## Discovering functions
+
+The Spark connector only exposes functions that have at least one Java 
implementation with
+`RuntimeType.SPARK`. Functions with only non-Spark implementations (e.g. 
`TRINO`) are not
+listed or loadable.
+
+```sql
+-- List all available functions in a schema (includes Gravitino UDFs with 
Spark runtime)
+SHOW FUNCTIONS IN my_catalog.my_schema;
+```

(gravitino) branch main updated: [#9539] doc(udf): add user doc for Function management (#9971)

Reply via email to