This is an automated email from the ASF dual-hosted git repository.
yunhong pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/fluss.git
The following commit(s) were added to refs/heads/main by this push:
new 3c53a205d Rebalance procedure docs (#2355)
3c53a205d is described below
commit 3c53a205d2922a53be8e43ca1d57599848ee845f
Author: yunhong <[email protected]>
AuthorDate: Thu Jan 15 16:07:11 2026 +0800
Rebalance procedure docs (#2355)
* [docs] Add document for rebalance call procedure
* address baiye's comments
---
.../procedure/ListRebalanceProcessProcedure.java | 2 +-
website/docs/engine-flink/procedures.md | 205 +++++++++++++++++++++
2 files changed, 206 insertions(+), 1 deletion(-)
diff --git
a/fluss-flink/fluss-flink-common/src/main/java/org/apache/fluss/flink/procedure/ListRebalanceProcessProcedure.java
b/fluss-flink/fluss-flink-common/src/main/java/org/apache/fluss/flink/procedure/ListRebalanceProcessProcedure.java
index 722fc0c07..bc0bf99f7 100644
---
a/fluss-flink/fluss-flink-common/src/main/java/org/apache/fluss/flink/procedure/ListRebalanceProcessProcedure.java
+++
b/fluss-flink/fluss-flink-common/src/main/java/org/apache/fluss/flink/procedure/ListRebalanceProcessProcedure.java
@@ -65,7 +65,7 @@ public class ListRebalanceProcessProcedure extends
ProcedureBase {
Optional<RebalanceProgress> progressOpt =
admin.listRebalanceProgress(rebalanceId).get();
if (!progressOpt.isPresent()) {
- return new String[] {"No rebalance progress found."};
+ return new String[0];
}
return progressToString(progressOpt.get());
diff --git a/website/docs/engine-flink/procedures.md
b/website/docs/engine-flink/procedures.md
index eeab4496f..50c8377a6 100644
--- a/website/docs/engine-flink/procedures.md
+++ b/website/docs/engine-flink/procedures.md
@@ -277,4 +277,209 @@ CALL sys.reset_cluster_configs(
CALL sys.reset_cluster_configs(
config_keys => 'kv.rocksdb.shared-rate-limiter.bytes-per-sec',
'datalake.format'
);
+```
+
+## Rebalance Procedures
+
+Fluss provides procedures to rebalance buckets across the cluster based on
workload.
+Rebalancing primarily occurs in the following scenarios: Offline existing
tabletServers
+from the cluster, adding new tabletServers to the cluster, and routine
adjustments for load imbalance.
+
+### add_server_tag
+
+Add server tag to TabletServers in the cluster. For example, adding
`tabletServer-0` with `PERMANENT_OFFLINE` tag
+indicates that `tabletServer-0` is about to be permanently decommissioned, and
during the next rebalance,
+all buckets on this node need to be migrated away.
+
+**Syntax:**
+
+```sql
+CALL [catalog_name.]sys.add_server_tag(
+ tabletServers => 'STRING',
+ serverTag => 'STRING'
+)
+```
+
+**Parameters:**
+
+- `tabletServers` (required): The TabletServer IDs to add tag to. Can be a
single server ID (e.g., `'0'`) or multiple IDs separated by commas (e.g.,
`'0,1,2'`).
+- `serverTag` (required): The tag to add to the TabletServers. Valid values
are:
+ - `'PERMANENT_OFFLINE'`: Indicates the TabletServer is permanently offline
and will be decommissioned. All buckets on this server will be migrated during
the next rebalance.
+ - `'TEMPORARY_OFFLINE'`: Indicates the TabletServer is temporarily offline
(e.g., for upgrading). Buckets may be temporarily migrated but can return after
the server comes back online.
+
+**Returns:** An array with a single element `'success'` if the operation
completes successfully.
+
+**Example:**
+
+```sql title="Flink SQL"
+-- Use the Fluss catalog (replace 'fluss_catalog' with your catalog name if
different)
+USE fluss_catalog;
+
+-- Add PERMANENT_OFFLINE tag to a single TabletServer
+CALL sys.add_server_tag('0', 'PERMANENT_OFFLINE');
+
+-- Add TEMPORARY_OFFLINE tag to multiple TabletServers
+CALL sys.add_server_tag('1,2,3', 'TEMPORARY_OFFLINE');
+```
+
+### remove_server_tag
+
+Remove server tag from TabletServers in the cluster. This operation is
typically used when a previously tagged TabletServer is ready to return to
normal service, or to cancel a planned offline operation.
+
+**Syntax:**
+
+```sql
+CALL [catalog_name.]sys.remove_server_tag(
+ tabletServers => 'STRING',
+ serverTag => 'STRING'
+)
+```
+
+**Parameters:**
+
+- `tabletServers` (required): The TabletServer IDs to remove tag from. Can be
a single server ID (e.g., `'0'`) or multiple IDs separated by commas (e.g.,
`'0,1,2'`).
+- `serverTag` (required): The tag to remove from the TabletServers. Valid
values are:
+ - `'PERMANENT_OFFLINE'`: Remove the permanent offline tag from the
TabletServer.
+ - `'TEMPORARY_OFFLINE'`: Remove the temporary offline tag from the
TabletServer.
+
+**Returns:** An array with a single element `'success'` if the operation
completes successfully.
+
+**Example:**
+
+```sql title="Flink SQL"
+-- Use the Fluss catalog (replace 'fluss_catalog' with your catalog name if
different)
+USE fluss_catalog;
+
+-- Remove PERMANENT_OFFLINE tag from a single TabletServer
+CALL sys.remove_server_tag('0', 'PERMANENT_OFFLINE');
+
+-- Remove TEMPORARY_OFFLINE tag from multiple TabletServers
+CALL sys.remove_server_tag('1,2,3', 'TEMPORARY_OFFLINE');
+```
+
+### rebalance
+
+Trigger a rebalance operation to redistribute buckets across TabletServers in
the cluster. This procedure helps balance workload based on specified goals,
such as distributing replicas or leaders evenly across the cluster.
+
+**Syntax:**
+
+```sql
+CALL [catalog_name.]sys.rebalance(
+ priorityGoals => 'STRING'
+)
+```
+
+**Parameters:**
+
+- `priorityGoals` (required): The rebalance goals to achieve, specified as
goal types. Can be a single goal (e.g., `'REPLICA_DISTRIBUTION'`) or multiple
goals separated by commas (e.g., `'REPLICA_DISTRIBUTION,LEADER_DISTRIBUTION'`).
Valid goal types are:
+ - `'REPLICA_DISTRIBUTION'`: Generates replica movement tasks to ensure the
number of replicas on each TabletServer is near balanced.
+ - `'LEADER_DISTRIBUTION'`: Generates leadership movement and leader
replica movement tasks to ensure the number of leader replicas on each
TabletServer is near balanced.
+
+**Returns:** An array with a single element containing the rebalance ID (e.g.,
`'rebalance-12345'`), which can be used to track or cancel the rebalance
operation.
+
+**Important Notes:**
+
+- Multiple goals can be specified in priority order. The system will attempt
to achieve goals in the order specified.
+- Rebalance operations run asynchronously in the background. Use the returned
rebalance ID to monitor progress.
+- The rebalance operation respects server tags set by `add_server_tag`. For
example, servers marked with `PERMANENT_OFFLINE` will have their buckets
migrated away.
+
+**Example:**
+
+```sql title="Flink SQL"
+-- Use the Fluss catalog (replace 'fluss_catalog' with your catalog name if
different)
+USE fluss_catalog;
+
+-- Trigger rebalance with replica distribution goal
+CALL sys.rebalance('REPLICA_DISTRIBUTION');
+
+-- Trigger rebalance with multiple goals in priority order
+CALL sys.rebalance('REPLICA_DISTRIBUTION,LEADER_DISTRIBUTION');
+```
+
+### list_rebalance
+
+Query the progress and status of a rebalance operation. This procedure allows
you to monitor ongoing or completed rebalance operations to track their
progress and view detailed information about bucket movements.
+
+**Syntax:**
+
+```sql
+-- List the most recent rebalance progress
+CALL [catalog_name.]sys.list_rebalance()
+
+-- List a specific rebalance progress by ID
+CALL [catalog_name.]sys.list_rebalance(
+ rebalanceId => 'STRING'
+)
+```
+
+**Parameters:**
+
+- `rebalanceId` (optional): The rebalance ID to query. If omitted, returns the
progress of the most recent rebalance operation. The rebalance ID is returned
when calling the `rebalance` procedure.
+
+**Returns:** An array of strings containing:
+- Rebalance ID: The unique identifier of the rebalance operation
+- Rebalance total status: The overall status of the rebalance. Possible values
are:
+ - `NOT_STARTED`: The rebalance has been created but not yet started
+ - `REBALANCING`: The rebalance is currently in progress
+ - `COMPLETED`: The rebalance has successfully completed
+ - `FAILED`: The rebalance has failed
+ - `CANCELED`: The rebalance has been canceled
+- Rebalance progress: The completion percentage (e.g., `75.5%`)
+- Rebalance detail progress for bucket: Detailed progress information for each
bucket being moved
+
+If no rebalance is found, returns empty line.
+
+**Example:**
+
+```sql title="Flink SQL"
+-- Use the Fluss catalog (replace 'fluss_catalog' with your catalog name if
different)
+USE fluss_catalog;
+
+-- List the most recent rebalance progress
+CALL sys.list_rebalance();
+
+-- List a specific rebalance progress by ID
+CALL sys.list_rebalance('rebalance-12345');
+```
+
+### cancel_rebalance
+
+Cancel an ongoing rebalance operation. This procedure allows you to stop a
rebalance that is in progress, which is useful when you need to halt bucket
redistribution due to operational requirements or unexpected issues.
+
+**Syntax:**
+
+```sql
+-- Cancel the most recent rebalance operation
+CALL [catalog_name.]sys.cancel_rebalance()
+
+-- Cancel a specific rebalance operation by ID
+CALL [catalog_name.]sys.cancel_rebalance(
+ rebalanceId => 'STRING'
+)
+```
+
+**Parameters:**
+
+- `rebalanceId` (optional): The rebalance ID to cancel. If omitted, cancels
the most recent rebalance operation. The rebalance ID is returned when calling
the `rebalance` procedure.
+
+**Returns:** An array with a single element `'success'` if the operation
completes successfully.
+
+**Important Notes:**
+
+- Only rebalance operations in `NOT_STARTED` or `REBALANCING` status can be
canceled.
+- Canceling a rebalance will stop bucket movements, but already completed
bucket migrations will not be rolled back.
+- After cancellation, the rebalance status will change to `CANCELED`.
+- You can verify the cancellation by calling `list_rebalance` to check the
status.
+
+**Example:**
+
+```sql title="Flink SQL"
+-- Use the Fluss catalog (replace 'fluss_catalog' with your catalog name if
different)
+USE fluss_catalog;
+
+-- Cancel the most recent rebalance operation
+CALL sys.cancel_rebalance();
+
+-- Cancel a specific rebalance operation by ID
+CALL sys.cancel_rebalance('rebalance-12345');
```
\ No newline at end of file