This is an automated email from the ASF dual-hosted git repository.

yunhong pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/fluss.git


The following commit(s) were added to refs/heads/main by this push:
     new 3c53a205d Rebalance procedure docs (#2355)
3c53a205d is described below

commit 3c53a205d2922a53be8e43ca1d57599848ee845f
Author: yunhong <[email protected]>
AuthorDate: Thu Jan 15 16:07:11 2026 +0800

    Rebalance procedure docs (#2355)
    
    * [docs] Add document for rebalance call procedure
    
    * address baiye's comments
---
 .../procedure/ListRebalanceProcessProcedure.java   |   2 +-
 website/docs/engine-flink/procedures.md            | 205 +++++++++++++++++++++
 2 files changed, 206 insertions(+), 1 deletion(-)

diff --git 
a/fluss-flink/fluss-flink-common/src/main/java/org/apache/fluss/flink/procedure/ListRebalanceProcessProcedure.java
 
b/fluss-flink/fluss-flink-common/src/main/java/org/apache/fluss/flink/procedure/ListRebalanceProcessProcedure.java
index 722fc0c07..bc0bf99f7 100644
--- 
a/fluss-flink/fluss-flink-common/src/main/java/org/apache/fluss/flink/procedure/ListRebalanceProcessProcedure.java
+++ 
b/fluss-flink/fluss-flink-common/src/main/java/org/apache/fluss/flink/procedure/ListRebalanceProcessProcedure.java
@@ -65,7 +65,7 @@ public class ListRebalanceProcessProcedure extends 
ProcedureBase {
         Optional<RebalanceProgress> progressOpt = 
admin.listRebalanceProgress(rebalanceId).get();
 
         if (!progressOpt.isPresent()) {
-            return new String[] {"No rebalance progress found."};
+            return new String[0];
         }
 
         return progressToString(progressOpt.get());
diff --git a/website/docs/engine-flink/procedures.md 
b/website/docs/engine-flink/procedures.md
index eeab4496f..50c8377a6 100644
--- a/website/docs/engine-flink/procedures.md
+++ b/website/docs/engine-flink/procedures.md
@@ -277,4 +277,209 @@ CALL sys.reset_cluster_configs(
 CALL sys.reset_cluster_configs(
   config_keys => 'kv.rocksdb.shared-rate-limiter.bytes-per-sec', 
'datalake.format'
 );
+```
+
+## Rebalance Procedures
+
+Fluss provides procedures to rebalance buckets across the cluster based on 
workload.
+Rebalancing primarily occurs in the following scenarios: Offline existing 
tabletServers
+from the cluster, adding new tabletServers to the cluster, and routine 
adjustments for load imbalance.
+
+### add_server_tag
+
+Add server tag to TabletServers in the cluster. For example, adding 
`tabletServer-0` with `PERMANENT_OFFLINE` tag
+indicates that `tabletServer-0` is about to be permanently decommissioned, and 
during the next rebalance,
+all buckets on this node need to be migrated away.
+
+**Syntax:**
+
+```sql
+CALL [catalog_name.]sys.add_server_tag(
+  tabletServers => 'STRING',
+  serverTag => 'STRING'
+)
+```
+
+**Parameters:**
+
+- `tabletServers` (required): The TabletServer IDs to add tag to. Can be a 
single server ID (e.g., `'0'`) or multiple IDs separated by commas (e.g., 
`'0,1,2'`).
+- `serverTag` (required): The tag to add to the TabletServers. Valid values 
are:
+    - `'PERMANENT_OFFLINE'`: Indicates the TabletServer is permanently offline 
and will be decommissioned. All buckets on this server will be migrated during 
the next rebalance.
+    - `'TEMPORARY_OFFLINE'`: Indicates the TabletServer is temporarily offline 
(e.g., for upgrading). Buckets may be temporarily migrated but can return after 
the server comes back online.
+
+**Returns:** An array with a single element `'success'` if the operation 
completes successfully.
+
+**Example:**
+
+```sql title="Flink SQL"
+-- Use the Fluss catalog (replace 'fluss_catalog' with your catalog name if 
different)
+USE fluss_catalog;
+
+-- Add PERMANENT_OFFLINE tag to a single TabletServer
+CALL sys.add_server_tag('0', 'PERMANENT_OFFLINE');
+
+-- Add TEMPORARY_OFFLINE tag to multiple TabletServers
+CALL sys.add_server_tag('1,2,3', 'TEMPORARY_OFFLINE');
+```
+
+### remove_server_tag
+
+Remove server tag from TabletServers in the cluster. This operation is 
typically used when a previously tagged TabletServer is ready to return to 
normal service, or to cancel a planned offline operation.
+
+**Syntax:**
+
+```sql
+CALL [catalog_name.]sys.remove_server_tag(
+  tabletServers => 'STRING',
+  serverTag => 'STRING'
+)
+```
+
+**Parameters:**
+
+- `tabletServers` (required): The TabletServer IDs to remove tag from. Can be 
a single server ID (e.g., `'0'`) or multiple IDs separated by commas (e.g., 
`'0,1,2'`).
+- `serverTag` (required): The tag to remove from the TabletServers. Valid 
values are:
+    - `'PERMANENT_OFFLINE'`: Remove the permanent offline tag from the 
TabletServer.
+    - `'TEMPORARY_OFFLINE'`: Remove the temporary offline tag from the 
TabletServer.
+
+**Returns:** An array with a single element `'success'` if the operation 
completes successfully.
+
+**Example:**
+
+```sql title="Flink SQL"
+-- Use the Fluss catalog (replace 'fluss_catalog' with your catalog name if 
different)
+USE fluss_catalog;
+
+-- Remove PERMANENT_OFFLINE tag from a single TabletServer
+CALL sys.remove_server_tag('0', 'PERMANENT_OFFLINE');
+
+-- Remove TEMPORARY_OFFLINE tag from multiple TabletServers
+CALL sys.remove_server_tag('1,2,3', 'TEMPORARY_OFFLINE');
+```
+
+### rebalance
+
+Trigger a rebalance operation to redistribute buckets across TabletServers in 
the cluster. This procedure helps balance workload based on specified goals, 
such as distributing replicas or leaders evenly across the cluster.
+
+**Syntax:**
+
+```sql
+CALL [catalog_name.]sys.rebalance(
+  priorityGoals => 'STRING'
+)
+```
+
+**Parameters:**
+
+- `priorityGoals` (required): The rebalance goals to achieve, specified as 
goal types. Can be a single goal (e.g., `'REPLICA_DISTRIBUTION'`) or multiple 
goals separated by commas (e.g., `'REPLICA_DISTRIBUTION,LEADER_DISTRIBUTION'`). 
Valid goal types are:
+    - `'REPLICA_DISTRIBUTION'`: Generates replica movement tasks to ensure the 
number of replicas on each TabletServer is near balanced.
+    - `'LEADER_DISTRIBUTION'`: Generates leadership movement and leader 
replica movement tasks to ensure the number of leader replicas on each 
TabletServer is near balanced.
+
+**Returns:** An array with a single element containing the rebalance ID (e.g., 
`'rebalance-12345'`), which can be used to track or cancel the rebalance 
operation.
+
+**Important Notes:**
+
+- Multiple goals can be specified in priority order. The system will attempt 
to achieve goals in the order specified.
+- Rebalance operations run asynchronously in the background. Use the returned 
rebalance ID to monitor progress.
+- The rebalance operation respects server tags set by `add_server_tag`. For 
example, servers marked with `PERMANENT_OFFLINE` will have their buckets 
migrated away.
+
+**Example:**
+
+```sql title="Flink SQL"
+-- Use the Fluss catalog (replace 'fluss_catalog' with your catalog name if 
different)
+USE fluss_catalog;
+
+-- Trigger rebalance with replica distribution goal
+CALL sys.rebalance('REPLICA_DISTRIBUTION');
+
+-- Trigger rebalance with multiple goals in priority order
+CALL sys.rebalance('REPLICA_DISTRIBUTION,LEADER_DISTRIBUTION');
+```
+
+### list_rebalance
+
+Query the progress and status of a rebalance operation. This procedure allows 
you to monitor ongoing or completed rebalance operations to track their 
progress and view detailed information about bucket movements.
+
+**Syntax:**
+
+```sql
+-- List the most recent rebalance progress
+CALL [catalog_name.]sys.list_rebalance()
+
+-- List a specific rebalance progress by ID
+CALL [catalog_name.]sys.list_rebalance(
+  rebalanceId => 'STRING'
+)
+```
+
+**Parameters:**
+
+- `rebalanceId` (optional): The rebalance ID to query. If omitted, returns the 
progress of the most recent rebalance operation. The rebalance ID is returned 
when calling the `rebalance` procedure.
+
+**Returns:** An array of strings containing:
+- Rebalance ID: The unique identifier of the rebalance operation
+- Rebalance total status: The overall status of the rebalance. Possible values 
are:
+    - `NOT_STARTED`: The rebalance has been created but not yet started
+    - `REBALANCING`: The rebalance is currently in progress
+    - `COMPLETED`: The rebalance has successfully completed
+    - `FAILED`: The rebalance has failed
+    - `CANCELED`: The rebalance has been canceled
+- Rebalance progress: The completion percentage (e.g., `75.5%`)
+- Rebalance detail progress for bucket: Detailed progress information for each 
bucket being moved
+
+If no rebalance is found, returns empty line.
+
+**Example:**
+
+```sql title="Flink SQL"
+-- Use the Fluss catalog (replace 'fluss_catalog' with your catalog name if 
different)
+USE fluss_catalog;
+
+-- List the most recent rebalance progress
+CALL sys.list_rebalance();
+
+-- List a specific rebalance progress by ID
+CALL sys.list_rebalance('rebalance-12345');
+```
+
+### cancel_rebalance
+
+Cancel an ongoing rebalance operation. This procedure allows you to stop a 
rebalance that is in progress, which is useful when you need to halt bucket 
redistribution due to operational requirements or unexpected issues.
+
+**Syntax:**
+
+```sql
+-- Cancel the most recent rebalance operation
+CALL [catalog_name.]sys.cancel_rebalance()
+
+-- Cancel a specific rebalance operation by ID
+CALL [catalog_name.]sys.cancel_rebalance(
+  rebalanceId => 'STRING'
+)
+```
+
+**Parameters:**
+
+- `rebalanceId` (optional): The rebalance ID to cancel. If omitted, cancels 
the most recent rebalance operation. The rebalance ID is returned when calling 
the `rebalance` procedure.
+
+**Returns:** An array with a single element `'success'` if the operation 
completes successfully.
+
+**Important Notes:**
+
+- Only rebalance operations in `NOT_STARTED` or `REBALANCING` status can be 
canceled.
+- Canceling a rebalance will stop bucket movements, but already completed 
bucket migrations will not be rolled back.
+- After cancellation, the rebalance status will change to `CANCELED`.
+- You can verify the cancellation by calling `list_rebalance` to check the 
status.
+
+**Example:**
+
+```sql title="Flink SQL"
+-- Use the Fluss catalog (replace 'fluss_catalog' with your catalog name if 
different)
+USE fluss_catalog;
+
+-- Cancel the most recent rebalance operation
+CALL sys.cancel_rebalance();
+
+-- Cancel a specific rebalance operation by ID
+CALL sys.cancel_rebalance('rebalance-12345');
 ```
\ No newline at end of file

Reply via email to