[ 
https://issues.apache.org/jira/browse/KUDU-3731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18066309#comment-18066309
 ] 

Gabriella Lotz commented on KUDU-3731:
--------------------------------------

Ksck output for 2026.03.17.:
{code:java}
...
Tablet Replica Count Summary
   Statistic    | Replica Count
----------------+---------------
 Minimum        | 8
 First Quartile | 8
 Median         | 8
 Third Quartile | 8
 Maximum        | 8
Total Count Summary
                | Total Count
----------------+-------------
 Masters        | 1
 Tablet Servers | 3
 Tables         | 1
 Tablets        | 8
 Replicas       | 24
==================
Warnings:
==================
Some masters have unsafe, experimental, or hidden flags set
OK{code}
Log findings:
{code:java}
# grep -i rebalanc /var/log/kudu/kudu-master.INFO | tail -20
I20260317 08:00:06.239782 821025 auto_leader_rebalancer.cc:125] leader 
rebalance for table user_events
I20260317 08:00:06.239840 821025 auto_leader_rebalancer.cc:125] leader 
rebalance for table time_series_table
I20260317 08:00:06.239856 821025 auto_leader_rebalancer.cc:125] leader 
rebalance for table user_events
I20260317 08:00:06.239988 821025 auto_leader_rebalancer.cc:359] table: 
user_events, leader rebalance finish, leader transfer count: 0
I20260317 08:00:06.240016 821025 auto_leader_rebalancer.cc:414] All tables' 
leader rebalancing finished this round
I20260317 09:00:06.240218 821025 auto_leader_rebalancer.cc:125] leader 
rebalance for table user_events
I20260317 09:00:06.240336 821025 auto_leader_rebalancer.cc:125] leader 
rebalance for table time_series_table
I20260317 09:00:06.240382 821025 auto_leader_rebalancer.cc:125] leader 
rebalance for table user_events
I20260317 09:00:06.240657 821025 auto_leader_rebalancer.cc:359] table: 
user_events, leader rebalance finish, leader transfer count: 0
I20260317 09:00:06.240732 821025 auto_leader_rebalancer.cc:414] All tables' 
leader rebalancing finished this round
I20260317 10:00:06.241026 821025 auto_leader_rebalancer.cc:125] leader 
rebalance for table user_events
I20260317 10:00:06.241196 821025 auto_leader_rebalancer.cc:125] leader 
rebalance for table time_series_table
I20260317 10:00:06.241241 821025 auto_leader_rebalancer.cc:125] leader 
rebalance for table user_events
I20260317 10:00:06.241433 821025 auto_leader_rebalancer.cc:359] table: 
user_events, leader rebalance finish, leader transfer count: 0
I20260317 10:00:06.241477 821025 auto_leader_rebalancer.cc:414] All tables' 
leader rebalancing finished this round
I20260317 11:00:06.241743 821025 auto_leader_rebalancer.cc:125] leader 
rebalance for table user_events
I20260317 11:00:06.241837 821025 auto_leader_rebalancer.cc:125] leader 
rebalance for table time_series_table
I20260317 11:00:06.241860 821025 auto_leader_rebalancer.cc:125] leader 
rebalance for table user_events
I20260317 11:00:06.242039 821025 auto_leader_rebalancer.cc:359] table: 
user_events, leader rebalance finish, leader transfer count: 0
I20260317 11:00:06.242084 821025 auto_leader_rebalancer.cc:414] All tables' 
leader rebalancing finished this round {code}
 

 

> Long-running smoke test
> -----------------------
>
>                 Key: KUDU-3731
>                 URL: https://issues.apache.org/jira/browse/KUDU-3731
>             Project: Kudu
>          Issue Type: Sub-task
>            Reporter: Gabriella Lotz
>            Assignee: Gabriella Lotz
>            Priority: Major
>
> h4. Step 0: Start a cluster with 1 master and 3 tablet servers.
> h4. Step 1: Set the following flags.
> --auto_rebalancing_enabled=true
> --auto_rebalancing_interval_seconds=60
> --auto_leader_rebalancing_enabled=true
> h4. Step 2: Create the following tables.
> {code:java}
> kudu table create <master> '{
>   "table_name": "user_events",
>   "schema": {
>     "columns": [
>       {"column_name": "user_id",  "column_type": "STRING", "is_nullable": 
> false},
>       {"column_name": "event_id", "column_type": "INT64",  "is_nullable": 
> false},
>       {"column_name": "data",     "column_type": "STRING", "is_nullable": 
> true}
>     ],
>     "key_column_names": ["user_id", "event_id"]
>   },
>   "partition": {
>     "hash_partitions": [{"columns": ["user_id", "event_id"], "num_buckets": 
> 8}]
>   },
>   "num_replicas": 3
> }'
> kudu table create <master> '{
>   "table_name": "time_series_table",
>   "schema": {
>     "columns": [
>       {"column_name": "ts",        "column_type": "UNIXTIME_MICROS", 
> "is_nullable": false},
>       {"column_name": "sensor_id", "column_type": "STRING",          
> "is_nullable": false},
>       {"column_name": "value",     "column_type": "DOUBLE",          
> "is_nullable": true}
>     ],
>     "key_column_names": ["ts", "sensor_id"]
>   },
>   "partition": {
>     "range_partition": {
>       "columns": ["ts"],
>       "range_bounds": [
>         {"upper_bound": {"bound_type": "exclusive", "bound_values": 
> ["1704067200000000"]}},
>         {
>           "lower_bound": {"bound_type": "inclusive", "bound_values": 
> ["1704067200000000"]},
>           "upper_bound": {"bound_type": "exclusive", "bound_values": 
> ["1735689600000000"]}
>         },
>         {"lower_bound": {"bound_type": "inclusive", "bound_values": 
> ["1735689600000000"]}}
>       ]
>     }
>   },
>   "num_replicas": 3
> }' {code}
> h4. Step 3: Start loadgen in tmux
> tmux new -s smoke
> Inside tmux:
> while true; do
>   kudu perf loadgen <master> \
>     --table_name=user_events \
>     --num_threads=4 \
>     --num_rows_per_thread=500000 \
>     --flush_per_n_rows=1000 \
>     --run_cleanup # cleanup so that disk doesn't fill up
>   sleep 10
> done
> h4. Step 4: Monitor findings for 2 weeks. (start 2026.03.13.)
>  # Is cluster healthy?
> kudu cluster ksck <master>
>  # Check whether auto-rebalancer is running.
> grep -i rebalanc /var/log/kudu/kudu-master.INFO | tail -20



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to