ChuanFF opened a new pull request, #12991:
URL: https://github.com/apache/apisix/pull/12991

   ### Background
   In production environments, when new upstream nodes join a service cluster, 
sudden traffic spikes can cause performance degradation or even service crashes 
due to un-warmed caches, unestablished database connection pools, or incomplete 
JIT compilation. Currently, APISIX lacks a traffic protection mechanism for new 
nodes to gradually migrate traffic smoothly.
   
   ### Solution
   When `warm_up_conf` is configured in an upstream, APISIX automatically 
detects newly added nodes and calculates temporary weights for them. These 
weights start from a configured minimum percentage and gradually increase to 
the original weight over a specified warm-up period, enabling smooth traffic 
control.
   
   ### Key Design Points
   1. **Warm-up Algorithm Reference**: Inspired by Envoy's Slow Start 
implementation ([Envoy Slow Start 
Documentation](https://www.envoyproxy.io/docs/envoy/latest/intro/arch_overview/upstream/load_balancing/slow_start))
   2. **Node Identification**: Automatically identifies new nodes via the 
`update_time` field while preserving timestamps for existing nodes
   3. **Weight Calculation**: Uses an exponential growth curve controlled by 
the `aggression` parameter
   4. **Format Limitation**: Only supports array-format node definitions 
(`[{"host": "127.0.0.1", "port": 8080}]`), not map format (`{"127.0.0.1:8080": 
100}`)
   5. **Configuration Inheritance**: Supports warm-up configuration at Route, 
Service, and Upstream levels
   
   ### Configuration Example
   ```json
   {
     "type": "roundrobin",
     "nodes": [
       {"host": "127.0.0.1", "port": 1980, "weight": 100},
       {"host": "127.0.0.1", "port": 1981, "weight": 100}
     ],
     "warm_up_conf": {
       "slow_start_time_seconds": 30,
       "min_weight_percent": 10,
       "aggression": 1,
       "interval": 1
     }
   }
   ```
   
   ### Technical Implementation
   - Added `warm_up_conf` configuration item with warm-up duration, minimum 
weight percentage, and growth curve parameters
   - Dynamically adjusts actual node weights based on `update_time` during load 
balancer weight calculation
   - Ensures weight update consistency through configuration version management
   - Automatically maintains update timestamps during node addition/removal
   
   ### Testing
   Includes comprehensive test cases verifying:
   1. Traffic skew when new nodes join
   2. Traffic growth curve during warm-up
   3. Normal load balancing after warm-up completion
   4. Timestamp preservation during configuration updates
   5. Compatibility across different configuration levels
   
   ### Checklist
   
   - [x] I have explained the need for this PR and the problem it solves
   - [x] I have explained the changes or the new features added to this PR
   - [x] I have added tests corresponding to this change
   - [x] I have updated the documentation to reflect this change
   - [x] I have verified that this change is backward compatible (If not, 
please discuss on the [APISIX mailing 
list](https://github.com/apache/apisix/tree/master#community) first)
   
   <!--
   
   Note
   
   1. Mark the PR as draft until it's ready to be reviewed.
   2. Always add/update tests for any changes unless you have a good reason.
   3. Always update the documentation to reflect the changes made in the PR.
   4. Make a new commit to resolve conversations instead of `push -f`.
   5. To resolve merge conflicts, merge master instead of rebasing.
   6. Use "request review" to notify the reviewer after making changes.
   7. Only a reviewer can mark a conversation as resolved.
   
   -->
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to