xianshun163 opened a new issue #5766:
URL: https://github.com/apache/apisix/issues/5766
### Issue description
Think about this scene:
1、My app have 3 pod
2、Then pod 1 is too busy and can not make a response
3、apisix call the pod 1 and will have a time out fail after 6s
4、apisix will retry next pod and success
then, another request come , and the apisix do again as above
I think when the pod 1 is time out , it is good to have a breaker ,
and skip the pod 1 some seconds .
So the requests later will not proxy to the pod 1.
After some seconds we retry to proxy to pod 1, if it success,then the
apisix make balance to pod 1~3.
I work like the api-breaker, but it is act on the pod。
I modify the balancer.lua to realize this function, the code as
below. It is it some beter way to do this.
the code here: balancer.lua
```bash
local function pick_server(route, ctx)
if ctx.balancer_try_count > 1 then
if ctx.server_picker and ctx.server_picker.after_balance then
ctx.server_picker.after_balance(ctx, true)
end
-- add a function here, because that must have a node fail just now
do_unhealthy_process(route, ctx)
............
--- here is the function , the code work like the api-breaker
local function do_unhealthy_process(route, ctx)
local node_unhealthy_key = gen_node_unhealthy_key(ctx.balancer_ip,
ctx.balancer_port, ctx)
local node_healthy_key = gen_node_healthy_key(ctx.balancer_ip,
ctx.balancer_port, ctx)
local node_lasttime_key = gen_node_lasttime_key(ctx.balancer_ip,
ctx.balancer_port, ctx)
local conf = route.value.plugins[node_break_plugin_name]
local node_failures
local max_breaker_sec
if not conf then
node_failures = conf.unhealthy.node_failures
max_breaker_sec = conf.max_breaker_sec
else
node_failures = 1
max_breaker_sec = 10
end
local node_unhealthy_count, err = shared_buffer:incr(node_unhealthy_key,
1, max_breaker_sec * 10)
if err then
core.log.warn("failed to incr node_unhealthy_key: ",
node_unhealthy_key,
" err: ", err)
end
core.log.warn("a node is unhealthy, node_unhealthy_key: ",
node_unhealthy_key, " count: ", node_unhealthy_count)
shared_buffer:delete(node_healthy_key)
if node_unhealthy_count % node_failures == 0 then
core.log.warn("the the node to fail:", ctx.balancer_ip, ":",
ctx.balancer_port)
shared_buffer:set(node_lasttime_key, ngx.time(), max_breaker_sec)
end
end
```
### Environment
- apisix version (cmd: `apisix version`):2.10.1
- OS (cmd: `uname -a`):centos 7
- OpenResty / Nginx version (cmd: `nginx -V` or `openresty -V`):
- etcd version, if have (cmd: run `curl
http://127.0.0.1:9090/v1/server_info` to get the info from server-info API):
- apisix-dashboard version, if have:
- the plugin runner version, if the issue is about a plugin runner (cmd:
depended on the kind of runner):
- luarocks version, if the issue is about installation (cmd: `luarocks
--version`):
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]