xianshun163 opened a new issue #5766:
URL: https://github.com/apache/apisix/issues/5766


   ### Issue description
   
   Think  about this scene: 
   1、My app  have  3  pod 
   2、Then  pod  1  is  too busy  and  can not  make a response 
   3、apisix  call the  pod 1  and   will  have  a  time out fail   after  6s  
   4、apisix  will  retry  next pod  and  success
   then, another request come , and  the   apisix  do  again  as  above
   
   I  think  when the pod 1 is time out  ,  it is  good to  have  a  breaker  , 
 and  skip the pod 1  some seconds .
   So   the requests  later  will not  proxy to the pod 1. 
   After  some seconds  we retry  to  proxy to pod 1, if it success,then   the  
apisix make balance to  pod 1~3.
   
   I work  like  the  api-breaker, but  it is   act  on  the pod。
   
   I modify the  balancer.lua  to  realize this function,  the  code  as  
below.  It  is it some beter way to do this. 
   
   the code here:  balancer.lua
   ```bash
   local function pick_server(route, ctx)   
       if ctx.balancer_try_count > 1 then
           if ctx.server_picker and ctx.server_picker.after_balance then
               ctx.server_picker.after_balance(ctx, true)
           end
           -- add a function here, because  that must have a node fail just now
           do_unhealthy_process(route, ctx)
     ............
   
   --- here is the function ,   the code work  like the  api-breaker
   
   local function do_unhealthy_process(route, ctx)
       local node_unhealthy_key = gen_node_unhealthy_key(ctx.balancer_ip, 
ctx.balancer_port, ctx)
       local node_healthy_key = gen_node_healthy_key(ctx.balancer_ip, 
ctx.balancer_port, ctx)
       local node_lasttime_key = gen_node_lasttime_key(ctx.balancer_ip, 
ctx.balancer_port, ctx)
       local conf = route.value.plugins[node_break_plugin_name]
       local node_failures
       local max_breaker_sec
       if not conf then
           node_failures = conf.unhealthy.node_failures
           max_breaker_sec = conf.max_breaker_sec
       else
           node_failures = 1
           max_breaker_sec = 10
       end
   
       local node_unhealthy_count, err = shared_buffer:incr(node_unhealthy_key, 
1, max_breaker_sec * 10)
       if err then
           core.log.warn("failed to incr node_unhealthy_key: ", 
node_unhealthy_key,
                   " err: ", err)
       end
       core.log.warn("a node is unhealthy, node_unhealthy_key: ", 
node_unhealthy_key, " count: ", node_unhealthy_count)
       shared_buffer:delete(node_healthy_key)
       if node_unhealthy_count % node_failures == 0 then
           core.log.warn("the the node to fail:", ctx.balancer_ip, ":", 
ctx.balancer_port)
           shared_buffer:set(node_lasttime_key, ngx.time(), max_breaker_sec)
       end
   end
   
   
   ```
   
   
   
   
   ### Environment
   
   - apisix version (cmd: `apisix version`):2.10.1
   - OS (cmd: `uname -a`):centos 7
   - OpenResty / Nginx version (cmd: `nginx -V` or `openresty -V`):
   - etcd version, if have (cmd: run `curl 
http://127.0.0.1:9090/v1/server_info` to get the info from server-info API):
   - apisix-dashboard version, if have:
   - the plugin runner version, if the issue is about a plugin runner (cmd: 
depended on the kind of runner):
   - luarocks version, if the issue is about installation (cmd: `luarocks 
--version`):
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to