Wauux opened a new issue #4941:
URL: https://github.com/apache/apisix/issues/4941


   ### Issue description
   
   # apisix2.8 dns解析IP变化导致栈溢出复现过程
   
   ##### 现象:
   
   ​            服务器500错误
   
   
   
   ##### 错误日志堆栈 : 
   
   ```
   2021/08/30 09:52:15 [error] 10780#10780: *35662 lua entry thread aborted: 
runtime error: stack overflow
   stack traceback:
   coroutine 0:
           [C]: in function 'nkeys'
           /usr/local/apisix/apisix/core/table.lua:108: in function '_deepcopy'
           /usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
           /usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
           /usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
           /usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
           /usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
           /usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
           /usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
           /usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
           /usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
           /usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
           /usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
           /usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
           /usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
           /usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
           /usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
           /usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
           /usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
           /usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
           /usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
           /usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
           ..., client: 192.168.33.133, server: _, request: "GET / HTTP/1.1", 
host: "192.168.33.133:9080"
   ```
   
   
   
   ##### 问题出现状况:
   
   ​            生产环境 cname 对应IP会不定时发生变化
   
   
   
   ##### 复现过程:
   
   1. apisix 上游服务目标节点配置假cname  www.test.com
   
   2. 本机使用 dnsmasq 循环修改  www.test.com 对应的解析IP 127.0.0.1 / 192.168.33.101 
(2个即可复现) 并持续访问
   
      ```
       debug 调试日志 dns解析ip在变化
       
       tail -f /usr/local/apisix/logs/error.log | grep "dns resolver domain: 
www.test.com to"
      2021/08/30 10:45:36 [info] 18176#18176: *51148 [lua] resolver.lua:43: 
parse_domain(): dns resolver domain: www.test.com to 127.0.0.1, client: 
192.168.33.133, server: _, request: "GET / HTTP/1.1", host: 
"192.168.33.133:9080"
      2021/08/30 10:45:37 [info] 18176#18176: *51161 [lua] resolver.lua:43: 
parse_domain(): dns resolver domain: www.test.com to 192.168.33.101, client: 
192.168.33.133, server: _, request: "GET / HTTP/1.1", host: 
"192.168.33.133:9080"
      2021/08/30 10:45:37 [info] 18176#18176: *51174 [lua] resolver.lua:43: 
parse_domain(): dns resolver domain: www.test.com to 127.0.0.1, client: 
192.168.33.133, server: _, request: "GET / HTTP/1.1", host: 
"192.168.33.133:9080"
      2021/08/30 10:45:38 [info] 18176#18176: *51186 [lua] resolver.lua:43: 
parse_domain(): dns resolver domain: www.test.com to 192.168.33.101, client: 
192.168.33.133, server: _, request: "GET / HTTP/1.1", host: 
"192.168.33.133:9080"
      2021/08/30 10:45:38 [info] 18176#18176: *51200 [lua] resolver.lua:43: 
parse_domain(): dns resolver domain: www.test.com to 127.0.0.1, client: 
192.168.33.133, server: _, request: "GET / HTTP/1.1", host: 
"192.168.33.133:9080"
      2021/08/30 10:45:39 [info] 18176#18176: *51215 [lua] resolver.lua:43: 
parse_domain(): dns resolver domain: www.test.com to 192.168.33.101, client: 
192.168.33.133, server: _, request: "GET / HTTP/1.1", host: 
"192.168.33.133:9080"
      ```
   
      
   
   3. 等待一段时间会出现即可复现堆栈、并且内存即将打满 
   
      ```
       
       过程中会出现json 过度嵌套导致的 解析错误
       
       json.lua:82: failed to encode: Cannot serialise, excessive nesting 
(1001) force: true, client: 192.168.33.133, server: _, request: "GET / 
HTTP/1.1", host: "192.168.33.133:9080"
      2021/08/30 10:47:37 [warn] 18176#18176: *54246 [lua] json.lua:82: failed 
to encode: Cannot serialise, excessive nesting (1001)
      ```
   
   
   
   ##### 相关代码:
   
   parse_domain_in_route.lua
   
   ```lua
   local function parse_domain_in_route(route)
       local nodes = route.value.upstream.nodes
       local new_nodes, err = parse_domain_for_nodes(nodes)   
       if not new_nodes then
           return nil, err
       end
   
       local up_conf = route.dns_value and route.dns_value.upstream
       local ok = upstream_util.compare_upstream_node(up_conf, new_nodes)
       if ok then
           return route
       end
   
       -- don't modify the modifiedIndex to avoid plugin cache miss because of 
DNS resolve result
       -- has changed
   
       route.dns_value = core.table.deepcopy(route.value)
       route.dns_value.upstream.nodes = new_nodes
       core.log.info("parse route which contain domain: ",
                     core.json.delay_encode(route, true))
       return route
   end
   ```
   
   
   ### Environment
   
   - apisix version (cmd: `apisix version`):2.8
   - OS (cmd: `uname -a`):
   - Linux ip-10-195-19-241.ap-southeast-1.compute.internal 
3.10.0-1062.12.1.el7.x86_64 #1 SMP Tue Feb 4 23:02:59 UTC 2020 x86_64 x86_64 
x86_64 GNU/Linux
   - OpenResty / Nginx version (cmd: `nginx -V` or `openresty -V`):
   - nginx version: openresty/1.19.3.2
   - etcd version, if have (cmd: run `curl 
http://127.0.0.1:9090/v1/server_info` to get the info from server-info API):
   - 
{"hostname":"ip-10-195-19-241.ap-southeast-1.compute.internal","version":"2.6","boot_time":1630375405,"etcd_version":"3.4.0","up_time":4503,"last_report_time":1630379905,"id":"a2315b76-470d-4457-8625-f943ff24eac8"}
   - apisix-dashboard version, if have:
   - luarocks version, if the issue is about installation (cmd: `luarocks 
--version`):
   
   
   ### Steps to reproduce
   
   ##### 复现过程:
   
   1. apisix 上游服务目标节点配置假cname  www.test.com
   
   2. 本机使用 dnsmasq 循环修改  www.test.com 对应的解析IP 127.0.0.1 / 192.168.33.101 
(2个即可复现) 并持续访问
   
      ```
       debug 调试日志 dns解析ip在变化
       
       tail -f /usr/local/apisix/logs/error.log | grep "dns resolver domain: 
www.test.com to"
      2021/08/30 10:45:36 [info] 18176#18176: *51148 [lua] resolver.lua:43: 
parse_domain(): dns resolver domain: www.test.com to 127.0.0.1, client: 
192.168.33.133, server: _, request: "GET / HTTP/1.1", host: 
"192.168.33.133:9080"
      2021/08/30 10:45:37 [info] 18176#18176: *51161 [lua] resolver.lua:43: 
parse_domain(): dns resolver domain: www.test.com to 192.168.33.101, client: 
192.168.33.133, server: _, request: "GET / HTTP/1.1", host: 
"192.168.33.133:9080"
      2021/08/30 10:45:37 [info] 18176#18176: *51174 [lua] resolver.lua:43: 
parse_domain(): dns resolver domain: www.test.com to 127.0.0.1, client: 
192.168.33.133, server: _, request: "GET / HTTP/1.1", host: 
"192.168.33.133:9080"
      2021/08/30 10:45:38 [info] 18176#18176: *51186 [lua] resolver.lua:43: 
parse_domain(): dns resolver domain: www.test.com to 192.168.33.101, client: 
192.168.33.133, server: _, request: "GET / HTTP/1.1", host: 
"192.168.33.133:9080"
      2021/08/30 10:45:38 [info] 18176#18176: *51200 [lua] resolver.lua:43: 
parse_domain(): dns resolver domain: www.test.com to 127.0.0.1, client: 
192.168.33.133, server: _, request: "GET / HTTP/1.1", host: 
"192.168.33.133:9080"
      2021/08/30 10:45:39 [info] 18176#18176: *51215 [lua] resolver.lua:43: 
parse_domain(): dns resolver domain: www.test.com to 192.168.33.101, client: 
192.168.33.133, server: _, request: "GET / HTTP/1.1", host: 
"192.168.33.133:9080"
      ```
   
      
   
   3. 等待一段时间会出现即可复现堆栈、并且内存即将打满 
   
   ### Actual result
   
   apisix 500错误 table拷贝递归栈溢出
   
   ### Error log
   
   2021/08/30 09:52:15 [error] 10780#10780: *35662 lua entry thread aborted: 
runtime error: stack overflow
   stack traceback:
   coroutine 0:
           [C]: in function 'nkeys'
           /usr/local/apisix/apisix/core/table.lua:108: in function '_deepcopy'
           /usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
           /usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
           /usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
           /usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
           /usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
           /usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
           /usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
           /usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
           /usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
           /usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
           /usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
           /usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
           /usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
           /usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
           /usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
           /usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
           /usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
           /usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
           /usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
           /usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
           ..., client: 192.168.33.133, server: _, request: "GET / HTTP/1.1", 
host: "192.168.33.133:9080"
   ```
   
   ### Expected result
   
   dns解析变化导致栈溢出


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to