Wauux opened a new issue #4941:
URL: https://github.com/apache/apisix/issues/4941
### Issue description
# apisix2.8 dns解析IP变化导致栈溢出复现过程
##### 现象:
服务器500错误
##### 错误日志堆栈 :
```
2021/08/30 09:52:15 [error] 10780#10780: *35662 lua entry thread aborted:
runtime error: stack overflow
stack traceback:
coroutine 0:
[C]: in function 'nkeys'
/usr/local/apisix/apisix/core/table.lua:108: in function '_deepcopy'
/usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
/usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
/usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
/usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
/usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
/usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
/usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
/usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
/usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
/usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
/usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
/usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
/usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
/usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
/usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
/usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
/usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
/usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
/usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
/usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
..., client: 192.168.33.133, server: _, request: "GET / HTTP/1.1",
host: "192.168.33.133:9080"
```
##### 问题出现状况:
生产环境 cname 对应IP会不定时发生变化
##### 复现过程:
1. apisix 上游服务目标节点配置假cname www.test.com
2. 本机使用 dnsmasq 循环修改 www.test.com 对应的解析IP 127.0.0.1 / 192.168.33.101
(2个即可复现) 并持续访问
```
debug 调试日志 dns解析ip在变化
tail -f /usr/local/apisix/logs/error.log | grep "dns resolver domain:
www.test.com to"
2021/08/30 10:45:36 [info] 18176#18176: *51148 [lua] resolver.lua:43:
parse_domain(): dns resolver domain: www.test.com to 127.0.0.1, client:
192.168.33.133, server: _, request: "GET / HTTP/1.1", host:
"192.168.33.133:9080"
2021/08/30 10:45:37 [info] 18176#18176: *51161 [lua] resolver.lua:43:
parse_domain(): dns resolver domain: www.test.com to 192.168.33.101, client:
192.168.33.133, server: _, request: "GET / HTTP/1.1", host:
"192.168.33.133:9080"
2021/08/30 10:45:37 [info] 18176#18176: *51174 [lua] resolver.lua:43:
parse_domain(): dns resolver domain: www.test.com to 127.0.0.1, client:
192.168.33.133, server: _, request: "GET / HTTP/1.1", host:
"192.168.33.133:9080"
2021/08/30 10:45:38 [info] 18176#18176: *51186 [lua] resolver.lua:43:
parse_domain(): dns resolver domain: www.test.com to 192.168.33.101, client:
192.168.33.133, server: _, request: "GET / HTTP/1.1", host:
"192.168.33.133:9080"
2021/08/30 10:45:38 [info] 18176#18176: *51200 [lua] resolver.lua:43:
parse_domain(): dns resolver domain: www.test.com to 127.0.0.1, client:
192.168.33.133, server: _, request: "GET / HTTP/1.1", host:
"192.168.33.133:9080"
2021/08/30 10:45:39 [info] 18176#18176: *51215 [lua] resolver.lua:43:
parse_domain(): dns resolver domain: www.test.com to 192.168.33.101, client:
192.168.33.133, server: _, request: "GET / HTTP/1.1", host:
"192.168.33.133:9080"
```
3. 等待一段时间会出现即可复现堆栈、并且内存即将打满
```
过程中会出现json 过度嵌套导致的 解析错误
json.lua:82: failed to encode: Cannot serialise, excessive nesting
(1001) force: true, client: 192.168.33.133, server: _, request: "GET /
HTTP/1.1", host: "192.168.33.133:9080"
2021/08/30 10:47:37 [warn] 18176#18176: *54246 [lua] json.lua:82: failed
to encode: Cannot serialise, excessive nesting (1001)
```
##### 相关代码:
parse_domain_in_route.lua
```lua
local function parse_domain_in_route(route)
local nodes = route.value.upstream.nodes
local new_nodes, err = parse_domain_for_nodes(nodes)
if not new_nodes then
return nil, err
end
local up_conf = route.dns_value and route.dns_value.upstream
local ok = upstream_util.compare_upstream_node(up_conf, new_nodes)
if ok then
return route
end
-- don't modify the modifiedIndex to avoid plugin cache miss because of
DNS resolve result
-- has changed
route.dns_value = core.table.deepcopy(route.value)
route.dns_value.upstream.nodes = new_nodes
core.log.info("parse route which contain domain: ",
core.json.delay_encode(route, true))
return route
end
```
### Environment
- apisix version (cmd: `apisix version`):2.8
- OS (cmd: `uname -a`):
- Linux ip-10-195-19-241.ap-southeast-1.compute.internal
3.10.0-1062.12.1.el7.x86_64 #1 SMP Tue Feb 4 23:02:59 UTC 2020 x86_64 x86_64
x86_64 GNU/Linux
- OpenResty / Nginx version (cmd: `nginx -V` or `openresty -V`):
- nginx version: openresty/1.19.3.2
- etcd version, if have (cmd: run `curl
http://127.0.0.1:9090/v1/server_info` to get the info from server-info API):
-
{"hostname":"ip-10-195-19-241.ap-southeast-1.compute.internal","version":"2.6","boot_time":1630375405,"etcd_version":"3.4.0","up_time":4503,"last_report_time":1630379905,"id":"a2315b76-470d-4457-8625-f943ff24eac8"}
- apisix-dashboard version, if have:
- luarocks version, if the issue is about installation (cmd: `luarocks
--version`):
### Steps to reproduce
##### 复现过程:
1. apisix 上游服务目标节点配置假cname www.test.com
2. 本机使用 dnsmasq 循环修改 www.test.com 对应的解析IP 127.0.0.1 / 192.168.33.101
(2个即可复现) 并持续访问
```
debug 调试日志 dns解析ip在变化
tail -f /usr/local/apisix/logs/error.log | grep "dns resolver domain:
www.test.com to"
2021/08/30 10:45:36 [info] 18176#18176: *51148 [lua] resolver.lua:43:
parse_domain(): dns resolver domain: www.test.com to 127.0.0.1, client:
192.168.33.133, server: _, request: "GET / HTTP/1.1", host:
"192.168.33.133:9080"
2021/08/30 10:45:37 [info] 18176#18176: *51161 [lua] resolver.lua:43:
parse_domain(): dns resolver domain: www.test.com to 192.168.33.101, client:
192.168.33.133, server: _, request: "GET / HTTP/1.1", host:
"192.168.33.133:9080"
2021/08/30 10:45:37 [info] 18176#18176: *51174 [lua] resolver.lua:43:
parse_domain(): dns resolver domain: www.test.com to 127.0.0.1, client:
192.168.33.133, server: _, request: "GET / HTTP/1.1", host:
"192.168.33.133:9080"
2021/08/30 10:45:38 [info] 18176#18176: *51186 [lua] resolver.lua:43:
parse_domain(): dns resolver domain: www.test.com to 192.168.33.101, client:
192.168.33.133, server: _, request: "GET / HTTP/1.1", host:
"192.168.33.133:9080"
2021/08/30 10:45:38 [info] 18176#18176: *51200 [lua] resolver.lua:43:
parse_domain(): dns resolver domain: www.test.com to 127.0.0.1, client:
192.168.33.133, server: _, request: "GET / HTTP/1.1", host:
"192.168.33.133:9080"
2021/08/30 10:45:39 [info] 18176#18176: *51215 [lua] resolver.lua:43:
parse_domain(): dns resolver domain: www.test.com to 192.168.33.101, client:
192.168.33.133, server: _, request: "GET / HTTP/1.1", host:
"192.168.33.133:9080"
```
3. 等待一段时间会出现即可复现堆栈、并且内存即将打满
### Actual result
apisix 500错误 table拷贝递归栈溢出
### Error log
2021/08/30 09:52:15 [error] 10780#10780: *35662 lua entry thread aborted:
runtime error: stack overflow
stack traceback:
coroutine 0:
[C]: in function 'nkeys'
/usr/local/apisix/apisix/core/table.lua:108: in function '_deepcopy'
/usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
/usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
/usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
/usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
/usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
/usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
/usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
/usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
/usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
/usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
/usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
/usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
/usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
/usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
/usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
/usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
/usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
/usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
/usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
/usr/local/apisix/apisix/core/table.lua:111: in function '_deepcopy'
..., client: 192.168.33.133, server: _, request: "GET / HTTP/1.1",
host: "192.168.33.133:9080"
```
### Expected result
dns解析变化导致栈溢出
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]