ZhangShangyu opened a new issue, #9805:
URL: https://github.com/apache/apisix/issues/9805
### Current Behavior
### Bug description:
We have implement a service discover plugin to meet this requirement:
1. get service id from host subdomain, then set upstream to the ip of this
service id
2. the host subdomain is dynamic so we cannot enumerate all host and config
a detail route for it
for example:
a.myhost.com should route to ip A
b.myhost.com should route to ip B
c.myhost.com should route to ip C
and so on
the service discover plugin demo is like this:
```lua
demo_discover.lua
local demo_nodes_tab = {
a = {host = "10.0.0.1", port = 80 },
b = {host = "10.0.0.2", port = 80 }
...
}
function _M.nodes(service_name, discovery_args)
local host = ngx.var.host
-- host example: a.myhost.com
local service_id = host:match("([^.]+).myhost.com")
return demo_nodes_tab[service_id]
end
```
then create a service and route for it
```
service:
{
"id": "demo_service"
"name": "demo_service",
"upstream": {
"discovery_type": "demo_discover",
"service_name": "demo_service"
"pass_host": "pass",
"scheme": "http",
"type": "roundrobin"
},
}
route:
{
"host": "*.myhost.com",
"name": "demo_route",
"service_id": "demo_service"
"uri": "/*"
}
```
do some requests to a.myhost.com and b.myhost.com,
magic happened, **the upstream ip is not correct! seems like ABA problem**
* some case request a.myhost.com get b.myhost.com ip
* some case request b.myhost.com get a.myhost.com ip
### root cause:
**1.upstream_conf is from cache**
in the apisix source code
https://github.com/apache/apisix/blob/release/3.4/apisix/init.lua#L631C24-L631C43
https://github.com/apache/apisix/blob/release/3.4/apisix/plugin.lua#L634
if route.value.service_id then will call plugin.merge_service_route
`plugin.merge_service_route` will use a lrucache by service_conf key and
verison
**2. cached upstream_conf is used in set_upstream**
https://github.com/apache/apisix/blob/release/3.4/apisix/init.lua#L478
https://github.com/apache/apisix/blob/release/3.4/apisix/init.lua#L478
**3. compare cached upstream_conf original_node and service discover new
nodes**
https://github.com/apache/apisix/blob/release/3.4/apisix/upstream.lua#L277
https://github.com/apache/apisix/blob/release/3.4/apisix/utils/upstream.lua#L37
**4. update upstream_conf by compare result**
https://github.com/apache/apisix/blob/release/3.4/apisix/upstream.lua#L285
https://github.com/apache/apisix/blob/release/3.4/apisix/upstream.lua#L336
if original_nodes is same as new nodes, it will fill_node_info to old cached
upstream_conf, everything is ok
if original_node is not the same as new_nodes, update nodes to new nodes,
then clone a new_up_conf,
after this it will fill_node_info to new_up_conf
**the bug cause is here**, the old cached upstream_conf.original_nodes is
not updated by fill_node_info, because new_up_conf is another address by clone
for example
1. request a.myhost.com firstly, then nodes in cache is ip A, and
original_nodes is ip A
2. then request b.myhost.com, compare new discover ip B with original_nodes
ip A is not same, then update nodes to ip B, but the cached original_nodes is
not updated, still ip A
3. request a.myhost.com again compare new discover ip A with original_nodes
ip A is the same! **the nodes ip will not be update to new discover ip A, so
the get a wrong upstream nodes ip B!**
### solution
call fill_node_info before clone to new_up_conf, so the cached up_conf
original_node will update
### Expected Behavior
_No response_
### Error Logs
_No response_
### Steps to Reproduce
in above description
### Environment
- APISIX version 2.15.3
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]