On 2/5/2023 7:01 PM, J Carter wrote:
Hi Aleksei,
Why not permanently assign the task of resolving a given upstream server
group (all servers/peers within it) to a single worker?
It seems that this approach would resolve the SRV issues, and remove the
need for the shared queue of tasks.
The load would still be spread evenly for the most realistic scenarios -
which is where there are many upstream server groups of few servers, as
opposed to few upstream server groups of many servers.
The intent of the change was exactly opposite, to avoid any permanent
assignment of periodic tasks to a worker and allow another processes to
resume resolving if the original assignee exits, no matter if normally
or abnormally. I'm not even doing enough for that -- I should've kept
in-progress tasks at the end of the queue with expires = resolver
timeout + a small constant, and retry from another process when the
timeout is reached, but the idea was abandoned for a minuscule
improvement of insertion time. I expect to be asked to reconsider, as
patch 6/6 does not cover all the possible situations where we want to
recover a stale task.
A permanent assignment of a whole upstream would also require notifying
another processes that the upstream is no longer assigned if the worker
exits or consistently recovering that assignment over a restart of
single worker (e.g. after a crash - not a regular situation, but one we
should take into account nonetheless). And the benefit is not quite
obvious - I mentioned that resolving SRVs with a lot of records may take
longer to update the list of peers, but the situation with contention is
not expected to change significantly* if we pin these tasks to a single
worker as another worker may be doing the same for another upstream.
Most importantly, this isn't even a bottleneck. It only slightly
exacerbates an existing problem with certain balancers that already
suffer from the overuse of locks, in a configuration that was
specifically crafted to amplify and highlight the difference and is far
from these most realistic scenarios.
* Pending verification on a performance test stand.
_______________________________________________
nginx-devel mailing list
nginx-devel@nginx.org
https://mailman.nginx.org/mailman/listinfo/nginx-devel