Hi Marc,
On Fri, Mar 24, 2023 at 01:52:05PM -0500, Marc West wrote:
> Hi,
>
> I saw in the 2.6.10 release notes to report any issues that seem like
> they could be related to the concurrency changes. When reloading config
> on 2.6.10 or 2.6.11 on FreeBSD 13.1-RELEASE the old process does not
> exit and starts to use 100%+ CPU. This does not happen on 2.6.9 with
> the same config.
Hmmm that's not cool, it could indeed be related to the recent fixes,
thanks for reporting it.
> As a test I set nbthread=1 on 2.6.11 instead of the 40 default on this
> system and cannot reproduce the problem. Reloads are also good on test
> VMs with only 1 core. Trying different nbthread values in increments of
> 2 up to 40 gave mixed results, sometimes reloads are OK and other times
> the old process has the issue with the same nbthread value that was OK on
> a previous test. Higher nbthread values seem to more reliably fail and
> my original value of 36 always fails with active traffic.
>
> Everything else runs perfectly on 2.6.11 except this issue during reloads
> and the workaround is to just manually kill -9 the old process.
>
> Here is an example after a few config reloads while making changes (with
> nbthread=36):
Is it easy to reproduce for you and is the reproducer reliable enough,
or does it require some long production traffic ? I'm asking because
I'd like to make you run a few extra tests but I don't want to cause
you trouble on your production.
> Some ktrace output from one of the old processes is below, I can send
> the full ~130MB dump to a developer if helpful.
(...)
It seems to show that some events are ignored. Among the things I'm
thinking about to try to narrow down the cause:
- if it only affects nbthread >1, it could be related to listeners
or idle connection takeover between threads on the backend. This
last one can be disabled by the following global setting:
tune.idle-pool.shared off
- for listeners, one way for them not to be shared between threads
(and to improve overall performance) is to enable sharding, which
will result in having one FD per thread. It will consume more FDs
(each listener will have nbthread sockets) but usually significantly
improves resource usage and lowers contention. But here what I'm
looking for is to make sure no listening FD is shared between two
threads:
bind ... shards by-thread
- given that I didn't notice this and that we haven't received other
reports yet, it's possible that it only affects the kqueue poller.
If it is possible for you to reproduce this on a test machine, it
would be nice to start with "nokqueue" in the global section. This
will switch to poll() instead, which is much less scalable but which
is different, and comparable on other operating systems.
- it is possible that I failed one of my backports and that it would
trigger a bug with your setup. If you're able to reproduce this out
of production, or if you're OK with attempting with 2.7, it would
be useful to try with latest 2.7 instead. If it works fine there,
it will definitely indicate that there remains a difference between
the two in this area.
- in order to figure which functions are called, it would be nice to
pre-establish a connection to the stats socket before the reload,
then issue "show tasks;show fd" there once the CPU goes through the
roof. If you've built with -DDEBUG_TASK, it can be even more helpful
to also send "set profiling tasks on", wait 10s or so, then send
"show profiling tasks". It will return the number of calls per task
and per calling point as well as the time spent there. Usually it
helps understand what event is preventing the process from sleeping.
- it's also possible that something in your config triggers one bug
above and that it's unrelated to this specific poller but to the
changes themselves, so if nothing above helps, I'll then ask you
for a stripped-down version of your config that is sufficient to
reproduce the issue, and I'll test it on a tiny freebsd machine
I have on my desk. But I'd rather do this once the issue is
sufficiently narrowed down.
Thanks!
Willy