On 6/12/25 12:15 PM, Jeff Layton wrote: > On Thu, 2025-06-12 at 12:05 -0400, Chuck Lever wrote: >> On 6/12/25 11:57 AM, Jeff Layton wrote: >>> On Tue, 2025-05-27 at 20:12 -0400, Jeff Layton wrote: >>>> The old nfsdfs interface for starting a server with multiple pools >>>> handles the special case of a single entry array passed down from >>>> userland by distributing the threads over every NUMA node. >>>> >>>> The netlink control interface however constructs an array of length >>>> nfsd_nrpools() and fills any unprovided slots with 0's. This behavior >>>> defeats the special casing that the old interface relies on. >>>> >>>> Change nfsd_nl_threads_set_doit() to pass down the array from userland >>>> as-is. >>>> >>>> Fixes: 7f5c330b2620 ("nfsd: allow passing in array of thread counts via >>>> netlink") >>>> Reported-by: Mike Snitzer <snit...@kernel.org> >>>> Closes: https://lore.kernel.org/linux-nfs/adc-ftnzhjalw...@kernel.org/ >>>> Signed-off-by: Jeff Layton <jlay...@kernel.org> >>>> --- >>>> fs/nfsd/nfsctl.c | 5 ++--- >>>> 1 file changed, 2 insertions(+), 3 deletions(-) >>>> >>>> diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c >>>> index >>>> ac265d6fde35df4e02b955050f5b0ef22e6e519c..22101e08c3e80350668e94c395058bc228b08e64 >>>> 100644 >>>> --- a/fs/nfsd/nfsctl.c >>>> +++ b/fs/nfsd/nfsctl.c >>>> @@ -1611,7 +1611,7 @@ int nfsd_nl_rpc_status_get_dumpit(struct sk_buff >>>> *skb, >>>> */ >>>> int nfsd_nl_threads_set_doit(struct sk_buff *skb, struct genl_info *info) >>>> { >>>> - int *nthreads, count = 0, nrpools, i, ret = -EOPNOTSUPP, rem; >>>> + int *nthreads, nrpools = 0, i, ret = -EOPNOTSUPP, rem; >>>> struct net *net = genl_info_net(info); >>>> struct nfsd_net *nn = net_generic(net, nfsd_net_id); >>>> const struct nlattr *attr; >>>> @@ -1623,12 +1623,11 @@ int nfsd_nl_threads_set_doit(struct sk_buff *skb, >>>> struct genl_info *info) >>>> /* count number of SERVER_THREADS values */ >>>> nlmsg_for_each_attr(attr, info->nlhdr, GENL_HDRLEN, rem) { >>>> if (nla_type(attr) == NFSD_A_SERVER_THREADS) >>>> - count++; >>>> + nrpools++; >>>> } >>>> >>>> mutex_lock(&nfsd_mutex); >>>> >>>> - nrpools = max(count, nfsd_nrpools(net)); >>>> nthreads = kcalloc(nrpools, sizeof(int), GFP_KERNEL); >>>> if (!nthreads) { >>>> ret = -ENOMEM; >>> >>> I noticed that this didn't go in to the recent merge window. >>> >>> This patch fixes a rather nasty regression when you try to start the >>> server on a NUMA-capable box. >> >> The NFSD netlink interface is not broadly used yet, is it? >> > > It is. RHEL10 shipped with it, for instance and it's been in Fedora for > a while.
RHEL 10 is shiny and new, and Fedora is bleeding edge. It's not likely either of these are deployed in production environments yet. Just sayin that in this case, the Bayesian filter leans towards waiting a full dev cycle. >> Since this one came in late during the 6.16 dev cycle and the Fixes: tag >> references a commit that is already in released kernels, I put in the >> "next merge window" pile. On it's own it doesn't look urgent to me. >> > > I'd really like to see this go in soon and to stable. If you want me to > respin the changelog, I can. It's not a crash, but it manifests as lost > RPCs that just hang. It took me quite a while to figure out what was > going on, and I'd prefer that we not put users through that. If someone can confirm that it is effective, I'll add it to nfsd-fixes. -- Chuck Lever