Re: [discuss] NFS performance on new hardware

Vitaliy Gusev Thu, 23 Nov 2017 04:47:02 -0800

Hi,

> On Nov 22, 2017, at 1:34 AM, Jorgen Lundman <[email protected] 
> <mailto:[email protected]>> wrote:


…
> Then from time to time, it goes crazy, loads goes over 50, nfsd threads
> drop to about 120. All NFS clients spew messages regarding NR_BAD_SEQID and
> NFS4ERR_STALE.

Jorgen, could you decrease number of nfs server threads to 256 and check 
behaviour again ?


> Sometimes it recovers, sometimes it reboots. It has been armed with dump
> now, in case it crashes again.
> 

Please look at fmdump output. It should show stacks even if you didn't save a 
crash dump.

Garrett, can the id_alloc() or arena exhausting lead to reboot/crash by design 
? 
Also there are two NUMA nodes. It also can slowdown mutex handling, if 
processes performing mutex lock/unlock are on all nodes. Is there possibility 
to bind a process and allocations only to one NUMA node ?

———
Vitaliy Gusev


> On 22 Nov 2017, at 19:31, Garrett D'Amore <[email protected]> wrote:
> 
> I was going to say the same thing. I suspect that the id space is exhausted 
> or nearly so. The code is spending a lot of time doing cvwait apparently and 
> that should not happen unless the arena is exhausted. 
> 
> The load profile for 48 cores is such that if there are runnable threads the 
> load should sit around 48. A perfectly utilized system will have load average 
> == cores. 
> 
> An outstanding question might be why there are so many runnable threads but 
> know that many software systems try to scale worker thread counts to match 
> core count. (This is not always a good thing for performance but it is common 
> practice nevertheless.)
> On Wed, Nov 22, 2017 at 7:38 AM Pavel Zakharov <[email protected] 
> <mailto:[email protected]>> wrote:
> Hi Jorgen,
> 
> I took a quick look at your flamegraph and at the code and what we are seeing 
> here looks like a lock contention rather than a memory issue.
> 
> It seems like the problem is at id_alloc() which uses the vmem framework to 
> allocate unique ids.
> In particular, vmem_nextfit_alloc() is the one that is responsible for your 
> slowness as its operation is single threaded.
> I’m somewhat confused by its implementation but my hunch is that it doesn’t 
> scale well to 48 CPUs.
> 
> It would be interesting to see what the vmem arena backing that space_id_t 
> resource looks like.
> 
> Regards,
> Pavel

------------------------------------------
illumos-discuss
Archives: 
https://illumos.topicbox.com/groups/discuss/discussions/T1f149f6156a80f52-M51c0d29df022c901899b816a
Powered by Topicbox: https://topicbox.com

Re: [discuss] NFS performance on new hardware

Reply via email to