Hi Terry, Following on from what Stephen mentioned, when you hit an AVX2 instruction there is a warmup latency while the CPU powers on the upper half of the 256bit lanes. It's normally around 10usecs, so possibly not accounting for everything you're seeing;
https://software.intel.com/en-us/forums/intel-isa-extensions/topic/710248 Also with RT threads that never yield you should add nosoftlockup to your bootline to prevent the kernel assuming your thread has locked up. Some things to look into; 1. Are you using no_hz mode on the kernel bootline ? 2. Have you disabled RCU callbacks from your cpu's with rcu_nocbs on kernel bootline ? 3. Have you manually IRQbalanced to move IRQ's off your isolated cpu's ? The clear_page_erms suggests it could be memory housekeeping like zone reclaiming or transparent_hugepages, have you disabled these ? -Richard. > -----Original Message----- > From: Tim Shearer [mailto:[email protected]] > Sent: 20 April 2018 03:00 > To: [email protected]; [email protected] > Subject: Re: [dpdk-users] Linux forcibly descheduling isolated thread > on isolated cpu running DPDK rx under load > > Hi Terry, > > Without digging into this too much, it looks like the kernel is context > switching out to do a clear_page call, so I wonder if one of your other > threads is doing something memory related that's triggering this > behaviour. > > Tim > ________________________________ > From: users <[email protected]> on behalf of > [email protected] <[email protected]> > Sent: Thursday, April 19, 2018 11:43:32 AM > To: [email protected] > Subject: [dpdk-users] Linux forcibly descheduling isolated thread on > isolated cpu running DPDK rx under load > > Hi there, > I wondered if anyone had come across this particular problem regarding > linux scheduling, or rather what appears to be a forced descheduling > effect. > I'm running on standard vanilla Ubuntu 17-10 using kernel 4.13.0-36- > generic. > Local Timer interrupts are therefore enabled.... > I'm running a dual CPU Xeon E5-2623v4 system. I have cpu 2 on the first > NUMA node (CPU 0) isolated for DPDK receive. I have an Intel X550 card > attached to NUMA 0. > What I'm doing is running my DPDK receive thread on the isolated core > (2) and changing the scheduling for this thread to SCHED_FIFO and > priority 98. > Most of the time this works really well. However, I'm running this DPDK > thread inside a larger application - there are probably 40 threads > inside this process at default priority. > What I'm seeing is, when the application is under load, the DPDK > receive thread is forcibly descheduled (observed with pidstat -p <PID> > -w and seeing the non-voluntary counts spike ) and the core appears to > go idle, sometimes for up to 1400uS. > This is obviously a problem.... > Running "perf" to sample activity on this isolated core only, I see the > following entries. > 0.90% swapper [kernel.kallsyms] [k] cpu_idle_poll > 0.60% lcore-slave-2 [kernel.kallsyms] [k] clear_page_erms > i.e - it has gone idle and 1.5% of the processing time has gone > elsewhere - which ties in pretty well with my ~1400uS deschedule > observation. > In normal operation I do not see this effect. > I've checked the code - it appears to go idle in the middle of some > AVX2 data processing code - there are no system calls taken, it just > goes idle. > Does anyone have any ideas ? > Many thanks > Terry --------------------------------------------------------------------------------------- This email has been scanned for email related threats and delivered safely by Mimecast. For more information please visit http://www.mimecast.com ---------------------------------------------------------------------------------------
