Hi Frederic.

On 1/16/26 8:27 PM, Frederic Weisbecker wrote:
I forgot to mention I haven't yet tested CONFIG_VIRT_CPU_ACCOUNTING_NATIVE
(s390 and powerpc).

Thanks.


tl;dr

I ran this on powerNV(Non virtualized) with 144 CPUs with below config. 
(default ones)
Patch *breaks* the cpu idle stats most of the time. idle values are wrong.


Detailed info:

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++

In config i have this:
CONFIG_TICK_CPU_ACCOUNTING=y
# CONFIG_VIRT_CPU_ACCOUNTING_NATIVE is not set
# CONFIG_VIRT_CPU_ACCOUNTING_GEN is not set
# CONFIG_IRQ_TIME_ACCOUNTING is not set
# CONFIG_BSD_PROCESS_ACCT is not set

+++++++++

When system is fully idle, i see this.

06:44:26 AM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  
%guest  %gnice   %idle
06:44:27 AM  all    0.01    0.00    0.01    0.00   57.20    0.00    0.00    
0.00    0.00   42.79
06:44:28 AM  all    0.02    0.00    0.03    0.00   55.73    0.00    0.00    
0.00    0.00   44.22
06:44:29 AM  all    0.01    0.00    0.00    0.00   56.23    0.00    0.00    
0.00    0.00   43.77

- Seeing 50%+ in irq time, which is clearly wrong.

+++++++++
When running stress-ng --cpu=72 (expectation is 50% idle time)
06:48:12 AM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  
%guest  %gnice   %idle
06:48:13 AM  all   49.98    0.00    0.01    0.00   15.81    0.00    0.00    
0.00    0.00   34.20
06:48:14 AM  all   49.93    0.00    0.00    0.00   15.15    0.00    0.00    
0.00    0.00   34.91
06:48:15 AM  all   49.99    0.00    0.01    0.00   15.29    0.00    0.00    
0.00    0.00   34.72

- Wrong values again. 50% is expected idle time.

+++++++++
system is idle again.
06:48:46 AM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  
%guest  %gnice   %idle
06:48:47 AM  all    0.00    0.00    0.00    0.00   63.93    0.00    0.00    
0.00    0.00   36.07
06:48:48 AM  all    0.02    0.00    0.00    0.00   63.78    0.01    0.00    
0.00    0.00   36.18
06:48:49 AM  all    0.00    0.00    0.00    0.00   63.77    0.00    0.00    
0.00    0.00   36.23

- Wrong values again. irq increased further.

+++++++++

I have seen the below warnings too.
WARNING: kernel/time/tick-sched.c:1353 at tick_nohz_idle_exit
[    T0] WARNING: kernel/time/tick-sched.c:1353 at 
tick_nohz_idle_exit+0x148/0x150, CPU#4: swapper/4/0
[    T0] Modules linked in: vmx_crypto gf128mul
[    T0] CPU: 4 UID: 0 PID: 0 Comm: swapper/4 Tainted: G        W           
6.19.0-rc5-00683-gbe7e8f3d5116 #61 PREEMPT(full)
[    T0] Tainted: [W]=WARN
[    T0] Hardware name: 0000000000000000 POWER9 0x4e1202 opal:v7.1 PowerNV
[    T0] NIP [c0000000002c8210] tick_nohz_idle_exit+0x148/0x150
[    T0] LR [c00000000022f10c] do_idle+0x1dc/0x328


WARNING: kernel/time/tick-sched.c:1274 at tick_nohz_get_sleep_length
    T0] NIP [c0000000002c7fc0] tick_nohz_get_sleep_length+0x108/0x110
[    T0] LR [c000000000ca1548] menu_select+0x3c0/0x7b4
[    T0] Call Trace:
[    T0] [c000000003197e10] [c000000003197e50] 0xc000000003197e50 (unreliable)
[    T0] [c000000003197e50] [c000000000ca1548] menu_select+0x3c0/0x7b4
[    T0] [c000000003197ed0] [c000000000c9f120] cpuidle_select+0x34/0x48
[    T0] [c000000003197ef0] [c00000000022f184] do_idle+0x254/0x328


+++++++++++++++++++++++++++++++++++++++++++++++++++++++++

I went back to baseline to confirm the original behaviour.
(d613f96096e4) Merge timers/vdso into tip/master

07:02:17 AM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  
%guest  %gnice   %idle
07:02:18 AM  all    0.01    0.00    0.01    0.01    1.19    0.00    0.00    
0.00    0.00   98.77
07:02:19 AM  all    0.01    0.00    0.01    0.00    0.84    0.00    0.00    
0.00    0.00   99.14
07:02:20 AM  all    0.00    0.00    0.01    0.00    0.99    0.00    0.00    
0.00    0.00   99.00
07:02:21 AM  all    0.01    0.00    0.00    0.00    0.83    0.00    0.00    
0.00    0.00   99.16

Which is the working as expected.



PS: Initial data. I haven't gone through the series yet.

Reply via email to