On Mon, 21 Jun 2021 at 11:39, Odin Ugedal <o...@uged.al> wrote: > > man. 21. jun. 2021 kl. 08:33 skrev Sachin Sant <sach...@linux.vnet.ibm.com>: > > > > While running LTP tests (cfs_bandwidth01) against 5.13.0-rc7 kernel on a > > powerpc box > > following warning is seen > > > > [ 6611.331827] ------------[ cut here ]------------ > > [ 6611.331855] rq->tmp_alone_branch != &rq->leaf_cfs_rq_list > > [ 6611.331862] WARNING: CPU: 8 PID: 0 at kernel/sched/fair.c:401 > > unthrottle_cfs_rq+0x4cc/0x590 > > [ 6611.331883] Modules linked in: nfsv3 nfs_acl nfs lockd grace fscache > > netfs tun brd overlay vfat fat btrfs blake2b_generic xor zstd_compress > > raid6_pq xfs loop sctp ip6_udp_tunnel udp_tunnel libcrc32c dm_mod bonding > > rfkill sunrpc pseries_rng xts vmx_crypto sch_fq_codel ip_tables ext4 > > mbcache jbd2 sd_mod t10_pi sg ibmvscsi ibmveth scsi_transport_srp fuse > > [last unloaded: init_module] > > [ 6611.331957] CPU: 8 PID: 0 Comm: swapper/8 Tainted: G OE > > 5.13.0-rc6-gcba5e97280f5 #1 > > [ 6611.331968] NIP: c0000000001b7aac LR: c0000000001b7aa8 CTR: > > c000000000722d30 > > [ 6611.331976] REGS: c00000000274f3a0 TRAP: 0700 Tainted: G OE > > (5.13.0-rc6-gcba5e97280f5) > > [ 6611.331985] MSR: 8000000000029033 <SF,EE,ME,IR,DR,RI,LE> CR: 48000224 > > XER: 00000005 > > [ 6611.332002] CFAR: c00000000014ca20 IRQMASK: 1 > > [ 6611.332002] GPR00: c0000000001b7aa8 c00000000274f640 c000000001abaf00 > > 000000000000002d > > [ 6611.332002] GPR04: 00000000ffff7fff c00000000274f300 0000000000000027 > > c000000efdb07e08 > > [ 6611.332002] GPR08: 0000000000000023 0000000000000001 0000000000000027 > > c000000001976680 > > [ 6611.332002] GPR12: 0000000000000000 c000000effc0be80 c000000ef07b3f90 > > 000000001eefe200 > > [ 6611.332002] GPR16: 0000000000000000 0000000000000000 0000000000000000 > > 0000000000000000 > > [ 6611.332002] GPR20: 0000000000000001 c000000000fa6c08 c000000000fa6030 > > 0000000000000001 > > [ 6611.332002] GPR24: 0000000000000000 0000000000000000 c000000efde12380 > > 0000000000000001 > > [ 6611.332002] GPR28: 0000000000000001 0000000000000000 c000000efde12400 > > 0000000000000000 > > [ 6611.332094] NIP [c0000000001b7aac] unthrottle_cfs_rq+0x4cc/0x590 > > [ 6611.332104] LR [c0000000001b7aa8] unthrottle_cfs_rq+0x4c8/0x590 > > [ 6611.332113] Call Trace: > > [ 6611.332116] [c00000000274f640] [c0000000001b7aa8] > > unthrottle_cfs_rq+0x4c8/0x590 (unreliable) > > [ 6611.332128] [c00000000274f6e0] [c0000000001b7e38] > > distribute_cfs_runtime+0x1d8/0x280 > > [ 6611.332139] [c00000000274f7b0] [c0000000001b81d0] > > sched_cfs_period_timer+0x140/0x330 > > [ 6611.332149] [c00000000274f870] [c00000000022a03c] > > __hrtimer_run_queues+0x17c/0x380 > > [ 6611.332158] [c00000000274f8f0] [c00000000022ac68] > > hrtimer_interrupt+0x128/0x2f0 > > [ 6611.332168] [c00000000274f9a0] [c00000000002940c] > > timer_interrupt+0x13c/0x370 > > [ 6611.332179] [c00000000274fa00] [c000000000009c04] > > decrementer_common_virt+0x1a4/0x1b0 > > [ 6611.332189] --- interrupt: 900 at plpar_hcall_norets_notrace+0x18/0x24 > > [ 6611.332199] NIP: c0000000000f6af8 LR: c000000000a05f68 CTR: > > 0000000000000000 > > [ 6611.332206] REGS: c00000000274fa70 TRAP: 0900 Tainted: G OE > > (5.13.0-rc6-gcba5e97280f5) > > [ 6611.332214] MSR: 800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> > > CR: 28000224 XER: 00000000 > > [ 6611.332234] CFAR: 0000000000000c00 IRQMASK: 0 > > [ 6611.332234] GPR00: 0000000000000000 c00000000274fd10 c000000001abaf00 > > 0000000000000000 > > [ 6611.332234] GPR04: 00000000000000c0 0000000000000080 0001a91c68b80fa1 > > 00000000000003dc > > [ 6611.332234] GPR08: 000000000001f400 0000000000000001 0000000000000000 > > 0000000000000000 > > [ 6611.332234] GPR12: 0000000000000000 c000000effc0be80 c000000ef07b3f90 > > 000000001eefe200 > > [ 6611.332234] GPR16: 0000000000000000 0000000000000000 0000000000000000 > > 0000000000000000 > > [ 6611.332234] GPR20: 0000000000000001 0000000000000002 0000000000000010 > > c0000000019fe2f8 > > [ 6611.332234] GPR24: 0000000000000001 00000603517d757e 0000000000000000 > > 0000000000000000 > > [ 6611.332234] GPR28: 0000000000000001 0000000000000000 c000000001231f90 > > c000000001231f98 > > [ 6611.332323] NIP [c0000000000f6af8] plpar_hcall_norets_notrace+0x18/0x24 > > [ 6611.332332] LR [c000000000a05f68] check_and_cede_processor+0x48/0x60 > > [ 6611.332340] --- interrupt: 900 > > [ 6611.332345] [c00000000274fd10] [c000000efdb92380] 0xc000000efdb92380 > > (unreliable) > > [ 6611.332355] [c00000000274fd70] [c000000000a063bc] > > dedicated_cede_loop+0x9c/0x1b0 > > [ 6611.332364] [c00000000274fdc0] [c000000000a02b04] > > cpuidle_enter_state+0x2e4/0x4e0 > > [ 6611.332375] [c00000000274fe20] [c000000000a02da0] cpuidle_enter+0x50/0x70 > > [ 6611.332385] [c00000000274fe60] [c0000000001a883c] call_cpuidle+0x4c/0x80 > > [ 6611.332393] [c00000000274fe80] [c0000000001a8ee0] do_idle+0x380/0x3e0 > > [ 6611.332402] [c00000000274ff00] [c0000000001a91bc] > > cpu_startup_entry+0x3c/0x40 > > [ 6611.332411] [c00000000274ff30] [c000000000063ff8] > > start_secondary+0x298/0x2b0 > > [ 6611.332421] [c00000000274ff90] [c00000000000c754] > > start_secondary_prolog+0x10/0x14 > > [ 6611.332430] Instruction dump: > > [ 6611.332435] 4bfffc44 3d22fff6 8929f328 2f890000 409efea4 39200001 > > 3d42fff6 3c62ff4f > > [ 6611.332451] 3863bcd8 992af328 4bf94f15 60000000 <0fe00000> 4bfffe80 > > 7f6407b4 7f43d378 > > [ 6611.332466] ---[ end trace 1346f865cd1cae91 ]— > > > > 5.13.0-rc6 was good. Bisect points to following patch > > > > commit a7b359fc6a37 > > sched/fair: Correctly insert cfs_rq's to list on unthrottle > > > > The test runs to completion(without this warning) if the patch is reverted. > > > > Thanks > > -Sachin > > > > Hi, > > Thanks for the report! I have a theory about what is possibly causing > this, so I will try to reproduce it and see if my assumptions are > correct.
This means that a child's load was not null and it was inserted whereas parent's load was null. This should not happen unless the propagation failed somewhere > > > Odin