On 26/02/2020 05:40, Amol Grover wrote:
On Tue, Feb 25, 2020 at 05:47:54PM +0000, John Garry wrote:
Hi Corey,

I was just wondering are all the changes to fix this issue applied yet?

I could not help but notice this change:

https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/drivers/char/ipmi?h=next-20200225&id=ef0129a4d06a669a0a4d49654d2130ff4343a262

I tested this on top of 5.6-rc3, and I still a warn of suspicious RCU usage
from the IPMI driver.

Thanks,
John


I'm booting an allmodconfig kernel on my arm64 system from recent Linus'
master branch, and I see this:

[  188.241001][    T1] IPMI message handler: version 39.2
[  188.247331][    T1] ipmi device interface
[  188.256271][    T1]
[  188.258541][    T1] =============================
[  188.263388][    T1] WARNING: suspicious RCU usage
[  188.268236][    T1] 5.4.0+ #1416 Not tainted
[  188.272645][    T1] -----------------------------
[  188.277496][    T1] drivers/char/ipmi/ipmi_msghandler.c:744 RCU-list
traversed in non-reader section!!

Hi John,

Hi Amol,


Are you referring to this suspicious RCU usage warning that you're
getting on 5.6-rc3?

Yes, I see these warnings on 5.6-rc3; they look like the same issues I saw on earlier kernel versions:

Is this the only warning that you're getting?

I see 2x warnings on 5.6-rc3:


[   29.772408] =============================
[   29.776863] WARNING: suspicious RCU usage
[   29.780915] 5.6.0-rc3-00001-g907305ae6618-dirty #1755 Not tainted
[   29.787046] -----------------------------
[ 29.791100] drivers/char/ipmi/ipmi_msghandler.c:744 RCU-list traversed in non-reader section!!
[   29.799753]
other info that might help us debug this:

[   29.807801]
rcu_scheduler_active = 2, debug_locks = 1
[   29.814372] 2 locks held by swapper/0/1:
[ 29.818331] #0: ffffa00013e1ae60 (smi_watchers_mutex){+.+.}, at: ipmi_smi_watcher_register+0x64/0x220 [ 29.827696] #1: ffffa000190fd7a8 (&ipmi_interfaces_srcu){....}, at: ipmi_smi_watcher_register+0x80/0x220
[   29.837314]
stack backtrace:
[ 29.841712] CPU: 24 PID: 1 Comm: swapper/0 Not tainted 5.6.0-rc3-00001-g907305ae6618-dirty #1755 [ 29.850495] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI RC0 - V1.16.01 03/15/2019
[   29.859016] Call trace:
[   29.861463]  dump_backtrace+0x0/0x298
[   29.865124]  show_stack+0x14/0x20
[   29.868441]  dump_stack+0x118/0x190
[   29.871931]  lockdep_rcu_suspicious+0xe0/0x120
[   29.876373]  ipmi_smi_watcher_register+0x1e4/0x220
[   29.881166]  init_ipmi_devintf+0xbc/0x108
[   29.885174]  do_one_initcall+0xf0/0x608
[   29.889011]  kernel_init_freeable+0x31c/0x384
[   29.893367]  kernel_init+0x10/0x118
[   29.896854]  ret_from_fork+0x10/0x18



[   30.212615] =============================
[   30.216710] WARNING: suspicious RCU usage
[   30.220785] 5.6.0-rc3-00001-g907305ae6618-dirty #1755 Not tainted
[   30.226923] -----------------------------
[ 30.230972] drivers/char/ipmi/ipmi_msghandler.c:3453 RCU-list traversed in non-reader section!!
[   30.239706]
other info that might help us debug this:

[   30.247747]
rcu_scheduler_active = 2, debug_locks = 1
[   30.254312] 2 locks held by swapper/0/1:
[ 30.258276] #0: ffffa00013e1e960 (smi_infos_lock){+.+.}, at: init_ipmi_si+0x64/0x1dc [ 30.266172] #1: ffffa00013e1adc0 (ipmi_interfaces_mutex){+.+.}, at: ipmi_add_smi+0x5ac/0xa30
[   30.274758]
stack backtrace:
[ 30.279155] CPU: 25 PID: 1 Comm: swapper/0 Not tainted 5.6.0-rc3-00001-g907305ae6618-dirty #1755 [ 30.287938] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI RC0 - V1.16.01 03/15/2019
[   30.296459] Call trace:
[   30.298907]  dump_backtrace+0x0/0x298
[   30.302568]  show_stack+0x14/0x20
[   30.305884]  dump_stack+0x118/0x190
[   30.309372]  lockdep_rcu_suspicious+0xe0/0x120
[   30.313815]  ipmi_add_smi+0x85c/0xa30
[   30.317477]  try_smi_init+0x9b4/0xd78
[   30.321138]  init_ipmi_si+0x98/0x1dc
[   30.324713]  do_one_initcall+0xf0/0x608
[   30.328549]  kernel_init_freeable+0x31c/0x384
[   30.332906]  kernel_init+0x10/0x118
[   30.336393]  ret_from_fork+0x10/0x18


I have put a full kernel log here:
https://pastebin.com/7yBzBt1y

Patch "drivers: char: ipmi: ipmi_msghandler: Pass lockdep expression to RCU lists" looks to have eliminated the 2nd warning (drivers/char/ipmi/ipmi_msghandler.c:3453 RCU-list traversed in non-reader section!!), above.

I can
see a few usages of list_for_each_entry_rcu() under SRCU read lock
which, if I'm not mistaken, might all throw these warnings as no
srcu_read_lock_held() is used. I'll be happy to take a look at them. Let
me know.


Thanks,
John

Thanks
Amol

[  188.286971][    T1]
[  188.286971][    T1] other info that might help us debug this:
[  188.286971][    T1]
[  188.297235][    T1]
[  188.297235][    T1] rcu_scheduler_active = 2, debug_locks = 1
[  188.305314][    T1] 2 locks held by swapper/0/1:
[  188.310072][    T1]  #0: ffffffd0149687f0 (smi_watchers_mutex){+.+.},
at: ipmi_smi_watcher_register+0x5c/0x248
[  188.320295][    T1]  #1: ffffffd016c9aff8
(&ipmi_interfaces_srcu){....}, at: ipmi_smi_watcher_register+0xa8/0x248
[  188.330771][    T1]
[  188.330771][    T1] stack backtrace:
[  188.336671][    T1] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.4.0+
#1416
[  188.343604][    T1] Hardware name: Huawei Taishan 2280 /D05, BIOS
Hisilicon D05 IT21 Nemo 2.0 RC0 04/18/2018
[  188.353506][    T1] Call trace:
[  188.356688][    T1]  dump_backtrace+0x0/0x3a0
[  188.361089][    T1]  show_stack+0x28/0x38
[  188.365142][    T1]  dump_stack+0x1ac/0x23c
[  188.369371][    T1]  lockdep_rcu_suspicious+0x100/0x118
[  188.374646][    T1]  ipmi_smi_watcher_register+0x134/0x248
[  188.380185][    T1]  init_ipmi_devintf+0xec/0x140
[  188.384936][    T1]  do_one_initcall+0x2f8/0x710
[  188.389600][    T1]  kernel_init_freeable+0x304/0x3dc
[  188.394701][    T1]  kernel_init+0x20/0x180
[  188.398928][    T1]  ret_from_fork+0x10/0x18
[  188.403333][    T1] ipmi_si: IPMI System Interface driver
[  188.411646][    T1] ipmi_si hisi-lpc-ipmi.5.auto: ipmi_platform:
probing via ACPI
[  188.421958][    T1] ipmi_si hisi-lpc-ipmi.5.auto: IRQ index 0 not found
[  188.428737][    T1] ipmi_si hisi-lpc-ipmi.5.auto: ipmi_platform: [io
0xffc0e3-0xffc0e7] regsize 1 spacing 1 irq 0
[  188.439386][    T1] ipmi_si: Adding ACPI-specified bt state machine
[  188.448341][    T1] kobject: 'wakeup' ((____ptrval____)):
kobject_release, parent (____ptrval____) (delayed 1000)
[  188.458824][    T1] kobject: 'wakeup20' ((____ptrval____)):
kobject_release, parent (____ptrval____) (delayed 250)
[  188.471778][    T1] ipmi_si hisi-lpc-ipmi.5.auto: ipmi_platform:
probing via ACPI
[  188.481621][    T1] ipmi_si hisi-lpc-ipmi.5.auto: IRQ index 0 not found
[  188.488390][    T1] ipmi_si hisi-lpc-ipmi.5.auto: ipmi_platform: [io
0xffc0e3-0xffc0e7] regsize 1 spacing 1 irq 0
[  188.498936][    T1] ipmi_si: Adding ACPI-specified bt state machine
[  188.507082][    T1] ipmi_si: Trying ACPI-specified bt state machine
at i/o address 0xffc0e3, slave address 0x0, irq 0
[  188.524402][    T1] ipmi_si hisi-lpc-ipmi.5.auto: bt cap response too
short: 3
[  188.531809][    T1] ipmi_si hisi-lpc-ipmi.5.auto: using default values
[  188.538483][    T1] ipmi_si hisi-lpc-ipmi.5.auto: req2rsp=5 secs
retries=2
[  188.572794][    T1]
[  188.575080][    T1] =============================
[  188.579930][    T1] WARNING: suspicious RCU usage
[  188.584797][    T1] 5.4.0+ #1416 Not tainted
[  188.589206][    T1] -----------------------------
[  188.594055][    T1] drivers/char/ipmi/ipmi_msghandler.c:3453 RCU-list
traversed in non-reader section!!
[  188.603617][    T1]
[  188.603617][    T1] other info that might help us debug this:
[  188.603617][    T1]
[  188.613884][    T1]
[  188.613884][    T1] rcu_scheduler_active = 2, debug_locks = 1
[  188.621945][    T1] 2 locks held by swapper/0/1:
[  188.626704][    T1]  #0: ffffffd01496c8f0 (smi_infos_lock){+.+.}, at:
init_ipmi_si+0x78/0x24c
[  188.635436][    T1]  #1: ffffffd0149689b0
(ipmi_interfaces_mutex){+.+.}, at: ipmi_add_smi+0x5e4/0xae0
[  188.644864][    T1]
[  188.644864][    T1] stack backtrace:
[  188.650765][    T1] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.4.0+
#1416
[  188.657698][    T1] Hardware name: Huawei Taishan 2280 /D05, BIOS
Hisilicon D05 IT21 Nemo 2.0 RC0 04/18/2018
[  188.667598][    T1] Call trace:
[  188.670777][    T1]  dump_backtrace+0x0/0x3a0
[  188.675178][    T1]  show_stack+0x28/0x38
[  188.679230][    T1]  dump_stack+0x1ac/0x23c
[  188.683458][    T1]  lockdep_rcu_suspicious+0x100/0x118
[  188.688726][    T1]  ipmi_add_smi+0x630/0xae0
[  188.688745][    T1]  try_smi_init+0xdbc/0xf18
[  188.688759][    T1]  init_ipmi_si+0x14c/0x24c
[  188.688773][    T1]  do_one_initcall+0x2f8/0x710
[  188.688787][    T1]  kernel_init_freeable+0x304/0x3dc
[  188.688802][    T1]  kernel_init+0x20/0x180
[  188.688815][    T1]  ret_from_fork+0x10/0x18
[  188.731333][    T1] ipmi_si hisi-lpc-ipmi.5.auto: IPMI message
handler: Found new BMC (man_id: 0x0007db, prod_id: 0x0001, dev_id: 0x01)
[  188.935301][    T1] ipmi_si hisi-lpc-ipmi.5.auto: IPMI bt interface
initialized
[  188.942797][    T1] ipmi_ssif: IPMI SSIF Interface driver

In the code we seem to be using srcu read lock instead of rcu read lock
for list iteration protection, but that's a quick analysis. And there is
also a mutex locked in ipmi_smi_watcher_register() [didn't check the
other case], so I wonder if rcu list iterator variant is required at
all.

Apologies if this has been reported previously. I see that Anders
(cc'ed) has been reporting some of these sorts of issues.

Thanks,
John




_______________________________________________
Openipmi-developer mailing list
Openipmi-developer@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openipmi-developer

Reply via email to