On Sat, 16 May 2020 at 23:54, David Miller <da...@davemloft.net> wrote: > > From: Xulin Sun <xulin....@windriver.com> > Date: Fri, 15 May 2020 11:18:13 +0800 > > > BUG: sleeping function called from invalid context at > > drivers/net/ethernet/mscc/ocelot.c:59 > > in_atomic(): 1, irqs_disabled(): 0, pid: 3778, name: ifconfig > > INFO: lockdep is turned off. > > Preemption disabled at: > > [<ffff2b163c83b78c>] dev_set_rx_mode+0x24/0x40 > > Hardware name: LS1028A RDB Board (DT) > > Call trace: > > dump_backtrace+0x0/0x140 > > show_stack+0x24/0x30 > > dump_stack+0xc4/0x10c > > ___might_sleep+0x194/0x230 > > __might_sleep+0x58/0x90 > > ocelot_mact_forget+0x74/0xf8 > > ocelot_mc_unsync+0x2c/0x38 > > __hw_addr_sync_dev+0x6c/0x130 > > ocelot_set_rx_mode+0x8c/0xa0 > > Vladimir states that this call chain is not possible in mainline. > > I'm not applying this.
(but the essence of the problem is legitimate though) There are 2 specific things I don't like: - The problem is claimed to reproduce on "LS1028A RDB Board (DT)" which does not call ocelot_set_rx_mode. So it claims to fix a problem for which only Xulin has the ability to decide whether it is the right solution or not. - On ocelot, it _looks_ like it is indeed a problem which was introduced in 639c1b2625af ("net: mscc: ocelot: Register poll timeout should be wall time not attempts"). But there was no attempt to bring it up with the author of that patch, who very clearly expressed that he is working on hardware where the polling timeout is in the order of milliseconds, and the timeout for the driver is currently set at 100 ms. I'm not very sure that it is desirable to spin in atomic context for 100 ms. Thanks, -Vladimir