Re: [PATCH] brcmfmac: stop watchdog before detach and free everything
Hi Andy The problem seems really easy to solve: diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c index 412a05b..ba60b151 100644 --- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c +++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c @@ -4227,13 +4227,7 @@ struct brcmf_sdio *brcmf_sdio_probe(struct brcmf_sdio_dev *sdiodev) timer_setup(>timer, brcmf_sdio_watchdog, 0); /* Initialize watchdog thread */ init_completion(>watchdog_wait); - bus->watchdog_tsk = kthread_run(brcmf_sdio_watchdog_thread, - bus, "brcmf_wdog/%s", - dev_name(>func1->dev)); - if (IS_ERR(bus->watchdog_tsk)) { - pr_warn("brcmf_watchdog thread failed to start\n"); - bus->watchdog_tsk = NULL; - } + /* Initialize DPC thread */ bus->dpc_triggered = false; bus->dpc_running = false; @@ -4281,6 +4275,14 @@ struct brcmf_sdio *brcmf_sdio_probe(struct brcmf_sdio_dev *sdiodev) goto fail; } + bus->watchdog_tsk = kthread_run(brcmf_sdio_watchdog_thread, + bus, "brcmf_wdog/%s", + dev_name(>func1->dev)); + if (IS_ERR(bus->watchdog_tsk)) { + pr_warn("brcmf_watchdog thread failed to start\n"); + bus->watchdog_tsk = NULL; + } + return bus; Just look here ret = brcmf_fw_get_firmwares(sdiodev->dev, fwreq, brcmf_sdio_firmware_callback); if (ret != 0) { brcmf_err("async firmware request failed: %d\n", ret); kfree(fwreq); goto fail; } bus->watchdog_tsk = kthread_run(brcmf_sdio_watchdog_thread, bus, "brcmf_wdog/%s", dev_name(>func1->dev)); if (IS_ERR(bus->watchdog_tsk)) { pr_warn("brcmf_watchdog thread failed to start\n"); bus->watchdog_tsk = NULL; } return bus; fail: brcmf_sdio_remove(bus); return NULL; On Mon, May 28, 2018 at 5:29 PM, Michael Nazzareno Trimarchiwrote: > Hi > > On Mon, May 28, 2018 at 5:25 PM, Andy Shevchenko > wrote: >> On Mon, May 28, 2018 at 12:54 PM, Michael Nazzareno Trimarchi >> wrote: >>> Hi Arend >>> >>> On Mon, May 28, 2018 at 11:51 AM, Arend van Spriel >>> wrote: On 5/28/2018 9:50 AM, Michael Trimarchi wrote: > > Watchdog need to be stopped in brcmf_sdio_remove to avoid > i > The system is going down NOW! > [ 1348.110759] Unable to handle kernel NULL pointer dereference at virtual > address 02f8 > Sent SIGTERM to all processes > [ 1348.121412] Mem abort info: > [ 1348.126962] ESR = 0x9604 > [ 1348.130023] Exception class = DABT (current EL), IL = 32 bits > [ 1348.135948] SET = 0, FnV = 0 > [ 1348.138997] EA = 0, S1PTW = 0 > [ 1348.142154] Data abort info: > [ 1348.145045] ISV = 0, ISS = 0x0004 > [ 1348.148884] CM = 0, WnR = 0 > [ 1348.151861] user pgtable: 4k pages, 48-bit VAs, pgdp = (ptrval) > [ 1348.158475] [02f8] pgd= > [ 1348.163364] Internal error: Oops: 9604 [#1] PREEMPT SMP > [ 1348.168927] Modules linked in: ipv6 > [ 1348.172421] CPU: 3 PID: 1421 Comm: brcmf_wdog/mmc0 Not tainted > 4.17.0-rc5-next-20180517 #18 > [ 1348.180757] Hardware name: Amarula A64-Relic (DT) > [ 1348.185455] pstate: 6005 (nZCv daif -PAN -UAO) > [ 1348.190251] pc : brcmf_sdiod_freezer_count+0x0/0x20 > [ 1348.195124] lr : brcmf_sdio_watchdog_thread+0x64/0x290 Hi Michael, Thanks for the patch. In normal scenario the callstack looks like this: brcmf_sdio_remove() -> brcmf_detach() -> brcmf_bus_stop() -> brcmf_sdio_bus_stop() In brcmf_sdio_bus_stop() the watchdog is terminated. So in what scenario did you encounter this null pointer deref? >>> >>> Is this happen even when there is not wifi firmware? >>> boot without any firmware in the filesystem and then trigger a reboot >> >> Something like the above I had noticed for a long (couple of kernel >> releases?) time, but wasn't a big priority to me. >> Though, I can test this on my side. >> >> P.S. I think rmmod or echo > unbind will trigger that as well. >> > > Right now the module is compiled in the kernel. I can dig down tonight > on this if needed > > Michael > >> -- >> With Best Regards, >> Andy Shevchenko > > > > -- > | Michael Nazzareno Trimarchi Amarula Solutions BV | > | COO - Founder
Re: [PATCH] brcmfmac: stop watchdog before detach and free everything
Hi On Mon, May 28, 2018 at 5:25 PM, Andy Shevchenkowrote: > On Mon, May 28, 2018 at 12:54 PM, Michael Nazzareno Trimarchi > wrote: >> Hi Arend >> >> On Mon, May 28, 2018 at 11:51 AM, Arend van Spriel >> wrote: >>> On 5/28/2018 9:50 AM, Michael Trimarchi wrote: Watchdog need to be stopped in brcmf_sdio_remove to avoid i The system is going down NOW! [ 1348.110759] Unable to handle kernel NULL pointer dereference at virtual address 02f8 Sent SIGTERM to all processes [ 1348.121412] Mem abort info: [ 1348.126962] ESR = 0x9604 [ 1348.130023] Exception class = DABT (current EL), IL = 32 bits [ 1348.135948] SET = 0, FnV = 0 [ 1348.138997] EA = 0, S1PTW = 0 [ 1348.142154] Data abort info: [ 1348.145045] ISV = 0, ISS = 0x0004 [ 1348.148884] CM = 0, WnR = 0 [ 1348.151861] user pgtable: 4k pages, 48-bit VAs, pgdp = (ptrval) [ 1348.158475] [02f8] pgd= [ 1348.163364] Internal error: Oops: 9604 [#1] PREEMPT SMP [ 1348.168927] Modules linked in: ipv6 [ 1348.172421] CPU: 3 PID: 1421 Comm: brcmf_wdog/mmc0 Not tainted 4.17.0-rc5-next-20180517 #18 [ 1348.180757] Hardware name: Amarula A64-Relic (DT) [ 1348.185455] pstate: 6005 (nZCv daif -PAN -UAO) [ 1348.190251] pc : brcmf_sdiod_freezer_count+0x0/0x20 [ 1348.195124] lr : brcmf_sdio_watchdog_thread+0x64/0x290 >>> >>> >>> Hi Michael, >>> >>> Thanks for the patch. In normal scenario the callstack looks like this: >>> >>> brcmf_sdio_remove() >>> -> brcmf_detach() >>> -> brcmf_bus_stop() >>> -> brcmf_sdio_bus_stop() >>> >>> In brcmf_sdio_bus_stop() the watchdog is terminated. So in what scenario did >>> you encounter this null pointer deref? >> >> Is this happen even when there is not wifi firmware? >> boot without any firmware in the filesystem and then trigger a reboot > > Something like the above I had noticed for a long (couple of kernel > releases?) time, but wasn't a big priority to me. > Though, I can test this on my side. > > P.S. I think rmmod or echo > unbind will trigger that as well. > Right now the module is compiled in the kernel. I can dig down tonight on this if needed Michael > -- > With Best Regards, > Andy Shevchenko -- | Michael Nazzareno Trimarchi Amarula Solutions BV | | COO - Founder Cruquiuskade 47 | | +31(0)851119172 Amsterdam 1018 AM NL | | [`as] http://www.amarulasolutions.com |
Re: [PATCH] brcmfmac: stop watchdog before detach and free everything
On Mon, May 28, 2018 at 12:54 PM, Michael Nazzareno Trimarchiwrote: > Hi Arend > > On Mon, May 28, 2018 at 11:51 AM, Arend van Spriel > wrote: >> On 5/28/2018 9:50 AM, Michael Trimarchi wrote: >>> >>> Watchdog need to be stopped in brcmf_sdio_remove to avoid >>> i >>> The system is going down NOW! >>> [ 1348.110759] Unable to handle kernel NULL pointer dereference at virtual >>> address 02f8 >>> Sent SIGTERM to all processes >>> [ 1348.121412] Mem abort info: >>> [ 1348.126962] ESR = 0x9604 >>> [ 1348.130023] Exception class = DABT (current EL), IL = 32 bits >>> [ 1348.135948] SET = 0, FnV = 0 >>> [ 1348.138997] EA = 0, S1PTW = 0 >>> [ 1348.142154] Data abort info: >>> [ 1348.145045] ISV = 0, ISS = 0x0004 >>> [ 1348.148884] CM = 0, WnR = 0 >>> [ 1348.151861] user pgtable: 4k pages, 48-bit VAs, pgdp = (ptrval) >>> [ 1348.158475] [02f8] pgd= >>> [ 1348.163364] Internal error: Oops: 9604 [#1] PREEMPT SMP >>> [ 1348.168927] Modules linked in: ipv6 >>> [ 1348.172421] CPU: 3 PID: 1421 Comm: brcmf_wdog/mmc0 Not tainted >>> 4.17.0-rc5-next-20180517 #18 >>> [ 1348.180757] Hardware name: Amarula A64-Relic (DT) >>> [ 1348.185455] pstate: 6005 (nZCv daif -PAN -UAO) >>> [ 1348.190251] pc : brcmf_sdiod_freezer_count+0x0/0x20 >>> [ 1348.195124] lr : brcmf_sdio_watchdog_thread+0x64/0x290 >> >> >> Hi Michael, >> >> Thanks for the patch. In normal scenario the callstack looks like this: >> >> brcmf_sdio_remove() >> -> brcmf_detach() >> -> brcmf_bus_stop() >> -> brcmf_sdio_bus_stop() >> >> In brcmf_sdio_bus_stop() the watchdog is terminated. So in what scenario did >> you encounter this null pointer deref? > > Is this happen even when there is not wifi firmware? > boot without any firmware in the filesystem and then trigger a reboot Something like the above I had noticed for a long (couple of kernel releases?) time, but wasn't a big priority to me. Though, I can test this on my side. P.S. I think rmmod or echo > unbind will trigger that as well. -- With Best Regards, Andy Shevchenko
Re: [PATCH] brcmfmac: stop watchdog before detach and free everything
Hi Arend On Mon, May 28, 2018 at 11:51 AM, Arend van Sprielwrote: > On 5/28/2018 9:50 AM, Michael Trimarchi wrote: >> >> Watchdog need to be stopped in brcmf_sdio_remove to avoid >> i >> The system is going down NOW! >> [ 1348.110759] Unable to handle kernel NULL pointer dereference at virtual >> address 02f8 >> Sent SIGTERM to all processes >> [ 1348.121412] Mem abort info: >> [ 1348.126962] ESR = 0x9604 >> [ 1348.130023] Exception class = DABT (current EL), IL = 32 bits >> [ 1348.135948] SET = 0, FnV = 0 >> [ 1348.138997] EA = 0, S1PTW = 0 >> [ 1348.142154] Data abort info: >> [ 1348.145045] ISV = 0, ISS = 0x0004 >> [ 1348.148884] CM = 0, WnR = 0 >> [ 1348.151861] user pgtable: 4k pages, 48-bit VAs, pgdp = (ptrval) >> [ 1348.158475] [02f8] pgd= >> [ 1348.163364] Internal error: Oops: 9604 [#1] PREEMPT SMP >> [ 1348.168927] Modules linked in: ipv6 >> [ 1348.172421] CPU: 3 PID: 1421 Comm: brcmf_wdog/mmc0 Not tainted >> 4.17.0-rc5-next-20180517 #18 >> [ 1348.180757] Hardware name: Amarula A64-Relic (DT) >> [ 1348.185455] pstate: 6005 (nZCv daif -PAN -UAO) >> [ 1348.190251] pc : brcmf_sdiod_freezer_count+0x0/0x20 >> [ 1348.195124] lr : brcmf_sdio_watchdog_thread+0x64/0x290 > > > Hi Michael, > > Thanks for the patch. In normal scenario the callstack looks like this: > > brcmf_sdio_remove() > -> brcmf_detach() > -> brcmf_bus_stop() > -> brcmf_sdio_bus_stop() > > In brcmf_sdio_bus_stop() the watchdog is terminated. So in what scenario did > you encounter this null pointer deref? Is this happen even when there is not wifi firmware? boot without any firmware in the filesystem and then trigger a reboot Michael > > Regards, > Arend -- | Michael Nazzareno Trimarchi Amarula Solutions BV | | COO - Founder Cruquiuskade 47 | | +31(0)851119172 Amsterdam 1018 AM NL | | [`as] http://www.amarulasolutions.com |
Re: [PATCH] brcmfmac: stop watchdog before detach and free everything
On 5/28/2018 9:50 AM, Michael Trimarchi wrote: Watchdog need to be stopped in brcmf_sdio_remove to avoid i The system is going down NOW! [ 1348.110759] Unable to handle kernel NULL pointer dereference at virtual address 02f8 Sent SIGTERM to all processes [ 1348.121412] Mem abort info: [ 1348.126962] ESR = 0x9604 [ 1348.130023] Exception class = DABT (current EL), IL = 32 bits [ 1348.135948] SET = 0, FnV = 0 [ 1348.138997] EA = 0, S1PTW = 0 [ 1348.142154] Data abort info: [ 1348.145045] ISV = 0, ISS = 0x0004 [ 1348.148884] CM = 0, WnR = 0 [ 1348.151861] user pgtable: 4k pages, 48-bit VAs, pgdp = (ptrval) [ 1348.158475] [02f8] pgd= [ 1348.163364] Internal error: Oops: 9604 [#1] PREEMPT SMP [ 1348.168927] Modules linked in: ipv6 [ 1348.172421] CPU: 3 PID: 1421 Comm: brcmf_wdog/mmc0 Not tainted 4.17.0-rc5-next-20180517 #18 [ 1348.180757] Hardware name: Amarula A64-Relic (DT) [ 1348.185455] pstate: 6005 (nZCv daif -PAN -UAO) [ 1348.190251] pc : brcmf_sdiod_freezer_count+0x0/0x20 [ 1348.195124] lr : brcmf_sdio_watchdog_thread+0x64/0x290 Hi Michael, Thanks for the patch. In normal scenario the callstack looks like this: brcmf_sdio_remove() -> brcmf_detach() -> brcmf_bus_stop() -> brcmf_sdio_bus_stop() In brcmf_sdio_bus_stop() the watchdog is terminated. So in what scenario did you encounter this null pointer deref? Regards, Arend
[PATCH] brcmfmac: stop watchdog before detach and free everything
Watchdog need to be stopped in brcmf_sdio_remove to avoid i The system is going down NOW! [ 1348.110759] Unable to handle kernel NULL pointer dereference at virtual address 02f8 Sent SIGTERM to all processes [ 1348.121412] Mem abort info: [ 1348.126962] ESR = 0x9604 [ 1348.130023] Exception class = DABT (current EL), IL = 32 bits [ 1348.135948] SET = 0, FnV = 0 [ 1348.138997] EA = 0, S1PTW = 0 [ 1348.142154] Data abort info: [ 1348.145045] ISV = 0, ISS = 0x0004 [ 1348.148884] CM = 0, WnR = 0 [ 1348.151861] user pgtable: 4k pages, 48-bit VAs, pgdp = (ptrval) [ 1348.158475] [02f8] pgd= [ 1348.163364] Internal error: Oops: 9604 [#1] PREEMPT SMP [ 1348.168927] Modules linked in: ipv6 [ 1348.172421] CPU: 3 PID: 1421 Comm: brcmf_wdog/mmc0 Not tainted 4.17.0-rc5-next-20180517 #18 [ 1348.180757] Hardware name: Amarula A64-Relic (DT) [ 1348.185455] pstate: 6005 (nZCv daif -PAN -UAO) [ 1348.190251] pc : brcmf_sdiod_freezer_count+0x0/0x20 [ 1348.195124] lr : brcmf_sdio_watchdog_thread+0x64/0x290 [ 1348.200253] sp : 0b85be30 [ 1348.203561] x29: 0b85be30 x28: [ 1348.208868] x27: 0b6cb918 x26: 80003b990638 [ 1348.214176] x25: 087b1a20 x24: 80003b94f800 [ 1348.219483] x23: 08e620c8 x22: 08f0b660 [ 1348.224790] x21: 08c6a858 x20: fe00 [ 1348.230097] x19: 80003b94f800 x18: 0001 [ 1348.235404] x17: ab2e8a74 x16: 080d7de8 [ 1348.240711] x15: x14: 0400 [ 1348.246018] x13: 0400 x12: 0001 [ 1348.251324] x11: 02c4 x10: 0a10 [ 1348.256631] x9 : 0b85bc40 x8 : 80003be11870 [ 1348.261937] x7 : 80003dfc7308 x6 : 00078ff08b55 [ 1348.267243] x5 : 0139e1058400 x4 : [ 1348.272550] x3 : dead0100 x2 : 958f2788d6618100 [ 1348.277856] x1 : fe00 x0 : Signed-off-by: Michael Trimarchi--- drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c index 412a05b..061f69d 100644 --- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c +++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c @@ -4294,6 +4294,13 @@ void brcmf_sdio_remove(struct brcmf_sdio *bus) brcmf_dbg(TRACE, "Enter\n"); if (bus) { + /* Stop watchdog task */ + if (bus->watchdog_tsk) { + send_sig(SIGTERM, bus->watchdog_tsk, 1); + kthread_stop(bus->watchdog_tsk); + bus->watchdog_tsk = NULL; + } + /* De-register interrupt handler */ brcmf_sdiod_intr_unregister(bus->sdiodev); -- 2.7.4