Re: [PATCH] brcmfmac: stop watchdog before detach and free everything

2018-05-28 Thread Michael Nazzareno Trimarchi
Hi Andy

The problem seems really easy to solve:

diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c
b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c
index 412a05b..ba60b151 100644
--- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c
+++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c
@@ -4227,13 +4227,7 @@ struct brcmf_sdio *brcmf_sdio_probe(struct
brcmf_sdio_dev *sdiodev)
timer_setup(>timer, brcmf_sdio_watchdog, 0);
/* Initialize watchdog thread */
init_completion(>watchdog_wait);
-   bus->watchdog_tsk = kthread_run(brcmf_sdio_watchdog_thread,
-   bus, "brcmf_wdog/%s",
-   dev_name(>func1->dev));
-   if (IS_ERR(bus->watchdog_tsk)) {
-   pr_warn("brcmf_watchdog thread failed to start\n");
-   bus->watchdog_tsk = NULL;
-   }
+
/* Initialize DPC thread */
bus->dpc_triggered = false;
bus->dpc_running = false;
@@ -4281,6 +4275,14 @@ struct brcmf_sdio *brcmf_sdio_probe(struct
brcmf_sdio_dev *sdiodev)
goto fail;
}

+   bus->watchdog_tsk = kthread_run(brcmf_sdio_watchdog_thread,
+   bus, "brcmf_wdog/%s",
+   dev_name(>func1->dev));
+   if (IS_ERR(bus->watchdog_tsk)) {
+   pr_warn("brcmf_watchdog thread failed to start\n");
+   bus->watchdog_tsk = NULL;
+   }
+
return bus;

Just look here

ret = brcmf_fw_get_firmwares(sdiodev->dev, fwreq,
 brcmf_sdio_firmware_callback);
if (ret != 0) {
brcmf_err("async firmware request failed: %d\n", ret);
kfree(fwreq);
goto fail;
}

bus->watchdog_tsk = kthread_run(brcmf_sdio_watchdog_thread,
bus, "brcmf_wdog/%s",
dev_name(>func1->dev));
if (IS_ERR(bus->watchdog_tsk)) {
pr_warn("brcmf_watchdog thread failed to start\n");
bus->watchdog_tsk = NULL;
}

return bus;

fail:
brcmf_sdio_remove(bus);
return NULL;


On Mon, May 28, 2018 at 5:29 PM, Michael Nazzareno Trimarchi
 wrote:
> Hi
>
> On Mon, May 28, 2018 at 5:25 PM, Andy Shevchenko
>  wrote:
>> On Mon, May 28, 2018 at 12:54 PM, Michael Nazzareno Trimarchi
>>  wrote:
>>> Hi Arend
>>>
>>> On Mon, May 28, 2018 at 11:51 AM, Arend van Spriel
>>>  wrote:
 On 5/28/2018 9:50 AM, Michael Trimarchi wrote:
>
> Watchdog need to be stopped in brcmf_sdio_remove to avoid
> i
> The system is going down NOW!
> [ 1348.110759] Unable to handle kernel NULL pointer dereference at virtual
> address 02f8
> Sent SIGTERM to all processes
> [ 1348.121412] Mem abort info:
> [ 1348.126962]   ESR = 0x9604
> [ 1348.130023]   Exception class = DABT (current EL), IL = 32 bits
> [ 1348.135948]   SET = 0, FnV = 0
> [ 1348.138997]   EA = 0, S1PTW = 0
> [ 1348.142154] Data abort info:
> [ 1348.145045]   ISV = 0, ISS = 0x0004
> [ 1348.148884]   CM = 0, WnR = 0
> [ 1348.151861] user pgtable: 4k pages, 48-bit VAs, pgdp = (ptrval)
> [ 1348.158475] [02f8] pgd=
> [ 1348.163364] Internal error: Oops: 9604 [#1] PREEMPT SMP
> [ 1348.168927] Modules linked in: ipv6
> [ 1348.172421] CPU: 3 PID: 1421 Comm: brcmf_wdog/mmc0 Not tainted
> 4.17.0-rc5-next-20180517 #18
> [ 1348.180757] Hardware name: Amarula A64-Relic (DT)
> [ 1348.185455] pstate: 6005 (nZCv daif -PAN -UAO)
> [ 1348.190251] pc : brcmf_sdiod_freezer_count+0x0/0x20
> [ 1348.195124] lr : brcmf_sdio_watchdog_thread+0x64/0x290


 Hi Michael,

 Thanks for the patch. In normal scenario the callstack looks like this:

 brcmf_sdio_remove()
 -> brcmf_detach()
 -> brcmf_bus_stop()
 -> brcmf_sdio_bus_stop()

 In brcmf_sdio_bus_stop() the watchdog is terminated. So in what scenario 
 did
 you encounter this null pointer deref?
>>>
>>> Is this happen even when there is not wifi firmware?
>>> boot without any firmware in the filesystem and then trigger a reboot
>>
>> Something like the above I had noticed for a long (couple of kernel
>> releases?) time, but wasn't a big priority to me.
>> Though, I can test this on my side.
>>
>> P.S. I think rmmod or echo > unbind will trigger that as well.
>>
>
> Right now the module is compiled in the kernel. I can dig down tonight
> on this if needed
>
> Michael
>
>> --
>> With Best Regards,
>> Andy Shevchenko
>
>
>
> --
> | Michael Nazzareno Trimarchi Amarula Solutions BV |
> | COO  -  Founder

Re: [PATCH] brcmfmac: stop watchdog before detach and free everything

2018-05-28 Thread Michael Nazzareno Trimarchi
Hi

On Mon, May 28, 2018 at 5:25 PM, Andy Shevchenko
 wrote:
> On Mon, May 28, 2018 at 12:54 PM, Michael Nazzareno Trimarchi
>  wrote:
>> Hi Arend
>>
>> On Mon, May 28, 2018 at 11:51 AM, Arend van Spriel
>>  wrote:
>>> On 5/28/2018 9:50 AM, Michael Trimarchi wrote:

 Watchdog need to be stopped in brcmf_sdio_remove to avoid
 i
 The system is going down NOW!
 [ 1348.110759] Unable to handle kernel NULL pointer dereference at virtual
 address 02f8
 Sent SIGTERM to all processes
 [ 1348.121412] Mem abort info:
 [ 1348.126962]   ESR = 0x9604
 [ 1348.130023]   Exception class = DABT (current EL), IL = 32 bits
 [ 1348.135948]   SET = 0, FnV = 0
 [ 1348.138997]   EA = 0, S1PTW = 0
 [ 1348.142154] Data abort info:
 [ 1348.145045]   ISV = 0, ISS = 0x0004
 [ 1348.148884]   CM = 0, WnR = 0
 [ 1348.151861] user pgtable: 4k pages, 48-bit VAs, pgdp = (ptrval)
 [ 1348.158475] [02f8] pgd=
 [ 1348.163364] Internal error: Oops: 9604 [#1] PREEMPT SMP
 [ 1348.168927] Modules linked in: ipv6
 [ 1348.172421] CPU: 3 PID: 1421 Comm: brcmf_wdog/mmc0 Not tainted
 4.17.0-rc5-next-20180517 #18
 [ 1348.180757] Hardware name: Amarula A64-Relic (DT)
 [ 1348.185455] pstate: 6005 (nZCv daif -PAN -UAO)
 [ 1348.190251] pc : brcmf_sdiod_freezer_count+0x0/0x20
 [ 1348.195124] lr : brcmf_sdio_watchdog_thread+0x64/0x290
>>>
>>>
>>> Hi Michael,
>>>
>>> Thanks for the patch. In normal scenario the callstack looks like this:
>>>
>>> brcmf_sdio_remove()
>>> -> brcmf_detach()
>>> -> brcmf_bus_stop()
>>> -> brcmf_sdio_bus_stop()
>>>
>>> In brcmf_sdio_bus_stop() the watchdog is terminated. So in what scenario did
>>> you encounter this null pointer deref?
>>
>> Is this happen even when there is not wifi firmware?
>> boot without any firmware in the filesystem and then trigger a reboot
>
> Something like the above I had noticed for a long (couple of kernel
> releases?) time, but wasn't a big priority to me.
> Though, I can test this on my side.
>
> P.S. I think rmmod or echo > unbind will trigger that as well.
>

Right now the module is compiled in the kernel. I can dig down tonight
on this if needed

Michael

> --
> With Best Regards,
> Andy Shevchenko



-- 
| Michael Nazzareno Trimarchi Amarula Solutions BV |
| COO  -  Founder  Cruquiuskade 47 |
| +31(0)851119172 Amsterdam 1018 AM NL |
|  [`as] http://www.amarulasolutions.com   |


Re: [PATCH] brcmfmac: stop watchdog before detach and free everything

2018-05-28 Thread Andy Shevchenko
On Mon, May 28, 2018 at 12:54 PM, Michael Nazzareno Trimarchi
 wrote:
> Hi Arend
>
> On Mon, May 28, 2018 at 11:51 AM, Arend van Spriel
>  wrote:
>> On 5/28/2018 9:50 AM, Michael Trimarchi wrote:
>>>
>>> Watchdog need to be stopped in brcmf_sdio_remove to avoid
>>> i
>>> The system is going down NOW!
>>> [ 1348.110759] Unable to handle kernel NULL pointer dereference at virtual
>>> address 02f8
>>> Sent SIGTERM to all processes
>>> [ 1348.121412] Mem abort info:
>>> [ 1348.126962]   ESR = 0x9604
>>> [ 1348.130023]   Exception class = DABT (current EL), IL = 32 bits
>>> [ 1348.135948]   SET = 0, FnV = 0
>>> [ 1348.138997]   EA = 0, S1PTW = 0
>>> [ 1348.142154] Data abort info:
>>> [ 1348.145045]   ISV = 0, ISS = 0x0004
>>> [ 1348.148884]   CM = 0, WnR = 0
>>> [ 1348.151861] user pgtable: 4k pages, 48-bit VAs, pgdp = (ptrval)
>>> [ 1348.158475] [02f8] pgd=
>>> [ 1348.163364] Internal error: Oops: 9604 [#1] PREEMPT SMP
>>> [ 1348.168927] Modules linked in: ipv6
>>> [ 1348.172421] CPU: 3 PID: 1421 Comm: brcmf_wdog/mmc0 Not tainted
>>> 4.17.0-rc5-next-20180517 #18
>>> [ 1348.180757] Hardware name: Amarula A64-Relic (DT)
>>> [ 1348.185455] pstate: 6005 (nZCv daif -PAN -UAO)
>>> [ 1348.190251] pc : brcmf_sdiod_freezer_count+0x0/0x20
>>> [ 1348.195124] lr : brcmf_sdio_watchdog_thread+0x64/0x290
>>
>>
>> Hi Michael,
>>
>> Thanks for the patch. In normal scenario the callstack looks like this:
>>
>> brcmf_sdio_remove()
>> -> brcmf_detach()
>> -> brcmf_bus_stop()
>> -> brcmf_sdio_bus_stop()
>>
>> In brcmf_sdio_bus_stop() the watchdog is terminated. So in what scenario did
>> you encounter this null pointer deref?
>
> Is this happen even when there is not wifi firmware?
> boot without any firmware in the filesystem and then trigger a reboot

Something like the above I had noticed for a long (couple of kernel
releases?) time, but wasn't a big priority to me.
Though, I can test this on my side.

P.S. I think rmmod or echo > unbind will trigger that as well.

-- 
With Best Regards,
Andy Shevchenko


Re: [PATCH] brcmfmac: stop watchdog before detach and free everything

2018-05-28 Thread Michael Nazzareno Trimarchi
Hi Arend

On Mon, May 28, 2018 at 11:51 AM, Arend van Spriel
 wrote:
> On 5/28/2018 9:50 AM, Michael Trimarchi wrote:
>>
>> Watchdog need to be stopped in brcmf_sdio_remove to avoid
>> i
>> The system is going down NOW!
>> [ 1348.110759] Unable to handle kernel NULL pointer dereference at virtual
>> address 02f8
>> Sent SIGTERM to all processes
>> [ 1348.121412] Mem abort info:
>> [ 1348.126962]   ESR = 0x9604
>> [ 1348.130023]   Exception class = DABT (current EL), IL = 32 bits
>> [ 1348.135948]   SET = 0, FnV = 0
>> [ 1348.138997]   EA = 0, S1PTW = 0
>> [ 1348.142154] Data abort info:
>> [ 1348.145045]   ISV = 0, ISS = 0x0004
>> [ 1348.148884]   CM = 0, WnR = 0
>> [ 1348.151861] user pgtable: 4k pages, 48-bit VAs, pgdp = (ptrval)
>> [ 1348.158475] [02f8] pgd=
>> [ 1348.163364] Internal error: Oops: 9604 [#1] PREEMPT SMP
>> [ 1348.168927] Modules linked in: ipv6
>> [ 1348.172421] CPU: 3 PID: 1421 Comm: brcmf_wdog/mmc0 Not tainted
>> 4.17.0-rc5-next-20180517 #18
>> [ 1348.180757] Hardware name: Amarula A64-Relic (DT)
>> [ 1348.185455] pstate: 6005 (nZCv daif -PAN -UAO)
>> [ 1348.190251] pc : brcmf_sdiod_freezer_count+0x0/0x20
>> [ 1348.195124] lr : brcmf_sdio_watchdog_thread+0x64/0x290
>
>
> Hi Michael,
>
> Thanks for the patch. In normal scenario the callstack looks like this:
>
> brcmf_sdio_remove()
> -> brcmf_detach()
> -> brcmf_bus_stop()
> -> brcmf_sdio_bus_stop()
>
> In brcmf_sdio_bus_stop() the watchdog is terminated. So in what scenario did
> you encounter this null pointer deref?

Is this happen even when there is not wifi firmware?
boot without any firmware in the filesystem and then trigger a reboot

Michael

>
> Regards,
> Arend



-- 
| Michael Nazzareno Trimarchi Amarula Solutions BV |
| COO  -  Founder  Cruquiuskade 47 |
| +31(0)851119172 Amsterdam 1018 AM NL |
|  [`as] http://www.amarulasolutions.com   |


Re: [PATCH] brcmfmac: stop watchdog before detach and free everything

2018-05-28 Thread Arend van Spriel

On 5/28/2018 9:50 AM, Michael Trimarchi wrote:

Watchdog need to be stopped in brcmf_sdio_remove to avoid
i
The system is going down NOW!
[ 1348.110759] Unable to handle kernel NULL pointer dereference at virtual 
address 02f8
Sent SIGTERM to all processes
[ 1348.121412] Mem abort info:
[ 1348.126962]   ESR = 0x9604
[ 1348.130023]   Exception class = DABT (current EL), IL = 32 bits
[ 1348.135948]   SET = 0, FnV = 0
[ 1348.138997]   EA = 0, S1PTW = 0
[ 1348.142154] Data abort info:
[ 1348.145045]   ISV = 0, ISS = 0x0004
[ 1348.148884]   CM = 0, WnR = 0
[ 1348.151861] user pgtable: 4k pages, 48-bit VAs, pgdp = (ptrval)
[ 1348.158475] [02f8] pgd=
[ 1348.163364] Internal error: Oops: 9604 [#1] PREEMPT SMP
[ 1348.168927] Modules linked in: ipv6
[ 1348.172421] CPU: 3 PID: 1421 Comm: brcmf_wdog/mmc0 Not tainted 
4.17.0-rc5-next-20180517 #18
[ 1348.180757] Hardware name: Amarula A64-Relic (DT)
[ 1348.185455] pstate: 6005 (nZCv daif -PAN -UAO)
[ 1348.190251] pc : brcmf_sdiod_freezer_count+0x0/0x20
[ 1348.195124] lr : brcmf_sdio_watchdog_thread+0x64/0x290


Hi Michael,

Thanks for the patch. In normal scenario the callstack looks like this:

brcmf_sdio_remove()
-> brcmf_detach()
-> brcmf_bus_stop()
-> brcmf_sdio_bus_stop()

In brcmf_sdio_bus_stop() the watchdog is terminated. So in what scenario 
did you encounter this null pointer deref?


Regards,
Arend


[PATCH] brcmfmac: stop watchdog before detach and free everything

2018-05-28 Thread Michael Trimarchi
Watchdog need to be stopped in brcmf_sdio_remove to avoid
i
The system is going down NOW!
[ 1348.110759] Unable to handle kernel NULL pointer dereference at virtual 
address 02f8
Sent SIGTERM to all processes
[ 1348.121412] Mem abort info:
[ 1348.126962]   ESR = 0x9604
[ 1348.130023]   Exception class = DABT (current EL), IL = 32 bits
[ 1348.135948]   SET = 0, FnV = 0
[ 1348.138997]   EA = 0, S1PTW = 0
[ 1348.142154] Data abort info:
[ 1348.145045]   ISV = 0, ISS = 0x0004
[ 1348.148884]   CM = 0, WnR = 0
[ 1348.151861] user pgtable: 4k pages, 48-bit VAs, pgdp = (ptrval)
[ 1348.158475] [02f8] pgd=
[ 1348.163364] Internal error: Oops: 9604 [#1] PREEMPT SMP
[ 1348.168927] Modules linked in: ipv6
[ 1348.172421] CPU: 3 PID: 1421 Comm: brcmf_wdog/mmc0 Not tainted 
4.17.0-rc5-next-20180517 #18
[ 1348.180757] Hardware name: Amarula A64-Relic (DT)
[ 1348.185455] pstate: 6005 (nZCv daif -PAN -UAO)
[ 1348.190251] pc : brcmf_sdiod_freezer_count+0x0/0x20
[ 1348.195124] lr : brcmf_sdio_watchdog_thread+0x64/0x290
[ 1348.200253] sp : 0b85be30
[ 1348.203561] x29: 0b85be30 x28: 
[ 1348.208868] x27: 0b6cb918 x26: 80003b990638
[ 1348.214176] x25: 087b1a20 x24: 80003b94f800
[ 1348.219483] x23: 08e620c8 x22: 08f0b660
[ 1348.224790] x21: 08c6a858 x20: fe00
[ 1348.230097] x19: 80003b94f800 x18: 0001
[ 1348.235404] x17: ab2e8a74 x16: 080d7de8
[ 1348.240711] x15:  x14: 0400
[ 1348.246018] x13: 0400 x12: 0001
[ 1348.251324] x11: 02c4 x10: 0a10
[ 1348.256631] x9 : 0b85bc40 x8 : 80003be11870
[ 1348.261937] x7 : 80003dfc7308 x6 : 00078ff08b55
[ 1348.267243] x5 : 0139e1058400 x4 : 
[ 1348.272550] x3 : dead0100 x2 : 958f2788d6618100
[ 1348.277856] x1 : fe00 x0 : 

Signed-off-by: Michael Trimarchi 
---
 drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c 
b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c
index 412a05b..061f69d 100644
--- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c
+++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c
@@ -4294,6 +4294,13 @@ void brcmf_sdio_remove(struct brcmf_sdio *bus)
brcmf_dbg(TRACE, "Enter\n");
 
if (bus) {
+   /* Stop watchdog task */
+   if (bus->watchdog_tsk) {
+   send_sig(SIGTERM, bus->watchdog_tsk, 1);
+   kthread_stop(bus->watchdog_tsk);
+   bus->watchdog_tsk = NULL;
+   }
+
/* De-register interrupt handler */
brcmf_sdiod_intr_unregister(bus->sdiodev);
 
-- 
2.7.4