Re: armmmci rmmod causes hung tasks

2020-12-01 Thread Ulf Hansson
On Mon, 30 Nov 2020 at 18:20, Martin DEVERA  wrote:
>
> On 11/30/20 4:08 PM, Ulf Hansson wrote:
> > On Sun, 29 Nov 2020 at 19:20, Martin DEVERA  wrote:
> >> Hello,
> >>
> >> on STM32MP1 with almost vanilla 5.7.7 in single CPU mode. Pair of
> >> modprobe armmmci ; rmmod armmmci
> >>
> >> causes rmmod and kworker to hang. I should note that no MMC is detected
> >> on the board (SDIO device on MMC bus is not responding).
> >> On another board (where SDIO is responding) rmmod works.
> >>
> >> It seems as another manifestation of https://lkml.org/lkml/2019/8/27/945
> >>
> >> Thanks.
> >>
> >> INFO: task kworker/0:1:12 blocked for more than 368 seconds.
> >> Not tainted 5.7.7kdb-3-g10397828596c-dirty #224
> >> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> >> kworker/0:1 D012  2 0x
> >> Workqueue: events_freezable mmc_rescan
> >> (__schedule) from (schedule+0x5b/0x90)
> >> (schedule) from (schedule_timeout+0x1b/0xa0)
> >> (schedule_timeout) from (__wait_for_common+0x7d/0xdc)
> >> (__wait_for_common) from (mmc_wait_for_req_done+0x1b/0x8c)
> >> (mmc_wait_for_req_done) from (mmc_wait_for_cmd+0x4d/0x68)
> >> (mmc_wait_for_cmd) from (mmc_io_rw_direct_host+0x87/0xc8)
> >> (mmc_io_rw_direct_host) from (sdio_reset+0x3b/0x58)
> >> (sdio_reset) from (mmc_rescan+0x15d/0x1d4)
> >> (mmc_rescan) from (process_one_work+0xdd/0x168)
> >> (process_one_work) from (worker_thread+0x17d/0x1ec)
> >> (worker_thread) from (kthread+0x9b/0xa4)
> >> (kthread) from (ret_from_fork+0x11/0x28)
> > It looks like the worker thread, which runs mmc_rescan() to try to
> > detect the SDIO card is hanging. Exactly why, I don't know.
> >
> > Could be a misconfigured clock, pinctrl or a power domain being
> > suddenly gated...
>
> I turned some logging on (see below), IIUC pl18x is starting CMD52 with arg
> SDIO_CCCR_ABORT read and it got IRQ later along with response. Then sending
> again SDIO_CCCR_ABORT write but no IRQ comes back.
>
> [  135.810802] mmc0: mmc_rescan_try_freq: trying to init card at 40 Hz
> [  135.810832] mmc0: starting CMD52 arg 0c00 flags 0195
> [  135.810862] mmci-pl18x 48004000.sdmmc: op 34 arg 0c00 flags 0195
> [  135.811155] mmci-pl18x 48004000.sdmmc: irq0 (data+cmd) 0040
> [  135.811178] mmc0: req done (CMD52): 0:   
> 
> [  135.811202] mmci-pl18x 48004000.sdmmc: irq0 (data+cmd) 
> [  135.816487] mmc0: starting CMD52 arg 8c08 flags 0195
> [  135.816506] mmci-pl18x 48004000.sdmmc: op 34 arg 8c08 flags 0195
> [  172.150614] random: crng init done
> [  172.150642] random: 6 urandom warning(s) missed due to ratelimiting
> [  173.290565] INFO: task kworker/0:0:5 blocked for more than 20 seconds.
>
> Here is the same system, only with different (working) SDIO device on
> the same bus:
>
> [  495.654596] mmc0: mmc_rescan_try_freq: trying to init card at 40 Hz
> [  495.654628] mmc0: starting CMD52 arg 0c00 flags 0195
> [  495.654658] mmci-pl18x 48004000.sdmmc: op 34 arg 0c00 flags 0195
> [  495.654996] mmci-pl18x 48004000.sdmmc: irq0 (data+cmd) 0004
> [  495.655017] mmc0: req done (CMD52): -110:   
> 
> [  495.655042] mmci-pl18x 48004000.sdmmc: irq0 (data+cmd) 
> [  495.660201] mmc0: starting CMD52 arg 8c08 flags 0195
> [  495.660222] mmci-pl18x 48004000.sdmmc: op 34 arg 8c08 flags 0195
> [  495.660549] mmci-pl18x 48004000.sdmmc: irq0 (data+cmd) 0004
> [  495.660567] mmc0: req done (CMD52): -110:   
> 
> [  495.660591] mmci-pl18x 48004000.sdmmc: irq0 (data+cmd) 
>
> Should it be expected, that invalid (probably non-responding) device on
> the SDIO bus
> causes it to be locked up forever ?
> Or is it bug in pl18x driver not handling the error correctly ?

Honestly, it's hard to tell without further debugging. It looks like a
bug in pl18x driver, but perhaps also on the SDIO device side.

Kind regards
Uffe


Re: armmmci rmmod causes hung tasks

2020-11-30 Thread Martin DEVERA

On 11/30/20 4:08 PM, Ulf Hansson wrote:

On Sun, 29 Nov 2020 at 19:20, Martin DEVERA  wrote:

Hello,

on STM32MP1 with almost vanilla 5.7.7 in single CPU mode. Pair of
modprobe armmmci ; rmmod armmmci

causes rmmod and kworker to hang. I should note that no MMC is detected
on the board (SDIO device on MMC bus is not responding).
On another board (where SDIO is responding) rmmod works.

It seems as another manifestation of https://lkml.org/lkml/2019/8/27/945

Thanks.

INFO: task kworker/0:1:12 blocked for more than 368 seconds.
Not tainted 5.7.7kdb-3-g10397828596c-dirty #224
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kworker/0:1 D012  2 0x
Workqueue: events_freezable mmc_rescan
(__schedule) from (schedule+0x5b/0x90)
(schedule) from (schedule_timeout+0x1b/0xa0)
(schedule_timeout) from (__wait_for_common+0x7d/0xdc)
(__wait_for_common) from (mmc_wait_for_req_done+0x1b/0x8c)
(mmc_wait_for_req_done) from (mmc_wait_for_cmd+0x4d/0x68)
(mmc_wait_for_cmd) from (mmc_io_rw_direct_host+0x87/0xc8)
(mmc_io_rw_direct_host) from (sdio_reset+0x3b/0x58)
(sdio_reset) from (mmc_rescan+0x15d/0x1d4)
(mmc_rescan) from (process_one_work+0xdd/0x168)
(process_one_work) from (worker_thread+0x17d/0x1ec)
(worker_thread) from (kthread+0x9b/0xa4)
(kthread) from (ret_from_fork+0x11/0x28)

It looks like the worker thread, which runs mmc_rescan() to try to
detect the SDIO card is hanging. Exactly why, I don't know.

Could be a misconfigured clock, pinctrl or a power domain being
suddenly gated...


I turned some logging on (see below), IIUC pl18x is starting CMD52 with arg
SDIO_CCCR_ABORT read and it got IRQ later along with response. Then sending
again SDIO_CCCR_ABORT write but no IRQ comes back.

[  135.810802] mmc0: mmc_rescan_try_freq: trying to init card at 40 Hz
[  135.810832] mmc0: starting CMD52 arg 0c00 flags 0195
[  135.810862] mmci-pl18x 48004000.sdmmc: op 34 arg 0c00 flags 0195
[  135.811155] mmci-pl18x 48004000.sdmmc: irq0 (data+cmd) 0040
[  135.811178] mmc0: req done (CMD52): 0:    


[  135.811202] mmci-pl18x 48004000.sdmmc: irq0 (data+cmd) 
[  135.816487] mmc0: starting CMD52 arg 8c08 flags 0195
[  135.816506] mmci-pl18x 48004000.sdmmc: op 34 arg 8c08 flags 0195
[  172.150614] random: crng init done
[  172.150642] random: 6 urandom warning(s) missed due to ratelimiting
[  173.290565] INFO: task kworker/0:0:5 blocked for more than 20 seconds.

Here is the same system, only with different (working) SDIO device on 
the same bus:


[  495.654596] mmc0: mmc_rescan_try_freq: trying to init card at 40 Hz
[  495.654628] mmc0: starting CMD52 arg 0c00 flags 0195
[  495.654658] mmci-pl18x 48004000.sdmmc: op 34 arg 0c00 flags 0195
[  495.654996] mmci-pl18x 48004000.sdmmc: irq0 (data+cmd) 0004
[  495.655017] mmc0: req done (CMD52): -110:    


[  495.655042] mmci-pl18x 48004000.sdmmc: irq0 (data+cmd) 
[  495.660201] mmc0: starting CMD52 arg 8c08 flags 0195
[  495.660222] mmci-pl18x 48004000.sdmmc: op 34 arg 8c08 flags 0195
[  495.660549] mmci-pl18x 48004000.sdmmc: irq0 (data+cmd) 0004
[  495.660567] mmc0: req done (CMD52): -110:    


[  495.660591] mmci-pl18x 48004000.sdmmc: irq0 (data+cmd) 

Should it be expected, that invalid (probably non-responding) device on 
the SDIO bus

causes it to be locked up forever ?
Or is it bug in pl18x driver not handling the error correctly ?

best regards,
Martin



Re: armmmci rmmod causes hung tasks

2020-11-30 Thread Ulf Hansson
On Sun, 29 Nov 2020 at 19:20, Martin DEVERA  wrote:
>
> Hello,
>
> on STM32MP1 with almost vanilla 5.7.7 in single CPU mode. Pair of
> modprobe armmmci ; rmmod armmmci
>
> causes rmmod and kworker to hang. I should note that no MMC is detected
> on the board (SDIO device on MMC bus is not responding).
> On another board (where SDIO is responding) rmmod works.
>
> It seems as another manifestation of https://lkml.org/lkml/2019/8/27/945
>
> Thanks.
>
> INFO: task kworker/0:1:12 blocked for more than 368 seconds.
>Not tainted 5.7.7kdb-3-g10397828596c-dirty #224
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> kworker/0:1 D012  2 0x
> Workqueue: events_freezable mmc_rescan
> (__schedule) from (schedule+0x5b/0x90)
> (schedule) from (schedule_timeout+0x1b/0xa0)
> (schedule_timeout) from (__wait_for_common+0x7d/0xdc)
> (__wait_for_common) from (mmc_wait_for_req_done+0x1b/0x8c)
> (mmc_wait_for_req_done) from (mmc_wait_for_cmd+0x4d/0x68)
> (mmc_wait_for_cmd) from (mmc_io_rw_direct_host+0x87/0xc8)
> (mmc_io_rw_direct_host) from (sdio_reset+0x3b/0x58)
> (sdio_reset) from (mmc_rescan+0x15d/0x1d4)
> (mmc_rescan) from (process_one_work+0xdd/0x168)
> (process_one_work) from (worker_thread+0x17d/0x1ec)
> (worker_thread) from (kthread+0x9b/0xa4)
> (kthread) from (ret_from_fork+0x11/0x28)

It looks like the worker thread, which runs mmc_rescan() to try to
detect the SDIO card is hanging. Exactly why, I don't know.

Could be a misconfigured clock, pinctrl or a power domain being
suddenly gated...

>
> INFO: task rmmod:308 blocked for more than 368 seconds.
>Not tainted 5.7.7kdb-3-g10397828596c-dirty #224
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> rmmod   D0   308222 0x
> (__schedule) from (schedule+0x5b/0x90)
> (schedule) from (schedule_timeout+0x1b/0xa0)
> (schedule_timeout) from (__wait_for_common+0x7d/0xdc)
> (__wait_for_common) from (__flush_work+0xb3/0x110)
> (__flush_work) from (__cancel_work_timer+0x97/0xf0)
> (__cancel_work_timer) from (mmc_stop_host+0x29/0x88)
> (mmc_stop_host) from (mmc_remove_host+0x9/0x22)
> (mmc_remove_host) from (mmci_remove+0x19/0x92 [armmmci])
> (mmci_remove [armmmci]) from (amba_remove+0x13/0x42)
> (amba_remove) from (device_release_driver_internal+0x89/0xe4)
> (device_release_driver_internal) from (driver_detach+0x57/0x60)
> (driver_detach) from (bus_remove_driver+0x53/0x84)
> (bus_remove_driver) from (mmci_driver_exit+0x9/0x11dc [armmmci])
> (mmci_driver_exit [armmmci]) from (sys_delete_module+0x115/0x154)
> (sys_delete_module) from (ret_fast_syscall+0x1/0x5a)
>

When unbinding the mmc host, mmc_stop_host() needs to make sure that
mmc_rescan() becomes disabled. This is to prevent a card from being
detected/removed at same time as when the host is being removed.

In any case, when we call...

cancel_delayed_work_sync(>detect);

...it seems like we end up waiting for the currently running
mmc_rescan() work to be completed.

Kind regards
Uffe


armmmci rmmod causes hung tasks

2020-11-29 Thread Martin DEVERA

Hello,

on STM32MP1 with almost vanilla 5.7.7 in single CPU mode. Pair of
modprobe armmmci ; rmmod armmmci

causes rmmod and kworker to hang. I should note that no MMC is detected
on the board (SDIO device on MMC bus is not responding).
On another board (where SDIO is responding) rmmod works.

It seems as another manifestation of https://lkml.org/lkml/2019/8/27/945

Thanks.

INFO: task kworker/0:1:12 blocked for more than 368 seconds.
  Not tainted 5.7.7kdb-3-g10397828596c-dirty #224
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kworker/0:1 D    0    12  2 0x
Workqueue: events_freezable mmc_rescan
(__schedule) from (schedule+0x5b/0x90)
(schedule) from (schedule_timeout+0x1b/0xa0)
(schedule_timeout) from (__wait_for_common+0x7d/0xdc)
(__wait_for_common) from (mmc_wait_for_req_done+0x1b/0x8c)
(mmc_wait_for_req_done) from (mmc_wait_for_cmd+0x4d/0x68)
(mmc_wait_for_cmd) from (mmc_io_rw_direct_host+0x87/0xc8)
(mmc_io_rw_direct_host) from (sdio_reset+0x3b/0x58)
(sdio_reset) from (mmc_rescan+0x15d/0x1d4)
(mmc_rescan) from (process_one_work+0xdd/0x168)
(process_one_work) from (worker_thread+0x17d/0x1ec)
(worker_thread) from (kthread+0x9b/0xa4)
(kthread) from (ret_from_fork+0x11/0x28)

INFO: task rmmod:308 blocked for more than 368 seconds.
  Not tainted 5.7.7kdb-3-g10397828596c-dirty #224
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
rmmod   D    0   308    222 0x
(__schedule) from (schedule+0x5b/0x90)
(schedule) from (schedule_timeout+0x1b/0xa0)
(schedule_timeout) from (__wait_for_common+0x7d/0xdc)
(__wait_for_common) from (__flush_work+0xb3/0x110)
(__flush_work) from (__cancel_work_timer+0x97/0xf0)
(__cancel_work_timer) from (mmc_stop_host+0x29/0x88)
(mmc_stop_host) from (mmc_remove_host+0x9/0x22)
(mmc_remove_host) from (mmci_remove+0x19/0x92 [armmmci])
(mmci_remove [armmmci]) from (amba_remove+0x13/0x42)
(amba_remove) from (device_release_driver_internal+0x89/0xe4)
(device_release_driver_internal) from (driver_detach+0x57/0x60)
(driver_detach) from (bus_remove_driver+0x53/0x84)
(bus_remove_driver) from (mmci_driver_exit+0x9/0x11dc [armmmci])
(mmci_driver_exit [armmmci]) from (sys_delete_module+0x115/0x154)
(sys_delete_module) from (ret_fast_syscall+0x1/0x5a)