Re: [PATCH] mmc: block: Prevent new req entering queue while freeing up the queue

2020-11-11 Thread Veerabhadrarao Badiganti



On 11/3/2020 8:55 PM, Ulf Hansson wrote:

On Wed, 28 Oct 2020 at 12:20, Veerabhadrarao Badiganti
 wrote:

The commit bbdc74dc19e0 ("mmc: block: Prevent new req entering queue
after its cleanup") has introduced this change but it got moved after
del_gendisk() with commit 57678e5a3d51 ("mmc: block: Delete gendisk
before cleaning up the request queue").

This isn't the first time we have spotted errors in this path. Seems
like a difficult path to get correct. :-)


It is blocking reboot with below Call stack().

INFO: task reboot:3086 blocked for more than 122 seconds.
  __schedule
  schedule
  schedule_timeout
  io_schedule_timeout
  do_wait_for_common
  wait_for_completion_io
  submit_bio_wait
  blkdev_issue_flush
  ext4_sync_fs
  __sync_filesystem
  sync_filesystem
  fsync_bdev
  invalidate_partition
  del_gendisk
  mmc_blk_remove_req
  mmc_blk_remove
  mmc_bus_remove
  device_release_driver_internal
  device_release_driver
  bus_remove_device
  device_del
  mmc_remove_card
  mmc_remove
  mmc_stop_host
  mmc_remove_host
  sdhci_remove_host
  sdhci_msm_remove

Why do you call sdhci_msm_remove() from the shutdown callback? What
specific operations do you need to run in the shutdown path for sdhci
msm?
I was suggested to add shutdown callback by memory team to gracefully 
de-register with smmu driver

during reboot/shutdown. So tried adding it. Since SMMU team.

I just need to ensure that controller is not active anymore and not 
accessing memory related to any

requests and de-register with smmu driver during shutdown.


The important part should be to do a graceful shutdown of the card
(and the block device) - is there anything else?

Or you are just using the shutdown callback as a simple way to trigger
this problem? Could unbinding the driver trigger the same issue?


  sdhci_msm_shutdown
  platform_drv_shutdown
  device_shutdown
  kernel_restart_prepare
  kernel_restart

So bringing this change back.

Signed-off-by: Veerabhadrarao Badiganti 
---

I'm observing this issue 100% of the time with shutdown callback added to 
sdhci-msm driver.
I'm trying on 5.4 kernel with ChromeOS.

Please let me know if this can be fixed in a better way.

I don't know yet, but I will have a closer look. Let's also see if
Adrian has some thoughts.

Kind regards
Uffe


---

  drivers/mmc/core/block.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
index 8d3df0be0355..76dbb2b8a13b 100644
--- a/drivers/mmc/core/block.c
+++ b/drivers/mmc/core/block.c
@@ -2627,6 +2627,7 @@ static void mmc_blk_remove_req(struct mmc_blk_data *md)
  * from being accepted.
  */
 card = md->queue.card;
+   blk_set_queue_dying(md->queue.queue);
 if (md->disk->flags & GENHD_FL_UP) {
 device_remove_file(disk_to_dev(md->disk), 
>force_ro);
 if ((md->area_type & MMC_BLK_DATA_AREA_BOOT) &&
--
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc., 
is a member of Code Aurora Forum, a Linux Foundation Collaborative Project



Re: [PATCH] mmc: block: Prevent new req entering queue while freeing up the queue

2020-11-04 Thread Adrian Hunter
On 3/11/20 5:25 pm, Ulf Hansson wrote:
> On Wed, 28 Oct 2020 at 12:20, Veerabhadrarao Badiganti
>  wrote:
>>
>> The commit bbdc74dc19e0 ("mmc: block: Prevent new req entering queue
>> after its cleanup") has introduced this change but it got moved after
>> del_gendisk() with commit 57678e5a3d51 ("mmc: block: Delete gendisk
>> before cleaning up the request queue").
> 
> This isn't the first time we have spotted errors in this path. Seems
> like a difficult path to get correct. :-)
> 
>>
>> It is blocking reboot with below Call stack().
>>
>> INFO: task reboot:3086 blocked for more than 122 seconds.
>>  __schedule
>>  schedule
>>  schedule_timeout
>>  io_schedule_timeout
>>  do_wait_for_common
>>  wait_for_completion_io
>>  submit_bio_wait
>>  blkdev_issue_flush
>>  ext4_sync_fs
>>  __sync_filesystem
>>  sync_filesystem
>>  fsync_bdev
>>  invalidate_partition
>>  del_gendisk
>>  mmc_blk_remove_req
>>  mmc_blk_remove
>>  mmc_bus_remove
>>  device_release_driver_internal
>>  device_release_driver
>>  bus_remove_device
>>  device_del
>>  mmc_remove_card
>>  mmc_remove
>>  mmc_stop_host
>>  mmc_remove_host
>>  sdhci_remove_host
>>  sdhci_msm_remove
> 
> Why do you call sdhci_msm_remove() from the shutdown callback? What
> specific operations do you need to run in the shutdown path for sdhci
> msm?

Yes, the problem is that upper layers, like the mmc block driver, have
already shut down, so doing operations like remove will get into deadlocks.

> 
> The important part should be to do a graceful shutdown of the card
> (and the block device) - is there anything else?
> 
> Or you are just using the shutdown callback as a simple way to trigger
> this problem? Could unbinding the driver trigger the same issue?
> 
>>  sdhci_msm_shutdown
>>  platform_drv_shutdown
>>  device_shutdown
>>  kernel_restart_prepare
>>  kernel_restart
>>
>> So bringing this change back.
>>
>> Signed-off-by: Veerabhadrarao Badiganti 
>> ---
>>
>> I'm observing this issue 100% of the time with shutdown callback added to 
>> sdhci-msm driver.
>> I'm trying on 5.4 kernel with ChromeOS.
>>
>> Please let me know if this can be fixed in a better way.
> 
> I don't know yet, but I will have a closer look. Let's also see if
> Adrian has some thoughts.
> 
> Kind regards
> Uffe
> 
>> ---
>>
>>  drivers/mmc/core/block.c | 1 +
>>  1 file changed, 1 insertion(+)
>>
>> diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
>> index 8d3df0be0355..76dbb2b8a13b 100644
>> --- a/drivers/mmc/core/block.c
>> +++ b/drivers/mmc/core/block.c
>> @@ -2627,6 +2627,7 @@ static void mmc_blk_remove_req(struct mmc_blk_data *md)
>>  * from being accepted.
>>  */
>> card = md->queue.card;
>> +   blk_set_queue_dying(md->queue.queue);
>> if (md->disk->flags & GENHD_FL_UP) {
>> device_remove_file(disk_to_dev(md->disk), 
>> >force_ro);
>> if ((md->area_type & MMC_BLK_DATA_AREA_BOOT) &&
>> --
>> Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, 
>> Inc., is a member of Code Aurora Forum, a Linux Foundation Collaborative 
>> Project
>>



Re: [PATCH] mmc: block: Prevent new req entering queue while freeing up the queue

2020-11-03 Thread Ulf Hansson
On Wed, 28 Oct 2020 at 12:20, Veerabhadrarao Badiganti
 wrote:
>
> The commit bbdc74dc19e0 ("mmc: block: Prevent new req entering queue
> after its cleanup") has introduced this change but it got moved after
> del_gendisk() with commit 57678e5a3d51 ("mmc: block: Delete gendisk
> before cleaning up the request queue").

This isn't the first time we have spotted errors in this path. Seems
like a difficult path to get correct. :-)

>
> It is blocking reboot with below Call stack().
>
> INFO: task reboot:3086 blocked for more than 122 seconds.
>  __schedule
>  schedule
>  schedule_timeout
>  io_schedule_timeout
>  do_wait_for_common
>  wait_for_completion_io
>  submit_bio_wait
>  blkdev_issue_flush
>  ext4_sync_fs
>  __sync_filesystem
>  sync_filesystem
>  fsync_bdev
>  invalidate_partition
>  del_gendisk
>  mmc_blk_remove_req
>  mmc_blk_remove
>  mmc_bus_remove
>  device_release_driver_internal
>  device_release_driver
>  bus_remove_device
>  device_del
>  mmc_remove_card
>  mmc_remove
>  mmc_stop_host
>  mmc_remove_host
>  sdhci_remove_host
>  sdhci_msm_remove

Why do you call sdhci_msm_remove() from the shutdown callback? What
specific operations do you need to run in the shutdown path for sdhci
msm?

The important part should be to do a graceful shutdown of the card
(and the block device) - is there anything else?

Or you are just using the shutdown callback as a simple way to trigger
this problem? Could unbinding the driver trigger the same issue?

>  sdhci_msm_shutdown
>  platform_drv_shutdown
>  device_shutdown
>  kernel_restart_prepare
>  kernel_restart
>
> So bringing this change back.
>
> Signed-off-by: Veerabhadrarao Badiganti 
> ---
>
> I'm observing this issue 100% of the time with shutdown callback added to 
> sdhci-msm driver.
> I'm trying on 5.4 kernel with ChromeOS.
>
> Please let me know if this can be fixed in a better way.

I don't know yet, but I will have a closer look. Let's also see if
Adrian has some thoughts.

Kind regards
Uffe

> ---
>
>  drivers/mmc/core/block.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
> index 8d3df0be0355..76dbb2b8a13b 100644
> --- a/drivers/mmc/core/block.c
> +++ b/drivers/mmc/core/block.c
> @@ -2627,6 +2627,7 @@ static void mmc_blk_remove_req(struct mmc_blk_data *md)
>  * from being accepted.
>  */
> card = md->queue.card;
> +   blk_set_queue_dying(md->queue.queue);
> if (md->disk->flags & GENHD_FL_UP) {
> device_remove_file(disk_to_dev(md->disk), 
> >force_ro);
> if ((md->area_type & MMC_BLK_DATA_AREA_BOOT) &&
> --
> Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, 
> Inc., is a member of Code Aurora Forum, a Linux Foundation Collaborative 
> Project
>


Re: [PATCH] mmc: block: Prevent new req entering queue while freeing up the queue

2020-11-03 Thread Veerabhadrarao Badiganti

Hi Ulf, Adrian,

Gentle reminder. Can you share your comments on this issue and change?

Thanks

On 10/28/2020 4:49 PM, Veerabhadrarao Badiganti wrote:

The commit bbdc74dc19e0 ("mmc: block: Prevent new req entering queue
after its cleanup") has introduced this change but it got moved after
del_gendisk() with commit 57678e5a3d51 ("mmc: block: Delete gendisk
before cleaning up the request queue").

It is blocking reboot with below Call stack().

INFO: task reboot:3086 blocked for more than 122 seconds.
  __schedule
  schedule
  schedule_timeout
  io_schedule_timeout
  do_wait_for_common
  wait_for_completion_io
  submit_bio_wait
  blkdev_issue_flush
  ext4_sync_fs
  __sync_filesystem
  sync_filesystem
  fsync_bdev
  invalidate_partition
  del_gendisk
  mmc_blk_remove_req
  mmc_blk_remove
  mmc_bus_remove
  device_release_driver_internal
  device_release_driver
  bus_remove_device
  device_del
  mmc_remove_card
  mmc_remove
  mmc_stop_host
  mmc_remove_host
  sdhci_remove_host
  sdhci_msm_remove
  sdhci_msm_shutdown
  platform_drv_shutdown
  device_shutdown
  kernel_restart_prepare
  kernel_restart

So bringing this change back.

Signed-off-by: Veerabhadrarao Badiganti 
---

I'm observing this issue 100% of the time with shutdown callback added to 
sdhci-msm driver.
I'm trying on 5.4 kernel with ChromeOS.

Please let me know if this can be fixed in a better way.
---

  drivers/mmc/core/block.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
index 8d3df0be0355..76dbb2b8a13b 100644
--- a/drivers/mmc/core/block.c
+++ b/drivers/mmc/core/block.c
@@ -2627,6 +2627,7 @@ static void mmc_blk_remove_req(struct mmc_blk_data *md)
 * from being accepted.
 */
card = md->queue.card;
+   blk_set_queue_dying(md->queue.queue);
if (md->disk->flags & GENHD_FL_UP) {
device_remove_file(disk_to_dev(md->disk), 
>force_ro);
if ((md->area_type & MMC_BLK_DATA_AREA_BOOT) &&


[PATCH] mmc: block: Prevent new req entering queue while freeing up the queue

2020-10-28 Thread Veerabhadrarao Badiganti
The commit bbdc74dc19e0 ("mmc: block: Prevent new req entering queue
after its cleanup") has introduced this change but it got moved after
del_gendisk() with commit 57678e5a3d51 ("mmc: block: Delete gendisk
before cleaning up the request queue").

It is blocking reboot with below Call stack().

INFO: task reboot:3086 blocked for more than 122 seconds.
 __schedule
 schedule
 schedule_timeout
 io_schedule_timeout
 do_wait_for_common
 wait_for_completion_io
 submit_bio_wait
 blkdev_issue_flush
 ext4_sync_fs
 __sync_filesystem
 sync_filesystem
 fsync_bdev
 invalidate_partition
 del_gendisk
 mmc_blk_remove_req
 mmc_blk_remove
 mmc_bus_remove
 device_release_driver_internal
 device_release_driver
 bus_remove_device
 device_del
 mmc_remove_card
 mmc_remove
 mmc_stop_host
 mmc_remove_host
 sdhci_remove_host
 sdhci_msm_remove
 sdhci_msm_shutdown
 platform_drv_shutdown
 device_shutdown
 kernel_restart_prepare
 kernel_restart

So bringing this change back.

Signed-off-by: Veerabhadrarao Badiganti 
---

I'm observing this issue 100% of the time with shutdown callback added to 
sdhci-msm driver.
I'm trying on 5.4 kernel with ChromeOS.

Please let me know if this can be fixed in a better way.
---

 drivers/mmc/core/block.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
index 8d3df0be0355..76dbb2b8a13b 100644
--- a/drivers/mmc/core/block.c
+++ b/drivers/mmc/core/block.c
@@ -2627,6 +2627,7 @@ static void mmc_blk_remove_req(struct mmc_blk_data *md)
 * from being accepted.
 */
card = md->queue.card;
+   blk_set_queue_dying(md->queue.queue);
if (md->disk->flags & GENHD_FL_UP) {
device_remove_file(disk_to_dev(md->disk), 
>force_ro);
if ((md->area_type & MMC_BLK_DATA_AREA_BOOT) &&
-- 
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc., 
is a member of Code Aurora Forum, a Linux Foundation Collaborative Project