Re: System hung in I/O when booting with sd card
Hi Bart, On 2018/4/12 17:05, Shawn Lin wrote: Hi Bart, On 2018/4/12 9:54, Bart Van Assche wrote: On Thu, 2018-04-12 at 09:48 +0800, Shawn Lin wrote: I ran into 2 times that my system hung here when booting with a ext4 sd card. No sure how to reproduce it but it seems doesn't matter with the ext4 as I see it again with a vfat sd card, this morning with Linus' master branch. Is it a known issue or any idea how to debug this? Please verify whether the following patch resolves this: https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git/commit/?h=for-linus=37f9579f4c31a6d698dbf3016d7bf132f9288d30 Thanks for your patch! I set up some devices to test reboot this morning against Linus' master branch. Two devices out of 5, were already hang for that without your patch, but the other 5 are still work with your patch. Will update the result tomorrow morning. I think your patch solve this. Thanks. Tested-by: Shawn Lin <shawn@rock-chips.com> Thanks, Bart.
Re: System hung in I/O when booting with sd card
Hi Bart, On 2018/4/12 9:54, Bart Van Assche wrote: On Thu, 2018-04-12 at 09:48 +0800, Shawn Lin wrote: I ran into 2 times that my system hung here when booting with a ext4 sd card. No sure how to reproduce it but it seems doesn't matter with the ext4 as I see it again with a vfat sd card, this morning with Linus' master branch. Is it a known issue or any idea how to debug this? Please verify whether the following patch resolves this: https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git/commit/?h=for-linus=37f9579f4c31a6d698dbf3016d7bf132f9288d30 Thanks for your patch! I set up some devices to test reboot this morning against Linus' master branch. Two devices out of 5, were already hang for that without your patch, but the other 5 are still work with your patch. Will update the result tomorrow morning. Thanks, Bart.
System hung in I/O when booting with sd card
Hi, I ran into 2 times that my system hung here when booting with a ext4 sd card. No sure how to reproduce it but it seems doesn't matter with the ext4 as I see it again with a vfat sd card, this morning with Linus' master branch. Is it a known issue or any idea how to debug this? [ 110.877652] Unable to handle kernel NULL pointer dereference at virtual address 00c8 [ 110.878360] Mem abort info: [ 110.878607] ESR = 0x9606 [ 110.878876] Exception class = DABT (current EL), IL = 32 bits [ 110.879391] SET = 0, FnV = 0 [ 110.879659] EA = 0, S1PTW = 0 [ 110.879957] Data abort info: [ 110.880210] ISV = 0, ISS = 0x0006 [ 110.880545] CM = 0, WnR = 0 [ 110.880807] user pgtable: 4k pages, 48-bit VAs, pgdp = ed0ed92e [ 110.881382] [00c8] pgd=f015e003, pud=f0173003, pmd= [ 110.882143] Internal error: Oops: 9606 [#1] PREEMPT SMP [ 110.882629] Modules linked in: [ 110.882902] CPU: 4 PID: 1777 Comm: ls Not tainted 4.16.0-next-20180410-00013-geabffa6-dirty #315 [ 110.883665] Hardware name: Excavator-RK3399 Board (DT) [ 110.884114] pstate: 4005 (nZcv daif -PAN -UAO) [ 110.884542] pc : percpu_counter_add_batch+0x2c/0x100 [ 110.884977] lr : generic_make_request_checks+0x214/0x478 [ 110.885440] sp : 0c403960 [ 110.885730] x29: 0c403960 x28: 8000efcba6e0 [ 110.886196] x27: 08a81000 x26: 082643f8 [ 110.886662] x25: 08263828 x24: 3000 [ 110.887127] x23: 1000 x22: [ 110.887591] x21: 1000 x20: 00a8 [ 110.888056] x19: 8000f0d7dc00 x18: 0001 [ 110.888521] x17: 004ab8a8 x16: 08241628 [ 110.888986] x15: ff80 x14: 7e0003bbaf80 [ 110.889451] x13: 00c5 x12: 09069b88 [ 110.889916] x11: 0040 x10: 8000efc57dc0 [ 110.890380] x9 : 8000efc57e90 x8 : 8000f0d3b138 [ 110.890844] x7 : x6 : 8000f0d7dc88 [ 110.891309] x5 : 0e56 x4 : 0001 [ 110.891774] x3 : 8000f0cab800 x2 : 3fff [ 110.892239] x1 : 8000edf3 x0 : 00a8 [ 110.892705] Process ls (pid: 1777, stack limit = 0xb3fb6733) [ 110.893258] Call trace: [ 110.893475] percpu_counter_add_batch+0x2c/0x100 [ 110.893879] generic_make_request_checks+0x214/0x478 [ 110.894313] generic_make_request+0x34/0x260 [ 110.894686] submit_bio+0xcc/0x1b0 [ 110.894988] ll_rw_block+0xc0/0x100 [ 110.895295] ext4_bread+0x74/0xc0 [ 110.895587] __ext4_read_dirblock+0x3c/0x2d8 [ 110.895960] htree_dirblock_to_tree+0x70/0x1d8 [ 110.896349] ext4_htree_fill_tree+0xa4/0x2c8 [ 110.896723] ext4_readdir+0x5f4/0x7f0 [ 110.897046] iterate_dir+0x9c/0x1a8 [ 110.897351] ksys_getdents64+0x8c/0x168 [ 110.897687] sys_getdents64+0xc/0x18 [ 110.898001] el0_svc_naked+0x30/0x34 [ 110.898315] Code: b9401064 11000484 b9001064 d538d081 (f9401000) [ 110.898848] ---[ end trace 3b0d37baa3bb2fb3 ]--- [ 110.899260] note: ls[1777] exited with preempt_count 1 [ 110.899852] WARNING: CPU: 4 PID: 1777 at kernel/rcu/tree_plugin.h:330 rcu_note_context_switch+0x30/0x3b8 [ 110.900676] Modules linked in: Se[gmentation fault 110.900947 [root@rockchip:/]# ] CPU: 4 PID: 1777 Comm: ls Tainted: G D 4.16.0-next-20180410-00013-geabffa6-dirty #315 [ 110.902121] Hardware name: Excavator-RK3399 Board (DT) [ 110.902570] pstate: 2085 (nzCv daIf -PAN -UAO) [ 110.902990] pc : rcu_note_context_switch+0x30/0x3b8 [ 110.903417] lr : rcu_note_context_switch+0x1c/0x3b8 [ 110.903841] sp : 0c403480 [ 110.904132] x29: 0c403480 x28: 0c403820 [ 110.904597] x27: x26: 8000f0cab800 [ 110.905062] x25: 080fdaa8 x24: 09069000 [ 110.905527] x23: 09042000 x22: 8000f0cab800 [ 110.905992] x21: 8000f6f7ed00 x20: [ 110.906456] x19: 8000f0cab800 x18: [ 110.906921] x17: 004ab8a8 x16: 08241628 [ 110.907386] x15: x14: 0400 [ 110.907851] x13: 091b6200 x12: 0004 [ 110.908316] x11: 0019c85d4000 x10: 0400 [ 110.908781] x9 : 0004 x8 : 8000f02e6a00 [ 110.909246] x7 : 8000f6f7f760 x6 : 090ba86f [ 110.909711] x5 : x4 : [ 110.910176] x3 : 8000edf3 x2 : 0004 [ 110.910641] x1 : 0904fa98 x0 : 0001 [ 110.911106] Call trace: [ 110.911322] rcu_note_context_switch+0x30/0x3b8 [ 110.911722] __schedule+0x90/0x600 [ 110.912022] do_task_dead+0x40/0x48 [ 110.912328] do_exit+0x6c8/0x9b8 [ 110.912613] die+0x1cc/0x1f8 [ 110.912868] __do_kernel_fault+0xa4/0xf8 [ 110.913212] do_page_fault+0x1f0/0x428 [ 110.913541] do_translation_fault+0x5c/0x68 [ 110.913908] do_mem_abort+0x54/0xd8 [ 110.914215] el1_da+0x20/0x80 [ 110.914476]
[PATCH] mmc: block: Delete gendisk before cleaning up the request queue
dd if=/dev/urandom of=/dev/mmcblk1 bs=4k count=1 with a SD card hotplug during transfer reports a warning below introduced by commit a063057d7c73 ("block: Fix a race between request queue removal and the block cgroup controller"). So we should now remove the disk, partition and bdi sysfs attributes before cleaning up the request queue associated with the disk. [ 410.331226] mmc1: card 59b4 removed [ 410.348583] WARNING: CPU: 0 PID: 5 at block/blk-core.c:785 blk_cleanup_queue+0x138/0x140 [ 410.349294] Modules linked in: [ 410.349570] CPU: 0 PID: 5 Comm: kworker/0:0 Not tainted 4.16.0-rc6-next-20180321-4-gc2ad6a7 #263 [ 410.350363] Hardware name: Excavator-RK3399 Board (DT) [ 410.350819] Workqueue: events_freezable mmc_rescan [ 410.351242] pstate: 6005 (nZCv daif -PAN -UAO) [ 410.351663] pc : blk_cleanup_queue+0x138/0x140 [ 410.352054] lr : blk_cleanup_queue+0xac/0x140 [ 410.352436] sp : 092cbb90 [ 410.352727] x29: 092cbb90 x28: [ 410.353195] x27: 8000f6f23030 x26: 0904e610 [ 410.353662] x25: 8000f17cc808 x24: 8000f1038200 [ 410.354128] x23: 0060 x22: [ 410.354595] x21: 8000f11748d8 x20: 8000f1038200 [ 410.355061] x19: 8000f1174200 x18: 936347d8 [ 410.355528] x17: 935b93c0 x16: 081263f8 [ 410.355994] x15: x14: 0400 [ 410.356461] x13: 0001 x12: 0001 [ 410.356927] x11: 0040 x10: 8000f2400028 [ 410.357393] x9 : 8000f2400040 x8 : [ 410.357860] x7 : 8000f6f3a340 x6 : 8000f6f3a340 [ 410.358326] x5 : 8000f240 x4 : 8000f6f3a340 [ 410.358792] x3 : x2 : 39c1333e45670800 [ 410.359259] x1 : x0 : 0003 [ 410.359726] Call trace: [ 410.359943] blk_cleanup_queue+0x138/0x140 [ 410.360305] mmc_cleanup_queue+0x2c/0x48 [ 410.360652] mmc_blk_remove_req+0x1c/0x98 [ 410.361005] mmc_blk_remove+0x180/0x1c0 [ 410.361343] mmc_bus_remove+0x1c/0x28 [ 410.361670] device_release_driver_internal+0x154/0x1f0 [ 410.362128] device_release_driver+0x14/0x20 [ 410.362504] bus_remove_device+0xc8/0x108 [ 410.362858] device_del+0x120/0x350 [ 410.363167] mmc_remove_card+0x5c/0xb8 [ 410.363498] mmc_sd_detect+0x40/0x78 [ 410.363813] mmc_rescan+0x19c/0x368 [ 410.364123] process_one_work+0x1ac/0x318 [ 410.364477] worker_thread+0x50/0x450 [ 410.364801] kthread+0xf8/0x128 [ 410.365081] ret_from_fork+0x10/0x18 [ 410.365395] ---[ end trace 268e87a46c28968c ]--- Signed-off-by: Shawn Lin <shawn@rock-chips.com> --- drivers/mmc/core/block.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c index a2b9c25..02485e3 100644 --- a/drivers/mmc/core/block.c +++ b/drivers/mmc/core/block.c @@ -2659,7 +2659,6 @@ static void mmc_blk_remove_req(struct mmc_blk_data *md) * from being accepted. */ card = md->queue.card; - mmc_cleanup_queue(>queue); if (md->disk->flags & GENHD_FL_UP) { device_remove_file(disk_to_dev(md->disk), >force_ro); if ((md->area_type & MMC_BLK_DATA_AREA_BOOT) && @@ -2669,6 +2668,7 @@ static void mmc_blk_remove_req(struct mmc_blk_data *md) del_gendisk(md->disk); } + mmc_cleanup_queue(>queue); mmc_blk_put(md); } } -- 1.9.1
Re: [PATCH 1/2 v6] mmc: block: Convert RPMB to a character device
On 2017/9/28 3:45, Linus Walleij wrote: On Wed, Sep 27, 2017 at 1:35 PM, Adrian Hunterwrote: BUG when removing, fixed by reverting this patch. [ 346.548512] mmc1: card 0001 removed [ 346.552782] BUG: unable to handle kernel NULL pointer dereference at 0070 How did you achieve this? I need to reproduce it. RPMB is only available on eMMC cards and I don't have a removable eMMC card since by definition they are soldered on. Does it happen on an ordinary MMC card without RPMB? I'm sorry if I'm ignorant of something very obvious here... I will read the code and see if I'm missing something obvious. Haven't closely look at this patch, but btw, don't you need put_device(>dev) for the error path of mmc_blk_alloc_rpmb_part and also do that for mmc_blk_remove_rpmb_part? And don't you need to check rpmb->md->usage when doing mmc_blk_remove_rpmb_part ? Yours, Linus Walleij
Re: [PATCH 1/5] mmc: block: Move duplicate check
Hi, On 2017/6/15 20:12, Linus Walleij wrote: mmc_blk_ioctl() calls either mmc_blk_ioctl_cmd() or mmc_blk_ioctl_multi_cmd() and each of these make the same check. Factor it into a new helper function, call it on both branches of the switch() statement and save a chunk of duplicate code. Cc: Shawn Lin <shawn@rock-chips.com> Signed-off-by: Linus Walleij <linus.wall...@linaro.org> --- ChangeLog v1->v2: - We need to check the block device only if an actual well-known ioctl() is coming in, on the path of the switch() statments, only those branches that handle actual ioctl()s. Create a new helper function to check the block device and call that. Reviewed-by: Shawn Lin <shawn@rock-chips.com> --- drivers/mmc/core/block.c | 36 1 file changed, 20 insertions(+), 16 deletions(-) diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c index 1ce6012ce3c1..d1b824e65590 100644 --- a/drivers/mmc/core/block.c +++ b/drivers/mmc/core/block.c @@ -566,14 +566,6 @@ static int mmc_blk_ioctl_cmd(struct block_device *bdev, int err = 0, ioc_err = 0; struct request *req; - /* -* The caller must have CAP_SYS_RAWIO, and must be calling this on the -* whole block device, not on a partition. This prevents overspray -* between sibling partitions. -*/ - if ((!capable(CAP_SYS_RAWIO)) || (bdev != bdev->bd_contains)) - return -EPERM; - idata = mmc_blk_ioctl_copy_from_user(ic_ptr); if (IS_ERR(idata)) return PTR_ERR(idata); @@ -626,14 +618,6 @@ static int mmc_blk_ioctl_multi_cmd(struct block_device *bdev, __u64 num_of_cmds; struct request *req; - /* -* The caller must have CAP_SYS_RAWIO, and must be calling this on the -* whole block device, not on a partition. This prevents overspray -* between sibling partitions. -*/ - if ((!capable(CAP_SYS_RAWIO)) || (bdev != bdev->bd_contains)) - return -EPERM; - if (copy_from_user(_of_cmds, >num_of_cmds, sizeof(num_of_cmds))) return -EFAULT; @@ -697,14 +681,34 @@ static int mmc_blk_ioctl_multi_cmd(struct block_device *bdev, return ioc_err ? ioc_err : err; } +static int mmc_blk_check_blkdev(struct block_device *bdev) +{ + /* +* The caller must have CAP_SYS_RAWIO, and must be calling this on the +* whole block device, not on a partition. This prevents overspray +* between sibling partitions. +*/ + if ((!capable(CAP_SYS_RAWIO)) || (bdev != bdev->bd_contains)) + return -EPERM; + return 0; +} + static int mmc_blk_ioctl(struct block_device *bdev, fmode_t mode, unsigned int cmd, unsigned long arg) { + int ret; + switch (cmd) { case MMC_IOC_CMD: + ret = mmc_blk_check_blkdev(bdev); + if (ret) + return ret; return mmc_blk_ioctl_cmd(bdev, (struct mmc_ioc_cmd __user *)arg); case MMC_IOC_MULTI_CMD: + ret = mmc_blk_check_blkdev(bdev); + if (ret) + return ret; return mmc_blk_ioctl_multi_cmd(bdev, (struct mmc_ioc_multi_cmd __user *)arg); default:
Re: [PATCH v4] mmc: sdio: check the buffer address for sdio API
Hi Russell, On 2017/2/15 3:34, Russell King - ARM Linux wrote: On Tue, Feb 14, 2017 at 09:18:43AM -0700, Jens Axboe wrote: The current situation seems like a bit of a mess. Why don't you have two entry points, one for DMA and one for PIO. If the caller doesn't know if he can use DMA, he'd better call the PIO variant. Either that, or audit all callers and ensure they do the right thing wrt having a dma capable buffer. It really shouldn't matter. MMC interfaces are just like USB - you have a host controller, which interfaces what is a multi-lane serial bus to the system. The SDIO card shouldn't care one bit whether the host controller is using DMA or PIO. However, I think that the DMA vs PIO thing is actually misleading here, that's really not the issue at all. Looking at the oops, I see it uses sdio_memcpy_toio(). Tracing that code leads me to here: for_each_sg(data.sg, sg_ptr, data.sg_len, i) { sg_set_page(sg_ptr, virt_to_page(buf + (i * seg_size)), min(seg_size, left_size), offset_in_page(buf + (i * seg_size))); so the buffer that is passed into sdio_memcpy_toio() gets passed into virt_to_page(). Firstly, the fact that it's passed to virt_to_page() means that "buf" must _only_ _ever_ be a lowmem address. It can't ever be a vmalloc address (virt_to_page() is invalid on anything but lowmem.) Just like certain kernel interfaces, passing pointers to memory of different types from the one intended by the interface produces invalid results, and that seems to be what's happening here. Secondly, it's a scatterlist, and scatterlists can be passed to DMA mapping operations, which also implies that _if_ a host driver decides to use DMA on it, the buffer better be DMA-able. Thirdly, while PIO may work (or even appear to work) because _maybe_ converting a vmalloc address to a ficticious struct page + offset, and then converting that back again _might_ result in hitting the correct memory, but it's not guaranteed to. [1]: If no DMA involved, the host drivers usually use memcpy or readl/writel to transfer the data between MMIO address and buffer coming from the caller. So, is it also not guaranteed when using memcpy or readl to transfer data between MMIO address and vmalloc/heap buffer? I suspect that virt_to_page() + kmap_atomic() is likely to try to dereference a struct page pointer that does not point at a legal entry in the memmap arrays, and result in scribbling over some random part of kernel memory. If that is the fact, so what I am concerned mostly is that by seraching the APIs, sdio_writeb and sdio_readb, under the drivers/net /wireless/, I could see almost all sdio based WLAN drivers passed in a vmalloc area(actually when built as moudle, it should be located in MODULE range which also be included as vmalloc area, no?) or heap buffer. I assume my question[1] above is fine, then thanks to none of the mmc host drivers use DMA for sdio_writeb and sdio_readb since it only request one byte which didn't be fetched from host FIFO and the host controller HW didn't support this kind of request to use DMA(but may be not in the future). Otherwise, it may result in scribbling over some random part of kernel memory. Actually we didn't see that issues before, so I think the actual question should be if the buffer from heap or vmalloc will be used with DMA involved? So every way I look at this, the binary driver that Shawn downloaded is buggy, whether the host controller uses PIO or DMA. So it's really more dangerous that we were/are taking risking of scribbling over some memory belonging to other code even if using PIO if it also not guaranteed to use heap/vmalloc buffer.. I bet if Shawn tries running it against a modern kernel with CONFIG_DEBUG_VIRTUAL enabled, Shawn will get complaints backing up my claim.