Re: BUG: KASAN: use-after-free in scsi_exit_rq
On Tue, 2017-05-02 at 16:41 +0200, Jan Kara wrote: > So I'm also not aware of any particular breakage this would cause. However > logically the freeing of request mempools really belongs to > blk_release_queue() so it seems a bit dumb to move blk_exit_rl() just > because SCSI stores the fact from which slab cache it has allocated the > sense buffer in a structure (shost) that it frees under its hands by the > time blk_release_queue() is called. :-| Hello Jan, My concern when I wrote my previous e-mail was that I didn't want to add a scsi_host_get() / scsi_host_put() pair to the hot path in the SCSI core. But I just realized that scsi_init_rq() and scsi_exit_rq() are not in the hot path so adding a scsi_host_get() / scsi_host_put() pair should work fine. I will post a patch. Bart.
Re: BUG: KASAN: use-after-free in scsi_exit_rq
On Fri 28-04-17 17:46:47, Tejun Heo wrote: > On Fri, Apr 21, 2017 at 09:49:17PM +, Bart Van Assche wrote: > > On Thu, 2017-04-20 at 15:18 -0600, Scott Bauer wrote: > > > [ 642.638860] BUG: KASAN: use-after-free in scsi_exit_rq+0xf3/0x120 at > > > addr 8802b7fedf00 > > > [ 642.639362] Read of size 1 by task rcuos/5/53 > > > [ 642.639713] CPU: 7 PID: 53 Comm: rcuos/6 Not tainted 4.11.0-rc5+ #13 > > > [ 642.640170] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), > > > BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org > > > 04/01/2014 > > > [ 642.640923] Call Trace: > > > [ 642.641080] dump_stack+0x63/0x8f > > > [ 642.641289] kasan_object_err+0x21/0x70 > > > [ 642.641531] kasan_report.part.1+0x231/0x500 > > > [ 642.641823] ? scsi_exit_rq+0xf3/0x120 > > > [ 642.642054] ? _raw_spin_unlock_irqrestore+0xe/0x10 > > > [ 642.642353] ? free_percpu+0x1b7/0x340 > > > [ 642.642586] ? put_task_stack+0x117/0x2b0 > > > [ 642.642837] __asan_report_load1_noabort+0x2e/0x30 > > > [ 642.643138] scsi_exit_rq+0xf3/0x120 > > > [ 642.643366] free_request_size+0x44/0x60 > > > [ 642.643614] mempool_destroy.part.6+0x9b/0x150 > > > [ 642.643899] ? kasan_slab_free+0x87/0xb0 > > > [ 642.644152] mempool_destroy+0x13/0x20 > > > [ 642.644394] blk_exit_rl+0x36/0x40 > > > [ 642.644614] blkg_free+0x146/0x200 > > > [ 642.644836] __blkg_release_rcu+0x121/0x220 > > > [ 642.645112] rcu_nocb_kthread+0x61f/0xca0 > > > [ 642.645376] ? get_state_synchronize_rcu+0x20/0x20 > > > [ 642.645690] ? pci_mmcfg_check_reserved+0x110/0x110 > > > [ 642.646011] kthread+0x298/0x390 > > > [ 642.646224] ? get_state_synchronize_rcu+0x20/0x20 > > > [ 642.646535] ? kthread_park+0x160/0x160 > > > [ 642.646787] ret_from_fork+0x2c/0x40 > > > > I'm not familiar with cgroups but seeing this makes me wonder whether it > > would > > be possible to move the blk_exit_rl() calls from blk_release_queue() into > > blk_cleanup_queue()? The SCSI core frees a SCSI host after > > blk_cleanup_queue() > > has finished for all associated SCSI devices. This is why I think that > > calling > > blk_exit_rl() earlier would be sufficient to avoid that scsi_exit_rq() > > dereferences a SCSI host pointer after it has been freed. > > Hmm... I see. Didn't know request put path involved derefing those > structs. The blk_exit_rl() call just replaced open coded > mempool_destroy call, so the destruction order was always like this. > We shouldn't have any request in flight by cleanup, so moving it there > should be fine. So I'm also not aware of any particular breakage this would cause. However logically the freeing of request mempools really belongs to blk_release_queue() so it seems a bit dumb to move blk_exit_rl() just because SCSI stores the fact from which slab cache it has allocated the sense buffer in a structure (shost) that it frees under its hands by the time blk_release_queue() is called. :-| Honza -- Jan Kara SUSE Labs, CR
Re: BUG: KASAN: use-after-free in scsi_exit_rq
(cc'ing Jan) Hello, Bart. On Fri, Apr 21, 2017 at 09:49:17PM +, Bart Van Assche wrote: > On Thu, 2017-04-20 at 15:18 -0600, Scott Bauer wrote: > > [ 642.638860] BUG: KASAN: use-after-free in scsi_exit_rq+0xf3/0x120 at > > addr 8802b7fedf00 > > [ 642.639362] Read of size 1 by task rcuos/5/53 > > [ 642.639713] CPU: 7 PID: 53 Comm: rcuos/6 Not tainted 4.11.0-rc5+ #13 > > [ 642.640170] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > > rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014 > > [ 642.640923] Call Trace: > > [ 642.641080] dump_stack+0x63/0x8f > > [ 642.641289] kasan_object_err+0x21/0x70 > > [ 642.641531] kasan_report.part.1+0x231/0x500 > > [ 642.641823] ? scsi_exit_rq+0xf3/0x120 > > [ 642.642054] ? _raw_spin_unlock_irqrestore+0xe/0x10 > > [ 642.642353] ? free_percpu+0x1b7/0x340 > > [ 642.642586] ? put_task_stack+0x117/0x2b0 > > [ 642.642837] __asan_report_load1_noabort+0x2e/0x30 > > [ 642.643138] scsi_exit_rq+0xf3/0x120 > > [ 642.643366] free_request_size+0x44/0x60 > > [ 642.643614] mempool_destroy.part.6+0x9b/0x150 > > [ 642.643899] ? kasan_slab_free+0x87/0xb0 > > [ 642.644152] mempool_destroy+0x13/0x20 > > [ 642.644394] blk_exit_rl+0x36/0x40 > > [ 642.644614] blkg_free+0x146/0x200 > > [ 642.644836] __blkg_release_rcu+0x121/0x220 > > [ 642.645112] rcu_nocb_kthread+0x61f/0xca0 > > [ 642.645376] ? get_state_synchronize_rcu+0x20/0x20 > > [ 642.645690] ? pci_mmcfg_check_reserved+0x110/0x110 > > [ 642.646011] kthread+0x298/0x390 > > [ 642.646224] ? get_state_synchronize_rcu+0x20/0x20 > > [ 642.646535] ? kthread_park+0x160/0x160 > > [ 642.646787] ret_from_fork+0x2c/0x40 > > I'm not familiar with cgroups but seeing this makes me wonder whether it would > be possible to move the blk_exit_rl() calls from blk_release_queue() into > blk_cleanup_queue()? The SCSI core frees a SCSI host after blk_cleanup_queue() > has finished for all associated SCSI devices. This is why I think that calling > blk_exit_rl() earlier would be sufficient to avoid that scsi_exit_rq() > dereferences a SCSI host pointer after it has been freed. Hmm... I see. Didn't know request put path involved derefing those structs. The blk_exit_rl() call just replaced open coded mempool_destroy call, so the destruction order was always like this. We shouldn't have any request in flight by cleanup, so moving it there should be fine. Thanks. -- tejun
Re: BUG: KASAN: use-after-free in scsi_exit_rq
On Thu, 2017-04-20 at 15:18 -0600, Scott Bauer wrote: > [ 642.638860] BUG: KASAN: use-after-free in scsi_exit_rq+0xf3/0x120 at addr > 8802b7fedf00 > [ 642.639362] Read of size 1 by task rcuos/5/53 > [ 642.639713] CPU: 7 PID: 53 Comm: rcuos/6 Not tainted 4.11.0-rc5+ #13 > [ 642.640170] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014 > [ 642.640923] Call Trace: > [ 642.641080] dump_stack+0x63/0x8f > [ 642.641289] kasan_object_err+0x21/0x70 > [ 642.641531] kasan_report.part.1+0x231/0x500 > [ 642.641823] ? scsi_exit_rq+0xf3/0x120 > [ 642.642054] ? _raw_spin_unlock_irqrestore+0xe/0x10 > [ 642.642353] ? free_percpu+0x1b7/0x340 > [ 642.642586] ? put_task_stack+0x117/0x2b0 > [ 642.642837] __asan_report_load1_noabort+0x2e/0x30 > [ 642.643138] scsi_exit_rq+0xf3/0x120 > [ 642.643366] free_request_size+0x44/0x60 > [ 642.643614] mempool_destroy.part.6+0x9b/0x150 > [ 642.643899] ? kasan_slab_free+0x87/0xb0 > [ 642.644152] mempool_destroy+0x13/0x20 > [ 642.644394] blk_exit_rl+0x36/0x40 > [ 642.644614] blkg_free+0x146/0x200 > [ 642.644836] __blkg_release_rcu+0x121/0x220 > [ 642.645112] rcu_nocb_kthread+0x61f/0xca0 > [ 642.645376] ? get_state_synchronize_rcu+0x20/0x20 > [ 642.645690] ? pci_mmcfg_check_reserved+0x110/0x110 > [ 642.646011] kthread+0x298/0x390 > [ 642.646224] ? get_state_synchronize_rcu+0x20/0x20 > [ 642.646535] ? kthread_park+0x160/0x160 > [ 642.646787] ret_from_fork+0x2c/0x40 I'm not familiar with cgroups but seeing this makes me wonder whether it would be possible to move the blk_exit_rl() calls from blk_release_queue() into blk_cleanup_queue()? The SCSI core frees a SCSI host after blk_cleanup_queue() has finished for all associated SCSI devices. This is why I think that calling blk_exit_rl() earlier would be sufficient to avoid that scsi_exit_rq() dereferences a SCSI host pointer after it has been freed. Bart.
BUG: KASAN: use-after-free in scsi_exit_rq
Hi all, While running xfs test testing some other features scheduled for 4.12 I came across this KASAN dump: [ 638.913813] XFS (nvme0n1): Mounting V5 Filesystem [ 638.917934] XFS (nvme0n1): Ending clean mount [ 639.035070] blk_update_request: I/O error, dev nvme1n1, sector 0 [ 639.071764] XFS (nvme1n1): Mounting V5 Filesystem [ 639.077052] XFS (nvme1n1): Ending clean mount [ 639.110634] XFS (nvme0n1): Unmounting Filesystem [ 639.260132] XFS (nvme0n1): Mounting V5 Filesystem [ 639.264141] XFS (nvme0n1): Ending clean mount [ 639.282112] run fstests generic/108 at 2017-04-20 14:30:05 [ 639.525274] XFS (nvme1n1): Unmounting Filesystem [ 639.570999] scsi host2: scsi_debug: version 1.86 [20160430] [ 639.570999] dev_size_mb=128, opts=0x0, submit_queues=1, statistics=0 [ 639.573698] scsi 2:0:0:0: Direct-Access Linuxscsi_debug 0186 PQ: 0 ANSI: 7 [ 639.595927] sd 2:0:0:0: Attached scsi generic sg1 type 0 [ 639.598290] sd 2:0:0:0: [sda] 262144 512-byte logical blocks: (134 MB/128 MiB) [ 639.599884] sd 2:0:0:0: [sda] Write Protect is off [ 639.600246] sd 2:0:0:0: [sda] Mode Sense: 73 00 10 08 [ 639.602747] sd 2:0:0:0: [sda] Write cache: enabled, read cache: enabled, supports DPO and FUA [ 639.626440] sd 2:0:0:0: [sda] Attached SCSI disk [ 641.264666] XFS (dm-0): Mounting V5 Filesystem [ 641.278227] XFS (dm-0): Ending clean mount [ 641.394865] sd 2:0:0:0: rejecting I/O to offline device [ 641.395353] sd 2:0:0:0: rejecting I/O to offline device [ 641.395903] sd 2:0:0:0: rejecting I/O to offline device [ 641.396362] sd 2:0:0:0: rejecting I/O to offline device [ 641.396896] sd 2:0:0:0: rejecting I/O to offline device [ 641.397347] sd 2:0:0:0: rejecting I/O to offline device [ 641.397888] sd 2:0:0:0: rejecting I/O to offline device [ 641.398358] sd 2:0:0:0: rejecting I/O to offline device [ 641.49] sd 2:0:0:0: rejecting I/O to offline device [ 641.400378] blk_update_request: I/O error, dev sda, sector 0 [ 641.423308] XFS (dm-0): Unmounting Filesystem [ 642.585631] sd 2:0:0:0: [sda] Synchronizing SCSI cache [ 642.637953] == [ 642.638860] BUG: KASAN: use-after-free in scsi_exit_rq+0xf3/0x120 at addr 8802b7fedf00 [ 642.639362] Read of size 1 by task rcuos/5/53 [ 642.639713] CPU: 7 PID: 53 Comm: rcuos/6 Not tainted 4.11.0-rc5+ #13 [ 642.640170] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014 [ 642.640923] Call Trace: [ 642.641080] dump_stack+0x63/0x8f [ 642.641289] kasan_object_err+0x21/0x70 [ 642.641531] kasan_report.part.1+0x231/0x500 [ 642.641823] ? scsi_exit_rq+0xf3/0x120 [ 642.642054] ? _raw_spin_unlock_irqrestore+0xe/0x10 [ 642.642353] ? free_percpu+0x1b7/0x340 [ 642.642586] ? put_task_stack+0x117/0x2b0 [ 642.642837] __asan_report_load1_noabort+0x2e/0x30 [ 642.643138] scsi_exit_rq+0xf3/0x120 [ 642.643366] free_request_size+0x44/0x60 [ 642.643614] mempool_destroy.part.6+0x9b/0x150 [ 642.643899] ? kasan_slab_free+0x87/0xb0 [ 642.644152] mempool_destroy+0x13/0x20 [ 642.644394] blk_exit_rl+0x36/0x40 [ 642.644614] blkg_free+0x146/0x200 [ 642.644836] __blkg_release_rcu+0x121/0x220 [ 642.645112] rcu_nocb_kthread+0x61f/0xca0 [ 642.645376] ? get_state_synchronize_rcu+0x20/0x20 [ 642.645690] ? pci_mmcfg_check_reserved+0x110/0x110 [ 642.646011] kthread+0x298/0x390 [ 642.646224] ? get_state_synchronize_rcu+0x20/0x20 [ 642.646535] ? kthread_park+0x160/0x160 [ 642.646787] ret_from_fork+0x2c/0x40 [ 642.647020] Object at 8802b7fedd80, in cache kmalloc-2048 size: 2048 [ 642.647435] Allocated: [ 642.647591] PID = 3992 [ 642.647750] save_stack_trace+0x1b/0x20 [ 642.648000] save_stack+0x46/0xd0 [ 642.648208] kasan_kmalloc+0xad/0xe0 [ 642.648441] __kmalloc+0x134/0x220 [ 642.648664] scsi_host_alloc+0x6b/0x11c0 [ 642.648919] 0xc101d94a [ 642.649124] driver_probe_device+0x49e/0xc60 [ 642.649398] __device_attach_driver+0x1d3/0x2a0 [ 642.649684] bus_for_each_drv+0x11a/0x1d0 [ 642.649946] __device_attach+0x1e1/0x2c0 [ 642.650200] device_initial_probe+0x13/0x20 [ 642.650557] bus_probe_device+0x19b/0x240 [ 642.651029] device_add+0x86d/0x1450 [ 642.651472] device_register+0x1a/0x20 [ 642.651778] 0xc10270ce [ 642.651990] 0xc1048a62 [ 642.652197] do_one_initcall+0xa7/0x250 [ 642.652445] do_init_module+0x1d0/0x55d [ 642.652696] load_module+0x7c9f/0x9850 [ 642.652937] SYSC_finit_module+0x189/0x1c0 [ 642.653200] SyS_finit_module+0xe/0x10 [ 642.653441] entry_SYSCALL_64_fastpath+0x1a/0xa9 [ 642.653735] Freed: [ 642.653875] PID = 4128 [ 642.654304] save_stack_trace+0x1b/0x20 [ 642.654991] save_stack+0x46/0xd0 [ 642.655200] kasan_slab_free+0x71/0xb0 [ 642.655434] kfree+0x8d/0x1b0 [ 642.655624] scsi_host_dev_release+0x2cb/0x430 [ 642.655898] device_release+0x76/0x1e0 [ 642.656141] kobject_release+0x107/0x370