Re: BUG: KASAN: use-after-free in bt_for_each+0x1ea/0x29f

2018-04-05 Thread Bart Van Assche
On Wed, 2018-04-04 at 19:26 -0600, Jens Axboe wrote:
> Leaving the whole trace here, but I'm having a hard time making sense of it.
> It complains about a user-after-free in the inflight iteration, which is only
> working on the queue, request, and on-stack mi data. None of these would be
> freed. The below trace on allocation and free indicates a bio, but that isn't
> used in the inflight path at all. Is it possible that kasan gets confused 
> here?
> Not sure what to make of it so far.

Hello Jens,

In the many block layer tests I ran with KASAN enabled I have never seen
anything like this nor have I seen anything that made me wonder about the
reliability of KASAN. Maybe some code outside the block layer core corrupted
a request queue data structure and triggered this weird report?

Bart.





Re: BUG: KASAN: use-after-free in bt_for_each+0x1ea/0x29f

2018-04-04 Thread Jens Axboe
On 4/4/18 5:28 PM, Ming Lei wrote:
> Hi,
> 
> The following warning is observed once when running dbench on NVMe with
> the linus tree(top commit is 642e7fd23353).
> 
> [ 1446.882043] 
> ==
> [ 1446.886884] BUG: KASAN: use-after-free in bt_for_each+0x1ea/0x29f
> [ 1446.888045] Read of size 8 at addr 880055a60a00 by task dbench/13443
> [ 1446.889660]
> [ 1446.889892] CPU: 1 PID: 13443 Comm: dbench Not tainted 
> 4.16.0_642e7fd23353_master+ #1
> [ 1446.891007] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 
> 1.10.2-2.fc27 04/01/2014
> [ 1446.892290] Call Trace:
> [ 1446.892641]  
> [ 1446.892937]  dump_stack+0xf0/0x191
> [ 1446.893600]  ? dma_direct_map_page+0x6f/0x6f
> [ 1446.894425]  ? show_regs_print_info+0xa/0xa
> [ 1446.895247]  ? ext4_writepages+0x196d/0x1e6d
> [ 1446.896063]  ? do_writepages+0x57/0xa3
> [ 1446.896810]  print_address_description+0x6e/0x23b
> [ 1446.897882]  ? bt_for_each+0x1ea/0x29f
> [ 1446.898693]  kasan_report+0x247/0x285
> [ 1446.899484]  bt_for_each+0x1ea/0x29f
> [ 1446.900233]  ? blk_mq_tagset_busy_iter+0xa3/0xa3
> [ 1446.901190]  ? generic_file_buffered_read+0x14b1/0x14b1
> [ 1446.903097]  ? blk_mq_hctx_mark_pending.isra.0+0x5c/0x5c
> [ 1446.904418]  ? bio_free+0x64/0xaa
> [ 1446.905113]  ? debug_lockdep_rcu_enabled+0x26/0x52
> [ 1446.906332]  ? bio_put+0x7a/0x10e
> [ 1446.906811]  ? debug_lockdep_rcu_enabled+0x26/0x52
> [ 1446.907527]  ? blk_mq_hctx_mark_pending.isra.0+0x5c/0x5c
> [ 1446.908334]  blk_mq_queue_tag_busy_iter+0xd0/0xde
> [ 1446.909023]  blk_mq_in_flight+0xb4/0xdb
> [ 1446.909619]  ? blk_mq_exit_hctx+0x190/0x190
> [ 1446.910281]  ? ext4_end_bio+0x25d/0x2a1
> [ 1446.911713]  part_in_flight+0xc0/0x2ac
> [ 1446.912470]  ? ext4_put_io_end_defer+0x277/0x277
> [ 1446.913465]  ? part_dec_in_flight+0x8f/0x8f
> [ 1446.914375]  ? __lock_acquire+0x38/0x8e5
> [ 1446.915182]  ? bio_endio+0x3d9/0x41c
> [ 1446.915936]  ? __rcu_read_unlock+0x134/0x180
> [ 1446.916796]  ? lock_acquire+0x2ba/0x32d
> [ 1446.917570]  ? blk_account_io_done+0xea/0x572
> [ 1446.918424]  part_round_stats+0x167/0x1a3
> [ 1446.919188]  ? part_round_stats_single.isra.1+0xc7/0xc7
> [ 1446.920187]  blk_account_io_done+0x34d/0x572
> [ 1446.921056]  ? blk_update_bidi_request+0x8f/0x8f
> [ 1446.921923]  ? blk_mq_run_hw_queue+0x13d/0x187
> [ 1446.922803]  blk_mq_end_request+0x3f/0xbf
> [ 1446.923631]  nvme_complete_rq+0x305/0x348 [nvme_core]
> [ 1446.924612]  ? nvme_delete_ctrl_sync+0x5c/0x5c [nvme_core]
> [ 1446.925696]  ? nvme_pci_complete_rq+0x1f6/0x20c [nvme]
> [ 1446.926673]  ? kfree+0x21c/0x2ab
> [ 1446.927317]  ? nvme_pci_complete_rq+0x1f6/0x20c [nvme]
> [ 1446.928239]  __blk_mq_complete_request+0x391/0x3ee
> [ 1446.928938]  ? blk_mq_free_request+0x479/0x479
> [ 1446.929588]  ? rcu_read_lock_bh_held+0x3a/0x3a
> [ 1446.930321]  ? enqueue_hrtimer+0x252/0x29a
> [ 1446.930938]  ? do_raw_spin_lock+0xd8/0xd8
> [ 1446.931532]  ? debug_lockdep_rcu_enabled+0x26/0x52
> [ 1446.932425]  blk_mq_complete_request+0x10e/0x159
> [ 1446.933341]  ? hctx_lock+0xe8/0xe8
> [ 1446.933985]  ? lock_contended+0x680/0x680
> [ 1446.934707]  ? lock_downgrade+0x338/0x338
> [ 1446.935463]  nvme_process_cq+0x26a/0x34d [nvme]
> [ 1446.936297]  ? nvme_init_hctx+0xa6/0xa6 [nvme]
> [ 1446.937150]  nvme_irq+0x23/0x51 [nvme]
> [ 1446.937864]  ? nvme_process_cq+0x34d/0x34d [nvme]
> [ 1446.938713]  __handle_irq_event_percpu+0x29d/0x568
> [ 1446.939516]  ? __irq_wake_thread+0x99/0x99
> [ 1446.940241]  ? rcu_user_enter+0x72/0x72
> [ 1446.940978]  ? do_timer+0x25/0x25
> [ 1446.941650]  ? do_raw_spin_unlock+0x146/0x179
> [ 1446.942514]  ? __lock_acquire+0x38/0x8e5
> [ 1446.943305]  ? debug_lockdep_rcu_enabled+0x26/0x52
> [ 1446.944242]  ? lock_acquire+0x32d/0x32d
> [ 1446.944995]  ? lock_contended+0x680/0x680
> [ 1446.945718]  handle_irq_event_percpu+0x7c/0xf7
> [ 1446.946438]  ? __handle_irq_event_percpu+0x568/0x568
> [ 1446.947124]  ? rcu_user_exit+0xa/0xa
> [ 1446.947781]  handle_irq_event+0x53/0x83
> [ 1446.948553]  handle_edge_irq+0x1f2/0x279
> [ 1446.949397]  handle_irq+0x1d8/0x1e9
> [ 1446.950094]  do_IRQ+0x90/0x12d
> [ 1446.950750]  common_interrupt+0xf/0xf
> [ 1446.951507]  
> [ 1446.951953] RIP: 0010:__blk_mq_get_tag+0x201/0x22d
> [ 1446.952894] RSP: 0018:880055b467a0 EFLAGS: 0246 ORIG_RAX: 
> ffdc
> [ 1446.954295] RAX:  RBX: 88005952f648 RCX: 
> 
> [ 1446.955641] RDX: 0259 RSI:  RDI: 
> ed000ab68d06
> [ 1446.956972] RBP: ed000ab68cf6 R08: 0007 R09: 
> 
> [ 1446.958356] R10: ed000a0ec0f2 R11: ed000a0ec0f1 R12: 
> 88007f113978
> [ 1446.959737] R13: 880055b46ce8 R14: dc00 R15: 
> 880058bf60c0
> [ 1446.961184]  ? modules_open+0x5e/0x5e
> [ 1446.961922]  ? blk_mq_unique_tag+0xc5/0xc5
> [ 1446.962748]  ? lock_acquire+0x32d/0x32d
> [ 1446.963534]  ? __rcu_read_unlock+0x134/0x180
> [ 1446.964393]  ? 

Re: BUG: KASAN: use-after-free in scsi_exit_rq

2017-05-02 Thread Bart Van Assche
On Tue, 2017-05-02 at 16:41 +0200, Jan Kara wrote:
> So I'm also not aware of any particular breakage this would cause. However
> logically the freeing of request mempools really belongs to
> blk_release_queue() so it seems a bit dumb to move blk_exit_rl() just
> because SCSI stores the fact from which slab cache it has allocated the
> sense buffer in a structure (shost) that it frees under its hands by the
> time blk_release_queue() is called. :-|

Hello Jan,

My concern when I wrote my previous e-mail was that I didn't want to add a
scsi_host_get() / scsi_host_put() pair to the hot path in the SCSI core. But
I just realized that scsi_init_rq() and scsi_exit_rq() are not in the hot
path so adding a scsi_host_get() / scsi_host_put() pair should work fine. I
will post a patch.

Bart.


Re: BUG: KASAN: use-after-free in scsi_exit_rq

2017-05-02 Thread Jan Kara
On Fri 28-04-17 17:46:47, Tejun Heo wrote:
> On Fri, Apr 21, 2017 at 09:49:17PM +, Bart Van Assche wrote:
> > On Thu, 2017-04-20 at 15:18 -0600, Scott Bauer wrote:
> > > [  642.638860] BUG: KASAN: use-after-free in scsi_exit_rq+0xf3/0x120 at 
> > > addr 8802b7fedf00
> > > [  642.639362] Read of size 1 by task rcuos/5/53
> > > [  642.639713] CPU: 7 PID: 53 Comm: rcuos/6 Not tainted 4.11.0-rc5+ #13
> > > [  642.640170] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), 
> > > BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 
> > > 04/01/2014
> > > [  642.640923] Call Trace:
> > > [  642.641080]  dump_stack+0x63/0x8f
> > > [  642.641289]  kasan_object_err+0x21/0x70
> > > [  642.641531]  kasan_report.part.1+0x231/0x500
> > > [  642.641823]  ? scsi_exit_rq+0xf3/0x120
> > > [  642.642054]  ? _raw_spin_unlock_irqrestore+0xe/0x10
> > > [  642.642353]  ? free_percpu+0x1b7/0x340
> > > [  642.642586]  ? put_task_stack+0x117/0x2b0
> > > [  642.642837]  __asan_report_load1_noabort+0x2e/0x30
> > > [  642.643138]  scsi_exit_rq+0xf3/0x120
> > > [  642.643366]  free_request_size+0x44/0x60
> > > [  642.643614]  mempool_destroy.part.6+0x9b/0x150
> > > [  642.643899]  ? kasan_slab_free+0x87/0xb0
> > > [  642.644152]  mempool_destroy+0x13/0x20
> > > [  642.644394]  blk_exit_rl+0x36/0x40
> > > [  642.644614]  blkg_free+0x146/0x200
> > > [  642.644836]  __blkg_release_rcu+0x121/0x220
> > > [  642.645112]  rcu_nocb_kthread+0x61f/0xca0
> > > [  642.645376]  ? get_state_synchronize_rcu+0x20/0x20
> > > [  642.645690]  ? pci_mmcfg_check_reserved+0x110/0x110
> > > [  642.646011]  kthread+0x298/0x390
> > > [  642.646224]  ? get_state_synchronize_rcu+0x20/0x20
> > > [  642.646535]  ? kthread_park+0x160/0x160
> > > [  642.646787]  ret_from_fork+0x2c/0x40
> > 
> > I'm not familiar with cgroups but seeing this makes me wonder whether it 
> > would
> > be possible to move the blk_exit_rl() calls from blk_release_queue() into
> > blk_cleanup_queue()? The SCSI core frees a SCSI host after 
> > blk_cleanup_queue()
> > has finished for all associated SCSI devices. This is why I think that 
> > calling
> > blk_exit_rl() earlier would be sufficient to avoid that scsi_exit_rq()
> > dereferences a SCSI host pointer after it has been freed.
> 
> Hmm... I see.  Didn't know request put path involved derefing those
> structs.  The blk_exit_rl() call just replaced open coded
> mempool_destroy call, so the destruction order was always like this.
> We shouldn't have any request in flight by cleanup, so moving it there
> should be fine.

So I'm also not aware of any particular breakage this would cause. However
logically the freeing of request mempools really belongs to
blk_release_queue() so it seems a bit dumb to move blk_exit_rl() just
because SCSI stores the fact from which slab cache it has allocated the
sense buffer in a structure (shost) that it frees under its hands by the
time blk_release_queue() is called. :-|

Honza
-- 
Jan Kara 
SUSE Labs, CR


Re: BUG: KASAN: use-after-free in scsi_exit_rq

2017-04-21 Thread Bart Van Assche
On Thu, 2017-04-20 at 15:18 -0600, Scott Bauer wrote:
> [  642.638860] BUG: KASAN: use-after-free in scsi_exit_rq+0xf3/0x120 at addr 
> 8802b7fedf00
> [  642.639362] Read of size 1 by task rcuos/5/53
> [  642.639713] CPU: 7 PID: 53 Comm: rcuos/6 Not tainted 4.11.0-rc5+ #13
> [  642.640170] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
> rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
> [  642.640923] Call Trace:
> [  642.641080]  dump_stack+0x63/0x8f
> [  642.641289]  kasan_object_err+0x21/0x70
> [  642.641531]  kasan_report.part.1+0x231/0x500
> [  642.641823]  ? scsi_exit_rq+0xf3/0x120
> [  642.642054]  ? _raw_spin_unlock_irqrestore+0xe/0x10
> [  642.642353]  ? free_percpu+0x1b7/0x340
> [  642.642586]  ? put_task_stack+0x117/0x2b0
> [  642.642837]  __asan_report_load1_noabort+0x2e/0x30
> [  642.643138]  scsi_exit_rq+0xf3/0x120
> [  642.643366]  free_request_size+0x44/0x60
> [  642.643614]  mempool_destroy.part.6+0x9b/0x150
> [  642.643899]  ? kasan_slab_free+0x87/0xb0
> [  642.644152]  mempool_destroy+0x13/0x20
> [  642.644394]  blk_exit_rl+0x36/0x40
> [  642.644614]  blkg_free+0x146/0x200
> [  642.644836]  __blkg_release_rcu+0x121/0x220
> [  642.645112]  rcu_nocb_kthread+0x61f/0xca0
> [  642.645376]  ? get_state_synchronize_rcu+0x20/0x20
> [  642.645690]  ? pci_mmcfg_check_reserved+0x110/0x110
> [  642.646011]  kthread+0x298/0x390
> [  642.646224]  ? get_state_synchronize_rcu+0x20/0x20
> [  642.646535]  ? kthread_park+0x160/0x160
> [  642.646787]  ret_from_fork+0x2c/0x40

I'm not familiar with cgroups but seeing this makes me wonder whether it would
be possible to move the blk_exit_rl() calls from blk_release_queue() into
blk_cleanup_queue()? The SCSI core frees a SCSI host after blk_cleanup_queue()
has finished for all associated SCSI devices. This is why I think that calling
blk_exit_rl() earlier would be sufficient to avoid that scsi_exit_rq()
dereferences a SCSI host pointer after it has been freed.

Bart.

Re: BUG: KASAN: Use-after-free

2017-01-24 Thread Matias Bjørling
On 01/24/2017 02:34 PM, Christoph Hellwig wrote:
> On Tue, Jan 24, 2017 at 11:01:42AM +0100, Matias Bjørling wrote:
>> Yup. That fixes it. Should I put together the patch, or will you take
>> care of it?
> 
> I'll send it out.  Of course with proper reporting credits for you.
> 

Awesome. Thanks
--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BUG: KASAN: Use-after-free

2017-01-24 Thread Christoph Hellwig
On Tue, Jan 24, 2017 at 11:01:42AM +0100, Matias Bjørling wrote:
> Yup. That fixes it. Should I put together the patch, or will you take
> care of it?

I'll send it out.  Of course with proper reporting credits for you.
--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BUG: KASAN: Use-after-free

2017-01-24 Thread Matias Bjørling
On 01/24/2017 10:52 AM, Christoph Hellwig wrote:
> On Tue, Jan 24, 2017 at 10:32:11AM +0100, Matias Bjørling wrote:
>> *(gdb) list *blkdev_direct_IO+0x50c
>> 0x8142ab8c is in blkdev_direct_IO (fs/block_dev.c:401).
>> 396  submit_bio(bio);
>> 397  bio = bio_alloc(GFP_KERNEL, nr_pages);
>> 398  }
>> 399  blk_finish_plug();
>> 400  
>> 401  if (!dio->is_sync)
>> 402  return -EIOCBQUEUED;
>>
>> It looks like the qc = submit_bio() completes the I/O before the
>> blkdev_direct_IO completes, which then leads to the use after free case,
>> when the private dio struct is accessed.
> 
> Ok, the is_sync check here is clearly a bug, we need a local variable
> for is_sync as well for the aio case:
> 
> diff --git a/fs/block_dev.c b/fs/block_dev.c
> index 5db5d13..3c47614 100644
> --- a/fs/block_dev.c
> +++ b/fs/block_dev.c
> @@ -331,7 +331,7 @@ __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter 
> *iter, int nr_pages)
>   struct blk_plug plug;
>   struct blkdev_dio *dio;
>   struct bio *bio;
> - bool is_read = (iov_iter_rw(iter) == READ);
> + bool is_read = (iov_iter_rw(iter) == READ), is_sync;
>   loff_t pos = iocb->ki_pos;
>   blk_qc_t qc = BLK_QC_T_NONE;
>   int ret;
> @@ -344,7 +344,7 @@ __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter 
> *iter, int nr_pages)
>   bio_get(bio); /* extra ref for the completion handler */
>  
>   dio = container_of(bio, struct blkdev_dio, bio);
> - dio->is_sync = is_sync_kiocb(iocb);
> + dio->is_sync = is_sync = is_sync_kiocb(iocb);
>   if (dio->is_sync)
>   dio->waiter = current;
>   else
> @@ -398,7 +398,7 @@ __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter 
> *iter, int nr_pages)
>   }
>   blk_finish_plug();
>  
> - if (!dio->is_sync)
> + if (!is_sync)
>   return -EIOCBQUEUED;
>  
>   for (;;) {
> 

Yup. That fixes it. Should I put together the patch, or will you take
care of it?
--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BUG: KASAN: Use-after-free

2017-01-24 Thread Christoph Hellwig
On Tue, Jan 24, 2017 at 10:32:11AM +0100, Matias Bjørling wrote:
> *(gdb) list *blkdev_direct_IO+0x50c
> 0x8142ab8c is in blkdev_direct_IO (fs/block_dev.c:401).
> 396   submit_bio(bio);
> 397   bio = bio_alloc(GFP_KERNEL, nr_pages);
> 398   }
> 399   blk_finish_plug();
> 400   
> 401   if (!dio->is_sync)
> 402   return -EIOCBQUEUED;
> 
> It looks like the qc = submit_bio() completes the I/O before the
> blkdev_direct_IO completes, which then leads to the use after free case,
> when the private dio struct is accessed.

Ok, the is_sync check here is clearly a bug, we need a local variable
for is_sync as well for the aio case:

diff --git a/fs/block_dev.c b/fs/block_dev.c
index 5db5d13..3c47614 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -331,7 +331,7 @@ __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter 
*iter, int nr_pages)
struct blk_plug plug;
struct blkdev_dio *dio;
struct bio *bio;
-   bool is_read = (iov_iter_rw(iter) == READ);
+   bool is_read = (iov_iter_rw(iter) == READ), is_sync;
loff_t pos = iocb->ki_pos;
blk_qc_t qc = BLK_QC_T_NONE;
int ret;
@@ -344,7 +344,7 @@ __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter 
*iter, int nr_pages)
bio_get(bio); /* extra ref for the completion handler */
 
dio = container_of(bio, struct blkdev_dio, bio);
-   dio->is_sync = is_sync_kiocb(iocb);
+   dio->is_sync = is_sync = is_sync_kiocb(iocb);
if (dio->is_sync)
dio->waiter = current;
else
@@ -398,7 +398,7 @@ __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter 
*iter, int nr_pages)
}
blk_finish_plug();
 
-   if (!dio->is_sync)
+   if (!is_sync)
return -EIOCBQUEUED;
 
for (;;) {
--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BUG: KASAN: Use-after-free

2017-01-24 Thread Matias Bjørling
On 01/23/2017 06:20 PM, Christoph Hellwig wrote:
> On Mon, Jan 23, 2017 at 05:07:52PM +0100, Matias Bjørling wrote:
>> Hi,
>>
>> I could use some help verifying an use-after-free bug that I am seeing
>> after the new direct I/O work went in.
>>
>> When issuing a direct write io using libaio, a bio is referenced in the
>> blkdev_direct_IO path, and then put again in the blkdev_bio_end_io path.
>> However, there is a case where the bio is put twice, which leads to
>> modules that rely on the bio after biodev_bio_end_io() to access it
>> prematurely.
> 
> Can you reproduce anything like this with a normal block device?

Looks like bcache has something similar with a get/put in it. I'll look
a bit more.

> 
>>
>> The KASAN error report:
>>
>> [   14.645916]
>> ==
>> [   14.648027] BUG: KASAN: use-after-free in
>> blkdev_direct_IO+0x50c/0x600 at addr 8801ef30ea14
> 
> Can you resolve that address for me, please?
> 

*(gdb) list *blkdev_direct_IO+0x50c
0x8142ab8c is in blkdev_direct_IO (fs/block_dev.c:401).
396 submit_bio(bio);
397 bio = bio_alloc(GFP_KERNEL, nr_pages);
398 }
399 blk_finish_plug();
400 
401 if (!dio->is_sync)
402 return -EIOCBQUEUED;

It looks like the qc = submit_bio() completes the I/O before the
blkdev_direct_IO completes, which then leads to the use after free case,
when the private dio struct is accessed.

>> Which boils down to the bio already being freed, when we hit the
>> module's endio function.
>>
>> The double bio_put() occurs when dio->should_dirty = 0, and dio->is_sync
>> = 0. The flow follows:
>>
>> Issuing:
>>   blkdev_direct_IO
>>  get_bio(bio)
> 
> bio refcounting in __blkdev_direct_IO is the following
> 
> first bio is allocated using bio_alloc_bioset to embedd the dio structure
> into it.  We then grab a second reference to that bio and only that one.
> Each other new bio just gets it's single reference from bio_alloc.
> 
> blkdev_bio_end_io then checks if we hit the last reference on the dio, in
> which case it then drops the additional reference on the first bio after
> calling the aio completion handler.  Once that is done it always drops the
> reference for the current bio - either explicitly or through
> bio_check_pages_dirty in the should_dirty case.
> 
>>  submit_io()
>>rrpc make_request(bio)
>>  get_bio(bio)
>> Completion:
>>   blkdev_bio_end_io
>> bio_put(bio)
>> bio_put(bio) - bio is freed
> 
> Yes, and that's how it's intended.
> 
>>   rrpc_end_io
>> bio_put(bio) (use-after-free access)
> 
> Can you check this is the same bio that got submitted and it didn't
> get bounced somewhere?
> 

Yup, the same:

[11.329950] blkdev_direct_io: get_bio (bio=8801f1e7a018) get ref
[11.331557] blkdev_direct_io: (!nr_pages) (bio=8801f1e7a018) submit
[11.333603] rrpc bio_get: (bio=8801f1e7a018) get ref
[11.335004] blkdev_bio_end_io:!dio->is_syn(bio=8801f1e7a018) put ref
[11.335009] rrpc bio_put: (bio=8801f1e7a018) put ref

It could look like the first get_bio() ref is decremented prematurely in
the blkdev_bio_end_io() path, where it should first have been
decremented at the end of blkdev_direct_IO() path.
--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BUG: KASAN: Use-after-free

2017-01-23 Thread Christoph Hellwig
On Mon, Jan 23, 2017 at 05:07:52PM +0100, Matias Bjørling wrote:
> Hi,
> 
> I could use some help verifying an use-after-free bug that I am seeing
> after the new direct I/O work went in.
> 
> When issuing a direct write io using libaio, a bio is referenced in the
> blkdev_direct_IO path, and then put again in the blkdev_bio_end_io path.
> However, there is a case where the bio is put twice, which leads to
> modules that rely on the bio after biodev_bio_end_io() to access it
> prematurely.

Can you reproduce anything like this with a normal block device?

> 
> The KASAN error report:
> 
> [   14.645916]
> ==
> [   14.648027] BUG: KASAN: use-after-free in
> blkdev_direct_IO+0x50c/0x600 at addr 8801ef30ea14

Can you resolve that address for me, please?

> Which boils down to the bio already being freed, when we hit the
> module's endio function.
> 
> The double bio_put() occurs when dio->should_dirty = 0, and dio->is_sync
> = 0. The flow follows:
> 
> Issuing:
>   blkdev_direct_IO
>  get_bio(bio)

bio refcounting in __blkdev_direct_IO is the following

first bio is allocated using bio_alloc_bioset to embedd the dio structure
into it.  We then grab a second reference to that bio and only that one.
Each other new bio just gets it's single reference from bio_alloc.

blkdev_bio_end_io then checks if we hit the last reference on the dio, in
which case it then drops the additional reference on the first bio after
calling the aio completion handler.  Once that is done it always drops the
reference for the current bio - either explicitly or through
bio_check_pages_dirty in the should_dirty case.

>  submit_io()
>rrpc make_request(bio)
>  get_bio(bio)
> Completion:
>   blkdev_bio_end_io
> bio_put(bio)
> bio_put(bio) - bio is freed

Yes, and that's how it's intended.

>   rrpc_end_io
> bio_put(bio) (use-after-free access)

Can you check this is the same bio that got submitted and it didn't
get bounced somewhere?
--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html