Re: [PATCH 37/45] drivers: use req op accessor
On Thu, Aug 4, 2016 at 10:46 AM, Christoph Hellwig wrote: > On Wed, Aug 03, 2016 at 07:30:29PM -0500, Shaun Tancheff wrote: >> I think the translation in loop.c is suspicious here: >> >> "if use DIO && not (a flush_flag or discard_flag)" >> should translate to: >> "if use DIO && not ((a flush_flag) || op == discard)" >> >> But in the patch I read: >> "if use DIO && ((not a flush_flag) || op == discard) >> >> Which would have DIO && discards follow the AIO path? > > Indeed. Sorry for missing out on your patch, I just sent a fix > in reply to Dave's other report earlier which is pretty similar to > yours. No worries. I prefer your switch to a an if conditional here. -- Shaun Tancheff
Re: [PATCH 37/45] drivers: use req op accessor
On Wed, Aug 03, 2016 at 07:30:29PM -0500, Shaun Tancheff wrote: > I think the translation in loop.c is suspicious here: > > "if use DIO && not (a flush_flag or discard_flag)" > should translate to: > "if use DIO && not ((a flush_flag) || op == discard)" > > But in the patch I read: > "if use DIO && ((not a flush_flag) || op == discard) > > Which would have DIO && discards follow the AIO path? Indeed. Sorry for missing out on your patch, I just sent a fix in reply to Dave's other report earlier which is pretty similar to yours.
Re: [PATCH 37/45] drivers: use req op accessor
On 08/03/2016 07:30 PM, Shaun Tancheff wrote: > On Wed, Aug 3, 2016 at 6:47 PM, Mike Christie wrote: >> On 08/03/2016 05:33 PM, Ross Zwisler wrote: >>> On Sun, Jun 5, 2016 at 1:32 PM, wrote: From: Mike Christie The req operation REQ_OP is separated from the rq_flag_bits definition. This converts the block layer drivers to use req_op to get the op from the request struct. Signed-off-by: Mike Christie --- drivers/block/loop.c | 6 +++--- drivers/block/mtip32xx/mtip32xx.c | 2 +- drivers/block/nbd.c | 2 +- drivers/block/rbd.c | 4 ++-- drivers/block/xen-blkfront.c | 8 +--- drivers/ide/ide-floppy.c | 2 +- drivers/md/dm.c | 2 +- drivers/mmc/card/block.c | 7 +++ drivers/mmc/card/queue.c | 6 ++ >>> >>> Dave Chinner reported a deadlock with XFS + DAX, which I reproduced >>> and bisected to this commit: >>> >>> commit c2df40dfb8c015211ec55f4b1dd0587f875c7b34 >>> Author: Mike Christie >>> Date: Sun Jun 5 14:32:17 2016 -0500 >>> drivers: use req op accessor >>> >>> Here are the steps to reproduce the deadlock with a BRD ramdisk: >>> >>> mkfs.xfs -f /dev/ram0 >>> mount -o dax /dev/ram0 /mnt/scratch >> >> When using ramdisks, we need the attached patch like in your other bug >> report. I think it will fix some hangs people are seeing. >> >> I do not think that it should cause the failure to run issue you saw >> when doing generic/008 and ext2. >> > > I think the translation in loop.c is suspicious here: > > "if use DIO && not (a flush_flag or discard_flag)" > should translate to: > "if use DIO && not ((a flush_flag) || op == discard)" > > But in the patch I read: > "if use DIO && ((not a flush_flag) || op == discard) > > Which would have DIO && discards follow the AIO path? > > So I would humbly suggest something like the following > (on top of commit c2df40dfb8c015211ec55f4b1dd0587f875c7b34): > [Please excuse the messed up patch format ... gmail eats tabs] > > diff --git a/drivers/block/loop.c b/drivers/block/loop.c > index b9b737c..0754d83 100644 > --- a/drivers/block/loop.c > +++ b/drivers/block/loop.c > @@ -1659,8 +1659,9 @@ static int loop_queue_rq(struct blk_mq_hw_ctx *hctx, > if (lo->lo_state != Lo_bound) > return -EIO; > > - if (lo->use_dio && (!(cmd->rq->cmd_flags & REQ_FLUSH) || > - req_op(cmd->rq) == REQ_OP_DISCARD)) > + if (lo->use_dio && !( > + (cmd->rq->cmd_flags & REQ_FLUSH) || > +req_op(cmd->rq) == REQ_OP_DISCARD)) > cmd->use_aio = true; > else > cmd->use_aio = false; > You are right. The translation was bad and your code above is correct. I think we need my patch in the other mail though too, because for the rw_page user case if WB_SYNC_ALL is set, then the IO gets sent down as a read instead of a write.
Re: [PATCH 37/45] drivers: use req op accessor
On Wed, Aug 3, 2016 at 6:47 PM, Mike Christie wrote: > On 08/03/2016 05:33 PM, Ross Zwisler wrote: >> On Sun, Jun 5, 2016 at 1:32 PM, wrote: >>> From: Mike Christie >>> >>> The req operation REQ_OP is separated from the rq_flag_bits >>> definition. This converts the block layer drivers to >>> use req_op to get the op from the request struct. >>> >>> Signed-off-by: Mike Christie >>> --- >>> drivers/block/loop.c | 6 +++--- >>> drivers/block/mtip32xx/mtip32xx.c | 2 +- >>> drivers/block/nbd.c | 2 +- >>> drivers/block/rbd.c | 4 ++-- >>> drivers/block/xen-blkfront.c | 8 +--- >>> drivers/ide/ide-floppy.c | 2 +- >>> drivers/md/dm.c | 2 +- >>> drivers/mmc/card/block.c | 7 +++ >>> drivers/mmc/card/queue.c | 6 ++ >> >> Dave Chinner reported a deadlock with XFS + DAX, which I reproduced >> and bisected to this commit: >> >> commit c2df40dfb8c015211ec55f4b1dd0587f875c7b34 >> Author: Mike Christie >> Date: Sun Jun 5 14:32:17 2016 -0500 >> drivers: use req op accessor >> >> Here are the steps to reproduce the deadlock with a BRD ramdisk: >> >> mkfs.xfs -f /dev/ram0 >> mount -o dax /dev/ram0 /mnt/scratch > > When using ramdisks, we need the attached patch like in your other bug > report. I think it will fix some hangs people are seeing. > > I do not think that it should cause the failure to run issue you saw > when doing generic/008 and ext2. > I think the translation in loop.c is suspicious here: "if use DIO && not (a flush_flag or discard_flag)" should translate to: "if use DIO && not ((a flush_flag) || op == discard)" But in the patch I read: "if use DIO && ((not a flush_flag) || op == discard) Which would have DIO && discards follow the AIO path? So I would humbly suggest something like the following (on top of commit c2df40dfb8c015211ec55f4b1dd0587f875c7b34): [Please excuse the messed up patch format ... gmail eats tabs] diff --git a/drivers/block/loop.c b/drivers/block/loop.c index b9b737c..0754d83 100644 --- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -1659,8 +1659,9 @@ static int loop_queue_rq(struct blk_mq_hw_ctx *hctx, if (lo->lo_state != Lo_bound) return -EIO; - if (lo->use_dio && (!(cmd->rq->cmd_flags & REQ_FLUSH) || - req_op(cmd->rq) == REQ_OP_DISCARD)) + if (lo->use_dio && !( + (cmd->rq->cmd_flags & REQ_FLUSH) || +req_op(cmd->rq) == REQ_OP_DISCARD)) cmd->use_aio = true; else cmd->use_aio = false; -- Shaun Tancheff
Re: [PATCH 37/45] drivers: use req op accessor
On 08/03/2016 05:33 PM, Ross Zwisler wrote: > On Sun, Jun 5, 2016 at 1:32 PM, wrote: >> From: Mike Christie >> >> The req operation REQ_OP is separated from the rq_flag_bits >> definition. This converts the block layer drivers to >> use req_op to get the op from the request struct. >> >> Signed-off-by: Mike Christie >> --- >> drivers/block/loop.c | 6 +++--- >> drivers/block/mtip32xx/mtip32xx.c | 2 +- >> drivers/block/nbd.c | 2 +- >> drivers/block/rbd.c | 4 ++-- >> drivers/block/xen-blkfront.c | 8 +--- >> drivers/ide/ide-floppy.c | 2 +- >> drivers/md/dm.c | 2 +- >> drivers/mmc/card/block.c | 7 +++ >> drivers/mmc/card/queue.c | 6 ++ > > Dave Chinner reported a deadlock with XFS + DAX, which I reproduced > and bisected to this commit: > > commit c2df40dfb8c015211ec55f4b1dd0587f875c7b34 > Author: Mike Christie > Date: Sun Jun 5 14:32:17 2016 -0500 > drivers: use req op accessor > > Here are the steps to reproduce the deadlock with a BRD ramdisk: > > mkfs.xfs -f /dev/ram0 > mount -o dax /dev/ram0 /mnt/scratch When using ramdisks, we need the attached patch like in your other bug report. I think it will fix some hangs people are seeing. I do not think that it should cause the failure to run issue you saw when doing generic/008 and ext2. diff --git a/drivers/block/brd.c b/drivers/block/brd.c index 3022dad..9fbbeba 100644 --- a/drivers/block/brd.c +++ b/drivers/block/brd.c @@ -300,20 +300,20 @@ static void copy_from_brd(void *dst, struct brd_device *brd, * Process a single bvec of a bio. */ static int brd_do_bvec(struct brd_device *brd, struct page *page, - unsigned int len, unsigned int off, int rw, + unsigned int len, unsigned int off, int op, sector_t sector) { void *mem; int err = 0; - if (rw != READ) { + if (op_is_write(op)) { err = copy_to_brd_setup(brd, sector, len); if (err) goto out; } mem = kmap_atomic(page); - if (rw == READ) { + if (!op_is_write(op)) { copy_from_brd(mem + off, brd, sector, len); flush_dcache_page(page); } else { @@ -330,7 +330,6 @@ static blk_qc_t brd_make_request(struct request_queue *q, struct bio *bio) { struct block_device *bdev = bio->bi_bdev; struct brd_device *brd = bdev->bd_disk->private_data; - int rw; struct bio_vec bvec; sector_t sector; struct bvec_iter iter; @@ -347,14 +346,12 @@ static blk_qc_t brd_make_request(struct request_queue *q, struct bio *bio) goto out; } - rw = bio_data_dir(bio); - bio_for_each_segment(bvec, bio, iter) { unsigned int len = bvec.bv_len; int err; err = brd_do_bvec(brd, bvec.bv_page, len, - bvec.bv_offset, rw, sector); + bvec.bv_offset, bio_op(bio), sector); if (err) goto io_error; sector += len >> SECTOR_SHIFT; @@ -369,11 +366,11 @@ io_error: } static int brd_rw_page(struct block_device *bdev, sector_t sector, - struct page *page, int rw) + struct page *page, int op, int op_flags) { struct brd_device *brd = bdev->bd_disk->private_data; - int err = brd_do_bvec(brd, page, PAGE_SIZE, 0, rw, sector); - page_endio(page, rw & WRITE, err); + int err = brd_do_bvec(brd, page, PAGE_SIZE, 0, op, sector); + page_endio(page, op, err); return err; } diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c index 7454cf1..f0e126c 100644 --- a/drivers/block/zram/zram_drv.c +++ b/drivers/block/zram/zram_drv.c @@ -843,15 +843,15 @@ static void zram_bio_discard(struct zram *zram, u32 index, } static int zram_bvec_rw(struct zram *zram, struct bio_vec *bvec, u32 index, - int offset, int rw) + int offset, int op) { unsigned long start_time = jiffies; int ret; - generic_start_io_acct(rw, bvec->bv_len >> SECTOR_SHIFT, + generic_start_io_acct(op, bvec->bv_len >> SECTOR_SHIFT, &zram->disk->part0); - if (rw == READ) { + if (!op_is_write(op)) { atomic64_inc(&zram->stats.num_reads); ret = zram_bvec_read(zram, bvec, index, offset); } else { @@ -859,10 +859,10 @@ static int zram_bvec_rw(struct zram *zram, struct bio_vec *bvec, u32 index, ret = zram_bvec_write(zram, bvec, index, offset); } - generic_end_io_acct(rw, &zram->disk->part0, start_time); + generic_end_io_acct(op, &zram->disk->part0, start_time); if (unlikely(ret)) { - if (rw == READ) + if (!op_is_write(op)) atomic64_inc(&zram->stats.failed_reads); else atomic64_inc(&zram->stats.failed_writes); @@ -873,7 +873,7 @@ static int zram_bvec_rw(struct zram *zram, struct bio_vec *bvec, u32 index, static void __zram_make_request(struct zram *zram, struct bio *bio) { - int offset, rw; + int offset; u32 index; struct bio_vec bvec; struct bvec_iter iter; @@ -888,7 +888,6 @@ static void __zram_make_request(struct zram *zram, struct bio *bio) return; } - rw = bio_data_dir(bio); bio_for_each_segment(bvec, bio, iter) { int max_transfer_size = PAGE_SIZE - offset; @@ -903,15 +902,18 @@ static v
Re: [PATCH 37/45] drivers: use req op accessor
On Sun, Jun 5, 2016 at 1:32 PM, wrote: > From: Mike Christie > > The req operation REQ_OP is separated from the rq_flag_bits > definition. This converts the block layer drivers to > use req_op to get the op from the request struct. > > Signed-off-by: Mike Christie > --- > drivers/block/loop.c | 6 +++--- > drivers/block/mtip32xx/mtip32xx.c | 2 +- > drivers/block/nbd.c | 2 +- > drivers/block/rbd.c | 4 ++-- > drivers/block/xen-blkfront.c | 8 +--- > drivers/ide/ide-floppy.c | 2 +- > drivers/md/dm.c | 2 +- > drivers/mmc/card/block.c | 7 +++ > drivers/mmc/card/queue.c | 6 ++ Dave Chinner reported a deadlock with XFS + DAX, which I reproduced and bisected to this commit: commit c2df40dfb8c015211ec55f4b1dd0587f875c7b34 Author: Mike Christie Date: Sun Jun 5 14:32:17 2016 -0500 drivers: use req op accessor Here are the steps to reproduce the deadlock with a BRD ramdisk: mkfs.xfs -f /dev/ram0 mount -o dax /dev/ram0 /mnt/scratch xfs_io -f -c "truncate 1g" /mnt/scratch/test.img losetup -f --show /mnt/scratch/test.img mkfs.xfs -f /dev/loop0 At this point the mkfs.xfs deadlocks. Here is the stack trace gathered via "echo w > /proc/sysrq-trigger" and passed through kasan_symbolize.py: brd: module loaded XFS (ram0): DAX enabled. Warning: EXPERIMENTAL, use at your own risk XFS (ram0): Mounting V5 Filesystem XFS (ram0): Ending clean mount sysrq: SysRq : Show Blocked State taskPC stack pid father mkfs.xfsD 88060ae47b38 0 1482 1287 0x 88060ae47b38 79e8 880610fd8d98 880036011a40 8800aa6dcec0 88060ae48000 880610fd8d80 7fff 8800aa6dcec0 024000c0 88060ae47b50 81aca775 Call Trace: [] schedule+0x35/0x80 kernel/sched/core.c:3360 [] schedule_timeout+0x271/0x460 kernel/time/timer.c:1493 [] io_schedule_timeout+0xa4/0x110 kernel/sched/core.c:4969 [< inline >] do_wait_for_common kernel/sched/completion.c:75 [< inline >] __wait_for_common kernel/sched/completion.c:93 [< inline >] wait_for_common_io kernel/sched/completion.c:107 [] wait_for_completion_io+0xdf/0x120 kernel/sched/completion.c:155 [] submit_bio_wait+0x66/0x90 block/bio.c:870 [] blkdev_issue_discard+0x86/0xc0 block/blk-lib.c:115 [] blk_ioctl_discard+0xa3/0xd0 block/ioctl.c:221 [] blkdev_ioctl+0x60a/0x9e0 block/ioctl.c:510 [] block_ioctl+0x43/0x50 fs/block_dev.c:1714 [< inline >] vfs_ioctl fs/ioctl.c:43 [] do_vfs_ioctl+0xa2/0x6a0 fs/ioctl.c:674 [< inline >] SYSC_ioctl fs/ioctl.c:689 [] SyS_ioctl+0x79/0x90 fs/ioctl.c:680 [] entry_SYSCALL_64_fastpath+0x1f/0xbd arch/x86/entry/entry_64.S:207 The line numbers are for the commit above, not for linux/master. This occurs 100% as of this commit, and 0% with the previous commit. This doesn't occur if you don't use DAX, but based on the content of the commit I'm guessing that difference is due to variations in the way the two paths use discard. - Ross
Re: [PATCH 37/45] drivers: use req op accessor
On 06/05/2016 09:32 PM, mchri...@redhat.com wrote: > From: Mike Christie > > The req operation REQ_OP is separated from the rq_flag_bits > definition. This converts the block layer drivers to > use req_op to get the op from the request struct. > > Signed-off-by: Mike Christie > --- > drivers/block/loop.c | 6 +++--- > drivers/block/mtip32xx/mtip32xx.c | 2 +- > drivers/block/nbd.c | 2 +- > drivers/block/rbd.c | 4 ++-- > drivers/block/xen-blkfront.c | 8 +--- > drivers/ide/ide-floppy.c | 2 +- > drivers/md/dm.c | 2 +- > drivers/mmc/card/block.c | 7 +++ > drivers/mmc/card/queue.c | 6 ++ > drivers/mmc/card/queue.h | 5 - > drivers/mtd/mtd_blkdevs.c | 2 +- > drivers/nvme/host/core.c | 2 +- > drivers/nvme/host/nvme.h | 4 ++-- > drivers/scsi/sd.c | 25 - > 14 files changed, 43 insertions(+), 34 deletions(-) > Reviewed-by: Hannes Reinecke Cheers, Hannes -- Dr. Hannes ReineckeTeamlead Storage & Networking h...@suse.de +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg)
[PATCH 37/45] drivers: use req op accessor
From: Mike Christie The req operation REQ_OP is separated from the rq_flag_bits definition. This converts the block layer drivers to use req_op to get the op from the request struct. Signed-off-by: Mike Christie --- drivers/block/loop.c | 6 +++--- drivers/block/mtip32xx/mtip32xx.c | 2 +- drivers/block/nbd.c | 2 +- drivers/block/rbd.c | 4 ++-- drivers/block/xen-blkfront.c | 8 +--- drivers/ide/ide-floppy.c | 2 +- drivers/md/dm.c | 2 +- drivers/mmc/card/block.c | 7 +++ drivers/mmc/card/queue.c | 6 ++ drivers/mmc/card/queue.h | 5 - drivers/mtd/mtd_blkdevs.c | 2 +- drivers/nvme/host/core.c | 2 +- drivers/nvme/host/nvme.h | 4 ++-- drivers/scsi/sd.c | 25 - 14 files changed, 43 insertions(+), 34 deletions(-) diff --git a/drivers/block/loop.c b/drivers/block/loop.c index e9f1701..b9b737c 100644 --- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -544,7 +544,7 @@ static int do_req_filebacked(struct loop_device *lo, struct request *rq) if (op_is_write(req_op(rq))) { if (rq->cmd_flags & REQ_FLUSH) ret = lo_req_flush(lo, rq); - else if (rq->cmd_flags & REQ_DISCARD) + else if (req_op(rq) == REQ_OP_DISCARD) ret = lo_discard(lo, rq, pos); else if (lo->transfer) ret = lo_write_transfer(lo, rq, pos); @@ -1659,8 +1659,8 @@ static int loop_queue_rq(struct blk_mq_hw_ctx *hctx, if (lo->lo_state != Lo_bound) return -EIO; - if (lo->use_dio && !(cmd->rq->cmd_flags & (REQ_FLUSH | - REQ_DISCARD))) + if (lo->use_dio && (!(cmd->rq->cmd_flags & REQ_FLUSH) || + req_op(cmd->rq) == REQ_OP_DISCARD)) cmd->use_aio = true; else cmd->use_aio = false; diff --git a/drivers/block/mtip32xx/mtip32xx.c b/drivers/block/mtip32xx/mtip32xx.c index 6053e46..8e3e708 100644 --- a/drivers/block/mtip32xx/mtip32xx.c +++ b/drivers/block/mtip32xx/mtip32xx.c @@ -3765,7 +3765,7 @@ static int mtip_submit_request(struct blk_mq_hw_ctx *hctx, struct request *rq) return -ENODATA; } - if (rq->cmd_flags & REQ_DISCARD) { + if (req_op(rq) == REQ_OP_DISCARD) { int err; err = mtip_send_trim(dd, blk_rq_pos(rq), blk_rq_sectors(rq)); diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c index 31e73a7..6c2c28d 100644 --- a/drivers/block/nbd.c +++ b/drivers/block/nbd.c @@ -282,7 +282,7 @@ static int nbd_send_req(struct nbd_device *nbd, struct request *req) if (req->cmd_type == REQ_TYPE_DRV_PRIV) type = NBD_CMD_DISC; - else if (req->cmd_flags & REQ_DISCARD) + else if (req_op(req) == REQ_OP_DISCARD) type = NBD_CMD_TRIM; else if (req->cmd_flags & REQ_FLUSH) type = NBD_CMD_FLUSH; diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c index 81666a5..4506620 100644 --- a/drivers/block/rbd.c +++ b/drivers/block/rbd.c @@ -3286,9 +3286,9 @@ static void rbd_queue_workfn(struct work_struct *work) goto err; } - if (rq->cmd_flags & REQ_DISCARD) + if (req_op(rq) == REQ_OP_DISCARD) op_type = OBJ_OP_DISCARD; - else if (rq->cmd_flags & REQ_WRITE) + else if (req_op(rq) == REQ_OP_WRITE) op_type = OBJ_OP_WRITE; else op_type = OBJ_OP_READ; diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c index 52963a2..6fd1601 100644 --- a/drivers/block/xen-blkfront.c +++ b/drivers/block/xen-blkfront.c @@ -844,7 +844,8 @@ static int blkif_queue_request(struct request *req, struct blkfront_ring_info *r if (unlikely(rinfo->dev_info->connected != BLKIF_STATE_CONNECTED)) return 1; - if (unlikely(req->cmd_flags & (REQ_DISCARD | REQ_SECURE))) + if (unlikely(req_op(req) == REQ_OP_DISCARD || +req->cmd_flags & REQ_SECURE)) return blkif_queue_discard_req(req, rinfo); else return blkif_queue_rw_req(req, rinfo); @@ -2054,8 +2055,9 @@ static int blkif_recover(struct blkfront_info *info) /* * Get the bios in the request so we can re-queue them. */ - if (copy[i].request->cmd_flags & - (REQ_FLUSH | REQ_FUA | REQ_DISCARD | REQ_SECURE)) { + if (copy[i].request->cmd_flags & REQ_FLUSH || + req_op(copy[i].request) == REQ_OP_DISCARD || + copy[i].request->cmd_flags & (REQ_FUA | REQ_SECURE)) { /* * Flush operations don't