Re: [PATCH v4 02/10] block, scsi: Give RQF_PREEMPT back its original meaning
On Mon, Aug 06, 2018 at 01:54:40PM +, Bart Van Assche wrote: > On Mon, 2018-08-06 at 08:27 +0200, Hannes Reinecke wrote: > > I am not sure this is going to work for SCSI parallel; we're using the > > QUIESCE state there to do domain validation, and all commands there are > > most definitely not PM requests. > > Can you please validate your patches with eg aic7xxx and SCSI parallel > > disks? > > Hello Hannes, > > How about using the RQF_PM flag for SCSI parallel requests instead of > RQF_PREEMPT? That change will also avoid that RQF_PREEMPT has a double > meaning. > Anyway, I will see whether I can drop that change from this series and submit > that change seperately. One thing I'd like to do is split BLK_MQ_REQ_PREEMPT and RQF_PREEMPT, that is don't set RQF_PREEMPT automatically when BLK_MQ_REQ_PREEMPT is passed to blk_get_request, but let the caller set it manually. After that RQF_PREEMPT isn't used by the block layer itself at all and can be moved into ide/scsi specific flags that we can use as we want.
Re: [RFC PATCH 3/3] blk-mq: Remove generation seqeunce
On Thu, Jul 19, 2018 at 10:22:40AM -0600, Keith Busch wrote: > On Thu, Jul 19, 2018 at 04:04:46PM +, Bart Van Assche wrote: > > On Thu, 2018-07-19 at 09:56 -0600, Keith Busch wrote: > > > Even some scsi drivers are susceptible to losing their requests with the > > > reverted behavior: take virtio-scsi for example, which returns RESET_TIMER > > > from it's timeout handler. With the behavior everyone seems to want, > > > a natural completion at or around the same time is lost forever because > > > it was blocked from completion with no way to recover. > > > > The patch I had posted handles a completion that occurs while a timeout is > > being handled properly. From > > https://www.mail-archive.com/linux-block@vger.kernel.org/msg22196.html: > > Sounds like a win-win to me. How do we get a fix into 4.18 at this part of the cycle? I think that is the most important prirority right now.
Re: [RFC PATCH 3/3] blk-mq: Remove generation seqeunce
On Wed, Jul 18, 2018 at 03:17:11PM -0600, Keith Busch wrote: > And there may be other drivers that don't want their completions > ignored, so breaking them again is also a step in the wrong direction. > > There are not that many blk-mq drivers, so we can go through them all. I think the point is that SCSI is the biggest user by both the number of low-level drivers sitting under the midlayer, and also by usage. We need to be very careful not to break it. Note that this doesn't mean that I don't want to eventually move away from just ignoring completions in timeout state for SCSI. I'd just rather rever 4.18 to a clean known state instead of doctoring around late in the rc phase. > Most don't even implement .timeout, so they never know that condition > ever happened. Others always return BLK_EH_RESET_TIMER without doing > anythign else, so the 'new' behavior would have to be better for those, > too. And we should never even hit the timeout handler for those as it is rather pointless (although it looks we currently do..).
Re: [RFC PATCH 3/3] blk-mq: Remove generation seqeunce
On Wed, Jul 18, 2018 at 09:56:50PM +0200, h...@lst.de wrote: > On Fri, Jul 13, 2018 at 05:58:08PM -0600, Keith Busch wrote: > > Of the two you mentioned, yours is preferable IMO. While I appreciate > > Jianchao's detailed analysis, it's hard to take a proposal seriously > > that so colourfully calls everyone else "dangerous" while advocating > > for silently losing requests on purpose. > > > > But where's the option that fixes scsi to handle hardware completions > > concurrently with arbitrary timeout software? Propping up that house of > > cards can't be the only recourse. > > The important bit is that we need to fix this issue quickly. We are > past -rc5 so I'm rather concerned about anything too complicated. > > I'm not even sure SCSI has a problem with multiple completions happening > at the same time, but it certainly has a problem with bypassing > blk_mq_complete_request from the EH path. > > I think we can solve this properly, but I also think we are way to late > in the 4.18 cycle to fix it properly. For now I fear we'll just have > to revert the changes and try again for 4.19 or even 4.20 if we don't > act quickly enough. So here is a quick attempt at the revert while also trying to keep nvme working. Keith, Bart, Jianchao - does this looks reasonable as a 4.18 band aid? http://git.infradead.org/users/hch/block.git/shortlog/refs/heads/blk-eh-revert
Re: [RFC PATCH 3/3] blk-mq: Remove generation seqeunce
On Fri, Jul 13, 2018 at 05:58:08PM -0600, Keith Busch wrote: > Of the two you mentioned, yours is preferable IMO. While I appreciate > Jianchao's detailed analysis, it's hard to take a proposal seriously > that so colourfully calls everyone else "dangerous" while advocating > for silently losing requests on purpose. > > But where's the option that fixes scsi to handle hardware completions > concurrently with arbitrary timeout software? Propping up that house of > cards can't be the only recourse. The important bit is that we need to fix this issue quickly. We are past -rc5 so I'm rather concerned about anything too complicated. I'm not even sure SCSI has a problem with multiple completions happening at the same time, but it certainly has a problem with bypassing blk_mq_complete_request from the EH path. I think we can solve this properly, but I also think we are way to late in the 4.18 cycle to fix it properly. For now I fear we'll just have to revert the changes and try again for 4.19 or even 4.20 if we don't act quickly enough.
Re: [PATCH v9 2/2] blk-mq: Rework blk-mq timeout handling again
On Wed, May 16, 2018 at 04:47:54PM +, Bart Van Assche wrote: > I think your patch changes the order of changing the request state and > calling mod_timer(). In my patch the request state and the deadline are > updated first and mod_timer() is called afterwards. I think your patch > changes the order of these operations into the following: > (1) __blk_mq_start_request() sets the request deadline. > (2) __blk_mq_start_request() calls __blk_add_timer() which in turn calls > mod_timer(). > (3) __blk_mq_start_request() changes the request state into MQ_RQ_IN_FLIGHT. > > In the unlikely event of a significant delay between (2) and (3) it can > happen that the timer fires and examines and ignores the request because > its state differs from MQ_RQ_IN_FLIGHT. If the request for which this > happened times out its timeout will only be handled the next time > blk_mq_timeout_work() is called. Is this the behavior you intended? We can move the timer manipulation after the change easily I think. It would make sense to add comments explaining the ordering.
Re: [PATCH v9 2/2] blk-mq: Rework blk-mq timeout handling again
On Wed, May 16, 2018 at 04:17:42PM +, Bart Van Assche wrote: > There is another reason the deadline is included in the atomic operation, > namely to handle races between the BLK_EH_RESET_TIMER case in > blk_mq_rq_timed_out() > and blk_mq_complete_request(). I don't think that race is addressed properly > by > your patch. I will see what I can do to address that race without using 64-bit > atomic operations. I might be missing something here, so please help me understand what is missing. If we restart the timer in blk_mq_rq_timed_out we also bump the generation at the same time as we reset the deadline in your old patch. With this patch we only bump the generation, but that should be enough to address the rest, or not?
Re: [PATCH] block: Avoid executing a report or reset zones while a queue is frozen
On Tue, Apr 17, 2018 at 05:35:07PM +, Bart Van Assche wrote: > > Hmm. I think we need to avoid clearing that data and update it using > > RCU instead. Calling blk_queue_enter before submitting bios is > > something that would make zone reporting very different from any > > other block layer user. > > Hello Christoph, > > Please have a look at generic_make_request(). I think that function shows > that blk_queue_enter() is called anyway before .make_request_fn() is called. Yes, that is the point. We already call blk_queue_enter in generic_make_request, which the zone report and reset ops pass through.
Re: [PATCH] block: Ensure that a request queue is dissociated from the cgroup controller
On Thu, Apr 12, 2018 at 06:48:12AM -0700, t...@kernel.org wrote: > > Which sounds like a very good reason not to use a driver controller > > lock for internals like blkcq. > > > > In fact splitting the lock used for synchronizing access to queue > > fields from the driver controller lock used to synchronize I/O > > in the legacy path in long overdue. > > It'd be probably a lot easier to make sure the shared lock doesn't go > away till all the request_queues using it go away. The choice is > between refcnting something which carries the lock and double locking > in hot paths. Can't think of many downsides of the former approach. We've stopped sharing request_queues between different devices a while ago. The problem is just that for legacy devices the driver still controls the lock life time, and it might be shorter than the queue lifetime. Note that for blk-mq we basically don't use the queue_lock at all, and certainly not in the hot path.
Re: [PATCH] block: Ensure that a request queue is dissociated from the cgroup controller
On Thu, Apr 12, 2018 at 11:52:11AM +, Bart Van Assche wrote: > On Thu, 2018-04-12 at 07:34 +0200, Christoph Hellwig wrote: > > On Wed, Apr 11, 2018 at 07:58:52PM -0600, Bart Van Assche wrote: > > > Several block drivers call alloc_disk() followed by put_disk() if > > > something fails before device_add_disk() is called without calling > > > blk_cleanup_queue(). Make sure that also for this scenario a request > > > queue is dissociated from the cgroup controller. This patch avoids > > > that loading the parport_pc, paride and pf drivers triggers the > > > following kernel crash: > > > > Can we move the cleanup to put_disk in general and not just for > > this case? Having alloc/free routines pair up generally avoids > > a lot of confusion. > > Hello Christoph, > > At least the SCSI ULP drivers drop the last reference to a disk after > the blk_cleanup_queue() call. As explained in the description of commit > a063057d7c73, removing a request queue from blkcg must happen before > blk_cleanup_queue() finishes because a block driver may free the > request queue spinlock immediately after blk_cleanup_queue() returns. > So I don't think that we can move the code that removes a request > queue from blkcg into put_disk(). Another challenge is that some block > drivers (e.g. skd) clear the disk->queue pointer if device_add_disk() > has not been called to avoid that put_disk() causes a request queue > reference count imbalance. Which sounds like a very good reason not to use a driver controller lock for internals like blkcq. In fact splitting the lock used for synchronizing access to queue fields from the driver controller lock used to synchronize I/O in the legacy path in long overdue.
Re: [PATCH v4] blk-mq: Fix race conditions in request timeout handling
On Tue, Apr 10, 2018 at 01:26:39PM +, Bart Van Assche wrote: > Can you explain why you think that using cmpxchg() is safe in this context? > The regular completion path and the timeout code can both execute e.g. the > following code concurrently: > > blk_mq_change_rq_state(rq, MQ_RQ_IN_FLIGHT, MQ_RQ_COMPLETE); > > That's why I think that we need an atomic compare and exchange instead of > cmpxchg(). What I found in the Intel Software Developer Manual seems to > confirm that: The Linux cmpxchg() helper is supposed to always be atomic, you only need atomic_cmpxchg and friends if you want to operate on an atomic_t. (same for the _long variant). In fact if you look at the x86 implementation of the cmpxchg() macro you'll see that it will use the lock prefix if the kernel has been built with CONFIG_SMP enabled.
Re: [PATCH v2] block: bio_check_eod() needs to consider partition
Hi Bart, can you test the version below? --- >From a68a8518158e31d66a0dc4f4e795ca3ceb83752c Mon Sep 17 00:00:00 2001 From: Christoph Hellwig <h...@lst.de> Date: Tue, 13 Mar 2018 09:27:30 +0100 Subject: block: bio_check_eod() needs to consider partition bio_check_eod() should check partiton size not the whole disk if bio->bi_partno is non-zero. Does this by taking the call to bio_check_eod into blk_partition_remap. Based on an earlier patch from Jiufei Xue. Fixes: 74d46992e0d9 ("block: replace bi_bdev with a gendisk pointer and partitions index") Reported-by: Jiufei Xue <jiufei@linux.alibaba.com> Signed-off-by: Christoph Hellwig <h...@lst.de> --- block/blk-core.c | 93 1 file changed, 40 insertions(+), 53 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index 6d82c4f7fadd..47ee24611126 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -2023,7 +2023,7 @@ static blk_qc_t blk_queue_bio(struct request_queue *q, struct bio *bio) return BLK_QC_T_NONE; } -static void handle_bad_sector(struct bio *bio) +static void handle_bad_sector(struct bio *bio, sector_t maxsector) { char b[BDEVNAME_SIZE]; @@ -2031,7 +2031,7 @@ static void handle_bad_sector(struct bio *bio) printk(KERN_INFO "%s: rw=%d, want=%Lu, limit=%Lu\n", bio_devname(bio, b), bio->bi_opf, (unsigned long long)bio_end_sector(bio), - (long long)get_capacity(bio->bi_disk)); + (long long)maxsector); } #ifdef CONFIG_FAIL_MAKE_REQUEST @@ -2092,68 +2092,59 @@ static noinline int should_fail_bio(struct bio *bio) } ALLOW_ERROR_INJECTION(should_fail_bio, ERRNO); +/* + * Check whether this bio extends beyond the end of the device or partition. + * This may well happen - the kernel calls bread() without checking the size of + * the device, e.g., when mounting a file system. + */ +static inline int bio_check_eod(struct bio *bio, sector_t maxsector) +{ + unsigned int nr_sectors = bio_sectors(bio); + + if (nr_sectors && maxsector && + (nr_sectors > maxsector || +bio->bi_iter.bi_sector > maxsector - nr_sectors)) { + handle_bad_sector(bio, maxsector); + return -EIO; + } + return 0; +} + /* * Remap block n of partition p to block n+start(p) of the disk. */ static inline int blk_partition_remap(struct bio *bio) { struct hd_struct *p; - int ret = 0; + int ret = -EIO; rcu_read_lock(); p = __disk_get_part(bio->bi_disk, bio->bi_partno); - if (unlikely(!p || should_fail_request(p, bio->bi_iter.bi_size) || -bio_check_ro(bio, p))) { - ret = -EIO; + if (unlikely(!p)) + goto out; + if (unlikely(should_fail_request(p, bio->bi_iter.bi_size))) + goto out; + if (unlikely(bio_check_ro(bio, p))) goto out; - } /* * Zone reset does not include bi_size so bio_sectors() is always 0. * Include a test for the reset op code and perform the remap if needed. */ - if (!bio_sectors(bio) && bio_op(bio) != REQ_OP_ZONE_RESET) - goto out; - - bio->bi_iter.bi_sector += p->start_sect; - bio->bi_partno = 0; - trace_block_bio_remap(bio->bi_disk->queue, bio, part_devt(p), - bio->bi_iter.bi_sector - p->start_sect); - + if (bio_sectors(bio) || bio_op(bio) == REQ_OP_ZONE_RESET) { + if (bio_check_eod(bio, part_nr_sects_read(p))) + goto out; + bio->bi_iter.bi_sector += p->start_sect; + bio->bi_partno = 0; + trace_block_bio_remap(bio->bi_disk->queue, bio, part_devt(p), + bio->bi_iter.bi_sector - p->start_sect); + } + ret = 0; out: rcu_read_unlock(); return ret; } -/* - * Check whether this bio extends beyond the end of the device. - */ -static inline int bio_check_eod(struct bio *bio, unsigned int nr_sectors) -{ - sector_t maxsector; - - if (!nr_sectors) - return 0; - - /* Test device or partition size, when known. */ - maxsector = get_capacity(bio->bi_disk); - if (maxsector) { - sector_t sector = bio->bi_iter.bi_sector; - - if (maxsector < nr_sectors || maxsector - nr_sectors < sector) { - /* -* This may well happen - the kernel calls bread() -* without checking the size of the device, e.g., when -* mounting a device. -*/ - handle_bad_sector(bio); -
Re: [PATCH v2] block: bio_check_eod() needs to consider partition
On Tue, Mar 13, 2018 at 04:25:40PM +, Bart Van Assche wrote: > On Tue, 2018-03-13 at 07:49 -0700, Jens Axboe wrote: > > On 3/13/18 2:18 AM, Christoph Hellwig wrote: > > > bio_check_eod() should check partiton size not the whole disk if > > > bio->bi_partno is non-zero. Does this by taking the call to bio_check_eod > > > into blk_partition_remap. > > > > Applied, thanks. > > Hello Christoph and Jens, > > I think that this patch introduces a severe regression: with this patch > applied > a VM that I use for kernel testing doesn't boot anymore. If I revert this > patch > that VM boots fine again. This is what I see on the serial console with this > patch applied: Ok, please revert this for now. I'm off for today and will look into it tomorrow.
Re: [PATCH 2/2] block: fix peeking requests during PM
On Tue, Oct 03, 2017 at 03:13:43PM +, Bart Van Assche wrote: > > + switch (rq->q->rpm_status) { > > + case RPM_SUSPENDED: > > + return false; > > + case RPM_ACTIVE: > > + return rq->rq_flags & RQF_PM; > > + default: > > + return true; > > + } > > } > > Hello Christoph, > > The old code does not process non-PM requests in the RPM_SUSPENDING mode nor > in the RPM_RESUMING code. I think the new code processes all requests in > these two modes. Was that behavior change intended? Yeah, looks like this version has the RPM_ACTIVE and default cases switched. Back to the drawing board.
Re: [PATCH v4 4/7] block: Add the QUEUE_FLAG_PREEMPT_ONLY request queue flag
On Mon, Oct 02, 2017 at 03:56:08PM +, Bart Van Assche wrote: > Since queue_flag_set() and queue_flag_clear() use non-atomic primitives to > set and clear flags it is essential to protect calls of these functions with > the queue lock. Otherwise flag updates could get lost due to concurrent > queue_flag_set() or queue_flag_clear() calls. > > One of the reasons why I introduced this helper function is to keep the > wake_up_all(>mq_freeze_wq) call that is added to this function by a later > patch in the block layer. Oh, despite the _unlocked versions existing it doesn't do any atomic ops. Yeah, we'll need the queue lock then. But please split it into one helper for setting the queue blocked and to clear it, especially if the clear needs to do an additional wake_up.
Re: [PATCH v3 0/6] Make SCSI device suspend and resume work reliably
On Mon, Sep 25, 2017 at 04:17:27PM +, Bart Van Assche wrote: > This patch series definitely is an alternative for blk-mq/scsi-mq. And as you > know my approach can be extended easily to the legacy SCSI core by adding > blk_queue_enter() / blk_queue_exit() calls where necessary in the legacy block > layer. I have not done this because the bug report was against scsi-mq and not > against the legacy SCSI core. Additionally, since the legacy block layer and > SCSI core are on their way out I did not want to spend time on modifying > these. For now we'll have to fix both paths to have equal functionality. Even if scsi might be on it's way out there still are many drivers that use the legacy request path, and with mmc we'll probably grow another dual block layer driver soon, although hopefully the transition period will be much shorter.
Re: [PATCH V3 06/12] scsi: sd_zbc: Rearrange code
On Fri, Sep 15, 2017 at 02:51:03PM +, Bart Van Assche wrote: > On Fri, 2017-09-15 at 19:06 +0900, Damien Le Moal wrote: > > Rearrange sd_zbc_setup() to include use_16_for_rw and use_10_for_rw > > assignments and move the calculation of sdkp->zone_shift together > > with the assignment of the verified zone_blocks value in > > sd_zbc_check_zone_size(). > > Both functions are called from inside block_device_operations.revalidate_disk. > Isn't that too late to set use_1[06]_for_rw? Shouldn't both be set just after > the slave_alloc callback has been called? Hmm. sd_read_capacity already is called from the same path and seems to work.. > > Thanks, > > Bart.---end quoted text---
Re: [PATCH] block: Call .initialize_rq_fn() also for filesystem requests
Hi Bart, this isn't just about performance - it's about understanability of the I/O path. The legacy request path already has the prep_fn, which is intended for exactly this sort of prep work, but even that led to a lot of confusion. For blk-mq we decided to not add it but let the called driver in control. I really don't want to move away from that. The passthrough requests using blk_get_requst are a special case as the caller allocates the requests, stuffs data into them and only then hands control to the driver, and thus we need some way to initialize the request before handing controller to the driver in this particular special case. And if needed would could actually do that with explicit calls instead of the callback, although you changed it to save a bit of code.
Re: [PATCH 10/10] nvme: implement multipath access to nvme subsystems
On Wed, Aug 23, 2017 at 06:21:55PM +, Bart Van Assche wrote: > On Wed, 2017-08-23 at 19:58 +0200, Christoph Hellwig wrote: > > +static blk_qc_t nvme_make_request(struct request_queue *q, struct bio *bio) > > +{ > > + struct nvme_ns_head *head = q->queuedata; > > + struct nvme_ns *ns; > > + blk_qc_t ret = BLK_QC_T_NONE; > > + int srcu_idx; > > + > > + srcu_idx = srcu_read_lock(>srcu); > > + ns = srcu_dereference(head->current_path, >srcu); > > + if (unlikely(!ns || ns->ctrl->state != NVME_CTRL_LIVE)) > > + ns = nvme_find_path(head); > > + if (likely(ns)) { > > + bio->bi_disk = ns->disk; > > + bio->bi_opf |= REQ_FAILFAST_TRANSPORT; > > + ret = generic_make_request_fast(bio); > > + } else if (!list_empty_careful(>list)) { > > + printk_ratelimited("no path available - requeing I/O\n"); > > + > > + spin_lock_irq(>requeue_lock); > > + bio_list_add(>requeue_list, bio); > > + spin_unlock_irq(>requeue_lock); > > + } else { > > + printk_ratelimited("no path - failing I/O\n"); > > + > > + bio->bi_status = BLK_STS_IOERR; > > + bio_endio(bio); > > + } > > + > > + srcu_read_unlock(>srcu, srcu_idx); > > + return ret; > > +} > > Hello Christoph, > > Since generic_make_request_fast() returns BLK_STS_AGAIN for a dying path: > can the same kind of soft lockups occur with the NVMe multipathing code as > with the current upstream device mapper multipathing code? See e.g. > "[PATCH 3/7] dm-mpath: Do not lock up a CPU with requeuing activity" > (https://www.redhat.com/archives/dm-devel/2017-August/msg00124.html). I suspect the code is not going to hit it because we check the controller state before trying to queue I/O on the lower queue. But if you point me to a good reproducer test case I'd like to check. Also does the "single queue" case in your mail refer to the old request code? nvme only uses blk-mq so it would not hit that. But either way I think get_request should be fixed to return BLK_STS_IOERR if the queue is dying instead of BLK_STS_AGAIN. > Another question about this code is what will happen if > generic_make_request_fast() returns BLK_STS_AGAIN and the submit_bio() or > generic_make_request() caller ignores the return value of the called > function? A quick grep revealed that there is plenty of code that ignores > the return value of these last two functions. generic_make_request and generic_make_request_fast only return the polling cookie (blk_qc_t), not a block status. Note that we do not use blk_get_request / blk_mq_alloc_request for the request allocation of the request on the lower device, so unless the caller passed REQ_NOWAIT and is able to handle BLK_STS_AGAIN we won't ever return it.
Re: [BUG] BLKZEROOUT on dm-crypt container cause OOM / kernel panic
Does commit 615d22a51c04856efe62af6e1d5b450aaf5cc2c0 "block: Fix __blkdev_issue_zeroout loop" fix the issue for you?
Re: [PATCH 08/19] block: Introduce request_queue.initialize_rq_fn()
On Fri, May 26, 2017 at 11:56:30PM +, Bart Van Assche wrote: > I have tried to move that call into blk_mq_alloc_request() but that > resulted in a kernel oops during boot due to scsi_add_cmd_to_list() > dereferencing scsi_cmnd.device and due to that pointer being invalid. > I think that pointer was invalid because moving the initialize_rq_fn() > call into blk_mq_alloc_request() caused request initialization to be > skipped for the following code path: > submit_bio() > -> generic_make_request() > -> .make_request_fn == blk_mq_make_request() > -> blk_mq_sched_get_request() > -> __blk_mq_alloc_request() > -> blk_mq_rq_ctx_init() > > This is why I would like to keep the .initialize_rq_fn() call in > blk_mq_rq_ctx_init(). But we don't call scsi_req_init for this path either with the current code. So not having the call should be fine as long as you ensure we still manually initialize everything for the non-passthrough path in the later patches. I'll keep an eye on that issue while reviewing the remaining patches.
Re: [PATCH 05/19] cdrom: Check private request size before attaching to a queue
On Fri, May 26, 2017 at 03:50:42PM +, Bart Van Assche wrote: > On Fri, 2017-05-26 at 08:08 +0200, Christoph Hellwig wrote: > > On Thu, May 25, 2017 at 11:43:13AM -0700, Bart Van Assche wrote: > > > Since the cdrom driver only supports request queues for which > > > struct scsi_request is the first member of their private request > > > data, refuse to register block layer queues for which this is > > > not the case. > > > > Hmm. I think we have a deeper issue there. The cdrom layer usually > > sends packets commands through the cdrom ops ->generic_packet > > method, but one function (cdrom_read_cdda_bpc) seems to directly > > send a scsi passthrough request. It is only used if the cdda_method > > is not CDDA_OLD, which is set if the cdrom_device_info structure > > does not have a gendisk pointer hanging off it. It seems like > > the legacy cdrom drivers fall into that category and would be > > broken by this change. > > Hello Christoph, > > How about moving the check into cdrom_read_cdda_bpc()? As far as I can see > that function is only called if a CDROMREADAUDIO ioctl is submitted. Although > Documentation/cdrom/cdrom-standard.tex mentions that that ioctl is unsupported > applications like cdparanoia use it. I guess that's fine for now. In the long run we can probably replace the current users of generic_packet method with direct scsi passthrough requests, but that can be left for later.
Re: [PATCH 06/19] nfsd: Check private request size before submitting a SCSI request
On Thu, May 25, 2017 at 08:19:47PM +, Bart Van Assche wrote: > On Thu, 2017-05-25 at 14:48 -0400, J . Bruce Fields wrote: > > On Thu, May 25, 2017 at 11:43:14AM -0700, Bart Van Assche wrote: > > > Since using scsi_req() is only allowed against request queues for > > > which struct scsi_request is the first member of their private > > > request data, refuse to submit SCSI commands against a queue for > > > which this is not the case. > > > > Is it possible we could catch this earlier and avoid giving out the > > layout in the first place? > > Hello Christoph, > > According to what I see in commit 8650b8a05850 you are the author of this > code? Can the blk_queue_scsi_pdu(q) test fail in nfsd4_scsi_identify_device()? If the user explicitly asked for a scsi layout export of a non-scsi device it can. > If so, can nfsd4_layout_verify() be modified in such a way that it prevents > that nfsd4_scsi_proc_getdeviceinfo() is ever called for a non-SCSI queue? > Can you recommend an approach? Not easily. The only thing we could do is an export time check, that would refuse the scsi layout export if the device is not capable. I can look into that, but it will take some time so for now I think we should go ahead with your series.
Re: [PATCH 02/15] scsi/osd: don't save block errors into req_results
On Wed, May 24, 2017 at 04:04:40PM +, Bart Van Assche wrote: > Are you sure that that code is not necessary? From osd_initiator.c: > > static void _put_request(struct request *rq) > { > /* >* If osd_finalize_request() was called but the request was not >* executed through the block layer, then we must release BIOs. >* TODO: Keep error code in or->async_error. Need to audit all >* code paths. >*/ > if (unlikely(rq->bio)) > blk_end_request(rq, -ENOMEM, blk_rq_bytes(rq)); > else > blk_put_request(rq); > } Which isn't using it at all. It has a ten year old comment to pass on some error, but even then ORing two different error types together would no be very helpful. > > Bart.---end quoted text---
Re: [PATCH 02/18] bsg: Check private request size before attaching to a queue
On Sun, May 21, 2017 at 02:33:05PM +, Bart Van Assche wrote: > Thanks for the review comments. The cover letter should have made it to at > least the linux-scsi mailing list since it shows up in at least one archive of > that mailing list: https://www.spinics.net/lists/linux-scsi/msg108940.html. Yes, I see it on the list now. But it's missing various Cc that the actual patches have, including that to me, which seems a bit broken.
Re: [PATCH 12/15] block: merge blk_types.h into bio.h
On Thu, May 18, 2017 at 02:25:52PM +, Bart Van Assche wrote: > On Thu, 2017-05-18 at 15:18 +0200, Christoph Hellwig wrote: > > We've cleaned up our headers sufficiently that we don't need this split > > anymore. > > Hello Christoph, > > Request-based drivers need the structure definitions from and > the type definitions from but do not need the definition > of > struct bio. Have you considered to remove #include from file > include/linux/blkdev.h? Do you think that would help to reduce the kernel > build > time? Most block drivers end up needing bio.h anyway. But I can skip this part of the series for now.
Re: small dm mpath cleanups
On Wed, Apr 26, 2017 at 06:41:27PM +, Bart Van Assche wrote: > On Wed, 2017-04-26 at 09:40 +0200, Christoph Hellwig wrote: > > this series has some prep patches for my work to have proper, type > > checked block errors codes. One fallout of that is that we need to > > get rid of how dm overloads a few return values with either internal > > positive error codes or negative errno values. This patches does > > that, which happens to clean things up a bit, and also allows us > > dm to propagate the actual error code in one case where it currently > > is dropped on the floor. > > Hello Christoph, > > Some patches in this series conflict with patches I would like to end up in > the stable kernel series. If I would rebase my patch series on top of your > series then that would make it harder to apply my patches on the stable > kernel trees. Mike and Christoph, please advise how to proceed. Bugfixes always go before cleanups. I'd be happy to delay and/or rebase any of my patches as needed.
Re: [PATCH 02/23] block: remove the blk_execute_rq return value
On Wed, Apr 19, 2017 at 09:07:45PM +, Bart Van Assche wrote: > > + blk_execute_rq(or->request->q, NULL, or->request, 0); > > + error = or->request ? -EIO : 0; > > Hello Christoph, > > Did you perhaps intend or->request->errors instead of or->request? Yes, thanks.
Re: blk-mq: remove the error argument to blk_mq_complete_request
On Tue, Apr 18, 2017 at 10:50:18PM +, Bart Van Assche wrote: > On Tue, 2017-04-18 at 08:52 -0700, Christoph Hellwig wrote: > > Now that we always have a ->complete callback we can remove the direct > > call to blk_mq_end_request, as well as the error argument to > > blk_mq_complete_request. > > Hello Christoph, > > Please add a runtime check that issues a warning early if a .complete > callback function is missing. > > Are you sure that all blk-mq drivers have a .complete callback? In the > request-errors branch of your block git tree I found the following: I think the problem is that my above changelog was wrong - we only require the complete_rq method if the driver calls the blk_mq_complete_request function. Both rbd and ubiblock don't in the tree. They probably should, but that's work for a separate series.
Re: block: add a error_count field to struct request
On Tue, Apr 18, 2017 at 10:57:11PM +, Bart Van Assche wrote: > Both blk-mq and the traditional block layer support a .cmd_size field to > make the block layer core allocate driver-specific data at the end of struct > request. Could that mechanism have been used for the error_count field? It could, and that's what I did for the modern drivers. It would have been a bit of a pain for these old floppy drivers, though.
Re: scsi: introduce a new result field in struct scsi_request
On Tue, Apr 18, 2017 at 10:34:20PM +, Bart Van Assche wrote: > Did you perhaps intend "req->result" instead of "rq->result"? Yes. > Did you intend "war" or is that perhaps a typo? I'll fix the comment. > > trace_scsi_dispatch_cmd_done(cmd); > > - blk_mq_complete_request(cmd->request, cmd->request->errors); > > + blk_mq_complete_request(cmd->request, 0); > > } > > Why has cmd->request->errors been changed into 0? Because the argument is only used to set req->errors, which we won't look at any more.
Re: block: remove the blk_execute_rq return value
On Tue, Apr 18, 2017 at 10:24:15PM +, Bart Van Assche wrote: > On Tue, 2017-04-18 at 08:52 -0700, Christoph Hellwig wrote: > > diff --git a/drivers/block/paride/pd.c b/drivers/block/paride/pd.c > > index b05e151c9b38..82c6d02193ae 100644 > > --- a/drivers/block/paride/pd.c > > +++ b/drivers/block/paride/pd.c > > @@ -747,7 +747,8 @@ static int pd_special_command(struct pd_unit *disk, > > > > rq->special = func; > > > > - err = blk_execute_rq(disk->gd->queue, disk->gd, rq, 0); > > + blk_execute_rq(disk->gd->queue, disk->gd, rq, 0); > > + err = req->errors ? -EIO : 0; > > > > blk_put_request(rq); > > return err; > > Hello Christoph, > > Sorry that I hadn't noticed this before but shouldn't "req" be changed into > "rq"? Otherwise this patch looks fine to me. If this comment gets addressed > you can add: It should. But it seems this no one caught this because the check gets removed later in the series by "pd: remove bogus check for req->errors", so I should just move that patch earlier in the series.
Re: [PATCH 02/25] block: remove the blk_execute_rq return value
On Mon, Apr 17, 2017 at 10:01:09AM -0600, Jens Axboe wrote: > Are you respinning this series for 4.12? Yes, I think I got enough feedback by now to resend it. I'll try to get it out today.
Re: [PATCH 02/25] block: remove the blk_execute_rq return value
On Thu, Apr 13, 2017 at 08:03:22PM +, Bart Van Assche wrote: > That blk_execute_rq() call can only be reached if a few lines above 0 was > assigned to the "error" variable. Since nfsd4_scsi_identify_device() returns > the value of the "error" variable I think -EIO should be assigned to that > variable before the "goto out_put_request" statement is reached. You're right! I'll fix it up.
Re: [PATCH 12/23] sd: handle REQ_UNMAP
On Thu, Mar 30, 2017 at 10:19:55PM -0400, Martin K. Petersen wrote: > > If you manually change the provisioning mode to WS10 on a device that > > must use WRITE SAME (16) to be able to address all blocks you're already > > screwed right now, and with this patch you can screw yourself through > > the WRITE_ZEROES path in addition to the DISCARD path. > > Oh, I see. We only had the LBA sanity check in place for write same, not > for discard. And btw, I'd be happy to add such a check, I'd just rather keep it out of this patch. It might be a good idea if you give it a turn after this series given that you have all the devices that have weird provisioning modes while I don't..
Re: [PATCH 23/23] block: remove the discard_zeroes_data flag
On Thu, Mar 30, 2017 at 11:29:29AM -0400, Martin K. Petersen wrote: > "h...@lst.de" <h...@lst.de> writes: > > > Jens, any opinion? I'd like to remove it too, but I fear it might > > break things. We could deprecate it first with a warning when read > > and then remove it a few releases down the road. > > I know of several apps that check this variable (as opposed to the > ioctl). The above was in reference to both methods..
Re: [PATCH 23/23] block: remove the discard_zeroes_data flag
On Tue, Mar 28, 2017 at 05:00:48PM +, Bart Van Assche wrote: > > It seems to me like the documentation in Documentation/ABI/testing/sysfs-block > and the above code are not in sync. I think the above code will cause reading > from the discard_zeroes_data attribute to return an empty string ("") instead > of "0\n". Thanks, fine with me. > > BTW, my personal preference is to remove this attribute entirely because > keeping > it will cause confusion, no matter how well we document the behavior of this > attribute. Jens, any opinion? I'd like to remove it too, but I fear it might break things. We could deprecate it first with a warning when read and then remove it a few releases down the road.
Re: [PATCH 11/23] block_dev: use blkdev_issue_zerout for hole punches
On Tue, Mar 28, 2017 at 04:50:47PM +, Bart Van Assche wrote: > On Thu, 2017-03-23 at 10:33 -0400, Christoph Hellwig wrote: > > This gets us support for non-discard efficient write of zeroes (e.g. NVMe) > > and preparse for removing the discard_zeroes_data flag. > > Hello Christoph, > > "preparse" probably should have been "prepare"? Yes, fixed.
Re: [PATCH 12/23] sd: handle REQ_UNMAP
On Tue, Mar 28, 2017 at 04:48:55PM +, Bart Van Assche wrote: > > if (sdp->no_write_same) > > return BLKPREP_INVALID; > > if (sdkp->ws16 || sector > 0x || nr_sectors > 0x) > > Users can change the provisioning mode from user space from SD_LBP_WS16 into > SD_LBP_WS10 so I'm not sure it's safe to skip the (sdkp->ws16 || sector > > 0x || nr_sectors > 0x) check if REQ_UNMAP is set. They can, and if the device has too many sectors that will already cause discard to fail, and in this case it will cause write zeroes to fail as well. The intent behind this patch is to keep the behavior the same as the old path that uses discards for zeroing. The logic looks a bit clumsy, but I'd rather keep it as-is.
Re: [PATCH 01/23] block: renumber REQ_OP_WRITE_ZEROES
On Tue, Mar 28, 2017 at 04:12:46PM +, Bart Van Assche wrote: > Since REQ_OP_WRITE_ZEROES was introduced in kernel v4.10, do we need > "Cc: stable" and "Fixes: a6f0788ec2881" tags for this patch? No. This just works around the way scsi_setup_cmnd sets up the data direction. Before this series it's not an issue because no one used the req_op data direction for setting up the dma direction.
Re: [PATCH 1/7] ѕd: split sd_setup_discard_cmnd
On Tue, Mar 28, 2017 at 08:05:09AM -0600, ax...@kernel.dk wrote: > > Although I know this is an issue in the existing code and not something > > introduced by you: please consider using logical_to_sectors() instead of > > open-coding this function. Otherwise this patch looks fine to me. > > The downside of doing that is that we still call ilog2() twice, which > sucks. Would be faster to cache ilog2(sector_size) and use that in the > shift calculation. I suspect that gcc is smart enough to optimize it away. That beeing said while this looks like a nice cleanup this patch is just supposed to move code, so I'd rather not add the change here and leave it for a separate submission.
Re: unify and streamline the blk-mq make_request implementations V2
On Tue, Mar 21, 2017 at 12:40:17PM +, Bart Van Assche wrote: > On Mon, 2017-03-20 at 16:39 -0400, Christoph Hellwig wrote: > > Changes since V1: > > - rebase on top of the recent blk_mq_try_issue_directly changes > > - incorporate comments from Bart > > Hi Christoph, > > It seems to me like none of the three comments I had posted on patch 4/4 > have been addressed. Please have another look at these comments. I had a look but resend the same old series again due to a rebase bug. I'll resend it ASAP with what should address your comments to the previous series as well as this one.
Re: [PATCH 2/4] blk-mq: merge mq and sq make_request instances
On Mon, Mar 13, 2017 at 09:01:08PM +, Bart Van Assche wrote: > > + /* > > +* @request_count may become stale because of schedule > > +* out, so check the list again. > > +*/ > > The above comment was relevant as long as there was a request_count assignment > above blk_mq_sched_get_request(). This patch moves that assignment inside if > (plug && q->nr_hw_queues == 1). Does that mean that the above comment should > be > removed entirely? I don't think so - for the !blk_queue_nomerges cases we still rely on blk_attempt_plug_merge calculatіng request_count, so the comment still applies.
Re: [PATCH 1/4] blk-mq: remove BLK_MQ_F_DEFER_ISSUE
On Mon, Mar 13, 2017 at 08:52:54PM +, Bart Van Assche wrote: > > - if (((plug && !blk_queue_nomerges(q)) || is_sync) && > > - !(data.hctx->flags & BLK_MQ_F_DEFER_ISSUE)) { > > + if (((plug && !blk_queue_nomerges(q)) || is_sync)) { > > A minor comment: due to this change the outer pair of parentheses > became superfluous. Please consider removing these. The last patch in the series removes the statement in this form. But if I have to respin the series for some reason I'll make sure it gets removed here already.
Re: [GIT PULL] Block pull request for- 4.11-rc1
On Fri, Feb 24, 2017 at 01:22:42PM -0700, Jens Axboe wrote: > Bart, I pushed a fix here: > > http://git.kernel.dk/cgit/linux-block/commit/?h=for-linus=61febef40bfe8ab68259d8545257686e8a0d91d1 Yeah, this looks fine to me. It was broken on blk-mq before, but basically impossible to hit. I wonder if we should have a debug mode where we set requests to a known pattern after they are freed to catch these sort of bugs.
Re: Boot regression (was "Re: [PATCH] genhd: Do not hold event lock when scheduling workqueue elements")
> I tested today's linux-next (next-20170214) + the 2 patches just now and got > a weird result: > sometimes the VM stills hung with a new calltrace (BUG: spinlock bad > magic) , but sometimes the VM did boot up despite the new calltrace! > > Attached is the log of a "good" boot. > > It looks we have a memory corruption issue somewhere... Yes. > Actually previously I saw the "BUG: spinlock bad magic" message once, but I > couldn't repro it later, so I didn't mention it to you. Interesting. > > The good news is that now I can repro the "spinlock bad magic" message > every time. > I tried to dig into this by enabling Kernel hacking -> Memory debugging, > but didn't find anything abnormal. > Is it possible that the SCSI layer passes a wrong memory address? It's possible, but this looks like it might be a different issue. A few questions on the dmesg: [6.208794] sd 2:0:0:0: [storvsc] Sense Key : Illegal Request [current] [6.209447] sd 2:0:0:0: [storvsc] Add. Sense: Invalid command operation code [6.210043] sd 3:0:0:0: [storvsc] Sense Key : Illegal Request [current] [6.210618] sd 3:0:0:0: [storvsc] Add. Sense: Invalid command operation code [6.212272] sd 2:0:0:0: [storvsc] Sense Key : Illegal Request [current] [6.212897] sd 2:0:0:0: [storvsc] Add. Sense: Invalid command operation code [6.213474] sd 3:0:0:0: [storvsc] Sense Key : Illegal Request [current] [6.214051] sd 3:0:0:0: [storvsc] Add. Sense: Invalid command operation code I didn't see anything like this in the other logs. Are these messages something usual on HyperV VMs? [6.358405] XFS (sdb1): Mounting V5 Filesystem [6.404478] XFS (sdb1): Ending clean mount [7.535174] BUG: spinlock bad magic on CPU#0, swapper/0/0 [7.536807] lock: host_ts+0x30/0xe1a0 [hv_utils], .magic: , .owner: /-1, .owner_cpu: 0 [7.538436] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.10.0-rc8-next-20170214+ #1 [7.539142] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090006 04/28/2016 [7.539142] Call Trace: [7.539142] [7.539142] dump_stack+0x63/0x82 [7.539142] spin_dump+0x78/0xc0 [7.539142] do_raw_spin_lock+0xfd/0x160 [7.539142] _raw_spin_lock_irqsave+0x4c/0x60 [7.539142] ? timesync_onchannelcallback+0x153/0x220 [hv_utils] [7.539142] timesync_onchannelcallback+0x153/0x220 [hv_utils] Can you resolve this address using gdb to a line of code? Once inside gdb do: l *(timesync_onchannelcallback+0x153)
Re: Boot regression (was "Re: [PATCH] genhd: Do not hold event lock when scheduling workqueue elements")
On Tue, Feb 14, 2017 at 02:46:41PM +, Dexuan Cui wrote: > > From: h...@lst.de [mailto:h...@lst.de] > > Sent: Tuesday, February 14, 2017 22:29 > > To: Dexuan Cui <de...@microsoft.com> > > Subject: Re: Boot regression (was "Re: [PATCH] genhd: Do not hold event lock > > when scheduling workqueue elements") > > > > Ok, thanks for testing. Can you try the patch below? It fixes a > > clear problem which was partially papered over before the commit > > you bisected to, although it can't explain why blk-mq still works. > > Still bad luck. :-( > > BTW, I'm using the first "bad" commit (scsi: allocate scsi_cmnd structures as > part of struct request) + the 2 patches you provided today. > > I suppose I don't need to test the 2 patches on the latest linux-next repo. I'd love a test on that repo actually. We had a few other for sense handling since then I think.
Re: Boot regression (was "Re: [PATCH] genhd: Do not hold event lock when scheduling workqueue elements")
Ok, thanks for testing. Can you try the patch below? It fixes a clear problem which was partially papered over before the commit you bisected to, although it can't explain why blk-mq still works. >From e4a66856fa2d92c0298000de658365f31bea60cd Mon Sep 17 00:00:00 2001 From: Christoph Hellwig <h...@lst.de> Date: Tue, 14 Feb 2017 15:08:39 +0100 Subject: scsi: always zero sshdr in scsi_normalize_sense This gives us a clear state even if a command didn't return sense data. Signed-off-by: Christoph Hellwig <h...@lst.de> --- drivers/scsi/scsi_common.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/scsi/scsi_common.c b/drivers/scsi/scsi_common.c index b1383a71400e..a75673bb82b3 100644 --- a/drivers/scsi/scsi_common.c +++ b/drivers/scsi/scsi_common.c @@ -137,11 +137,11 @@ EXPORT_SYMBOL(int_to_scsilun); bool scsi_normalize_sense(const u8 *sense_buffer, int sb_len, struct scsi_sense_hdr *sshdr) { + memset(sshdr, 0, sizeof(struct scsi_sense_hdr)); + if (!sense_buffer || !sb_len) return false; - memset(sshdr, 0, sizeof(struct scsi_sense_hdr)); - sshdr->response_code = (sense_buffer[0] & 0x7f); if (!scsi_sense_valid(sshdr)) -- 2.11.0
Re: Boot regression (was "Re: [PATCH] genhd: Do not hold event lock when scheduling workqueue elements")
On Wed, Feb 08, 2017 at 10:43:59AM -0700, Jens Axboe wrote: > I've changed the subject line, this issue has nothing to do with the > issue that Hannes was attempting to fix. Nothing really useful in the thread. Dexuan, can you throw in some prints to see which command times out?
Re: [PATCH 15/18] scsi: allocate scsi_cmnd structures as part of struct request
On Fri, Jan 27, 2017 at 06:39:46PM +, Bart Van Assche wrote: > Why have the scsi_release_buffers() and scsi_put_command(cmd) calls been > moved up? I haven't found an explanation for this change in the patch > description. Because they reference the scsi_cmnd, which are now part of the request and thus freed by blk_finish_request. And yes, I should have mentioned it in the changelog, sorry. > Please also consider to remove the cmd->request->special = NULL assignments > via this patch. Since this patch makes the lifetime of struct scsi_cmnd and > struct request identical these assignments are no longer needed. True. If I had to resend again I would have fixed it up, but it's probably not worth the churn now. > This patch introduces the function scsi_exit_rq(). Having two functions > for the single-queue path that release resources (scsi_release_buffers() > and scsi_exit_rq()) is confusing. Since every scsi_release_buffers() call > is followed by a blk_unprep_request() call, have you considered to move > the scsi_release_buffers() call into scsi_unprep_fn() via an additional > patch? We could have done that. But it's just more change for a code path that I hope won't survive this calendar year. -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: split scsi passthrough fields out of struct request V2
On Fri, Jan 27, 2017 at 09:27:53PM +, Bart Van Assche wrote: > Have you considered to convert all block drivers to the new > approach and to get rid of request.special? If so, do you already > have plans to start working on this? I'm namely wondering wheter I > should start working on this myself. Hi Bart, I'd love to have all drivers move of using .special (and thus reducing request size further). I think the general way to do that is to convert them to blk-mq and not using the legacy cmd_size field. -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: split scsi passthrough fields out of struct request V3
On Fri, Jan 27, 2017 at 06:58:53PM +, Bart Van Assche wrote: > Version 3 of the patch with title "block: split scsi_request out of > struct request" (commit 3c30af6ebe12) differs significantly from v2 > of that patch that has been posted on several mailing lists. E.g. v2 > moves __cmd[], cmd and cmd_len from struct request into struct > scsi_request but v3 not. Which version do you want us to review? Hi Bart, I tried to resend the whole updated v3 series, but the mail server stopped accepting mails due to overload. Otherwise it would have included all the patches. Jens instead took the updated version straight from this git branch: http://git.infradead.org/users/hch/block.git/shortlog/refs/heads/block-pc-refactor -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 14/18] scsi: remove __scsi_alloc_queue
On Fri, Jan 27, 2017 at 05:58:02PM +, Bart Van Assche wrote: > Since __scsi_init_queue() modifies data in the Scsi_Host structure, have you > considered to add the declaration for this function to ? > If you want to keep this declaration in please add a > direct include of that header file to drivers/scsi/scsi_lib.c such that the > declaration remains visible to the compiler if someone would minimize the > number of #include directives in SCSI header files. Feel free to send an incremental patch either way. In the long run I'd really like to kill off __scsi_init_queue and remove the transport BSG queue abuse of SCSI internals, though. -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: split scsi passthrough fields out of struct request V2
On Thu, Jan 26, 2017 at 11:57:36AM -0700, Jens Axboe wrote: > It's against my for-4.11/block, which you were running under Christoph's > patches. Maybe he's using an older version? In any case, should be > pretty trivial for you to hand apply. Just ensure that .flags is set to > 0 for the common cases, and inherit 'flags' when it is passed in. No, the flush op cleanups you asked for last round create a conflict with your patch. They should be trivial to fix, though. -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html