On Tue, Apr 04, 2017 at 03:56:34PM +, Bart Van Assche wrote:
> > This looks like generic block layer code, why is it in SCSI?
>
> Hello Christoph,
>
> That's an excellent question. I assume that you are fine with moving
> this code to the block layer?
Yes. In fact I wonder if we need the
On 04/04/2017 08:56 PM, Dmitry Monakhov wrote:
> SCSI drivers do care about bip_seed so we must update it accordingly.
>
> Signed-off-by: Dmitry Monakhov
> ---
> block/bio-integrity.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/block/bio-integrity.c
On 04/04/2017 08:56 PM, Dmitry Monakhov wrote:
> Currently if some one try to advance bvec beyond it's size we simply
> dump WARN_ONCE and continue to iterate beyond bvec array boundaries.
> This simply means that we endup dereferencing/corrupting random memory
> region.
>
> Sane reaction would
On 04/04/2017 08:56 PM, Dmitry Monakhov wrote:
> Currently ->verify_fn not woks at all because at the moment it is called
> bio->bi_iter.bi_size == 0, so we do not iterate integrity bvecs at all.
>
> In order to perform verification we need to know original data vector,
> with new bvec rewind API
On 04/04/2017 08:56 PM, Dmitry Monakhov wrote:
> Some ->bi_end_io handlers (for example: pi_verify or decrypt handlers)
> need to know original data vector, but after bio traverse io-stack it may
> be advanced, splited and relocated many times so it is hard to guess
> original iterator. Let's add
On 04/04/2017 08:56 PM, Dmitry Monakhov wrote:
> Signed-off-by: Dmitry Monakhov
> ---
> block/t10-pi.c | 9 +++--
> drivers/scsi/lpfc/lpfc_scsi.c| 5 +++--
> drivers/scsi/qla2xxx/qla_isr.c | 8
> drivers/target/target_core_sbc.c | 2 +-
>
On 04/04/2017 08:56 PM, Dmitry Monakhov wrote:
> Currently all integrity prep hooks are open-coded, and if prepare fails
> we ignore it's code and fail bio with EIO. Let's return real error to
> upper layer, so later caller may react accordingly.
>
> In fact no one want to use
On Wed 05-04-17 14:33:50, NeilBrown wrote:
>
> When a filesystem is mounted from a loop device, writes are
> throttled by balance_dirty_pages() twice: once when writing
> to the filesystem and once when the loop_handle_cmd() writes
> to the backing file. This double-throttling can trigger
>
On Wed 05-04-17 09:19:27, Michal Hocko wrote:
> On Wed 05-04-17 14:33:50, NeilBrown wrote:
[...]
> > diff --git a/drivers/block/loop.c b/drivers/block/loop.c
> > index 0ecb6461ed81..44b3506fd086 100644
> > --- a/drivers/block/loop.c
> > +++ b/drivers/block/loop.c
> > @@ -852,6 +852,7 @@ static int
Hi,
this series aims to unify the setting and clearing of PF_MEMALLOC, which
prevents recursive reclaim. There are some places that clear the flag
unconditionally from current->flags, which may result in clearing a
pre-existing flag. This already resulted in a bug report that Patch 1 fixes
We now have memalloc_noreclaim_{save,restore} helpers for robust setting and
clearing of PF_MEMALLOC. Let's convert the code which was using the generic
tsk_restore_flags(). No functional change.
Signed-off-by: Vlastimil Babka
Cc: Josef Bacik
Cc: Lee Duncan
On Tue 04-04-17 14:09:51, Thiago Jung Bauermann wrote:
> Hello,
>
> Am Donnerstag, 23. März 2017, 01:36:52 BRT schrieb Jan Kara:
> > this is a series with the remaining patches (on top of 4.11-rc2) to fix
> > several different races and issues I've found when testing device shutdown
> > and
The previous patch has shown that simply setting and clearing PF_MEMALLOC in
current->flags can result in wrongly clearing a pre-existing PF_MEMALLOC flag
and potentially lead to recursive reclaim. Let's introduce helpers that support
proper nesting by saving the previous stat of the flag, similar
Nandsim has own functions set_memalloc() and clear_memalloc() for robust
setting and clearing of PF_MEMALLOC. Replace them by the new generic helpers.
No functional change.
Signed-off-by: Vlastimil Babka
Cc: Boris Brezillon
Cc: Richard
Instead of bloating the generic struct request with it.
Signed-off-by: Christoph Hellwig
---
block/scsi_ioctl.c | 8
drivers/scsi/osd/osd_initiator.c | 2 +-
drivers/scsi/osst.c| 2 +-
drivers/scsi/scsi_error.c | 2 +-
On 04/02/2017 07:41 AM, Sagi Grimberg wrote:
> Like pci and virtio, we add a rdma helper for affinity
> spreading. This achieves optimal mq affinity assignments
> according to the underlying rdma device affinity maps.
Reviewed-by: Jens Axboe
--
Jens Axboe
Don't pass the status explicitly but derive it from the requeust,
and unwind the complex condition to be more readable.
Signed-off-by: Christoph Hellwig
---
drivers/nvme/host/core.c | 16 +++-
1 file changed, 11 insertions(+), 5 deletions(-)
diff --git
This series fixes a few lose bits in terms of how nvme uses ->retries,
including fixing it for non-PCIe transports. While at it I noticed that
nvme and scsi use the field in entirely different ways, and no other
driver uses it at all. So I decided to move it into the nvme_request and
This way we get the behavior right for the non-PCIe transports.
Signed-off-by: Christoph Hellwig
---
drivers/nvme/host/core.c | 5 +
drivers/nvme/host/pci.c | 4
2 files changed, 5 insertions(+), 4 deletions(-)
diff --git a/drivers/nvme/host/core.c
This avoids fallbacks to explicit zeroing in (__)blkdev_issue_zeroout if
the caller doesn't want them.
Also clean up the convoluted check for the return condition that this
new flag is added to.
Signed-off-by: Christoph Hellwig
Reviewed-by: Martin K. Petersen
This gets us support for non-discard efficient write of zeroes (e.g. NVMe)
and prepares for removing the discard_zeroes_data flag.
Also remove a pointless discard support check, which is done in
blkdev_issue_discard already.
Signed-off-by: Christoph Hellwig
Reviewed-by: Martin K.
But now for the real NVMe Write Zeroes yet, just to get rid of the
discard abuse for zeroing. Also rename the quirk flag to be a bit
more self-explanatory.
Signed-off-by: Christoph Hellwig
Reviewed-by: Martin K. Petersen
Reviewed-by: Hannes Reinecke
On Wed, 2017-04-05 at 08:07 -0600, Jens Axboe wrote:
> On Mon, Apr 03 2017, Bart Van Assche wrote:
> > The block layer core sets blk_mq_queue_data.list but no block
> > drivers read that member. Hence remove it and also the code that
> > is used to set this member.
>
> Looks fine to me, might as
This driver is for pre-IDE hardisk that are only found in PC from the
stoneage of personal computing, and which we don't support elsewhere
in the kernel these days.
It's also been marked broken forever.
Signed-off-by: Christoph Hellwig
---
drivers/block/Kconfig | 12 -
On Mon, Apr 03 2017, Bart Van Assche wrote:
> The block layer core sets blk_mq_queue_data.list but no block
> drivers read that member. Hence remove it and also the code that
> is used to set this member.
Looks fine to me, might as well kill it. Your patch came through mangled
here though, on
On 04/04/2017 06:31 AM, Jan Kara wrote:
> Writeback throttling does not play well with CFQ since that also tries
> to throttle async writes. As a result async writeback can get starved in
> presence of readers. As an example take a benchmark simulating
> postgreSQL database running over a standard
Copy and past the REQ_OP_WRITE_SAME code to prepare to implementations
that limit the write zeroes size.
Signed-off-by: Christoph Hellwig
Reviewed-by: Martin K. Petersen
Reviewed-by: Hannes Reinecke
---
block/blk-merge.c | 17
Fix up do_region to not allocate a bio_vec for discards. We've
got rid of the discard payload allocated by the caller years ago.
Obviously this wasn't actually harmful given how long it's been
there, but it's still good to avoid the pointless allocation.
Signed-off-by: Christoph Hellwig
Copy & paste from the REQ_OP_WRITE_SAME code.
Signed-off-by: Christoph Hellwig
Reviewed-by: Hannes Reinecke
---
drivers/md/dm-core.h | 1 +
drivers/md/dm-io.c| 8 ++--
drivers/md/dm-linear.c| 1 +
drivers/md/dm-mpath.c |
Make life easy for implementations that needs to send a data buffer
to the device (e.g. SCSI) by numbering it as a data out command.
Signed-off-by: Christoph Hellwig
Reviewed-by: Martin K. Petersen
Reviewed-by: Hannes Reinecke
---
Turn the existing discard flag into a new BLKDEV_ZERO_UNMAP flag with
similar semantics, but without referring to diѕcard.
Signed-off-by: Christoph Hellwig
Reviewed-by: Martin K. Petersen
Reviewed-by: Hannes Reinecke
---
block/blk-lib.c
Split sd_setup_discard_cmnd into one function per provisioning type. While
this creates some very slight duplication of boilerplate code it keeps the
code modular for additions of new provisioning types, and for reusing the
write same functions for the upcoming scsi implementation of the Write
Signed-off-by: Christoph Hellwig
Reviewed-by: Martin K. Petersen
Reviewed-by: Hannes Reinecke
---
drivers/scsi/sd.c | 31 ++-
drivers/scsi/sd_zbc.c | 1 +
2 files changed, 27 insertions(+), 5 deletions(-)
It seems like the code currently passes whatever it was using for writes
to WRITE SAME. Just switch it to WRITE ZEROES, although that doesn't
need any payload.
Untested, and confused by the code, maybe someone who understands it
better than me can help..
Signed-off-by: Christoph Hellwig
This series makes REQ_OP_WRITE_ZEROES the only zeroing offload
supported by the block layer, and switches existing implementations
of REQ_OP_DISCARD that correctly set discard_zeroes_data to it,
removes incorrect discard_zeroes_data, and also switches WRITE SAME
based zeroing in SCSI to this new
On Wed, Apr 05, 2017 at 04:18:56PM +0200, Christoph Hellwig wrote:
> Instead of bloating the generic struct request with it.
>
> Signed-off-by: Christoph Hellwig
> ---
Reviewed-by: Johannes Thumshirn
--
Johannes Thumshirn
On Wed, Apr 05, 2017 at 04:18:55PM +0200, Christoph Hellwig wrote:
> The way NVMe uses this field is entirely different from the older
> SCSI/BLOCK_PC usage, so move it into struct nvme_request.
>
> Also reduce the size of the file to a unsigned char so that we leave space
> for additional
It's identical to discard as hole punches will always leave us with
zeroes on reads.
Signed-off-by: Christoph Hellwig
Reviewed-by: Hannes Reinecke
---
drivers/block/loop.c | 4
1 file changed, 4 insertions(+)
diff --git a/drivers/block/loop.c
Try to use a write same with unmap bit variant if the device supports it
and the caller allows for it.
Signed-off-by: Christoph Hellwig
Reviewed-by: Martin K. Petersen
Reviewed-by: Hannes Reinecke
---
drivers/scsi/sd.c | 9 +
1
Just the same as discard if the block size equals the system page size.
Signed-off-by: Christoph Hellwig
Reviewed-by: Hannes Reinecke
---
drivers/block/zram/zram_drv.c | 13 -
1 file changed, 8 insertions(+), 5 deletions(-)
diff --git
If this flag is set logical provisioning capable device should
release space for the zeroed blocks if possible, if it is not set
devices should keep the blocks anchored.
Also remove an out of sync kerneldoc comment for a static function
that would have become even more out of data with this
We'll always use the WRITE ZEROES code for zeroing now.
Signed-off-by: Christoph Hellwig
Reviewed-by: Martin K. Petersen
Reviewed-by: Hannes Reinecke
---
block/blk-lib.c | 4
1 file changed, 4 deletions(-)
diff --git
On Wed, Apr 05, 2017 at 04:18:52PM +0200, Christoph Hellwig wrote:
> This way we get the behavior right for the non-PCIe transports.
Could you please share a bit of your minds inner workings for us mere mortals?
Thanks,
Johannes
--
Johannes Thumshirn
On Wed, Apr 05, 2017 at 04:43:34PM +0200, Johannes Thumshirn wrote:
> On Wed, Apr 05, 2017 at 04:18:52PM +0200, Christoph Hellwig wrote:
> > This way we get the behavior right for the non-PCIe transports.
>
> Could you please share a bit of your minds inner workings for us mere mortals?
It's
On Wed, Apr 05, 2017 at 11:14:30AM -0400, Keith Busch wrote:
> On Wed, Apr 05, 2017 at 04:18:55PM +0200, Christoph Hellwig wrote:
> > The way NVMe uses this field is entirely different from the older
> > SCSI/BLOCK_PC usage, so move it into struct nvme_request.
> >
> > Also reduce the size of the
From: Long Li
Under heavy I/O, one hardware queue may be unable to dispatch any I/O to the
device layer. This poses a problem with restarting this hardware queue on I/O
finish in blk_mq_sched_restart_queues(), becaue there is nothing pending that
will finish in future on
> -Original Message-
> From: Bart Van Assche [mailto:bart.vanass...@sandisk.com]
> Sent: Wednesday, April 5, 2017 5:32 PM
> To: linux-ker...@vger.kernel.org; linux-block@vger.kernel.org; Long Li
> ; ax...@kernel.dk
> Cc: Stephen Hemminger ;
> -Original Message-
> From: KY Srinivasan
> Sent: Wednesday, April 5, 2017 9:21 PM
> To: Bart Van Assche ; linux-
> ker...@vger.kernel.org; linux-block@vger.kernel.org; Long Li
> ; ax...@kernel.dk
> Cc: Stephen Hemminger
On Wed, Apr 05 2017, Michal Hocko wrote:
> On Wed 05-04-17 09:19:27, Michal Hocko wrote:
>> On Wed 05-04-17 14:33:50, NeilBrown wrote:
> [...]
>> > diff --git a/drivers/block/loop.c b/drivers/block/loop.c
>> > index 0ecb6461ed81..44b3506fd086 100644
>> > --- a/drivers/block/loop.c
>> > +++
On Thu, 2017-04-06 at 03:38 +, Long Li wrote:
> > -Original Message-
> > From: Bart Van Assche [mailto:bart.vanass...@sandisk.com]
> >
> > Please drop this patch. I'm working on a better solution.
>
> Thank you. Looking forward to your patch.
Hello Long,
It would help if you could
On Wed, Apr 05, 2017 at 12:01:29PM -0700, Omar Sandoval wrote:
> From: Omar Sandoval
>
> While dispatching requests, if we fail to get a driver tag, we mark the
> hardware queue as waiting for a tag and put the requests on a
> hctx->dispatch list to be run later when a driver tag
On Wed, 2017-04-05 at 17:16 -0700, Long Li wrote:
> Under heavy I/O, one hardware queue may be unable to dispatch any I/O to the
> device layer. This poses a problem with restarting this hardware queue on I/O
> finish in blk_mq_sched_restart_queues(), becaue there is nothing pending that
> will
> -Original Message-
> From: Bart Van Assche [mailto:bart.vanass...@sandisk.com]
> Sent: Wednesday, April 5, 2017 8:46 PM
> To: linux-ker...@vger.kernel.org; linux-block@vger.kernel.org; Long Li
> ; ax...@kernel.dk
> Cc: Stephen Hemminger ;
Turn the existing discard flag into a new BLKDEV_ZERO_UNMAP flag with
similar semantics, but without referring to diѕcard.
Signed-off-by: Christoph Hellwig
Reviewed-by: Martin K. Petersen
Reviewed-by: Hannes Reinecke
---
block/blk-lib.c
Make life easy for implementations that needs to send a data buffer
to the device (e.g. SCSI) by numbering it as a data out command.
Signed-off-by: Christoph Hellwig
Reviewed-by: Martin K. Petersen
Reviewed-by: Hannes Reinecke
---
Copy & paste from the REQ_OP_WRITE_SAME code.
Signed-off-by: Christoph Hellwig
Reviewed-by: Hannes Reinecke
---
drivers/md/dm-core.h | 1 +
drivers/md/dm-io.c| 8 ++--
drivers/md/dm-linear.c| 1 +
drivers/md/dm-mpath.c |
Split sd_setup_discard_cmnd into one function per provisioning type. While
this creates some very slight duplication of boilerplate code it keeps the
code modular for additions of new provisioning types, and for reusing the
write same functions for the upcoming scsi implementation of the Write
This series makes REQ_OP_WRITE_ZEROES the only zeroing offload
supported by the block layer, and switches existing implementations
of REQ_OP_DISCARD that correctly set discard_zeroes_data to it,
removes incorrect discard_zeroes_data, and also switches WRITE SAME
based zeroing in SCSI to this new
The block layer core sets blk_mq_queue_data.list but no block
drivers read that member. Hence remove it and also the code that
is used to set this member.
Signed-off-by: Bart Van Assche
Reviewed-by: Christoph Hellwig
Reviewed-by: Sagi Grimberg
->retries is counting the number of times a command is resubmitted, and
be cleared on the first time we see the command. We currently don't do
that for non-PCIe command, which is easily fixed by moving the setup
to common code.
Signed-off-by: Christoph Hellwig
---
Instead of bloating the generic struct request with it.
Signed-off-by: Christoph Hellwig
Reviewed-by: Johannes Thumshirn
---
block/scsi_ioctl.c | 8
drivers/scsi/osd/osd_initiator.c | 2 +-
drivers/scsi/osst.c| 2 +-
This series fixes a few lose bits in terms of how nvme uses ->retries,
including fixing it for non-PCIe transports. While at it I noticed that
nvme and scsi use the field in entirely different ways, and no other
driver uses it at all. So I decided to move it into the nvme_request and
The way NVMe uses this field is entirely different from the older
SCSI/BLOCK_PC usage, so move it into struct nvme_request.
Also reduce the size of the file to a unsigned char so that we leave
space for additional smaller fields that will appear soon.
Signed-off-by: Christoph Hellwig
Don't pass the status explicitly but derive it from the requeust,
and unwind the complex condition to be more readable.
Signed-off-by: Christoph Hellwig
Reviewed-by: Johannes Thumshirn
---
drivers/nvme/host/core.c | 16 +++-
1 file changed, 11
Signed-off-by: Christoph Hellwig
Reviewed-by: Johannes Thumshirn
---
drivers/nvme/host/core.c | 3 +--
drivers/nvme/host/nvme.h | 2 --
2 files changed, 1 insertion(+), 4 deletions(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index
It's just a in-driver reimplementation of writing zeroes to the pages,
which fails if the discards aren't page aligned.
Signed-off-by: Christoph Hellwig
Reviewed-by: Hannes Reinecke
---
drivers/block/brd.c | 54 -
It seems like the code currently passes whatever it was using for writes
to WRITE SAME. Just switch it to WRITE ZEROES, although that doesn't
need any payload.
Untested, and confused by the code, maybe someone who understands it
better than me can help..
Signed-off-by: Christoph Hellwig
It's identical to discard as hole punches will always leave us with
zeroes on reads.
Signed-off-by: Christoph Hellwig
Reviewed-by: Hannes Reinecke
---
drivers/block/loop.c | 4
1 file changed, 4 insertions(+)
diff --git a/drivers/block/loop.c
Try to use a write same with unmap bit variant if the device supports it
and the caller allows for it.
Signed-off-by: Christoph Hellwig
Reviewed-by: Martin K. Petersen
Reviewed-by: Hannes Reinecke
---
drivers/scsi/sd.c | 9 +
1
drbd always wants its discard wire operations to zero the blocks, so
use blkdev_issue_zeroout with the BLKDEV_ZERO_UNMAP flag instead of
reinventing it poorly.
Signed-off-by: Christoph Hellwig
Reviewed-by: Hannes Reinecke
---
drivers/block/drbd/drbd_debugfs.c | 3
This avoids fallbacks to explicit zeroing in (__)blkdev_issue_zeroout if
the caller doesn't want them.
Also clean up the convoluted check for the return condition that this
new flag is added to.
Signed-off-by: Christoph Hellwig
Reviewed-by: Martin K. Petersen
But now for the real NVMe Write Zeroes yet, just to get rid of the
discard abuse for zeroing. Also rename the quirk flag to be a bit
more self-explanatory.
Signed-off-by: Christoph Hellwig
Reviewed-by: Martin K. Petersen
Reviewed-by: Hannes Reinecke
Now that we have REQ_OP_WRITE_ZEROES implemented for all devices that
support efficient zeroing, we can remove the call to blkdev_issue_discard.
This means we only have two ways of zeroing left and can simplify the
code.
Signed-off-by: Christoph Hellwig
Reviewed-by: Martin K.
We'll always use the WRITE ZEROES code for zeroing now.
Signed-off-by: Christoph Hellwig
Reviewed-by: Martin K. Petersen
Reviewed-by: Hannes Reinecke
---
block/blk-lib.c | 4
1 file changed, 4 deletions(-)
diff --git
rsxx only supports discarding on large alignments, so the zeroing code
would always fall back to explicit writings of zeroes.
Signed-off-by: Christoph Hellwig
Reviewed-by: Hannes Reinecke
---
drivers/block/rsxx/dev.c | 1 -
1 file changed, 1 deletion(-)
diff --git
If this flag is set logical provisioning capable device should
release space for the zeroed blocks if possible, if it is not set
devices should keep the blocks anchored.
Also remove an out of sync kerneldoc comment for a static function
that would have become even more out of data with this
mmc only supports discarding on large alignments, so the zeroing code
would always fall back to explicit writings of zeroes.
Signed-off-by: Christoph Hellwig
Reviewed-by: Hannes Reinecke
---
drivers/mmc/core/queue.c | 2 --
1 file changed, 2 deletions(-)
diff --git
This gets us support for non-discard efficient write of zeroes (e.g. NVMe)
and prepares for removing the discard_zeroes_data flag.
Also remove a pointless discard support check, which is done in
blkdev_issue_discard already.
Signed-off-by: Christoph Hellwig
Reviewed-by: Martin K.
Now that we use the proper REQ_OP_WRITE_ZEROES operation everywhere we can
kill this hack.
Signed-off-by: Christoph Hellwig
Reviewed-by: Martin K. Petersen
Reviewed-by: Hannes Reinecke
---
Documentation/ABI/testing/sysfs-block | 10
From: "Martin K. Petersen"
Separating discards and zeroout operations allows us to remove the LBPRZ
block zeroing constraints from discards and honor the device preferences
for UNMAP commands.
If supported by the device, we'll also choose UNMAP over one of the
WRITE
From: "Martin K. Petersen"
Now that zeroout and discards are distinct operations we need to
separate the policy of choosing the appropriate command. Create a
zeroing_mode which can be one of:
write: Zeroout assist not present, use regular WRITE
On 04/05/2017 09:39 AM, Bart Van Assche wrote:
> The block layer core sets blk_mq_queue_data.list but no block
> drivers read that member. Hence remove it and also the code that
> is used to set this member.
Thanks, this came through fine. Applied for 4.12.
--
Jens Axboe
Fix up do_region to not allocate a bio_vec for discards. We've
got rid of the discard payload allocated by the caller years ago.
Obviously this wasn't actually harmful given how long it's been
there, but it's still good to avoid the pointless allocation.
Signed-off-by: Christoph Hellwig
Signed-off-by: Christoph Hellwig
Reviewed-by: Martin K. Petersen
Reviewed-by: Hannes Reinecke
---
drivers/scsi/sd.c | 31 ++-
drivers/scsi/sd_zbc.c | 1 +
2 files changed, 27 insertions(+), 5 deletions(-)
Christoph Hellwig writes:
> This driver is for pre-IDE hardisk that are only found in PC from the
> stoneage of personal computing, and which we don't support elsewhere
> in the kernel these days.
>
> It's also been marked broken forever.
Reviewed-by: Martin K. Petersen
On Wed, 2017-04-05 at 11:28 -0700, Omar Sandoval wrote:
> From: Omar Sandoval
>
> Trivial cleanup.
>
> Signed-off-by: Omar Sandoval
> ---
> block/blk-mq.c | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/block/blk-mq.c
Reviewed-by: Sagi Grimberg
Reviewed-by: Sagi Grimberg
Reviewed-by: Sagi Grimberg
Reviewed-by: Sagi Grimberg
On Wed, Apr 05 2017, Christoph Hellwig wrote:
> This series fixes a few lose bits in terms of how nvme uses ->retries,
> including fixing it for non-PCIe transports. While at it I noticed that
> nvme and scsi use the field in entirely different ways, and no other
> driver uses it at all. So I
From: Omar Sandoval
This v2 of my series from a couple of days ago [1] with one extra fix
and two extra cleanups.
- Patch 1 is the new fix for a hang that Josef reported after trying v1.
- Patches 2-6 are the original series. Patch 5 now has Christoph's and
Sagi's Reviewed-by.
From: Omar Sandoval
In elevator_switch(), if blk_mq_init_sched() fails, we attempt to fall
back to the original scheduler. However, at this point, we've already
torn down the original scheduler's tags, so this causes a crash. Doing
the fallback like the legacy elevator path is
From: Omar Sandoval
Trivial cleanup.
Signed-off-by: Omar Sandoval
---
block/blk-mq.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 71dc8608f3a8..779249a5999b 100644
--- a/block/blk-mq.c
+++
From: Omar Sandoval
If a new hardware queue is added at runtime, we don't allocate scheduler
tags for it, leading to a crash. This hooks up the scheduler framework
to blk_mq_{init,exit}_hctx() to make sure everything gets properly
initialized/freed.
Signed-off-by: Omar Sandoval
From: Omar Sandoval
Schedulers need to be informed when a hardware queue is added or removed
at runtime so they can allocate/free per-hardware queue data. So,
replace the blk_mq_sched_init_hctx_data() helper, which only makes sense
at init time, with .init_hctx() and .exit_hctx()
From: Omar Sandoval
blk_mq_update_nr_hw_queues() used to remap hardware queues, which is the
behavior that drivers expect. However, commit 4e68a011428a changed
blk_mq_queue_reinit() to not remap queues for the case of CPU
hotplugging, inadvertently making
From: Omar Sandoval
Minor cleanup that makes it easier to figure out what's going on in the
driver tag allocation failure path of blk_mq_dispatch_rq_list().
Signed-off-by: Omar Sandoval
---
block/blk-mq.c | 19 +--
1 file changed, 9
From: Omar Sandoval
While dispatching requests, if we fail to get a driver tag, we mark the
hardware queue as waiting for a tag and put the requests on a
hctx->dispatch list to be run later when a driver tag is freed. However,
blk_mq_dispatch_rq_list() may dispatch requests from
From: Omar Sandoval
Schedulers need to be informed when a hardware queue is added or removed
at runtime so they can allocate/free per-hardware queue data. So,
replace the blk_mq_sched_init_hctx_data() helper, which only makes sense
at init time, with .init_hctx() and .exit_hctx()
1 - 100 of 143 matches
Mail list logo