On Mon, Apr 9, 2018 at 11:35 PM, Oleksandr Natalenko
wrote:
> Did your system hang on smartctl hammering too? Have you got some stack
> traces to compare with mine ones?
Unfortunately I only had a single hang with no dumps. I haven't been
able to reproduce it since. :(
-Kees
--
Kees Cook
Pixel
Hi.
09.04.2018 22:30, Kees Cook wrote:
echo 1 | tee /sys/block/sd*/queue/nr_requests
I can't get this below "4".
Oops, yeah. It cannot be less than BLKDEV_MIN_RQ (which is 4), so it is
enforced explicitly in queue_requests_store(). It is the same for me.
echo 1 | tee /sys/block/sd*/devic
On Mon, 9 Apr 2018 17:39:15 +0200
Christoph Hellwig wrote:
> Same numerical value (for now at least), but a much better
> documentation of intent.
>
> Signed-off-by: Christoph Hellwig
> ---
> block/scsi_ioctl.c | 2 +-
> drivers/block/drbd/drbd_bitmap.c | 3 ++-
> drivers/bloc
On Mon, 9 Apr 2018 17:39:16 +0200
Christoph Hellwig wrote:
> blk_get_request is used for pass-through style I/O and thus doesn't
> need GFP_NOIO.
>
> Signed-off-by: Christoph Hellwig
> ---
> block/blk-core.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/block/blk-c
On Mon, 9 Apr 2018 17:39:14 +0200
Christoph Hellwig wrote:
> We just can't do I/O when doing block layer requests allocations,
> so use GFP_NOIO instead of the even more limited __GFP_DIRECT_RECLAIM.
>
> Signed-off-by: Christoph Hellwig
> ---
> block/blk-core.c | 5 ++---
> 1 file changed, 2
On Mon, 9 Apr 2018 17:39:13 +0200
Christoph Hellwig wrote:
> blk_old_get_request already has it at hand, and in blk_queue_bio,
> which is the fast path, it is constant.
>
> Signed-off-by: Christoph Hellwig
> ---
> block/blk-core.c | 14 +++---
> drivers/scsi/scsi_error.c | 4
On Mon, 9 Apr 2018 17:39:12 +0200
Christoph Hellwig wrote:
> Switch everyone to blk_get_request_flags, and then rename
> blk_get_request_flags to blk_get_request.
>
> Signed-off-by: Christoph Hellwig
> ---
> block/blk-core.c | 14 +++---
> block/bsg.c
If a completion occurs after blk_mq_rq_timed_out() has reset
rq->aborted_gstate and the request is again in flight when the timeout
expires then a request will be completed twice: a first time by the
timeout handler and a second time when the regular completion occurs.
Additionally, the blk-mq tim
On Tue, 2018-04-10 at 09:30 +0800, Ming Lei wrote:
> Also is it possible to see queue freed here?
I think the caller should keep a reference on the request queue. Otherwise
we have a much bigger problem than a race between submitting a bio and
removing a request queue from the cgroup controller in
On Mon, Apr 09, 2018 at 10:54:57PM +, Bart Van Assche wrote:
> On Mon, 2018-04-09 at 14:54 +0800, Joseph Qi wrote:
> > The oops happens during generic_make_request_checks(), in
> > blk_throtl_bio() exactly.
> > So if we want to bypass dying queue, we have to check this before
> > generic_make_r
On 4/9/18 5:54 PM, Linus Torvalds wrote:
> On Mon, Apr 9, 2018 at 3:32 PM, Jens Axboe wrote:
>>
>> The resulting min/max and friends would have been trivial to test, but
>> clearly they weren't.
>
> Well, the min/max macros themselves actually were tested in user space by me.
>
> It was the inte
On Mon, Apr 9, 2018 at 3:32 PM, Jens Axboe wrote:
>
> The resulting min/max and friends would have been trivial to test, but
> clearly they weren't.
Well, the min/max macros themselves actually were tested in user space by me.
It was the interaction with the unrelated "min_not_zero()" that wasn'
On Mon, Apr 09, 2018 at 07:43:01PM -0400, Wakko Warner wrote:
> Ming Lei wrote:
> > On Mon, Apr 09, 2018 at 09:30:11PM +, Bart Van Assche wrote:
> > > Hello Ming,
> > >
> > > Can you have a look at this? The start of this e-mail thread is available
> > > at
> > > https://www.mail-archive.com/
Ming Lei wrote:
> On Mon, Apr 09, 2018 at 09:30:11PM +, Bart Van Assche wrote:
> > Hello Ming,
> >
> > Can you have a look at this? The start of this e-mail thread is available at
> > https://www.mail-archive.com/linux-scsi@vger.kernel.org/msg72574.html.
>
> Sure, thanks for your sharing.
>
On Mon, Apr 09, 2018 at 09:30:11PM +, Bart Van Assche wrote:
> On Sun, 2018-04-08 at 12:02 -0400, Wakko Warner wrote:
> > I finished with git bisect. Here's the output:
> > 84c8590646d5b35804bac60eb58b145839b5893e is the first bad commit
> > commit 84c8590646d5b35804bac60eb58b145839b5893e
> >
On Mon, 2018-04-09 at 16:58 -0600, Jens Axboe wrote:
> This ends up being nutty in the generic_make_request() case, where we
> do the exact same enter/exit logic right after. That needs to get unified.
> Maybe move the queue enter into generic_make_request_checks(), and exit
> in the caller?
Hello
On Mon, Apr 9, 2018 at 1:30 PM, Kees Cook wrote:
> Ah! dm-crypt too. I'll see if I can get that added easily to my tests.
Quick update: I added dm-crypt (with XFS on top) and it hung my system
almost immediately. I got no warnings at all, though.
-Kees
--
Kees Cook
Pixel Security
On Mon, Apr 09, 2018 at 06:18:04PM +0800, Li Wang wrote:
> Hi,
>
> I got this BUG_ON() on s390x platform with kernel-v4.16.0.
>
> [1.200196] [ cut here ]
> [1.200201] kernel BUG at block/bio.c:1798!
> [1.200228] illegal operation: 0001 ilc:1 [#1] SMP
> [1.2
On 4/9/18 4:38 PM, Kees Cook wrote:
> On Mon, Apr 9, 2018 at 3:32 PM, Jens Axboe wrote:
>> That's bad, for sure, but my worry was bigger than an oops or crash,
>> we could have had corruption due to this.
>>
>> The resulting min/max and friends would have been trivial to test, but
>> clearly they
On 4/9/18 4:54 PM, Bart Van Assche wrote:
> On Mon, 2018-04-09 at 14:54 +0800, Joseph Qi wrote:
>> The oops happens during generic_make_request_checks(), in
>> blk_throtl_bio() exactly.
>> So if we want to bypass dying queue, we have to check this before
>> generic_make_request_checks(), I think.
>
On Mon, 2018-04-09 at 14:54 +0800, Joseph Qi wrote:
> The oops happens during generic_make_request_checks(), in
> blk_throtl_bio() exactly.
> So if we want to bypass dying queue, we have to check this before
> generic_make_request_checks(), I think.
How about something like the patch below?
Thank
On Mon, Apr 9, 2018 at 3:32 PM, Jens Axboe wrote:
> That's bad, for sure, but my worry was bigger than an oops or crash,
> we could have had corruption due to this.
>
> The resulting min/max and friends would have been trivial to test, but
> clearly they weren't.
Yeah, that was bad luck and my fa
On 4/9/18 4:27 PM, Ming Lei wrote:
> On Mon, Apr 09, 2018 at 04:10:17PM -0600, Jens Axboe wrote:
>> On 4/9/18 4:05 PM, Kees Cook wrote:
>>> On Mon, Apr 9, 2018 at 2:56 PM, Jens Axboe wrote:
On 4/9/18 3:26 PM, Jens Axboe wrote:
> On 4/9/18 1:32 PM, Jens Axboe wrote:
>> On 4/9/18 12:38
On Mon, Apr 09, 2018 at 04:10:17PM -0600, Jens Axboe wrote:
> On 4/9/18 4:05 PM, Kees Cook wrote:
> > On Mon, Apr 9, 2018 at 2:56 PM, Jens Axboe wrote:
> >> On 4/9/18 3:26 PM, Jens Axboe wrote:
> >>> On 4/9/18 1:32 PM, Jens Axboe wrote:
> On 4/9/18 12:38 PM, Mike Snitzer wrote:
> > On Mon
On 4/9/18 4:05 PM, Kees Cook wrote:
> On Mon, Apr 9, 2018 at 2:56 PM, Jens Axboe wrote:
>> On 4/9/18 3:26 PM, Jens Axboe wrote:
>>> On 4/9/18 1:32 PM, Jens Axboe wrote:
On 4/9/18 12:38 PM, Mike Snitzer wrote:
> On Mon, Apr 09 2018 at 11:51am -0400,
> Mike Snitzer wrote:
>
>>
(cc'ing Joseph as he worked on the area recently, hi!)
Hello,
On Sat, Apr 07, 2018 at 12:21:48PM +0200, Alexandru Moise wrote:
> The q->id is used as an index within the blkg_tree radix tree.
>
> If the entry is not released before reclaiming the blk_queue_ida's id
> blkcg_init_queue() within a
On Mon, Apr 9, 2018 at 2:56 PM, Jens Axboe wrote:
> On 4/9/18 3:26 PM, Jens Axboe wrote:
>> On 4/9/18 1:32 PM, Jens Axboe wrote:
>>> On 4/9/18 12:38 PM, Mike Snitzer wrote:
On Mon, Apr 09 2018 at 11:51am -0400,
Mike Snitzer wrote:
> On Sun, Apr 08 2018 at 12:00am -0400,
> M
On 4/9/18 3:26 PM, Jens Axboe wrote:
> On 4/9/18 1:32 PM, Jens Axboe wrote:
>> On 4/9/18 12:38 PM, Mike Snitzer wrote:
>>> On Mon, Apr 09 2018 at 11:51am -0400,
>>> Mike Snitzer wrote:
>>>
On Sun, Apr 08 2018 at 12:00am -0400,
Ming Lei wrote:
> Hi,
>
> The following ker
Hello, Bart.
On Mon, Apr 09, 2018 at 09:30:27PM +, Bart Van Assche wrote:
> On Mon, 2018-04-09 at 11:56 -0700, t...@kernel.org wrote:
> > On Mon, Apr 09, 2018 at 05:03:05PM +, Bart Van Assche wrote:
> > > exist today in the blk-mq timeout handling code cannot be fixed completely
> > > usin
On Mon, 2018-04-09 at 11:56 -0700, t...@kernel.org wrote:
> On Mon, Apr 09, 2018 at 05:03:05PM +, Bart Van Assche wrote:
> > exist today in the blk-mq timeout handling code cannot be fixed completely
> > using RCU only.
>
> I really don't think that is that complicated. Let's first confirm
>
On Sun, 2018-04-08 at 12:02 -0400, Wakko Warner wrote:
> I finished with git bisect. Here's the output:
> 84c8590646d5b35804bac60eb58b145839b5893e is the first bad commit
> commit 84c8590646d5b35804bac60eb58b145839b5893e
> Author: Ming Lei
> Date: Fri Nov 11 20:05:32 2016 +0800
>
> target:
On 4/9/18 1:32 PM, Jens Axboe wrote:
> On 4/9/18 12:38 PM, Mike Snitzer wrote:
>> On Mon, Apr 09 2018 at 11:51am -0400,
>> Mike Snitzer wrote:
>>
>>> On Sun, Apr 08 2018 at 12:00am -0400,
>>> Ming Lei wrote:
>>>
Hi,
The following kernel oops(divide error) is triggered when running
On Mon, Apr 9, 2018 at 12:02 PM, Oleksandr Natalenko
wrote:
>
> Hi.
>
> (fancy details for linux-block and BFQ people go below)
>
> 09.04.2018 20:32, Kees Cook wrote:
>>
>> Ah, this detail I didn't have. I've changed my environment to
>>
>> build with:
>>
>> CONFIG_BLK_MQ_PCI=y
>> CONFIG_BLK_MQ_VI
On 4/9/18 12:38 PM, Mike Snitzer wrote:
> On Mon, Apr 09 2018 at 11:51am -0400,
> Mike Snitzer wrote:
>
>> On Sun, Apr 08 2018 at 12:00am -0400,
>> Ming Lei wrote:
>>
>>> Hi,
>>>
>>> The following kernel oops(divide error) is triggered when running
>>> xfstest(generic/347) on ext4.
>>>
>>> [ 44
Hi.
(fancy details for linux-block and BFQ people go below)
09.04.2018 20:32, Kees Cook wrote:
Ah, this detail I didn't have. I've changed my environment to
build with:
CONFIG_BLK_MQ_PCI=y
CONFIG_BLK_MQ_VIRTIO=y
CONFIG_IOSCHED_BFQ=y
boot with scsi_mod.use_blk_mq=1
and select BFQ in the sche
Hello,
On Mon, Apr 09, 2018 at 05:03:05PM +, Bart Van Assche wrote:
> My opinion is not only that the two patches that you posted recently do not
> fix all the races that are fixed by this patch but also that the races that
The race was with the path where the ownership of a timed out request
On Mon, Apr 09 2018 at 11:51am -0400,
Mike Snitzer wrote:
> On Sun, Apr 08 2018 at 12:00am -0400,
> Ming Lei wrote:
>
> > Hi,
> >
> > The following kernel oops(divide error) is triggered when running
> > xfstest(generic/347) on ext4.
> >
> > [ 442.632954] run fstests generic/347 at 2018-04-0
On Sun, Apr 8, 2018 at 12:07 PM, Oleksandr Natalenko
wrote:
> So far, I wasn't able to trigger this with mq-deadline (or without blk-mq).
> Maybe, this has something to do with blk-mq+BFQ re-queuing, or it's just me
> not being persistent enough.
Ah, this detail I didn't have. I've changed my env
On Mon 09-04-18 15:03:45, Bart Van Assche wrote:
> On Mon, 2018-04-09 at 11:00 +0200, Michal Hocko wrote:
> > On Mon 09-04-18 04:46:22, Bart Van Assche wrote:
> > [...]
> > [...]
> > > diff --git a/drivers/ide/ide-pm.c b/drivers/ide/ide-pm.c
> > > index ad8a125defdd..3ddb464b72e6 100644
> > > --- a
On Mon, Apr 09, 2018 at 09:52:03AM -0700, Matthew Wilcox wrote:
> On Mon, Apr 09, 2018 at 05:39:16PM +0200, Christoph Hellwig wrote:
> > blk_get_request is used for pass-through style I/O and thus doesn't need
> > GFP_NOIO.
>
> Obviously GFP_KERNEL is a big improvement over GFP_NOIO! But can we t
On Mon, Apr 09, 2018 at 09:03:54AM -0700, Matthew Wilcox wrote:
> > @@ -499,7 +499,7 @@ int sg_scsi_ioctl(struct request_queue *q, struct
> > gendisk *disk, fmode_t mode,
> > break;
> > }
> >
> > - if (bytes && blk_rq_map_kern(q, rq, buffer, bytes, __GFP_RECLAIM)) {
> > + if
On Mon, 2018-04-09 at 09:47 -0700, Tejun Heo wrote:
> On Sun, Apr 08, 2018 at 10:20:38PM -0700, Bart Van Assche wrote:
> > If a completion occurs after blk_mq_rq_timed_out() has reset
> > rq->aborted_gstate and the request is again in flight when the timeout
> > expires then a request will be compl
On Mon, Apr 09, 2018 at 05:39:16PM +0200, Christoph Hellwig wrote:
> blk_get_request is used for pass-through style I/O and thus doesn't need
> GFP_NOIO.
Obviously GFP_KERNEL is a big improvement over GFP_NOIO! But can we take
it all the way to GFP_USER, if this is always done in the ioctl path
(
Hello, Sagi.
On Mon, Apr 09, 2018 at 11:37:15AM +0300, Sagi Grimberg wrote:
>
> >If a completion occurs after blk_mq_rq_timed_out() has reset
> >rq->aborted_gstate and the request is again in flight when the timeout
> >expires then a request will be completed twice: a first time by the
> >timeout
Hey, Bart.
On Sun, Apr 08, 2018 at 10:20:38PM -0700, Bart Van Assche wrote:
> If a completion occurs after blk_mq_rq_timed_out() has reset
> rq->aborted_gstate and the request is again in flight when the timeout
> expires then a request will be completed twice: a first time by the
> timeout handle
On Mon, Apr 09, 2018 at 05:39:15PM +0200, Christoph Hellwig wrote:
> Same numerical value (for now at least), but a much better documentation
> of intent.
> @@ -499,7 +499,7 @@ int sg_scsi_ioctl(struct request_queue *q, struct gendisk
> *disk, fmode_t mode,
> break;
> }
>
>
On 2018-04-09 02:17 AM, Hannes Reinecke wrote:
On 04/09/2018 04:08 AM, Tim Walker wrote:
On Fri, Apr 6, 2018 at 11:09 AM, Douglas Gilbert wrote:
On 2018-04-06 02:42 AM, Christoph Hellwig wrote:
On Fri, Apr 06, 2018 at 08:24:18AM +0200, Hannes Reinecke wrote:
Ah. Far better.
What about del
Hi.
09.04.2018 11:35, Christoph Hellwig wrote:
I really can't make sense of that report.
Sorry, I have nothing to add there so far, I just see the symptom of
something going wrong in the ioctl code path that is invoked by
smartctl, but I have no idea what's the minimal environment to reprodu
On Sun, Apr 08 2018 at 12:00am -0400,
Ming Lei wrote:
> Hi,
>
> The following kernel oops(divide error) is triggered when running
> xfstest(generic/347) on ext4.
>
> [ 442.632954] run fstests generic/347 at 2018-04-07 18:06:44
> [ 443.839480] divide error: [#1] PREEMPT SMP PTI
> [ 443.8
On 08.04.2018 11:48, Ming Lei wrote:
Hi Jens,
The first two patches fix issues about queue mapping.
The other 6 patches improve queue mapping for blk-mq.
Christian, this patches should fix your issue, so please give
a test, and the patches can be found in the following tree:
https://g
On 4/9/2018 11:37 AM, Sagi Grimberg wrote:
If a completion occurs after blk_mq_rq_timed_out() has reset
rq->aborted_gstate and the request is again in flight when the timeout
expires then a request will be completed twice: a first time by the
timeout handler and a second time when the regular c
We just can't do I/O when doing block layer requests allocations,
so use GFP_NOIO instead of the even more limited __GFP_DIRECT_RECLAIM.
Signed-off-by: Christoph Hellwig
---
block/blk-core.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/block/blk-core.c b/block/blk-cor
blk_old_get_request already has it at hand, and in blk_queue_bio, which
is the fast path, it is constant.
Signed-off-by: Christoph Hellwig
---
block/blk-core.c | 14 +++---
drivers/scsi/scsi_error.c | 4
2 files changed, 7 insertions(+), 11 deletions(-)
diff --git a/block
blk_get_request is used for pass-through style I/O and thus doesn't need
GFP_NOIO.
Signed-off-by: Christoph Hellwig
---
block/blk-core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/block/blk-core.c b/block/blk-core.c
index 432923751551..253a869558f9 100644
--- a/block/blk
Switch everyone to blk_get_request_flags, and then rename
blk_get_request_flags to blk_get_request.
Signed-off-by: Christoph Hellwig
---
block/blk-core.c | 14 +++---
block/bsg.c| 5 ++---
block/scsi_ioctl.c | 8 +++-
driver
Same numerical value (for now at least), but a much better documentation
of intent.
Signed-off-by: Christoph Hellwig
---
block/scsi_ioctl.c | 2 +-
drivers/block/drbd/drbd_bitmap.c | 3 ++-
drivers/block/pktcdvd.c | 2 +-
drivers/ide/ide-tape.c | 2 +-
driver
Always GFP_KERNEL, and keeping it would cause serious complications for
the next change.
Signed-off-by: Christoph Hellwig
---
drivers/scsi/osd/osd_initiator.c | 24 +++-
fs/exofs/ore.c | 10 +-
fs/exofs/super.c | 2 +-
include/scsi/o
Hi all,
this series sorts out the mess around how we use gfp flags in the
block layer get_request interface.
Signed-off-by: Christoph Hellwig
---
block/blk-core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/block/blk-core.c b/block/blk-core.c
index abcb8684ba67..abde22c755ab 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1517,7 +1517,7 @@ static struct request *__get_re
On Mon, 2018-04-09 at 01:26 -0700, Christoph Hellwig wrote:
> On Mon, Apr 09, 2018 at 08:53:49AM +0200, Hannes Reinecke wrote:
> > Why don't you fold the 'flags' argument into the 'gfp_flags', and drop
> > the 'flags' argument completely?
> > Looks a bit pointless to me, having two arguments denoti
On Mon, Apr 09, 2018 at 01:26:50AM -0700, Christoph Hellwig wrote:
> On Mon, Apr 09, 2018 at 08:53:49AM +0200, Hannes Reinecke wrote:
> > Why don't you fold the 'flags' argument into the 'gfp_flags', and drop
> > the 'flags' argument completely?
> > Looks a bit pointless to me, having two arguments
On Mon, 2018-04-09 at 11:00 +0200, Michal Hocko wrote:
> On Mon 09-04-18 04:46:22, Bart Van Assche wrote:
> [...]
> [...]
> > diff --git a/drivers/ide/ide-pm.c b/drivers/ide/ide-pm.c
> > index ad8a125defdd..3ddb464b72e6 100644
> > --- a/drivers/ide/ide-pm.c
> > +++ b/drivers/ide/ide-pm.c
> > @@ -91
On 4/9/18 8:58 AM, Bart Van Assche wrote:
> On Mon, 2018-04-09 at 11:37 +0200, Christoph Hellwig wrote:
>> This looks sensible, but I'm worried about taking a whole spinlock
>> for every request completion, including irq disabling. However it seems
>> like your new updated pattern would fit use of
On 4/8/18 3:48 AM, Ming Lei wrote:
> Hi Jens,
>
> The first two patches fix issues about queue mapping.
>
> The other 6 patches improve queue mapping for blk-mq.
>
> Christian, this patches should fix your issue, so please give
> a test, and the patches can be found in the following tree:
>
>
On Mon, 2018-04-09 at 11:37 +0200, Christoph Hellwig wrote:
> This looks sensible, but I'm worried about taking a whole spinlock
> for every request completion, including irq disabling. However it seems
> like your new updated pattern would fit use of cmpxchg() very nicely.
Hello Christoph,
Than
On Mon, 2018-04-09 at 11:37 +0300, Sagi Grimberg wrote:
> > If a completion occurs after blk_mq_rq_timed_out() has reset
> > rq->aborted_gstate and the request is again in flight when the timeout
> > expires then a request will be completed twice: a first time by the
> > timeout handler and a secon
On Mon, Apr 09, 2018 at 11:31:37AM +0300, Sagi Grimberg wrote:
>
> > > My device exposes nr_hw_queues which is not higher than num_online_cpus
> > > so I want to connect all hctxs with hope that they will be used.
> >
> > The issue is that CPU online & offline can happen any time, and after
> > b
Looks good,
Reviewed-by: Sagi Grimberg
Looks good,
Reviewed-by: Sagi Grimberg
Looks good,
Reviewed-by: Sagi Grimberg
Looks good,
Reviewed-by: Sagi Grimberg
Looks good,
Reviewed-by: Sagi Grimberg
Looks good,
Reviewed-by: Sagi Grimberg
Looks good,
Reviewed-by: Sagi Grimberg
Looks good,
Reviewed-by: Sagi Grimberg
On 04/08/2018 11:48 AM, Ming Lei wrote:
> Hi Jens,
>
> The first two patches fix issues about queue mapping.
>
> The other 6 patches improve queue mapping for blk-mq.
>
> Christian, this patches should fix your issue, so please give
> a test, and the patches can be found in the following tree
On Sun, Apr 08, 2018 at 05:48:14PM +0800, Ming Lei wrote:
> Firstly, from commit 4b855ad37194 ("blk-mq: Create hctx for each present CPU),
> blk-mq doesn't remap queue any more after CPU topo is changed.
>
> Secondly, set->nr_hw_queues can't be bigger than nr_cpu_ids, and now we map
> all possible
On Sun, Apr 08, 2018 at 05:48:13PM +0800, Ming Lei wrote:
> Now the actual meaning of queue mapped is that if there is any online
> CPU mapped to this hctx, so implement blk_mq_hw_queue_mapped() in this
> way.
Reviewed-by: Christoph Hellwig
On Sun, Apr 08, 2018 at 05:48:12PM +0800, Ming Lei wrote:
> There are several reasons for removing the check:
>
> 1) blk_mq_hw_queue_mapped() returns true always now since each hctx
> may be mapped by one CPU at least
Sounds like we should remove it, then..
The patch looks good:
Reviewed-by: Ch
On Sun, Apr 08, 2018 at 05:48:11PM +0800, Ming Lei wrote:
> No driver uses this interface any more, so remove it.
Looks good,
Reviewed-by: Christoph Hellwig
On Sun, Apr 08, 2018 at 05:48:10PM +0800, Ming Lei wrote:
> This patch introduces helper of blk_mq_hw_queue_first_cpu() for
> figuring out the hctx's first cpu, and code duplication can be
> avoided.
Looks good,
Reviewed-by: Christoph Hellwig
On Sun, Apr 08, 2018 at 05:48:09PM +0800, Ming Lei wrote:
> This patch figures out the final selected CPU, then writes
> it to hctx->next_cpu once, then we can avoid to intermediate
> next cpu observed from other dispatch paths.
Looks good,
Reviewed-by: Christoph Hellwig
On Sun, Apr 08, 2018 at 05:48:08PM +0800, Ming Lei wrote:
> >From commit 4b855ad37194 ("blk-mq: Create hctx for each present CPU),
> blk-mq doesn't remap queue after CPU topo is changed, that said when
> some of these offline CPUs become online, they are still mapped to
> hctx 0, then hctx 0 may be
On Sun, Apr 08, 2018 at 05:48:07PM +0800, Ming Lei wrote:
> >From commit 20e4d81393196 (blk-mq: simplify queue mapping & schedule
> with each possisble CPU), one hctx can be mapped from all offline CPUs,
> then hctx->next_cpu can be set as wrong.
>
> This patch fixes this issue by making hctx->nex
This looks sensible, but I'm worried about taking a whole spinlock
for every request completion, including irq disabling. However it seems
like your new updated pattern would fit use of cmpxchg() very nicely.
I really can't make sense of that report. And I'm also curious why
you think 17cb960f29c2 should change anything for that code path.
On Fri, Apr 06, 2018 at 01:09:08PM -0400, Douglas Gilbert wrote:
> So you found a document that outlines NVMe's architecture! Could you
> share the url (no marketing BS, please)?
You can always take a look at the actual spec:
http://nvmexpress.org/wp-content/uploads/NVM-Express-1_3a-20171024_rati
Hi Sagi
Sorry for the late response, bellow patch works, here is the full log:
Thanks for testing!
Now that we isolated the issue, the question is if this fix is correct
given that we are guaranteed that the connect context will run on an
online cpu?
another reference to the patch (we can ma
On 04/09/2018 04:54 PM, Yi Zhang wrote:
On 04/09/2018 04:31 PM, Sagi Grimberg wrote:
My device exposes nr_hw_queues which is not higher than
num_online_cpus
so I want to connect all hctxs with hope that they will be used.
The issue is that CPU online & offline can happen any time, and a
On Mon 09-04-18 04:46:22, Bart Van Assche wrote:
[...]
[...]
> diff --git a/drivers/ide/ide-pm.c b/drivers/ide/ide-pm.c
> index ad8a125defdd..3ddb464b72e6 100644
> --- a/drivers/ide/ide-pm.c
> +++ b/drivers/ide/ide-pm.c
> @@ -91,7 +91,7 @@ int generic_ide_resume(struct device *dev)
>
> mems
On 04/09/2018 04:31 PM, Sagi Grimberg wrote:
My device exposes nr_hw_queues which is not higher than num_online_cpus
so I want to connect all hctxs with hope that they will be used.
The issue is that CPU online & offline can happen any time, and after
blk-mq removes CPU hotplug handler, the
If a completion occurs after blk_mq_rq_timed_out() has reset
rq->aborted_gstate and the request is again in flight when the timeout
expires then a request will be completed twice: a first time by the
timeout handler and a second time when the regular completion occurs.
Additionally, the blk-mq
My device exposes nr_hw_queues which is not higher than num_online_cpus
so I want to connect all hctxs with hope that they will be used.
The issue is that CPU online & offline can happen any time, and after
blk-mq removes CPU hotplug handler, there is no way to remap queue
when CPU topo is cha
On Mon, Apr 09, 2018 at 08:53:49AM +0200, Hannes Reinecke wrote:
> Why don't you fold the 'flags' argument into the 'gfp_flags', and drop
> the 'flags' argument completely?
> Looks a bit pointless to me, having two arguments denoting basically
> the same ...
Wrong way around. gfp_flags doesn't re
94 matches
Mail list logo