Re: [PATCH 4.21 V3] blk-mq: not embed .mq_kobj and ctx->kobj into queue instance

2018-11-20 Thread Jens Axboe
On 11/20/18 9:45 AM, Guenter Roeck wrote:
> On Tue, Nov 20, 2018 at 08:51:50AM +0100, Greg Kroah-Hartman wrote:
>> On Tue, Nov 20, 2018 at 09:44:35AM +0800, Ming Lei wrote:
>>> Even though .mq_kobj, ctx->kobj and q->kobj share same lifetime
>>> from block layer's view, actually they don't because userspace may
>>> grab one kobject anytime via sysfs.
>>>
>>> This patch fixes the issue by the following approach:
>>>
>>> 1) introduce 'struct blk_mq_ctxs' for holding .mq_kobj and managing
>>> all ctxs
>>>
>>> 2) free all allocated ctxs and the 'blk_mq_ctxs' instance in release
>>> handler of .mq_kobj
>>>
>>> 3) grab one ref of .mq_kobj before initializing each ctx->kobj, so that
>>> .mq_kobj is always released after all ctxs are freed.
>>>
>>> This patch fixes kernel panic issue during booting when 
>>> DEBUG_KOBJECT_RELEASE
>>> is enabled.
>>>
>>> Reported-by: Guenter Roeck 
>>> Cc: "jianchao.wang" 
>>> Cc: Guenter Roeck 
>>> Cc: Greg Kroah-Hartman 
>>> Cc: sta...@vger.kernel.org
>>> Signed-off-by: Ming Lei 
>>> ---
>>> V3:
>>> - keep to allocate q->queue_ctx via percpu allocator, so one extra
>>>   pointer reference can be saved for getting ctx
>>> V2:
>>> - allocate 'blk_mq_ctx' inside blk_mq_init_allocated_queue()
>>> - allocate q->mq_kobj directly 
>>
>> Not tested, but seems sane from a kobject point-of-view:
>>
>> Reviewed-by: Greg Kroah-Hartman 
> 
> Tested-by: Guenter Roeck 
> 
> with v4.14.y and v4.19.y.
> 
> The patch is marked for v4.21. I would kindly suggest to not wait for v4.21
> but apply it to v4.20. This would let us enable DEBUG_KOBJECT_RELEASE
> with syzbot on upstream and stable kernels.

I'd very much like to put this into 4.21, and not 4.20, as that's much
less risky. This isn't a new regression anyway, so there's no rush to
put it into 4.20 as far as I'm concerned.

-- 
Jens Axboe



Re: [PATCH 4.21 V3] blk-mq: not embed .mq_kobj and ctx->kobj into queue instance

2018-11-20 Thread Guenter Roeck
On Tue, Nov 20, 2018 at 08:51:50AM +0100, Greg Kroah-Hartman wrote:
> On Tue, Nov 20, 2018 at 09:44:35AM +0800, Ming Lei wrote:
> > Even though .mq_kobj, ctx->kobj and q->kobj share same lifetime
> > from block layer's view, actually they don't because userspace may
> > grab one kobject anytime via sysfs.
> > 
> > This patch fixes the issue by the following approach:
> > 
> > 1) introduce 'struct blk_mq_ctxs' for holding .mq_kobj and managing
> > all ctxs
> > 
> > 2) free all allocated ctxs and the 'blk_mq_ctxs' instance in release
> > handler of .mq_kobj
> > 
> > 3) grab one ref of .mq_kobj before initializing each ctx->kobj, so that
> > .mq_kobj is always released after all ctxs are freed.
> > 
> > This patch fixes kernel panic issue during booting when 
> > DEBUG_KOBJECT_RELEASE
> > is enabled.
> > 
> > Reported-by: Guenter Roeck 
> > Cc: "jianchao.wang" 
> > Cc: Guenter Roeck 
> > Cc: Greg Kroah-Hartman 
> > Cc: sta...@vger.kernel.org
> > Signed-off-by: Ming Lei 
> > ---
> > V3:
> > - keep to allocate q->queue_ctx via percpu allocator, so one extra
> >   pointer reference can be saved for getting ctx
> > V2:
> > - allocate 'blk_mq_ctx' inside blk_mq_init_allocated_queue()
> > - allocate q->mq_kobj directly 
> 
> Not tested, but seems sane from a kobject point-of-view:
> 
> Reviewed-by: Greg Kroah-Hartman 

Tested-by: Guenter Roeck 

with v4.14.y and v4.19.y.

The patch is marked for v4.21. I would kindly suggest to not wait for v4.21
but apply it to v4.20. This would let us enable DEBUG_KOBJECT_RELEASE
with syzbot on upstream and stable kernels.

Greg, applying the patch to v4.14.y will require a backport due to a minor
context conflict. I'll send that to you after the patch is available in
mainline.

Thanks,
Guenter


Re: [PATCH 4.21 V3] blk-mq: not embed .mq_kobj and ctx->kobj into queue instance

2018-11-19 Thread Greg Kroah-Hartman
On Tue, Nov 20, 2018 at 09:44:35AM +0800, Ming Lei wrote:
> Even though .mq_kobj, ctx->kobj and q->kobj share same lifetime
> from block layer's view, actually they don't because userspace may
> grab one kobject anytime via sysfs.
> 
> This patch fixes the issue by the following approach:
> 
> 1) introduce 'struct blk_mq_ctxs' for holding .mq_kobj and managing
> all ctxs
> 
> 2) free all allocated ctxs and the 'blk_mq_ctxs' instance in release
> handler of .mq_kobj
> 
> 3) grab one ref of .mq_kobj before initializing each ctx->kobj, so that
> .mq_kobj is always released after all ctxs are freed.
> 
> This patch fixes kernel panic issue during booting when DEBUG_KOBJECT_RELEASE
> is enabled.
> 
> Reported-by: Guenter Roeck 
> Cc: "jianchao.wang" 
> Cc: Guenter Roeck 
> Cc: Greg Kroah-Hartman 
> Cc: sta...@vger.kernel.org
> Signed-off-by: Ming Lei 
> ---
> V3:
>   - keep to allocate q->queue_ctx via percpu allocator, so one extra
> pointer reference can be saved for getting ctx
> V2:
>   - allocate 'blk_mq_ctx' inside blk_mq_init_allocated_queue()
>   - allocate q->mq_kobj directly 

Not tested, but seems sane from a kobject point-of-view:

Reviewed-by: Greg Kroah-Hartman 


[PATCH 4.21 V3] blk-mq: not embed .mq_kobj and ctx->kobj into queue instance

2018-11-19 Thread Ming Lei
Even though .mq_kobj, ctx->kobj and q->kobj share same lifetime
from block layer's view, actually they don't because userspace may
grab one kobject anytime via sysfs.

This patch fixes the issue by the following approach:

1) introduce 'struct blk_mq_ctxs' for holding .mq_kobj and managing
all ctxs

2) free all allocated ctxs and the 'blk_mq_ctxs' instance in release
handler of .mq_kobj

3) grab one ref of .mq_kobj before initializing each ctx->kobj, so that
.mq_kobj is always released after all ctxs are freed.

This patch fixes kernel panic issue during booting when DEBUG_KOBJECT_RELEASE
is enabled.

Reported-by: Guenter Roeck 
Cc: "jianchao.wang" 
Cc: Guenter Roeck 
Cc: Greg Kroah-Hartman 
Cc: sta...@vger.kernel.org
Signed-off-by: Ming Lei 
---
V3:
- keep to allocate q->queue_ctx via percpu allocator, so one extra
  pointer reference can be saved for getting ctx
V2:
- allocate 'blk_mq_ctx' inside blk_mq_init_allocated_queue()
- allocate q->mq_kobj directly 

 block/blk-mq-sysfs.c   | 34 --
 block/blk-mq.c | 39 ---
 block/blk-mq.h |  6 ++
 include/linux/blkdev.h |  2 +-
 4 files changed, 63 insertions(+), 18 deletions(-)

diff --git a/block/blk-mq-sysfs.c b/block/blk-mq-sysfs.c
index 3d25b9c419e9..6efef1f679f0 100644
--- a/block/blk-mq-sysfs.c
+++ b/block/blk-mq-sysfs.c
@@ -15,6 +15,18 @@
 
 static void blk_mq_sysfs_release(struct kobject *kobj)
 {
+   struct blk_mq_ctxs *ctxs = container_of(kobj, struct blk_mq_ctxs, kobj);
+
+   free_percpu(ctxs->queue_ctx);
+   kfree(ctxs);
+}
+
+static void blk_mq_ctx_sysfs_release(struct kobject *kobj)
+{
+   struct blk_mq_ctx *ctx = container_of(kobj, struct blk_mq_ctx, kobj);
+
+   /* ctx->ctxs won't be released until all ctx are freed */
+   kobject_put(>ctxs->kobj);
 }
 
 static void blk_mq_hw_sysfs_release(struct kobject *kobj)
@@ -213,7 +225,7 @@ static struct kobj_type blk_mq_ktype = {
 static struct kobj_type blk_mq_ctx_ktype = {
.sysfs_ops  = _mq_sysfs_ops,
.default_attrs  = default_ctx_attrs,
-   .release= blk_mq_sysfs_release,
+   .release= blk_mq_ctx_sysfs_release,
 };
 
 static struct kobj_type blk_mq_hw_ktype = {
@@ -245,7 +257,7 @@ static int blk_mq_register_hctx(struct blk_mq_hw_ctx *hctx)
if (!hctx->nr_ctx)
return 0;
 
-   ret = kobject_add(>kobj, >mq_kobj, "%u", hctx->queue_num);
+   ret = kobject_add(>kobj, q->mq_kobj, "%u", hctx->queue_num);
if (ret)
return ret;
 
@@ -268,8 +280,8 @@ void blk_mq_unregister_dev(struct device *dev, struct 
request_queue *q)
queue_for_each_hw_ctx(q, hctx, i)
blk_mq_unregister_hctx(hctx);
 
-   kobject_uevent(>mq_kobj, KOBJ_REMOVE);
-   kobject_del(>mq_kobj);
+   kobject_uevent(q->mq_kobj, KOBJ_REMOVE);
+   kobject_del(q->mq_kobj);
kobject_put(>kobj);
 
q->mq_sysfs_init_done = false;
@@ -289,7 +301,7 @@ void blk_mq_sysfs_deinit(struct request_queue *q)
ctx = per_cpu_ptr(q->queue_ctx, cpu);
kobject_put(>kobj);
}
-   kobject_put(>mq_kobj);
+   kobject_put(q->mq_kobj);
 }
 
 void blk_mq_sysfs_init(struct request_queue *q)
@@ -297,10 +309,12 @@ void blk_mq_sysfs_init(struct request_queue *q)
struct blk_mq_ctx *ctx;
int cpu;
 
-   kobject_init(>mq_kobj, _mq_ktype);
+   kobject_init(q->mq_kobj, _mq_ktype);
 
for_each_possible_cpu(cpu) {
ctx = per_cpu_ptr(q->queue_ctx, cpu);
+
+   kobject_get(q->mq_kobj);
kobject_init(>kobj, _mq_ctx_ktype);
}
 }
@@ -313,11 +327,11 @@ int __blk_mq_register_dev(struct device *dev, struct 
request_queue *q)
WARN_ON_ONCE(!q->kobj.parent);
lockdep_assert_held(>sysfs_lock);
 
-   ret = kobject_add(>mq_kobj, kobject_get(>kobj), "%s", "mq");
+   ret = kobject_add(q->mq_kobj, kobject_get(>kobj), "%s", "mq");
if (ret < 0)
goto out;
 
-   kobject_uevent(>mq_kobj, KOBJ_ADD);
+   kobject_uevent(q->mq_kobj, KOBJ_ADD);
 
queue_for_each_hw_ctx(q, hctx, i) {
ret = blk_mq_register_hctx(hctx);
@@ -334,8 +348,8 @@ int __blk_mq_register_dev(struct device *dev, struct 
request_queue *q)
while (--i >= 0)
blk_mq_unregister_hctx(q->queue_hw_ctx[i]);
 
-   kobject_uevent(>mq_kobj, KOBJ_REMOVE);
-   kobject_del(>mq_kobj);
+   kobject_uevent(q->mq_kobj, KOBJ_REMOVE);
+   kobject_del(q->mq_kobj);
kobject_put(>kobj);
return ret;
 }
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 32b246ed44c0..9228f2e9bd40 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -2515,6 +2515,34 @@ static void blk_mq_add_queue_tag_set(struct 
blk_mq_tag_set *set,
mutex_unlock(>tag_list_lock);
 }
 
+/* All allocations will be freed in release handler of q->mq_kobj */
+static