Re: [Regression] Linux-Next Merge 25Jul2018 breaks mmc on Tegra.

2018-07-30 Thread Ming Lei
Hi Peter,

Thanks for collecting the log.

On Mon, Jul 30, 2018 at 02:55:42PM -0400, Peter Geis wrote:
> 
> 
> On 07/28/2018 09:37 AM, Ming Lei wrote:

...

> [   10.887209] systemd--112 0.n.1 2411122us : blk_mq_make_request: make
> rq -1
> [   10.890274] kworker/-98  0...1 2411506us : blk_mq_free_request:
> complete: rq -1
> [   10.893313] systemd--107 0...1 2412025us : blk_mq_make_request: make
> rq -1
> [   10.896354] systemd--107 0 2412323us : mmc_mq_queue_rq: queue rq
> -1, 0
> [   10.899388] systemd--107 0 2412327us :
> blk_mq_try_issue_list_directly: issue direct: rq -1, ret 0
> [   10.902463] (direxec-111 1...1 2413829us : blk_mq_make_request: make
> rq -1
> [   10.905513] systemd--114 1...1 2415159us : blk_mq_make_request: make
> rq -1

Above is the most interesting part in the log. MMC sets hw queue depth
as 1, and you are using none scheduler, that means the max number of
in-flight requests should be one, but the above log shows that there may
be 3 in-flight requests.

That seems really weird, but it shouldn't be related with my two patches,
which won't change the tag allocation behaviour at all. However, what matters
may be that the patch speeds up the request dispatch. Maybe one bug
in lib/sbitmap.c block/blk-mq-tag.c.

Unfortunately rq->tag wasn't shown in the log because I forget to dump
it in the debug patch, so could you apply the following new debug patch and
provide us the log again? BTW, please attach the ftrace log in the reply
mail directly, then it may be parsed/looked easily.

>From c9ffdcbb71811b6b450ab83f9f97ae227596ed6a Mon Sep 17 00:00:00 2001
From: Ming Lei 
Date: Tue, 31 Jul 2018 08:31:58 +0800
Subject: [PATCH] mmc-blk-mq-debug: debug info

Signed-off-by: Ming Lei 
---
 block/blk-mq-sched.c |  2 ++
 block/blk-mq.c   |  8 
 drivers/mmc/core/queue.c | 16 
 drivers/mmc/host/sdhci.c |  1 +
 include/linux/blkdev.h   |  1 +
 5 files changed, 28 insertions(+)

diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
index cf9c66c6d35a..6573684b07e5 100644
--- a/block/blk-mq-sched.c
+++ b/block/blk-mq-sched.c
@@ -415,6 +415,8 @@ void blk_mq_sched_insert_requests(struct request_queue *q,
blk_mq_try_issue_list_directly(hctx, list);
if (list_empty(list))
return;
+   if (blk_queue_debug(q))
+   trace_printk("issue direct: partial done\n");
}
blk_mq_insert_requests(hctx, ctx, list);
}
diff --git a/block/blk-mq.c b/block/blk-mq.c
index e13bdc2707ce..fe807cd1f9ef 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -488,6 +488,9 @@ void blk_mq_free_request(struct request *rq)
struct blk_mq_ctx *ctx = rq->mq_ctx;
struct blk_mq_hw_ctx *hctx = blk_mq_map_queue(q, ctx->cpu);
 
+   if (blk_queue_debug(q))
+   trace_printk("complete: rq %d %d\n", rq->internal_tag, rq->tag);
+
if (rq->rq_flags & RQF_ELVPRIV) {
if (e && e->type->ops.mq.finish_request)
e->type->ops.mq.finish_request(rq);
@@ -1793,6 +1796,9 @@ void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx 
*hctx,
 
list_del_init(&rq->queuelist);
ret = blk_mq_request_issue_directly(rq);
+   if (blk_queue_debug(rq->q))
+   trace_printk("issue direct: rq %d %d, ret %d\n",
+   rq->internal_tag, rq->tag, (int)ret);
if (ret != BLK_STS_OK) {
if (ret == BLK_STS_RESOURCE ||
ret == BLK_STS_DEV_RESOURCE) {
@@ -1841,6 +1847,8 @@ static blk_qc_t blk_mq_make_request(struct request_queue 
*q, struct bio *bio)
return BLK_QC_T_NONE;
}
 
+if (blk_queue_debug(q))
+trace_printk("make rq %d %d\n", rq->internal_tag, rq->tag);
rq_qos_track(q, rq, bio);
 
cookie = request_to_qc_t(data.hctx, rq);
diff --git a/drivers/mmc/core/queue.c b/drivers/mmc/core/queue.c
index 648eb6743ed5..d35f265cd5e0 100644
--- a/drivers/mmc/core/queue.c
+++ b/drivers/mmc/core/queue.c
@@ -257,6 +257,10 @@ static blk_status_t mmc_mq_queue_rq(struct blk_mq_hw_ctx 
*hctx,
 
if (mmc_card_removed(mq->card)) {
req->rq_flags |= RQF_QUIET;
+
+   if (blk_queue_debug(q))
+   trace_printk("queue rq %d %d, %s\n",
+   req->internal_tag, req->tag, "io err");
return BLK_STS_IOERR;
}
 
@@ -266,6 +270,9 @@ static blk_status_t mmc_mq_queue_rq(struct blk_mq_hw_ctx 
*hctx,
 
if (mq->recovery_needed) {
spin_unlock_irq(q->queue_lock);
+   if (blk_queue_debug(q))
+   trace_printk("queue rq %d %d, %s\n",
+   req->internal_tag, req->tag, 
"resource");
  

[PATCH v4 0/2] Ensure that a request queue is dissociated from the cgroup controller

2018-07-30 Thread Bart Van Assche
Hello Jens,

Several block drivers call alloc_disk() followed by put_disk() if something
fails before device_add_disk() is called without calling blk_cleanup_queue().
Make sure that also for this scenario a request queue is dissociated from the
cgroup controller. This patch avoids that loading the parport_pc, paride and
pf drivers trigger a kernel crash. Since this patch series fixes a regression,
please consider these patches for kernel v4.19.

Thanks,

Bart.

Changes between v3 and v4:
- Added "Cc: stable" tags.

Changes between v2 and v3:
- Avoid code duplication by introducing a new helper function.

Changes between v1 and v2:
- Fixed the build for CONFIG_BLK_CGROUP=n.

Bart Van Assche (2):
  block: Introduce blk_exit_queue()
  block: Ensure that a request queue is dissociated from the cgroup
controller

 block/blk-core.c  | 54 ++-
 block/blk-sysfs.c | 25 ++
 block/blk.h   |  1 +
 3 files changed, 56 insertions(+), 24 deletions(-)

-- 
2.18.0



[PATCH v4 1/2] block: Introduce blk_exit_queue()

2018-07-30 Thread Bart Van Assche
This patch does not change any functionality.

Signed-off-by: Bart Van Assche 
Cc: Christoph Hellwig 
Cc: Ming Lei 
Cc: Omar Sandoval 
Cc: Johannes Thumshirn 
Cc: Alexandru Moise <00moses.alexande...@gmail.com>
Cc: Joseph Qi 
Cc: 
---
 block/blk-core.c | 54 +++-
 block/blk.h  |  1 +
 2 files changed, 31 insertions(+), 24 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 7c65e297a4f1..5fbfa8484c95 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -719,6 +719,35 @@ void blk_set_queue_dying(struct request_queue *q)
 }
 EXPORT_SYMBOL_GPL(blk_set_queue_dying);
 
+/* Unconfigure the I/O scheduler and dissociate from the cgroup controller. */
+void blk_exit_queue(struct request_queue *q)
+{
+   /*
+* Since the I/O scheduler exit code may access cgroup information,
+* perform I/O scheduler exit before disassociating from the block
+* cgroup controller.
+*/
+   if (q->elevator) {
+   ioc_clear_queue(q);
+   elevator_exit(q, q->elevator);
+   q->elevator = NULL;
+   }
+
+   /*
+* Remove all references to @q from the block cgroup controller before
+* restoring @q->queue_lock to avoid that restoring this pointer causes
+* e.g. blkcg_print_blkgs() to crash.
+*/
+   blkcg_exit_queue(q);
+
+   /*
+* Since the cgroup code may dereference the @q->backing_dev_info
+* pointer, only decrease its reference count after having removed the
+* association with the block cgroup controller.
+*/
+   bdi_put(q->backing_dev_info);
+}
+
 /**
  * blk_cleanup_queue - shutdown a request queue
  * @q: request queue to shutdown
@@ -788,30 +817,7 @@ void blk_cleanup_queue(struct request_queue *q)
 */
WARN_ON_ONCE(q->kobj.state_in_sysfs);
 
-   /*
-* Since the I/O scheduler exit code may access cgroup information,
-* perform I/O scheduler exit before disassociating from the block
-* cgroup controller.
-*/
-   if (q->elevator) {
-   ioc_clear_queue(q);
-   elevator_exit(q, q->elevator);
-   q->elevator = NULL;
-   }
-
-   /*
-* Remove all references to @q from the block cgroup controller before
-* restoring @q->queue_lock to avoid that restoring this pointer causes
-* e.g. blkcg_print_blkgs() to crash.
-*/
-   blkcg_exit_queue(q);
-
-   /*
-* Since the cgroup code may dereference the @q->backing_dev_info
-* pointer, only decrease its reference count after having removed the
-* association with the block cgroup controller.
-*/
-   bdi_put(q->backing_dev_info);
+   blk_exit_queue(q);
 
if (q->mq_ops)
blk_mq_free_queue(q);
diff --git a/block/blk.h b/block/blk.h
index 6adae8f94279..aeb8026f4d7b 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -130,6 +130,7 @@ void blk_free_flush_queue(struct blk_flush_queue *q);
 int blk_init_rl(struct request_list *rl, struct request_queue *q,
gfp_t gfp_mask);
 void blk_exit_rl(struct request_queue *q, struct request_list *rl);
+void blk_exit_queue(struct request_queue *q);
 void blk_rq_bio_prep(struct request_queue *q, struct request *rq,
struct bio *bio);
 void blk_queue_bypass_start(struct request_queue *q);
-- 
2.18.0



[PATCH v4 2/2] block: Ensure that a request queue is dissociated from the cgroup controller

2018-07-30 Thread Bart Van Assche
Several block drivers call alloc_disk() followed by put_disk() if
something fails before device_add_disk() is called without calling
blk_cleanup_queue(). Make sure that also for this scenario a request
queue is dissociated from the cgroup controller. This patch avoids
that loading the parport_pc, paride and pf drivers triggers the
following kernel crash:

BUG: KASAN: null-ptr-deref in pi_init+0x42e/0x580 [paride]
Read of size 4 at addr 0008 by task modprobe/744
Call Trace:
dump_stack+0x9a/0xeb
kasan_report+0x139/0x350
pi_init+0x42e/0x580 [paride]
pf_init+0x2bb/0x1000 [pf]
do_one_initcall+0x8e/0x405
do_init_module+0xd9/0x2f2
load_module+0x3ab4/0x4700
SYSC_finit_module+0x176/0x1a0
do_syscall_64+0xee/0x2b0
entry_SYSCALL_64_after_hwframe+0x42/0xb7

Reported-by: Alexandru Moise <00moses.alexande...@gmail.com>
Fixes: a063057d7c73 ("block: Fix a race between request queue removal and the 
block cgroup controller") # v4.17
Signed-off-by: Bart Van Assche 
Tested-by: Alexandru Moise <00moses.alexande...@gmail.com>
Cc: Tejun Heo 
Cc: Christoph Hellwig 
Cc: Ming Lei 
Cc: Johannes Thumshirn 
Cc: Alexandru Moise <00moses.alexande...@gmail.com>
Cc: Joseph Qi 
Cc: 
---
 block/blk-sysfs.c | 25 +
 1 file changed, 25 insertions(+)

diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index ca1984ecbdeb..26275d9babcb 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -802,6 +802,31 @@ static void __blk_release_queue(struct work_struct *work)
blk_stat_remove_callback(q, q->poll_cb);
blk_stat_free_callback(q->poll_cb);
 
+   if (!blk_queue_dead(q)) {
+   /*
+* Last reference was dropped without having called
+* blk_cleanup_queue().
+*/
+   WARN_ONCE(blk_queue_init_done(q),
+ "request queue %p has been registered but 
blk_cleanup_queue() has not been called for that queue\n",
+ q);
+   blk_exit_queue(q);
+   }
+
+#ifdef CONFIG_BLK_CGROUP
+   {
+   struct blkcg_gq *blkg;
+
+   rcu_read_lock();
+   blkg = blkg_lookup(&blkcg_root, q);
+   rcu_read_unlock();
+
+   WARN(blkg,
+"request queue %p is being released but it has not yet 
been removed from the blkcg controller\n",
+q);
+   }
+#endif
+
blk_free_queue_stats(q->stats);
 
blk_exit_rl(q, &q->root_rl);
-- 
2.18.0



Re: [PATCH] block: really disable runtime-pm for blk-mq

2018-07-30 Thread Bart Van Assche
On Mon, 2018-07-30 at 20:02 +0800, Ming Lei wrote:
> Runtime PM isn't ready for blk-mq yet, and commit 765e40b675a9 ("block:
> disable runtime-pm for blk-mq") tried to disable it. Unfortunately,
> it can't take effect in that way since user space still can switch
> it on via 'echo auto > /sys/block/sdN/device/power/control'.
> 
> This patch disables runtime-pm for blk-mq really by pm_runtime_disable()
> and fixes all kinds of PM related kernel crash.

Reviewed-by: Bart Van Assche 




Re: [PATCH] block: really disable runtime-pm for blk-mq

2018-07-30 Thread Patrick Steinhardt
On Mon, Jul 30, 2018 at 08:02:19PM +0800, Ming Lei wrote:
> Runtime PM isn't ready for blk-mq yet, and commit 765e40b675a9 ("block:
> disable runtime-pm for blk-mq") tried to disable it. Unfortunately,
> it can't take effect in that way since user space still can switch
> it on via 'echo auto > /sys/block/sdN/device/power/control'.
> 
> This patch disables runtime-pm for blk-mq really by pm_runtime_disable()
> and fixes all kinds of PM related kernel crash.

I can confirm that this patch fixes the kernel panics I've seen.
For what it's worth, the USB hotplugging issues didn't
re-surface, either. Thanks!

> Cc: Christoph Hellwig 
> Cc: Patrick Steinhardt 
> Cc: Bart Van Assche 
> Cc: Tomas Janousek 
> Cc: Przemek Socha 
> Cc: Alan Stern 
> Cc: 
> Signed-off-by: Ming Lei 
> ---
>  block/blk-core.c | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/block/blk-core.c b/block/blk-core.c
> index 03a4ea93a5f3..090b782df129 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -3769,9 +3769,11 @@ EXPORT_SYMBOL(blk_finish_plug);
>   */
>  void blk_pm_runtime_init(struct request_queue *q, struct device *dev)
>  {
> - /* not support for RQF_PM and ->rpm_status in blk-mq yet */
> - if (q->mq_ops)
> + /* Don't enable runtime PM for blk-mq until it is ready */
> + if (q->mq_ops) {
> + pm_runtime_disable(dev);
>   return;
> + }
>  
>   q->dev = dev;
>   q->rpm_status = RPM_ACTIVE;
> -- 
> 2.9.5
> 


signature.asc
Description: PGP signature


Re: [PATCH v5 1/3] block: move ref_tag calculation func to the block layer

2018-07-30 Thread Jens Axboe
On 7/29/18 3:15 PM, Max Gurtovoy wrote:
> Currently this function is implemented in the scsi layer, but it's
> actual place should be the block layer since T10-PI is a general
> data integrity feature that is used in the nvme protocol as well.

Applied 1-3, thanks.

-- 
Jens Axboe



Re: [PATCH][v3] block: don't account for split bio's size in cgroup stats

2018-07-30 Thread Jens Axboe
On 7/30/18 8:10 AM, Josef Bacik wrote:
> We need to check in blkcg_bio_issue_check if the bio is flagged as
> QUEUE_ENTERED, because if it is then we've already accounted for the
> size of the IO in the cgroup stats.  We can still however account for
> the extra IO since it'll be another request.

Applied, thanks.

-- 
Jens Axboe



[PATCH][v3] block: don't account for split bio's size in cgroup stats

2018-07-30 Thread Josef Bacik
We need to check in blkcg_bio_issue_check if the bio is flagged as
QUEUE_ENTERED, because if it is then we've already accounted for the
size of the IO in the cgroup stats.  We can still however account for
the extra IO since it'll be another request.

Reported-by: Tejun Heo 
Signed-off-by: Josef Bacik 
---
v2->v3:
- this one actually compiles

 include/linux/blk-cgroup.h | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/include/linux/blk-cgroup.h b/include/linux/blk-cgroup.h
index 3bed5e02a873..f7b910768306 100644
--- a/include/linux/blk-cgroup.h
+++ b/include/linux/blk-cgroup.h
@@ -769,8 +769,14 @@ static inline bool blkcg_bio_issue_check(struct 
request_queue *q,
 
if (!throtl) {
blkg = blkg ?: q->root_blkg;
-   blkg_rwstat_add(&blkg->stat_bytes, bio->bi_opf,
-   bio->bi_iter.bi_size);
+   /*
+* If the bio is flagged with BIO_QUEUE_ENTERED it means this
+* is a split bio and we would have already accounted for the
+* size of the bio.
+*/
+   if (!bio_flagged(bio, BIO_QUEUE_ENTERED))
+   blkg_rwstat_add(&blkg->stat_bytes, bio->bi_opf,
+   bio->bi_iter.bi_size);
blkg_rwstat_add(&blkg->stat_ios, bio->bi_opf, 1);
}
 
-- 
2.14.3



[PATCH][v2] block: don't account for split bio's size in cgroup stats

2018-07-30 Thread Josef Bacik
We need to check in blkcg_bio_issue_check if the bio is flagged as
QUEUE_ENTERED, because if it is then we've already accounted for the
size of the IO in the cgroup stats.  We can still however account for
the extra IO since it'll be another request.

Reported-by: Tejun Heo 
Signed-off-by: Josef Bacik 
---
v1->v2:
- added a comment per Jens's suggestion.
- changed it so we only skip the size stat change, we are technically adding
  another request so keep that part of the stats.

 include/linux/blk-cgroup.h | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/include/linux/blk-cgroup.h b/include/linux/blk-cgroup.h
index 3bed5e02a873..4586b183 100644
--- a/include/linux/blk-cgroup.h
+++ b/include/linux/blk-cgroup.h
@@ -769,8 +769,14 @@ static inline bool blkcg_bio_issue_check(struct 
request_queue *q,
 
if (!throtl) {
blkg = blkg ?: q->root_blkg;
-   blkg_rwstat_add(&blkg->stat_bytes, bio->bi_opf,
-   bio->bi_iter.bi_size);
+   /*
+* If the bio is flagged with BIO_QUEUE_ENTERED it means this
+* is a split bio and we would have already accounted for the
+* size of the bio.
+*/
+   if (!bio_flagged(bio, BIO_QUEUE_ENTERED)) {
+   blkg_rwstat_add(&blkg->stat_bytes, bio->bi_opf,
+   bio->bi_iter.bi_size);
blkg_rwstat_add(&blkg->stat_ios, bio->bi_opf, 1);
}
 
-- 
2.14.3



Re: [PATCH v5 3/3] block: bio_iov_iter_get_pages: pin more pages for multi-segment IOs

2018-07-30 Thread Ming Lei
On Wed, Jul 25, 2018 at 11:15:09PM +0200, Martin Wilck wrote:
> bio_iov_iter_get_pages() currently only adds pages for the
> next non-zero segment from the iov_iter to the bio. That's
> suboptimal for callers, which typically try to pin as many
> pages as fit into the bio. This patch converts the current
> bio_iov_iter_get_pages() into a static helper, and introduces
> a new helper that allocates as many pages as
> 
>  1) fit into the bio,
>  2) are present in the iov_iter,
>  3) and can be pinned by MM.
> 
> Error is returned only if zero pages could be pinned. Because of
> 3), a zero return value doesn't necessarily mean all pages have been
> pinned. Callers that have to pin every page in the iov_iter must still
> call this function in a loop (this is currently the case).
> 
> This change matters most for __blkdev_direct_IO_simple(), which calls
> bio_iov_iter_get_pages() only once. If it obtains less pages than requested,
> it returns a "short write" or "short read", and __generic_file_write_iter()
> falls back to buffered writes, which may lead to data corruption.
> 
> Fixes: 72ecad22d9f1 ("block: support a full bio worth of IO for
>  simplified bdev direct-io")
> Signed-off-by: Martin Wilck 
> ---
>  block/bio.c | 35 ---
>  1 file changed, 32 insertions(+), 3 deletions(-)
> 
> diff --git a/block/bio.c b/block/bio.c
> index 489a430..925033d 100644
> --- a/block/bio.c
> +++ b/block/bio.c
> @@ -903,14 +903,16 @@ int bio_add_page(struct bio *bio, struct page *page,
>  EXPORT_SYMBOL(bio_add_page);
>  
>  /**
> - * bio_iov_iter_get_pages - pin user or kernel pages and add them to a bio
> + * __bio_iov_iter_get_pages - pin user or kernel pages and add them to a bio
>   * @bio: bio to add pages to
>   * @iter: iov iterator describing the region to be mapped
>   *
> - * Pins as many pages from *iter and appends them to @bio's bvec array. The
> + * Pins pages from *iter and appends them to @bio's bvec array. The
>   * pages will have to be released using put_page() when done.
> + * For multi-segment *iter, this function only adds pages from the
> + * the next non-empty segment of the iov iterator.
>   */
> -int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
> +static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
>  {
>   unsigned short nr_pages = bio->bi_max_vecs - bio->bi_vcnt, idx;
>   struct bio_vec *bv = bio->bi_io_vec + bio->bi_vcnt;
> @@ -947,6 +949,33 @@ int bio_iov_iter_get_pages(struct bio *bio, struct 
> iov_iter *iter)
>   iov_iter_advance(iter, size);
>   return 0;
>  }
> +
> +/**
> + * bio_iov_iter_get_pages - pin user or kernel pages and add them to a bio
> + * @bio: bio to add pages to
> + * @iter: iov iterator describing the region to be mapped
> + *
> + * Pins pages from *iter and appends them to @bio's bvec array. The
> + * pages will have to be released using put_page() when done.
> + * The function tries, but does not guarantee, to pin as many pages as
> + * fit into the bio, or are requested in *iter, whatever is smaller.
> + * If MM encounters an error pinning the requested pages, it stops.
> + * Error is returned only if 0 pages could be pinned.
> + */
> +int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
> +{
> + unsigned short orig_vcnt = bio->bi_vcnt;
> +
> + do {
> + int ret = __bio_iov_iter_get_pages(bio, iter);
> +
> + if (unlikely(ret))
> + return bio->bi_vcnt > orig_vcnt ? 0 : ret;
> +
> + } while (iov_iter_count(iter) && !bio_full(bio));

When 'ret' isn't zero, and some partial progress has been made, seems less pages
might be obtained than requested too. Is that something we need to worry about?

Thanks,
Ming


[PATCH] block: really disable runtime-pm for blk-mq

2018-07-30 Thread Ming Lei
Runtime PM isn't ready for blk-mq yet, and commit 765e40b675a9 ("block:
disable runtime-pm for blk-mq") tried to disable it. Unfortunately,
it can't take effect in that way since user space still can switch
it on via 'echo auto > /sys/block/sdN/device/power/control'.

This patch disables runtime-pm for blk-mq really by pm_runtime_disable()
and fixes all kinds of PM related kernel crash.

Cc: Christoph Hellwig 
Cc: Patrick Steinhardt 
Cc: Bart Van Assche 
Cc: Tomas Janousek 
Cc: Przemek Socha 
Cc: Alan Stern 
Cc: 
Signed-off-by: Ming Lei 
---
 block/blk-core.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 03a4ea93a5f3..090b782df129 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -3769,9 +3769,11 @@ EXPORT_SYMBOL(blk_finish_plug);
  */
 void blk_pm_runtime_init(struct request_queue *q, struct device *dev)
 {
-   /* not support for RQF_PM and ->rpm_status in blk-mq yet */
-   if (q->mq_ops)
+   /* Don't enable runtime PM for blk-mq until it is ready */
+   if (q->mq_ops) {
+   pm_runtime_disable(dev);
return;
+   }
 
q->dev = dev;
q->rpm_status = RPM_ACTIVE;
-- 
2.9.5



Re: [PATCH 6/6] virtio-blk: modernize sysfs attribute creation

2018-07-30 Thread Christoph Hellwig
On Mon, Jul 30, 2018 at 09:12:27AM +0200, Hannes Reinecke wrote:
> Use new-style DEVICE_ATTR_RO/DEVICE_ATTR_RW to create the sysfs attributes
> and register the disk with default sysfs attribute groups.
> 
> Signed-off-by: Hannes Reinecke 
> Cc: Michael Tsirkin 
> Cc: Jason Wang 

Looks good,

Reviewed-by: Christoph Hellwig 


Re: [PATCH 5/6] zram: register default groups with device_add_disk()

2018-07-30 Thread Christoph Hellwig
Looks good,

Reviewed-by: Christoph Hellwig 


Re: [PATCH 1/6] genhd: drop 'bool' argument from __device_add_disk()

2018-07-30 Thread Hannes Reinecke
On 07/30/2018 10:56 AM, Christoph Hellwig wrote:
> I really don't see the point for this change.
> 
Okay, I'll be dropping it on the next iteration.

Cheers,

Hannes
-- 
Dr. Hannes ReineckezSeries & Storage
h...@suse.com  +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)


Re: [PATCH 4/6] aoe: use device_add_disk_with_groups()

2018-07-30 Thread Christoph Hellwig
On Mon, Jul 30, 2018 at 09:12:25AM +0200, Hannes Reinecke wrote:
> Use device_add_disk_with_groups() to avoid a race condition with
> udev during startup.
> 
> Signed-off-by: Hannes Reinecke 
> Cc: Ed L. Cachin 

Looks good,

Reviewed-by: Christoph Hellwig 


Re: [PATCH 3/6] nvme: register ns_id attributes as default sysfs groups

2018-07-30 Thread Christoph Hellwig
On Mon, Jul 30, 2018 at 09:12:24AM +0200, Hannes Reinecke wrote:
> We should be registering the ns_id attribute as default sysfs
> attribute groups, otherwise we have a race condition between
> the uevent and the attributes appearing in sysfs.
> 
> Signed-off-by: Hannes Reinecke 
> Cc: Sagi Grimberg 
> Cc: Keith Busch 

Looks good,

Reviewed-by: Christoph Hellwig 


Re: [PATCH 2/6] block: genhd: add 'groups' argument to device_add_disk

2018-07-30 Thread Christoph Hellwig
On Mon, Jul 30, 2018 at 09:12:23AM +0200, Hannes Reinecke wrote:
> Update device_add_disk() to take an 'groups' argument so that
> individual drivers can register a device with additional sysfs
> attributes.
> This avoids race condition the driver would otherwise have if these
> groups were to be created with sysfs_add_groups().
> 
> Signed-off-by: Martin Wilck 
> Signed-off-by: Hannes Reinecke 

The first signoff should always be the author.

Otherwise this looks fine (modulo any context changes due to patch 1):

Reviewed-by: Christoph Hellwig 


Re: [PATCH 1/6] genhd: drop 'bool' argument from __device_add_disk()

2018-07-30 Thread Christoph Hellwig
I really don't see the point for this change.


[PATCH 3/6] nvme: register ns_id attributes as default sysfs groups

2018-07-30 Thread Hannes Reinecke
We should be registering the ns_id attribute as default sysfs
attribute groups, otherwise we have a race condition between
the uevent and the attributes appearing in sysfs.

Signed-off-by: Hannes Reinecke 
Cc: Sagi Grimberg 
Cc: Keith Busch 
---
 drivers/nvme/host/core.c  | 13 ++---
 drivers/nvme/host/multipath.c | 15 ---
 drivers/nvme/host/nvme.h  |  2 +-
 3 files changed, 11 insertions(+), 19 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index f1b61ebceaf1..39bdfe806d1b 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -2696,6 +2696,11 @@ const struct attribute_group nvme_ns_id_attr_group = {
.is_visible = nvme_ns_id_attrs_are_visible,
 };
 
+const struct attribute_group *nvme_ns_id_attr_groups[] = {
+   &nvme_ns_id_attr_group,
+   NULL,
+};
+
 #define nvme_show_str_function(field)  
\
 static ssize_t  field##_show(struct device *dev,   
\
struct device_attribute *attr, char *buf)   
\
@@ -3061,11 +3066,7 @@ static void nvme_alloc_ns(struct nvme_ctrl *ctrl, 
unsigned nsid)
 
nvme_get_ctrl(ctrl);
 
-   device_add_disk(ctrl->device, ns->disk, NULL);
-   if (sysfs_create_group(&disk_to_dev(ns->disk)->kobj,
-   &nvme_ns_id_attr_group))
-   pr_warn("%s: failed to create sysfs group for identification\n",
-   ns->disk->disk_name);
+   device_add_disk(ctrl->device, ns->disk, nvme_ns_id_attr_groups);
if (ns->ndev && nvme_nvm_register_sysfs(ns))
pr_warn("%s: failed to register lightnvm sysfs group for 
identification\n",
ns->disk->disk_name);
@@ -3094,8 +3095,6 @@ static void nvme_ns_remove(struct nvme_ns *ns)
 
nvme_fault_inject_fini(ns);
if (ns->disk && ns->disk->flags & GENHD_FL_UP) {
-   sysfs_remove_group(&disk_to_dev(ns->disk)->kobj,
-   &nvme_ns_id_attr_group);
if (ns->ndev)
nvme_nvm_unregister_sysfs(ns);
del_gendisk(ns->disk);
diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index e4d786467129..7d3c30324da9 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -281,13 +281,9 @@ static void nvme_mpath_set_live(struct nvme_ns *ns)
if (!head->disk)
return;
 
-   if (!(head->disk->flags & GENHD_FL_UP)) {
-   device_add_disk(&head->subsys->dev, head->disk, NULL);
-   if (sysfs_create_group(&disk_to_dev(head->disk)->kobj,
-   &nvme_ns_id_attr_group))
-   dev_warn(&head->subsys->dev,
-"failed to create id group.\n");
-   }
+   if (!(head->disk->flags & GENHD_FL_UP))
+   device_add_disk(&head->subsys->dev, head->disk,
+   nvme_ns_id_attr_groups);
 
kblockd_schedule_work(&ns->head->requeue_work);
 }
@@ -493,11 +489,8 @@ void nvme_mpath_remove_disk(struct nvme_ns_head *head)
 {
if (!head->disk)
return;
-   if (head->disk->flags & GENHD_FL_UP) {
-   sysfs_remove_group(&disk_to_dev(head->disk)->kobj,
-  &nvme_ns_id_attr_group);
+   if (head->disk->flags & GENHD_FL_UP)
del_gendisk(head->disk);
-   }
blk_set_queue_dying(head->disk->queue);
/* make sure all pending bios are cleaned up */
kblockd_schedule_work(&head->requeue_work);
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 8b356f1d941c..7f156367bd18 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -466,7 +466,7 @@ int nvme_delete_ctrl_sync(struct nvme_ctrl *ctrl);
 int nvme_get_log(struct nvme_ctrl *ctrl, u32 nsid, u8 log_page, u8 lsp,
void *log, size_t size, u64 offset);
 
-extern const struct attribute_group nvme_ns_id_attr_group;
+extern const struct attribute_group *nvme_ns_id_attr_groups[];
 extern const struct block_device_operations nvme_ns_head_ops;
 
 #ifdef CONFIG_NVME_MULTIPATH
-- 
2.12.3



[PATCH 2/6] block: genhd: add 'groups' argument to device_add_disk

2018-07-30 Thread Hannes Reinecke
Update device_add_disk() to take an 'groups' argument so that
individual drivers can register a device with additional sysfs
attributes.
This avoids race condition the driver would otherwise have if these
groups were to be created with sysfs_add_groups().

Signed-off-by: Martin Wilck 
Signed-off-by: Hannes Reinecke 
---
 arch/um/drivers/ubd_kern.c  |  2 +-
 block/genhd.c   | 21 +++--
 drivers/block/floppy.c  |  2 +-
 drivers/block/mtip32xx/mtip32xx.c   |  2 +-
 drivers/block/ps3disk.c |  2 +-
 drivers/block/ps3vram.c |  2 +-
 drivers/block/rsxx/dev.c|  2 +-
 drivers/block/skd_main.c|  2 +-
 drivers/block/sunvdc.c  |  2 +-
 drivers/block/virtio_blk.c  |  2 +-
 drivers/block/xen-blkfront.c|  2 +-
 drivers/ide/ide-cd.c|  2 +-
 drivers/ide/ide-gd.c|  2 +-
 drivers/memstick/core/ms_block.c|  2 +-
 drivers/memstick/core/mspro_block.c |  2 +-
 drivers/mmc/core/block.c|  2 +-
 drivers/mtd/mtd_blkdevs.c   |  2 +-
 drivers/nvdimm/blk.c|  2 +-
 drivers/nvdimm/btt.c|  2 +-
 drivers/nvdimm/pmem.c   |  2 +-
 drivers/nvme/host/core.c|  2 +-
 drivers/nvme/host/multipath.c   |  2 +-
 drivers/s390/block/dasd_genhd.c |  2 +-
 drivers/s390/block/dcssblk.c|  2 +-
 drivers/s390/block/scm_blk.c|  2 +-
 drivers/scsi/sd.c   |  2 +-
 drivers/scsi/sr.c   |  2 +-
 include/linux/genhd.h   |  5 +++--
 28 files changed, 44 insertions(+), 34 deletions(-)

diff --git a/arch/um/drivers/ubd_kern.c b/arch/um/drivers/ubd_kern.c
index 83c470364dfb..6ee4c56032f7 100644
--- a/arch/um/drivers/ubd_kern.c
+++ b/arch/um/drivers/ubd_kern.c
@@ -891,7 +891,7 @@ static int ubd_disk_register(int major, u64 size, int unit,
 
disk->private_data = &ubd_devs[unit];
disk->queue = ubd_devs[unit].queue;
-   device_add_disk(parent, disk);
+   device_add_disk(parent, disk, NULL);
 
*disk_out = disk;
return 0;
diff --git a/block/genhd.c b/block/genhd.c
index 62aba3d8ae19..500c2e8d6345 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -567,7 +567,8 @@ static int exact_lock(dev_t devt, void *data)
return 0;
 }
 
-static void register_disk(struct device *parent, struct gendisk *disk)
+static void register_disk(struct device *parent, struct gendisk *disk,
+ const struct attribute_group **groups)
 {
struct device *ddev = disk_to_dev(disk);
struct block_device *bdev;
@@ -582,6 +583,10 @@ static void register_disk(struct device *parent, struct 
gendisk *disk)
/* delay uevents, until we scanned partition table */
dev_set_uevent_suppress(ddev, 1);
 
+   if (groups) {
+   WARN_ON(ddev->groups);
+   ddev->groups = groups;
+   }
if (device_add(ddev))
return;
if (!sysfs_deprecated) {
@@ -647,13 +652,15 @@ static void register_disk(struct device *parent, struct 
gendisk *disk)
  * __device_add_disk - add disk information to kernel list
  * @parent: parent device for the disk
  * @disk: per-device partitioning information
+ * @groups: Additional per-device sysfs groups
  *
  * This function registers the partitioning information in @disk
  * with the kernel.
  *
  * FIXME: error handling
  */
-static void __device_add_disk(struct device *parent, struct gendisk *disk)
+static void __device_add_disk(struct device *parent, struct gendisk *disk,
+ const struct attribute_group **groups)
 {
dev_t devt;
int retval;
@@ -696,12 +703,13 @@ static void __device_add_disk(struct device *parent, 
struct gendisk *disk)
blk_register_region(disk_devt(disk), disk->minors, NULL,
exact_match, exact_lock, disk);
}
-   register_disk(parent, disk);
+   register_disk(parent, disk, groups);
 }
 
-void device_add_disk(struct device *parent, struct gendisk *disk)
+void device_add_disk(struct device *parent, struct gendisk *disk,
+const struct attribute_group **groups)
 {
-   __device_add_disk(parent, disk);
+   __device_add_disk(parent, disk, groups);
 
blk_register_queue(disk);
 
@@ -718,7 +726,8 @@ EXPORT_SYMBOL(device_add_disk);
 
 void device_add_disk_no_queue_reg(struct device *parent, struct gendisk *disk)
 {
-   __device_add_disk(parent, disk);
+   __device_add_disk(parent, disk, NULL);
+
/*
 * Take an extra ref on queue which will be put on disk_release()
 * so that it sticks around as long as @disk is there.
diff --git a/drivers/block/floppy.c b/drivers/block/floppy.c
index 48f622728ce6..1bc99e9dfaee 100644
--- a/drivers/block/floppy.c
+++ b/drivers/block/floppy.c
@@ -4676,7 +4676,7 @@ static int __init do_floppy_init(void)
/* to be 

[PATCH 1/6] genhd: drop 'bool' argument from __device_add_disk()

2018-07-30 Thread Hannes Reinecke
Split off the last part of __device_add_disk() and move it into the
caller.

Signed-off-by: Hannes Reinecke 
---
 block/genhd.c | 28 +---
 1 file changed, 17 insertions(+), 11 deletions(-)

diff --git a/block/genhd.c b/block/genhd.c
index 8cc719a37b32..62aba3d8ae19 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -647,15 +647,13 @@ static void register_disk(struct device *parent, struct 
gendisk *disk)
  * __device_add_disk - add disk information to kernel list
  * @parent: parent device for the disk
  * @disk: per-device partitioning information
- * @register_queue: register the queue if set to true
  *
  * This function registers the partitioning information in @disk
  * with the kernel.
  *
  * FIXME: error handling
  */
-static void __device_add_disk(struct device *parent, struct gendisk *disk,
- bool register_queue)
+static void __device_add_disk(struct device *parent, struct gendisk *disk)
 {
dev_t devt;
int retval;
@@ -699,8 +697,13 @@ static void __device_add_disk(struct device *parent, 
struct gendisk *disk,
exact_match, exact_lock, disk);
}
register_disk(parent, disk);
-   if (register_queue)
-   blk_register_queue(disk);
+}
+
+void device_add_disk(struct device *parent, struct gendisk *disk)
+{
+   __device_add_disk(parent, disk);
+
+   blk_register_queue(disk);
 
/*
 * Take an extra ref on queue which will be put on disk_release()
@@ -711,16 +714,19 @@ static void __device_add_disk(struct device *parent, 
struct gendisk *disk,
disk_add_events(disk);
blk_integrity_add(disk);
 }
-
-void device_add_disk(struct device *parent, struct gendisk *disk)
-{
-   __device_add_disk(parent, disk, true);
-}
 EXPORT_SYMBOL(device_add_disk);
 
 void device_add_disk_no_queue_reg(struct device *parent, struct gendisk *disk)
 {
-   __device_add_disk(parent, disk, false);
+   __device_add_disk(parent, disk);
+   /*
+* Take an extra ref on queue which will be put on disk_release()
+* so that it sticks around as long as @disk is there.
+*/
+   WARN_ON_ONCE(!blk_get_queue(disk->queue));
+
+   disk_add_events(disk);
+   blk_integrity_add(disk);
 }
 EXPORT_SYMBOL(device_add_disk_no_queue_reg);
 
-- 
2.12.3



[PATCH 6/6] virtio-blk: modernize sysfs attribute creation

2018-07-30 Thread Hannes Reinecke
Use new-style DEVICE_ATTR_RO/DEVICE_ATTR_RW to create the sysfs attributes
and register the disk with default sysfs attribute groups.

Signed-off-by: Hannes Reinecke 
Cc: Michael Tsirkin 
Cc: Jason Wang 
---
 drivers/block/virtio_blk.c | 68 ++
 1 file changed, 39 insertions(+), 29 deletions(-)

diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index fe8056a1..086c6bb12baa 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -351,8 +351,8 @@ static int minor_to_index(int minor)
return minor >> PART_BITS;
 }
 
-static ssize_t virtblk_serial_show(struct device *dev,
-   struct device_attribute *attr, char *buf)
+static ssize_t serial_show(struct device *dev,
+  struct device_attribute *attr, char *buf)
 {
struct gendisk *disk = dev_to_disk(dev);
int err;
@@ -371,7 +371,7 @@ static ssize_t virtblk_serial_show(struct device *dev,
return err;
 }
 
-static DEVICE_ATTR(serial, 0444, virtblk_serial_show, NULL);
+static DEVICE_ATTR_RO(serial);
 
 /* The queue's logical block size must be set before calling this */
 static void virtblk_update_capacity(struct virtio_blk *vblk, bool resize)
@@ -545,8 +545,8 @@ static const char *const virtblk_cache_types[] = {
 };
 
 static ssize_t
-virtblk_cache_type_store(struct device *dev, struct device_attribute *attr,
-const char *buf, size_t count)
+cache_type_store(struct device *dev, struct device_attribute *attr,
+const char *buf, size_t count)
 {
struct gendisk *disk = dev_to_disk(dev);
struct virtio_blk *vblk = disk->private_data;
@@ -564,8 +564,7 @@ virtblk_cache_type_store(struct device *dev, struct 
device_attribute *attr,
 }
 
 static ssize_t
-virtblk_cache_type_show(struct device *dev, struct device_attribute *attr,
-char *buf)
+cache_type_show(struct device *dev, struct device_attribute *attr, char *buf)
 {
struct gendisk *disk = dev_to_disk(dev);
struct virtio_blk *vblk = disk->private_data;
@@ -575,12 +574,38 @@ virtblk_cache_type_show(struct device *dev, struct 
device_attribute *attr,
return snprintf(buf, 40, "%s\n", virtblk_cache_types[writeback]);
 }
 
-static const struct device_attribute dev_attr_cache_type_ro =
-   __ATTR(cache_type, 0444,
-  virtblk_cache_type_show, NULL);
-static const struct device_attribute dev_attr_cache_type_rw =
-   __ATTR(cache_type, 0644,
-  virtblk_cache_type_show, virtblk_cache_type_store);
+static DEVICE_ATTR_RW(cache_type);
+
+static struct attribute *virtblk_attrs[] = {
+   &dev_attr_serial.attr,
+   &dev_attr_cache_type.attr,
+   NULL,
+};
+
+static umode_t virtblk_attrs_are_visible(struct kobject *kobj,
+   struct attribute *a, int n)
+{
+   struct device *dev = container_of(kobj, struct device, kobj);
+   struct gendisk *disk = dev_to_disk(dev);
+   struct virtio_blk *vblk = disk->private_data;
+   struct virtio_device *vdev = vblk->vdev;
+
+   if (a == &dev_attr_cache_type.attr &&
+   !virtio_has_feature(vdev, VIRTIO_BLK_F_CONFIG_WCE))
+   return S_IRUGO;
+
+   return a->mode;
+}
+
+static const struct attribute_group virtblk_attr_group = {
+   .attrs = virtblk_attrs,
+   .is_visible = virtblk_attrs_are_visible,
+};
+
+static const struct attribute_group *virtblk_attr_groups[] = {
+   &virtblk_attr_group,
+   NULL,
+};
 
 static int virtblk_init_request(struct blk_mq_tag_set *set, struct request *rq,
unsigned int hctx_idx, unsigned int numa_node)
@@ -780,24 +805,9 @@ static int virtblk_probe(struct virtio_device *vdev)
virtblk_update_capacity(vblk, false);
virtio_device_ready(vdev);
 
-   device_add_disk(&vdev->dev, vblk->disk, NULL);
-   err = device_create_file(disk_to_dev(vblk->disk), &dev_attr_serial);
-   if (err)
-   goto out_del_disk;
-
-   if (virtio_has_feature(vdev, VIRTIO_BLK_F_CONFIG_WCE))
-   err = device_create_file(disk_to_dev(vblk->disk),
-&dev_attr_cache_type_rw);
-   else
-   err = device_create_file(disk_to_dev(vblk->disk),
-&dev_attr_cache_type_ro);
-   if (err)
-   goto out_del_disk;
+   device_add_disk(&vdev->dev, vblk->disk, virtblk_attr_groups);
return 0;
 
-out_del_disk:
-   del_gendisk(vblk->disk);
-   blk_cleanup_queue(vblk->disk->queue);
 out_free_tags:
blk_mq_free_tag_set(&vblk->tag_set);
 out_put_disk:
-- 
2.12.3



[PATCH 4/6] aoe: use device_add_disk_with_groups()

2018-07-30 Thread Hannes Reinecke
Use device_add_disk_with_groups() to avoid a race condition with
udev during startup.

Signed-off-by: Hannes Reinecke 
Cc: Ed L. Cachin 
---
 drivers/block/aoe/aoe.h|  1 -
 drivers/block/aoe/aoeblk.c | 21 +++--
 drivers/block/aoe/aoedev.c |  1 -
 3 files changed, 7 insertions(+), 16 deletions(-)

diff --git a/drivers/block/aoe/aoe.h b/drivers/block/aoe/aoe.h
index c0ebda1283cc..015c68017a1c 100644
--- a/drivers/block/aoe/aoe.h
+++ b/drivers/block/aoe/aoe.h
@@ -201,7 +201,6 @@ int aoeblk_init(void);
 void aoeblk_exit(void);
 void aoeblk_gdalloc(void *);
 void aoedisk_rm_debugfs(struct aoedev *d);
-void aoedisk_rm_sysfs(struct aoedev *d);
 
 int aoechr_init(void);
 void aoechr_exit(void);
diff --git a/drivers/block/aoe/aoeblk.c b/drivers/block/aoe/aoeblk.c
index 429ebb84b592..ff770e7d9e52 100644
--- a/drivers/block/aoe/aoeblk.c
+++ b/drivers/block/aoe/aoeblk.c
@@ -177,10 +177,15 @@ static struct attribute *aoe_attrs[] = {
NULL,
 };
 
-static const struct attribute_group attr_group = {
+static const struct attribute_group aoe_attr_group = {
.attrs = aoe_attrs,
 };
 
+static const struct attribute_group *aoe_attr_groups[] = {
+   &aoe_attr_group,
+   NULL,
+};
+
 static const struct file_operations aoe_debugfs_fops = {
.open = aoe_debugfs_open,
.read = seq_read,
@@ -220,17 +225,6 @@ aoedisk_rm_debugfs(struct aoedev *d)
 }
 
 static int
-aoedisk_add_sysfs(struct aoedev *d)
-{
-   return sysfs_create_group(&disk_to_dev(d->gd)->kobj, &attr_group);
-}
-void
-aoedisk_rm_sysfs(struct aoedev *d)
-{
-   sysfs_remove_group(&disk_to_dev(d->gd)->kobj, &attr_group);
-}
-
-static int
 aoeblk_open(struct block_device *bdev, fmode_t mode)
 {
struct aoedev *d = bdev->bd_disk->private_data;
@@ -417,8 +411,7 @@ aoeblk_gdalloc(void *vp)
 
spin_unlock_irqrestore(&d->lock, flags);
 
-   add_disk(gd);
-   aoedisk_add_sysfs(d);
+   device_add_disk(NULL, gd, aoe_attr_groups);
aoedisk_add_debugfs(d);
 
spin_lock_irqsave(&d->lock, flags);
diff --git a/drivers/block/aoe/aoedev.c b/drivers/block/aoe/aoedev.c
index 697f735b07a4..d92fa1fe3580 100644
--- a/drivers/block/aoe/aoedev.c
+++ b/drivers/block/aoe/aoedev.c
@@ -275,7 +275,6 @@ freedev(struct aoedev *d)
del_timer_sync(&d->timer);
if (d->gd) {
aoedisk_rm_debugfs(d);
-   aoedisk_rm_sysfs(d);
del_gendisk(d->gd);
put_disk(d->gd);
blk_cleanup_queue(d->blkq);
-- 
2.12.3



[PATCH 5/6] zram: register default groups with device_add_disk()

2018-07-30 Thread Hannes Reinecke
Register default sysfs groups during device_add-disk() to avoid a
race condition with udev during startup.

Signed-off-by: Hannes Reinecke 
Cc: Minchan Kim 
Cc: Nitin Gupta 
---
 drivers/block/zram/zram_drv.c | 28 +++-
 1 file changed, 7 insertions(+), 21 deletions(-)

diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 2907a8156aaf..fc3626cb7fe5 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -1618,6 +1618,11 @@ static const struct attribute_group zram_disk_attr_group 
= {
.attrs = zram_disk_attrs,
 };
 
+static const struct attribute_group *zram_disk_attr_groups[] = {
+   &zram_disk_attr_group,
+   NULL,
+};
+
 /*
  * Allocate and initialize new zram device. the function returns
  * '>= 0' device_id upon success, and negative value otherwise.
@@ -1698,24 +1703,14 @@ static int zram_add(void)
 
zram->disk->queue->backing_dev_info->capabilities |=
(BDI_CAP_STABLE_WRITES | BDI_CAP_SYNCHRONOUS_IO);
-   add_disk(zram->disk);
-
-   ret = sysfs_create_group(&disk_to_dev(zram->disk)->kobj,
-   &zram_disk_attr_group);
-   if (ret < 0) {
-   pr_err("Error creating sysfs group for device %d\n",
-   device_id);
-   goto out_free_disk;
-   }
+   device_add_disk(NULL, zram->disk, zram_disk_attr_groups);
+
strlcpy(zram->compressor, default_compressor, sizeof(zram->compressor));
 
zram_debugfs_register(zram);
pr_info("Added device: %s\n", zram->disk->disk_name);
return device_id;
 
-out_free_disk:
-   del_gendisk(zram->disk);
-   put_disk(zram->disk);
 out_free_queue:
blk_cleanup_queue(queue);
 out_free_idr:
@@ -1744,15 +1739,6 @@ static int zram_remove(struct zram *zram)
mutex_unlock(&bdev->bd_mutex);
 
zram_debugfs_unregister(zram);
-   /*
-* Remove sysfs first, so no one will perform a disksize
-* store while we destroy the devices. This also helps during
-* hot_remove -- zram_reset_device() is the last holder of
-* ->init_lock, no later/concurrent disksize_store() or any
-* other sysfs handlers are possible.
-*/
-   sysfs_remove_group(&disk_to_dev(zram->disk)->kobj,
-   &zram_disk_attr_group);
 
/* Make sure all the pending I/O are finished */
fsync_bdev(bdev);
-- 
2.12.3



[PATCH 0/6] genhd: register default groups with device_add_disk()

2018-07-30 Thread Hannes Reinecke
device_add_disk() doesn't allow to register with default sysfs groups,
which introduces a race with udev as these groups have to be created after
the uevent has been sent.
This patchset updates device_add_disk() to accept a 'groups' argument to
avoid this race and updates the obvious drivers to use it.
There are some more drivers which might benefit from this (eg loop or md),
but the interface is not straightforward so I haven't attempted it.

As usual, comments and reviews are welcome.

Hannes Reinecke (6):
  genhd: drop 'bool' argument from __device_add_disk()
  block: genhd: add 'groups' argument to device_add_disk
  nvme: register ns_id attributes as default sysfs groups
  aoe: use device_add_disk_with_groups()
  zram: register default groups with device_add_disk()
  virtio-blk: modernize sysfs attribute creation

 arch/um/drivers/ubd_kern.c  |  2 +-
 block/genhd.c   | 39 ++---
 drivers/block/aoe/aoe.h |  1 -
 drivers/block/aoe/aoeblk.c  | 21 
 drivers/block/aoe/aoedev.c  |  1 -
 drivers/block/floppy.c  |  2 +-
 drivers/block/mtip32xx/mtip32xx.c   |  2 +-
 drivers/block/ps3disk.c |  2 +-
 drivers/block/ps3vram.c |  2 +-
 drivers/block/rsxx/dev.c|  2 +-
 drivers/block/skd_main.c|  2 +-
 drivers/block/sunvdc.c  |  2 +-
 drivers/block/virtio_blk.c  | 68 +
 drivers/block/xen-blkfront.c|  2 +-
 drivers/block/zram/zram_drv.c   | 28 ---
 drivers/ide/ide-cd.c|  2 +-
 drivers/ide/ide-gd.c|  2 +-
 drivers/memstick/core/ms_block.c|  2 +-
 drivers/memstick/core/mspro_block.c |  2 +-
 drivers/mmc/core/block.c|  2 +-
 drivers/mtd/mtd_blkdevs.c   |  2 +-
 drivers/nvdimm/blk.c|  2 +-
 drivers/nvdimm/btt.c|  2 +-
 drivers/nvdimm/pmem.c   |  2 +-
 drivers/nvme/host/core.c| 13 ---
 drivers/nvme/host/multipath.c   | 15 +++-
 drivers/nvme/host/nvme.h|  2 +-
 drivers/s390/block/dasd_genhd.c |  2 +-
 drivers/s390/block/dcssblk.c|  2 +-
 drivers/s390/block/scm_blk.c|  2 +-
 drivers/scsi/sd.c   |  2 +-
 drivers/scsi/sr.c   |  2 +-
 include/linux/genhd.h   |  5 +--
 33 files changed, 117 insertions(+), 122 deletions(-)

-- 
2.12.3