[dm-devel] [PATCH 1/2] block: simple improvements for bio->flags

2017-04-06 Thread NeilBrown
The comment for the 'flags' field of 'bio' mentions
"command" which is no longer stored there, and doesn't
mention the bvec pool number, which is.

BIO_RESET_BITS is set in such a way that it would need to be
updated if new bits were added, which is easy to miss.

BVEC_POOL_BITS is larger than needed.  The BVEC_POOL_IDX()
ranges from 0 to 6, so 3 bits are sufficient.

This patch make improvements in each of these areas.

Signed-off-by: NeilBrown 
---
 include/linux/blk_types.h |   22 +-
 1 file changed, 13 insertions(+), 9 deletions(-)

diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index d703acb55d0f..ff07d891f38f 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -29,7 +29,7 @@ struct bio {
 * top bits REQ_OP. Use
 * accessors.
 */
-   unsigned short  bi_flags;   /* status, command, etc */
+   unsigned short  bi_flags;   /* status, etc and bvec pool 
number */
unsigned short  bi_ioprio;
 
struct bvec_iterbi_iter;
@@ -102,12 +102,7 @@ struct bio {
 #define BIO_REFFED 8   /* bio has elevated ->bi_cnt */
 #define BIO_THROTTLED  9   /* This bio has already been subjected to
 * throttling rules. Don't do it again. */
-
-/*
- * Flags starting here get preserved by bio_reset() - this includes
- * BVEC_POOL_IDX()
- */
-#define BIO_RESET_BITS 10
+/* See BVEC_POOL_OFFSET below before adding new flags */
 
 /*
  * We support 6 different bvec pools, the last one is magic in that it
@@ -117,13 +112,22 @@ struct bio {
 #define BVEC_POOL_MAX  (BVEC_POOL_NR - 1)
 
 /*
- * Top 4 bits of bio flags indicate the pool the bvecs came from.  We add
+ * Top 3 bits of bio flags indicate the pool the bvecs came from.  We add
  * 1 to the actual index so that 0 indicates that there are no bvecs to be
  * freed.
  */
-#define BVEC_POOL_BITS (4)
+#define BVEC_POOL_BITS (3)
 #define BVEC_POOL_OFFSET   (16 - BVEC_POOL_BITS)
 #define BVEC_POOL_IDX(bio) ((bio)->bi_flags >> BVEC_POOL_OFFSET)
+#if (1<< BVEC_POOL_BITS) < (BVEC_POOL_NR+1)
+# error "BVEC_POOL_BITS is too small"
+#endif
+
+/*
+ * Flags starting here get preserved by bio_reset() - this includes
+ * only BVEC_POOL_IDX()
+ */
+#define BIO_RESET_BITS BVEC_POOL_OFFSET
 
 /*
  * Operations and flags common to the bio and request structures.


--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


[dm-devel] [PATCH 0/2] Trace completion of all bios

2017-04-06 Thread NeilBrown
Hi Jens,
 I think all objections to this patch have been answered so I'm
 resending.
 I've added a small cleanup patch first which makes some small
 enhancements to the documentation and #defines for the bio->flags
 field.

Thanks,
NeilBrown

---

NeilBrown (2):
  block: simple improvements for bio->flags
  block: trace completion of all bios.


 block/bio.c   |   13 +
 block/blk-core.c  |   10 +-
 drivers/md/dm.c   |1 -
 drivers/md/raid5.c|2 --
 include/linux/blk_types.h |   24 +++-
 5 files changed, 37 insertions(+), 13 deletions(-)

--
Signature

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] [PATCH 07/25] virtio_blk: don't use req->errors

2017-04-06 Thread Johannes Thumshirn
On Thu, Apr 06, 2017 at 05:39:26PM +0200, Christoph Hellwig wrote:
> Remove passing req->errors (which at that point is always 0) to
> blk_mq_complete_requestq, and rely on the virtio status code for the
blk_mq_complete_request ^

> serial number passthrough request.
> 
> Signed-off-by: Christoph Hellwig 
> ---

Otherwise looks good,
Reviewed-by: Johannes Thumshirn 

-- 
Johannes Thumshirn  Storage
jthumsh...@suse.de+49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] [PATCH 04/25] nvme: split nvme status from block req->errors

2017-04-06 Thread Johannes Thumshirn
On Thu, Apr 06, 2017 at 05:39:23PM +0200, Christoph Hellwig wrote:
> We want our own clearly defined error field for NVMe passthrough commands,
> and the request errors field is going away in its current form.
> 
> Just store the status and result field in the nvme_request field from
> hardirq completion context (using a new helper) and then generate a
> Linux errno for the block layer only when we actually need it.
> 
> Because we can't overload the status value with a negative error code
> for cancelled command we now have a flags filed in struct nvme_request
> that contains a bit for this condition.
> 
> Signed-off-by: Christoph Hellwig 
> ---

Looks good,
Reviewed-by: Johannes Thumshirn 

-- 
Johannes Thumshirn  Storage
jthumsh...@suse.de+49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] [PATCH 02/25] block: remove the blk_execute_rq return value

2017-04-06 Thread Johannes Thumshirn
On Thu, Apr 06, 2017 at 05:39:21PM +0200, Christoph Hellwig wrote:
> The function only returns -EIO if rq->errors is non-zero, which is not
> very useful and lets a large number of callers ignore the return value.
> 
> Just let the callers figure out their error themselves.
> 
> Signed-off-by: Christoph Hellwig 
> ---

Looks good,
Reviewed-by: Johannes Thumshirn 

-- 
Johannes Thumshirn  Storage
jthumsh...@suse.de+49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] multipathd: locks itself in udev trigger

2017-04-06 Thread Benjamin Marzinski
On Thu, Apr 06, 2017 at 02:24:07PM +0200, Alban Browaeys wrote:
> Bcache backing partition bcache0 triggers an udev add event that is handled 
> by multipathd.
>  Somewhat the other "bare" paritions sda do not.
> 
> The issue is when this event triggers the thread lock itself since commit
>  c6a18f4541d0a161e2f5fed8c67d9732bf512b37 "fix INIT_REQUESTED_UDEV code" .
> This change in "uev_update_path" moved "uev_add_path(uev, vecs);" under the 
> fast lock (non recursive)
> "lock(>lock);".  As uev_add_path too calls "lock(>lock);" 
> multipathd hangs in this second call
> in the same thread.

Oops. I'll post a fix for this shortly.

Thanks
-Ben
 
> Then "multipathd list paths" or other multipathd commands returns timeout.
> This also postpone systemd shutdown/reboot by a minute while it waits for 
> multipathd service to stop.
>  
> The backtrace was:
> (gdb) t a a bt
> 
> Thread 6 (Thread 0x7f922663c700 (LWP 545)):
> #0  syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
> #1  0x7f9228bd3602 in ?? () from /usr/lib/x86_64-linux-gnu/liburcu.so.4
> #2  0x7f92289ba424 in start_thread (arg=0x7f922663c700) at 
> pthread_create.c:333
> #3  0x7f922825c9bf in clone () at 
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:105
> 
> Thread 5 (Thread 0x7f9229734700 (LWP 543)):
> #0  __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
> #1  0x7f92289bcb85 in __GI___pthread_mutex_lock (mutex=0x556fefc43080) at 
> ../nptl/pthread_mutex_lock.c:80
> #2  0x556fedcbe42d in lock (a=0x556fefc43080) at ../libmultipath/lock.h:12
> #3  uev_add_path (vecs=0x556fefc43080, uev=, uev= out>) at main.c:627
> #4  0x556fedcbe9c9 in uev_update_path (uev=0x7f9220001510, 
> vecs=0x556fefc43080) at main.c:998
> #5  0x556fedcbecdb in uev_trigger (uev=0x7f9220001510, 
> trigger_data=0x556fefc43080) at main.c:1146
> #6  0x7f92292091b2 in service_uevq (tmpq=tmpq@entry=0x7f9229733b10) at 
> uevent.c:89
> #7  0x7f9229209280 in uevent_dispatch (uev_trigger=, 
> trigger_data=) at uevent.c:145
> #8  0x556fedcbc2cc in uevqloop (ap=0x556fefc43080) at main.c:1177
> #9  0x7f92289ba424 in start_thread (arg=0x7f9229734700) at 
> pthread_create.c:333
> #10 0x7f922825c9bf in clone () at 
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:105
> 
> Thread 4 (Thread 0x7f9229745700 (LWP 542)):
> #0  __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
> #1  0x7f92289bcb85 in __GI___pthread_mutex_lock (mutex=0x556fefc43080) at 
> ../nptl/pthread_mutex_lock.c:80
> #2  0x556fedcbfb45 in lock (a=0x556fefc43080) at ../libmultipath/lock.h:12
> #3  checkerloop (ap=0x556fefc43080) at main.c:1827
> #4  0x7f92289ba424 in start_thread (arg=0x7f9229745700) at 
> pthread_create.c:333
> #5  0x7f922825c9bf in clone () at 
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:105
> 
> Thread 3 (Thread 0x7f9229810700 (LWP 541)):
> #0  0x7f9228253611 in __GI_ppoll (fds=0x7f92180021e0, nfds=nfds@entry=1, 
> timeout=, timeout@entry=0x556fedecc020 , 
> sigmask=sigmask@entry=0x7f922980fa60) at ../sysdeps/unix/sysv/linux/ppoll.c:39
> #1  0x556fedcc13ba in ppoll (__ss=0x7f922980fa60, 
> __timeout=0x556fedecc020 , __nfds=1, __fds=) at 
> /usr/include/x86_64-linux-gnu/bits/poll2.h:77
> #2  uxsock_listen (uxsock_trigger=0x556fedcbb520 , 
> trigger_data=0x556fefc43080) at uxlsnr.c:204
> #3  0x556fedcbbd5a in uxlsnrloop (ap=0x556fefc43080) at main.c:1239
> #4  0x7f92289ba424 in start_thread (arg=0x7f9229810700) at 
> pthread_create.c:333
> #5  0x7f922825c9bf in clone () at 
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:105
> 
> Thread 2 (Thread 0x7f9229851700 (LWP 540)):
> #0  0x7f922825354d in poll () at ../sysdeps/unix/syscall-template.S:84
> #1  0x7f9229209f3a in poll (__timeout=, __nfds=1, 
> __fds=0x7f9229850a88) at /usr/include/x86_64-linux-gnu/bits/poll2.h:46
> #2  uevent_listen (udev=0x556fefbec040) at uevent.c:515
> #3  0x556fedcbc235 in ueventloop (ap=0x556fefbec040) at main.c:1166
> #4  0x7f92289ba424 in start_thread (arg=0x7f9229851700) at 
> pthread_create.c:333
> #5  0x7f922825c9bf in clone () at 
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:105
> 
> Thread 1 (Thread 0x7f9229746f00 (LWP 537)):
> #0  pthread_cond_wait@@GLIBC_2.3.2 () at 
> ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
> #1  0x556fedcc0aba in child (param=) at main.c:2407
> #2  0x556fedcbb0df in main (argc=, argv=0x7fff81f9a0d8) at 
> main.c:2664
> 
> As a local workaround I moved "uev_add_path" in "uev_update_path" back out of 
> the lock umbrella
>  while I keep it under pp->initialized check.
> https://bugs.debian.org/cgi-bin/bugreport.cgi?att=1;bug=859157;filename=fix_uev_update_path_udevadd_recursive_lock_deadlock.diff;msg=5
> This change fixes the reboot delay but I have no multipath setup thus cannot 
> detect any regressions.
> 
> -Alban

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


[dm-devel] [PATCH 13/25] dm mpath: don't check for req->errors

2017-04-06 Thread Christoph Hellwig
We'll get all proper errors reported through ->end_io and ->errors will
go away soon.

Signed-off-by: Christoph Hellwig 
---
 drivers/md/dm-mpath.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
index 7f223dbed49f..4aa1b865e66e 100644
--- a/drivers/md/dm-mpath.c
+++ b/drivers/md/dm-mpath.c
@@ -1491,7 +1491,7 @@ static int do_end_io(struct multipath *m, struct request 
*clone,
 */
int r = DM_ENDIO_REQUEUE;
 
-   if (!error && !clone->errors)
+   if (!error)
return 0;   /* I/O complete */
 
if (noretry_error(error))
-- 
2.11.0

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


[dm-devel] [PATCH 23/25] pd: remove bogus check for req->errors

2017-04-06 Thread Christoph Hellwig
The driver never sets req->errors
---
 drivers/block/paride/pd.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/block/paride/pd.c b/drivers/block/paride/pd.c
index 82c6d02193ae..3b0ab214fe74 100644
--- a/drivers/block/paride/pd.c
+++ b/drivers/block/paride/pd.c
@@ -739,7 +739,6 @@ static int pd_special_command(struct pd_unit *disk,
  enum action (*func)(struct pd_unit *disk))
 {
struct request *rq;
-   int err = 0;
 
rq = blk_get_request(disk->gd->queue, REQ_OP_DRV_IN, __GFP_RECLAIM);
if (IS_ERR(rq))
@@ -748,10 +747,9 @@ static int pd_special_command(struct pd_unit *disk,
rq->special = func;
 
blk_execute_rq(disk->gd->queue, disk->gd, rq, 0);
-   err = req->errors ? -EIO : 0;
 
blk_put_request(rq);
-   return err;
+   return 0;
 }
 
 /* kernel glue structures */
-- 
2.11.0

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


[dm-devel] [PATCH 19/25] block: add a error_count field to struct request

2017-04-06 Thread Christoph Hellwig
This is for the legacy floppy and ataflop drivers that currently abuse
->errors for this purpose.  It's stashed away in a union to not grow
the struct size, the other fields are either used by modern drivers
for different purposes or the I/O scheduler before queing the I/O
to drivers.

Signed-off-by: Christoph Hellwig 
---
 include/linux/blkdev.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index a03acd92ae74..fa75401ea5cc 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -175,6 +175,7 @@ struct request {
struct rb_node rb_node; /* sort/lookup */
struct bio_vec special_vec;
void *completion_data;
+   int error_count; /* for legacy drivers, don't use */
};
 
/*
-- 
2.11.0

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


[dm-devel] [PATCH 14/25] nbd: don't use req->errors

2017-04-06 Thread Christoph Hellwig
Add a nbd-specific field instead.

Signed-off-by: Christoph Hellwig 
---
 drivers/block/nbd.c | 28 ++--
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 03ae72985c79..4f045fab9659 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -83,6 +83,7 @@ struct nbd_device {
 struct nbd_cmd {
struct nbd_device *nbd;
struct completion send_complete;
+   int status;
 };
 
 #if IS_ENABLED(CONFIG_DEBUG_FS)
@@ -152,16 +153,14 @@ static void nbd_size_set(struct nbd_device *nbd, struct 
block_device *bdev,
nbd_size_update(nbd, bdev);
 }
 
-static void nbd_end_request(struct nbd_cmd *cmd)
+static void nbd_complete_rq(struct request *req)
 {
-   struct nbd_device *nbd = cmd->nbd;
-   struct request *req = blk_mq_rq_from_pdu(cmd);
-   int error = req->errors ? -EIO : 0;
+   struct nbd_cmd *cmd = blk_mq_rq_to_pdu(req);
 
-   dev_dbg(nbd_to_dev(nbd), "request %p: %s\n", cmd,
-   error ? "failed" : "done");
+   dev_dbg(nbd_to_dev(cmd->nbd), "request %p: %s\n", cmd,
+   cmd->status ? "failed" : "done");
 
-   blk_mq_complete_request(req, error);
+   blk_mq_end_request(req, cmd->status);
 }
 
 /*
@@ -193,7 +192,7 @@ static enum blk_eh_timer_return nbd_xmit_timeout(struct 
request *req,
 
dev_err(nbd_to_dev(nbd), "Connection timed out, shutting down 
connection\n");
set_bit(NBD_TIMEDOUT, >runtime_flags);
-   req->errors = -EIO;
+   cmd->status = -EIO;
 
mutex_lock(>config_lock);
sock_shutdown(nbd);
@@ -433,7 +432,7 @@ static struct nbd_cmd *nbd_read_stat(struct nbd_device 
*nbd, int index)
if (ntohl(reply.error)) {
dev_err(disk_to_dev(nbd->disk), "Other side returned error 
(%d)\n",
ntohl(reply.error));
-   req->errors = -EIO;
+   cmd->status = -EIO;
return cmd;
}
 
@@ -449,7 +448,7 @@ static struct nbd_cmd *nbd_read_stat(struct nbd_device 
*nbd, int index)
if (result <= 0) {
dev_err(disk_to_dev(nbd->disk), "Receive data 
failed (result %d)\n",
result);
-   req->errors = -EIO;
+   cmd->status = -EIO;
return cmd;
}
dev_dbg(nbd_to_dev(nbd), "request %p: got %d bytes 
data\n",
@@ -499,7 +498,7 @@ static void recv_work(struct work_struct *work)
break;
}
 
-   nbd_end_request(cmd);
+   blk_mq_complete_request(blk_mq_rq_from_pdu(cmd), 0);
}
 
/*
@@ -519,8 +518,8 @@ static void nbd_clear_req(struct request *req, void *data, 
bool reserved)
if (!blk_mq_request_started(req))
return;
cmd = blk_mq_rq_to_pdu(req);
-   req->errors = -EIO;
-   nbd_end_request(cmd);
+   cmd->status = -EIO;
+   blk_mq_complete_request(req, 0);
 }
 
 static void nbd_clear_que(struct nbd_device *nbd)
@@ -551,7 +550,7 @@ static int nbd_handle_cmd(struct nbd_cmd *cmd, int index)
return -EINVAL;
}
 
-   req->errors = 0;
+   cmd->status = 0;
 
nsock = nbd->socks[index];
mutex_lock(>tx_lock);
@@ -1037,6 +1036,7 @@ static int nbd_init_request(void *data, struct request 
*rq,
 
 static const struct blk_mq_ops nbd_mq_ops = {
.queue_rq   = nbd_queue_rq,
+   .complete   = nbd_complete_rq,
.init_request   = nbd_init_request,
.timeout= nbd_xmit_timeout,
 };
-- 
2.11.0

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


[dm-devel] kill req->errors

2017-04-06 Thread Christoph Hellwig
Currently the request structure has an errors field that is used in
various different ways.  The oldest drivers use it as an error count,
blk-mq and the generic timeout code assume that it holds a Linux
errno for block completions, and various drivers use it for internal
status values, often overwriting them with Linux errnos later,
that is unless they are processing passthrough requests in which
case they'll leave their errors in it.

This series kills the ->errors field and replaced it with new fields
in the drivers (or an odd hack in a union in struct request for
two bitrotting old drivers).

Note that this series expects that the patch to remove the mg_disk
driver has been applied already on top of the block for-next tree.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


[dm-devel] [PATCH 16/25] xen-blkfront: don't use req->errors

2017-04-06 Thread Christoph Hellwig
xen-blkfron is the last users using rq->errros for passing back error to
blk-mq, and I'd like to get rid of that.  In the longer run the driver
should be moving more of the completion processing into .complete, but
this is the minimal change to move forward for now.

Signed-off-by: Christoph Hellwig 
---
 drivers/block/xen-blkfront.c | 36 +---
 1 file changed, 25 insertions(+), 11 deletions(-)

diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
index d137ef8a72be..08172f679a64 100644
--- a/drivers/block/xen-blkfront.c
+++ b/drivers/block/xen-blkfront.c
@@ -115,6 +115,15 @@ struct split_bio {
atomic_t pending;
 };
 
+struct blkif_req {
+   int error;
+};
+
+static inline struct blkif_req *blkif_req(struct request *rq)
+{
+   return blk_mq_rq_to_pdu(rq);
+}
+
 static DEFINE_MUTEX(blkfront_mutex);
 static const struct block_device_operations xlvbd_block_fops;
 
@@ -907,8 +916,14 @@ static int blkif_queue_rq(struct blk_mq_hw_ctx *hctx,
return BLK_MQ_RQ_QUEUE_BUSY;
 }
 
+static void blkif_complete_rq(struct request *rq)
+{
+   blk_mq_end_request(rq, blkif_req(rq)->error);
+}
+
 static const struct blk_mq_ops blkfront_mq_ops = {
.queue_rq = blkif_queue_rq,
+   .complete = blkif_complete_rq,
 };
 
 static void blkif_set_queue_limits(struct blkfront_info *info)
@@ -969,7 +984,7 @@ static int xlvbd_init_blk_queue(struct gendisk *gd, u16 
sector_size,
info->tag_set.queue_depth = BLK_RING_SIZE(info);
info->tag_set.numa_node = NUMA_NO_NODE;
info->tag_set.flags = BLK_MQ_F_SHOULD_MERGE | BLK_MQ_F_SG_MERGE;
-   info->tag_set.cmd_size = 0;
+   info->tag_set.cmd_size = sizeof(struct blkif_req);
info->tag_set.driver_data = info;
 
if (blk_mq_alloc_tag_set(>tag_set))
@@ -1543,7 +1558,6 @@ static irqreturn_t blkif_interrupt(int irq, void *dev_id)
unsigned long flags;
struct blkfront_ring_info *rinfo = (struct blkfront_ring_info *)dev_id;
struct blkfront_info *info = rinfo->dev_info;
-   int error;
 
if (unlikely(info->connected != BLKIF_STATE_CONNECTED))
return IRQ_HANDLED;
@@ -1587,37 +1601,36 @@ static irqreturn_t blkif_interrupt(int irq, void 
*dev_id)
continue;
}
 
-   error = (bret->status == BLKIF_RSP_OKAY) ? 0 : -EIO;
+   blkif_req(req)->error = (bret->status == BLKIF_RSP_OKAY) ? 0 : 
-EIO;
switch (bret->operation) {
case BLKIF_OP_DISCARD:
if (unlikely(bret->status == BLKIF_RSP_EOPNOTSUPP)) {
struct request_queue *rq = info->rq;
printk(KERN_WARNING "blkfront: %s: %s op 
failed\n",
   info->gd->disk_name, 
op_name(bret->operation));
-   error = -EOPNOTSUPP;
+   blkif_req(req)->error = -EOPNOTSUPP;
info->feature_discard = 0;
info->feature_secdiscard = 0;
queue_flag_clear(QUEUE_FLAG_DISCARD, rq);
queue_flag_clear(QUEUE_FLAG_SECERASE, rq);
}
-   blk_mq_complete_request(req, error);
break;
case BLKIF_OP_FLUSH_DISKCACHE:
case BLKIF_OP_WRITE_BARRIER:
if (unlikely(bret->status == BLKIF_RSP_EOPNOTSUPP)) {
printk(KERN_WARNING "blkfront: %s: %s op 
failed\n",
   info->gd->disk_name, 
op_name(bret->operation));
-   error = -EOPNOTSUPP;
+   blkif_req(req)->error = -EOPNOTSUPP;
}
if (unlikely(bret->status == BLKIF_RSP_ERROR &&
 rinfo->shadow[id].req.u.rw.nr_segments == 
0)) {
printk(KERN_WARNING "blkfront: %s: empty %s op 
failed\n",
   info->gd->disk_name, 
op_name(bret->operation));
-   error = -EOPNOTSUPP;
+   blkif_req(req)->error = -EOPNOTSUPP;
}
-   if (unlikely(error)) {
-   if (error == -EOPNOTSUPP)
-   error = 0;
+   if (unlikely(blkif_req(req)->error)) {
+   if (blkif_req(req)->error == -EOPNOTSUPP)
+   blkif_req(req)->error = 0;
info->feature_fua = 0;
info->feature_flush = 0;
xlvbd_flush(info);
@@ -1629,11 +1642,12 @@ static irqreturn_t blkif_interrupt(int irq, void 
*dev_id)
 

[dm-devel] [PATCH 03/25] nvme-fc: fix status code handling in nvme_fc_fcpio_done

2017-04-06 Thread Christoph Hellwig
nvme_complete_async_event expects the little endian status code
including the phase bit, and a new completion handler I plan to
introduce will do so as well.

Change the status variable into the little endian format with the
phase bit used in the NVMe CQE to fix / enable this.

Signed-off-by: Christoph Hellwig 
---
 drivers/nvme/host/fc.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index fc42172c796a..aad7f9c0be32 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -1147,7 +1147,7 @@ nvme_fc_fcpio_done(struct nvmefc_fcp_req *req)
struct nvme_fc_ctrl *ctrl = op->ctrl;
struct nvme_fc_queue *queue = op->queue;
struct nvme_completion *cqe = >rsp_iu.cqe;
-   u16 status = NVME_SC_SUCCESS;
+   __le16 status = cpu_to_le16(NVME_SC_SUCCESS << 1);
 
/*
 * WARNING:
@@ -1182,9 +1182,9 @@ nvme_fc_fcpio_done(struct nvmefc_fcp_req *req)
sizeof(op->rsp_iu), DMA_FROM_DEVICE);
 
if (atomic_read(>state) == FCPOP_STATE_ABORTED)
-   status = NVME_SC_ABORT_REQ | NVME_SC_DNR;
+   status = cpu_to_le16((NVME_SC_ABORT_REQ | NVME_SC_DNR) << 1);
else if (freq->status)
-   status = NVME_SC_FC_TRANSPORT_ERROR;
+   status = cpu_to_le16(NVME_SC_FC_TRANSPORT_ERROR << 1);
 
/*
 * For the linux implementation, if we have an unsuccesful
@@ -1212,7 +1212,7 @@ nvme_fc_fcpio_done(struct nvmefc_fcp_req *req)
 */
if (freq->transferred_length !=
be32_to_cpu(op->cmd_iu.data_len)) {
-   status = NVME_SC_FC_TRANSPORT_ERROR;
+   status = cpu_to_le16(NVME_SC_FC_TRANSPORT_ERROR << 1);
goto done;
}
op->nreq.result.u64 = 0;
@@ -1229,15 +1229,15 @@ nvme_fc_fcpio_done(struct nvmefc_fcp_req *req)
freq->transferred_length ||
 op->rsp_iu.status_code ||
 op->rqno != le16_to_cpu(cqe->command_id))) {
-   status = NVME_SC_FC_TRANSPORT_ERROR;
+   status = cpu_to_le16(NVME_SC_FC_TRANSPORT_ERROR << 1);
goto done;
}
op->nreq.result = cqe->result;
-   status = le16_to_cpu(cqe->status) >> 1;
+   status = cqe->status;
break;
 
default:
-   status = NVME_SC_FC_TRANSPORT_ERROR;
+   status = cpu_to_le16(NVME_SC_FC_TRANSPORT_ERROR << 1);
goto done;
}
 
@@ -1249,7 +1249,7 @@ nvme_fc_fcpio_done(struct nvmefc_fcp_req *req)
return;
}
 
-   blk_mq_complete_request(rq, status);
+   blk_mq_complete_request(rq, le16_to_cpu(status) >> 1);
 }
 
 static int
-- 
2.11.0

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


[dm-devel] [PATCH 18/25] blk-mq: simplify __blk_mq_complete_request

2017-04-06 Thread Christoph Hellwig
Merge blk_mq_ipi_complete_request and blk_mq_stat_add into their only
caller.

Signed-off-by: Christoph Hellwig 
---
 block/blk-mq.c | 21 ++---
 1 file changed, 6 insertions(+), 15 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 393f350ebb90..a6e14a3c87ce 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -405,12 +405,17 @@ static void __blk_mq_complete_request_remote(void *data)
rq->q->softirq_done_fn(rq);
 }
 
-static void blk_mq_ipi_complete_request(struct request *rq)
+static void __blk_mq_complete_request(struct request *rq)
 {
struct blk_mq_ctx *ctx = rq->mq_ctx;
bool shared = false;
int cpu;
 
+   if (rq->rq_flags & RQF_STATS) {
+   blk_mq_poll_stats_start(rq->q);
+   blk_stat_add(rq);
+   }
+
if (!test_bit(QUEUE_FLAG_SAME_COMP, >q->queue_flags)) {
rq->q->softirq_done_fn(rq);
return;
@@ -431,20 +436,6 @@ static void blk_mq_ipi_complete_request(struct request *rq)
put_cpu();
 }
 
-static void blk_mq_stat_add(struct request *rq)
-{
-   if (rq->rq_flags & RQF_STATS) {
-   blk_mq_poll_stats_start(rq->q);
-   blk_stat_add(rq);
-   }
-}
-
-static void __blk_mq_complete_request(struct request *rq)
-{
-   blk_mq_stat_add(rq);
-   blk_mq_ipi_complete_request(rq);
-}
-
 /**
  * blk_mq_complete_request - end I/O on a request
  * @rq:the request being processed
-- 
2.11.0

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


[dm-devel] [PATCH 11/25] null_blk: don't pass always-0 req->errors to blk_mq_complete_request

2017-04-06 Thread Christoph Hellwig
Signed-off-by: Christoph Hellwig 
---
 drivers/block/null_blk.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/block/null_blk.c b/drivers/block/null_blk.c
index f93906ff31e8..24ca85a70fd8 100644
--- a/drivers/block/null_blk.c
+++ b/drivers/block/null_blk.c
@@ -281,7 +281,7 @@ static inline void null_handle_cmd(struct nullb_cmd *cmd)
case NULL_IRQ_SOFTIRQ:
switch (queue_mode)  {
case NULL_Q_MQ:
-   blk_mq_complete_request(cmd->rq, cmd->rq->errors);
+   blk_mq_complete_request(cmd->rq, 0);
break;
case NULL_Q_RQ:
blk_complete_request(cmd->rq);
-- 
2.11.0

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


[dm-devel] [PATCH 22/25] swim3: remove (commented out) printing of req->errors

2017-04-06 Thread Christoph Hellwig
Signed-off-by: Christoph Hellwig 
---
 drivers/block/swim3.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/block/swim3.c b/drivers/block/swim3.c
index 61b3ffa4f458..ba4809c9bdba 100644
--- a/drivers/block/swim3.c
+++ b/drivers/block/swim3.c
@@ -343,8 +343,8 @@ static void start_request(struct floppy_state *fs)
  req->rq_disk->disk_name, req->cmd,
  (long)blk_rq_pos(req), blk_rq_sectors(req),
  bio_data(req->bio));
-   swim3_dbg("   errors=%d current_nr_sectors=%u\n",
- req->errors, blk_rq_cur_sectors(req));
+   swim3_dbg("   current_nr_sectors=%u\n",
+ blk_rq_cur_sectors(req));
 #endif
 
if (blk_rq_pos(req) >= fs->total_secs) {
-- 
2.11.0

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


[dm-devel] [PATCH 15/25] mtip32xx: add a status field to struct mtip_cmd

2017-04-06 Thread Christoph Hellwig
Instead of using req->errors, which will go away.

Signed-off-by: Christoph Hellwig 
---
 drivers/block/mtip32xx/mtip32xx.c | 16 +---
 drivers/block/mtip32xx/mtip32xx.h |  1 +
 2 files changed, 10 insertions(+), 7 deletions(-)

diff --git a/drivers/block/mtip32xx/mtip32xx.c 
b/drivers/block/mtip32xx/mtip32xx.c
index 30076e7753bc..ec998e80cfaf 100644
--- a/drivers/block/mtip32xx/mtip32xx.c
+++ b/drivers/block/mtip32xx/mtip32xx.c
@@ -241,7 +241,8 @@ static void mtip_async_complete(struct mtip_port *port,
 
rq = mtip_rq_from_tag(dd, tag);
 
-   blk_mq_complete_request(rq, status);
+   cmd->status = status;
+   blk_mq_complete_request(rq, 0);
 }
 
 /*
@@ -2910,18 +2911,19 @@ static void mtip_softirq_done_fn(struct request *rq)
if (unlikely(cmd->unaligned))
up(>port->cmd_slot_unal);
 
-   blk_mq_end_request(rq, rq->errors);
+   blk_mq_end_request(rq, cmd->status);
 }
 
 static void mtip_abort_cmd(struct request *req, void *data,
bool reserved)
 {
+   struct mtip_cmd *cmd = blk_mq_rq_to_pdu(req);
struct driver_data *dd = data;
 
dbg_printk(MTIP_DRV_NAME " Aborting request, tag = %d\n", req->tag);
 
clear_bit(req->tag, dd->port->cmds_to_issue);
-   req->errors = -EIO;
+   cmd->status = -EIO;
mtip_softirq_done_fn(req);
 }
 
@@ -3816,7 +3818,6 @@ static int mtip_queue_rq(struct blk_mq_hw_ctx *hctx,
if (likely(!ret))
return BLK_MQ_RQ_QUEUE_OK;
 
-   rq->errors = ret;
return BLK_MQ_RQ_QUEUE_ERROR;
 }
 
@@ -4107,9 +4108,10 @@ static void mtip_no_dev_cleanup(struct request *rq, void 
*data, bool reserv)
struct driver_data *dd = (struct driver_data *)data;
struct mtip_cmd *cmd;
 
-   if (likely(!reserv))
-   blk_mq_complete_request(rq, -ENODEV);
-   else if (test_bit(MTIP_PF_IC_ACTIVE_BIT, >port->flags)) {
+   if (likely(!reserv)) {
+   cmd->status = -ENODEV;
+   blk_mq_complete_request(rq, 0);
+   } else if (test_bit(MTIP_PF_IC_ACTIVE_BIT, >port->flags)) {
 
cmd = mtip_cmd_from_tag(dd, MTIP_TAG_INTERNAL);
if (cmd->comp_func)
diff --git a/drivers/block/mtip32xx/mtip32xx.h 
b/drivers/block/mtip32xx/mtip32xx.h
index 7617888f7944..57b41528a824 100644
--- a/drivers/block/mtip32xx/mtip32xx.h
+++ b/drivers/block/mtip32xx/mtip32xx.h
@@ -352,6 +352,7 @@ struct mtip_cmd {
int retries; /* The number of retries left for this command. */
 
int direction; /* Data transfer direction */
+   int status;
 };
 
 /* Structure used to describe a port. */
-- 
2.11.0

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


[dm-devel] [PATCH 17/25] blk-mq: remove the error argument to blk_mq_complete_request

2017-04-06 Thread Christoph Hellwig
Now that we always have a ->complete callback we can remove the direct
call to blk_mq_end_request, as well as the error argument to
blk_mq_complete_request.

Signed-off-by: Christoph Hellwig 
---
 block/blk-mq.c| 14 +++---
 drivers/block/loop.c  |  4 ++--
 drivers/block/mtip32xx/mtip32xx.c |  4 ++--
 drivers/block/nbd.c   |  4 ++--
 drivers/block/null_blk.c  |  2 +-
 drivers/block/virtio_blk.c|  2 +-
 drivers/block/xen-blkfront.c  |  2 +-
 drivers/md/dm-rq.c|  2 +-
 drivers/nvme/host/core.c  |  2 +-
 drivers/nvme/host/nvme.h  |  2 +-
 drivers/scsi/scsi_lib.c   |  2 +-
 include/linux/blk-mq.h|  2 +-
 12 files changed, 17 insertions(+), 25 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index f7cd3208bcdf..393f350ebb90 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -441,14 +441,8 @@ static void blk_mq_stat_add(struct request *rq)
 
 static void __blk_mq_complete_request(struct request *rq)
 {
-   struct request_queue *q = rq->q;
-
blk_mq_stat_add(rq);
-
-   if (!q->softirq_done_fn)
-   blk_mq_end_request(rq, rq->errors);
-   else
-   blk_mq_ipi_complete_request(rq);
+   blk_mq_ipi_complete_request(rq);
 }
 
 /**
@@ -459,16 +453,14 @@ static void __blk_mq_complete_request(struct request *rq)
  * Ends all I/O on a request. It does not handle partial completions.
  * The actual completion happens out-of-order, through a IPI handler.
  **/
-void blk_mq_complete_request(struct request *rq, int error)
+void blk_mq_complete_request(struct request *rq)
 {
struct request_queue *q = rq->q;
 
if (unlikely(blk_should_fake_timeout(q)))
return;
-   if (!blk_mark_rq_complete(rq)) {
-   rq->errors = error;
+   if (!blk_mark_rq_complete(rq))
__blk_mq_complete_request(rq);
-   }
 }
 EXPORT_SYMBOL(blk_mq_complete_request);
 
diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index 6924ec611a49..7d49d919543a 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -465,7 +465,7 @@ static void lo_rw_aio_complete(struct kiocb *iocb, long 
ret, long ret2)
struct loop_cmd *cmd = container_of(iocb, struct loop_cmd, iocb);
 
cmd->ret = ret;
-   blk_mq_complete_request(cmd->rq, 0);
+   blk_mq_complete_request(cmd->rq);
 }
 
 static int lo_rw_aio(struct loop_device *lo, struct loop_cmd *cmd,
@@ -1683,7 +1683,7 @@ static void loop_handle_cmd(struct loop_cmd *cmd)
/* complete non-aio request */
if (!cmd->use_aio || ret) {
cmd->ret = ret ? -EIO : 0;
-   blk_mq_complete_request(cmd->rq, 0);
+   blk_mq_complete_request(cmd->rq);
}
 }
 
diff --git a/drivers/block/mtip32xx/mtip32xx.c 
b/drivers/block/mtip32xx/mtip32xx.c
index ec998e80cfaf..be6bbe5ce613 100644
--- a/drivers/block/mtip32xx/mtip32xx.c
+++ b/drivers/block/mtip32xx/mtip32xx.c
@@ -242,7 +242,7 @@ static void mtip_async_complete(struct mtip_port *port,
rq = mtip_rq_from_tag(dd, tag);
 
cmd->status = status;
-   blk_mq_complete_request(rq, 0);
+   blk_mq_complete_request(rq);
 }
 
 /*
@@ -4110,7 +4110,7 @@ static void mtip_no_dev_cleanup(struct request *rq, void 
*data, bool reserv)
 
if (likely(!reserv)) {
cmd->status = -ENODEV;
-   blk_mq_complete_request(rq, 0);
+   blk_mq_complete_request(rq);
} else if (test_bit(MTIP_PF_IC_ACTIVE_BIT, >port->flags)) {
 
cmd = mtip_cmd_from_tag(dd, MTIP_TAG_INTERNAL);
diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 4f045fab9659..4f78c5a01d11 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -498,7 +498,7 @@ static void recv_work(struct work_struct *work)
break;
}
 
-   blk_mq_complete_request(blk_mq_rq_from_pdu(cmd), 0);
+   blk_mq_complete_request(blk_mq_rq_from_pdu(cmd));
}
 
/*
@@ -519,7 +519,7 @@ static void nbd_clear_req(struct request *req, void *data, 
bool reserved)
return;
cmd = blk_mq_rq_to_pdu(req);
cmd->status = -EIO;
-   blk_mq_complete_request(req, 0);
+   blk_mq_complete_request(req);
 }
 
 static void nbd_clear_que(struct nbd_device *nbd)
diff --git a/drivers/block/null_blk.c b/drivers/block/null_blk.c
index 24ca85a70fd8..c27cccec368b 100644
--- a/drivers/block/null_blk.c
+++ b/drivers/block/null_blk.c
@@ -281,7 +281,7 @@ static inline void null_handle_cmd(struct nullb_cmd *cmd)
case NULL_IRQ_SOFTIRQ:
switch (queue_mode)  {
case NULL_Q_MQ:
-   blk_mq_complete_request(cmd->rq, 0);
+   blk_mq_complete_request(cmd->rq);
break;
case NULL_Q_RQ:
blk_complete_request(cmd->rq);
diff 

[dm-devel] [PATCH 04/25] nvme: split nvme status from block req->errors

2017-04-06 Thread Christoph Hellwig
We want our own clearly defined error field for NVMe passthrough commands,
and the request errors field is going away in its current form.

Just store the status and result field in the nvme_request field from
hardirq completion context (using a new helper) and then generate a
Linux errno for the block layer only when we actually need it.

Because we can't overload the status value with a negative error code
for cancelled command we now have a flags filed in struct nvme_request
that contains a bit for this condition.

Signed-off-by: Christoph Hellwig 
---
 drivers/nvme/host/core.c | 50 +++-
 drivers/nvme/host/fc.c   | 10 -
 drivers/nvme/host/lightnvm.c |  9 +---
 drivers/nvme/host/nvme.h | 33 +
 drivers/nvme/host/pci.c  | 11 +-
 drivers/nvme/host/rdma.c |  5 ++---
 drivers/nvme/target/loop.c   |  7 ++-
 7 files changed, 65 insertions(+), 60 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index dc05f41c3992..b73a86a78379 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -66,11 +66,24 @@ static DEFINE_SPINLOCK(dev_list_lock);
 
 static struct class *nvme_class;
 
+int nvme_error_status(struct request *req)
+{
+   switch (nvme_req(req)->status & 0x7ff) {
+   case NVME_SC_SUCCESS:
+   return 0;
+   case NVME_SC_CAP_EXCEEDED:
+   return -ENOSPC;
+   default:
+   return -EIO;
+   }
+}
+EXPORT_SYMBOL_GPL(nvme_error_status);
+
 static inline bool nvme_req_needs_retry(struct request *req)
 {
if (blk_noretry_request(req))
return false;
-   if (req->errors & NVME_SC_DNR)
+   if (nvme_req(req)->status & NVME_SC_DNR)
return false;
if (jiffies - req->start_time >= req->timeout)
return false;
@@ -81,23 +94,13 @@ static inline bool nvme_req_needs_retry(struct request *req)
 
 void nvme_complete_rq(struct request *req)
 {
-   int error = 0;
-
-   if (unlikely(req->errors)) {
-   if (nvme_req_needs_retry(req)) {
-   nvme_req(req)->retries++;
-   blk_mq_requeue_request(req,
-   !blk_mq_queue_stopped(req->q));
-   return;
-   }
-
-   if (blk_rq_is_passthrough(req))
-   error = req->errors;
-   else
-   error = nvme_error_status(req->errors);
+   if (unlikely(nvme_req(req)->status && nvme_req_needs_retry(req))) {
+   nvme_req(req)->retries++;
+   blk_mq_requeue_request(req, !blk_mq_queue_stopped(req->q));
+   return;
}
 
-   blk_mq_end_request(req, error);
+   blk_mq_end_request(req, nvme_error_status(req));
 }
 EXPORT_SYMBOL_GPL(nvme_complete_rq);
 
@@ -114,7 +117,9 @@ void nvme_cancel_request(struct request *req, void *data, 
bool reserved)
status = NVME_SC_ABORT_REQ;
if (blk_queue_dying(req->q))
status |= NVME_SC_DNR;
-   blk_mq_complete_request(req, status);
+   nvme_req(req)->status = status;
+   blk_mq_complete_request(req, 0);
+
 }
 EXPORT_SYMBOL_GPL(nvme_cancel_request);
 
@@ -357,6 +362,7 @@ int nvme_setup_cmd(struct nvme_ns *ns, struct request *req,
 
if (!(req->rq_flags & RQF_DONTPREP)) {
nvme_req(req)->retries = 0;
+   nvme_req(req)->flags = 0;
req->rq_flags |= RQF_DONTPREP;
}
 
@@ -411,7 +417,10 @@ int __nvme_submit_sync_cmd(struct request_queue *q, struct 
nvme_command *cmd,
blk_execute_rq(req->q, NULL, req, at_head);
if (result)
*result = nvme_req(req)->result;
-   ret = req->errors;
+   if (nvme_req(req)->flags & NVME_REQ_CANCELLED)
+   ret = -EINTR;
+   else
+   ret = nvme_req(req)->status;
  out:
blk_mq_free_request(req);
return ret;
@@ -496,7 +505,10 @@ int __nvme_submit_user_cmd(struct request_queue *q, struct 
nvme_command *cmd,
}
  submit:
blk_execute_rq(req->q, disk, req, 0);
-   ret = req->errors;
+   if (nvme_req(req)->flags & NVME_REQ_CANCELLED)
+   ret = -EINTR;
+   else
+   ret = nvme_req(req)->status;
if (result)
*result = le32_to_cpu(nvme_req(req)->result.u32);
if (meta && !ret && !write) {
diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index aad7f9c0be32..450733c8cd24 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -1148,6 +1148,7 @@ nvme_fc_fcpio_done(struct nvmefc_fcp_req *req)
struct nvme_fc_queue *queue = op->queue;
struct nvme_completion *cqe = >rsp_iu.cqe;
__le16 status = cpu_to_le16(NVME_SC_SUCCESS << 1);
+   union nvme_result result;
 
/*
 * WARNING:
@@ -1215,7 +1216,7 @@ nvme_fc_fcpio_done(struct 

[dm-devel] [PATCH 09/25] scsi: introduce a new result field in struct scsi_request

2017-04-06 Thread Christoph Hellwig
This passes on the scsi_cmnd result field to users of passthrough
requests.  Currently we abuse req->errors for this purpose, but that
field will go away in its current form.

Note that the old IDE code abuses the errors field in very creative
ways and stores all kinds of different values in it.  I didn't dare
to touch this magic, so the abuses are brought forward 1:1.

Signed-off-by: Christoph Hellwig 
---
 block/bsg-lib.c|  8 
 block/bsg.c| 12 +--
 block/scsi_ioctl.c | 14 ++---
 drivers/block/cciss.c  | 42 +++---
 drivers/block/pktcdvd.c|  2 +-
 drivers/block/virtio_blk.c |  2 +-
 drivers/cdrom/cdrom.c  |  2 +-
 drivers/ide/ide-atapi.c| 10 -
 drivers/ide/ide-cd.c   | 20 +-
 drivers/ide/ide-cd_ioctl.c |  2 +-
 drivers/ide/ide-devsets.c  |  4 ++--
 drivers/ide/ide-dma.c  |  2 +-
 drivers/ide/ide-eh.c   | 36 
 drivers/ide/ide-floppy.c   | 10 -
 drivers/ide/ide-io.c   | 10 -
 drivers/ide/ide-ioctls.c   |  4 ++--
 drivers/ide/ide-park.c |  2 +-
 drivers/ide/ide-pm.c   |  8 
 drivers/ide/ide-tape.c |  4 ++--
 drivers/ide/ide-taskfile.c |  6 +++---
 drivers/scsi/osd/osd_initiator.c   |  2 +-
 drivers/scsi/osst.c|  2 +-
 drivers/scsi/qla2xxx/qla_bsg.c |  6 +++---
 drivers/scsi/scsi_lib.c| 18 +++-
 drivers/scsi/scsi_transport_sas.c  |  2 +-
 drivers/scsi/sg.c  |  2 +-
 drivers/scsi/st.c  |  6 +++---
 drivers/target/target_core_pscsi.c |  2 +-
 fs/nfsd/blocklayout.c  |  4 ++--
 include/linux/ide.h|  2 +-
 include/scsi/scsi_request.h|  1 +
 31 files changed, 122 insertions(+), 125 deletions(-)

diff --git a/block/bsg-lib.c b/block/bsg-lib.c
index cd15f9dbb147..0a23dbba2d30 100644
--- a/block/bsg-lib.c
+++ b/block/bsg-lib.c
@@ -37,7 +37,7 @@ static void bsg_destroy_job(struct kref *kref)
struct bsg_job *job = container_of(kref, struct bsg_job, kref);
struct request *rq = job->req;
 
-   blk_end_request_all(rq, rq->errors);
+   blk_end_request_all(rq, scsi_req(rq)->result);
 
put_device(job->dev);   /* release reference for the request */
 
@@ -74,7 +74,7 @@ void bsg_job_done(struct bsg_job *job, int result,
struct scsi_request *rq = scsi_req(req);
int err;
 
-   err = job->req->errors = result;
+   err = scsi_req(job->req)->result = result;
if (err < 0)
/* we're only returning the result field in the reply */
rq->sense_len = sizeof(u32);
@@ -177,7 +177,7 @@ static int bsg_create_job(struct device *dev, struct 
request *req)
  * @q: request queue to manage
  *
  * On error the create_bsg_job function should return a -Exyz error value
- * that will be set to the req->errors.
+ * that will be set to ->result.
  *
  * Drivers/subsys should pass this to the queue init function.
  */
@@ -201,7 +201,7 @@ static void bsg_request_fn(struct request_queue *q)
 
ret = bsg_create_job(dev, req);
if (ret) {
-   req->errors = ret;
+   scsi_req(req)->result = ret;
blk_end_request_all(req, ret);
spin_lock_irq(q->queue_lock);
continue;
diff --git a/block/bsg.c b/block/bsg.c
index 74835dbf0c47..e54daf9f2ee0 100644
--- a/block/bsg.c
+++ b/block/bsg.c
@@ -391,13 +391,13 @@ static int blk_complete_sgv4_hdr_rq(struct request *rq, 
struct sg_io_v4 *hdr,
struct scsi_request *req = scsi_req(rq);
int ret = 0;
 
-   dprintk("rq %p bio %p 0x%x\n", rq, bio, rq->errors);
+   dprintk("rq %p bio %p 0x%x\n", rq, bio, rq->result);
/*
 * fill in all the output members
 */
-   hdr->device_status = rq->errors & 0xff;
-   hdr->transport_status = host_byte(rq->errors);
-   hdr->driver_status = driver_byte(rq->errors);
+   hdr->device_status = req->result & 0xff;
+   hdr->transport_status = host_byte(req->result);
+   hdr->driver_status = driver_byte(req->result);
hdr->info = 0;
if (hdr->device_status || hdr->transport_status || hdr->driver_status)
hdr->info |= SG_INFO_CHECK;
@@ -431,8 +431,8 @@ static int blk_complete_sgv4_hdr_rq(struct request *rq, 
struct sg_io_v4 *hdr,
 * just a protocol response (i.e. non negative), that gets
 * processed above.
 */
-   if (!ret && rq->errors < 0)
-   ret = rq->errors;
+   if (!ret && req->result < 0)
+   ret = req->result;
 
blk_rq_unmap_user(bio);
scsi_req_free_cmd(req);
diff --git a/block/scsi_ioctl.c 

[dm-devel] [PATCH 07/25] virtio_blk: don't use req->errors

2017-04-06 Thread Christoph Hellwig
Remove passing req->errors (which at that point is always 0) to
blk_mq_complete_requestq, and rely on the virtio status code for the
serial number passthrough request.

Signed-off-by: Christoph Hellwig 
---
 drivers/block/virtio_blk.c | 10 +++---
 1 file changed, 3 insertions(+), 7 deletions(-)

diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index dbc4e80680b1..8378ad480f77 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -175,19 +175,15 @@ static int virtblk_add_req(struct virtqueue *vq, struct 
virtblk_req *vbr,
 static inline void virtblk_request_done(struct request *req)
 {
struct virtblk_req *vbr = blk_mq_rq_to_pdu(req);
-   int error = virtblk_result(vbr);
 
switch (req_op(req)) {
case REQ_OP_SCSI_IN:
case REQ_OP_SCSI_OUT:
virtblk_scsi_request_done(req);
break;
-   case REQ_OP_DRV_IN:
-   req->errors = (error != 0);
-   break;
}
 
-   blk_mq_end_request(req, error);
+   blk_mq_end_request(req, virtblk_result(vbr));
 }
 
 static void virtblk_done(struct virtqueue *vq)
@@ -205,7 +201,7 @@ static void virtblk_done(struct virtqueue *vq)
while ((vbr = virtqueue_get_buf(vblk->vqs[qid].vq, )) != 
NULL) {
struct request *req = blk_mq_rq_from_pdu(vbr);
 
-   blk_mq_complete_request(req, req->errors);
+   blk_mq_complete_request(req, 0);
req_done = true;
}
if (unlikely(virtqueue_is_broken(vq)))
@@ -311,7 +307,7 @@ static int virtblk_get_id(struct gendisk *disk, char 
*id_str)
goto out;
 
blk_execute_rq(vblk->disk->queue, vblk->disk, req, false);
-   err = req->errors ? -EIO : 0;
+   err = virtblk_result(blk_mq_rq_to_pdu(req));
 out:
blk_put_request(req);
return err;
-- 
2.11.0

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


[dm-devel] [PATCH 08/25] scsi: fix fast-fail for non-passthrough requests

2017-04-06 Thread Christoph Hellwig
Currently error is always 0 for non-passthrough requests when reaching the
scsi_noretry_cmd check in scsi_io_completion, which effectively disables
all fastfail logic.  Fix this by having a single call to
__scsi_error_from_host_byte at the beginning of the function and always
having a valid error value.

Signed-off-by: Christoph Hellwig 
---
 drivers/scsi/scsi_lib.c | 28 +++-
 1 file changed, 7 insertions(+), 21 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 11972d1075f1..89b4d9e69866 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -779,21 +779,17 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned 
int good_bytes)
sense_valid = scsi_command_normalize_sense(cmd, );
if (sense_valid)
sense_deferred = scsi_sense_is_deferred();
+
+   if (!sense_deferred)
+   error = __scsi_error_from_host_byte(cmd, result);
}
 
if (blk_rq_is_passthrough(req)) {
-   if (result) {
-   if (sense_valid) {
-   /*
-* SG_IO wants current and deferred errors
-*/
-   scsi_req(req)->sense_len =
-   min(8 + cmd->sense_buffer[7],
-   SCSI_SENSE_BUFFERSIZE);
-   }
-   if (!sense_deferred)
-   error = __scsi_error_from_host_byte(cmd, 
result);
+   if (result && sense_valid) {
+   scsi_req(req)->sense_len = min(8 + cmd->sense_buffer[7],
+  SCSI_SENSE_BUFFERSIZE);
}
+
/*
 * __scsi_error_from_host_byte may have reset the host_byte
 */
@@ -812,13 +808,6 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned 
int good_bytes)
BUG();
return;
}
-   } else if (blk_rq_bytes(req) == 0 && result && !sense_deferred) {
-   /*
-* Flush commands do not transfers any data, and thus cannot use
-* good_bytes != blk_rq_bytes(req) as the signal for an error.
-* This sets the error explicitly for the problem case.
-*/
-   error = __scsi_error_from_host_byte(cmd, result);
}
 
/* no bidi support for !blk_rq_is_passthrough yet */
@@ -848,7 +837,6 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int 
good_bytes)
else if (!(req->rq_flags & RQF_QUIET))
scsi_print_sense(cmd);
result = 0;
-   /* for passthrough error may be set */
error = 0;
}
 
@@ -877,8 +865,6 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int 
good_bytes)
if (result == 0)
goto requeue;
 
-   error = __scsi_error_from_host_byte(cmd, result);
-
if (host_byte(result) == DID_RESET) {
/* Third party bus reset or reset for error recovery
 * reasons.  Just retry the command and see what
-- 
2.11.0

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


[dm-devel] [PATCH 10/25] loop: zero-fill bio on the submitting cpu

2017-04-06 Thread Christoph Hellwig
In thruth I've just audited which blk-mq drivers don't currently have a
complete callback, but I think this change is at least borderline useful.

Signed-off-by: Christoph Hellwig 
---
 drivers/block/loop.c | 30 ++
 drivers/block/loop.h |  1 +
 2 files changed, 15 insertions(+), 16 deletions(-)

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index cc981f34e017..6924ec611a49 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -445,32 +445,27 @@ static int lo_req_flush(struct loop_device *lo, struct 
request *rq)
return ret;
 }
 
-static inline void handle_partial_read(struct loop_cmd *cmd, long bytes)
+static void lo_complete_rq(struct request *rq)
 {
-   if (bytes < 0 || op_is_write(req_op(cmd->rq)))
-   return;
+   struct loop_cmd *cmd = blk_mq_rq_to_pdu(rq);
 
-   if (unlikely(bytes < blk_rq_bytes(cmd->rq))) {
+   if (unlikely(req_op(cmd->rq) == REQ_OP_READ && cmd->use_aio &&
+cmd->ret >= 0 && cmd->ret < blk_rq_bytes(cmd->rq))) {
struct bio *bio = cmd->rq->bio;
 
-   bio_advance(bio, bytes);
+   bio_advance(bio, cmd->ret);
zero_fill_bio(bio);
}
+
+   blk_mq_end_request(rq, cmd->ret < 0 ? -EIO : 0);
 }
 
 static void lo_rw_aio_complete(struct kiocb *iocb, long ret, long ret2)
 {
struct loop_cmd *cmd = container_of(iocb, struct loop_cmd, iocb);
-   struct request *rq = cmd->rq;
-
-   handle_partial_read(cmd, ret);
 
-   if (ret > 0)
-   ret = 0;
-   else if (ret < 0)
-   ret = -EIO;
-
-   blk_mq_complete_request(rq, ret);
+   cmd->ret = ret;
+   blk_mq_complete_request(cmd->rq, 0);
 }
 
 static int lo_rw_aio(struct loop_device *lo, struct loop_cmd *cmd,
@@ -1686,8 +1681,10 @@ static void loop_handle_cmd(struct loop_cmd *cmd)
ret = do_req_filebacked(lo, cmd->rq);
  failed:
/* complete non-aio request */
-   if (!cmd->use_aio || ret)
-   blk_mq_complete_request(cmd->rq, ret ? -EIO : 0);
+   if (!cmd->use_aio || ret) {
+   cmd->ret = ret ? -EIO : 0;
+   blk_mq_complete_request(cmd->rq, 0);
+   }
 }
 
 static void loop_queue_work(struct kthread_work *work)
@@ -1713,6 +1710,7 @@ static int loop_init_request(void *data, struct request 
*rq,
 static const struct blk_mq_ops loop_mq_ops = {
.queue_rq   = loop_queue_rq,
.init_request   = loop_init_request,
+   .complete   = lo_complete_rq,
 };
 
 static int loop_add(struct loop_device **l, int i)
diff --git a/drivers/block/loop.h b/drivers/block/loop.h
index fb2237c73e61..fecd3f97ef8c 100644
--- a/drivers/block/loop.h
+++ b/drivers/block/loop.h
@@ -70,6 +70,7 @@ struct loop_cmd {
struct request *rq;
struct list_head list;
bool use_aio;   /* use AIO interface to handle I/O */
+   long ret;
struct kiocb iocb;
 };
 
-- 
2.11.0

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


[dm-devel] [PATCH 20/25] floppy: switch from req->errors to req->error_count

2017-04-06 Thread Christoph Hellwig
Signed-off-by: Christoph Hellwig 
---
 drivers/block/floppy.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/block/floppy.c b/drivers/block/floppy.c
index ce102ec47ef2..60d4c7653178 100644
--- a/drivers/block/floppy.c
+++ b/drivers/block/floppy.c
@@ -2805,8 +2805,10 @@ static int set_next_request(void)
fdc_queue = 0;
if (q) {
current_req = blk_fetch_request(q);
-   if (current_req)
+   if (current_req) {
+   current_req->error_count = 0;
break;
+   }
}
} while (fdc_queue != old_pos);
 
@@ -2866,7 +2868,7 @@ static void redo_fd_request(void)
_floppy = floppy_type + DP->autodetect[DRS->probed_format];
} else
probing = 0;
-   errors = &(current_req->errors);
+   errors = &(current_req->error_count);
tmp = make_raw_rw_request();
if (tmp < 2) {
request_done(tmp);
-- 
2.11.0

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


[dm-devel] [PATCH 06/25] virtio: fix spelling of virtblk_scsi_request_done

2017-04-06 Thread Christoph Hellwig
Signed-off-by: Christoph Hellwig 
---
 drivers/block/virtio_blk.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index eaf99022bdc6..dbc4e80680b1 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -111,7 +111,7 @@ static int virtblk_add_req_scsi(struct virtqueue *vq, 
struct virtblk_req *vbr,
return virtqueue_add_sgs(vq, sgs, num_out, num_in, vbr, GFP_ATOMIC);
 }
 
-static inline void virtblk_scsi_reques_done(struct request *req)
+static inline void virtblk_scsi_request_done(struct request *req)
 {
struct virtblk_req *vbr = blk_mq_rq_to_pdu(req);
struct virtio_blk *vblk = req->q->queuedata;
@@ -144,7 +144,7 @@ static inline int virtblk_add_req_scsi(struct virtqueue *vq,
 {
return -EIO;
 }
-static inline void virtblk_scsi_reques_done(struct request *req)
+static inline void virtblk_scsi_request_done(struct request *req)
 {
 }
 #define virtblk_ioctl  NULL
@@ -180,7 +180,7 @@ static inline void virtblk_request_done(struct request *req)
switch (req_op(req)) {
case REQ_OP_SCSI_IN:
case REQ_OP_SCSI_OUT:
-   virtblk_scsi_reques_done(req);
+   virtblk_scsi_request_done(req);
break;
case REQ_OP_DRV_IN:
req->errors = (error != 0);
-- 
2.11.0

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


[dm-devel] [PATCH 12/25] dm rq: don't pass irrelevant error code to blk_mq_complete_request

2017-04-06 Thread Christoph Hellwig
dm never uses rq->errors, so there is no need to pass an error argument
to blk_mq_complete_request.

Signed-off-by: Christoph Hellwig 
---
 drivers/md/dm-rq.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c
index 6886bf160fb2..e1cea42ec2a6 100644
--- a/drivers/md/dm-rq.c
+++ b/drivers/md/dm-rq.c
@@ -358,7 +358,7 @@ static void dm_complete_request(struct request *rq, int 
error)
if (!rq->q->mq_ops)
blk_complete_request(rq);
else
-   blk_mq_complete_request(rq, error);
+   blk_mq_complete_request(rq, 0);
 }
 
 /*
-- 
2.11.0

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


[dm-devel] [PATCH 01/25] remove the mg_disk driver

2017-04-06 Thread Christoph Hellwig
This drivers was added in 2008, but as far as a I can tell we never had a
single platform that actually registered resources for the platform driver.

It's also been unmaintained for a long time and apparently has a ATA mode
that can be driven using the IDE/libata subsystem.

Signed-off-by: Christoph Hellwig 
---
 Documentation/blockdev/mflash.txt |   84 ---
 drivers/block/Kconfig |   17 -
 drivers/block/Makefile|1 -
 drivers/block/mg_disk.c   | 1110 -
 include/linux/mg_disk.h   |   45 --
 5 files changed, 1257 deletions(-)
 delete mode 100644 Documentation/blockdev/mflash.txt
 delete mode 100644 drivers/block/mg_disk.c
 delete mode 100644 include/linux/mg_disk.h

diff --git a/Documentation/blockdev/mflash.txt 
b/Documentation/blockdev/mflash.txt
deleted file mode 100644
index f7e050551487..
--- a/Documentation/blockdev/mflash.txt
+++ /dev/null
@@ -1,84 +0,0 @@
-This document describes m[g]flash support in linux.
-
-Contents
-  1. Overview
-  2. Reserved area configuration
-  3. Example of mflash platform driver registration
-
-1. Overview
-
-Mflash and gflash are embedded flash drive. The only difference is mflash is
-MCP(Multi Chip Package) device. These two device operate exactly same way.
-So the rest mflash repersents mflash and gflash altogether.
-
-Internally, mflash has nand flash and other hardware logics and supports
-2 different operation (ATA, IO) modes. ATA mode doesn't need any new
-driver and currently works well under standard IDE subsystem. Actually it's
-one chip SSD. IO mode is ATA-like custom mode for the host that doesn't have
-IDE interface.
-
-Following are brief descriptions about IO mode.
-A. IO mode based on ATA protocol and uses some custom command. (read confirm,
-write confirm)
-B. IO mode uses SRAM bus interface.
-C. IO mode supports 4kB boot area, so host can boot from mflash.
-
-2. Reserved area configuration
-If host boot from mflash, usually needs raw area for boot loader image. All of
-the mflash's block device operation will be taken this value as start offset.
-Note that boot loader's size of reserved area and kernel configuration value
-must be same.
-
-3. Example of mflash platform driver registration
-Working mflash is very straight forward. Adding platform device stuff to board
-configuration file is all. Here is some pseudo example.
-
-static struct mg_drv_data mflash_drv_data = {
-   /* If you want to polling driver set to 1 */
-   .use_polling = 0,
-   /* device attribution */
-   .dev_attr = MG_BOOT_DEV
-};
-
-static struct resource mg_mflash_rsc[] = {
-   /* Base address of mflash */
-   [0] = {
-   .start = 0x0800,
-   .end = 0x0800 + SZ_64K - 1,
-   .flags = IORESOURCE_MEM
-   },
-   /* mflash interrupt pin */
-   [1] = {
-   .start = IRQ_GPIO(84),
-   .end = IRQ_GPIO(84),
-   .flags = IORESOURCE_IRQ
-   },
-   /* mflash reset pin */
-   [2] = {
-   .start = 43,
-   .end = 43,
-   .name = MG_RST_PIN,
-   .flags = IORESOURCE_IO
-   },
-   /* mflash reset-out pin
-* If you use mflash as storage device (i.e. other than MG_BOOT_DEV),
-* should assign this */
-   [3] = {
-   .start = 51,
-   .end = 51,
-   .name = MG_RSTOUT_PIN,
-   .flags = IORESOURCE_IO
-   }
-};
-
-static struct platform_device mflash_dev = {
-   .name = MG_DEV_NAME,
-   .id = -1,
-   .dev = {
-   .platform_data = _drv_data,
-   },
-   .num_resources = ARRAY_SIZE(mg_mflash_rsc),
-   .resource = mg_mflash_rsc
-};
-
-platform_device_register(_dev);
diff --git a/drivers/block/Kconfig b/drivers/block/Kconfig
index a1c2e816128f..ebe8c1a6195e 100644
--- a/drivers/block/Kconfig
+++ b/drivers/block/Kconfig
@@ -434,23 +434,6 @@ config ATA_OVER_ETH
This driver provides Support for ATA over Ethernet block
devices like the Coraid EtherDrive (R) Storage Blade.
 
-config MG_DISK
-   tristate "mGine mflash, gflash support"
-   depends on ARM && GPIOLIB
-   help
- mGine mFlash(gFlash) block device driver
-
-config MG_DISK_RES
-   int "Size of reserved area before MBR"
-   depends on MG_DISK
-   default 0
-   help
- Define size of reserved area that usually used for boot. Unit is KB.
- All of the block device operation will be taken this value as start
- offset
- Examples:
-   1024 => 1 MB
-
 config SUNVDC
tristate "Sun Virtual Disk Client support"
depends on SUN_LDOMS
diff --git a/drivers/block/Makefile b/drivers/block/Makefile
index b12c772bbeb3..5ceead8b52d7 100644
--- a/drivers/block/Makefile
+++ b/drivers/block/Makefile
@@ -19,7 +19,6 @@ obj-$(CONFIG_BLK_CPQ_CISS_DA)  += cciss.o
 

[dm-devel] [PATCH 24/25] blktrace: remove the unused block_rq_abort tracepoint

2017-04-06 Thread Christoph Hellwig
Signed-off-by: Christoph Hellwig 
---
 include/trace/events/block.h | 44 ++--
 kernel/trace/blktrace.c  |  9 -
 2 files changed, 10 insertions(+), 43 deletions(-)

diff --git a/include/trace/events/block.h b/include/trace/events/block.h
index a88ed13446ff..99ed69fad041 100644
--- a/include/trace/events/block.h
+++ b/include/trace/events/block.h
@@ -61,7 +61,16 @@ DEFINE_EVENT(block_buffer, block_dirty_buffer,
TP_ARGS(bh)
 );
 
-DECLARE_EVENT_CLASS(block_rq_with_error,
+/**
+ * block_rq_requeue - place block IO request back on a queue
+ * @q: queue holding operation
+ * @rq: block IO operation request
+ *
+ * The block operation request @rq is being placed back into queue
+ * @q.  For some reason the request was not completed and needs to be
+ * put back in the queue.
+ */
+TRACE_EVENT(block_rq_requeue,
 
TP_PROTO(struct request_queue *q, struct request *rq),
 
@@ -94,39 +103,6 @@ DECLARE_EVENT_CLASS(block_rq_with_error,
 );
 
 /**
- * block_rq_abort - abort block operation request
- * @q: queue containing the block operation request
- * @rq: block IO operation request
- *
- * Called immediately after pending block IO operation request @rq in
- * queue @q is aborted. The fields in the operation request @rq
- * can be examined to determine which device and sectors the pending
- * operation would access.
- */
-DEFINE_EVENT(block_rq_with_error, block_rq_abort,
-
-   TP_PROTO(struct request_queue *q, struct request *rq),
-
-   TP_ARGS(q, rq)
-);
-
-/**
- * block_rq_requeue - place block IO request back on a queue
- * @q: queue holding operation
- * @rq: block IO operation request
- *
- * The block operation request @rq is being placed back into queue
- * @q.  For some reason the request was not completed and needs to be
- * put back in the queue.
- */
-DEFINE_EVENT(block_rq_with_error, block_rq_requeue,
-
-   TP_PROTO(struct request_queue *q, struct request *rq),
-
-   TP_ARGS(q, rq)
-);
-
-/**
  * block_rq_complete - block IO operation completed by device driver
  * @q: queue containing the block operation request
  * @rq: block operations request
diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
index b2058a7f94bd..9f3624dadb09 100644
--- a/kernel/trace/blktrace.c
+++ b/kernel/trace/blktrace.c
@@ -716,12 +716,6 @@ static void blk_add_trace_rq(struct request_queue *q, 
struct request *rq,
rq->cmd_flags, what, rq->errors, 0, NULL);
 }
 
-static void blk_add_trace_rq_abort(void *ignore,
-  struct request_queue *q, struct request *rq)
-{
-   blk_add_trace_rq(q, rq, blk_rq_bytes(rq), BLK_TA_ABORT);
-}
-
 static void blk_add_trace_rq_insert(void *ignore,
struct request_queue *q, struct request *rq)
 {
@@ -974,8 +968,6 @@ static void blk_register_tracepoints(void)
 {
int ret;
 
-   ret = register_trace_block_rq_abort(blk_add_trace_rq_abort, NULL);
-   WARN_ON(ret);
ret = register_trace_block_rq_insert(blk_add_trace_rq_insert, NULL);
WARN_ON(ret);
ret = register_trace_block_rq_issue(blk_add_trace_rq_issue, NULL);
@@ -1028,7 +1020,6 @@ static void blk_unregister_tracepoints(void)
unregister_trace_block_rq_requeue(blk_add_trace_rq_requeue, NULL);
unregister_trace_block_rq_issue(blk_add_trace_rq_issue, NULL);
unregister_trace_block_rq_insert(blk_add_trace_rq_insert, NULL);
-   unregister_trace_block_rq_abort(blk_add_trace_rq_abort, NULL);
 
tracepoint_synchronize_unregister();
 }
-- 
2.11.0

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] dm-crypt IV generation (summary)

2017-04-06 Thread Mike Snitzer
On Thu, Apr 06 2017 at  5:29am -0400,
Herbert Xu  wrote:

> On Fri, Mar 10, 2017 at 02:44:26PM +0100, Ondrej Mosnacek wrote:
> > Hi all,
> > 
> > I was tasked to post a summary the whole dm-crypt IV generation
> > problem and all the suggested solutions along with their drawbacks, so
> > here it goes...
> 
> Thanks for the summary.  It looks good to me.

There were 4 different solutions presented.  In reading each, option 2
looked the most promising.  BUT it requires a tradeoff/decision to be
made, please see my earlier reply where I asked for your (and/or
Milan's) thoughts:
https://www.spinics.net/lists/linux-crypto/msg24586.html

Thanks,
Mike

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


[dm-devel] multipathd: locks itself in udev trigger

2017-04-06 Thread Alban Browaeys
Bcache backing partition bcache0 triggers an udev add event that is handled by 
multipathd.
 Somewhat the other "bare" paritions sda do not.

The issue is when this event triggers the thread lock itself since commit
 c6a18f4541d0a161e2f5fed8c67d9732bf512b37 "fix INIT_REQUESTED_UDEV code" .
This change in "uev_update_path" moved "uev_add_path(uev, vecs);" under the 
fast lock (non recursive)
"lock(>lock);".  As uev_add_path too calls "lock(>lock);" 
multipathd hangs in this second call
in the same thread.

Then "multipathd list paths" or other multipathd commands returns timeout.
This also postpone systemd shutdown/reboot by a minute while it waits for 
multipathd service to stop.
 
The backtrace was:
(gdb) t a a bt

Thread 6 (Thread 0x7f922663c700 (LWP 545)):
#0  syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
#1  0x7f9228bd3602 in ?? () from /usr/lib/x86_64-linux-gnu/liburcu.so.4
#2  0x7f92289ba424 in start_thread (arg=0x7f922663c700) at 
pthread_create.c:333
#3  0x7f922825c9bf in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:105

Thread 5 (Thread 0x7f9229734700 (LWP 543)):
#0  __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1  0x7f92289bcb85 in __GI___pthread_mutex_lock (mutex=0x556fefc43080) at 
../nptl/pthread_mutex_lock.c:80
#2  0x556fedcbe42d in lock (a=0x556fefc43080) at ../libmultipath/lock.h:12
#3  uev_add_path (vecs=0x556fefc43080, uev=, uev=) at main.c:627
#4  0x556fedcbe9c9 in uev_update_path (uev=0x7f9220001510, 
vecs=0x556fefc43080) at main.c:998
#5  0x556fedcbecdb in uev_trigger (uev=0x7f9220001510, 
trigger_data=0x556fefc43080) at main.c:1146
#6  0x7f92292091b2 in service_uevq (tmpq=tmpq@entry=0x7f9229733b10) at 
uevent.c:89
#7  0x7f9229209280 in uevent_dispatch (uev_trigger=, 
trigger_data=) at uevent.c:145
#8  0x556fedcbc2cc in uevqloop (ap=0x556fefc43080) at main.c:1177
#9  0x7f92289ba424 in start_thread (arg=0x7f9229734700) at 
pthread_create.c:333
#10 0x7f922825c9bf in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:105

Thread 4 (Thread 0x7f9229745700 (LWP 542)):
#0  __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1  0x7f92289bcb85 in __GI___pthread_mutex_lock (mutex=0x556fefc43080) at 
../nptl/pthread_mutex_lock.c:80
#2  0x556fedcbfb45 in lock (a=0x556fefc43080) at ../libmultipath/lock.h:12
#3  checkerloop (ap=0x556fefc43080) at main.c:1827
#4  0x7f92289ba424 in start_thread (arg=0x7f9229745700) at 
pthread_create.c:333
#5  0x7f922825c9bf in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:105

Thread 3 (Thread 0x7f9229810700 (LWP 541)):
#0  0x7f9228253611 in __GI_ppoll (fds=0x7f92180021e0, nfds=nfds@entry=1, 
timeout=, timeout@entry=0x556fedecc020 , 
sigmask=sigmask@entry=0x7f922980fa60) at ../sysdeps/unix/sysv/linux/ppoll.c:39
#1  0x556fedcc13ba in ppoll (__ss=0x7f922980fa60, __timeout=0x556fedecc020 
, __nfds=1, __fds=) at 
/usr/include/x86_64-linux-gnu/bits/poll2.h:77
#2  uxsock_listen (uxsock_trigger=0x556fedcbb520 , 
trigger_data=0x556fefc43080) at uxlsnr.c:204
#3  0x556fedcbbd5a in uxlsnrloop (ap=0x556fefc43080) at main.c:1239
#4  0x7f92289ba424 in start_thread (arg=0x7f9229810700) at 
pthread_create.c:333
#5  0x7f922825c9bf in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:105

Thread 2 (Thread 0x7f9229851700 (LWP 540)):
#0  0x7f922825354d in poll () at ../sysdeps/unix/syscall-template.S:84
#1  0x7f9229209f3a in poll (__timeout=, __nfds=1, 
__fds=0x7f9229850a88) at /usr/include/x86_64-linux-gnu/bits/poll2.h:46
#2  uevent_listen (udev=0x556fefbec040) at uevent.c:515
#3  0x556fedcbc235 in ueventloop (ap=0x556fefbec040) at main.c:1166
#4  0x7f92289ba424 in start_thread (arg=0x7f9229851700) at 
pthread_create.c:333
#5  0x7f922825c9bf in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:105

Thread 1 (Thread 0x7f9229746f00 (LWP 537)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at 
../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x556fedcc0aba in child (param=) at main.c:2407
#2  0x556fedcbb0df in main (argc=, argv=0x7fff81f9a0d8) at 
main.c:2664

As a local workaround I moved "uev_add_path" in "uev_update_path" back out of 
the lock umbrella
 while I keep it under pp->initialized check.
https://bugs.debian.org/cgi-bin/bugreport.cgi?att=1;bug=859157;filename=fix_uev_update_path_udevadd_recursive_lock_deadlock.diff;msg=5
This change fixes the reboot delay but I have no multipath setup thus cannot 
detect any regressions.

-Alban

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] [PATCH 04/33] multipath: do not check daemon from udev rules

2017-04-06 Thread Martin Wilck
Hi Ben,

thanks for looking into this. I'll respond next week.

Cheers,
Martin

-- 
Dr. Martin Wilck , Tel. +49 (0)911 74053 2107
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

Re: [dm-devel] dm-crypt IV generation (summary)

2017-04-06 Thread Herbert Xu
On Fri, Mar 10, 2017 at 02:44:26PM +0100, Ondrej Mosnacek wrote:
> Hi all,
> 
> I was tasked to post a summary the whole dm-crypt IV generation
> problem and all the suggested solutions along with their drawbacks, so
> here it goes...

Thanks for the summary.  It looks good to me.

Something else to keep mind is the potential to reuse IV generators.

Recently a patch has been proposed for fscrypt that also makes
use of essiv (search for "fscrypt: Add support for AES-128-CBC").
It would be great if we could reuse the same code for both dm-crypt
and fscrypt.

Cheers,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] [PATCH 26/27] scsi: sd: Separate zeroout and discard command choices

2017-04-06 Thread Hannes Reinecke
On 04/05/2017 07:21 PM, Christoph Hellwig wrote:
> From: "Martin K. Petersen" 
> 
> Now that zeroout and discards are distinct operations we need to
> separate the policy of choosing the appropriate command. Create a
> zeroing_mode which can be one of:
> 
> write:Zeroout assist not present, use regular WRITE
> writesame:Allow WRITE SAME(10/16) with a zeroed payload
> writesame_16_unmap:   Allow WRITE SAME(16) with UNMAP
> writesame_10_unmap:   Allow WRITE SAME(10) with UNMAP
> 
> The last two are conditional on the device being thin provisioned with
> LBPRZ=1 and LBPWS=1 or LBPWS10=1 respectively.
> 
> Whether to set the UNMAP bit or not depends on the REQ_NOUNMAP flag. And
> if none of the _unmap variants are supported, regular WRITE SAME will be
> used if the device supports it.
> 
> The zeroout_mode is exported in sysfs and the detected mode for a given
> device can be overridden using the string constants above.
> 
> With this change in place we can now issue WRITE SAME(16) with UNMAP set
> for block zeroing applications that require hard guarantees and
> logical_block_size granularity. And at the same time use the UNMAP
> command with the device's preferred granulary and alignment for discard
> operations.
> 
> Signed-off-by: Martin K. Petersen 
> Signed-off-by: Christoph Hellwig 
> ---
>  drivers/scsi/sd.c | 56 
> ---
>  drivers/scsi/sd.h |  8 
>  2 files changed, 61 insertions(+), 3 deletions(-)
> 
Reviewed-by: Hannes Reinecke 

Cheers,

Hannes
-- 
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel


Re: [dm-devel] [PATCH 27/27] scsi: sd: Remove LBPRZ dependency for discards

2017-04-06 Thread Hannes Reinecke
On 04/05/2017 07:21 PM, Christoph Hellwig wrote:
> From: "Martin K. Petersen" 
> 
> Separating discards and zeroout operations allows us to remove the LBPRZ
> block zeroing constraints from discards and honor the device preferences
> for UNMAP commands.
> 
> If supported by the device, we'll also choose UNMAP over one of the
> WRITE SAME variants for discards.
> 
> Signed-off-by: Martin K. Petersen 
> Signed-off-by: Christoph Hellwig 
> ---
>  drivers/scsi/sd.c | 25 ++---
>  1 file changed, 6 insertions(+), 19 deletions(-)
> 
Reviewed-by: Hannes Reinecke 

Cheers,

Hannes
-- 
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel