Re:

2017-05-02 Thread H.A
With profound love in my heart, I Kindly Oblige your interest to very important 
proposal.. It is Truly Divine and require your utmost attention..

S hlubokou láskou v mém srdci, Laskave jsem prinutit svuj zájem k návrhu .. Je 
velmi duležité, skutecne Divine a vyžadují vaši nejvyšší pozornost.

  Kontaktujte me prímo pres: helenarobert...@gmail.com pro úplné 
podrobnosti.complete.


HELINA .A ROBERTS

---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus



[PATCH 1/3] target: Fix compare_and_write_callback handling for non GOOD status

2017-05-02 Thread Nicholas A. Bellinger
From: Nicholas Bellinger 

Following the bugfix for handling non SAM_STAT_GOOD COMPARE_AND_WRITE
status during COMMIT phase in commit 9b2792c3da1, the same bug exists
for the READ phase as well.

This would manifest first as a lost SCSI response, and eventual
hung task during fabric driver logout or re-login, as existing
shutdown logic waited for the COMPARE_AND_WRITE se_cmd->cmd_kref
to reach zero.

To address this bug, compare_and_write_callback() has been changed
to set post_ret = 1 and return TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE
as necessary to signal failure status.

Reported-by: Bill Borsari 
Cc: Bill Borsari 
Tested-by: Gary Guo 
Cc: Gary Guo 
Signed-off-by: Nicholas Bellinger 
---
 drivers/target/target_core_sbc.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/target/target_core_sbc.c b/drivers/target/target_core_sbc.c
index f9250b3..a0ad618 100644
--- a/drivers/target/target_core_sbc.c
+++ b/drivers/target/target_core_sbc.c
@@ -507,8 +507,11 @@ static sense_reason_t compare_and_write_callback(struct 
se_cmd *cmd, bool succes
 * been failed with a non-zero SCSI status.
 */
if (cmd->scsi_status) {
-   pr_err("compare_and_write_callback: non zero scsi_status:"
+   pr_debug("compare_and_write_callback: non zero scsi_status:"
" 0x%02x\n", cmd->scsi_status);
+   *post_ret = 1;
+   if (cmd->scsi_status == SAM_STAT_CHECK_CONDITION)
+   ret = TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE;
goto out;
}
 
-- 
1.9.1



[PATCH 3/3] target: Don't force session reset if queue_depth does not change

2017-05-02 Thread Nicholas A. Bellinger
From: Nicholas Bellinger 

Keeping in the idempotent nature of target_core_fabric_configfs.c,
if a queue_depth value is set and it's the same as the existing
value, don't attempt to force session reinstatement.

Reported-by: Raghu Krishnamurthy 
Cc: Raghu Krishnamurthy 
Tested-by: Gary Guo 
Cc: Gary Guo 
Signed-off-by: Nicholas Bellinger 
---
 drivers/target/target_core_tpg.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/target/target_core_tpg.c b/drivers/target/target_core_tpg.c
index dfaef4d..310d9e5 100644
--- a/drivers/target/target_core_tpg.c
+++ b/drivers/target/target_core_tpg.c
@@ -398,6 +398,13 @@ int core_tpg_set_initiator_node_queue_depth(
struct se_portal_group *tpg = acl->se_tpg;
 
/*
+* Allow the setting of se_node_acl queue_depth to be idempotent,
+* and not force a session shutdown event if the value is not
+* changing.
+*/
+   if (acl->queue_depth == queue_depth)
+   return 0;
+   /*
 * User has requested to change the queue depth for a Initiator Node.
 * Change the value in the Node's struct se_node_acl, and call
 * target_set_nacl_queue_depth() to set the new queue depth.
-- 
1.9.1



[PATCH 2/3] iscsi-target: Set session_fall_back_to_erl0 when forcing reinstatement

2017-05-02 Thread Nicholas A. Bellinger
From: Nicholas Bellinger 

While testing modification of per se_node_acl queue_depth forcing
session reinstatement via lio_target_nacl_cmdsn_depth_store() ->
core_tpg_set_initiator_node_queue_depth(), a hung task bug triggered
when changing cmdsn_depth invoked session reinstatement while an iscsi
login was already waiting for session reinstatement to complete.

This can happen when an outstanding se_cmd descriptor is taking a
long time to complete, and session reinstatement from iscsi login
or cmdsn_depth change occurs concurrently.

To address this bug, explicitly set session_fall_back_to_erl0 = 1
when forcing session reinstatement, so session reinstatement is
not attempted if an active session is already being shutdown.

This patch has been tested with two scenarios.  The first when
iscsi login is blocked waiting for iscsi session reinstatement
to complete followed by queue_depth change via configfs, and
second when queue_depth change via configfs us blocked followed
by a iscsi login driven session reinstatement.

Note this patch depends on commit d36ad77f702 to handle multiple
sessions per se_node_acl when changing cmdsn_depth, and for
pre v4.5 kernels will need to be included for stable as well.

Reported-by: Gary Guo 
Tested-by: Gary Guo 
Cc: Gary Guo 
Signed-off-by: Nicholas Bellinger 
---
 drivers/target/iscsi/iscsi_target.c  | 1 +
 drivers/target/iscsi/iscsi_target_configfs.c | 1 +
 drivers/target/iscsi/iscsi_target_login.c| 1 +
 3 files changed, 3 insertions(+)

diff --git a/drivers/target/iscsi/iscsi_target.c 
b/drivers/target/iscsi/iscsi_target.c
index 0f7ade0..26a9bcd 100644
--- a/drivers/target/iscsi/iscsi_target.c
+++ b/drivers/target/iscsi/iscsi_target.c
@@ -4663,6 +4663,7 @@ int iscsit_release_sessions_for_tpg(struct 
iscsi_portal_group *tpg, int force)
continue;
}
atomic_set(&sess->session_reinstatement, 1);
+   atomic_set(&sess->session_fall_back_to_erl0, 1);
spin_unlock(&sess->conn_lock);
 
list_move_tail(&se_sess->sess_list, &free_list);
diff --git a/drivers/target/iscsi/iscsi_target_configfs.c 
b/drivers/target/iscsi/iscsi_target_configfs.c
index 344e844..96d9c73 100644
--- a/drivers/target/iscsi/iscsi_target_configfs.c
+++ b/drivers/target/iscsi/iscsi_target_configfs.c
@@ -1528,6 +1528,7 @@ static void lio_tpg_close_session(struct se_session 
*se_sess)
return;
}
atomic_set(&sess->session_reinstatement, 1);
+   atomic_set(&sess->session_fall_back_to_erl0, 1);
spin_unlock(&sess->conn_lock);
 
iscsit_stop_time2retain_timer(sess);
diff --git a/drivers/target/iscsi/iscsi_target_login.c 
b/drivers/target/iscsi/iscsi_target_login.c
index ad8f301..6623847 100644
--- a/drivers/target/iscsi/iscsi_target_login.c
+++ b/drivers/target/iscsi/iscsi_target_login.c
@@ -208,6 +208,7 @@ int iscsi_check_for_session_reinstatement(struct iscsi_conn 
*conn)
initiatorname_param->value) &&
   (sess_p->sess_ops->SessionType == sessiontype))) {
atomic_set(&sess_p->session_reinstatement, 1);
+   atomic_set(&sess_p->session_fall_back_to_erl0, 1);
spin_unlock(&sess_p->conn_lock);
iscsit_inc_session_usage_count(sess_p);
iscsit_stop_time2retain_timer(sess_p);
-- 
1.9.1



[PATCH 0/3] target: Fixes for v4.12-rc1

2017-05-02 Thread Nicholas A. Bellinger
From: Nicholas Bellinger 

Hi all,

Here are a couple of fixes from the last weeks testing while
continuing longevity and scale out workloads on v4.x target code.

This series contains three patches.  The first is to address
a COMPARE_AND_WRITE se_cmd reference leak where the READ phase
hits a non GOOD status, observed with ESX VAAI hosts when
outstanding READ I/O reaches a point where non SAM_STAT_GOOD
status completions start to happen.

The second addresses a hung task bug observed with iscsi-target
ports while explicitly changing the active per se_node_acl
queue_depth via the existing configfs attribute, if a new iscsi
login was already forcing session reinstatement.

And the third to is avoid forcing an session reinstatement if
queue_depth is changed via configfs, but the value itself has
not changed.

The series has been verified on v4.1.y by DATERA Q/A and
automation teams.

Thank you,

--nab

Nicholas Bellinger (3):
  target: Fix compare_and_write_callback handling for non GOOD status
  iscsi-target: Set session_fall_back_to_erl0 when forcing reinstatement
  target: Don't force session reset if queue_depth does not change

 drivers/target/iscsi/iscsi_target.c  | 1 +
 drivers/target/iscsi/iscsi_target_configfs.c | 1 +
 drivers/target/iscsi/iscsi_target_login.c| 1 +
 drivers/target/target_core_sbc.c | 5 -
 drivers/target/target_core_tpg.c | 7 +++
 5 files changed, 14 insertions(+), 1 deletion(-)

-- 
1.9.1



[Bug 176951] boot fails unless acpi=off Acer Travelmate X-349

2017-05-02 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=176951

Max (700...@gmail.com) changed:

   What|Removed |Added

 CC||700...@gmail.com

--- Comment #23 from Max (700...@gmail.com) ---
I can confirm the bug on my Acer Swift 3 Model SF314-51 (bios 1.08).
I able to install just Mint 18.1. I found that it freezes when:
1. ~4 times of 5 when my machine start ups (boot logo/black screen); 
2. usually when chrome starts; 
3. when brightness changes very fast; 
4. when shut down.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.


Re: [PATCH v2] ibmvscsis: Do not send aborted task response

2017-05-02 Thread Nicholas A. Bellinger
On Tue, 2017-05-02 at 13:54 -0500, Bryant G. Ly wrote:
> The driver is sending a response to the actual scsi op that was
> aborted by an abort task TM, while LIO is sending a response to
> the abort task TM.
> 
> ibmvscsis_tgt does not send the response to the client until
> release_cmd time. The reason for this was because if we did it
> at queue_status time, then the client would be free to reuse the
> tag for that command, but we're still using the tag until the
> command is released at release_cmd time, so we chose to delay
> sending the response until then. That then caused this issue, because
> release_cmd is always called, even if queue_status is not.
> 
> SCSI spec says that the initiator that sends the abort task
> TM NEVER gets a response to the aborted op and with the current
> code it will send a response. Thus this fix will remove that response
> if the TAS bit is set.
> 
> Cc:  # v4.8+
> Signed-off-by: Bryant G. Ly 
> Reviewed-by: Tyrel Datwyler 
> ---
>  drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c | 66 
> ++--
>  1 file changed, 45 insertions(+), 21 deletions(-)

Applied, with a small update to the last sentence of the commit log wrt
to 'if ABORTED && !TAS bit is set'.

Thanks Bryant + Tyrel.



Re: [PATCH] tcmu: Recalculate the tcmu_cmd size to save cmd area memories

2017-05-02 Thread Nicholas A. Bellinger
On Tue, 2017-05-02 at 21:06 -0500, Mike Christie wrote:
> On 05/02/2017 02:54 AM, lixi...@cmss.chinamobile.com wrote:
> > From: Xiubo Li 
> > 
> > For the "struct tcmu_cmd_entry" in cmd area, the minimum size
> > will be sizeof(struct tcmu_cmd_entry) == 112 Bytes. And it could
> > fill about (sizeof(struct rsp) - sizeof(struct req)) /
> > sizeof(struct iovec) == 68 / 16 ~= 4 data regions(iov[4]) by
> > default.
> > 
> > For most tcmu_cmds, the data block indexes allocated from the
> > data area will be continuous. And for the continuous blocks they
> > will be merged into the same region using only one iovec. For
> > the current code, it will always allocates the same number of
> > iovecs with blocks for each tcmu_cmd, and it will wastes much
> > memories.
> > 
> > For example, when the block size is 4K and the DATA_OUT buffer
> > size is 64K, and the regions needed is less than 5(on my
> > environment is almost 99.7%). The current code will allocate
> > about 16 iovecs, and there will be (16 - 4) * sizeof(struct
> > iovec) = 192 Bytes cmd area memories wasted.
> > 
> > Here adds two helpers to calculate the base size and full size
> > of the tcmu_cmd. And will recalculate them again when it make sure
> > how many iovs is needed before insert it to cmd area.
> > 
> > Signed-off-by: Xiubo Li 
> 
> Looks ok to me. Thanks.
> 
> Acked-by: Mike Christie 

Applied.  Thanks Xiubo + MNC.



Re: [PATCH 25/27] block: remove the discard_zeroes_data flag

2017-05-02 Thread Nicholas A. Bellinger
On Tue, 2017-05-02 at 09:23 +0200, h...@lst.de wrote:
> On Tue, May 02, 2017 at 12:16:13AM -0700, Nicholas A. Bellinger wrote:
> > Or, another options is use bdev_write_zeroes_sectors() to determine when
> > dev_attrib->unmap_zeroes_data should be set.
> 
> Yes, that in combination with your patch to use bdev_write_zeroes_sectors
> for zeroing from write same seems like the right fix.

The larger target/iblock conversion patch looks like post v4.12 material
at this point, so to avoid breakage wrt to existing LBPRZ behavior, I'll
plan to push the following patch post -rc1.

diff --git a/drivers/target/target_core_device.c 
b/drivers/target/target_core_device.c
index d2f089c..e7caf78 100644
--- a/drivers/target/target_core_device.c
+++ b/drivers/target/target_core_device.c
@@ -851,7 +851,7 @@ bool target_configure_unmap_from_queue(struct se_dev_attrib 
*attrib,
attrib->unmap_granularity = q->limits.discard_granularity / block_size;
attrib->unmap_granularity_alignment = q->limits.discard_alignment /
block_size;
-   attrib->unmap_zeroes_data = 0;
+   attrib->unmap_zeroes_data = (q->limits.max_write_zeroes_sectors);
return true;
 }
 EXPORT_SYMBOL(target_configure_unmap_from_queue);



Re: [PATCH] tcmu: Recalculate the tcmu_cmd size to save cmd area memories

2017-05-02 Thread Mike Christie
On 05/02/2017 02:54 AM, lixi...@cmss.chinamobile.com wrote:
> From: Xiubo Li 
> 
> For the "struct tcmu_cmd_entry" in cmd area, the minimum size
> will be sizeof(struct tcmu_cmd_entry) == 112 Bytes. And it could
> fill about (sizeof(struct rsp) - sizeof(struct req)) /
> sizeof(struct iovec) == 68 / 16 ~= 4 data regions(iov[4]) by
> default.
> 
> For most tcmu_cmds, the data block indexes allocated from the
> data area will be continuous. And for the continuous blocks they
> will be merged into the same region using only one iovec. For
> the current code, it will always allocates the same number of
> iovecs with blocks for each tcmu_cmd, and it will wastes much
> memories.
> 
> For example, when the block size is 4K and the DATA_OUT buffer
> size is 64K, and the regions needed is less than 5(on my
> environment is almost 99.7%). The current code will allocate
> about 16 iovecs, and there will be (16 - 4) * sizeof(struct
> iovec) = 192 Bytes cmd area memories wasted.
> 
> Here adds two helpers to calculate the base size and full size
> of the tcmu_cmd. And will recalculate them again when it make sure
> how many iovs is needed before insert it to cmd area.
> 
> Signed-off-by: Xiubo Li 

Looks ok to me. Thanks.

Acked-by: Mike Christie 



Re: [PATCH] Avoid that scsi_exit_rq() triggers a use-after-free

2017-05-02 Thread Scott Bauer
On Tue, May 02, 2017 at 10:43:30AM -0700, Bart Van Assche wrote:
> This patch fixes the following KASAN complaint:
> 
> ==
> BUG: KASAN: use-after-free in scsi_exit_rq+0xf3/0x120 at addr 8802b7fedf00
> Read of size 1 by task rcuos/5/53
> CPU: 7 PID: 53 Comm: rcuos/6 Not tainted 4.11.0-rc5+ #13
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0 
> ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
> Call Trace:
>  dump_stack+0x63/0x8f
>  kasan_object_err+0x21/0x70
>  kasan_report.part.1+0x231/0x500
>  __asan_report_load1_noabort+0x2e/0x30
>  scsi_exit_rq+0xf3/0x120
>  free_request_size+0x44/0x60
>  mempool_destroy.part.6+0x9b/0x150
>  mempool_destroy+0x13/0x20
>  blk_exit_rl+0x36/0x40
>  blkg_free+0x146/0x200
>  __blkg_release_rcu+0x121/0x220
>  rcu_nocb_kthread+0x61f/0xca0
>  kthread+0x298/0x390
>  ret_from_fork+0x2c/0x40
> Object at 8802b7fedd80, in cache kmalloc-2048 size: 2048
> Allocated:
> PID = 3992
>  save_stack_trace+0x1b/0x20
>  save_stack+0x46/0xd0
>  kasan_kmalloc+0xad/0xe0
>  __kmalloc+0x134/0x220
>  scsi_host_alloc+0x6b/0x11c0
>  0xc101d94a
>  driver_probe_device+0x49e/0xc60
>  __device_attach_driver+0x1d3/0x2a0
>  bus_for_each_drv+0x11a/0x1d0
>  __device_attach+0x1e1/0x2c0
>  device_initial_probe+0x13/0x20
>  bus_probe_device+0x19b/0x240
>  device_add+0x86d/0x1450
>  device_register+0x1a/0x20
>  0xc10270ce
>  0xc1048a62
>  do_one_initcall+0xa7/0x250
>  do_init_module+0x1d0/0x55d
>  load_module+0x7c9f/0x9850
>  SYSC_finit_module+0x189/0x1c0
>  SyS_finit_module+0xe/0x10
>  entry_SYSCALL_64_fastpath+0x1a/0xa9
> Freed:
> PID = 4128
>  save_stack_trace+0x1b/0x20
>  save_stack+0x46/0xd0
>  kasan_slab_free+0x71/0xb0
>  kfree+0x8d/0x1b0
>  scsi_host_dev_release+0x2cb/0x430
>  device_release+0x76/0x1e0
>  kobject_release+0x107/0x370
>  kobject_put+0x56/0xb0
>  put_device+0x17/0x20
>  scsi_host_put+0x15/0x20
>  0xc101fcd7
>  device_release_driver_internal+0x26a/0x4e0
>  device_release_driver+0x12/0x20
>  bus_remove_device+0x2d0/0x590
>  device_del+0x55b/0x920
>  device_unregister+0x1a/0xa0
>  0xc101e0ca
>  0xc102fccc
>  SyS_delete_module+0x334/0x3e0
>  entry_SYSCALL_64_fastpath+0x1a/0xa9
> Memory state around the buggy address:
>  8802b7fede00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>  8802b7fede80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> >8802b7fedf00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>^
>  8802b7fedf80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>  8802b7fee000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> ==
> 
> Reported-by: Scott Bauer 
> Fixes: e9c787e65c0c ("scsi: allocate scsi_cmnd structures as part of struct 
> request")
> Signed-off-by: Bart Van Assche 
> Cc: Scott Bauer 
> Cc: Christoph Hellwig 
> Cc: Jan Kara 
> Cc: Hannes Reinecke 
> Cc: 
> ---
>  drivers/scsi/scsi_lib.c | 8 +++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index 15c9fe766071..d698364df072 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -2095,11 +2095,14 @@ static int scsi_init_rq(struct request_queue *q, 
> struct request *rq, gfp_t gfp)
>   struct Scsi_Host *shost = q->rq_alloc_data;
>   struct scsi_cmnd *cmd = blk_mq_rq_to_pdu(rq);
>  
> + if (!scsi_host_get(shost))
> + goto fail;
> +
>   memset(cmd, 0, sizeof(*cmd));
>  
>   cmd->sense_buffer = scsi_alloc_sense_buffer(shost, gfp, NUMA_NO_NODE);
>   if (!cmd->sense_buffer)
> - goto fail;
> + goto put;
>   cmd->req.sense = cmd->sense_buffer;
>  
>   if (scsi_host_get_prot(shost) >= SHOST_DIX_TYPE0_PROTECTION) {
> @@ -2112,6 +2115,8 @@ static int scsi_init_rq(struct request_queue *q, struct 
> request *rq, gfp_t gfp)
>  
>  fail_free_sense:
>   scsi_free_sense_buffer(shost, cmd->sense_buffer);
> +put:
> + scsi_host_put(shost);
>  fail:
>   return -ENOMEM;
>  }
> @@ -2124,6 +2129,7 @@ static void scsi_exit_rq(struct request_queue *q, 
> struct request *rq)
>   if (cmd->prot_sdb)
>   kmem_cache_free(scsi_sdb_cache, cmd->prot_sdb);
>   scsi_free_sense_buffer(shost, cmd->sense_buffer);
> + scsi_host_put(shost);
>  }
>  
>  struct request_queue *scsi_alloc_queue(struct scsi_device *sdev)
> -- 
> 2.12.2
> 


I've applied this on-top of Jens' For-Linus and re-ran the test. I get the 
following scheduling
while atomic BUG() splat:

[   30.686851] run fstests generic/108 at 2017-05-02 16:56:42
[   30.953920] XFS (nvme1n1): Unmounting Filesystem
[   31.020543] scsi host2: scsi_debug: version 1.86 [20160430]
[   31.020543]   dev_size_mb=128, opts=0x0, submit_queues=1, statistics=0
[   31.022341] scsi 2:0:0:0: Direct-Access Linuxscsi_debug   0186 
PQ: 0 ANSI: 7
[   31.0

[PATCH v2] ibmvscsis: Do not send aborted task response

2017-05-02 Thread Bryant G. Ly
The driver is sending a response to the actual scsi op that was
aborted by an abort task TM, while LIO is sending a response to
the abort task TM.

ibmvscsis_tgt does not send the response to the client until
release_cmd time. The reason for this was because if we did it
at queue_status time, then the client would be free to reuse the
tag for that command, but we're still using the tag until the
command is released at release_cmd time, so we chose to delay
sending the response until then. That then caused this issue, because
release_cmd is always called, even if queue_status is not.

SCSI spec says that the initiator that sends the abort task
TM NEVER gets a response to the aborted op and with the current
code it will send a response. Thus this fix will remove that response
if the TAS bit is set.

Cc:  # v4.8+
Signed-off-by: Bryant G. Ly 
Reviewed-by: Tyrel Datwyler 
---
 drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c | 66 ++--
 1 file changed, 45 insertions(+), 21 deletions(-)

diff --git a/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c 
b/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c
index 4bb5635..d6e62ce 100644
--- a/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c
+++ b/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c
@@ -1758,33 +1758,46 @@ static void ibmvscsis_send_messages(struct scsi_info 
*vscsi)
 
if (!(vscsi->flags & RESPONSE_Q_DOWN)) {
list_for_each_entry_safe(cmd, nxt, &vscsi->waiting_rsp, list) {
-   iue = cmd->iue;
-
-   crq->valid = VALID_CMD_RESP_EL;
-   crq->format = cmd->rsp.format;
+   /*
+* If Task Abort Status Bit is set, then dont send a
+* response.
+*/
+   if (cmd->se_cmd.transport_state & CMD_T_ABORTED &&
+   !(cmd->se_cmd.transport_state & CMD_T_TAS)) {
+   list_del(&cmd->list);
+   ibmvscsis_free_cmd_resources(vscsi, cmd);
+   } else {
+   iue = cmd->iue;
 
-   if (cmd->flags & CMD_FAST_FAIL)
-   crq->status = VIOSRP_ADAPTER_FAIL;
+   crq->valid = VALID_CMD_RESP_EL;
+   crq->format = cmd->rsp.format;
 
-   crq->IU_length = cpu_to_be16(cmd->rsp.len);
+   if (cmd->flags & CMD_FAST_FAIL)
+   crq->status = VIOSRP_ADAPTER_FAIL;
 
-   rc = h_send_crq(vscsi->dma_dev->unit_address,
-   be64_to_cpu(msg_hi),
-   be64_to_cpu(cmd->rsp.tag));
+   crq->IU_length = cpu_to_be16(cmd->rsp.len);
 
-   pr_debug("send_messages: cmd %p, tag 0x%llx, rc %ld\n",
-cmd, be64_to_cpu(cmd->rsp.tag), rc);
+   rc = h_send_crq(vscsi->dma_dev->unit_address,
+   be64_to_cpu(msg_hi),
+   be64_to_cpu(cmd->rsp.tag));
 
-   /* if all ok free up the command element resources */
-   if (rc == H_SUCCESS) {
-   /* some movement has occurred */
-   vscsi->rsp_q_timer.timer_pops = 0;
-   list_del(&cmd->list);
+   pr_debug("send_messages: cmd %p, tag 0x%llx, rc 
%ld\n",
+cmd, be64_to_cpu(cmd->rsp.tag), rc);
 
-   ibmvscsis_free_cmd_resources(vscsi, cmd);
-   } else {
-   srp_snd_msg_failed(vscsi, rc);
-   break;
+   /* if all ok free up the command element
+* resources
+*/
+   if (rc == H_SUCCESS) {
+   /* some movement has occurred */
+   vscsi->rsp_q_timer.timer_pops = 0;
+   list_del(&cmd->list);
+
+   ibmvscsis_free_cmd_resources(vscsi,
+cmd);
+   } else {
+   srp_snd_msg_failed(vscsi, rc);
+   break;
+   }
}
}
 
@@ -3581,9 +3594,20 @@ static int ibmvscsis_write_pending(struct se_cmd *se_cmd)
 {
struct ibmvscsis_cmd *cmd = container_of(se_cmd, struct ibmvscsis_cmd,
 se_cmd);
+   struct scsi_info *vs

[PATCH v6 5/5] Make __scsi_remove_device go straight from BLOCKED to DEL

2017-05-02 Thread Bart Van Assche
If a device is blocked, make __scsi_remove_device() cause it to
transition to the DEL state. This means that all the commands
issued in .shutdown() will error in the mid-layer, thus making
the removal proceed without being stopped.

This patch is a slightly modified version of a patch from James
Bottomley. This patch avoids that the following lockup occurs:

Call Trace:
 schedule+0x35/0x80
 schedule_timeout+0x237/0x2d0
 io_schedule_timeout+0xa6/0x110
 wait_for_completion_io+0xa3/0x110
 blk_execute_rq+0xdf/0x120
 scsi_execute+0xce/0x150 [scsi_mod]
 scsi_execute_req_flags+0x8f/0xf0 [scsi_mod]
 sd_sync_cache+0xa9/0x190 [sd_mod]
 sd_shutdown+0x6a/0x100 [sd_mod]
 sd_remove+0x64/0xc0 [sd_mod]
 __device_release_driver+0x8d/0x120
 device_release_driver+0x1e/0x30
 bus_remove_device+0xf9/0x170
 device_del+0x127/0x240
 __scsi_remove_device+0xc1/0xd0 [scsi_mod]
 scsi_forget_host+0x57/0x60 [scsi_mod]
 scsi_remove_host+0x72/0x110 [scsi_mod]
 srp_remove_work+0x8b/0x200 [ib_srp]

Reported-by: Israel Rukshin 
Signed-off-by: Bart Van Assche 
Cc: James Bottomley 
Cc: Israel Rukshin 
Cc: Max Gurtovoy 
Cc: Hannes Reinecke 
Cc: Benjamin Block 
---
 drivers/scsi/scsi_lib.c   |  2 +-
 drivers/scsi/scsi_sysfs.c | 13 +
 2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index bbce1f1db515..b83dca6b495b 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -2618,7 +2618,6 @@ scsi_device_set_state(struct scsi_device *sdev, enum 
scsi_device_state state)
case SDEV_QUIESCE:
case SDEV_OFFLINE:
case SDEV_TRANSPORT_OFFLINE:
-   case SDEV_BLOCK:
break;
default:
goto illegal;
@@ -2632,6 +2631,7 @@ scsi_device_set_state(struct scsi_device *sdev, enum 
scsi_device_state state)
case SDEV_OFFLINE:
case SDEV_TRANSPORT_OFFLINE:
case SDEV_CANCEL:
+   case SDEV_BLOCK:
case SDEV_CREATED_BLOCK:
break;
default:
diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index a91537a3abbf..1f243ac16010 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -1290,7 +1290,20 @@ void __scsi_remove_device(struct scsi_device *sdev)
 * wait until it has finished before changing the device state.
 */
mutex_lock(&sdev->state_mutex);
+   /*
+* If blocked, we go straight to DEL and restart the queue so
+* any commands issued during driver shutdown (like sync
+* cache) are errored immediately.
+*/
res = scsi_device_set_state(sdev, SDEV_CANCEL);
+   if (res != 0) {
+   res = scsi_device_set_state(sdev, SDEV_DEL);
+   if (res == 0) {
+   scsi_start_queue(sdev);
+   sdev_printk(KERN_DEBUG, sdev,
+   "Changed state from BLOCKED to DEL\n");
+   }
+   }
mutex_unlock(&sdev->state_mutex);
 
if (res != 0)
-- 
2.12.2



[PATCH v6 4/5] Introduce scsi_start_queue()

2017-05-02 Thread Bart Van Assche
This patch does not change any functionality.

Signed-off-by: Bart Van Assche 
Cc: Israel Rukshin 
Cc: Max Gurtovoy 
Cc: Hannes Reinecke 
Cc: Benjamin Block 
---
 drivers/scsi/scsi_lib.c  | 25 +++--
 drivers/scsi/scsi_priv.h |  1 +
 2 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index d2854558437d..bbce1f1db515 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -3024,6 +3024,20 @@ static int scsi_internal_device_block(struct scsi_device 
*sdev)
return err;
 }
  
+void scsi_start_queue(struct scsi_device *sdev)
+{
+   struct request_queue *q = sdev->request_queue;
+   unsigned long flags;
+
+   if (q->mq_ops) {
+   blk_mq_start_stopped_hw_queues(q, false);
+   } else {
+   spin_lock_irqsave(q->queue_lock, flags);
+   blk_start_queue(q);
+   spin_unlock_irqrestore(q->queue_lock, flags);
+   }
+}
+
 /**
  * scsi_internal_device_unblock_nowait - resume a device after a block request
  * @sdev:  device to resume
@@ -3042,9 +3056,6 @@ static int scsi_internal_device_block(struct scsi_device 
*sdev)
 int scsi_internal_device_unblock_nowait(struct scsi_device *sdev,
enum scsi_device_state new_state)
 {
-   struct request_queue *q = sdev->request_queue; 
-   unsigned long flags;
-
/*
 * Try to transition the scsi device to SDEV_RUNNING or one of the
 * offlined states and goose the device queue if successful.
@@ -3062,13 +3073,7 @@ int scsi_internal_device_unblock_nowait(struct 
scsi_device *sdev,
 sdev->sdev_state != SDEV_OFFLINE)
return -EINVAL;
 
-   if (q->mq_ops) {
-   blk_mq_start_stopped_hw_queues(q, false);
-   } else {
-   spin_lock_irqsave(q->queue_lock, flags);
-   blk_start_queue(q);
-   spin_unlock_irqrestore(q->queue_lock, flags);
-   }
+   scsi_start_queue(sdev);
 
return 0;
 }
diff --git a/drivers/scsi/scsi_priv.h b/drivers/scsi/scsi_priv.h
index f11bd102d6d5..c7629e31a75b 100644
--- a/drivers/scsi/scsi_priv.h
+++ b/drivers/scsi/scsi_priv.h
@@ -89,6 +89,7 @@ extern void scsi_run_host_queues(struct Scsi_Host *shost);
 extern void scsi_requeue_run_queue(struct work_struct *work);
 extern struct request_queue *scsi_alloc_queue(struct scsi_device *sdev);
 extern struct request_queue *scsi_mq_alloc_queue(struct scsi_device *sdev);
+extern void scsi_start_queue(struct scsi_device *sdev);
 extern int scsi_mq_setup_tags(struct Scsi_Host *shost);
 extern void scsi_mq_destroy_tags(struct Scsi_Host *shost);
 extern int scsi_init_queue(void);
-- 
2.12.2



[PATCH v6 2/5] Create two versions of scsi_internal_device_unblock()

2017-05-02 Thread Bart Van Assche
This will make it easier to serialize SCSI device state changes
through a mutex.

Signed-off-by: Bart Van Assche 
Cc: Christoph Hellwig 
Cc: Hannes Reinecke 
Cc: Johannes Thumshirn 
Cc: Sreekanth Reddy 
---
 drivers/scsi/mpt3sas/mpt3sas_scsih.c |  4 ++--
 drivers/scsi/scsi_lib.c  | 46 +---
 include/scsi/scsi_device.h   |  4 ++--
 3 files changed, 36 insertions(+), 18 deletions(-)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c 
b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index 25e89cfe4417..d671f6e6062c 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -2883,7 +2883,7 @@ _scsih_internal_device_unblock(struct scsi_device *sdev,
sdev_printk(KERN_WARNING, sdev, "device_unblock and setting to running, 
"
"handle(0x%04x)\n", sas_device_priv_data->sas_target->handle);
sas_device_priv_data->block = 0;
-   r = scsi_internal_device_unblock(sdev, SDEV_RUNNING);
+   r = scsi_internal_device_unblock_nowait(sdev, SDEV_RUNNING);
if (r == -EINVAL) {
/* The device has been set to SDEV_RUNNING by SD layer during
 * device addition but the request queue is still stopped by
@@ -2902,7 +2902,7 @@ _scsih_internal_device_unblock(struct scsi_device *sdev,
r, sas_device_priv_data->sas_target->handle);
 
sas_device_priv_data->block = 0;
-   r = scsi_internal_device_unblock(sdev, SDEV_RUNNING);
+   r = scsi_internal_device_unblock_nowait(sdev, SDEV_RUNNING);
if (r)
sdev_printk(KERN_WARNING, sdev, "retried device_unblock"
" failed with return(%d) for handle(0x%04x)\n",
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index ca20d3702b45..79bb05fa09d5 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -3016,24 +3016,22 @@ static int scsi_internal_device_block(struct 
scsi_device *sdev)
 }
  
 /**
- * scsi_internal_device_unblock - resume a device after a block request
+ * scsi_internal_device_unblock_nowait - resume a device after a block request
  * @sdev:  device to resume
- * @new_state: state to set devices to after unblocking
+ * @new_state: state to set the device to after unblocking
  *
- * Called by scsi lld's or the midlayer to restart the device queue
- * for the previously suspended scsi device.  Called from interrupt or
- * normal process context.
+ * Restart the device queue for a previously suspended SCSI device. Does not
+ * sleep.
  *
- * Returns zero if successful or error if not.
+ * Returns zero if successful or a negative error code upon failure.
  *
- * Notes:   
- * This routine transitions the device to the SDEV_RUNNING state
- * or to one of the offline states (which must be a legal transition)
- * allowing the midlayer to goose the queue for this device.
+ * Notes:
+ * This routine transitions the device to the SDEV_RUNNING state or to one of
+ * the offline states (which must be a legal transition) allowing the midlayer
+ * to goose the queue for this device.
  */
-int
-scsi_internal_device_unblock(struct scsi_device *sdev,
-enum scsi_device_state new_state)
+int scsi_internal_device_unblock_nowait(struct scsi_device *sdev,
+   enum scsi_device_state new_state)
 {
struct request_queue *q = sdev->request_queue; 
unsigned long flags;
@@ -3065,7 +3063,27 @@ scsi_internal_device_unblock(struct scsi_device *sdev,
 
return 0;
 }
-EXPORT_SYMBOL_GPL(scsi_internal_device_unblock);
+EXPORT_SYMBOL_GPL(scsi_internal_device_unblock_nowait);
+
+/**
+ * scsi_internal_device_unblock - resume a device after a block request
+ * @sdev:  device to resume
+ * @new_state: state to set the device to after unblocking
+ *
+ * Restart the device queue for a previously suspended SCSI device. May sleep.
+ *
+ * Returns zero if successful or a negative error code upon failure.
+ *
+ * Notes:
+ * This routine transitions the device to the SDEV_RUNNING state or to one of
+ * the offline states (which must be a legal transition) allowing the midlayer
+ * to goose the queue for this device.
+ */
+static int scsi_internal_device_unblock(struct scsi_device *sdev,
+   enum scsi_device_state new_state)
+{
+   return scsi_internal_device_unblock_nowait(sdev, new_state);
+}
 
 static void
 device_block(struct scsi_device *sdev, void *data)
diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
index e2f43ae3e264..bb784045ba71 100644
--- a/include/scsi/scsi_device.h
+++ b/include/scsi/scsi_device.h
@@ -473,8 +473,8 @@ static inline int scsi_device_created(struct scsi_device 
*sdev)
 }
 
 int scsi_internal_device_block_nowait(struct scsi_device *sdev);
-int scsi_internal_device_unblock(struct scsi_device *sdev,
-enum scs

[PATCH v6 1/5] Split scsi_internal_device_block()

2017-05-02 Thread Bart Van Assche
Instead of passing a "wait" argument to scsi_internal_device_block(),
split this function into a function that waits and a function that
doesn't wait. This will make it easier to serialize SCSI device state
changes through a mutex.

Signed-off-by: Bart Van Assche 
Cc: Christoph Hellwig 
Cc: Hannes Reinecke 
Cc: Johannes Thumshirn 
Cc: Sreekanth Reddy 
---
 drivers/scsi/mpt3sas/mpt3sas_scsih.c |  4 +-
 drivers/scsi/scsi_lib.c  | 73 +++-
 include/scsi/scsi_device.h   |  2 +-
 3 files changed, 50 insertions(+), 29 deletions(-)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c 
b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index 919ba2bb15f1..25e89cfe4417 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -2859,7 +2859,7 @@ _scsih_internal_device_block(struct scsi_device *sdev,
sas_device_priv_data->sas_target->handle);
sas_device_priv_data->block = 1;
 
-   r = scsi_internal_device_block(sdev, false);
+   r = scsi_internal_device_block_nowait(sdev);
if (r == -EINVAL)
sdev_printk(KERN_WARNING, sdev,
"device_block failed with return(%d) for handle(0x%04x)\n",
@@ -2895,7 +2895,7 @@ _scsih_internal_device_unblock(struct scsi_device *sdev,
"performing a block followed by an unblock\n",
r, sas_device_priv_data->sas_target->handle);
sas_device_priv_data->block = 1;
-   r = scsi_internal_device_block(sdev, false);
+   r = scsi_internal_device_block_nowait(sdev);
if (r)
sdev_printk(KERN_WARNING, sdev, "retried device_block "
"failed with return(%d) for handle(0x%04x)\n",
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index b7b340c494ab..ca20d3702b45 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -2937,28 +2937,20 @@ scsi_target_resume(struct scsi_target *starget)
 EXPORT_SYMBOL(scsi_target_resume);
 
 /**
- * scsi_internal_device_block - internal function to put a device temporarily 
into the SDEV_BLOCK state
- * @sdev:  device to block
- * @wait:  Whether or not to wait until ongoing .queuecommand() /
- * .queue_rq() calls have finished.
+ * scsi_internal_device_block_nowait - try to transition to the SDEV_BLOCK 
state
+ * @sdev: device to block
  *
- * Block request made by scsi lld's to temporarily stop all
- * scsi commands on the specified device. May sleep.
+ * Pause SCSI command processing on the specified device. Does not sleep.
  *
- * Returns zero if successful or error if not
+ * Returns zero if successful or a negative error code upon failure.
  *
- * Notes:   
- * This routine transitions the device to the SDEV_BLOCK state
- * (which must be a legal transition).  When the device is in this
- * state, all commands are deferred until the scsi lld reenables
- * the device with scsi_device_unblock or device_block_tmo fires.
- *
- * To do: avoid that scsi_send_eh_cmnd() calls queuecommand() after
- * scsi_internal_device_block() has blocked a SCSI device and also
- * remove the rport mutex lock and unlock calls from srp_queuecommand().
+ * Notes:
+ * This routine transitions the device to the SDEV_BLOCK state (which must be
+ * a legal transition). When the device is in this state, command processing
+ * is paused until the device leaves the SDEV_BLOCK state. See also
+ * scsi_internal_device_unblock_nowait().
  */
-int
-scsi_internal_device_block(struct scsi_device *sdev, bool wait)
+int scsi_internal_device_block_nowait(struct scsi_device *sdev)
 {
struct request_queue *q = sdev->request_queue;
unsigned long flags;
@@ -2978,21 +2970,50 @@ scsi_internal_device_block(struct scsi_device *sdev, 
bool wait)
 * request queue. 
 */
if (q->mq_ops) {
-   if (wait)
-   blk_mq_quiesce_queue(q);
-   else
-   blk_mq_stop_hw_queues(q);
+   blk_mq_stop_hw_queues(q);
} else {
spin_lock_irqsave(q->queue_lock, flags);
blk_stop_queue(q);
spin_unlock_irqrestore(q->queue_lock, flags);
-   if (wait)
-   scsi_wait_for_queuecommand(sdev);
}
 
return 0;
 }
-EXPORT_SYMBOL_GPL(scsi_internal_device_block);
+EXPORT_SYMBOL_GPL(scsi_internal_device_block_nowait);
+
+/**
+ * scsi_internal_device_block - try to transition to the SDEV_BLOCK state
+ * @sdev: device to block
+ *
+ * Pause SCSI command processing on the specified device and wait until all
+ * ongoing scsi_request_fn() / scsi_queue_rq() calls have finished. May sleep.
+ *
+ * Returns zero if successful or a negative error code upon failure.
+ *
+ * Note:
+ * This routine transitions the device to the SDEV_BLOCK state (which must be
+ * a legal transition). When the device is in th

[PATCH v6 0/5] Avoid that __scsi_remove_device() hangs

2017-05-02 Thread Bart Van Assche
__scsi_remove_device() hangs if it is waiting for the SYNCHRONIZE CACHE
command submitted by the sd driver to finish if the block layer queue is
stopped and does not get restarted. This patch series avoids that that
hang occurs.

Changes compared to v5:
- Reworked the approach. Instead of submitting the SYNCHRONIZE CACHE command
  asynchronously, make __scsi_remove_device() go straight from BLOCKED to DEL.

Changes compared to v4:
- Fixed the deadlock reported by the 0-day bot by changing a blk_put_request()
  call into __blk_put_request().

Changes compared to v3:
- Removed the logging statements from sd_sync_cache_done() that triggered
  a crash due to a use-after-free.

Changes compared to v2:
- Moved the "stop_disk" assignment after the sdkp check in the sd driver.
- Added a completion function for asynchronous SYNCHRONIZE CACHE commands.
- Added "disk" and "done" arguments to scsi_execute_async().

Changes compared to v1:
- Reworked the approach of this patch series.

Bart Van Assche (5):
  Split scsi_internal_device_block()
  Create two versions of scsi_internal_device_unblock()
  Protect SCSI device state changes with a mutex
  Introduce scsi_start_queue()
  Make __scsi_remove_device go straight from BLOCKED to DEL

 drivers/scsi/mpt3sas/mpt3sas_scsih.c |   8 +-
 drivers/scsi/scsi_error.c|   8 +-
 drivers/scsi/scsi_lib.c  | 171 +++
 drivers/scsi/scsi_priv.h |   1 +
 drivers/scsi/scsi_scan.c |  16 ++--
 drivers/scsi/scsi_sysfs.c|  37 +++-
 drivers/scsi/scsi_transport_srp.c|   7 +-
 drivers/scsi/sd.c|   7 +-
 include/scsi/scsi_device.h   |   7 +-
 9 files changed, 181 insertions(+), 81 deletions(-)

-- 
2.12.2



[PATCH v6 3/5] Protect SCSI device state changes with a mutex

2017-05-02 Thread Bart Van Assche
Enable this mechanism for all scsi_target_*block() callers but not
for the scsi_internal_device_unblock() calls from the mpt3sas driver
because that driver can call scsi_internal_device_unblock() from
atomic context.

Signed-off-by: Bart Van Assche 
Cc: Christoph Hellwig 
Cc: Hannes Reinecke 
Cc: Johannes Thumshirn 
---
 drivers/scsi/scsi_error.c |  8 +++-
 drivers/scsi/scsi_lib.c   | 27 +--
 drivers/scsi/scsi_scan.c  | 16 +---
 drivers/scsi/scsi_sysfs.c | 24 +++-
 drivers/scsi/scsi_transport_srp.c |  7 ---
 drivers/scsi/sd.c |  7 +--
 include/scsi/scsi_device.h|  1 +
 7 files changed, 66 insertions(+), 24 deletions(-)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index f2cafae150bc..02f5f7f49885 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -1696,11 +1696,17 @@ static void scsi_eh_offline_sdevs(struct list_head 
*work_q,
  struct list_head *done_q)
 {
struct scsi_cmnd *scmd, *next;
+   struct scsi_device *sdev;
 
list_for_each_entry_safe(scmd, next, work_q, eh_entry) {
sdev_printk(KERN_INFO, scmd->device, "Device offlined - "
"not ready after error recovery\n");
-   scsi_device_set_state(scmd->device, SDEV_OFFLINE);
+   sdev = scmd->device;
+
+   mutex_lock(&sdev->state_mutex);
+   scsi_device_set_state(sdev, SDEV_OFFLINE);
+   mutex_unlock(&sdev->state_mutex);
+
if (scmd->eh_eflags & SCSI_EH_CANCEL_CMD) {
/*
 * FIXME: Handle lost cmds.
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 79bb05fa09d5..d2854558437d 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -2875,7 +2875,12 @@ static void scsi_wait_for_queuecommand(struct 
scsi_device *sdev)
 int
 scsi_device_quiesce(struct scsi_device *sdev)
 {
-   int err = scsi_device_set_state(sdev, SDEV_QUIESCE);
+   int err;
+
+   mutex_lock(&sdev->state_mutex);
+   err = scsi_device_set_state(sdev, SDEV_QUIESCE);
+   mutex_unlock(&sdev->state_mutex);
+
if (err)
return err;
 
@@ -2903,10 +2908,11 @@ void scsi_device_resume(struct scsi_device *sdev)
 * so assume the state is being managed elsewhere (for example
 * device deleted during suspend)
 */
-   if (sdev->sdev_state != SDEV_QUIESCE ||
-   scsi_device_set_state(sdev, SDEV_RUNNING))
-   return;
-   scsi_run_queue(sdev->request_queue);
+   mutex_lock(&sdev->state_mutex);
+   if (sdev->sdev_state == SDEV_QUIESCE &&
+   scsi_device_set_state(sdev, SDEV_RUNNING) == 0)
+   scsi_run_queue(sdev->request_queue);
+   mutex_unlock(&sdev->state_mutex);
 }
 EXPORT_SYMBOL(scsi_device_resume);
 
@@ -3005,6 +3011,7 @@ static int scsi_internal_device_block(struct scsi_device 
*sdev)
struct request_queue *q = sdev->request_queue;
int err;
 
+   mutex_lock(&sdev->state_mutex);
err = scsi_internal_device_block_nowait(sdev);
if (err == 0) {
if (q->mq_ops)
@@ -3012,6 +3019,8 @@ static int scsi_internal_device_block(struct scsi_device 
*sdev)
else
scsi_wait_for_queuecommand(sdev);
}
+   mutex_unlock(&sdev->state_mutex);
+
return err;
 }
  
@@ -3082,7 +3091,13 @@ EXPORT_SYMBOL_GPL(scsi_internal_device_unblock_nowait);
 static int scsi_internal_device_unblock(struct scsi_device *sdev,
enum scsi_device_state new_state)
 {
-   return scsi_internal_device_unblock_nowait(sdev, new_state);
+   int ret;
+
+   mutex_lock(&sdev->state_mutex);
+   ret = scsi_internal_device_unblock_nowait(sdev, new_state);
+   mutex_unlock(&sdev->state_mutex);
+
+   return ret;
 }
 
 static void
diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index 6f7128f49c30..e6de4eee97a3 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -231,6 +231,7 @@ static struct scsi_device *scsi_alloc_sdev(struct 
scsi_target *starget,
sdev->id = starget->id;
sdev->lun = lun;
sdev->channel = starget->channel;
+   mutex_init(&sdev->state_mutex);
sdev->sdev_state = SDEV_CREATED;
INIT_LIST_HEAD(&sdev->siblings);
INIT_LIST_HEAD(&sdev->same_target_siblings);
@@ -943,16 +944,17 @@ static int scsi_add_lun(struct scsi_device *sdev, 
unsigned char *inq_result,
 
/* set the device running here so that slave configure
 * may do I/O */
+   mutex_lock(&sdev->state_mutex);
ret = scsi_device_set_state(sdev, SDEV_RUNNING);
-   if (ret) {
+   if (ret)
ret = scsi_device_set_state(sdev, SDEV_BLOCK);
+   mutex_unlock(&sdev->state_mut

[PATCH] scsi_lib: Add #include

2017-05-02 Thread Bart Van Assche
This patch avoids that when building with W=1 the compiler
complains that __scsi_init_queue() has not been declared.
See also commit d48777a633d6 ("scsi: remove __scsi_alloc_queue").

Signed-off-by: Bart Van Assche 
Cc: Christoph Hellwig 
Cc: Hannes Reinecke 
Cc: Johannes Thumshirn 
---
 drivers/scsi/scsi_lib.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index d698364df072..b7b340c494ab 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -30,6 +30,7 @@
 #include 
 #include 
 #include 
+#include  /* __scsi_init_queue() */
 #include 
 
 #include 
-- 
2.12.2



[PATCH] Avoid that scsi_exit_rq() triggers a use-after-free

2017-05-02 Thread Bart Van Assche
This patch fixes the following KASAN complaint:

==
BUG: KASAN: use-after-free in scsi_exit_rq+0xf3/0x120 at addr 8802b7fedf00
Read of size 1 by task rcuos/5/53
CPU: 7 PID: 53 Comm: rcuos/6 Not tainted 4.11.0-rc5+ #13
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0 
ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
Call Trace:
 dump_stack+0x63/0x8f
 kasan_object_err+0x21/0x70
 kasan_report.part.1+0x231/0x500
 __asan_report_load1_noabort+0x2e/0x30
 scsi_exit_rq+0xf3/0x120
 free_request_size+0x44/0x60
 mempool_destroy.part.6+0x9b/0x150
 mempool_destroy+0x13/0x20
 blk_exit_rl+0x36/0x40
 blkg_free+0x146/0x200
 __blkg_release_rcu+0x121/0x220
 rcu_nocb_kthread+0x61f/0xca0
 kthread+0x298/0x390
 ret_from_fork+0x2c/0x40
Object at 8802b7fedd80, in cache kmalloc-2048 size: 2048
Allocated:
PID = 3992
 save_stack_trace+0x1b/0x20
 save_stack+0x46/0xd0
 kasan_kmalloc+0xad/0xe0
 __kmalloc+0x134/0x220
 scsi_host_alloc+0x6b/0x11c0
 0xc101d94a
 driver_probe_device+0x49e/0xc60
 __device_attach_driver+0x1d3/0x2a0
 bus_for_each_drv+0x11a/0x1d0
 __device_attach+0x1e1/0x2c0
 device_initial_probe+0x13/0x20
 bus_probe_device+0x19b/0x240
 device_add+0x86d/0x1450
 device_register+0x1a/0x20
 0xc10270ce
 0xc1048a62
 do_one_initcall+0xa7/0x250
 do_init_module+0x1d0/0x55d
 load_module+0x7c9f/0x9850
 SYSC_finit_module+0x189/0x1c0
 SyS_finit_module+0xe/0x10
 entry_SYSCALL_64_fastpath+0x1a/0xa9
Freed:
PID = 4128
 save_stack_trace+0x1b/0x20
 save_stack+0x46/0xd0
 kasan_slab_free+0x71/0xb0
 kfree+0x8d/0x1b0
 scsi_host_dev_release+0x2cb/0x430
 device_release+0x76/0x1e0
 kobject_release+0x107/0x370
 kobject_put+0x56/0xb0
 put_device+0x17/0x20
 scsi_host_put+0x15/0x20
 0xc101fcd7
 device_release_driver_internal+0x26a/0x4e0
 device_release_driver+0x12/0x20
 bus_remove_device+0x2d0/0x590
 device_del+0x55b/0x920
 device_unregister+0x1a/0xa0
 0xc101e0ca
 0xc102fccc
 SyS_delete_module+0x334/0x3e0
 entry_SYSCALL_64_fastpath+0x1a/0xa9
Memory state around the buggy address:
 8802b7fede00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 8802b7fede80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>8802b7fedf00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
   ^
 8802b7fedf80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 8802b7fee000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==

Reported-by: Scott Bauer 
Fixes: e9c787e65c0c ("scsi: allocate scsi_cmnd structures as part of struct 
request")
Signed-off-by: Bart Van Assche 
Cc: Scott Bauer 
Cc: Christoph Hellwig 
Cc: Jan Kara 
Cc: Hannes Reinecke 
Cc: 
---
 drivers/scsi/scsi_lib.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 15c9fe766071..d698364df072 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -2095,11 +2095,14 @@ static int scsi_init_rq(struct request_queue *q, struct 
request *rq, gfp_t gfp)
struct Scsi_Host *shost = q->rq_alloc_data;
struct scsi_cmnd *cmd = blk_mq_rq_to_pdu(rq);
 
+   if (!scsi_host_get(shost))
+   goto fail;
+
memset(cmd, 0, sizeof(*cmd));
 
cmd->sense_buffer = scsi_alloc_sense_buffer(shost, gfp, NUMA_NO_NODE);
if (!cmd->sense_buffer)
-   goto fail;
+   goto put;
cmd->req.sense = cmd->sense_buffer;
 
if (scsi_host_get_prot(shost) >= SHOST_DIX_TYPE0_PROTECTION) {
@@ -2112,6 +2115,8 @@ static int scsi_init_rq(struct request_queue *q, struct 
request *rq, gfp_t gfp)
 
 fail_free_sense:
scsi_free_sense_buffer(shost, cmd->sense_buffer);
+put:
+   scsi_host_put(shost);
 fail:
return -ENOMEM;
 }
@@ -2124,6 +2129,7 @@ static void scsi_exit_rq(struct request_queue *q, struct 
request *rq)
if (cmd->prot_sdb)
kmem_cache_free(scsi_sdb_cache, cmd->prot_sdb);
scsi_free_sense_buffer(shost, cmd->sense_buffer);
+   scsi_host_put(shost);
 }
 
 struct request_queue *scsi_alloc_queue(struct scsi_device *sdev)
-- 
2.12.2



Re: BUG: KASAN: use-after-free in scsi_exit_rq

2017-05-02 Thread Bart Van Assche
On Tue, 2017-05-02 at 16:41 +0200, Jan Kara wrote:
> So I'm also not aware of any particular breakage this would cause. However
> logically the freeing of request mempools really belongs to
> blk_release_queue() so it seems a bit dumb to move blk_exit_rl() just
> because SCSI stores the fact from which slab cache it has allocated the
> sense buffer in a structure (shost) that it frees under its hands by the
> time blk_release_queue() is called. :-|

Hello Jan,

My concern when I wrote my previous e-mail was that I didn't want to add a
scsi_host_get() / scsi_host_put() pair to the hot path in the SCSI core. But
I just realized that scsi_init_rq() and scsi_exit_rq() are not in the hot
path so adding a scsi_host_get() / scsi_host_put() pair should work fine. I
will post a patch.

Bart.


Re: BUG: KASAN: use-after-free in scsi_exit_rq

2017-05-02 Thread Jan Kara
On Fri 28-04-17 17:46:47, Tejun Heo wrote:
> On Fri, Apr 21, 2017 at 09:49:17PM +, Bart Van Assche wrote:
> > On Thu, 2017-04-20 at 15:18 -0600, Scott Bauer wrote:
> > > [  642.638860] BUG: KASAN: use-after-free in scsi_exit_rq+0xf3/0x120 at 
> > > addr 8802b7fedf00
> > > [  642.639362] Read of size 1 by task rcuos/5/53
> > > [  642.639713] CPU: 7 PID: 53 Comm: rcuos/6 Not tainted 4.11.0-rc5+ #13
> > > [  642.640170] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), 
> > > BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 
> > > 04/01/2014
> > > [  642.640923] Call Trace:
> > > [  642.641080]  dump_stack+0x63/0x8f
> > > [  642.641289]  kasan_object_err+0x21/0x70
> > > [  642.641531]  kasan_report.part.1+0x231/0x500
> > > [  642.641823]  ? scsi_exit_rq+0xf3/0x120
> > > [  642.642054]  ? _raw_spin_unlock_irqrestore+0xe/0x10
> > > [  642.642353]  ? free_percpu+0x1b7/0x340
> > > [  642.642586]  ? put_task_stack+0x117/0x2b0
> > > [  642.642837]  __asan_report_load1_noabort+0x2e/0x30
> > > [  642.643138]  scsi_exit_rq+0xf3/0x120
> > > [  642.643366]  free_request_size+0x44/0x60
> > > [  642.643614]  mempool_destroy.part.6+0x9b/0x150
> > > [  642.643899]  ? kasan_slab_free+0x87/0xb0
> > > [  642.644152]  mempool_destroy+0x13/0x20
> > > [  642.644394]  blk_exit_rl+0x36/0x40
> > > [  642.644614]  blkg_free+0x146/0x200
> > > [  642.644836]  __blkg_release_rcu+0x121/0x220
> > > [  642.645112]  rcu_nocb_kthread+0x61f/0xca0
> > > [  642.645376]  ? get_state_synchronize_rcu+0x20/0x20
> > > [  642.645690]  ? pci_mmcfg_check_reserved+0x110/0x110
> > > [  642.646011]  kthread+0x298/0x390
> > > [  642.646224]  ? get_state_synchronize_rcu+0x20/0x20
> > > [  642.646535]  ? kthread_park+0x160/0x160
> > > [  642.646787]  ret_from_fork+0x2c/0x40
> > 
> > I'm not familiar with cgroups but seeing this makes me wonder whether it 
> > would
> > be possible to move the blk_exit_rl() calls from blk_release_queue() into
> > blk_cleanup_queue()? The SCSI core frees a SCSI host after 
> > blk_cleanup_queue()
> > has finished for all associated SCSI devices. This is why I think that 
> > calling
> > blk_exit_rl() earlier would be sufficient to avoid that scsi_exit_rq()
> > dereferences a SCSI host pointer after it has been freed.
> 
> Hmm... I see.  Didn't know request put path involved derefing those
> structs.  The blk_exit_rl() call just replaced open coded
> mempool_destroy call, so the destruction order was always like this.
> We shouldn't have any request in flight by cleanup, so moving it there
> should be fine.

So I'm also not aware of any particular breakage this would cause. However
logically the freeing of request mempools really belongs to
blk_release_queue() so it seems a bit dumb to move blk_exit_rl() just
because SCSI stores the fact from which slab cache it has allocated the
sense buffer in a structure (shost) that it frees under its hands by the
time blk_release_queue() is called. :-|

Honza
-- 
Jan Kara 
SUSE Labs, CR


I NEED YOUR URGENT HELP AND CORPERATION

2017-05-02 Thread IBRAHIM KABORE
Dear Friend,

I am contacting you on a business deal of  $9,500,000.00 Million United States 
Dollars, ready for transfer into your own personal account and if we
make this claim, we will  share it on the ratio of 50% / 50% basis, I would 
like to assure you that it be 100%  risk free and it will be legally backed 
up with government approval.  Once you are  interested to transact this  
business with me, kindly give me your  consent response immediately.

Hoping to hear from you.

My regards,
Mr Ibrahim Kabore
EMAIL,kabreibrahim...@gmail.com


Re: [PATCH, RFC] MAINTAINERS: update OSD entries

2017-05-02 Thread Jeff Layton
On Tue, 2017-05-02 at 09:57 +0200, Christoph Hellwig wrote:
> The open-osd domain doesn't exist anymore, and mails to the list lead
> to really annoying bounced that repeat every day.
> 
> Also the primarydata address for Benny bounces, and while I have a new
> one for him he doesn't seem to be maintaining the OSD code any more.
> 
> Which beggs the question:  should we really leave the Supported status
> in MAINTAINERS given that the code is barely maintained?
> 
> Signed-off-by: Christoph Hellwig 
> ---
>  MAINTAINERS | 4 
>  1 file changed, 4 deletions(-)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 1bb06c5f7716..28dd83a1d9e2 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -9418,10 +9418,6 @@ F: drivers/net/wireless/intersil/orinoco/
>  
>  OSD LIBRARY and FILESYSTEM
>  M:   Boaz Harrosh 
> -M:   Benny Halevy 
> -L:   osd-...@open-osd.org
> -W:   http://open-osd.org
> -T:   git git://git.open-osd.org/open-osd.git
>  S:   Maintained
>  F:   drivers/scsi/osd/
>  F:   include/scsi/osd_*

Hah, you beat me to it! I was going to spin up a patch for this today.

Acked-by: Jeff Layton 


答复: [PATCH v6 0/2] tcmu: Dynamic growing data area support

2017-05-02 Thread lixiubo
> > Xiubo Li (2):
> >   tcmu: Add dynamic growing data area feature support
> >   tcmu: Add global data block pool support
> >
> >  drivers/target/target_core_user.c | 598
> > ++
> >  1 file changed, 469 insertions(+), 129 deletions(-)
> >
> 
> So based upon the feedback from MNC, this looks OK to merge.
> 
> It looks like there is one bug as reported by MNC introduced by patch 
> #1.
> 

It's actually one enhancement.

> Please go ahead and post an incremental patch to address this at your 
> earliest convenience.
> 

And I will address this later.

Thank,

BRs
Xiubo





[PATCH, RFC] MAINTAINERS: update OSD entries

2017-05-02 Thread Christoph Hellwig
The open-osd domain doesn't exist anymore, and mails to the list lead
to really annoying bounced that repeat every day.

Also the primarydata address for Benny bounces, and while I have a new
one for him he doesn't seem to be maintaining the OSD code any more.

Which beggs the question:  should we really leave the Supported status
in MAINTAINERS given that the code is barely maintained?

Signed-off-by: Christoph Hellwig 
---
 MAINTAINERS | 4 
 1 file changed, 4 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 1bb06c5f7716..28dd83a1d9e2 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -9418,10 +9418,6 @@ F:   drivers/net/wireless/intersil/orinoco/
 
 OSD LIBRARY and FILESYSTEM
 M: Boaz Harrosh 
-M: Benny Halevy 
-L: osd-...@open-osd.org
-W: http://open-osd.org
-T: git git://git.open-osd.org/open-osd.git
 S: Maintained
 F: drivers/scsi/osd/
 F: include/scsi/osd_*
-- 
2.11.0



[PATCH] tcmu: Recalculate the tcmu_cmd size to save cmd area memories

2017-05-02 Thread lixiubo
From: Xiubo Li 

For the "struct tcmu_cmd_entry" in cmd area, the minimum size
will be sizeof(struct tcmu_cmd_entry) == 112 Bytes. And it could
fill about (sizeof(struct rsp) - sizeof(struct req)) /
sizeof(struct iovec) == 68 / 16 ~= 4 data regions(iov[4]) by
default.

For most tcmu_cmds, the data block indexes allocated from the
data area will be continuous. And for the continuous blocks they
will be merged into the same region using only one iovec. For
the current code, it will always allocates the same number of
iovecs with blocks for each tcmu_cmd, and it will wastes much
memories.

For example, when the block size is 4K and the DATA_OUT buffer
size is 64K, and the regions needed is less than 5(on my
environment is almost 99.7%). The current code will allocate
about 16 iovecs, and there will be (16 - 4) * sizeof(struct
iovec) = 192 Bytes cmd area memories wasted.

Here adds two helpers to calculate the base size and full size
of the tcmu_cmd. And will recalculate them again when it make sure
how many iovs is needed before insert it to cmd area.

Signed-off-by: Xiubo Li 
---
 drivers/target/target_core_user.c | 52 ++-
 1 file changed, 41 insertions(+), 11 deletions(-)

diff --git a/drivers/target/target_core_user.c 
b/drivers/target/target_core_user.c
index 0b29e4f..89b75ce 100644
--- a/drivers/target/target_core_user.c
+++ b/drivers/target/target_core_user.c
@@ -602,6 +602,27 @@ static bool is_ring_space_avail(struct tcmu_dev *udev, 
struct tcmu_cmd *cmd,
return true;
 }
 
+static inline size_t tcmu_cmd_get_base_cmd_size(size_t iov_cnt)
+{
+   return max(offsetof(struct tcmu_cmd_entry, req.iov[iov_cnt]),
+   sizeof(struct tcmu_cmd_entry));
+}
+
+static inline size_t tcmu_cmd_get_cmd_size(struct tcmu_cmd *tcmu_cmd,
+  size_t base_command_size)
+{
+   struct se_cmd *se_cmd = tcmu_cmd->se_cmd;
+   size_t command_size;
+
+   command_size = base_command_size +
+   round_up(scsi_command_size(se_cmd->t_task_cdb),
+   TCMU_OP_ALIGN_SIZE);
+
+   WARN_ON(command_size & (TCMU_OP_ALIGN_SIZE-1));
+
+   return command_size;
+}
+
 static sense_reason_t
 tcmu_queue_cmd_ring(struct tcmu_cmd *tcmu_cmd)
 {
@@ -624,16 +645,16 @@ static bool is_ring_space_avail(struct tcmu_dev *udev, 
struct tcmu_cmd *cmd,
 * Must be a certain minimum size for response sense info, but
 * also may be larger if the iov array is large.
 *
-* We prepare way too many iovs for potential uses here, because it's
-* expensive to tell how many regions are freed in the bitmap
-   */
-   base_command_size = max(offsetof(struct tcmu_cmd_entry,
-   req.iov[tcmu_cmd_get_block_cnt(tcmu_cmd)]),
-   sizeof(struct tcmu_cmd_entry));
-   command_size = base_command_size
-   + round_up(scsi_command_size(se_cmd->t_task_cdb), 
TCMU_OP_ALIGN_SIZE);
-
-   WARN_ON(command_size & (TCMU_OP_ALIGN_SIZE-1));
+* We prepare as many iovs as possbile for potential uses here,
+* because it's expensive to tell how many regions are freed in
+* the bitmap & global data pool, as the size calculated here
+* will only be used to do the checks.
+*
+* The size will be recalculated later as actually needed to save
+* cmd area memories.
+*/
+   base_command_size = tcmu_cmd_get_base_cmd_size(tcmu_cmd->dbi_cnt);
+   command_size = tcmu_cmd_get_cmd_size(tcmu_cmd, base_command_size);
 
mutex_lock(&udev->cmdr_lock);
 
@@ -694,7 +715,6 @@ static bool is_ring_space_avail(struct tcmu_dev *udev, 
struct tcmu_cmd *cmd,
entry = (void *) mb + CMDR_OFF + cmd_head;
tcmu_flush_dcache_range(entry, sizeof(*entry));
tcmu_hdr_set_op(&entry->hdr.len_op, TCMU_OP_CMD);
-   tcmu_hdr_set_len(&entry->hdr.len_op, command_size);
entry->hdr.cmd_id = tcmu_cmd->cmd_id;
entry->hdr.kflags = 0;
entry->hdr.uflags = 0;
@@ -736,6 +756,16 @@ static bool is_ring_space_avail(struct tcmu_dev *udev, 
struct tcmu_cmd *cmd,
entry->req.iov_bidi_cnt = iov_cnt;
}
 
+   /*
+* Recalaulate the command's base size and size according
+* to the actual needs
+*/
+   base_command_size = tcmu_cmd_get_base_cmd_size(entry->req.iov_cnt +
+  entry->req.iov_bidi_cnt);
+   command_size = tcmu_cmd_get_cmd_size(tcmu_cmd, base_command_size);
+
+   tcmu_hdr_set_len(&entry->hdr.len_op, command_size);
+
/* All offsets relative to mb_addr, not start of entry! */
cdb_off = CMDR_OFF + cmd_head + base_command_size;
memcpy((void *) mb + cdb_off, se_cmd->t_task_cdb, 
scsi_command_size(se_cmd->t_task_cdb));
-- 
1.8.3.1





Re: [PATCH 25/27] block: remove the discard_zeroes_data flag

2017-05-02 Thread h...@lst.de
On Tue, May 02, 2017 at 12:16:13AM -0700, Nicholas A. Bellinger wrote:
> Or, another options is use bdev_write_zeroes_sectors() to determine when
> dev_attrib->unmap_zeroes_data should be set.

Yes, that in combination with your patch to use bdev_write_zeroes_sectors
for zeroing from write same seems like the right fix.


Re: [PATCH 25/27] block: remove the discard_zeroes_data flag

2017-05-02 Thread Nicholas A. Bellinger
On Mon, 2017-05-01 at 23:43 -0700, Nicholas A. Bellinger wrote:
> On Mon, 2017-05-01 at 20:45 +, Bart Van Assche wrote:
> > On Wed, 2017-04-05 at 19:21 +0200, Christoph Hellwig wrote:
> > > Now that we use the proper REQ_OP_WRITE_ZEROES operation everywhere we can
> > > kill this hack.
> > > 
> > > Signed-off-by: Christoph Hellwig 
> > > Reviewed-by: Martin K. Petersen 
> > > Reviewed-by: Hannes Reinecke 
> > > [ ... ]
> > > diff --git a/drivers/target/target_core_device.c 
> > > b/drivers/target/target_core_device.c
> > > index c754ae33bf7b..d2f089cfa9ae 100644
> > > --- a/drivers/target/target_core_device.c
> > > +++ b/drivers/target/target_core_device.c
> > > @@ -851,7 +851,7 @@ bool target_configure_unmap_from_queue(struct 
> > > se_dev_attrib *attrib,
> > >   attrib->unmap_granularity = q->limits.discard_granularity / block_size;
> > >   attrib->unmap_granularity_alignment = q->limits.discard_alignment /
> > >   block_size;
> > > - attrib->unmap_zeroes_data = q->limits.discard_zeroes_data;
> > > + attrib->unmap_zeroes_data = 0;
> > >   return true;
> > >  }
> > >  EXPORT_SYMBOL(target_configure_unmap_from_queue);
> > 
> > Hello Christoph,
> > 
> > Sorry that I hadn't noticed this before but I think that this patch
> > introduces a significant performance regressions for LIO users. Before
> > this patch the LBPRZ flag was reported correctly to initiator systems
> > through the thin provisioning VPD. With this patch applied that flag
> > will always be reported as zero, forcing initiators to submit WRITE
> > commands with zeroed data buffers instead of submitting the SCSI UNMAP
> > command to block devices for which discard_zeroes_data was set. From
> > target_core_spc.c:
> > 
> > /* Thin Provisioning VPD */
> > static sense_reason_t spc_emulate_evpd_b2(struct se_cmd *cmd, unsigned char 
> > *buf)
> > {
> > [ ... ]
> > /*
> >  * The unmap_zeroes_data set means that the underlying device supports
> >  * REQ_DISCARD and has the discard_zeroes_data bit set. This satisfies
> >  * the SBC requirements for LBPRZ, meaning that a subsequent read
> >  * will return zeroes after an UNMAP or WRITE SAME (16) to an LBA
> >  * See sbc4r36 6.6.4.
> >  */
> > if (((dev->dev_attrib.emulate_tpu != 0) ||
> >  (dev->dev_attrib.emulate_tpws != 0)) &&
> >  (dev->dev_attrib.unmap_zeroes_data != 0))
> > buf[5] |= 0x04;
> > [ ... ]
> > }
> > 
> 
> According to sd_config_discard(), it's SD_LBP_WS16, SD_LBP_WS10 and
> SD_LBP_ZERO that where ever setting unmap_zeros_data = 1 to begin with.
> For UNMAP, q->limits.discard_zeroes_data was never set.
> 
> That said, it's pretty much implied that supporting DISCARD means
> subsequent READs return zeros, so target_configure_unmap_from_queue()
> should be setting attrib->unmap_zeroes_data = 1, or just dropping it
> all-together.
> 
> Post -rc1, I'll push a patch to do the latter.
> 

Or, another options is use bdev_write_zeroes_sectors() to determine when
dev_attrib->unmap_zeroes_data should be set.