date:20170711

Re: [PATCH v2 01/15] qla2xxx: Combine Active command arrays.

2017-07-11 Thread Madhani, Himanshu

Hi Bart,

> On Jul 11, 2017, at 5:39 PM, Nicholas A. Bellinger  
> wrote:
> 
> On Tue, 2017-07-11 at 23:43 +, Bart Van Assche wrote:
>> On Tue, 2017-06-13 at 20:47 -0700, Himanshu Madhani wrote:
>>> typedef struct srb {
>>> +   /*
>>> +* Do not move cmd_type field, it needs to
>>> +* line up with qla_tgt_cmd->cmd_type
>>> +*/
>>> +   uint8_t cmd_type;
>>> +   uint8_t pad[3];
>> 
>> Hello Himanshu,
>> 
>> Had I understood correctly that you had promised to rework the command
>> array merging such that a union is used instead of requiring certain
>> fields to be present at certain offsets (see also
>> https://www.spinics.net/lists/target-devel/msg15591.html)?
>> 
> 
> Yeah, as discussed previously it was not a blocker for merge.
> 

I had mentioned in my v2 submission cover letter that we will submit new patch 
to
address your review comment. 

Since we are in merge window right now i have not posted this patch. As soon as 
4.13.0-rc1 is out,
i will submit patch addressing your review comment. 

Thanks,
- Himanshu

Re: [PATCH 2/6] SCSI: use blk_mq_run_hw_queues() in scsi_kick_queue()

2017-07-11 Thread Ming Lei

On Tue, Jul 11, 2017 at 07:57:53PM +, Bart Van Assche wrote:
> On Wed, 2017-07-12 at 02:20 +0800, Ming Lei wrote:
> > Now SCSI won't stop queue, and not necessary to use
> > blk_mq_start_hw_queues(), so switch to blk_mq_run_hw_queues()
> > instead.
> > 
> > Cc: "James E.J. Bottomley" 
> > Cc: "Martin K. Petersen" 
> > Cc: linux-scsi@vger.kernel.org
> > Signed-off-by: Ming Lei 
> > ---
> >  drivers/scsi/scsi_lib.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> > index f6097b89d5d3..91d890356b78 100644
> > --- a/drivers/scsi/scsi_lib.c
> > +++ b/drivers/scsi/scsi_lib.c
> > @@ -333,7 +333,7 @@ void scsi_device_unbusy(struct scsi_device *sdev)
> >  static void scsi_kick_queue(struct request_queue *q)
> >  {
> > if (q->mq_ops)
> > -   blk_mq_start_hw_queues(q);
> > +   blk_mq_run_hw_queues(q, false);
> > else
> > blk_run_queue(q);
> >  }
> 
> Hello Ming,
> 
> Now that we have separate flags to represent the "stopped" and "quiesced"
> states, wouldn't it be better not to modify scsi_kick_queue() but instead to
> stop a SCSI hardware queue again if scsi_queue_rq() returns BLK_STS_RESOURCE?
> See also commits 36e3cf273977 ("scsi: Avoid that SCSI queues get stuck") and
> commit 52d7f1b5c2f3 ("blk-mq: Avoid that requeueing starts stopped queues").

As you can see in the following patches, all stop/start queue APIs will
be removed, and the 'stopped' state will become a internal state.

It doesn't make sense to clear 'stopped' for scsi anymore, since the
queue won't be stopped actually. So we need this change.

-- 
Ming

Re: [PATCH 02/13] mpt3sas: SGL to PRP Translation for I/Os to NVMe devices

2017-07-11 Thread kbuild test robot

Hi Suganath,

[auto build test ERROR on scsi/for-next]
[also build test ERROR on v4.12 next-20170711]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Suganath-Prabu-S/mpt3sas-Add-nvme-device-support-in-slave-alloc-target-alloc-and-probe/20170711-204831
base:   https://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git for-next
config: x86_64-kexec (attached as .config)
compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64 

Note: the 
linux-review/Suganath-Prabu-S/mpt3sas-Add-nvme-device-support-in-slave-alloc-target-alloc-and-probe/20170711-204831
 HEAD 46a9c8fb1d7fe7649aa0eaa925c6653a6fa3047e builds fine.
  It only hurts bisectibility.

All errors (new ones prefixed by >>):

   In file included from drivers/scsi/mpt3sas/mpt3sas_base.c:66:0:
>> drivers/scsi/mpt3sas/mpt3sas_base.h:57:26: fatal error: mpi/mpi2_pci.h: No 
>> such file or directory
#include "mpi/mpi2_pci.h"
 ^
   compilation terminated.

vim +57 drivers/scsi/mpt3sas/mpt3sas_base.h

48  
49  #include "mpi/mpi2_type.h"
50  #include "mpi/mpi2.h"
51  #include "mpi/mpi2_ioc.h"
52  #include "mpi/mpi2_cnfg.h"
53  #include "mpi/mpi2_init.h"
54  #include "mpi/mpi2_raid.h"
55  #include "mpi/mpi2_tool.h"
56  #include "mpi/mpi2_sas.h"
  > 57  #include "mpi/mpi2_pci.h"
58  

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip

Re: [PATCH v3 2/7] libsas: remove unused port_gone_completion

2017-07-11 Thread wangyijing



在 2017/7/11 23:54, John Garry 写道:
> On 10/07/2017 08:06, Yijing Wang wrote:
>> No one uses the port_gone_completion in struct asd_sas_port,
>> clean it out.
> 
> This seems like a reasonable tidy-up patch which could be taken in isolation, 
> having no dependency on the rest of the series.

Yes.

> 
>>
>> Signed-off-by: Yijing Wang 
>> ---
>>  include/scsi/libsas.h | 2 --
>>  1 file changed, 2 deletions(-)
>>
>> diff --git a/include/scsi/libsas.h b/include/scsi/libsas.h
>> index c41328d..628f48b 100644
>> --- a/include/scsi/libsas.h
>> +++ b/include/scsi/libsas.h
>> @@ -263,8 +263,6 @@ struct sas_discovery {
>>  /* The port struct is Class:RW, driver:RO */
>>  struct asd_sas_port {
>>  /* private: */
>> -struct completion port_gone_completion;
>> -
>>  struct sas_discovery disc;
>>  struct domain_device *port_dev;
>>  spinlock_t dev_list_lock;
>>
> 
> 
> 
> .
>

Re: [PATCH v3 1/7] libsas: Use static sas event pool to appease sas event lost

2017-07-11 Thread wangyijing

>> -unsigned long port_events_pending;
>> -unsigned long phy_events_pending;
>> +struct asd_sas_event   port_events[PORT_POOL_SIZE];
>> +struct asd_sas_event   phy_events[PHY_POOL_SIZE];
>>
>>  int error;
> 
> Hi Yijing,
> 
> So now we are creating a static pool of events per PHY/port, instead of 
> having 1 static work struct per event per PHY/port. So, for sure, this avoids 
> the dynamic event issue of system memory exhaustion which we discussed in 
> v1+v2 series. And it seems to possibly remove issue of losing SAS events.
> 
> But how did you determine the pool size for a PHY/port? It would seem to be 5 
> * #phy events or #port events (which is also 5, I figure by coincidence). How 
> does this deal with flutter of >25 events?

There is no special meaning for the pool size, if flutter of > 25 events, 
notify sas events will return error, and the further step work is depending on 
LLDD drivers.
I hope libsas could do more work in this case, but now it seems a little 
difficult, this patch may be a interim fix, until we find a perfect solution.

Thanks!
Yijing.

> 
> Thanks,
> John
> 
> 
> .
>

Re: [PATCH v2 01/15] qla2xxx: Combine Active command arrays.

2017-07-11 Thread Nicholas A. Bellinger

On Tue, 2017-07-11 at 23:43 +, Bart Van Assche wrote:
> On Tue, 2017-06-13 at 20:47 -0700, Himanshu Madhani wrote:
> >  typedef struct srb {
> > +   /*
> > +* Do not move cmd_type field, it needs to
> > +* line up with qla_tgt_cmd->cmd_type
> > +*/
> > +   uint8_t cmd_type;
> > +   uint8_t pad[3];
> 
> Hello Himanshu,
> 
> Had I understood correctly that you had promised to rework the command
> array merging such that a union is used instead of requiring certain
> fields to be present at certain offsets (see also
> https://www.spinics.net/lists/target-devel/msg15591.html)?
> 

Yeah, as discussed previously it was not a blocker for merge.

Re: [PATCH] iscsi-target: Add login_keys_workaround attribute for non RFC initiators

2017-07-11 Thread Nicholas A. Bellinger

On Tue, 2017-07-11 at 23:38 +, Bart Van Assche wrote:
> On Fri, 2017-07-07 at 22:24 +, Nicholas A. Bellinger wrote:
> > From: Nicholas Bellinger 
> > 
> > This patch re-introduces part of a long standing login workaround that
> > was recently dropped by:
> > 
> >   commit 1c99de981f30b3e7868b8d20ce5479fa1c0fea46
> >   Author: Nicholas Bellinger 
> >   Date:   Sun Apr 2 13:36:44 2017 -0700
> > 
> >   iscsi-target: Drop work-around for legacy GlobalSAN initiator
> > 
> > Namely, the workaround for FirstBurstLength ended up being required by
> > Mellanox Flexboot PXE boot ROMs as reported by Robert.
> > 
> > So this patch re-adds the work-around for FirstBurstLength within
> > iscsi_check_proposer_for_optional_reply(), and makes the key optional
> > to respond when the initiator does not propose, nor respond to it.
> > 
> > Also as requested by Arun, this patch introduces a new TPG attribute
> > named 'login_keys_workaround' that controls the use of both the
> > FirstBurstLength workaround, as well as the two other existing
> > workarounds for gPXE iSCSI boot client.
> > 
> > By default, the workaround is enabled with login_keys_workaround=1,
> > since Mellanox FlexBoot requires it, and Arun has verified the Qlogic
> > MSFT initiator already proposes FirstBurstLength, so it's uneffected
> > by this re-adding this part of the original work-around.
> 
> Hello Nick,
> 
> The new configfs attribute ("login_keys_workaround") may confuse users - for
> someone who has not followed this e-mail thread it can take a long time
> before they figure out that they need to set this configfs attribute.

It's enabled by default, so there is nothing a user has to explicitly
change in order all hosts to 'just work'.

The only reason the attribute was added was by request of Arun, so if
some future initiator doesn't proposed the keys controlled by the
work-around, and still attempts to respond they can at least get
something working w/o code change.

>  Have
> you considered to let the iSCSI target driver figure out whether or not that
> variable has to be set, e.g. by looking up the initiator IQN in a list?
> 

Given InitiatorName is end user configurable, trying to do a workaround
based on IQN regexs is error prone, at best.

Also, since the FirstBurstLength work-around this patch re-adds had
already been in place for the better part of 8 years, the risk of
interopt issues is almost non existent.

Re: [PATCH 02/13] mpt3sas: SGL to PRP Translation for I/Os to NVMe devices

2017-07-11 Thread Keith Busch

On Tue, Jul 11, 2017 at 01:55:02AM -0700, Suganath Prabu S wrote:
> +/**
> + * _base_check_pcie_native_sgl - This function is called for PCIe end 
> devices to
> + * determine if the driver needs to build a native SGL.  If so, that native
> + * SGL is built in the special contiguous buffers allocated especially for
> + * PCIe SGL creation.  If the driver will not build a native SGL, return
> + * TRUE and a normal IEEE SGL will be built.  Currently this routine
> + * supports NVMe.
> + * @ioc: per adapter object
> + * @mpi_request: mf request pointer
> + * @smid: system request message index
> + * @scmd: scsi command
> + * @pcie_device: points to the PCIe device's info
> + *
> + * Returns 0 if native SGL was built, 1 if no SGL was built
> + */
> +static int
> +_base_check_pcie_native_sgl(struct MPT3SAS_ADAPTER *ioc,
> + Mpi25SCSIIORequest_t *mpi_request, u16 smid, struct scsi_cmnd *scmd,
> + struct _pcie_device *pcie_device)
> +{



> + /* Return 0, indicating we built a native SGL. */
> + return 1;
> +}

This function doesn't return 0 ever. Not sure why it's here.

Curious about your device, though, if a nvme native SGL can *not* be
built, does the HBA firmware then buffer it in its local memory before
sending/receiving to/from the host?

And if a native SGL can be built, does the NVMe target DMA directly
to/from host memory, giving a performance boost?

Re: [PATCH v2 01/15] qla2xxx: Combine Active command arrays.

2017-07-11 Thread Bart Van Assche

On Tue, 2017-06-13 at 20:47 -0700, Himanshu Madhani wrote:
>  typedef struct srb {
> + /*
> +  * Do not move cmd_type field, it needs to
> +  * line up with qla_tgt_cmd->cmd_type
> +  */
> + uint8_t cmd_type;
> + uint8_t pad[3];

Hello Himanshu,

Had I understood correctly that you had promised to rework the command
array merging such that a union is used instead of requiring certain
fields to be present at certain offsets (see also
https://www.spinics.net/lists/target-devel/msg15591.html)?

Bart.

Re: [PATCH] iscsi-target: Add login_keys_workaround attribute for non RFC initiators

2017-07-11 Thread Bart Van Assche

On Fri, 2017-07-07 at 22:24 +, Nicholas A. Bellinger wrote:
> From: Nicholas Bellinger 
> 
> This patch re-introduces part of a long standing login workaround that
> was recently dropped by:
> 
>   commit 1c99de981f30b3e7868b8d20ce5479fa1c0fea46
>   Author: Nicholas Bellinger 
>   Date:   Sun Apr 2 13:36:44 2017 -0700
> 
>   iscsi-target: Drop work-around for legacy GlobalSAN initiator
> 
> Namely, the workaround for FirstBurstLength ended up being required by
> Mellanox Flexboot PXE boot ROMs as reported by Robert.
> 
> So this patch re-adds the work-around for FirstBurstLength within
> iscsi_check_proposer_for_optional_reply(), and makes the key optional
> to respond when the initiator does not propose, nor respond to it.
> 
> Also as requested by Arun, this patch introduces a new TPG attribute
> named 'login_keys_workaround' that controls the use of both the
> FirstBurstLength workaround, as well as the two other existing
> workarounds for gPXE iSCSI boot client.
> 
> By default, the workaround is enabled with login_keys_workaround=1,
> since Mellanox FlexBoot requires it, and Arun has verified the Qlogic
> MSFT initiator already proposes FirstBurstLength, so it's uneffected
> by this re-adding this part of the original work-around.

Hello Nick,

The new configfs attribute ("login_keys_workaround") may confuse users - for
someone who has not followed this e-mail thread it can take a long time
before they figure out that they need to set this configfs attribute. Have
you considered to let the iSCSI target driver figure out whether or not that
variable has to be set, e.g. by looking up the initiator IQN in a list?

Bart.

Re: [PATCH v2 1/4] scsi: scsi_dh_alua: allow I/O in target port unavailable and standby states

2017-07-11 Thread Mauricio Faria de Oliveira


On 07/11/2017 12:32 PM, Mauricio Faria de Oliveira wrote:

Also, it seems the Unavailable/Standby states would not be logged
without a recheck from alua_check_sense(), since the only callers
of alua_rtpg_queue() are alua_activate() and alua_check[_sense]()


Well, actually it does get logged if when activating/switching path
groups but shouldn't be the case with only a single path group.


--
Mauricio Faria de Oliveira
IBM Linux Technology Center

Re: [PATCH 2/6] SCSI: use blk_mq_run_hw_queues() in scsi_kick_queue()

2017-07-11 Thread Bart Van Assche

On Wed, 2017-07-12 at 02:20 +0800, Ming Lei wrote:
> Now SCSI won't stop queue, and not necessary to use
> blk_mq_start_hw_queues(), so switch to blk_mq_run_hw_queues()
> instead.
> 
> Cc: "James E.J. Bottomley" 
> Cc: "Martin K. Petersen" 
> Cc: linux-scsi@vger.kernel.org
> Signed-off-by: Ming Lei 
> ---
>  drivers/scsi/scsi_lib.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index f6097b89d5d3..91d890356b78 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -333,7 +333,7 @@ void scsi_device_unbusy(struct scsi_device *sdev)
>  static void scsi_kick_queue(struct request_queue *q)
>  {
>   if (q->mq_ops)
> - blk_mq_start_hw_queues(q);
> + blk_mq_run_hw_queues(q, false);
>   else
>   blk_run_queue(q);
>  }

Hello Ming,

Now that we have separate flags to represent the "stopped" and "quiesced"
states, wouldn't it be better not to modify scsi_kick_queue() but instead to
stop a SCSI hardware queue again if scsi_queue_rq() returns BLK_STS_RESOURCE?
See also commits 36e3cf273977 ("scsi: Avoid that SCSI queues get stuck") and
commit 52d7f1b5c2f3 ("blk-mq: Avoid that requeueing starts stopped queues").

Bart.

Re: [PATCH v2] scsi: sg: fix SG_DXFER_FROM_DEV transfers

2017-07-11 Thread Douglas Gilbert


On 2017-07-07 04:56 AM, Johannes Thumshirn wrote:

SG_DXFER_FROM_DEV transfers do not necessarily have a dxferp as we set
it to NULL for the old sg_io read/write interface, but must have a length
bigger than 0. This fixes a regression introduced by commit 28676d869bbb
("scsi: sg: check for valid direction before starting the request")

Signed-off-by: Johannes Thumshirn 
Fixes: 28676d869bbb ("scsi: sg: check for valid direction before starting the 
request")
Reported-by: Chris Clayton 
Tested-by: Chris Clayton 
Cc: Douglas Gilbert 
Reviewed-by: Hannes Reinecke 


Acked-by: Douglas Gilbert 


---
Changes to v1:
* Fix breakage of the sg_io v3 interface, verified using sg_inq

  drivers/scsi/sg.c | 5 -
  1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 21225d62b0c1..1e82d4128a84 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -758,8 +758,11 @@ static bool sg_is_valid_dxfer(sg_io_hdr_t *hp)
if (hp->dxferp || hp->dxfer_len > 0)
return false;
return true;
-   case SG_DXFER_TO_DEV:
case SG_DXFER_FROM_DEV:
+   if (hp->dxfer_len < 0)
+   return false;
+   return true;
+   case SG_DXFER_TO_DEV:
case SG_DXFER_TO_FROM_DEV:
if (!hp->dxferp || hp->dxfer_len == 0)
return false;

[PATCH 2/6] SCSI: use blk_mq_run_hw_queues() in scsi_kick_queue()

2017-07-11 Thread Ming Lei

Now SCSI won't stop queue, and not necessary to use
blk_mq_start_hw_queues(), so switch to blk_mq_run_hw_queues()
instead.

Cc: "James E.J. Bottomley" 
Cc: "Martin K. Petersen" 
Cc: linux-scsi@vger.kernel.org
Signed-off-by: Ming Lei 
---
 drivers/scsi/scsi_lib.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index f6097b89d5d3..91d890356b78 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -333,7 +333,7 @@ void scsi_device_unbusy(struct scsi_device *sdev)
 static void scsi_kick_queue(struct request_queue *q)
 {
if (q->mq_ops)
-   blk_mq_start_hw_queues(q);
+   blk_mq_run_hw_queues(q, false);
else
blk_run_queue(q);
 }
-- 
2.9.4

Re: [PATCH] tcmu: clean up the code and with one small fix

2017-07-11 Thread Nicholas A. Bellinger

On Tue, 2017-07-11 at 18:06 +0800, lixi...@cmss.chinamobile.com wrote:
> From: Xiubo Li 
> 
> Remove useless blank line and code and at the same time add one error
> path to catch the errors.
> 
> Signed-off-by: Xiubo Li 
> ---
>  drivers/target/target_core_user.c | 24 +++-
>  1 file changed, 11 insertions(+), 13 deletions(-)

Applied.

Thanks Xiubo.

Re: [PATCHv2] tcmu: Fix possbile memory leak when recalculating the cmd base size

2017-07-11 Thread Nicholas A. Bellinger

On Tue, 2017-07-11 at 17:59 +0800, lixi...@cmss.chinamobile.com wrote:
> From: Xiubo Li 
> 
> For all the entries allocated from the ring cmd area, the memory is
> something like the stack memory, which will always reserve the old
> data, so the entry->req.iov_bidi_cnt maybe none zero.
> 
> On some environments, the crash could be reporduce very easy and some
> not. The following is the crash core trace:
> 
> [  240.143969] CPU: 0 PID: 1285 Comm: iscsi_trx Not tainted
> 4.12.0-rc1+ #3
> [  240.150607] Hardware name: ASUS All Series/H87-PRO, BIOS 2104
> 10/28/2014
> [  240.157331] task: 8807de4f5800 task.stack:
> c900047dc000
> [  240.163270] RIP: 0010:memcpy_erms+0x6/0x10
> [  240.167377] RSP: 0018:c900047dfc68 EFLAGS: 00010202
> [  240.172621] RAX: c9065db85540 RBX: 8807f798 RCX:
> 0010
> [  240.179771] RDX: 0010 RSI: 8807de574fe0 RDI:
> c9065db85540
> [  240.186930] RBP: c900047dfd30 R08: 8807de41b000 R09:
> 
> [  240.194088] R10: 0040 R11: 8807e9b726f0 R12:
> 0006565726b0
> [  240.201246] R13: c90007612ea0 R14: 00065657d540 R15:
> 
> [  240.208397] FS:  ()
> GS:88081fa0()
> knlGS:
> [  240.216510] CS:  0010 DS:  ES:  CR0: 80050033
> [  240.80] CR2: c9065db85540 CR3: 01c0f000 CR4:
> 001406f0
> [  240.229430] Call Trace:
> [  240.231887]  ? tcmu_queue_cmd+0x83c/0xa80
> [  240.235916]  ? target_check_reservation+0xcd/0x6f0
> [  240.240725]  __target_execute_cmd+0x27/0xa0
> [  240.244918]  target_execute_cmd+0x232/0x2c0
> [  240.249124]  ? __local_bh_enable_ip+0x64/0xa0
> [  240.253499]  iscsit_execute_cmd+0x20d/0x270
> [  240.257693]  iscsit_sequence_cmd+0x110/0x190
> [  240.261985]  iscsit_get_rx_pdu+0x360/0xc80
> [  240.267565]  ? iscsi_target_rx_thread+0x54/0xd0
> [  240.273571]  iscsi_target_rx_thread+0x9a/0xd0
> [  240.279413]  kthread+0x113/0x150
> [  240.284120]  ? iscsi_target_tx_thread+0x1e0/0x1e0
> [  240.290297]  ? kthread_create_on_node+0x40/0x40
> [  240.296297]  ret_from_fork+0x2e/0x40
> [  240.301332] Code: 90 90 90 90 90 eb 1e 0f 1f 00 48 89 f8 48
> 89 d1 48
> c1 e9 03 83 e2 07 f3 48 a5 89 d1 f3 a4 c3 66 0f 1f 44 00 00 48
> 89 f8 48
> 89 d1  a4 c3 0f 1f 80 00 00 00 00 48 89 f8 48 83 fa 20 72 7e
> 40 38
> [  240.321751] RIP: memcpy_erms+0x6/0x10 RSP: c900047dfc68
> [  240.328838] CR2: c9065db85540
> [  240.333667] ---[ end trace b7e5354cfb54d08b ]---
> 
> To fix this, just memset all the entry memory before using it, and
> also to be more readable we adjust the bidi code.
> 
> Fixed: fe25cc34795(tcmu: Recalculate the tcmu_cmd size to save cmd area
>   memories)
> Reported-by: Bryant G. Ly 
> Tested-by: Damien Le Moal 
> Signed-off-by: Xiubo Li 
> ---
>  drivers/target/target_core_user.c | 12 +---
>  1 file changed, 5 insertions(+), 7 deletions(-)
> 

Applied, with a CC' to v4.12.y and slightly updated patch subject.

Thanks Xiubo, Bryant, Damien and MNC!

Re: [PATCH] scsi: default to scsi-mq

2017-07-11 Thread John Garry


On 11/07/2017 16:46, Bart Van Assche wrote:

On Tue, 2017-07-11 at 15:14 +0100, John Garry wrote:

On 11/07/2017 14:32, Bart Van Assche wrote:

On Tue, 2017-07-11 at 11:22 +0100, John Garry wrote:

On 10/07/2017 16:50, Bart Van Assche wrote:

Since a fix for the performance regression triggered by this patch will be 
upstream
soon (see also 
https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git/commit/?h=for-linus=32825c45ff8f4cce937ab85b030dc693ceb1aa0a):



FYI, on linux-next 20170711 (which now includes the above patch Bart
mentioned) we see a large performance regression on hisi_sas (LLDD does
not config shost for mq).

fio read mode iops goes from ~833K (scsi_mod.use_blk_mq=n on cmdline) to
~320K


Hello John,

Thanks for the feedback. Is the kernel config with which these measurements
were performed available somewhere?


It is the default arm64 defconfig with the following changes:
CONFIG_ARM_SMMU_V3=n
CONFIG_9P_FS=n

We were getting a compile error in the 9p fs code, so disabled it.
Turning on the IOMMU drops performance across the board for our
platform, so just disabling it for the test.


Hello John,

What block driver controls the block device for which the performance regression
has been observed? How many hardware queues were created by that block driver
(see also /sys/block/*/mq/...)?


Hi Bart,

Here's the shost init for our SCSI LLDD:
http://elixir.free-electrons.com/linux/latest/source/drivers/scsi/hisi_sas/hisi_sas_main.c#L1736

So we don't set hr_hw_queues (which would mean = 0), so this should set 
shost->tag_set.nr_hw_queues to 1 in scsi_mq_setup_tags().


FWIW, I can confirm sysfs entry when I get hw access tomorrow.

John

I'm asking this because the number of hardware

queues controls which I/O scheduler is selected as default. From 
block/elevator.c:






if (q->mq_ops) {
if (q->nr_hw_queues == 1)
e = elevator_get("mq-deadline", false);
if (!e)
return 0;
} else
e = elevator_get(CONFIG_DEFAULT_IOSCHED, false);

Bart.

RE: [PATCH 05/13] mpt3sas: Set NVMe device queue depth as 128

2017-07-11 Thread Elliott, Robert (Persistent Memory)

> +++ b/drivers/scsi/mpt3sas/mpt3sas_base.h
> @@ -115,7 +115,7 @@
> 
>  #define MPT3SAS_RAID_MAX_SECTORS 8192
>  #define MPT3SAS_HOST_PAGE_SIZE_4K12
> -
> +#define MPT3SAS_NVME_QUEUE_DEPTH 128
...
> + /*TODO-right Queue Depth?*/
> + qdepth = MPT3SAS_NVME_QUEUE_DEPTH;
> + ds = "NVMe";

The native NVMe driver is getting a modparam to set that value (rather than
using a #define of 1024) in this patch:
http://lists.infradead.org/pipermail/linux-nvme/2017-July/011734.html

Perhaps this driver should do the same.

---
Robert Elliott, HPE Persistent Memory

Re: [PATCH] tcmu: clean up the code and with one small fix

2017-07-11 Thread Mike Christie

On 07/11/2017 05:06 AM, lixi...@cmss.chinamobile.com wrote:
> From: Xiubo Li 
> 
> Remove useless blank line and code and at the same time add one error
> path to catch the errors.
> 
> Signed-off-by: Xiubo Li 

Thanks.

Reviewed-by: Mike Christie

Re: [PATCHv2] tcmu: Fix possbile memory leak when recalculating the cmd base size

2017-07-11 Thread Mike Christie

On 07/11/2017 04:59 AM, lixi...@cmss.chinamobile.com wrote:
> From: Xiubo Li 
> 
> For all the entries allocated from the ring cmd area, the memory is
> something like the stack memory, which will always reserve the old
> data, so the entry->req.iov_bidi_cnt maybe none zero.
> 
> On some environments, the crash could be reporduce very easy and some
> not. The following is the crash core trace:
> 
> [  240.143969] CPU: 0 PID: 1285 Comm: iscsi_trx Not tainted
> 4.12.0-rc1+ #3
> [  240.150607] Hardware name: ASUS All Series/H87-PRO, BIOS 2104
> 10/28/2014
> [  240.157331] task: 8807de4f5800 task.stack:
> c900047dc000
> [  240.163270] RIP: 0010:memcpy_erms+0x6/0x10
> [  240.167377] RSP: 0018:c900047dfc68 EFLAGS: 00010202
> [  240.172621] RAX: c9065db85540 RBX: 8807f798 RCX:
> 0010
> [  240.179771] RDX: 0010 RSI: 8807de574fe0 RDI:
> c9065db85540
> [  240.186930] RBP: c900047dfd30 R08: 8807de41b000 R09:
> 
> [  240.194088] R10: 0040 R11: 8807e9b726f0 R12:
> 0006565726b0
> [  240.201246] R13: c90007612ea0 R14: 00065657d540 R15:
> 
> [  240.208397] FS:  ()
> GS:88081fa0()
> knlGS:
> [  240.216510] CS:  0010 DS:  ES:  CR0: 80050033
> [  240.80] CR2: c9065db85540 CR3: 01c0f000 CR4:
> 001406f0
> [  240.229430] Call Trace:
> [  240.231887]  ? tcmu_queue_cmd+0x83c/0xa80
> [  240.235916]  ? target_check_reservation+0xcd/0x6f0
> [  240.240725]  __target_execute_cmd+0x27/0xa0
> [  240.244918]  target_execute_cmd+0x232/0x2c0
> [  240.249124]  ? __local_bh_enable_ip+0x64/0xa0
> [  240.253499]  iscsit_execute_cmd+0x20d/0x270
> [  240.257693]  iscsit_sequence_cmd+0x110/0x190
> [  240.261985]  iscsit_get_rx_pdu+0x360/0xc80
> [  240.267565]  ? iscsi_target_rx_thread+0x54/0xd0
> [  240.273571]  iscsi_target_rx_thread+0x9a/0xd0
> [  240.279413]  kthread+0x113/0x150
> [  240.284120]  ? iscsi_target_tx_thread+0x1e0/0x1e0
> [  240.290297]  ? kthread_create_on_node+0x40/0x40
> [  240.296297]  ret_from_fork+0x2e/0x40
> [  240.301332] Code: 90 90 90 90 90 eb 1e 0f 1f 00 48 89 f8 48
> 89 d1 48
> c1 e9 03 83 e2 07 f3 48 a5 89 d1 f3 a4 c3 66 0f 1f 44 00 00 48
> 89 f8 48
> 89 d1  a4 c3 0f 1f 80 00 00 00 00 48 89 f8 48 83 fa 20 72 7e
> 40 38
> [  240.321751] RIP: memcpy_erms+0x6/0x10 RSP: c900047dfc68
> [  240.328838] CR2: c9065db85540
> [  240.333667] ---[ end trace b7e5354cfb54d08b ]---
> 
> To fix this, just memset all the entry memory before using it, and
> also to be more readable we adjust the bidi code.
> 
> Fixed: fe25cc34795(tcmu: Recalculate the tcmu_cmd size to save cmd area
>   memories)
> Reported-by: Bryant G. Ly 
> Tested-by: Damien Le Moal 
> Signed-off-by: Xiubo Li 
> ---

Nice. Thanks.

Reviewed-by: Mike Christie

Re: [PATCH] scsi: hisi_sas: make several const arrays static

2017-07-11 Thread John Garry


On 11/07/2017 13:11, Colin King wrote:

From: Colin Ian King 

Don't populate various tables on the stack but make them static const.
Makes the object code smaller by over 280 bytes:

Before:
   textdata bss dec hex filename
  398875080  64   45031afe7 hisi_sas_v2_hw.o

After:
   textdata bss dec hex filename
  393185368  64   44750aece hisi_sas_v2_hw.o

Signed-off-by: Colin Ian King 
---


Acked-by: John Garry

Re: [PATCH 02/13] mpt3sas: SGL to PRP Translation for I/Os to NVMe devices

2017-07-11 Thread Keith Busch

On Tue, Jul 11, 2017 at 01:55:02AM -0700, Suganath Prabu S wrote:
> diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.h 
> b/drivers/scsi/mpt3sas/mpt3sas_base.h
> index 60fa7b6..cebdd8e 100644
> --- a/drivers/scsi/mpt3sas/mpt3sas_base.h
> +++ b/drivers/scsi/mpt3sas/mpt3sas_base.h
> @@ -54,6 +54,7 @@
>  #include "mpi/mpi2_raid.h"
>  #include "mpi/mpi2_tool.h"
>  #include "mpi/mpi2_sas.h"
> +#include "mpi/mpi2_pci.h"

Could you ajust your patch order for this series so each can compile? Here
in patch 2 you're including a header that's not defined until patch 12.

Re: [PATCH] iscsi-target: Reject immediate data underflow larger than SCSI transfer length

2017-07-11 Thread Bart Van Assche

On Tue, 2017-07-11 at 00:22 -0700, Nicholas A. Bellinger wrote:
> So rejecting this case as already done in commit abb85a9b51 is the
> correct approach for >= v4.3.y.

Hello Nic,

I hope that you agree that the current target_cmd_size_check() implementation
is complicated and ugly. Patch 30/33 of the patch series I referred to in my
e-mail removes a significant number of lines of code from that function. So
my patch series not only makes target_cmd_size_check() easier to maintain and
to verify but it makes that function also faster. Hence please reconsider the
approach from my patch series. For patch 30/33, see also
https://www.spinics.net/lists/target-devel/msg15384.html.

Bart.

RE: [PATCH] hpsa: add support for legacy boards

2017-07-11 Thread Meelis Roos

> The 5i controller is probably too old for the hpsa driver to support. 
> The hpsa driver is looking for information to determine if the drive is 
> online/offline and
> this information is not available.
> 
> What was the original issue you were having with the cciss driver?

Christoph Hellwig updated block layer with "block: Make most 
scsi_req_init() calls implicit" and at first try, cciss was left without 
the needed initialization. This caused OOPS in udev probing but the 
system worked. The issue was fixed by Christoph quickly.

But he suggested it might be worth trying hpsa driver instead of cciss, 
with a longer term goal to to move users of cciss over to hpsa if 
possible. Now that I have tested it, it seems not all older cards are 
supported in hpsa - it's more than ID-s and interrupt masks.

-- 
Meelis Roos (mr...@linux.ee)

Re: [PATCH v3 2/7] libsas: remove unused port_gone_completion

2017-07-11 Thread John Garry


On 10/07/2017 08:06, Yijing Wang wrote:

No one uses the port_gone_completion in struct asd_sas_port,
clean it out.


This seems like a reasonable tidy-up patch which could be taken in 
isolation, having no dependency on the rest of the series.




Signed-off-by: Yijing Wang 
---
 include/scsi/libsas.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/include/scsi/libsas.h b/include/scsi/libsas.h
index c41328d..628f48b 100644
--- a/include/scsi/libsas.h
+++ b/include/scsi/libsas.h
@@ -263,8 +263,6 @@ struct sas_discovery {
 /* The port struct is Class:RW, driver:RO */
 struct asd_sas_port {
 /* private: */
-   struct completion port_gone_completion;
-
struct sas_discovery disc;
struct domain_device *port_dev;
spinlock_t dev_list_lock;

Re: [PATCHv2] tcmu: Fix possbile memory leak when recalculating the cmd base size

2017-07-11 Thread Bryant G. Ly




From: Xiubo Li 

For all the entries allocated from the ring cmd area, the memory is
something like the stack memory, which will always reserve the old
data, so the entry->req.iov_bidi_cnt maybe none zero.

On some environments, the crash could be reporduce very easy and some
not. The following is the crash core trace:

[  240.143969] CPU: 0 PID: 1285 Comm: iscsi_trx Not tainted
4.12.0-rc1+ #3
[  240.150607] Hardware name: ASUS All Series/H87-PRO, BIOS 2104
10/28/2014
[  240.157331] task: 8807de4f5800 task.stack:
c900047dc000
[  240.163270] RIP: 0010:memcpy_erms+0x6/0x10
[  240.167377] RSP: 0018:c900047dfc68 EFLAGS: 00010202
[  240.172621] RAX: c9065db85540 RBX: 8807f798 RCX:
0010
[  240.179771] RDX: 0010 RSI: 8807de574fe0 RDI:
c9065db85540
[  240.186930] RBP: c900047dfd30 R08: 8807de41b000 R09:

[  240.194088] R10: 0040 R11: 8807e9b726f0 R12:
0006565726b0
[  240.201246] R13: c90007612ea0 R14: 00065657d540 R15:

[  240.208397] FS:  ()
GS:88081fa0()
knlGS:
[  240.216510] CS:  0010 DS:  ES:  CR0: 80050033
[  240.80] CR2: c9065db85540 CR3: 01c0f000 CR4:
001406f0
[  240.229430] Call Trace:
[  240.231887]  ? tcmu_queue_cmd+0x83c/0xa80
[  240.235916]  ? target_check_reservation+0xcd/0x6f0
[  240.240725]  __target_execute_cmd+0x27/0xa0
[  240.244918]  target_execute_cmd+0x232/0x2c0
[  240.249124]  ? __local_bh_enable_ip+0x64/0xa0
[  240.253499]  iscsit_execute_cmd+0x20d/0x270
[  240.257693]  iscsit_sequence_cmd+0x110/0x190
[  240.261985]  iscsit_get_rx_pdu+0x360/0xc80
[  240.267565]  ? iscsi_target_rx_thread+0x54/0xd0
[  240.273571]  iscsi_target_rx_thread+0x9a/0xd0
[  240.279413]  kthread+0x113/0x150
[  240.284120]  ? iscsi_target_tx_thread+0x1e0/0x1e0
[  240.290297]  ? kthread_create_on_node+0x40/0x40
[  240.296297]  ret_from_fork+0x2e/0x40
[  240.301332] Code: 90 90 90 90 90 eb 1e 0f 1f 00 48 89 f8 48
89 d1 48
c1 e9 03 83 e2 07 f3 48 a5 89 d1 f3 a4 c3 66 0f 1f 44 00 00 48
89 f8 48
89 d1  a4 c3 0f 1f 80 00 00 00 00 48 89 f8 48 83 fa 20 72 7e
40 38
[  240.321751] RIP: memcpy_erms+0x6/0x10 RSP: c900047dfc68
[  240.328838] CR2: c9065db85540
[  240.333667] ---[ end trace b7e5354cfb54d08b ]---

To fix this, just memset all the entry memory before using it, and
also to be more readable we adjust the bidi code.

Fixed: fe25cc34795(tcmu: Recalculate the tcmu_cmd size to save cmd area
memories)
Reported-by: Bryant G. Ly 
Tested-by: Damien Le Moal 
Signed-off-by: Xiubo Li 
---
  drivers/target/target_core_user.c | 12 +---
  1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/drivers/target/target_core_user.c 
b/drivers/target/target_core_user.c
index 2f1fa92..3b25ef3 100644
--- a/drivers/target/target_core_user.c
+++ b/drivers/target/target_core_user.c


Nice! This has fixed our long standing issue with not being able to boot with 
the global data area support on power.

Tested-by: Bryant G. Ly 

-Bryant

RE: [PATCH v2 11/15] megaraid_sas: Set device queue_depth same as HBA can_queue value in scsi-mq mode

2017-07-11 Thread Kashyap Desai

> -Original Message-
> From: Christoph Hellwig [mailto:h...@lst.de]
> Sent: Tuesday, July 11, 2017 7:28 PM
> To: Shivasharan S
> Cc: linux-scsi@vger.kernel.org; martin.peter...@oracle.com;
> the...@redhat.com; j...@linux.vnet.ibm.com;
> kashyap.de...@broadcom.com; sumit.sax...@broadcom.com;
> h...@suse.com; h...@lst.de
> Subject: Re: [PATCH v2 11/15] megaraid_sas: Set device queue_depth same
as
> HBA can_queue value in scsi-mq mode
>
> On Wed, Jul 05, 2017 at 05:00:25AM -0700, Shivasharan S wrote:
> > Currently driver sets default queue_depth for VDs at 256 and JBODs
> > based on interface type, ie., for SAS JBOD QD will be 64, for SATA
JBOD QD
> will be 32.
> > During performance runs with scsi-mq enabled, we are seeing better
> > results by setting QD same as HBA queue_depth.
>
> Please no scsi-mq specifics.  just do this unconditionally.

Chris -  Intent for mq specific check is mainly because of sequential work
load for HDD is having penalty due to mq scheduler issue.
We did this exercise prior to mq-deadline support.

Making generic change for non-mq and mq was good, but we may see some user
may not like to see regression.
E.a In case of, QD = 32 for SATA PD file system creation may be faster
compare to large QD. There may be a soft merger at block layer due to
queue depth throttling. Eventually, FS creation goes fast due to IO
merges, but same will not be true if we change queue depth logic (means,
increase device queue depth to HBA QD.)

We have choice to completely remove this patch and ask users to do sysfs
settings in case of scsi-mq performance issue for HDD sequential work
load.
Having this patch, we want to provide better QD settings as default from
driver.

Thanks, Kashyap

Re: [PATCH] scsi: default to scsi-mq

2017-07-11 Thread Bart Van Assche

On Tue, 2017-07-11 at 15:14 +0100, John Garry wrote:
> On 11/07/2017 14:32, Bart Van Assche wrote:
> > On Tue, 2017-07-11 at 11:22 +0100, John Garry wrote:
> > > On 10/07/2017 16:50, Bart Van Assche wrote:
> > > > Since a fix for the performance regression triggered by this patch will 
> > > > be upstream
> > > > soon (see also 
> > > > https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git/commit/?h=for-linus=32825c45ff8f4cce937ab85b030dc693ceb1aa0a):
> > > > 
> > > 
> > > FYI, on linux-next 20170711 (which now includes the above patch Bart
> > > mentioned) we see a large performance regression on hisi_sas (LLDD does
> > > not config shost for mq).
> > > 
> > > fio read mode iops goes from ~833K (scsi_mod.use_blk_mq=n on cmdline) to
> > > ~320K
> > 
> > Hello John,
> > 
> > Thanks for the feedback. Is the kernel config with which these measurements
> > were performed available somewhere?
> 
> It is the default arm64 defconfig with the following changes:
> CONFIG_ARM_SMMU_V3=n
> CONFIG_9P_FS=n
> 
> We were getting a compile error in the 9p fs code, so disabled it. 
> Turning on the IOMMU drops performance across the board for our 
> platform, so just disabling it for the test.

Hello John,

What block driver controls the block device for which the performance regression
has been observed? How many hardware queues were created by that block driver
(see also /sys/block/*/mq/...)? I'm asking this because the number of hardware
queues controls which I/O scheduler is selected as default. From 
block/elevator.c:

if (q->mq_ops) {
if (q->nr_hw_queues == 1)
e = elevator_get("mq-deadline", false);
if (!e)
return 0;
} else
e = elevator_get(CONFIG_DEFAULT_IOSCHED, false);

Bart.

RE: [PATCH] hpsa: add support for legacy boards

2017-07-11 Thread Don Brace

> -Original Message-
> From: mr...@math.ut.ee [mailto:mr...@math.ut.ee] On Behalf Of Meelis
> Roos
> Sent: Tuesday, July 11, 2017 9:26 AM
> To: Hannes Reinecke 
> Cc: Martin K. Petersen ; Christoph Hellwig
> ; James Bottomley
> ; Don Brace
> ; Jens Axboe ; linux-
> s...@vger.kernel.org; Hannes Reinecke 
> Subject: Re: [PATCH] hpsa: add support for legacy boards
> 
> EXTERNAL EMAIL
> 
> 
> > Add support for legacy boards, ensuring to enable the driver for
> > those boards only when 'hpsa_allow_any' is set.
> 
> Applied this patch, made sure I had compiled in hpsa and not cciss to
> avoid any variables from initramfs, and still I get this:
> 
> [4.015080] hpsa :00:04.0: unrecognized board ID: 0x40800e11, ignoring.
> [4.098473] hpsa :00:04.0: Board ID not found
> 
> Boot command line was "root=/dev/sda1 console=ttyS0,9600 ro
> hpsa_allow_any=1" -
> seems correct.
> 
> By looking at the code, I should see "unsupported board ID:" and it
> should work, but I see "unrecognized board ID:" and it does not work.
> 
> Hmm, trying hpsa.hpsa_allow_any=1. Much better:
> 
> [3.891531] hpsa :00:04.0: unsupported board ID: 0x40800e11
> [3.962367] hpsa :00:04.0: unsupported board ID: 0x40800e11
> [4.033493] hpsa :00:04.0: Controller reports max supported commands
> of 0 Using 16 instead. Ensure that firmware is up to date.
> [4.175134] hpsa :00:04.0: Physical aborts not supported
> [4.242931] hpsa :00:04.0: Logical aborts not supported
> [4.309594] hpsa :00:04.0: HP SSD Smart Path aborts not supported
> [4.460889] hpsa :00:04.0: Controller reports max supported commands
> of 0 Using 16 instead. Ensure that firmware is up to date.
> [4.584679] scsi host0: hpsa
> [4.587842] hpsa :00:04.0: report luns requested format 2, got 0
> [4.613215] hpsa :00:04.0: scsi 0:0:0:0: masked Direct-Access 
> COMPAQ
> BD03685A24   PHYS DRV SSDSmartPathCap- En- Exp=0
> [4.613219] hpsa :00:04.0: C0:B1:T0:L0 Volume status is not available
> through vital product data pages.
> [4.613224] hpsa :00:04.0: scsi 0:1:0:0: offline Direct-Access 
> COMPAQ
> LOGICAL VOLUME   RAID-1(+0) SSDSmartPathCap- En- Exp=1
> [4.613229] hpsa :00:04.0: scsi 0:3:0:0: added RAID  COMPAQ
> Smart Array 5i   controller SSDSmartPathCap- En- Exp=1
> [6.187725] scsi 0:3:0:0: RAID  COMPAQ   Smart Array 5i   2.62 
> PQ: 0
> ANSI: 0
> [...]
> [6.726872] VFS: Cannot open root device "sda1" or unknown-block(0,0):
> error -6
> [6.814364] Please append a correct "root=" boot option; here are the
> available partitions:
> [6.914403] Kernel panic - not syncing: VFS: Unable to mount root fs on
> unknown-block(0,0)
> 
> Controller is detected, there is something behind it but no sda is
> detected and no bootup.
> 
> What next?
> 
> And, for readability, we should use something like "Using unsupported
> board ID", not plain "unsupported board ID" - the last one leaves
> assumption that it will not work, although it should.
> 
> --
> Meelis Roos (mr...@linux.ee)

The 5i controller is probably too old for the hpsa driver to support. 
The hpsa driver is looking for information to determine if the drive is 
online/offline and
this information is not available.

What was the original issue you were having with the cciss driver?

Re: [PATCH v3 1/7] libsas: Use static sas event pool to appease sas event lost

2017-07-11 Thread John Garry


On 10/07/2017 08:06, Yijing Wang wrote:

Now libsas hotplug work is static, every sas event type has its own
static work, LLDD driver queue the hotplug work into shost->work_q.
If LLDD driver burst post lots hotplug events to libsas, the hotplug
events may pending in the workqueue like

shost->work_q
new work[PORTE_BYTES_DMAED] --> |[PHYE_LOSS_OF_SIGNAL][PORTE_BYTES_DMAED] -> 
processing
|<---wait worker to process>|
In this case, a new PORTE_BYTES_DMAED event coming, libsas try to queue it
to shost->work_q, but this work is already pending, so it would be lost.
Finally, libsas delete the related sas port and sas devices, but LLDD driver
expect libsas add the sas port and devices(last sas event).

This patch and use static sas event work pool to appease this issue, since
it's static work pool, it won't make memory exhaust.



[ ... ]



+#definePORT_POOL_SIZE  (PORT_NUM_EVENTS * 5)
+#definePHY_POOL_SIZE   (PHY_NUM_EVENTS * 5)
+
 /* The phy pretty much is controlled by the LLDD.
  * The class only reads those fields.
  */
 struct asd_sas_phy {
 /* private: */
-   struct asd_sas_event   port_events[PORT_NUM_EVENTS];
-   struct asd_sas_event   phy_events[PHY_NUM_EVENTS];
-
-   unsigned long port_events_pending;
-   unsigned long phy_events_pending;
+   struct asd_sas_event   port_events[PORT_POOL_SIZE];
+   struct asd_sas_event   phy_events[PHY_POOL_SIZE];

int error;


Hi Yijing,

So now we are creating a static pool of events per PHY/port, instead of 
having 1 static work struct per event per PHY/port. So, for sure, this 
avoids the dynamic event issue of system memory exhaustion which we 
discussed in v1+v2 series. And it seems to possibly remove issue of 
losing SAS events.


But how did you determine the pool size for a PHY/port? It would seem to 
be 5 * #phy events or #port events (which is also 5, I figure by 
coincidence). How does this deal with flutter of >25 events?


Thanks,
John

Re: [PATCH v2 1/4] scsi: scsi_dh_alua: allow I/O in target port unavailable and standby states

2017-07-11 Thread Mauricio Faria de Oliveira


On 07/11/2017 06:18 AM, Hannes Reinecke wrote:

NACK.

The whole_point_  of having device handlers is to_avoid_  I/O errors
during booting.

And the ALUA checker is prepared to handle this situation properly.
The directio checker of course doesn't know about this, but then no-one
expected the directio checker to work with ALUA.


I lacked that more holistic understanding. Thanks for explaining.

Now, for the sake of logging/debugging...

Any problem with patches 2 and 4?

Also, it seems the Unavailable/Standby states would not be logged
without a recheck from alua_check_sense(), since the only callers
of alua_rtpg_queue() are alua_activate() and alua_check[_sense]()
[the call from alua_check_vpd() is only in the initialization path].

Isn't there a point in scheduling a recheck once those conditions
are found in alua_check_sense() to get them logged? - since valid
path checkers won't go through that function.

(and it occurred to me that the state-change check of patch 3 can
be done there, simpler.)

cheers,

--
Mauricio Faria de Oliveira
IBM Linux Technology Center

Re: [PATCH] hpsa: add support for legacy boards

2017-07-11 Thread Meelis Roos

> Add support for legacy boards, ensuring to enable the driver for
> those boards only when 'hpsa_allow_any' is set.

Applied this patch, made sure I had compiled in hpsa and not cciss to 
avoid any variables from initramfs, and still I get this:

[4.015080] hpsa :00:04.0: unrecognized board ID: 0x40800e11, ignoring.
[4.098473] hpsa :00:04.0: Board ID not found

Boot command line was "root=/dev/sda1 console=ttyS0,9600 ro hpsa_allow_any=1" - 
seems correct.

By looking at the code, I should see "unsupported board ID:" and it 
should work, but I see "unrecognized board ID:" and it does not work.

Hmm, trying hpsa.hpsa_allow_any=1. Much better:

[3.891531] hpsa :00:04.0: unsupported board ID: 0x40800e11
[3.962367] hpsa :00:04.0: unsupported board ID: 0x40800e11
[4.033493] hpsa :00:04.0: Controller reports max supported commands of 
0 Using 16 instead. Ensure that firmware is up to date.
[4.175134] hpsa :00:04.0: Physical aborts not supported
[4.242931] hpsa :00:04.0: Logical aborts not supported
[4.309594] hpsa :00:04.0: HP SSD Smart Path aborts not supported
[4.460889] hpsa :00:04.0: Controller reports max supported commands of 
0 Using 16 instead. Ensure that firmware is up to date.
[4.584679] scsi host0: hpsa
[4.587842] hpsa :00:04.0: report luns requested format 2, got 0
[4.613215] hpsa :00:04.0: scsi 0:0:0:0: masked Direct-Access COMPAQ 
  BD03685A24   PHYS DRV SSDSmartPathCap- En- Exp=0
[4.613219] hpsa :00:04.0: C0:B1:T0:L0 Volume status is not available 
through vital product data pages.
[4.613224] hpsa :00:04.0: scsi 0:1:0:0: offline Direct-Access 
COMPAQ   LOGICAL VOLUME   RAID-1(+0) SSDSmartPathCap- En- Exp=1
[4.613229] hpsa :00:04.0: scsi 0:3:0:0: added RAID  COMPAQ  
 Smart Array 5i   controller SSDSmartPathCap- En- Exp=1
[6.187725] scsi 0:3:0:0: RAID  COMPAQ   Smart Array 5i   2.62 
PQ: 0 ANSI: 0
[...]
[6.726872] VFS: Cannot open root device "sda1" or unknown-block(0,0): error 
-6
[6.814364] Please append a correct "root=" boot option; here are the 
available partitions:
[6.914403] Kernel panic - not syncing: VFS: Unable to mount root fs on 
unknown-block(0,0)

Controller is detected, there is something behind it but no sda is 
detected and no bootup.

What next?

And, for readability, we should use something like "Using unsupported 
board ID", not plain "unsupported board ID" - the last one leaves 
assumption that it will not work, although it should.

-- 
Meelis Roos (mr...@linux.ee)

Re: [PATCH] scsi: default to scsi-mq

2017-07-11 Thread John Garry


On 11/07/2017 14:32, Bart Van Assche wrote:

On Tue, 2017-07-11 at 11:22 +0100, John Garry wrote:

On 10/07/2017 16:50, Bart Van Assche wrote:

Since a fix for the performance regression triggered by this patch will be 
upstream
soon (see also 
https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git/commit/?h=for-linus=32825c45ff8f4cce937ab85b030dc693ceb1aa0a):



FYI, on linux-next 20170711 (which now includes the above patch Bart
mentioned) we see a large performance regression on hisi_sas (LLDD does
not config shost for mq).

fio read mode iops goes from ~833K (scsi_mod.use_blk_mq=n on cmdline) to
~320K


Hello John,

Thanks for the feedback. Is the kernel config with which these measurements
were performed available somewhere?


Hi Bart,

It is the default arm64 defconfig with the following changes:
CONFIG_ARM_SMMU_V3=n
CONFIG_9P_FS=n

We were getting a compile error in the 9p fs code, so disabled it. 
Turning on the IOMMU drops performance across the board for our 
platform, so just disabling it for the test.


John



Bart.





config.tar.gz
Description: application/gzip

[PATCH] scsi: scsi_dh_alua: fix boolreturn.cocci warnings

2017-07-11 Thread kbuild test robot

drivers/scsi/device_handler/scsi_dh_alua.c:594:9-10: WARNING: return of 0/1 in 
function 'alua_rtpg_print_check' with return type bool

 Return statements in functions returning bool should use
 true/false instead of 1/0.
Generated by: scripts/coccinelle/misc/boolreturn.cocci

Fixes: cb809ba2fcbf ("scsi: scsi_dh_alua: do not print RTPG state if it remains 
unavailable/standby")
CC: Mauricio Faria de Oliveira 
Signed-off-by: Fengguang Wu 
---

 scsi_dh_alua.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/scsi/device_handler/scsi_dh_alua.c
+++ b/drivers/scsi/device_handler/scsi_dh_alua.c
@@ -591,7 +591,7 @@ static bool alua_rtpg_print_check(int ol
case SCSI_ACCESS_STATE_STANDBY:
return old_state != new_state;
default:
-   return 1;
+   return true;
}
 }

Re: [PATCH v2 3/4] scsi: scsi_dh_alua: do not print RTPG state if it remains unavailable/standby

2017-07-11 Thread kbuild test robot

Hi Mauricio,

[auto build test WARNING on bvanassche/for-next]
[also build test WARNING on v4.12 next-20170711]
[cannot apply to mkp-scsi/for-next scsi/for-next]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Mauricio-Faria-de-Oliveira/scsi_dh_alua-fix-stuck-I-O-after-unavailable-standby-states/20170711-141350
base:   https://github.com/bvanassche/linux for-next


coccinelle warnings: (new ones prefixed by >>)

>> drivers/scsi/device_handler/scsi_dh_alua.c:594:9-10: WARNING: return of 0/1 
>> in function 'alua_rtpg_print_check' with return type bool

Please review and possibly fold the followup patch.

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation

Re: [PATCH v2 15/15] megaraid_sas: driver version upgrade

2017-07-11 Thread Tomas Henzl

On 5.7.2017 14:00, Shivasharan S wrote:
> Signed-off-by: Shivasharan S 
> Reviewed-by: Hannes Reinecke 

Reviewed-by: Tomas Henzl 

tomash

Re: [PATCH v2 14/15] megaraid_sas: call megasas_dump_frame with correct IO frame size

2017-07-11 Thread Tomas Henzl

On 5.7.2017 14:00, Shivasharan S wrote:
> Signed-off-by: Kashyap Desai 
> Signed-off-by: Shivasharan S 
> Reviewed-by: Hannes Reinecke 

Reviewed-by: Tomas Henzl 

tomash

Re: [PATCH v2 13/15] megaraid_sas: modified few prints in OCR and IOC INIT path

2017-07-11 Thread Tomas Henzl

On 5.7.2017 14:00, Shivasharan S wrote:
> Signed-off-by: Kashyap Desai 
> Signed-off-by: Shivasharan S 
> Reviewed-by: Hannes Reinecke 

Reviewed-by: Tomas Henzl 

tomash

Re: [PATCH v2 11/15] megaraid_sas: Set device queue_depth same as HBA can_queue value in scsi-mq mode

2017-07-11 Thread Christoph Hellwig

On Wed, Jul 05, 2017 at 05:00:25AM -0700, Shivasharan S wrote:
> Currently driver sets default queue_depth for VDs at 256 and JBODs based on 
> interface type,
> ie., for SAS JBOD QD will be 64, for SATA JBOD QD will be 32.
> During performance runs with scsi-mq enabled, we are seeing better results by
> setting QD same as HBA queue_depth.

Please no scsi-mq specifics.  just do this unconditionally.

Re: [PATCH v2 12/15] megaraid_sas: replace internal FALSE/TRUE definitions with false/true

2017-07-11 Thread Tomas Henzl

On 5.7.2017 14:00, Shivasharan S wrote:
> Signed-off-by: Kashyap Desai 
> Signed-off-by: Shivasharan S 
> Reviewed-by: Hannes Reinecke 

Reviewed-by: Tomas Henzl 

tomash

Re: [PATCH v2 11/15] megaraid_sas: Set device queue_depth same as HBA can_queue value in scsi-mq mode

2017-07-11 Thread Tomas Henzl

On 5.7.2017 14:00, Shivasharan S wrote:
> Currently driver sets default queue_depth for VDs at 256 and JBODs based on 
> interface type,
> ie., for SAS JBOD QD will be 64, for SATA JBOD QD will be 32.
> During performance runs with scsi-mq enabled, we are seeing better results by
> setting QD same as HBA queue_depth.
>
> Signed-off-by: Kashyap Desai 
> Signed-off-by: Shivasharan S 

Reviewed-by: Tomas Henzl 

tomash

Re: [PATCH v2 10/15] megaraid_sas: Return pended IOCTLs with cmd_status MFI_STAT_WRONG_STATE in case adapter is dead

2017-07-11 Thread Tomas Henzl

On 5.7.2017 14:00, Shivasharan S wrote:
> Fix - After a kill adapter, since the cmd_status is not set the
> IOCTLs will be hung in driver resulting in application hang.
> Set cmd_status MFI_STAT_WRONG_STATE when completing pended IOCTLs.
>
> Signed-off-by: Kashyap Desai 
> Signed-off-by: Shivasharan S 
> Cc: sta...@vger.kernel.org
> Reviewed-by: Hannes Reinecke 

Reviewed-by: Tomas Henzl 

tomash

Re: [PATCH] scsi: default to scsi-mq

2017-07-11 Thread Bart Van Assche

On Tue, 2017-07-11 at 11:22 +0100, John Garry wrote:
> On 10/07/2017 16:50, Bart Van Assche wrote:
> > Since a fix for the performance regression triggered by this patch will be 
> > upstream
> > soon (see also 
> > https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git/commit/?h=for-linus=32825c45ff8f4cce937ab85b030dc693ceb1aa0a):
> > 
> 
> FYI, on linux-next 20170711 (which now includes the above patch Bart 
> mentioned) we see a large performance regression on hisi_sas (LLDD does 
> not config shost for mq).
> 
> fio read mode iops goes from ~833K (scsi_mod.use_blk_mq=n on cmdline) to 
> ~320K

Hello John,

Thanks for the feedback. Is the kernel config with which these measurements
were performed available somewhere?

Bart.

Re: [PATCH v2 09/15] megaraid_sas: use vmalloc for crash dump buffers and driver's local RAID map

2017-07-11 Thread Tomas Henzl

On 11.7.2017 12:49, Shivasharan Srikanteshwara wrote:
>> -Original Message-
>> From: Tomas Henzl [mailto:the...@redhat.com]
>> Sent: Monday, July 10, 2017 7:15 PM
>> To: Shivasharan S; linux-scsi@vger.kernel.org
>> Cc: martin.peter...@oracle.com; j...@linux.vnet.ibm.com;
>> kashyap.de...@broadcom.com; sumit.sax...@broadcom.com;
>> h...@suse.com; h...@lst.de
>> Subject: Re: [PATCH v2 09/15] megaraid_sas: use vmalloc for crash dump
>> buffers and driver's local RAID map
>>
>> On 5.7.2017 14:00, Shivasharan S wrote:
>>> Signed-off-by: Kashyap Desai 
>>> Signed-off-by: Shivasharan S 
>>> Reviewed-by: Hannes Reinecke 
>>> ---
>>>  drivers/scsi/megaraid/megaraid_sas.h|   1 -
>>>  drivers/scsi/megaraid/megaraid_sas_base.c   |  12 ++-
>>>  drivers/scsi/megaraid/megaraid_sas_fusion.c | 121
>>> ++--
>>>  3 files changed, 88 insertions(+), 46 deletions(-)
>>>
>>> diff --git a/drivers/scsi/megaraid/megaraid_sas.h
>>> b/drivers/scsi/megaraid/megaraid_sas.h
>>> index 2b209bb..6d9f111 100644
>>> --- a/drivers/scsi/megaraid/megaraid_sas.h
>>> +++ b/drivers/scsi/megaraid/megaraid_sas.h
>>> @@ -2115,7 +2115,6 @@ struct megasas_instance {
>>> u32 *crash_dump_buf;
>>> dma_addr_t crash_dump_h;
>>> void *crash_buf[MAX_CRASH_DUMP_SIZE];
>>> -   u32 crash_buf_pages;
>>> unsigned intfw_crash_buffer_size;
>>> unsigned intfw_crash_state;
>>> unsigned intfw_crash_buffer_offset;
>>> diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c
>>> b/drivers/scsi/megaraid/megaraid_sas_base.c
>>> index e490272..c63ef88 100644
>>> --- a/drivers/scsi/megaraid/megaraid_sas_base.c
>>> +++ b/drivers/scsi/megaraid/megaraid_sas_base.c
>>> @@ -49,6 +49,7 @@
>>>  #include 
>>>  #include 
>>>  #include 
>>> +#include 
>>>
>>>  #include 
>>>  #include 
>>> @@ -6672,9 +6673,14 @@ static void megasas_detach_one(struct pci_dev
>> *pdev)
>>>   fusion->max_map_sz,
>>>   fusion->ld_map[i],
>>>   fusion->ld_map_phys[i]);
>>> -   if (fusion->ld_drv_map[i])
>>> -   free_pages((ulong)fusion->ld_drv_map[i],
>>> -   fusion->drv_map_pages);
>>> +   if (fusion->ld_drv_map[i]) {
>>> +   if (is_vmalloc_addr(fusion->ld_drv_map[i]))
>>> +   vfree(fusion->ld_drv_map[i]);
>>> +   else
>>> +   free_pages((ulong)fusion-
>>> ld_drv_map[i],
>>> +  fusion->drv_map_pages);
>>> +   }
>>> +
>>> if (fusion->pd_seq_sync[i])
>>> dma_free_coherent(>pdev->dev,
>>> pd_seq_map_sz,
>>> diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c
>>> b/drivers/scsi/megaraid/megaraid_sas_fusion.c
>>> index c239762..ff4a3a8 100644
>>> --- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
>>> +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
>>> @@ -1257,6 +1257,80 @@ static int
>>> megasas_create_sg_sense_fusion(struct megasas_instance *instance)  }
>>>
>>>  /**
>>> + * megasas_allocate_raid_maps -Allocate memory for RAID maps
>>> + * @instance:  Adapter soft state
>>> + *
>>> + * return: if success: return 0
>>> + * failed:  return -ENOMEM
>>> + */
>>> +static inline int megasas_allocate_raid_maps(struct megasas_instance
>>> +*instance) {
>>> +   struct fusion_context *fusion;
>>> +   int i = 0;
>>> +
>>> +   fusion = instance->ctrl_context;
>>> +
>>> +   fusion->drv_map_pages = get_order(fusion->drv_map_sz);
>>> +
>>> +   for (i = 0; i < 2; i++) {
>>> +   fusion->ld_map[i] = NULL;
>> Hi, does this assignment^ mean, that you need a fusion->ld_drv_map[0;1] =
>> NULL setting before this for cycle as well or is it just superfluos ?
>>
> Hi Tomas,
> Initializing ld_map[i] = NULL is not necessary but that got carried over
> from
> earlier code. We do not need to set fusion->ld_drv_map[0:1] to NULL here as
> fusion_context is memset to zero during allocation.
>
>>> +
>>> +   fusion->ld_drv_map[i] = (void *)
>>> +   __get_free_pages(__GFP_ZERO | GFP_KERNEL,
>>> +fusion->drv_map_pages);
>> The subject says - 'use vmalloc for ... and driver's local RAID map'
>> in the code here you use vmalloc only if __get_free_pages fails is this
>> intended ?
>> (maybe an explanation in the mail body would be nice)
>>
>> tomash
>>
> I will send out v3 of the series with a more detailed commit description.
> The use of __get_free_pages first for driver's local RAID map is intentional
> as this
> structure is frequently accessed. But we do not

Aviso de conta

2017-07-11 Thread PostMaster

Alguém tentou acessar sua conta webmail / zimbra da África do Sul com IP no:
87.228.204.106. Ignore esta mensagem se você é o único, mas se você não é o
único, clique no link seguro da conta abaixo e faça login nos detalhes do
seu webmail / zimbra e clique em cimeira para proteger e proteger sua conta
de ser um hack.



http://corriouedeskl.tripod.com/



A partir de,

Suporte à conta do Webmaster.





---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

Re: [PATCH] virtio_scsi: always read VPD pages for multiqueue too

2017-07-11 Thread Stefan Hajnoczi

On Wed, Jul 05, 2017 at 10:30:56AM +0200, Paolo Bonzini wrote:
> Multi-queue virtio-scsi uses a different scsi_host_template struct.
> Add the .device_alloc field there, too.
> 
> Fixes: 25d1d50e23275e141e3a3fe06c25a99f4c4bf4e0
> Cc: sta...@vger.kernel.org
> Cc: David Gibson 
> Signed-off-by: Paolo Bonzini 
> ---
>  drivers/scsi/virtio_scsi.c | 1 +
>  1 file changed, 1 insertion(+)

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature

[PATCH] scsi: hisi_sas: make several const arrays static

2017-07-11 Thread Colin King

From: Colin Ian King 

Don't populate various tables on the stack but make them static const.
Makes the object code smaller by over 280 bytes:

Before:
   textdata bss dec hex filename
  398875080  64   45031afe7 hisi_sas_v2_hw.o

After:
   textdata bss dec hex filename
  393185368  64   44750aece hisi_sas_v2_hw.o

Signed-off-by: Colin Ian King 
---
 drivers/scsi/hisi_sas/hisi_sas_v2_hw.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c 
b/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c
index 551d103c27f1..2bfea7082e3a 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c
@@ -1693,7 +1693,7 @@ static int prep_ssp_v2_hw(struct hisi_hba *hisi_hba,
 
 static int parse_trans_tx_err_code_v2_hw(u32 err_msk)
 {
-   const u8 trans_tx_err_code_prio[] = {
+   static const u8 trans_tx_err_code_prio[] = {
TRANS_TX_OPEN_FAIL_WITH_IT_NEXUS_LOSS,
TRANS_TX_ERR_PHY_NOT_ENABLE,
TRANS_TX_OPEN_CNX_ERR_WRONG_DESTINATION,
@@ -1738,7 +1738,7 @@ static int parse_trans_tx_err_code_v2_hw(u32 err_msk)
 
 static int parse_trans_rx_err_code_v2_hw(u32 err_msk)
 {
-   const u8 trans_rx_err_code_prio[] = {
+   static const u8 trans_rx_err_code_prio[] = {
TRANS_RX_ERR_WITH_RXFRAME_CRC_ERR,
TRANS_RX_ERR_WITH_RXFIS_8B10B_DISP_ERR,
TRANS_RX_ERR_WITH_RXFRAME_HAVE_ERRPRM,
@@ -1784,7 +1784,7 @@ static int parse_trans_rx_err_code_v2_hw(u32 err_msk)
 
 static int parse_dma_tx_err_code_v2_hw(u32 err_msk)
 {
-   const u8 dma_tx_err_code_prio[] = {
+   static const u8 dma_tx_err_code_prio[] = {
DMA_TX_UNEXP_XFER_ERR,
DMA_TX_UNEXP_RETRANS_ERR,
DMA_TX_XFER_LEN_OVERFLOW,
@@ -1810,7 +1810,7 @@ static int parse_dma_tx_err_code_v2_hw(u32 err_msk)
 
 static int parse_sipc_rx_err_code_v2_hw(u32 err_msk)
 {
-   const u8 sipc_rx_err_code_prio[] = {
+   static const u8 sipc_rx_err_code_prio[] = {
SIPC_RX_FIS_STATUS_ERR_BIT_VLD,
SIPC_RX_PIO_WRSETUP_STATUS_DRQ_ERR,
SIPC_RX_FIS_STATUS_BSY_BIT_ERR,
@@ -1836,7 +1836,7 @@ static int parse_sipc_rx_err_code_v2_hw(u32 err_msk)
 
 static int parse_dma_rx_err_code_v2_hw(u32 err_msk)
 {
-   const u8 dma_rx_err_code_prio[] = {
+   static const u8 dma_rx_err_code_prio[] = {
DMA_RX_UNKNOWN_FRM_ERR,
DMA_RX_DATA_LEN_OVERFLOW,
DMA_RX_DATA_LEN_UNDERFLOW,
-- 
2.11.0

RE: [PATCH v2 09/15] megaraid_sas: use vmalloc for crash dump buffers and driver's local RAID map

2017-07-11 Thread Shivasharan Srikanteshwara

> -Original Message-
> From: Tomas Henzl [mailto:the...@redhat.com]
> Sent: Monday, July 10, 2017 7:15 PM
> To: Shivasharan S; linux-scsi@vger.kernel.org
> Cc: martin.peter...@oracle.com; j...@linux.vnet.ibm.com;
> kashyap.de...@broadcom.com; sumit.sax...@broadcom.com;
> h...@suse.com; h...@lst.de
> Subject: Re: [PATCH v2 09/15] megaraid_sas: use vmalloc for crash dump
> buffers and driver's local RAID map
>
> On 5.7.2017 14:00, Shivasharan S wrote:
> > Signed-off-by: Kashyap Desai 
> > Signed-off-by: Shivasharan S 
> > Reviewed-by: Hannes Reinecke 
> > ---
> >  drivers/scsi/megaraid/megaraid_sas.h|   1 -
> >  drivers/scsi/megaraid/megaraid_sas_base.c   |  12 ++-
> >  drivers/scsi/megaraid/megaraid_sas_fusion.c | 121
> > ++--
> >  3 files changed, 88 insertions(+), 46 deletions(-)
> >
> > diff --git a/drivers/scsi/megaraid/megaraid_sas.h
> > b/drivers/scsi/megaraid/megaraid_sas.h
> > index 2b209bb..6d9f111 100644
> > --- a/drivers/scsi/megaraid/megaraid_sas.h
> > +++ b/drivers/scsi/megaraid/megaraid_sas.h
> > @@ -2115,7 +2115,6 @@ struct megasas_instance {
> > u32 *crash_dump_buf;
> > dma_addr_t crash_dump_h;
> > void *crash_buf[MAX_CRASH_DUMP_SIZE];
> > -   u32 crash_buf_pages;
> > unsigned intfw_crash_buffer_size;
> > unsigned intfw_crash_state;
> > unsigned intfw_crash_buffer_offset;
> > diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c
> > b/drivers/scsi/megaraid/megaraid_sas_base.c
> > index e490272..c63ef88 100644
> > --- a/drivers/scsi/megaraid/megaraid_sas_base.c
> > +++ b/drivers/scsi/megaraid/megaraid_sas_base.c
> > @@ -49,6 +49,7 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >
> >  #include 
> >  #include 
> > @@ -6672,9 +6673,14 @@ static void megasas_detach_one(struct pci_dev
> *pdev)
> >   fusion->max_map_sz,
> >   fusion->ld_map[i],
> >   fusion->ld_map_phys[i]);
> > -   if (fusion->ld_drv_map[i])
> > -   free_pages((ulong)fusion->ld_drv_map[i],
> > -   fusion->drv_map_pages);
> > +   if (fusion->ld_drv_map[i]) {
> > +   if (is_vmalloc_addr(fusion->ld_drv_map[i]))
> > +   vfree(fusion->ld_drv_map[i]);
> > +   else
> > +   free_pages((ulong)fusion-
> >ld_drv_map[i],
> > +  fusion->drv_map_pages);
> > +   }
> > +
> > if (fusion->pd_seq_sync[i])
> > dma_free_coherent(>pdev->dev,
> > pd_seq_map_sz,
> > diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c
> > b/drivers/scsi/megaraid/megaraid_sas_fusion.c
> > index c239762..ff4a3a8 100644
> > --- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
> > +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
> > @@ -1257,6 +1257,80 @@ static int
> > megasas_create_sg_sense_fusion(struct megasas_instance *instance)  }
> >
> >  /**
> > + * megasas_allocate_raid_maps -Allocate memory for RAID maps
> > + * @instance:  Adapter soft state
> > + *
> > + * return: if success: return 0
> > + * failed:  return -ENOMEM
> > + */
> > +static inline int megasas_allocate_raid_maps(struct megasas_instance
> > +*instance) {
> > +   struct fusion_context *fusion;
> > +   int i = 0;
> > +
> > +   fusion = instance->ctrl_context;
> > +
> > +   fusion->drv_map_pages = get_order(fusion->drv_map_sz);
> > +
> > +   for (i = 0; i < 2; i++) {
> > +   fusion->ld_map[i] = NULL;
>
> Hi, does this assignment^ mean, that you need a fusion->ld_drv_map[0;1] =
> NULL setting before this for cycle as well or is it just superfluos ?
>
Hi Tomas,
Initializing ld_map[i] = NULL is not necessary but that got carried over
from
earlier code. We do not need to set fusion->ld_drv_map[0:1] to NULL here as
fusion_context is memset to zero during allocation.

>
> > +
> > +   fusion->ld_drv_map[i] = (void *)
> > +   __get_free_pages(__GFP_ZERO | GFP_KERNEL,
> > +fusion->drv_map_pages);
>
> The subject says - 'use vmalloc for ... and driver's local RAID map'
> in the code here you use vmalloc only if __get_free_pages fails is this
> intended ?
> (maybe an explanation in the mail body would be nice)
>
> tomash
>
I will send out v3 of the series with a more detailed commit description.
The use of __get_free_pages first for driver's local RAID map is intentional
as this
structure is frequently accessed. But we do not want to fail device probe
due
to unavailability of contiguous memory.

Thanks,
Shivasharan

Re: tgtd CPU 100% problem

2017-07-11 Thread Sagi Grimberg




Thanks very much for reply.
I think it may be better to  assert failed to exit problem than run
the endless loop after receivered DEVICE_REMOVAL event.
Or we can sleep 5 ms to check if the conn->h.state is STATE_FULL.


Note that neither of the CC'd lists are the correct list
for stgt, you should correspond s...@lists.wpkg.org

Regarding your observation, not sure why you think this is
happening, the work is queued once in 5 seconds and the loop
is finite.

Re: [PATCH] scsi: default to scsi-mq

2017-07-11 Thread John Garry


On 10/07/2017 16:50, Bart Van Assche wrote:

On Fri, 2017-06-16 at 10:27 +0200, Christoph Hellwig wrote:

Remove the SCSI_MQ_DEFAULT config option and default to the blk-mq I/O
path now that we had plenty of testing, and have I/O schedulers for
blk-mq.  The module option to disable the blk-mq path is kept around
for now.

Signed-off-by: Christoph Hellwig <h...@lst.de>
---
 drivers/scsi/Kconfig | 11 ---
 drivers/scsi/scsi.c  |  4 
 2 files changed, 15 deletions(-)

diff --git a/drivers/scsi/Kconfig b/drivers/scsi/Kconfig
index 3c52867dfe28..d384f4f86c26 100644
--- a/drivers/scsi/Kconfig
+++ b/drivers/scsi/Kconfig
@@ -47,17 +47,6 @@ config SCSI_NETLINK
default n
depends on NET

-config SCSI_MQ_DEFAULT
-   bool "SCSI: use blk-mq I/O path by default"
-   depends on SCSI
-   ---help---
- This option enables the new blk-mq based I/O path for SCSI
- devices by default.  With the option the scsi_mod.use_blk_mq
- module/boot option defaults to Y, without it to N, but it can
- still be overridden either way.
-
- If unsure say N.
-
 config SCSI_PROC_FS
bool "legacy /proc/scsi/ support"
depends on SCSI && PROC_FS
diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
index 1bf274e3b2b6..3d38c6d463b8 100644
--- a/drivers/scsi/scsi.c
+++ b/drivers/scsi/scsi.c
@@ -800,11 +800,7 @@ MODULE_LICENSE("GPL");
 module_param(scsi_logging_level, int, S_IRUGO|S_IWUSR);
 MODULE_PARM_DESC(scsi_logging_level, "a bit mask of logging levels");

-#ifdef CONFIG_SCSI_MQ_DEFAULT
 bool scsi_use_blk_mq = true;
-#else
-bool scsi_use_blk_mq = false;
-#endif
 module_param_named(use_blk_mq, scsi_use_blk_mq, bool, S_IWUSR | S_IRUGO);

 static int __init init_scsi(void)


Since a fix for the performance regression triggered by this patch will be 
upstream
soon (see also 
https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git/commit/?h=for-linus=32825c45ff8f4cce937ab85b030dc693ceb1aa0a):



FYI, on linux-next 20170711 (which now includes the above patch Bart 
mentioned) we see a large performance regression on hisi_sas (LLDD does 
not config shost for mq).


fio read mode iops goes from ~833K (scsi_mod.use_blk_mq=n on cmdline) to 
~320K


John


Acked-by: Bart Van Assche <bart.vanass...@wdc.com>

[PATCH] tcmu: clean up the code and with one small fix

2017-07-11 Thread lixiubo

From: Xiubo Li 

Remove useless blank line and code and at the same time add one error
path to catch the errors.

Signed-off-by: Xiubo Li 
---
 drivers/target/target_core_user.c | 24 +++-
 1 file changed, 11 insertions(+), 13 deletions(-)

diff --git a/drivers/target/target_core_user.c 
b/drivers/target/target_core_user.c
index 3b25ef3..80ee130 100644
--- a/drivers/target/target_core_user.c
+++ b/drivers/target/target_core_user.c
@@ -342,7 +342,6 @@ static inline bool tcmu_get_empty_block(struct tcmu_dev 
*udev,
 
page = radix_tree_lookup(>data_blocks, dbi);
if (!page) {
-
if (atomic_add_return(1, _db_count) >
TCMU_GLOBAL_MAX_BLOCKS) {
atomic_dec(_db_count);
@@ -352,14 +351,11 @@ static inline bool tcmu_get_empty_block(struct tcmu_dev 
*udev,
/* try to get new page from the mm */
page = alloc_page(GFP_KERNEL);
if (!page)
-   return false;
+   goto err_alloc;
 
ret = radix_tree_insert(>data_blocks, dbi, page);
-   if (ret) {
-   __free_page(page);
-   return false;
-   }
-
+   if (ret)
+   goto err_insert;
}
 
if (dbi > udev->dbi_max)
@@ -369,6 +365,11 @@ static inline bool tcmu_get_empty_block(struct tcmu_dev 
*udev,
tcmu_cmd_set_dbi(tcmu_cmd, dbi);
 
return true;
+err_insert:
+   __free_page(page);
+err_alloc:
+   atomic_dec(_db_count);
+   return false;
 }
 
 static bool tcmu_get_empty_blocks(struct tcmu_dev *udev,
@@ -527,7 +528,7 @@ static inline size_t get_block_offset_user(struct tcmu_dev 
*dev,
DATA_BLOCK_SIZE - remaining;
 }
 
-static inline size_t iov_tail(struct tcmu_dev *udev, struct iovec *iov)
+static inline size_t iov_tail(struct iovec *iov)
 {
return (size_t)iov->iov_base + iov->iov_len;
 }
@@ -566,7 +567,7 @@ static int scatter_data_area(struct tcmu_dev *udev,
to += offset;
 
if (*iov_cnt != 0 &&
-   to_offset == iov_tail(udev, *iov)) {
+   to_offset == iov_tail(*iov)) {
(*iov)->iov_len += copy_bytes;
} else {
new_iov(iov, iov_cnt, udev);
@@ -722,10 +723,7 @@ static bool is_ring_space_avail(struct tcmu_dev *udev, 
struct tcmu_cmd *cmd,
}
}
 
-   if (!tcmu_get_empty_blocks(udev, cmd))
-   return false;
-
-   return true;
+   return tcmu_get_empty_blocks(udev, cmd);
 }
 
 static inline size_t tcmu_cmd_get_base_cmd_size(size_t iov_cnt)
-- 
1.8.3.1

Re: tgtd CPU 100% problem

2017-07-11 Thread 李春

Thanks very much for reply.
I think it may be better to  assert failed to exit problem than run
the endless loop after receivered DEVICE_REMOVAL event.
Or we can sleep 5 ms to check if the conn->h.state is STATE_FULL.

2017-07-11 16:29 GMT+08:00 Sagi Grimberg :
>
>
> On 11/07/17 10:51, 李春 wrote:
>>
>> We have meet a problem of tgtd CPU 100%.
>>
>> the infinband network card was negotiate as eth mode by mistake,
>> after we change it to ib mode and restart opensmd for correct
>> State（Active）
>> the tgtd using 100% of CPU. and when we connect to it using tgtadm,
>> tgtadm hang forever.
>>
>> # how to repeat
>>
>> * tgtd export a disk throught port 3260 of iser
>> * iscsiadm login a target from tgt through infiniband
>>
>> * connectx_port_config set the mellanox infiniband to eth mode
>> * connectx_port_config set the mellanox infiniband to ib mode
>> * /etc/init.d/opensmd restart
>> * tgtadm connect to tgt will hang
>>
>> # error messge
>>
>> ```
>> Jul  1 21:32:37 shadow tgtd: iser_handle_rdmacm(1628) Unsupported
>> event:11, RDMA_CM_EVENT_DEVICE_REMOVAL - ignored
>> Jul  1 21:32:37 shadow tgtd: iser_handle_rdmacm(1628) Unsupported
>> event:11, RDMA_CM_EVENT_DEVICE_REMOVAL - ignored
>>
>> Jul  1 21:32:39 shadow tgtd: iser_handle_async_event(3174) dev:mlx4_0
>> HCA evt: local catastrophic error
>
>
> iser code in tgtd does not know how to correctly handle RDMA device
> removal events (and it never did).
>
> The event is generated from the port configuration step while
> tgt-iser is bound to it. Once the device is removed the device
> handle tgt-iser has is essentially unusable, which explains
> the qp creation failures below.
>
> Handling DEVICE_REMOVAL event handling is a new feature request.
>
>> Jul  1 21:46:56 shadow tgtd: iser_cm_connect_request(1471)
>> conn:0x1380bf0 cm_id:0x1380950 rdma_create_qp failed, Cannot allocate
>> memory
>> Jul  1 21:46:56 shadow tgtd: iser_cm_connect_request(1520)
>> cm_id:0x1380950 rdma_reject failed, Bad file descriptor
>> Jul  1 21:46:56 shadow tgtd: iser_cm_connect_request(1471)
>> conn:0x1380bf0 cm_id:0x1380950 rdma_create_qp failed, Cannot allocate
>> memory
>
>
> And also tgt-iser cannot even reject the (re)connect request.
>
>> Jul  1 21:46:56 shadow tgtd: iser_cm_connect_request(1520)
>> cm_id:0x1380950 rdma_reject failed, Bad file descriptor
>> ``



-- 
pickup.lichun 李春

[PATCHv2] tcmu: Fix possbile memory leak when recalculating the cmd base size

2017-07-11 Thread lixiubo

From: Xiubo Li 

For all the entries allocated from the ring cmd area, the memory is
something like the stack memory, which will always reserve the old
data, so the entry->req.iov_bidi_cnt maybe none zero.

On some environments, the crash could be reporduce very easy and some
not. The following is the crash core trace:

[  240.143969] CPU: 0 PID: 1285 Comm: iscsi_trx Not tainted
4.12.0-rc1+ #3
[  240.150607] Hardware name: ASUS All Series/H87-PRO, BIOS 2104
10/28/2014
[  240.157331] task: 8807de4f5800 task.stack:
c900047dc000
[  240.163270] RIP: 0010:memcpy_erms+0x6/0x10
[  240.167377] RSP: 0018:c900047dfc68 EFLAGS: 00010202
[  240.172621] RAX: c9065db85540 RBX: 8807f798 RCX:
0010
[  240.179771] RDX: 0010 RSI: 8807de574fe0 RDI:
c9065db85540
[  240.186930] RBP: c900047dfd30 R08: 8807de41b000 R09:

[  240.194088] R10: 0040 R11: 8807e9b726f0 R12:
0006565726b0
[  240.201246] R13: c90007612ea0 R14: 00065657d540 R15:

[  240.208397] FS:  ()
GS:88081fa0()
knlGS:
[  240.216510] CS:  0010 DS:  ES:  CR0: 80050033
[  240.80] CR2: c9065db85540 CR3: 01c0f000 CR4:
001406f0
[  240.229430] Call Trace:
[  240.231887]  ? tcmu_queue_cmd+0x83c/0xa80
[  240.235916]  ? target_check_reservation+0xcd/0x6f0
[  240.240725]  __target_execute_cmd+0x27/0xa0
[  240.244918]  target_execute_cmd+0x232/0x2c0
[  240.249124]  ? __local_bh_enable_ip+0x64/0xa0
[  240.253499]  iscsit_execute_cmd+0x20d/0x270
[  240.257693]  iscsit_sequence_cmd+0x110/0x190
[  240.261985]  iscsit_get_rx_pdu+0x360/0xc80
[  240.267565]  ? iscsi_target_rx_thread+0x54/0xd0
[  240.273571]  iscsi_target_rx_thread+0x9a/0xd0
[  240.279413]  kthread+0x113/0x150
[  240.284120]  ? iscsi_target_tx_thread+0x1e0/0x1e0
[  240.290297]  ? kthread_create_on_node+0x40/0x40
[  240.296297]  ret_from_fork+0x2e/0x40
[  240.301332] Code: 90 90 90 90 90 eb 1e 0f 1f 00 48 89 f8 48
89 d1 48
c1 e9 03 83 e2 07 f3 48 a5 89 d1 f3 a4 c3 66 0f 1f 44 00 00 48
89 f8 48
89 d1  a4 c3 0f 1f 80 00 00 00 00 48 89 f8 48 83 fa 20 72 7e
40 38
[  240.321751] RIP: memcpy_erms+0x6/0x10 RSP: c900047dfc68
[  240.328838] CR2: c9065db85540
[  240.333667] ---[ end trace b7e5354cfb54d08b ]---

To fix this, just memset all the entry memory before using it, and
also to be more readable we adjust the bidi code.

Fixed: fe25cc34795(tcmu: Recalculate the tcmu_cmd size to save cmd area
memories)
Reported-by: Bryant G. Ly 
Tested-by: Damien Le Moal 
Signed-off-by: Xiubo Li 
---
 drivers/target/target_core_user.c | 12 +---
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/drivers/target/target_core_user.c 
b/drivers/target/target_core_user.c
index 2f1fa92..3b25ef3 100644
--- a/drivers/target/target_core_user.c
+++ b/drivers/target/target_core_user.c
@@ -563,7 +563,7 @@ static int scatter_data_area(struct tcmu_dev *udev,
to_offset = get_block_offset_user(udev, dbi,
block_remaining);
offset = DATA_BLOCK_SIZE - block_remaining;
-   to = (void *)(unsigned long)to + offset;
+   to += offset;
 
if (*iov_cnt != 0 &&
to_offset == iov_tail(udev, *iov)) {
@@ -636,7 +636,7 @@ static void gather_data_area(struct tcmu_dev *udev, struct 
tcmu_cmd *cmd,
copy_bytes = min_t(size_t, sg_remaining,
block_remaining);
offset = DATA_BLOCK_SIZE - block_remaining;
-   from = (void *)(unsigned long)from + offset;
+   from += offset;
tcmu_flush_dcache_range(from, copy_bytes);
memcpy(to + sg->length - sg_remaining, from,
copy_bytes);
@@ -840,10 +840,9 @@ static inline size_t tcmu_cmd_get_cmd_size(struct tcmu_cmd 
*tcmu_cmd,
}
 
entry = (void *) mb + CMDR_OFF + cmd_head;
+   memset(entry, 0, command_size);
tcmu_hdr_set_op(>hdr.len_op, TCMU_OP_CMD);
entry->hdr.cmd_id = tcmu_cmd->cmd_id;
-   entry->hdr.kflags = 0;
-   entry->hdr.uflags = 0;
 
/* Handle allocating space from the data area */
tcmu_cmd_reset_dbi_cur(tcmu_cmd);
@@ -862,11 +861,10 @@ static inline size_t tcmu_cmd_get_cmd_size(struct 
tcmu_cmd *tcmu_cmd,
return TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE;
}
entry->req.iov_cnt = iov_cnt;
-   entry->req.iov_dif_cnt = 0;
 
/* Handle BIDI commands */
+   iov_cnt = 0;
if (se_cmd->se_cmd_flags & SCF_BIDI) {
-   iov_cnt = 0;
iov++;
ret =

Re: [PATCH] tcmu: Fix possbile memory leak when recalculating thecmd base size

2017-07-11 Thread Xiubo Li




On 2017年07月11日 17:34, Nicholas A. Bellinger wrote:

On Tue, 2017-07-11 at 09:24 +, Damien Le Moal wrote:

Xiubo,

Well done ! This fixed my problem. The ZBC test suite now passes all tests on
my target without crashing the kernel.

Please see some comments/nitpicks below.

Otherwise, please feel free to add my "tested-by"


Great, thanks for the quick turn-around.
  
Xiubo, please pick-up Damien's extra bits and repost for v2.  Also,

please update the patch topic + commit log to mention it addresses a
general protection fault OOPs, so others can find it easily when looking
through git log.

Damien's OOPs would probably be the best one to include in the commit
log, since Bryant's was on POWER.  ;)

Sure, I will repost for v2 later.


Btw, since it's a regression fix related to commit fe25cc34795, I assume
it should be CC'ed to stable for v4.12.y, right..?


Yes, it is.

Thanks

BRs
Xiubo

Re: [PATCH] tcmu: Fix possbile memory leak when recalculating the cmdbase size

2017-07-11 Thread Xiubo Li


Hi Damien,

Good news.

Thanks very much for you help to test about this.


On 2017年07月11日 17:24, Damien Le Moal wrote:

Xiubo,

Well done ! This fixed my problem. The ZBC test suite now passes all tests on
my target without crashing the kernel.

Please see some comments/nitpicks below.

Otherwise, please feel free to add my "tested-by"


On Tue, 2017-07-11 at 17:06 +0800, Xiubo Li wrote:

On 2017年07月11日 16:05, lixi...@cmss.chinamobile.com wrote:

From: Xiubo Li 

For all the entries allocated from the ring cmd area, the memory
is something like the stack, which will reserve the old data, so
the entry->req.iov_bidi_cnt maybe none zero.

To fix this, just memset all the entry memory before using it, and
also to be more readable we adjust the bidi code.

Fixed: fe25cc34795(tcmu: Recalculate the tcmu_cmd size to save cmd area
memories)
Reported-by: Bryant G. Ly 
Signed-off-by: Xiubo Li 
---
   drivers/target/target_core_user.c | 5 +++--
   1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/target/target_core_user.c
b/drivers/target/target_core_user.c
index 2f1fa92..be62c86 100644
--- a/drivers/target/target_core_user.c
+++ b/drivers/target/target_core_user.c
@@ -840,6 +840,7 @@ static inline size_t tcmu_cmd_get_cmd_size(struct
tcmu_cmd *tcmu_cmd,
}
   
   	entry = (void *) mb + CMDR_OFF + cmd_head;

+   memset(entry, 0, command_size);
tcmu_hdr_set_op(>hdr.len_op, TCMU_OP_CMD);
entry->hdr.cmd_id = tcmu_cmd->cmd_id;
entry->hdr.kflags = 0;

The added memset allows removing the 0 assignment here, and the one that
follows this one too.


Yes, it is. Maybe reserving them could make the code more readable ?

Thanks,

BRs
Xiubo


@@ -865,8 +866,8 @@ static inline size_t tcmu_cmd_get_cmd_size(struct
tcmu_cmd *tcmu_cmd,
entry->req.iov_dif_cnt = 0;
   
   	/* Handle BIDI commands */

+   iov_cnt = 0;
if (se_cmd->se_cmd_flags & SCF_BIDI) {
-   iov_cnt = 0;
iov++;
ret = scatter_data_area(udev, tcmu_cmd,
se_cmd->t_bidi_data_sg,
@@ -879,8 +880,8 @@ static inline size_t tcmu_cmd_get_cmd_size(struct
tcmu_cmd *tcmu_cmd,
pr_err("tcmu: alloc and scatter bidi data
failed\n");
return TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE;
}
-   entry->req.iov_bidi_cnt = iov_cnt;
}
+   entry->req.iov_bidi_cnt = iov_cnt;
   
   	/*

 * Recalaulate the command's base size and size according

For reference, here is the actual patch I used for testing:

diff --git a/drivers/target/target_core_user.c
b/drivers/target/target_core_user.c
index 2f1fa92..3b25ef3 100644
--- a/drivers/target/target_core_user.c
+++ b/drivers/target/target_core_user.c
@@ -563,7 +563,7 @@ static int scatter_data_area(struct tcmu_dev *udev,
to_offset = get_block_offset_user(udev, dbi,
block_remaining);
offset = DATA_BLOCK_SIZE - block_remaining;
-   to = (void *)(unsigned long)to + offset;
+   to += offset;
  
  			if (*iov_cnt != 0 &&

to_offset == iov_tail(udev, *iov)) {
@@ -636,7 +636,7 @@ static void gather_data_area(struct tcmu_dev *udev, struct
tcmu_cmd *cmd,
copy_bytes = min_t(size_t, sg_remaining,
block_remaining);
offset = DATA_BLOCK_SIZE - block_remaining;
-   from = (void *)(unsigned long)from + offset;
+   from += offset;
tcmu_flush_dcache_range(from, copy_bytes);
memcpy(to + sg->length - sg_remaining, from,
copy_bytes);
@@ -840,10 +840,9 @@ tcmu_queue_cmd_ring(struct tcmu_cmd *tcmu_cmd)
}
  
  	entry = (void *) mb + CMDR_OFF + cmd_head;

+   memset(entry, 0, command_size);
tcmu_hdr_set_op(>hdr.len_op, TCMU_OP_CMD);
entry->hdr.cmd_id = tcmu_cmd->cmd_id;
-   entry->hdr.kflags = 0;
-   entry->hdr.uflags = 0;
  
  	/* Handle allocating space from the data area */

tcmu_cmd_reset_dbi_cur(tcmu_cmd);
@@ -862,11 +861,10 @@ tcmu_queue_cmd_ring(struct tcmu_cmd *tcmu_cmd)
return TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE;
}
entry->req.iov_cnt = iov_cnt;
-   entry->req.iov_dif_cnt = 0;
  
  	/* Handle BIDI commands */

+   iov_cnt = 0;
if (se_cmd->se_cmd_flags & SCF_BIDI) {
-   iov_cnt = 0;
iov++;
ret = scatter_data_area(udev, tcmu_cmd,
se_cmd->t_bidi_data_sg,
@@ -879,8 +877,8 @@ tcmu_queue_cmd_ring(struct tcmu_cmd *tcmu_cmd)
pr_err("tcmu: alloc and scatter bidi

Re: [PATCH] tcmu: Fix possbile memory leak when recalculating the cmd base size

2017-07-11 Thread Nicholas A. Bellinger

On Tue, 2017-07-11 at 09:24 +, Damien Le Moal wrote:
> Xiubo,
> 
> Well done ! This fixed my problem. The ZBC test suite now passes all tests on
> my target without crashing the kernel.
> 
> Please see some comments/nitpicks below.
> 
> Otherwise, please feel free to add my "tested-by"
> 

Great, thanks for the quick turn-around.

Xiubo, please pick-up Damien's extra bits and repost for v2.  Also,
please update the patch topic + commit log to mention it addresses a
general protection fault OOPs, so others can find it easily when looking
through git log.

Damien's OOPs would probably be the best one to include in the commit
log, since Bryant's was on POWER.  ;)

Btw, since it's a regression fix related to commit fe25cc34795, I assume
it should be CC'ed to stable for v4.12.y, right..?

Re: [PATCH] tcmu: Fix possbile memory leak when recalculating the cmd base size

2017-07-11 Thread Damien Le Moal

Xiubo,

Well done ! This fixed my problem. The ZBC test suite now passes all tests on
my target without crashing the kernel.

Please see some comments/nitpicks below.

Otherwise, please feel free to add my "tested-by"


On Tue, 2017-07-11 at 17:06 +0800, Xiubo Li wrote:
> On 2017年07月11日 16:05, lixi...@cmss.chinamobile.com wrote:
> > From: Xiubo Li 
> > 
> > For all the entries allocated from the ring cmd area, the memory
> > is something like the stack, which will reserve the old data, so
> > the entry->req.iov_bidi_cnt maybe none zero.
> > 
> > To fix this, just memset all the entry memory before using it, and
> > also to be more readable we adjust the bidi code.
> > 
> > Fixed: fe25cc34795(tcmu: Recalculate the tcmu_cmd size to save cmd area
> > memories)
> > Reported-by: Bryant G. Ly 
> > Signed-off-by: Xiubo Li 
> > ---
> >   drivers/target/target_core_user.c | 5 +++--
> >   1 file changed, 3 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/target/target_core_user.c
> > b/drivers/target/target_core_user.c
> > index 2f1fa92..be62c86 100644
> > --- a/drivers/target/target_core_user.c
> > +++ b/drivers/target/target_core_user.c
> > @@ -840,6 +840,7 @@ static inline size_t tcmu_cmd_get_cmd_size(struct
> > tcmu_cmd *tcmu_cmd,
> >     }
> >   
> >     entry = (void *) mb + CMDR_OFF + cmd_head;
> > +   memset(entry, 0, command_size);
> >     tcmu_hdr_set_op(>hdr.len_op, TCMU_OP_CMD);
> >     entry->hdr.cmd_id = tcmu_cmd->cmd_id;
> >     entry->hdr.kflags = 0;

The added memset allows removing the 0 assignment here, and the one that
follows this one too.

> > @@ -865,8 +866,8 @@ static inline size_t tcmu_cmd_get_cmd_size(struct
> > tcmu_cmd *tcmu_cmd,
> >     entry->req.iov_dif_cnt = 0;
> >   
> >     /* Handle BIDI commands */
> > +   iov_cnt = 0;
> >     if (se_cmd->se_cmd_flags & SCF_BIDI) {
> > -   iov_cnt = 0;
> >     iov++;
> >     ret = scatter_data_area(udev, tcmu_cmd,
> >     se_cmd->t_bidi_data_sg,
> > @@ -879,8 +880,8 @@ static inline size_t tcmu_cmd_get_cmd_size(struct
> > tcmu_cmd *tcmu_cmd,
> >     pr_err("tcmu: alloc and scatter bidi data
> > failed\n");
> >     return TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE;
> >     }
> > -   entry->req.iov_bidi_cnt = iov_cnt;
> >     }
> > +   entry->req.iov_bidi_cnt = iov_cnt;
> >   
> >     /*
> >      * Recalaulate the command's base size and size according

For reference, here is the actual patch I used for testing:

diff --git a/drivers/target/target_core_user.c
b/drivers/target/target_core_user.c
index 2f1fa92..3b25ef3 100644
--- a/drivers/target/target_core_user.c
+++ b/drivers/target/target_core_user.c
@@ -563,7 +563,7 @@ static int scatter_data_area(struct tcmu_dev *udev,
    to_offset = get_block_offset_user(udev, dbi,
    block_remaining);
    offset = DATA_BLOCK_SIZE - block_remaining;
-   to = (void *)(unsigned long)to + offset;
+   to += offset;
 
    if (*iov_cnt != 0 &&
    to_offset == iov_tail(udev, *iov)) {
@@ -636,7 +636,7 @@ static void gather_data_area(struct tcmu_dev *udev, struct
tcmu_cmd *cmd,
    copy_bytes = min_t(size_t, sg_remaining,
    block_remaining);
    offset = DATA_BLOCK_SIZE - block_remaining;
-   from = (void *)(unsigned long)from + offset;
+   from += offset;
    tcmu_flush_dcache_range(from, copy_bytes);
    memcpy(to + sg->length - sg_remaining, from,
    copy_bytes);
@@ -840,10 +840,9 @@ tcmu_queue_cmd_ring(struct tcmu_cmd *tcmu_cmd)
    }
 
    entry = (void *) mb + CMDR_OFF + cmd_head;
+   memset(entry, 0, command_size);
    tcmu_hdr_set_op(>hdr.len_op, TCMU_OP_CMD);
    entry->hdr.cmd_id = tcmu_cmd->cmd_id;
-   entry->hdr.kflags = 0;
-   entry->hdr.uflags = 0;
 
    /* Handle allocating space from the data area */
    tcmu_cmd_reset_dbi_cur(tcmu_cmd);
@@ -862,11 +861,10 @@ tcmu_queue_cmd_ring(struct tcmu_cmd *tcmu_cmd)
    return TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE;
    }
    entry->req.iov_cnt = iov_cnt;
-   entry->req.iov_dif_cnt = 0;
 
    /* Handle BIDI commands */
+   iov_cnt = 0;
    if (se_cmd->se_cmd_flags & SCF_BIDI) {
-   iov_cnt = 0;
    iov++;
    ret = scatter_data_area(udev, tcmu_cmd,
    se_cmd->t_bidi_data_sg,
@@ -879,8 +877,8 @@ tcmu_queue_cmd_ring(struct tcmu_cmd *tcmu_cmd)
    pr_err("tcmu: alloc and scatter bidi data failed\n");
    return

Re: [PATCH] tcmu: Fix possible overflow for memcpy address in iovec

2017-07-11 Thread Xiubo Li




On 2017年07月11日 17:17, Damien Le Moal wrote:

Xiubo,

On Tue, 2017-07-11 at 17:04 +0800, Xiubo Li wrote:

diff --git a/drivers/target/target_core_user.c
b/drivers/target/target_core_user.c
index 930800c..86a845a 100644
--- a/drivers/target/target_core_user.c
+++ b/drivers/target/target_core_user.c
@@ -437,7 +437,7 @@ static int scatter_data_area(struct tcmu_dev
*udev,
to_offset = get_block_offset_user(udev,
dbi,
block_remaining);
offset = DATA_BLOCK_SIZE -
block_remaining;
-   to = (void *)(unsigned long)to + offset;
+   to = (void *)((unsigned long)to + offset);

			if (*iov_cnt != 0 &&

to_offset == iov_tail(udev, *iov)) {
@@ -510,7 +510,7 @@ static void gather_data_area(struct tcmu_dev
*udev, struct tcmu_cmd *cmd,
copy_bytes = min_t(size_t, sg_remaining,
block_remaining);
offset = DATA_BLOCK_SIZE -
block_remaining;
-   from = (void *)(unsigned long)from +
offset;
+   from = (void *)((unsigned long)from +
offset);
tcmu_flush_dcache_range(from,
copy_bytes);
memcpy(to + sg->length - sg_remaining,
from,
copy_bytes);

I was just looking at this patch and about to try to see if it fixes my
problem... It cannot hurt. Trying...

Hi Damien,

Please test another patch, I think that one maybe fix this.

void * pointer arithmetic is OK and equivalent to unsigned long. So I do not
think this actually fixes anything and could be rewritten more simply as


Yes, it is. So I just discard this one.
I meant to sent the second patch(tcmu: Fix possbile memory leak when 
recalculating the cmd base size) but just for mistake by handy this one.


Actually (void *) == (char *) from the GUN C manual.

Thanks,
BRs



to += offset;
and

from += offset.

And that compiles without a warning and there are no complaints from sparse.

Cheers.

Re: [PATCH] iscsi-target: Add login_keys_workaround attribute for non RFC initiators

2017-07-11 Thread Nicholas A. Bellinger

Hey Robert,

Any chance to test this with your Flexboot PXE setup..?

Please give this a spin ASAP to verify it addresses the regression you
reported earlier, wrt FirstBurstLength not being proposed nor responded
to using Flexboot PXE.

Thank you.

On Fri, 2017-07-07 at 22:24 +, Nicholas A. Bellinger wrote:
> From: Nicholas Bellinger 
> 
> This patch re-introduces part of a long standing login workaround that
> was recently dropped by:
> 
>   commit 1c99de981f30b3e7868b8d20ce5479fa1c0fea46
>   Author: Nicholas Bellinger 
>   Date:   Sun Apr 2 13:36:44 2017 -0700
> 
>   iscsi-target: Drop work-around for legacy GlobalSAN initiator
> 
> Namely, the workaround for FirstBurstLength ended up being required by
> Mellanox Flexboot PXE boot ROMs as reported by Robert.
> 
> So this patch re-adds the work-around for FirstBurstLength within
> iscsi_check_proposer_for_optional_reply(), and makes the key optional
> to respond when the initiator does not propose, nor respond to it.
> 
> Also as requested by Arun, this patch introduces a new TPG attribute
> named 'login_keys_workaround' that controls the use of both the
> FirstBurstLength workaround, as well as the two other existing
> workarounds for gPXE iSCSI boot client.
> 
> By default, the workaround is enabled with login_keys_workaround=1,
> since Mellanox FlexBoot requires it, and Arun has verified the Qlogic
> MSFT initiator already proposes FirstBurstLength, so it's uneffected
> by this re-adding this part of the original work-around.
> 
> Reported-by: Robert LeBlanc 
> Cc: Robert LeBlanc 
> Cc: Arun Easi 
> Signed-off-by: Nicholas Bellinger 
> ---
>  drivers/target/iscsi/iscsi_target_configfs.c   |  2 ++
>  drivers/target/iscsi/iscsi_target_nego.c   |  6 ++--
>  drivers/target/iscsi/iscsi_target_parameters.c | 41 
> ++
>  drivers/target/iscsi/iscsi_target_parameters.h |  2 +-
>  drivers/target/iscsi/iscsi_target_tpg.c| 19 
>  drivers/target/iscsi/iscsi_target_tpg.h|  1 +
>  include/target/iscsi/iscsi_target_core.h   |  9 ++
>  7 files changed, 64 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/target/iscsi/iscsi_target_configfs.c 
> b/drivers/target/iscsi/iscsi_target_configfs.c
> index 535a8e0..0dd4c45 100644
> --- a/drivers/target/iscsi/iscsi_target_configfs.c
> +++ b/drivers/target/iscsi/iscsi_target_configfs.c
> @@ -781,6 +781,7 @@ static int lio_target_init_nodeacl(struct se_node_acl 
> *se_nacl,
>  DEF_TPG_ATTRIB(t10_pi);
>  DEF_TPG_ATTRIB(fabric_prot_type);
>  DEF_TPG_ATTRIB(tpg_enabled_sendtargets);
> +DEF_TPG_ATTRIB(login_keys_workaround);
>  
>  static struct configfs_attribute *lio_target_tpg_attrib_attrs[] = {
>   _tpg_attrib_attr_authentication,
> @@ -796,6 +797,7 @@ static int lio_target_init_nodeacl(struct se_node_acl 
> *se_nacl,
>   _tpg_attrib_attr_t10_pi,
>   _tpg_attrib_attr_fabric_prot_type,
>   _tpg_attrib_attr_tpg_enabled_sendtargets,
> + _tpg_attrib_attr_login_keys_workaround,
>   NULL,
>  };
>  
> diff --git a/drivers/target/iscsi/iscsi_target_nego.c 
> b/drivers/target/iscsi/iscsi_target_nego.c
> index 96df63f..7a6751f 100644
> --- a/drivers/target/iscsi/iscsi_target_nego.c
> +++ b/drivers/target/iscsi/iscsi_target_nego.c
> @@ -864,7 +864,8 @@ static int iscsi_target_handle_csg_zero(
>   SENDER_TARGET,
>   login->rsp_buf,
>   >rsp_length,
> - conn->param_list);
> + conn->param_list,
> + conn->tpg->tpg_attrib.login_keys_workaround);
>   if (ret < 0)
>   return -1;
>  
> @@ -934,7 +935,8 @@ static int iscsi_target_handle_csg_one(struct iscsi_conn 
> *conn, struct iscsi_log
>   SENDER_TARGET,
>   login->rsp_buf,
>   >rsp_length,
> - conn->param_list);
> + conn->param_list,
> + conn->tpg->tpg_attrib.login_keys_workaround);
>   if (ret < 0) {
>   iscsit_tx_login_rsp(conn, ISCSI_STATUS_CLS_INITIATOR_ERR,
>   ISCSI_LOGIN_STATUS_INIT_ERR);
> diff --git a/drivers/target/iscsi/iscsi_target_parameters.c 
> b/drivers/target/iscsi/iscsi_target_parameters.c
> index fce6276..caab104 100644
> --- a/drivers/target/iscsi/iscsi_target_parameters.c
> +++ b/drivers/target/iscsi/iscsi_target_parameters.c
> @@ -765,7 +765,8 @@ static int iscsi_check_for_auth_key(char *key)
>   return 0;
>  }
>  
> -static void iscsi_check_proposer_for_optional_reply(struct iscsi_param 
> *param)
> +static void iscsi_check_proposer_for_optional_reply(struct iscsi_param 
> *param,
> + bool keys_workaround)
>  {
>   if (IS_TYPE_BOOL_AND(param)) {
>   if (!strcmp(param->value, NO))
>

Re: [PATCH v2 1/4] scsi: scsi_dh_alua: allow I/O in target port unavailable and standby states

2017-07-11 Thread Hannes Reinecke

On 07/11/2017 12:47 AM, Mauricio Faria de Oliveira wrote:
> According to SPC-4 (5.15.2.4.5 Unavailable state), the unavailable
> state may (or may not) transition to other states (e.g., microcode
> downloading or hardware error, which may be temporary or permanent).
> 
> But, scsi_dh_alua currently fails I/O requests early on once that
> state occurs (in alua_prep_fn()) preventing path checkers in such
> function path to actually check if I/O still fails or now works.
> 
> And that prevents a path activation (alua_activate()) which could
> update the PG state if it eventually recovered to an active state,
> thus resume I/O. (This is also the case with the standby state.)
> 
> This might cause device-mapper multipath to fail all paths to some
> storage system that moves the controllers to the unavailable state
> for firmware upgrades, and never recover regardless of the storage
> system doing upgrades one controller at a time and get them online.
> 
> Then I/O requests are blocked indefinitely due to queue_if_no_path
> but the underlying individual paths are fully operational, and can
> be verified as such through other function paths (e.g., SG_IO):
> 
> # multipath -l
> mpatha (360050764008100dac100) dm-0 IBM,2145
> size=40G features='2 queue_if_no_path retain_attached_hw_handler'
> hwhandler='1 alua' wp=rw
> |-+- policy='service-time 0' prio=0 status=enabled
> | |- 1:0:1:0 sdf 8:80  failed undef running
> | `- 2:0:1:0 sdn 8:208 failed undef running
> `-+- policy='service-time 0' prio=0 status=enabled
>   |- 1:0:0:0 sdb 8:16  failed undef running
>   `- 2:0:0:0 sdj 8:144 failed undef running
> 
> # strace -e read \
> sg_dd blk_sgio=0 \
> if=/dev/sdj of=/dev/null bs=512 count=1 iflag=direct \
> 2>&1 | grep 512
> read(3, 0x3fff7ba8, 512) = -1 EIO (Input/output error)
> 
> # strace -e ioctl \
> sg_dd blk_sgio=1 \
> if=/dev/sdj of=/dev/null bs=512 count=1 iflag=direct \
> 2>&1 | grep 512
> ioctl(3, SG_IO, {'S', SG_DXFER_FROM_DEV, cmd[10]=[28, 00, 00, 00,
> 00, 00, 00, 00, 01, 00], <...>) = 0
> 
> So, allow I/O to paths of PGs in unavailable/standby state, so path
> checkers can actually check them.
> 
> Also, schedule a recheck when unavailable/standby state is detected
> (in alua_check_sense()) to update pg->state, and quiet further SCSI
> error messages (in alua_prep_fn()).
> 
> Once a path checker eventually detects a working/active state again,
> the PG state is normally updated on path activation (alua_activate(),
> as it schedules a recheck), thus I/O requests are no longer failed.
> 
> Signed-off-by: Mauricio Faria de Oliveira 
> Reported-by: Naresh Bannoth 
> 
> ---
> v2:
> - also add support for standby state to alua_check_sense(), alua_prep_fn()
>   (Bart Van Assche )
> 
>  drivers/scsi/device_handler/scsi_dh_alua.c | 25 +
>  1 file changed, 25 insertions(+)
> 
> diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c 
> b/drivers/scsi/device_handler/scsi_dh_alua.c
> index c01b47e5b55a..a1cf3d6aa853 100644
> --- a/drivers/scsi/device_handler/scsi_dh_alua.c
> +++ b/drivers/scsi/device_handler/scsi_dh_alua.c
> @@ -431,6 +431,26 @@ static int alua_check_sense(struct scsi_device *sdev,
>   alua_check(sdev, false);
>   return NEEDS_RETRY;
>   }
> + if (sense_hdr->asc == 0x04 && sense_hdr->ascq == 0x0b) {
> + /*
> +  * LUN Not Accessible - target port in standby state.
> +  *
> +  * Do not retry, so failover to another target port 
> occur.
> +  * Schedule a recheck to update state for other 
> functions.
> +  */
> + alua_check(sdev, true);
> + return SUCCESS;
> + }
> + if (sense_hdr->asc == 0x04 && sense_hdr->ascq == 0x0c) {
> + /*
> +  * LUN Not Accessible - target port in unavailable 
> state.
> +  *
> +  * Do not retry, so failover to another target port 
> occur.
> +  * Schedule a recheck to update state for other 
> functions.
> +  */
> + alua_check(sdev, true);
> + return SUCCESS;
> + }
>   break;
>   case UNIT_ATTENTION:
>   if (sense_hdr->asc == 0x29 && sense_hdr->ascq == 0x00) {
> @@ -1057,6 +1077,8 @@ static void alua_check(struct scsi_device *sdev, bool 
> force)
>   *
>   * Fail I/O to all paths not in state
>   * active/optimized or active/non-optimized.
> + * Allow I/O to paths in state unavailable/standby
> + * so path checkers can actually check them.
>   */
>  static int alua_prep_fn(struct scsi_device *sdev, struct request

Re: [PATCH] tcmu: Fix possible overflow for memcpy address in iovec

2017-07-11 Thread Damien Le Moal

Xiubo,

On Tue, 2017-07-11 at 17:04 +0800, Xiubo Li wrote:
> > > > > diff --git a/drivers/target/target_core_user.c
> > > > > b/drivers/target/target_core_user.c
> > > > > index 930800c..86a845a 100644
> > > > > --- a/drivers/target/target_core_user.c
> > > > > +++ b/drivers/target/target_core_user.c
> > > > > @@ -437,7 +437,7 @@ static int scatter_data_area(struct tcmu_dev
> > > > > *udev,
> > > > >       to_offset = get_block_offset_user(udev,
> > > > > dbi,
> > > > >       block_remaining);
> > > > >       offset = DATA_BLOCK_SIZE -
> > > > > block_remaining;
> > > > > - to = (void *)(unsigned long)to + offset;
> > > > > + to = (void *)((unsigned long)to + offset);
> > > > >    
> > > > >       if (*iov_cnt != 0 &&
> > > > >       to_offset == iov_tail(udev, *iov)) {
> > > > > @@ -510,7 +510,7 @@ static void gather_data_area(struct tcmu_dev
> > > > > *udev, struct tcmu_cmd *cmd,
> > > > >       copy_bytes = min_t(size_t, sg_remaining,
> > > > >       block_remaining);
> > > > >       offset = DATA_BLOCK_SIZE -
> > > > > block_remaining;
> > > > > - from = (void *)(unsigned long)from +
> > > > > offset;
> > > > > + from = (void *)((unsigned long)from +
> > > > > offset);
> > > > >       tcmu_flush_dcache_range(from,
> > > > > copy_bytes);
> > > > >       memcpy(to + sg->length - sg_remaining,
> > > > > from,
> > > > >       copy_bytes);
> > 
> > I was just looking at this patch and about to try to see if it fixes my
> > problem... It cannot hurt. Trying...
> 
> Hi Damien,
> 
> Please test another patch, I think that one maybe fix this.

void * pointer arithmetic is OK and equivalent to unsigned long. So I do not
think this actually fixes anything and could be rewritten more simply as

to += offset;

and 

from += offset.

And that compiles without a warning and there are no complaints from sparse.

Cheers.


-- 
Damien Le Moal
Western Digital

Re: [PATCH] scsi: qla2xxx: Off by one in qlt_ctio_to_cmd()

2017-07-11 Thread Nicholas A. Bellinger

On Mon, 2017-07-10 at 11:47 +0300, Dan Carpenter wrote:
> There are "req->num_outstanding_cmds" elements in the
> req->outstanding_cmds[] array so the > here should be >=.
> 
> Signed-off-by: Dan Carpenter 
> 
> diff --git a/drivers/scsi/qla2xxx/qla_target.c 
> b/drivers/scsi/qla2xxx/qla_target.c
> index 6e4794367e0b..ecd1a95511f9 100644
> --- a/drivers/scsi/qla2xxx/qla_target.c
> +++ b/drivers/scsi/qla2xxx/qla_target.c
> @@ -3728,7 +3728,7 @@ static struct qla_tgt_cmd *qlt_ctio_to_cmd(struct 
> scsi_qla_host *vha,
>   h &= QLA_CMD_HANDLE_MASK;
>  
>   if (h != QLA_TGT_NULL_HANDLE) {
> - if (unlikely(h > req->num_outstanding_cmds)) {
> + if (unlikely(h >= req->num_outstanding_cmds)) {
>   ql_dbg(ql_dbg_tgt, vha, 0xe052,
>   "qla_target(%d): Wrong handle %x received\n",
>   vha->vp_idx, handle);

Nice catch.

Reviewed-by: Nicholas Bellinger

Re: [PATCH] tcmu: Fix possbile memory leak when recalculating the cmd base size

2017-07-11 Thread Xiubo Li


To Damien,

Please test this, I think this maybe helpful.


Thanks,

BRs



On 2017年07月11日 16:05, lixi...@cmss.chinamobile.com wrote:

From: Xiubo Li 

For all the entries allocated from the ring cmd area, the memory
is something like the stack, which will reserve the old data, so
the entry->req.iov_bidi_cnt maybe none zero.

To fix this, just memset all the entry memory before using it, and
also to be more readable we adjust the bidi code.

Fixed: fe25cc34795(tcmu: Recalculate the tcmu_cmd size to save cmd area
memories)
Reported-by: Bryant G. Ly 
Signed-off-by: Xiubo Li 
---
  drivers/target/target_core_user.c | 5 +++--
  1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/target/target_core_user.c 
b/drivers/target/target_core_user.c
index 2f1fa92..be62c86 100644
--- a/drivers/target/target_core_user.c
+++ b/drivers/target/target_core_user.c
@@ -840,6 +840,7 @@ static inline size_t tcmu_cmd_get_cmd_size(struct tcmu_cmd 
*tcmu_cmd,
}
  
  	entry = (void *) mb + CMDR_OFF + cmd_head;

+   memset(entry, 0, command_size);
tcmu_hdr_set_op(>hdr.len_op, TCMU_OP_CMD);
entry->hdr.cmd_id = tcmu_cmd->cmd_id;
entry->hdr.kflags = 0;
@@ -865,8 +866,8 @@ static inline size_t tcmu_cmd_get_cmd_size(struct tcmu_cmd 
*tcmu_cmd,
entry->req.iov_dif_cnt = 0;
  
  	/* Handle BIDI commands */

+   iov_cnt = 0;
if (se_cmd->se_cmd_flags & SCF_BIDI) {
-   iov_cnt = 0;
iov++;
ret = scatter_data_area(udev, tcmu_cmd,
se_cmd->t_bidi_data_sg,
@@ -879,8 +880,8 @@ static inline size_t tcmu_cmd_get_cmd_size(struct tcmu_cmd 
*tcmu_cmd,
pr_err("tcmu: alloc and scatter bidi data failed\n");
return TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE;
}
-   entry->req.iov_bidi_cnt = iov_cnt;
}
+   entry->req.iov_bidi_cnt = iov_cnt;
  
  	/*

 * Recalaulate the command's base size and size according

Re: [PATCH] tcmu: Fix possible overflow for memcpy address in iovec

2017-07-11 Thread Xiubo Li




diff --git a/drivers/target/target_core_user.c 
b/drivers/target/target_core_user.c
index 930800c..86a845a 100644
--- a/drivers/target/target_core_user.c
+++ b/drivers/target/target_core_user.c
@@ -437,7 +437,7 @@ static int scatter_data_area(struct tcmu_dev *udev,
to_offset = get_block_offset_user(udev, dbi,
block_remaining);
offset = DATA_BLOCK_SIZE - block_remaining;
-   to = (void *)(unsigned long)to + offset;
+   to = (void *)((unsigned long)to + offset);
   
   			if (*iov_cnt != 0 &&

to_offset == iov_tail(udev, *iov)) {
@@ -510,7 +510,7 @@ static void gather_data_area(struct tcmu_dev *udev, struct 
tcmu_cmd *cmd,
copy_bytes = min_t(size_t, sg_remaining,
block_remaining);
offset = DATA_BLOCK_SIZE - block_remaining;
-   from = (void *)(unsigned long)from + offset;
+   from = (void *)((unsigned long)from + offset);
tcmu_flush_dcache_range(from, copy_bytes);
memcpy(to + sg->length - sg_remaining, from,
copy_bytes);

I was just looking at this patch and about to try to see if it fixes my
problem... It cannot hurt. Trying...

Hi Damien,

Please test another patch, I think that one maybe fix this.

Thanks,
BRs

Re: [PATCH] tcmu: Fix possible overflow for memcpy address in iovec

2017-07-11 Thread Xiubo Li



On 2017年07月11日 16:41, Nicholas A. Bellinger wrote:

Hey Xiubo,

On Tue, 2017-07-11 at 16:04 +0800, Xiubo Li wrote:

Hi All

Please ignore about this patch.

Just my mistake.

Sorry.


Damien (CC'ed) has been observing something similar atop the latest
target-pending/for-next with his user-space ZBC backend:

http://www.spinics.net/lists/target-devel/msg15804.html
Sorry I missed about this thread these days just for debugging about 
similar issue reported by Bryant weeks ago to me and Mike.


I had reproduced one crash bug as Bryant reported, but I couldn't get 
any crash dump message due to the kdump service failed to start up on my 
environment, I have sent another patch just after this one and have test 
it about 17 hours which on my environment works fine.


Please ignore about the current patch, which is none sense and will 
introduce a new bug.


I'm not very sure my another fix patch will resolve the crash as Damien 
reported, and I will have a look later.



Just curious, are you going to re-send a different patch to address
this..?

Another fix patch has been sent, please review.



Is there anything that he can test to verify it's the same bug..?

I could get any crash dump message on my environment, so need more 
investigate about mine and Damien's.


Thanks,
BRs
Xiubo



Brs

Xiubo



On 2017年07月11日 15:40, lixi...@cmss.chinamobile.com wrote:

From: Xiubo Li 

Before the data area dynamic grow patches, though the overflow
bug was already exist, since the data area memories are all
preallocated, so there mostly won't any bad page fault core
trace produced.

The dynamic grow patches will only allocate and map the block
needed in data area, so when memcpy overflow, the system will
die.

[  367.864705] [c000fc657340] [c00d220c] do_exit+0x79c/0xcf0
[  367.864710] [c000fc657410] [c00249a4] die+0x314/0x470
[  367.864715] [c000fc6574a0] [c005425c] bad_page_fault+0xdc/0x150
[  367.864720] [c000fc657510] [c0008964] handle_page_fault+0x2c/0x30
[  367.864726] --- interrupt: 300 at memcpy_power7+0x20c/0x840
[  367.864726] LR = tcmu_queue_cmd+0x844/0xa80 [target_core_user]
[  367.864732] [c000fc657800] [d88916d8] tcmu_queue_cmd+0x768/0xa80 
[target_core_user] (unreliable)
[  367.864746] [c000fc657940] [d2993184] 
__target_execute_cmd+0x54/0x150 [target_core_mod]
[  367.864758] [c000fc657970] [d2994708] 
transport_generic_new_cmd+0x158/0x2d0 [target_core_mod]
[  367.864770] [c000fc6579f0] [d29948e4] 
transport_handle_cdb_direct+0x64/0xd0 [target_core_mod]
[  367.864783] [c000fc657a60] [d2994af8] 
target_submit_cmd_map_sgls+0x1a8/0x320 [target_core_mod]
[  367.864796] [c000fc657af0] [d2994cb8] 
target_submit_cmd+0x48/0x60 [target_core_mod]
[  367.864803] [c000fc657b90] [d2a54bd0] 
ibmvscsis_scheduler+0x350/0x5c0 [ibmvscsis]
[  367.864808] [c000fc657c50] [c00f1c28] 
process_one_work+0x1e8/0x5b0
[  367.864813] [c000fc657ce0] [c00f2098] worker_thread+0xa8/0x650
[  367.864818] [c000fc657d80] [c00fa864] kthread+0x114/0x140
[  367.864823] [c000fc657e30] [c00098f0] 
ret_from_kernel_thread+0x5c/0x6c
[  367.864827] Instruction dump:
[  367.864829] 6042 7fe3fb78 4bfcd175 6000 4bfffecc 7c0802a6 f8010010 
6000
[  367.864838] 7c0802a6 f8010010 f821ffe1 e9230690  38210020 e8010010 
7c0803a6
[  367.864847] ---[ end trace 8d085df7e65f7d20 ]---
[  367.870358]
[  367.870362] Fixing recursive fault but reboot is needed!
[  388.859695] INFO: rcu_sched detected stalls on CPUs/tasks:
[  388.859717]  16-...: (0 ticks this GP) idle=7e3/140/0 
softirq=12245/12245 fqs=2622
[  388.859722]  (detected by 20, t=5252 jiffies, g=12458, c=12457, q=2904)
[  388.859744] Task dump for CPU 16:
[  388.859747] kworker/16:2D  0  6865 0 0x0800
[  388.859762] Call Trace:
[  388.859768] [c000fc6579a0] [c14ef090] 
sysctl_sched_migration_cost+0x0/0x4 (unreliable)
[  388.859778] [c000fc6579c0] [d8890c1c] tcmu_parse_cdb+0x2c/0x40 
[target_core_user]
[  388.859782] [c000fc6579e0] [c000fc657a60] 0xc000fc657a60

Reported-by: Bryant G. Ly 
Signed-off-by: Xiubo Li 
---
   drivers/target/target_core_user.c | 4 ++--
   1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/target/target_core_user.c 
b/drivers/target/target_core_user.c
index 930800c..86a845a 100644
--- a/drivers/target/target_core_user.c
+++ b/drivers/target/target_core_user.c
@@ -437,7 +437,7 @@ static int scatter_data_area(struct tcmu_dev *udev,
to_offset = get_block_offset_user(udev, dbi,
block_remaining);
offset = DATA_BLOCK_SIZE - block_remaining;
-   to = (void *)(unsigned long)to + offset;
+

[PATCH 02/13] mpt3sas: SGL to PRP Translation for I/Os to NVMe devices

2017-07-11 Thread Suganath Prabu S

* Added support for translating the SGLs associated with incoming
commands either to IEE SGL or NVMe PRPs for NVMe devices.

* The hardware translation of IEEE SGL to NVMe PRPs has limitation
and if a command cannot be translated by hardware then it will go
to firmware and the firmware needs to translate it. And this will
have a performance reduction. To avoid that driver proactively
checks whether the translation will be done in hardware or not,
if not then driver try to translate inside the driver.

Signed-off-by: Chaitra P B 
Signed-off-by: Suganath Prabu S 
---
 drivers/scsi/mpt3sas/mpt3sas_base.c  |  623 +-
 drivers/scsi/mpt3sas/mpt3sas_base.h  |   43 +++-
 drivers/scsi/mpt3sas/mpt3sas_ctl.c   |1 +
 drivers/scsi/mpt3sas/mpt3sas_scsih.c |   12 +-
 4 files changed, 666 insertions(+), 13 deletions(-)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c 
b/drivers/scsi/mpt3sas/mpt3sas_base.c
index 18039bb..b67212c 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.c
@@ -59,6 +59,7 @@
 #include 
 #include 
 #include 
+#include /* To get host page size per arch */
 #include 
 
 
@@ -1347,6 +1348,502 @@ _base_build_sg(struct MPT3SAS_ADAPTER *ioc, void *psge,
 /* IEEE format sgls */
 
 /**
+ * _base_build_nvme_prp - This function is called for NVMe end devices to build
+ * a native SGL (NVMe PRP). The native SGL is built starting in the first PRP
+ * entry of the NVMe message (PRP1).  If the data buffer is small enough to be
+ * described entirely using PRP1, then PRP2 is not used.  If needed, PRP2 is
+ * used to describe a larger data buffer.  If the data buffer is too large to
+ * describe using the two PRP entriess inside the NVMe message, then PRP1
+ * describes the first data memory segment, and PRP2 contains a pointer to a 
PRP
+ * list located elsewhere in memory to describe the remaining data memory
+ * segments.  The PRP list will be contiguous.
+
+ * The native SGL for NVMe devices is a Physical Region Page (PRP).  A PRP
+ * consists of a list of PRP entries to describe a number of noncontigous
+ * physical memory segments as a single memory buffer, just as a SGL does.  
Note
+ * however, that this function is only used by the IOCTL call, so the memory
+ * given will be guaranteed to be contiguous.  There is no need to translate
+ * non-contiguous SGL into a PRP in this case.  All PRPs will describe
+ * contiguous space that is one page size each.
+ *
+ * Each NVMe message contains two PRP entries.  The first (PRP1) either 
contains
+ * a PRP list pointer or a PRP element, depending upon the command.  PRP2
+ * contains the second PRP element if the memory being described fits within 2
+ * PRP entries, or a PRP list pointer if the PRP spans more than two entries.
+ *
+ * A PRP list pointer contains the address of a PRP list, structured as a 
linear
+ * array of PRP entries.  Each PRP entry in this list describes a segment of
+ * physical memory.
+ *
+ * Each 64-bit PRP entry comprises an address and an offset field.  The address
+ * always points at the beginning of a 4KB physical memory page, and the offset
+ * describes where within that 4KB page the memory segment begins.  Only the
+ * first element in a PRP list may contain a non-zero offest, implying that all
+ * memory segments following the first begin at the start of a 4KB page.
+ *
+ * Each PRP element normally describes 4KB of physical memory, with exceptions
+ * for the first and last elements in the list.  If the memory being described
+ * by the list begins at a non-zero offset within the first 4KB page, then the
+ * first PRP element will contain a non-zero offset indicating where the region
+ * begins within the 4KB page.  The last memory segment may end before the end
+ * of the 4KB segment, depending upon the overall size of the memory being
+ * described by the PRP list.
+ *
+ * Since PRP entries lack any indication of size, the overall data buffer 
length
+ * is used to determine where the end of the data memory buffer is located, and
+ * how many PRP entries are required to describe it.
+ *
+ * @ioc: per adapter object
+ * @smid: system request message index for getting asscociated SGL
+ * @nvme_encap_request: the NVMe request msg frame pointer
+ * @data_out_dma: physical address for WRITES
+ * @data_out_sz: data xfer size for WRITES
+ * @data_in_dma: physical address for READS
+ * @data_in_sz: data xfer size for READS
+ *
+ * Returns nothing.
+ */
+static void
+_base_build_nvme_prp(struct MPT3SAS_ADAPTER *ioc, u16 smid,
+   Mpi26NVMeEncapsulatedRequest_t *nvme_encap_request,
+   dma_addr_t data_out_dma, size_t data_out_sz, dma_addr_t data_in_dma,
+   size_t data_in_sz)
+{
+   int prp_size = NVME_PRP_SIZE;
+   u64 *prp_entry, *prp1_entry, *prp2_entry, *prp_entry_phys;
+   u64 *prp_page, *prp_page_phys;
+   u32

[PATCH 03/13] mpt3sas: Added support for nvme encapsulated request message.

2017-07-11 Thread Suganath Prabu S

* Mpt3sas driver uses the NVMe Encapsulated Request message to
send an NVMe command to an NVMe device attached to the IOC.

* Normal I/O commands like reads and writes are passed to the
controller as SCSI commands and the controller has the ability
to translate the commands to NVMe equivalent.

* This encapsulated NVMe command is used by applications to send
direct NVMe commands to NVMe drives or for handling unmap where
the translation at controller/firmware level is having
performance issues.

Signed-off-by: Chaitra P B 
Signed-off-by: Suganath Prabu S 
---
 drivers/scsi/mpt3sas/mpt3sas_base.c |   56 +++-
 drivers/scsi/mpt3sas/mpt3sas_base.h |1 +
 drivers/scsi/mpt3sas/mpt3sas_ctl.c  |   81 ++-
 3 files changed, 136 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c 
b/drivers/scsi/mpt3sas/mpt3sas_base.c
index b67212c..a64cfce 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.c
@@ -557,6 +557,11 @@ _base_sas_ioc_info(struct MPT3SAS_ADAPTER *ioc, 
MPI2DefaultReply_t *mpi_reply,
frame_sz = sizeof(Mpi2SmpPassthroughRequest_t) + ioc->sge_size;
func_str = "smp_passthru";
break;
+   case MPI2_FUNCTION_NVME_ENCAPSULATED:
+   frame_sz = sizeof(Mpi26NVMeEncapsulatedRequest_t) +
+   ioc->sge_size;
+   func_str = "nvme_encapsulated";
+   break;
default:
frame_sz = 32;
func_str = "unknown";
@@ -985,7 +990,9 @@ _base_interrupt(int irq, void *bus_id)
if (request_desript_type ==
MPI25_RPY_DESCRIPT_FLAGS_FAST_PATH_SCSI_IO_SUCCESS ||
request_desript_type ==
-   MPI2_RPY_DESCRIPT_FLAGS_SCSI_IO_SUCCESS) {
+   MPI2_RPY_DESCRIPT_FLAGS_SCSI_IO_SUCCESS ||
+   request_desript_type ==
+   MPI26_RPY_DESCRIPT_FLAGS_PCIE_ENCAPSULATED_SUCCESS) {
cb_idx = _base_get_cb_idx(ioc, smid);
if ((likely(cb_idx < MPT_MAX_CALLBACKS)) &&
(likely(mpt_callbacks[cb_idx] != NULL))) {
@@ -3079,6 +3086,30 @@ _base_put_smid_hi_priority(struct MPT3SAS_ADAPTER *ioc, 
u16 smid,
 }
 
 /**
+ * _base_put_smid_nvme_encap - send NVMe encapsulated request to
+ *  firmware
+ * @ioc: per adapter object
+ * @smid: system request message index
+ *
+ * Return nothing.
+ */
+static void
+_base_put_smid_nvme_encap(struct MPT3SAS_ADAPTER *ioc, u16 smid)
+{
+   Mpi2RequestDescriptorUnion_t descriptor;
+   u64 *request = (u64 *)
+
+   descriptor.Default.RequestFlags =
+   MPI26_REQ_DESCRIPT_FLAGS_PCIE_ENCAPSULATED;
+   descriptor.Default.MSIxIndex =  _base_get_msix_index(ioc);
+   descriptor.Default.SMID = cpu_to_le16(smid);
+   descriptor.Default.LMID = 0;
+   descriptor.Default.DescriptorTypeDependent = 0;
+   _base_writeq(*request, >chip->RequestDescriptorPostLow,
+   >scsi_lookup_lock);
+}
+
+/**
  * _base_put_smid_default - Default, primarily used for config pages
  * @ioc: per adapter object
  * @smid: system request message index
@@ -3169,6 +3200,27 @@ _base_put_smid_hi_priority_atomic(struct MPT3SAS_ADAPTER 
*ioc, u16 smid,
 }
 
 /**
+ * _base_put_smid_nvme_encap_atomic - send NVMe encapsulated request to
+ *   firmware using Atomic Request Descriptor
+ * @ioc: per adapter object
+ * @smid: system request message index
+ *
+ * Return nothing.
+ */
+static void
+_base_put_smid_nvme_encap_atomic(struct MPT3SAS_ADAPTER *ioc, u16 smid)
+{
+   Mpi26AtomicRequestDescriptor_t descriptor;
+   u32 *request = (u32 *)
+
+   descriptor.RequestFlags = MPI26_REQ_DESCRIPT_FLAGS_PCIE_ENCAPSULATED;
+   descriptor.MSIxIndex = _base_get_msix_index(ioc);
+   descriptor.SMID = cpu_to_le16(smid);
+
+   writel(cpu_to_le32(*request), >chip->AtomicRequestDescriptorPost);
+}
+
+/**
  * _base_put_smid_default - Default, primarily used for config pages
  * use Atomic Request Descriptor
  * @ioc: per adapter object
@@ -6001,11 +6053,13 @@ mpt3sas_base_attach(struct MPT3SAS_ADAPTER *ioc)
ioc->put_smid_scsi_io = &_base_put_smid_scsi_io_atomic;
ioc->put_smid_fast_path = &_base_put_smid_fast_path_atomic;
ioc->put_smid_hi_priority = &_base_put_smid_hi_priority_atomic;
+   ioc->put_smid_nvme_encap = &_base_put_smid_nvme_encap_atomic;
} else {
ioc->put_smid_default = &_base_put_smid_default;
ioc->put_smid_scsi_io = &_base_put_smid_scsi_io;
ioc->put_smid_fast_path = &_base_put_smid_fast_path;
ioc->put_smid_hi_priority = &_base_put_smid_hi_priority;
+   ioc->put_smid_nvme_encap = &_base_put_smid_nvme_encap;
}
 
 
diff --git

[PATCH 05/13] mpt3sas: Set NVMe device queue depth as 128

2017-07-11 Thread Suganath Prabu S

Sets nvme device queue depth, name and displays device capabilities.

Signed-off-by: Chaitra P B 
Signed-off-by: Suganath Prabu S 
---
 drivers/scsi/mpt3sas/mpt3sas_base.h  |2 +-
 drivers/scsi/mpt3sas/mpt3sas_scsih.c |   40 ++
 2 files changed, 41 insertions(+), 1 deletions(-)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.h 
b/drivers/scsi/mpt3sas/mpt3sas_base.h
index 26239ec..0f07b16 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.h
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.h
@@ -115,7 +115,7 @@
 
 #define MPT3SAS_RAID_MAX_SECTORS   8192
 #define MPT3SAS_HOST_PAGE_SIZE_4K  12
-
+#define MPT3SAS_NVME_QUEUE_DEPTH   128
 #define MPT_NAME_LENGTH32  /* generic length of 
strings */
 #define MPT_STRING_LENGTH  64
 
diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c 
b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index 2a6a8e6..e4e35c1 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -1962,6 +1962,7 @@ scsih_slave_configure(struct scsi_device *sdev)
struct MPT3SAS_DEVICE *sas_device_priv_data;
struct MPT3SAS_TARGET *sas_target_priv_data;
struct _sas_device *sas_device;
+   struct _pcie_device *pcie_device;
struct _raid_device *raid_device;
unsigned long flags;
int qdepth;
@@ -2092,6 +2093,45 @@ scsih_slave_configure(struct scsi_device *sdev)
}
}
 
+   /* PCIe handling */
+   if (sas_target_priv_data->flags & MPT_TARGET_FLAGS_PCIE_DEVICE) {
+   spin_lock_irqsave(>pcie_device_lock, flags);
+   pcie_device = __mpt3sas_get_pdev_by_wwid(ioc,
+   sas_device_priv_data->sas_target->sas_address);
+   if (!pcie_device) {
+   spin_unlock_irqrestore(>pcie_device_lock, flags);
+   dfailprintk(ioc, pr_warn(MPT3SAS_FMT
+   "failure at %s:%d/%s()!\n", ioc->name, __FILE__,
+   __LINE__, __func__));
+   return 1;
+   }
+
+   /*TODO-right Queue Depth?*/
+   qdepth = MPT3SAS_NVME_QUEUE_DEPTH;
+   ds = "NVMe";
+   /*TODO-Add device name when defined*/
+   sdev_printk(KERN_INFO, sdev,
+   "%s: handle(0x%04x), wwid(0x%016llx), port(%d)\n",
+   ds, handle, (unsigned long long)pcie_device->wwid,
+   pcie_device->port_num);
+   if (pcie_device->enclosure_handle != 0)
+   sdev_printk(KERN_INFO, sdev,
+   "%s: enclosure logical id(0x%016llx), slot(%d)\n",
+   ds,
+   (unsigned long long)pcie_device->enclosure_logical_id,
+   pcie_device->slot);
+   if (pcie_device->connector_name[0] != '\0')
+   sdev_printk(KERN_INFO, sdev,
+   "%s: enclosure level(0x%04x),"
+   "connector name( %s)\n", ds,
+   pcie_device->enclosure_level,
+   pcie_device->connector_name);
+   pcie_device_put(pcie_device);
+   spin_unlock_irqrestore(>pcie_device_lock, flags);
+   scsih_change_queue_depth(sdev, qdepth);
+   return 0;
+   }
+
spin_lock_irqsave(>sas_device_lock, flags);
sas_device = __mpt3sas_get_sdev_by_addr(ioc,
   sas_device_priv_data->sas_target->sas_address);
-- 
1.7.1

[PATCH 04/13] mpt3sas: Handle NVMe PCIe device related events generated from firmware.

2017-07-11 Thread Suganath Prabu S

* The controller firmware sends separate events for NVMe devices and
PCIe switches similar to existing SAS events.

* NVMe device detection, addition and removal are reported by the
firmware through PCIe Topology Change list events.

* The PCIe device state change events are sent when the firmware
detects any abnormal conditions with a NVMe device or switch.

* The enumeration event are sent when the firmware starts PCIe device
enumeration and stops.

* This patch has the code change to handle the events and add/remove
NVMe devices in driver's inventory.

Signed-off-by: Chaitra P B 
Signed-off-by: Suganath Prabu S 
---
 drivers/scsi/mpt3sas/mpt3sas_base.c  |   30 ++-
 drivers/scsi/mpt3sas/mpt3sas_scsih.c |  468 +-
 2 files changed, 492 insertions(+), 6 deletions(-)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c 
b/drivers/scsi/mpt3sas/mpt3sas_base.c
index a64cfce..09fecd0 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.c
@@ -663,6 +663,26 @@ _base_display_event_data(struct MPT3SAS_ADAPTER *ioc,
case MPI2_EVENT_ACTIVE_CABLE_EXCEPTION:
desc = "Active cable exception";
break;
+   case MPI2_EVENT_PCIE_DEVICE_STATUS_CHANGE:
+   desc = "PCIE Device Status Change";
+   break;
+   case MPI2_EVENT_PCIE_ENUMERATION:
+   {
+   Mpi26EventDataPCIeEnumeration_t *event_data =
+   (Mpi26EventDataPCIeEnumeration_t *)mpi_reply->EventData;
+   pr_info(MPT3SAS_FMT "PCIE Enumeration: (%s)", ioc->name,
+  (event_data->ReasonCode ==
+   MPI26_EVENT_PCIE_ENUM_RC_STARTED) ?
+   "start" : "stop");
+   if (event_data->EnumerationStatus)
+   pr_info("enumeration_status(0x%08x)",
+  le32_to_cpu(event_data->EnumerationStatus));
+   pr_info("\n");
+   return;
+   }
+   case MPI2_EVENT_PCIE_TOPOLOGY_CHANGE_LIST:
+   desc = "PCIE Topology Change List";
+   break;
}
 
if (!desc)
@@ -6187,8 +6207,16 @@ mpt3sas_base_attach(struct MPT3SAS_ADAPTER *ioc)
_base_unmask_events(ioc, MPI2_EVENT_IR_OPERATION_STATUS);
_base_unmask_events(ioc, MPI2_EVENT_LOG_ENTRY_ADDED);
_base_unmask_events(ioc, MPI2_EVENT_TEMP_THRESHOLD);
-   if (ioc->hba_mpi_version_belonged == MPI26_VERSION)
+   if (ioc->hba_mpi_version_belonged == MPI26_VERSION) {
_base_unmask_events(ioc, MPI2_EVENT_ACTIVE_CABLE_EXCEPTION);
+   if (ioc->is_gen35_ioc) {
+   _base_unmask_events(ioc,
+   MPI2_EVENT_PCIE_DEVICE_STATUS_CHANGE);
+   _base_unmask_events(ioc, MPI2_EVENT_PCIE_ENUMERATION);
+   _base_unmask_events(ioc,
+   MPI2_EVENT_PCIE_TOPOLOGY_CHANGE_LIST);
+   }
+   }
 
r = _base_make_ioc_operational(ioc);
if (r)
diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c 
b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index 45b8d94..2a6a8e6 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -3132,8 +3132,6 @@ _scsih_block_io_device(struct MPT3SAS_ADAPTER *ioc, u16 
handle)
struct _sas_device *sas_device;
 
sas_device = mpt3sas_get_sdev_by_handle(ioc, handle);
-   if (!sas_device)
-   return;
 
shost_for_each_device(sdev, ioc->shost) {
sas_device_priv_data = sdev->hostdata;
@@ -3143,7 +3141,7 @@ _scsih_block_io_device(struct MPT3SAS_ADAPTER *ioc, u16 
handle)
continue;
if (sas_device_priv_data->block)
continue;
-   if (sas_device->pend_sas_rphy_add)
+   if (sas_device && sas_device->pend_sas_rphy_add)
continue;
if (sas_device_priv_data->ignore_delay_remove) {
sdev_printk(KERN_INFO, sdev,
@@ -3154,7 +3152,8 @@ _scsih_block_io_device(struct MPT3SAS_ADAPTER *ioc, u16 
handle)
_scsih_internal_device_block(sdev, sas_device_priv_data);
}
 
-   sas_device_put(sas_device);
+   if (sas_device)
+   sas_device_put(sas_device);
 }
 
 /**
@@ -3238,6 +3237,33 @@ _scsih_block_io_to_children_attached_directly(struct 
MPT3SAS_ADAPTER *ioc,
 }
 
 /**
+ * _scsih_block_io_to_pcie_children_attached_directly
+ * @ioc: per adapter object
+ * @event_data: topology change event data
+ *
+ * This routine set sdev state to SDEV_BLOCK for all devices
+ * direct attached during device pull/reconnect.
+ */
+static void
+_scsih_block_io_to_pcie_children_attached_directly(struct MPT3SAS_ADAPTER *ioc,
+   Mpi26EventDataPCIeTopologyChangeList_t *event_data)

[PATCH 07/13] mpt3sas: API's to remove nvme drive from sml

2017-07-11 Thread Suganath Prabu S

Below API's are included in nvme drive remove path.
_scsih_pcie_device_remove
_scsih_pcie_device_remove_by_handle
_scsih_pcie_device_remove_from_sml

Signed-off-by: Chaitra P B 
Signed-off-by: Suganath Prabu S 
---
 drivers/scsi/mpt3sas/mpt3sas_scsih.c |  190 +-
 1 files changed, 188 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c 
b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index e52bebe..68aa102 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -73,7 +73,8 @@ static void _scsih_remove_device(struct MPT3SAS_ADAPTER *ioc,
 static int _scsih_add_device(struct MPT3SAS_ADAPTER *ioc, u16 handle,
u8 retry_count, u8 is_pd);
 static int _scsih_pcie_add_device(struct MPT3SAS_ADAPTER *ioc, u16 handle);
-
+static void _scsih_pcie_device_remove_from_sml(struct MPT3SAS_ADAPTER *ioc,
+   struct _pcie_device *pcie_device);
 static u8 _scsih_check_for_pending_tm(struct MPT3SAS_ADAPTER *ioc, u16 smid);
 
 /* global parameters */
@@ -1048,6 +1049,86 @@ mpt3sas_get_pdev_by_handle(struct MPT3SAS_ADAPTER *ioc, 
u16 handle)
 
return pcie_device;
 }
+
+/**
+ * _scsih_pcie_device_remove - remove pcie_device from list.
+ * @ioc: per adapter object
+ * @pcie_device: the pcie_device object
+ * Context: This function will acquire ioc->pcie_device_lock.
+ *
+ * If pcie_device is on the list, remove it and decrement its reference count.
+ */
+static void
+_scsih_pcie_device_remove(struct MPT3SAS_ADAPTER *ioc,
+   struct _pcie_device *pcie_device)
+{
+   unsigned long flags;
+   int was_on_pcie_device_list = 0;
+
+   if (!pcie_device)
+   return;
+   pr_info(MPT3SAS_FMT
+   "removing handle(0x%04x), wwid(0x%016llx)\n",
+   ioc->name, pcie_device->handle,
+   (unsigned long long) pcie_device->wwid);
+   if (pcie_device->enclosure_handle != 0)
+   pr_info(MPT3SAS_FMT
+   "removing enclosure logical id(0x%016llx), slot(%d)\n",
+   ioc->name,
+   (unsigned long long)pcie_device->enclosure_logical_id,
+   pcie_device->slot);
+   if (pcie_device->connector_name[0] != '\0')
+   pr_info(MPT3SAS_FMT
+   "removing enclosure level(0x%04x), connector name( 
%s)\n",
+   ioc->name, pcie_device->enclosure_level,
+   pcie_device->connector_name);
+
+   spin_lock_irqsave(>pcie_device_lock, flags);
+   if (!list_empty(_device->list)) {
+   list_del_init(_device->list);
+   was_on_pcie_device_list = 1;
+   }
+   spin_unlock_irqrestore(>pcie_device_lock, flags);
+   if (was_on_pcie_device_list) {
+   kfree(pcie_device->serial_number);
+   pcie_device_put(pcie_device);
+   }
+}
+
+
+/**
+ * _scsih_pcie_device_remove_by_handle - removing pcie device object by handle
+ * @ioc: per adapter object
+ * @handle: device handle
+ *
+ * Return nothing.
+ */
+static void
+_scsih_pcie_device_remove_by_handle(struct MPT3SAS_ADAPTER *ioc, u16 handle)
+{
+   struct _pcie_device *pcie_device;
+   unsigned long flags;
+   int was_on_pcie_device_list = 0;
+
+   if (ioc->shost_recovery)
+   return;
+
+   spin_lock_irqsave(>pcie_device_lock, flags);
+   pcie_device = __mpt3sas_get_pdev_by_handle(ioc, handle);
+   if (pcie_device) {
+   if (!list_empty(_device->list)) {
+   list_del_init(_device->list);
+   was_on_pcie_device_list = 1;
+   pcie_device_put(pcie_device);
+   }
+   }
+   spin_unlock_irqrestore(>pcie_device_lock, flags);
+   if (was_on_pcie_device_list) {
+   _scsih_pcie_device_remove_from_sml(ioc, pcie_device);
+   pcie_device_put(pcie_device);
+   }
+}
+
 /**
  * _scsih_pcie_device_add - add pcie_device object
  * @ioc: per adapter object
@@ -6630,6 +6711,83 @@ _scsih_check_pcie_access_status(struct MPT3SAS_ADAPTER 
*ioc, u64 wwid,
(unsigned long long)wwid, handle);
return rc;
 }
+
+/**
+ * _scsih_pcie_device_remove_from_sml -  removing pcie device
+ * from SML and free up associated memory
+ * @ioc: per adapter object
+ * @pcie_device: the pcie_device object
+ *
+ * Return nothing.
+ */
+static void
+_scsih_pcie_device_remove_from_sml(struct MPT3SAS_ADAPTER *ioc,
+   struct _pcie_device *pcie_device)
+{
+   struct MPT3SAS_TARGET *sas_target_priv_data;
+
+   dewtprintk(ioc, pr_info(MPT3SAS_FMT
+   "%s: enter: handle(0x%04x), wwid(0x%016llx)\n", ioc->name, __func__,
+   pcie_device->handle, (unsigned long long)
+   pcie_device->wwid));
+   if (pcie_device->enclosure_handle != 0)
+   dewtprintk(ioc, pr_info(MPT3SAS_FMT
+

[PATCH 09/13] mpt3as: Add-Task-management-debug-info-for-NVMe-drives.

2017-07-11 Thread Suganath Prabu S

Added debug information for NVMe/PCIe drives in target rest path.

Signed-off-by: Chaitra P B 
Signed-off-by: Suganath Prabu S 
---
 drivers/scsi/mpt3sas/mpt3sas_scsih.c |   86 --
 1 files changed, 72 insertions(+), 14 deletions(-)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c 
b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index 7100ee8..b96da33 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -690,7 +690,7 @@ found_device:
  * This searches for sas_device based on sas_address, then return sas_device
  * object.
  */
-static struct _sas_device *
+struct _sas_device *
 mpt3sas_get_sdev_by_handle(struct MPT3SAS_ADAPTER *ioc, u16 handle)
 {
struct _sas_device *sas_device;
@@ -1208,6 +1208,7 @@ _scsih_pcie_device_init_add(struct MPT3SAS_ADAPTER *ioc,
_scsih_determine_boot_device(ioc, pcie_device, PCIE_CHANNEL);
spin_unlock_irqrestore(>pcie_device_lock, flags);
 }
+
 /**
  * _scsih_raid_device_find_by_id - raid device search
  * @ioc: per adapter object
@@ -2891,6 +2892,7 @@ _scsih_tm_display_info(struct MPT3SAS_ADAPTER *ioc, 
struct scsi_cmnd *scmd)
struct scsi_target *starget = scmd->device->sdev_target;
struct MPT3SAS_TARGET *priv_target = starget->hostdata;
struct _sas_device *sas_device = NULL;
+   struct _pcie_device *pcie_device = NULL;
unsigned long flags;
char *device_str = NULL;
 
@@ -2907,6 +2909,31 @@ _scsih_tm_display_info(struct MPT3SAS_ADAPTER *ioc, 
struct scsi_cmnd *scmd)
"%s handle(0x%04x), %s wwid(0x%016llx)\n",
device_str, priv_target->handle,
device_str, (unsigned long long)priv_target->sas_address);
+
+   } else if (priv_target->flags & MPT_TARGET_FLAGS_PCIE_DEVICE) {
+   spin_lock_irqsave(>pcie_device_lock, flags);
+   pcie_device = __mpt3sas_get_pdev_from_target(ioc, priv_target);
+   if (pcie_device) {
+   starget_printk(KERN_INFO, starget,
+   "handle(0x%04x), wwid(0x%016llx), port(%d)\n",
+   pcie_device->handle,
+   (unsigned long long)pcie_device->wwid,
+   pcie_device->port_num);
+   if (pcie_device->enclosure_handle != 0)
+   starget_printk(KERN_INFO, starget,
+   "enclosure logical id(0x%016llx), 
slot(%d)\n",
+   (unsigned long long)
+   pcie_device->enclosure_logical_id,
+   pcie_device->slot);
+   if (pcie_device->connector_name[0] != '\0')
+   starget_printk(KERN_INFO, starget,
+   "enclosure level(0x%04x), connector 
name( %s)\n",
+   pcie_device->enclosure_level,
+   pcie_device->connector_name);
+   pcie_device_put(pcie_device);
+   }
+   spin_unlock_irqrestore(>pcie_device_lock, flags);
+
} else {
spin_lock_irqsave(>sas_device_lock, flags);
sas_device = __mpt3sas_get_sdev_from_target(ioc, priv_target);
@@ -3650,6 +3677,7 @@ _scsih_tm_tr_send(struct MPT3SAS_ADAPTER *ioc, u16 handle)
Mpi2SCSITaskManagementRequest_t *mpi_request;
u16 smid;
struct _sas_device *sas_device = NULL;
+   struct _pcie_device *pcie_device = NULL;
struct MPT3SAS_TARGET *sas_target_priv_data = NULL;
u64 sas_address = 0;
unsigned long flags;
@@ -3692,24 +3720,52 @@ _scsih_tm_tr_send(struct MPT3SAS_ADAPTER *ioc, u16 
handle)
sas_address = sas_device->sas_address;
}
spin_unlock_irqrestore(>sas_device_lock, flags);
-
+   if (!sas_device) {
+   spin_lock_irqsave(>pcie_device_lock, flags);
+   pcie_device = __mpt3sas_get_pdev_by_handle(ioc, handle);
+   if (pcie_device && pcie_device->starget &&
+   pcie_device->starget->hostdata) {
+   sas_target_priv_data = pcie_device->starget->hostdata;
+   sas_target_priv_data->deleted = 1;
+   sas_address = pcie_device->wwid;
+   }
+   spin_unlock_irqrestore(>pcie_device_lock, flags);
+   }
if (sas_target_priv_data) {
dewtprintk(ioc, pr_info(MPT3SAS_FMT
"setting delete flag: handle(0x%04x), 
sas_addr(0x%016llx)\n",
ioc->name, handle,
(unsigned long long)sas_address));
-   if (sas_device->enclosure_handle != 0)
-   dewtprintk(ioc, pr_info(MPT3SAS_FMT

[PATCH 11/13] mpt3sas: Fix nvme drives checking for tlr.

2017-07-11 Thread Suganath Prabu S

Check for NVMe drives before enabling or checking tlr.

Signed-off-by: Chaitra P B 
Signed-off-by: Suganath Prabu S 
---
 drivers/scsi/mpt3sas/mpt3sas_scsih.c |   22 --
 1 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c 
b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index b96da33..4d71ef7 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -2013,6 +2013,14 @@ scsih_is_raid(struct device *dev)
return (sdev->channel == RAID_CHANNEL) ? 1 : 0;
 }
 
+static int
+scsih_is_nvme(struct device *dev)
+{
+   struct scsi_device *sdev = to_scsi_device(dev);
+
+   return (sdev->channel == PCIE_CHANNEL) ? 1 : 0;
+}
+
 /**
  * scsih_get_resync - get raid volume resync percent complete
  * @dev the device struct object
@@ -4810,8 +4818,9 @@ scsih_qcmd(struct Scsi_Host *shost, struct scsi_cmnd 
*scmd)
/* Make sure Device is not raid volume.
 * We do not expose raid functionality to upper layer for warpdrive.
 */
-   if (!ioc->is_warpdrive && !scsih_is_raid(>device->sdev_gendev)
-   && sas_is_tlr_enabled(scmd->device) && scmd->cmd_len != 32)
+   if (((!ioc->is_warpdrive && !scsih_is_raid(>device->sdev_gendev))
+   && !scsih_is_nvme(>device->sdev_gendev))
+   && sas_is_tlr_enabled(scmd->device) && scmd->cmd_len != 32)
mpi_control |= MPI2_SCSIIO_CONTROL_TLR_ON;
 
smid = mpt3sas_base_get_smid_scsiio(ioc, ioc->scsi_io_cb_idx, scmd);
@@ -4856,8 +4865,8 @@ scsih_qcmd(struct Scsi_Host *shost, struct scsi_cmnd 
*scmd)
 
raid_device = sas_target_priv_data->raid_device;
if (raid_device && raid_device->direct_io_enabled)
-   mpt3sas_setup_direct_io(ioc, scmd, raid_device, mpi_request,
-   smid);
+   mpt3sas_setup_direct_io(ioc, scmd,
+   raid_device, mpi_request, smid);
 
if (likely(mpi_request->Function == MPI2_FUNCTION_SCSI_IO_REQUEST)) {
if (sas_target_priv_data->flags & MPT_TARGET_FASTPATH_IO) {
@@ -5405,9 +5414,10 @@ _scsih_io_done(struct MPT3SAS_ADAPTER *ioc, u16 smid, u8 
msix_index, u32 reply)
le32_to_cpu(mpi_reply->ResponseInfo) & 0xFF;
if (!sas_device_priv_data->tlr_snoop_check) {
sas_device_priv_data->tlr_snoop_check++;
-   if (!ioc->is_warpdrive &&
+   if ((!ioc->is_warpdrive &&
!scsih_is_raid(>device->sdev_gendev) &&
-   sas_is_tlr_enabled(scmd->device) &&
+   !scsih_is_nvme(>device->sdev_gendev))
+   && sas_is_tlr_enabled(scmd->device) &&
response_code == MPI2_SCSITASKMGMT_RSP_INVALID_FRAME) {
sas_disable_tlr(scmd->device);
sdev_printk(KERN_INFO, scmd->device, "TLR disabled\n");
-- 
1.7.1

[PATCH 13/13] mpt3sas: Update mpt3sas driver version.

2017-07-11 Thread Suganath Prabu S

Updated mpt3sas driver version to 15.101.00.00

Signed-off-by: Chaitra P B 
Signed-off-by: Suganath Prabu S 
---
 drivers/scsi/mpt3sas/mpt3sas_base.h |6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.h 
b/drivers/scsi/mpt3sas/mpt3sas_base.h
index ea6e607..835d6da 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.h
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.h
@@ -74,11 +74,11 @@
 #define MPT3SAS_DRIVER_NAME"mpt3sas"
 #define MPT3SAS_AUTHOR "Avago Technologies "
 #define MPT3SAS_DESCRIPTION"LSI MPT Fusion SAS 3.0 Device Driver"
-#define MPT3SAS_DRIVER_VERSION "15.100.00.00"
+#define MPT3SAS_DRIVER_VERSION "15.101.00.00"
 #define MPT3SAS_MAJOR_VERSION  15
-#define MPT3SAS_MINOR_VERSION  100
+#define MPT3SAS_MINOR_VERSION  101
 #define MPT3SAS_BUILD_VERSION  0
-#define MPT3SAS_RELEASE_VERSION00
+#define MPT3SAS_RELEASE_VERSION0
 
 #define MPT2SAS_DRIVER_NAME"mpt2sas"
 #define MPT2SAS_DESCRIPTION"LSI MPT Fusion SAS 2.0 Device Driver"
-- 
1.7.1

[PATCH 08/13] mpt3sas: scan and add nvme device after controller reset

2017-07-11 Thread Suganath Prabu S

After Controller reset, Scan and add nvme device back to the topology.

Signed-off-by: Chaitra P B 
Signed-off-by: Suganath Prabu S 
---
 drivers/scsi/mpt3sas/mpt3sas_scsih.c |  196 +-
 1 files changed, 191 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c 
b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index 68aa102..7100ee8 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -4867,6 +4867,7 @@ _scsih_scsi_ioc_info(struct MPT3SAS_ADAPTER *ioc, struct 
scsi_cmnd *scmd,
char *desc_scsi_state = ioc->tmp_string;
u32 log_info = le32_to_cpu(mpi_reply->IOCLogInfo);
struct _sas_device *sas_device = NULL;
+   struct _pcie_device *pcie_device = NULL;
struct scsi_target *starget = scmd->device->sdev_target;
struct MPT3SAS_TARGET *priv_target = starget->hostdata;
char *device_str = NULL;
@@ -4999,6 +5000,28 @@ _scsih_scsi_ioc_info(struct MPT3SAS_ADAPTER *ioc, struct 
scsi_cmnd *scmd,
if (priv_target->flags & MPT_TARGET_FLAGS_VOLUME) {
pr_warn(MPT3SAS_FMT "\t%s wwid(0x%016llx)\n", ioc->name,
device_str, (unsigned long long)priv_target->sas_address);
+   } else if (priv_target->flags & MPT_TARGET_FLAGS_PCIE_DEVICE) {
+   pcie_device = mpt3sas_get_pdev_from_target(ioc, priv_target);
+   if (pcie_device) {
+   pr_info(MPT3SAS_FMT "\twwid(0x%016llx), port(%d)\n",
+   ioc->name,
+   (unsigned long long)pcie_device->wwid,
+   pcie_device->port_num);
+   if (pcie_device->enclosure_handle != 0)
+   pr_info(MPT3SAS_FMT
+   "\tenclosure logical id(0x%016llx), "
+   "slot(%d)\n", ioc->name,
+   (unsigned long long)
+   pcie_device->enclosure_logical_id,
+   pcie_device->slot);
+   if (pcie_device->connector_name[0])
+   pr_info(MPT3SAS_FMT
+   "\tenclosure level(0x%04x),"
+   "connector name( %s)\n",
+   ioc->name, pcie_device->enclosure_level,
+   pcie_device->connector_name);
+   pcie_device_put(pcie_device);
+   }
} else {
sas_device = mpt3sas_get_sdev_from_target(ioc, priv_target);
if (sas_device) {
@@ -5045,11 +5068,10 @@ _scsih_scsi_ioc_info(struct MPT3SAS_ADAPTER *ioc, 
struct scsi_cmnd *scmd,
struct sense_info data;
_scsih_normalize_sense(scmd->sense_buffer, );
pr_warn(MPT3SAS_FMT
-   "\t[sense_key,asc,ascq]: [0x%02x,0x%02x,0x%02x], 
count(%d)\n",
-   ioc->name, data.skey,
-   data.asc, data.ascq, le32_to_cpu(mpi_reply->SenseCount));
+ "\t[sense_key,asc,ascq]: [0x%02x,0x%02x,0x%02x], count(%d)\n",
+ ioc->name, data.skey,
+ data.asc, data.ascq, le32_to_cpu(mpi_reply->SenseCount));
}
-
if (scsi_state & MPI2_SCSI_STATE_RESPONSE_INFO_VALID) {
response_info = le32_to_cpu(mpi_reply->ResponseInfo);
response_bytes = (u8 *)_info;
@@ -6931,7 +6953,7 @@ _scsih_pcie_add_device(struct MPT3SAS_ADAPTER *ioc, u16 
handle)
pcie_device_pg0.AccessStatus))
return 0;
 
-   if (!(_scsih_is_nvme_device(pcie_device_pg0.DeviceInfo)))
+   if (!(_scsih_is_nvme_device(le32_to_cpu(pcie_device_pg0.DeviceInfo
return 0;
 
pcie_device = mpt3sas_get_pdev_by_wwid(ioc, wwid);
@@ -8510,6 +8532,130 @@ _scsih_search_responding_sas_devices(struct 
MPT3SAS_ADAPTER *ioc)
 }
 
 /**
+ * _scsih_mark_responding_pcie_device - mark a pcie_device as responding
+ * @ioc: per adapter object
+ * @pcie_device_pg0: PCIe Device page 0
+ *
+ * After host reset, find out whether devices are still responding.
+ * Used in _scsih_remove_unresponding_devices.
+ *
+ * Return nothing.
+ */
+static void
+_scsih_mark_responding_pcie_device(struct MPT3SAS_ADAPTER *ioc,
+   Mpi26PCIeDevicePage0_t *pcie_device_pg0)
+{
+   struct MPT3SAS_TARGET *sas_target_priv_data = NULL;
+   struct scsi_target *starget;
+   struct _pcie_device *pcie_device;
+   unsigned long flags;
+
+   spin_lock_irqsave(>pcie_device_lock, flags);
+   list_for_each_entry(pcie_device, >pcie_device_list, list) {
+   if ((pcie_device->wwid == pcie_device_pg0->WWID) &&
+   (pcie_device->slot == pcie_device_pg0->Slot)) {
+   pcie_device->responding = 1;
+

[PATCH 12/13] mpt3sas: Update MPI Header

2017-07-11 Thread Suganath Prabu S

Update MPI Files for NVMe support

Signed-off-by: Chaitra P B 
Signed-off-by: Suganath Prabu S 
---
 drivers/scsi/mpt3sas/mpi/mpi2.h  |   43 +++-
 drivers/scsi/mpt3sas/mpi/mpi2_cnfg.h |  647 +-
 drivers/scsi/mpt3sas/mpi/mpi2_init.h |   11 +-
 drivers/scsi/mpt3sas/mpi/mpi2_ioc.h  |  331 +-
 drivers/scsi/mpt3sas/mpi/mpi2_pci.h  |  142 
 drivers/scsi/mpt3sas/mpi/mpi2_tool.h |   14 +-
 6 files changed, 1177 insertions(+), 11 deletions(-)
 create mode 100644 drivers/scsi/mpt3sas/mpi/mpi2_pci.h

diff --git a/drivers/scsi/mpt3sas/mpi/mpi2.h b/drivers/scsi/mpt3sas/mpi/mpi2.h
index a9a659f..bc59058 100644
--- a/drivers/scsi/mpt3sas/mpi/mpi2.h
+++ b/drivers/scsi/mpt3sas/mpi/mpi2.h
@@ -8,7 +8,7 @@
  * scatter/gather formats.
  * Creation Date:  June 21, 2006
  *
- * mpi2.h Version:  02.00.42
+ * mpi2.h Version:  02.00.48
  *
  * NOTE: Names (typedefs, defines, etc.) beginning with an MPI25 or Mpi25
  *   prefix are for use only on MPI v2.5 products, and must not be used
@@ -103,6 +103,16 @@
  * 08-25-15  02.00.40  Bumped MPI2_HEADER_VERSION_UNIT.
  * 12-15-15  02.00.41  Bumped MPI_HEADER_VERSION_UNIT
  * 01-01-16  02.00.42  Bumped MPI_HEADER_VERSION_UNIT
+ * 04-05-16  02.00.43  Modified  MPI26_DIAG_BOOT_DEVICE_SELECT defines
+ * to be unique within first 32 characters.
+ * Removed AHCI support.
+ * Removed SOP support.
+ * Bumped MPI2_HEADER_VERSION_UNIT.
+ * 04-10-16  02.00.44  Bumped MPI2_HEADER_VERSION_UNIT.
+ * 07-06-16  02.00.45  Bumped MPI2_HEADER_VERSION_UNIT.
+ * 09-02-16  02.00.46  Bumped MPI2_HEADER_VERSION_UNIT.
+ * 11-23-16  02.00.47  Bumped MPI2_HEADER_VERSION_UNIT.
+ * 02-03-17  02.00.48  Bumped MPI2_HEADER_VERSION_UNIT.
  * --
  */
 
@@ -142,7 +152,7 @@
 #define MPI2_VERSION_02_06 (0x0206)
 
 /*Unit and Dev versioning for this MPI header set */
-#define MPI2_HEADER_VERSION_UNIT(0x2A)
+#define MPI2_HEADER_VERSION_UNIT(0x30)
 #define MPI2_HEADER_VERSION_DEV (0x00)
 #define MPI2_HEADER_VERSION_UNIT_MASK   (0xFF00)
 #define MPI2_HEADER_VERSION_UNIT_SHIFT  (8)
@@ -249,6 +259,12 @@ typedef volatile struct _MPI2_SYSTEM_INTERFACE_REGS {
 #define MPI2_DIAG_BOOT_DEVICE_SELECT_DEFAULT(0x)
 #define MPI2_DIAG_BOOT_DEVICE_SELECT_HCDW   (0x0800)
 
+/* Defines for V7A/V7R HostDiagnostic Register */
+#define MPI26_DIAG_BOOT_DEVICE_SEL_64FLASH  (0x)
+#define MPI26_DIAG_BOOT_DEVICE_SEL_64HCDW   (0x0800)
+#define MPI26_DIAG_BOOT_DEVICE_SEL_32FLASH  (0x1000)
+#define MPI26_DIAG_BOOT_DEVICE_SEL_32HCDW   (0x1800)
+
 #define MPI2_DIAG_CLEAR_FLASH_BAD_SIG   (0x0400)
 #define MPI2_DIAG_FORCE_HCB_ON_RESET(0x0200)
 #define MPI2_DIAG_HCB_MODE  (0x0100)
@@ -367,6 +383,7 @@ typedef struct _MPI2_DEFAULT_REQUEST_DESCRIPTOR {
 #define MPI2_REQ_DESCRIPT_FLAGS_DEFAULT_TYPE(0x08)
 #define MPI2_REQ_DESCRIPT_FLAGS_RAID_ACCELERATOR(0x0A)
 #define MPI25_REQ_DESCRIPT_FLAGS_FAST_PATH_SCSI_IO  (0x0C)
+#define MPI26_REQ_DESCRIPT_FLAGS_PCIE_ENCAPSULATED  (0x10)
 
 #define MPI2_REQ_DESCRIPT_FLAGS_IOC_FIFO_MARKER (0x01)
 
@@ -425,6 +442,13 @@ typedef MPI2_SCSI_IO_REQUEST_DESCRIPTOR
Mpi25FastPathSCSIIORequestDescriptor_t,
*pMpi25FastPathSCSIIORequestDescriptor_t;
 
+/*PCIe Encapsulated Request Descriptor */
+typedef MPI2_SCSI_IO_REQUEST_DESCRIPTOR
+   MPI26_PCIE_ENCAPSULATED_REQUEST_DESCRIPTOR,
+   *PTR_MPI26_PCIE_ENCAPSULATED_REQUEST_DESCRIPTOR,
+   Mpi26PCIeEncapsulatedRequestDescriptor_t,
+   *pMpi26PCIeEncapsulatedRequestDescriptor_t;
+
 /*union of Request Descriptors */
 typedef union _MPI2_REQUEST_DESCRIPTOR_UNION {
MPI2_DEFAULT_REQUEST_DESCRIPTOR Default;
@@ -433,6 +457,7 @@ typedef union _MPI2_REQUEST_DESCRIPTOR_UNION {
MPI2_SCSI_TARGET_REQUEST_DESCRIPTOR SCSITarget;
MPI2_RAID_ACCEL_REQUEST_DESCRIPTOR RAIDAccelerator;
MPI25_FP_SCSI_IO_REQUEST_DESCRIPTOR FastPathSCSIIO;
+   MPI26_PCIE_ENCAPSULATED_REQUEST_DESCRIPTOR PCIeEncapsulated;
U64 Words;
 } MPI2_REQUEST_DESCRIPTOR_UNION,
*PTR_MPI2_REQUEST_DESCRIPTOR_UNION,
@@ -450,6 +475,7 @@ typedef union _MPI2_REQUEST_DESCRIPTOR_UNION {
  *  Atomic SCSI Target Request Descriptor
  *  Atomic RAID Accelerator Request Descriptor
  *  Atomic Fast Path SCSI IO Request Descriptor
+ *  Atomic PCIe Encapsulated Request Descriptor
  */
 
 /*Atomic Request Descriptor */
@@ -487,6 +513,7 @@ typedef struct _MPI2_DEFAULT_REPLY_DESCRIPTOR {
 #define MPI2_RPY_DESCRIPT_FLAGS_TARGET_COMMAND_BUFFER   (0x03)
 #define MPI2_RPY_DESCRIPT_FLAGS_RAID_ACCELERATOR_SUCCESS(0x05)
 #define

[PATCH 10/13] mpt3sas: NVMe drive support for BTDHMAPPING ioctl command and log info

2017-07-11 Thread Suganath Prabu S

* Added debug prints for pcie devices in ioctl debug path. Which
will be helpful for debugging.
* Added PCIe device support for ioctl BTDHMAPPING ioctl.

Signed-off-by: Chaitra P B 
Signed-off-by: Suganath Prabu S 
---
 drivers/scsi/mpt3sas/mpt3sas_base.h  |3 +-
 drivers/scsi/mpt3sas/mpt3sas_ctl.c   |   88 --
 drivers/scsi/mpt3sas/mpt3sas_warpdrive.c |2 +-
 3 files changed, 61 insertions(+), 32 deletions(-)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.h 
b/drivers/scsi/mpt3sas/mpt3sas_base.h
index 063977a..ea6e607 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.h
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.h
@@ -452,6 +452,7 @@ struct _internal_cmd {
struct completion done;
void*reply;
void*sense;
+   u64 *nvme_error_response;
u16 status;
u16 smid;
 };
@@ -1615,7 +1616,7 @@ void
 mpt3sas_scsi_direct_io_set(struct MPT3SAS_ADAPTER *ioc, u16 smid, u8 
direct_io);
 void
 mpt3sas_setup_direct_io(struct MPT3SAS_ADAPTER *ioc, struct scsi_cmnd *scmd,
-   struct _raid_device *raid_device, Mpi2SCSIIORequest_t *mpi_request,
+   struct _raid_device *raid_device, Mpi25SCSIIORequest_t *mpi_request,
u16 smid);
 
 /* NCQ Prio Handling Check */
diff --git a/drivers/scsi/mpt3sas/mpt3sas_ctl.c 
b/drivers/scsi/mpt3sas/mpt3sas_ctl.c
index 35e5c30..269c753 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_ctl.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_ctl.c
@@ -79,32 +79,6 @@ enum block_state {
 };
 
 /**
- * _ctl_sas_device_find_by_handle - sas device search
- * @ioc: per adapter object
- * @handle: sas device handle (assigned by firmware)
- * Context: Calling function should acquire ioc->sas_device_lock
- *
- * This searches for sas_device based on sas_address, then return sas_device
- * object.
- */
-static struct _sas_device *
-_ctl_sas_device_find_by_handle(struct MPT3SAS_ADAPTER *ioc, u16 handle)
-{
-   struct _sas_device *sas_device, *r;
-
-   r = NULL;
-   list_for_each_entry(sas_device, >sas_device_list, list) {
-   if (sas_device->handle != handle)
-   continue;
-   r = sas_device;
-   goto out;
-   }
-
- out:
-   return r;
-}
-
-/**
  * _ctl_display_some_debug - debug routine
  * @ioc: per adapter object
  * @smid: system request message index
@@ -229,10 +203,9 @@ _ctl_display_some_debug(struct MPT3SAS_ADAPTER *ioc, u16 
smid,
Mpi2SCSIIOReply_t *scsi_reply =
(Mpi2SCSIIOReply_t *)mpi_reply;
struct _sas_device *sas_device = NULL;
-   unsigned long flags;
+   struct _pcie_device *pcie_device = NULL;
 
-   spin_lock_irqsave(>sas_device_lock, flags);
-   sas_device = _ctl_sas_device_find_by_handle(ioc,
+   sas_device = mpt3sas_get_sdev_by_handle(ioc,
le16_to_cpu(scsi_reply->DevHandle));
if (sas_device) {
pr_warn(MPT3SAS_FMT "\tsas_address(0x%016llx), 
phy(%d)\n",
@@ -242,8 +215,25 @@ _ctl_display_some_debug(struct MPT3SAS_ADAPTER *ioc, u16 
smid,
"\tenclosure_logical_id(0x%016llx), slot(%d)\n",
ioc->name, (unsigned long long)
sas_device->enclosure_logical_id, sas_device->slot);
+   sas_device_put(sas_device);
+   }
+   if (!sas_device) {
+   pcie_device = mpt3sas_get_pdev_by_handle(ioc,
+   le16_to_cpu(scsi_reply->DevHandle));
+   if (pcie_device) {
+   pr_warn(MPT3SAS_FMT
+   "\tWWID(0x%016llx), port(%d)\n", ioc->name,
+   (unsigned long long)pcie_device->wwid,
+   pcie_device->port_num);
+   if (pcie_device->enclosure_handle != 0)
+   pr_warn(MPT3SAS_FMT
+   "\tenclosure_logical_id(0x%016llx), 
slot(%d)\n",
+   ioc->name, (unsigned long long)
+   pcie_device->enclosure_logical_id,
+   pcie_device->slot);
+   pcie_device_put(pcie_device);
+   }
}
-   spin_unlock_irqrestore(>sas_device_lock, flags);
if (scsi_reply->SCSIState || scsi_reply->SCSIStatus)
pr_info(MPT3SAS_FMT
"\tscsi_state(0x%02x), scsi_status"
@@ -1375,6 +1365,42 @@ _ctl_btdh_search_sas_device(struct MPT3SAS_ADAPTER *ioc,
 }
 
 /**
+ * _ctl_btdh_search_pcie_device - searching for pcie device
+ * @ioc: per adapter object
+ * @btdh: btdh ioctl payload
+ */
+static int

[PATCH 06/13] mpt3sas: API 's to support NVMe drive addition to SML

2017-07-11 Thread Suganath Prabu S

Below Functions are added in various paths to support NVMe
drive addition.

_scsih_pcie_add_device
_scsih_pcie_device_add
_scsih_pcie_device_init_add
_scsih_check_pcie_access_status
_scsih_pcie_check_device

mpt3sas_get_pdev_by_wwid
mpt3sas_get_pdev_by_idchannel
mpt3sas_get_pdev_by_handle

mpt3sas_config_get_pcie_device_pg0
mpt3sas_config_get_pcie_device_pg2

Signed-off-by: Chaitra P B 
Signed-off-by: Suganath Prabu S 
---
 drivers/scsi/mpt3sas/mpt3sas_base.h   |   53 +++
 drivers/scsi/mpt3sas/mpt3sas_config.c |  100 ++
 drivers/scsi/mpt3sas/mpt3sas_scsih.c  |  568 -
 3 files changed, 714 insertions(+), 7 deletions(-)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.h 
b/drivers/scsi/mpt3sas/mpt3sas_base.h
index 0f07b16..063977a 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.h
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.h
@@ -563,6 +563,49 @@ struct _pcie_device {
u8  *serial_number;
struct kref refcount;
 };
+
+/**
+ * pcie_device_get - Increment the pcie device reference count
+ *
+ * @p: pcie_device object
+ *
+ * When ever this function called it will increment the
+ * reference count of the pcie device for which this function called.
+ *
+ */
+static inline void pcie_device_get(struct _pcie_device *p)
+{
+   kref_get(>refcount);
+}
+
+/**
+ * pcie_device_free - Release the pcie device object
+ * @r - kref object
+ *
+ * Free's the pcie device object. It will be called when reference count
+ * reaches to zero.
+ */
+static inline void pcie_device_free(struct kref *r)
+{
+   kfree(container_of(r, struct _pcie_device, refcount));
+}
+
+/**
+ * pcie_device_put - Decrement the pcie device reference count
+ *
+ * @p: pcie_device object
+ *
+ * When ever this function called it will decrement the
+ * reference count of the pcie device for which this function called.
+ *
+ * When refernce count reaches to Zero, this will call pcie_device_free to the
+ * pcie_device object.
+ */
+static inline void pcie_device_put(struct _pcie_device *p)
+{
+   kref_put(>refcount, pcie_device_free);
+}
+
 /**
  * struct _raid_device - raid volume link list
  * @list: sas device list
@@ -1417,6 +1460,10 @@ struct _sas_device *mpt3sas_get_sdev_by_addr(
 struct MPT3SAS_ADAPTER *ioc, u64 sas_address);
 struct _sas_device *__mpt3sas_get_sdev_by_addr(
 struct MPT3SAS_ADAPTER *ioc, u64 sas_address);
+struct _sas_device *mpt3sas_get_sdev_by_handle(struct MPT3SAS_ADAPTER *ioc,
+   u16 handle);
+struct _pcie_device *mpt3sas_get_pdev_by_handle(struct MPT3SAS_ADAPTER *ioc,
+   u16 handle);
 
 void mpt3sas_port_enable_complete(struct MPT3SAS_ADAPTER *ioc);
 struct _raid_device *
@@ -1455,6 +1502,12 @@ int mpt3sas_config_get_sas_device_pg0(struct 
MPT3SAS_ADAPTER *ioc,
 int mpt3sas_config_get_sas_device_pg1(struct MPT3SAS_ADAPTER *ioc,
Mpi2ConfigReply_t *mpi_reply, Mpi2SasDevicePage1_t *config_page,
u32 form, u32 handle);
+int mpt3sas_config_get_pcie_device_pg0(struct MPT3SAS_ADAPTER *ioc,
+   Mpi2ConfigReply_t *mpi_reply, Mpi26PCIeDevicePage0_t *config_page,
+   u32 form, u32 handle);
+int mpt3sas_config_get_pcie_device_pg2(struct MPT3SAS_ADAPTER *ioc,
+   Mpi2ConfigReply_t *mpi_reply, Mpi26PCIeDevicePage2_t *config_page,
+   u32 form, u32 handle);
 int mpt3sas_config_get_sas_iounit_pg0(struct MPT3SAS_ADAPTER *ioc,
Mpi2ConfigReply_t *mpi_reply, Mpi2SasIOUnitPage0_t *config_page,
u16 sz);
diff --git a/drivers/scsi/mpt3sas/mpt3sas_config.c 
b/drivers/scsi/mpt3sas/mpt3sas_config.c
index dd62701..1c747cf 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_config.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_config.c
@@ -150,6 +150,24 @@ _config_display_some_debug(struct MPT3SAS_ADAPTER *ioc, 
u16 smid,
case MPI2_CONFIG_EXTPAGETYPE_DRIVER_MAPPING:
desc = "driver_mapping";
break;
+   case MPI2_CONFIG_EXTPAGETYPE_SAS_PORT:
+   desc = "sas_port";
+   break;
+   case MPI2_CONFIG_EXTPAGETYPE_EXT_MANUFACTURING:
+   desc = "ext_manufacturing";
+   break;
+   case MPI2_CONFIG_EXTPAGETYPE_PCIE_IO_UNIT:
+   desc = "pcie_io_unit";
+   break;
+   case MPI2_CONFIG_EXTPAGETYPE_PCIE_SWITCH:
+   desc = "pcie_switch";
+   break;
+   case MPI2_CONFIG_EXTPAGETYPE_PCIE_DEVICE:
+   desc = "pcie_device";
+   break;
+   case MPI2_CONFIG_EXTPAGETYPE_PCIE_LINK:
+   desc = "pcie_link";
+   break;
}
break;
}
@@ -1053,6 +1071,88 @@ mpt3sas_config_get_sas_device_pg1(struct MPT3SAS_ADAPTER 
*ioc,
 }
 
 /**
+ * mpt3sas_config_get_pcie_device_pg0 - obtain pcie device page 0
+ * @ioc:

[PATCH 01/13] mpt3sas: Add nvme device support in slave alloc, target alloc and probe

2017-07-11 Thread Suganath Prabu S

1) Added support for probing pcie device and adding NVMe drives to
SML and driver's internal list pcie_device_list.

2) Added support for determing NVMe as boot device.

3) Added nvme device support for call back functions scan_finished
target_alloc,slave_alloc,target destroy and slave destroy.

 a) During scan, pcie devices are probed and added to SML to drivers
internal list.

 b) target_alloc & slave alloc API's allocates resources for
(MPT3SAS_TARGET & MPT3SAS_DEVICE) private datas and holds
information like handle, target_id etc.

 c) slave_destroy & target_destroy are called when driver unregisters
or removes device. Also frees allocated resources and info.

Signed-off-by: Chaitra P B 
Signed-off-by: Suganath Prabu S 
---
 drivers/scsi/mpt3sas/mpt3sas_base.h  |   68 +++-
 drivers/scsi/mpt3sas/mpt3sas_scsih.c |  291 ++
 2 files changed, 325 insertions(+), 34 deletions(-)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.h 
b/drivers/scsi/mpt3sas/mpt3sas_base.h
index 099ab4c..60fa7b6 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.h
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.h
@@ -357,7 +357,8 @@ struct Mpi2ManufacturingPage11_t {
  * @flags: MPT_TARGET_FLAGS_XXX flags
  * @deleted: target flaged for deletion
  * @tm_busy: target is busy with TM request.
- * @sdev: The sas_device associated with this target
+ * @sas_dev: The sas_device associated with this target
+ * @pcie_dev: The pcie device associated with this target
  */
 struct MPT3SAS_TARGET {
struct scsi_target *starget;
@@ -368,7 +369,8 @@ struct MPT3SAS_TARGET {
u32 flags;
u8  deleted;
u8  tm_busy;
-   struct _sas_device *sdev;
+   struct _sas_device *sas_dev;
+   struct _pcie_device *pcie_dev;
 };
 
 
@@ -508,6 +510,48 @@ static inline void sas_device_put(struct _sas_device *s)
kref_put(>refcount, sas_device_free);
 }
 
+/*
+ * struct _pcie_device - attached PCIe device information
+ * @list: pcie device list
+ * @starget: starget object
+ * @wwid: device WWID
+ * @handle: device handle
+ * @device_info: bitfield provides detailed info about the device
+ * @id: target id
+ * @channel: target channel
+ * @slot: slot number
+ * @port_num: port number
+ * @responding: used in _scsih_pcie_device_mark_responding
+ * @fast_path: fast path feature enable bit
+ * @nvme_mdts: MaximumDataTransferSize from PCIe Device Page 2 for
+ * NVMe device only
+ * @enclosure_handle: enclosure handle
+ * @enclosure_logical_id: enclosure logical identifier
+ * @enclosure_level: The level of device's enclosure from the controller
+ * @connector_name: ASCII value of the Connector's name
+ * @serial_number: pointer of serial number string allocated runtime
+ * @refcount: reference count for deletion
+ */
+struct _pcie_device {
+   struct list_head list;
+   struct scsi_target *starget;
+   u64 wwid;
+   u16 handle;
+   u32 device_info;
+   int id;
+   int channel;
+   u16 slot;
+   u8  port_num;
+   u8  responding;
+   u8  fast_path;
+   u32 nvme_mdts;
+   u16 enclosure_handle;
+   u64 enclosure_logical_id;
+   u8  enclosure_level;
+   u8  connector_name[4];
+   u8  *serial_number;
+   struct kref refcount;
+};
 /**
  * struct _raid_device - raid volume link list
  * @list: sas device list
@@ -556,12 +600,13 @@ struct _raid_device {
 
 /**
  * struct _boot_device - boot device info
- * @is_raid: flag to indicate whether this is volume
- * @device: holds pointer for either struct _sas_device or
- * struct _raid_device
+ *
+ * @channel: sas, raid, or pcie channel
+ * @device: holds pointer for struct _sas_device, struct _raid_device or
+ * struct _pcie_device
  */
 struct _boot_device {
-   u8 is_raid;
+   int channel;
void *device;
 };
 
@@ -825,6 +870,8 @@ typedef void (*MPT3SAS_FLUSH_RUNNING_CMDS)(struct 
MPT3SAS_ADAPTER *ioc);
  * @bars: bitmask of BAR's that must be configured
  * @mask_interrupts: ignore interrupt
  * @dma_mask: used to set the consistent dma mask
+ * @pci_access_mutex: Mutex to synchronize ioctl,sysfs show path and
+ * pci resource handling
  * @fault_reset_work_q_name: fw fault work queue
  * @fault_reset_work_q: ""
  * @fault_reset_work: ""
@@ -888,9 +935,13 @@ typedef void (*MPT3SAS_FLUSH_RUNNING_CMDS)(struct 
MPT3SAS_ADAPTER *ioc);
  * @sas_device_list: sas device object list
  * @sas_device_init_list: sas device object list (used only at init time)
  * @sas_device_lock:
+ * @pcie_device_list: pcie device object list
+ * @pcie_device_init_list: pcie device object list (used only at init time)
+ * @pcie_device_lock:
  * @io_missing_delay: time for IO completed by fw when PDR enabled
  * @device_missing_delay: time for device missing by fw when PDR enabled
  * @sas_id : used for setting volume

[PATCH 00/13]mpt3sas driver NVMe support:

2017-07-11 Thread Suganath Prabu S

Ventura Series controller are Tri-mode. The controller and
firmware are capable of supporting NVMe devices and
PCIe switches to be connected with the controller. This
patch set adds driver level support for NVMe devices and
PCIe switches.

Suganath Prabu S (13):
  mpt3sas: Add nvme device support in slave alloc, target alloc and
probe
  mpt3sas: SGL to PRP Translation for I/Os to NVMe  devices
  mpt3sas: Added support for nvme encapsulated request message.
  mpt3sas: Handle NVMe PCIe device related events generated
from firmware.
  mpt3sas: Set NVMe device queue depth as 128
  mpt3sas: API 's to support NVMe drive addition to SML
  mpt3sas: API's to remove nvme drive from sml
  mpt3sas: scan and add nvme device after controller reset
  mpt3as: Add-Task-management-debug-info-for-NVMe-drives.
  mpt3sas: NVMe drive support for BTDHMAPPING ioctl command and log
info
  mpt3sas: Fix nvme drives checking for tlr.
  mpt3sas: Update MPI Header
  mpt3sas: Update mpt3sas driver version.

 drivers/scsi/mpt3sas/mpi/mpi2.h  |   43 +-
 drivers/scsi/mpt3sas/mpi/mpi2_cnfg.h |  647 ++-
 drivers/scsi/mpt3sas/mpi/mpi2_init.h |   11 +-
 drivers/scsi/mpt3sas/mpi/mpi2_ioc.h  |  331 ++-
 drivers/scsi/mpt3sas/mpi/mpi2_pci.h  |  142 +++
 drivers/scsi/mpt3sas/mpi/mpi2_tool.h |   14 +-
 drivers/scsi/mpt3sas/mpt3sas_base.c  |  709 +++-
 drivers/scsi/mpt3sas/mpt3sas_base.h  |  176 +++-
 drivers/scsi/mpt3sas/mpt3sas_config.c|  100 ++
 drivers/scsi/mpt3sas/mpt3sas_ctl.c   |  170 +++-
 drivers/scsi/mpt3sas/mpt3sas_scsih.c | 1871 --
 drivers/scsi/mpt3sas/mpt3sas_warpdrive.c |2 +-
 12 files changed, 4081 insertions(+), 135 deletions(-)
 create mode 100644 drivers/scsi/mpt3sas/mpi/mpi2_pci.h

Thanks,
Suganath Prabu S

Re: [PATCH] tcmu: Fix possible overflow for memcpy address in iovec

2017-07-11 Thread Damien Le Moal

Nicholas,

On 7/11/17 17:41, Nicholas A. Bellinger wrote:
> Hey Xiubo,
> 
> On Tue, 2017-07-11 at 16:04 +0800, Xiubo Li wrote:
>> Hi All
>>
>> Please ignore about this patch.
>>
>> Just my mistake.
>>
>> Sorry.
>>
> 
> Damien (CC'ed) has been observing something similar atop the latest
> target-pending/for-next with his user-space ZBC backend:
> 
> http://www.spinics.net/lists/target-devel/msg15804.html
> 
> Just curious, are you going to re-send a different patch to address
> this..?
> 
> Is there anything that he can test to verify it's the same bug..?
> 
>>
>> Brs
>>
>> Xiubo
>>
>>
>>
>> On 2017年07月11日 15:40, lixi...@cmss.chinamobile.com wrote:
>>> From: Xiubo Li 
>>>
>>> Before the data area dynamic grow patches, though the overflow
>>> bug was already exist, since the data area memories are all
>>> preallocated, so there mostly won't any bad page fault core
>>> trace produced.
>>>
>>> The dynamic grow patches will only allocate and map the block
>>> needed in data area, so when memcpy overflow, the system will
>>> die.
>>>
>>> [  367.864705] [c000fc657340] [c00d220c] do_exit+0x79c/0xcf0
>>> [  367.864710] [c000fc657410] [c00249a4] die+0x314/0x470
>>> [  367.864715] [c000fc6574a0] [c005425c] 
>>> bad_page_fault+0xdc/0x150
>>> [  367.864720] [c000fc657510] [c0008964] 
>>> handle_page_fault+0x2c/0x30
>>> [  367.864726] --- interrupt: 300 at memcpy_power7+0x20c/0x840
>>> [  367.864726] LR = tcmu_queue_cmd+0x844/0xa80 [target_core_user]
>>> [  367.864732] [c000fc657800] [d88916d8] 
>>> tcmu_queue_cmd+0x768/0xa80 [target_core_user] (unreliable)
>>> [  367.864746] [c000fc657940] [d2993184] 
>>> __target_execute_cmd+0x54/0x150 [target_core_mod]
>>> [  367.864758] [c000fc657970] [d2994708] 
>>> transport_generic_new_cmd+0x158/0x2d0 [target_core_mod]
>>> [  367.864770] [c000fc6579f0] [d29948e4] 
>>> transport_handle_cdb_direct+0x64/0xd0 [target_core_mod]
>>> [  367.864783] [c000fc657a60] [d2994af8] 
>>> target_submit_cmd_map_sgls+0x1a8/0x320 [target_core_mod]
>>> [  367.864796] [c000fc657af0] [d2994cb8] 
>>> target_submit_cmd+0x48/0x60 [target_core_mod]
>>> [  367.864803] [c000fc657b90] [d2a54bd0] 
>>> ibmvscsis_scheduler+0x350/0x5c0 [ibmvscsis]
>>> [  367.864808] [c000fc657c50] [c00f1c28] 
>>> process_one_work+0x1e8/0x5b0
>>> [  367.864813] [c000fc657ce0] [c00f2098] 
>>> worker_thread+0xa8/0x650
>>> [  367.864818] [c000fc657d80] [c00fa864] kthread+0x114/0x140
>>> [  367.864823] [c000fc657e30] [c00098f0] 
>>> ret_from_kernel_thread+0x5c/0x6c
>>> [  367.864827] Instruction dump:
>>> [  367.864829] 6042 7fe3fb78 4bfcd175 6000 4bfffecc 7c0802a6 
>>> f8010010 6000
>>> [  367.864838] 7c0802a6 f8010010 f821ffe1 e9230690  38210020 
>>> e8010010 7c0803a6
>>> [  367.864847] ---[ end trace 8d085df7e65f7d20 ]---
>>> [  367.870358]
>>> [  367.870362] Fixing recursive fault but reboot is needed!
>>> [  388.859695] INFO: rcu_sched detected stalls on CPUs/tasks:
>>> [  388.859717]  16-...: (0 ticks this GP) idle=7e3/140/0 
>>> softirq=12245/12245 fqs=2622
>>> [  388.859722]  (detected by 20, t=5252 jiffies, g=12458, c=12457, q=2904)
>>> [  388.859744] Task dump for CPU 16:
>>> [  388.859747] kworker/16:2D  0  6865 0 0x0800
>>> [  388.859762] Call Trace:
>>> [  388.859768] [c000fc6579a0] [c14ef090] 
>>> sysctl_sched_migration_cost+0x0/0x4 (unreliable)
>>> [  388.859778] [c000fc6579c0] [d8890c1c] 
>>> tcmu_parse_cdb+0x2c/0x40 [target_core_user]
>>> [  388.859782] [c000fc6579e0] [c000fc657a60] 0xc000fc657a60
>>>
>>> Reported-by: Bryant G. Ly 
>>> Signed-off-by: Xiubo Li 
>>> ---
>>>   drivers/target/target_core_user.c | 4 ++--
>>>   1 file changed, 2 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/target/target_core_user.c 
>>> b/drivers/target/target_core_user.c
>>> index 930800c..86a845a 100644
>>> --- a/drivers/target/target_core_user.c
>>> +++ b/drivers/target/target_core_user.c
>>> @@ -437,7 +437,7 @@ static int scatter_data_area(struct tcmu_dev *udev,
>>> to_offset = get_block_offset_user(udev, dbi,
>>> block_remaining);
>>> offset = DATA_BLOCK_SIZE - block_remaining;
>>> -   to = (void *)(unsigned long)to + offset;
>>> +   to = (void *)((unsigned long)to + offset);
>>>   
>>> if (*iov_cnt != 0 &&
>>> to_offset == iov_tail(udev, *iov)) {
>>> @@ -510,7 +510,7 @@ static void gather_data_area(struct tcmu_dev *udev, 
>>> struct tcmu_cmd *cmd,
>>> copy_bytes = min_t(size_t, sg_remaining,
>>> block_remaining);
>>> offset = DATA_BLOCK_SIZE

Re: [PATCH] tcmu: Fix possible overflow for memcpy address in iovec

2017-07-11 Thread Nicholas A. Bellinger

Hey Xiubo,

On Tue, 2017-07-11 at 16:04 +0800, Xiubo Li wrote:
> Hi All
> 
> Please ignore about this patch.
> 
> Just my mistake.
> 
> Sorry.
> 

Damien (CC'ed) has been observing something similar atop the latest
target-pending/for-next with his user-space ZBC backend:

http://www.spinics.net/lists/target-devel/msg15804.html

Just curious, are you going to re-send a different patch to address
this..?

Is there anything that he can test to verify it's the same bug..?

> 
> Brs
> 
> Xiubo
> 
> 
> 
> On 2017年07月11日 15:40, lixi...@cmss.chinamobile.com wrote:
> > From: Xiubo Li 
> >
> > Before the data area dynamic grow patches, though the overflow
> > bug was already exist, since the data area memories are all
> > preallocated, so there mostly won't any bad page fault core
> > trace produced.
> >
> > The dynamic grow patches will only allocate and map the block
> > needed in data area, so when memcpy overflow, the system will
> > die.
> >
> > [  367.864705] [c000fc657340] [c00d220c] do_exit+0x79c/0xcf0
> > [  367.864710] [c000fc657410] [c00249a4] die+0x314/0x470
> > [  367.864715] [c000fc6574a0] [c005425c] 
> > bad_page_fault+0xdc/0x150
> > [  367.864720] [c000fc657510] [c0008964] 
> > handle_page_fault+0x2c/0x30
> > [  367.864726] --- interrupt: 300 at memcpy_power7+0x20c/0x840
> > [  367.864726] LR = tcmu_queue_cmd+0x844/0xa80 [target_core_user]
> > [  367.864732] [c000fc657800] [d88916d8] 
> > tcmu_queue_cmd+0x768/0xa80 [target_core_user] (unreliable)
> > [  367.864746] [c000fc657940] [d2993184] 
> > __target_execute_cmd+0x54/0x150 [target_core_mod]
> > [  367.864758] [c000fc657970] [d2994708] 
> > transport_generic_new_cmd+0x158/0x2d0 [target_core_mod]
> > [  367.864770] [c000fc6579f0] [d29948e4] 
> > transport_handle_cdb_direct+0x64/0xd0 [target_core_mod]
> > [  367.864783] [c000fc657a60] [d2994af8] 
> > target_submit_cmd_map_sgls+0x1a8/0x320 [target_core_mod]
> > [  367.864796] [c000fc657af0] [d2994cb8] 
> > target_submit_cmd+0x48/0x60 [target_core_mod]
> > [  367.864803] [c000fc657b90] [d2a54bd0] 
> > ibmvscsis_scheduler+0x350/0x5c0 [ibmvscsis]
> > [  367.864808] [c000fc657c50] [c00f1c28] 
> > process_one_work+0x1e8/0x5b0
> > [  367.864813] [c000fc657ce0] [c00f2098] 
> > worker_thread+0xa8/0x650
> > [  367.864818] [c000fc657d80] [c00fa864] kthread+0x114/0x140
> > [  367.864823] [c000fc657e30] [c00098f0] 
> > ret_from_kernel_thread+0x5c/0x6c
> > [  367.864827] Instruction dump:
> > [  367.864829] 6042 7fe3fb78 4bfcd175 6000 4bfffecc 7c0802a6 
> > f8010010 6000
> > [  367.864838] 7c0802a6 f8010010 f821ffe1 e9230690  38210020 
> > e8010010 7c0803a6
> > [  367.864847] ---[ end trace 8d085df7e65f7d20 ]---
> > [  367.870358]
> > [  367.870362] Fixing recursive fault but reboot is needed!
> > [  388.859695] INFO: rcu_sched detected stalls on CPUs/tasks:
> > [  388.859717]  16-...: (0 ticks this GP) idle=7e3/140/0 
> > softirq=12245/12245 fqs=2622
> > [  388.859722]  (detected by 20, t=5252 jiffies, g=12458, c=12457, q=2904)
> > [  388.859744] Task dump for CPU 16:
> > [  388.859747] kworker/16:2D  0  6865 0 0x0800
> > [  388.859762] Call Trace:
> > [  388.859768] [c000fc6579a0] [c14ef090] 
> > sysctl_sched_migration_cost+0x0/0x4 (unreliable)
> > [  388.859778] [c000fc6579c0] [d8890c1c] 
> > tcmu_parse_cdb+0x2c/0x40 [target_core_user]
> > [  388.859782] [c000fc6579e0] [c000fc657a60] 0xc000fc657a60
> >
> > Reported-by: Bryant G. Ly 
> > Signed-off-by: Xiubo Li 
> > ---
> >   drivers/target/target_core_user.c | 4 ++--
> >   1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/target/target_core_user.c 
> > b/drivers/target/target_core_user.c
> > index 930800c..86a845a 100644
> > --- a/drivers/target/target_core_user.c
> > +++ b/drivers/target/target_core_user.c
> > @@ -437,7 +437,7 @@ static int scatter_data_area(struct tcmu_dev *udev,
> > to_offset = get_block_offset_user(udev, dbi,
> > block_remaining);
> > offset = DATA_BLOCK_SIZE - block_remaining;
> > -   to = (void *)(unsigned long)to + offset;
> > +   to = (void *)((unsigned long)to + offset);
> >   
> > if (*iov_cnt != 0 &&
> > to_offset == iov_tail(udev, *iov)) {
> > @@ -510,7 +510,7 @@ static void gather_data_area(struct tcmu_dev *udev, 
> > struct tcmu_cmd *cmd,
> > copy_bytes = min_t(size_t, sg_remaining,
> > block_remaining);
> > offset = DATA_BLOCK_SIZE - block_remaining;
> > -   from = (void *)(unsigned long)from + offset;
> > +

Re: tgtd CPU 100% problem

2017-07-11 Thread Sagi Grimberg




On 11/07/17 10:51, 李春 wrote:

We have meet a problem of tgtd CPU 100%.

the infinband network card was negotiate as eth mode by mistake,
after we change it to ib mode and restart opensmd for correct State（Active）
the tgtd using 100% of CPU. and when we connect to it using tgtadm,
tgtadm hang forever.

# how to repeat

* tgtd export a disk throught port 3260 of iser
* iscsiadm login a target from tgt through infiniband

* connectx_port_config set the mellanox infiniband to eth mode
* connectx_port_config set the mellanox infiniband to ib mode
* /etc/init.d/opensmd restart
* tgtadm connect to tgt will hang

# error messge

```
Jul  1 21:32:37 shadow tgtd: iser_handle_rdmacm(1628) Unsupported
event:11, RDMA_CM_EVENT_DEVICE_REMOVAL - ignored
Jul  1 21:32:37 shadow tgtd: iser_handle_rdmacm(1628) Unsupported
event:11, RDMA_CM_EVENT_DEVICE_REMOVAL - ignored

Jul  1 21:32:39 shadow tgtd: iser_handle_async_event(3174) dev:mlx4_0
HCA evt: local catastrophic error


iser code in tgtd does not know how to correctly handle RDMA device
removal events (and it never did).

The event is generated from the port configuration step while
tgt-iser is bound to it. Once the device is removed the device
handle tgt-iser has is essentially unusable, which explains
the qp creation failures below.

Handling DEVICE_REMOVAL event handling is a new feature request.


Jul  1 21:46:56 shadow tgtd: iser_cm_connect_request(1471)
conn:0x1380bf0 cm_id:0x1380950 rdma_create_qp failed, Cannot allocate
memory
Jul  1 21:46:56 shadow tgtd: iser_cm_connect_request(1520)
cm_id:0x1380950 rdma_reject failed, Bad file descriptor
Jul  1 21:46:56 shadow tgtd: iser_cm_connect_request(1471)
conn:0x1380bf0 cm_id:0x1380950 rdma_create_qp failed, Cannot allocate
memory


And also tgt-iser cannot even reject the (re)connect request.


Jul  1 21:46:56 shadow tgtd: iser_cm_connect_request(1520)
cm_id:0x1380950 rdma_reject failed, Bad file descriptor
``

[PATCH] tcmu: Fix possbile memory leak when recalculating the cmd base size

2017-07-11 Thread lixiubo

From: Xiubo Li 

For all the entries allocated from the ring cmd area, the memory
is something like the stack, which will reserve the old data, so
the entry->req.iov_bidi_cnt maybe none zero.

To fix this, just memset all the entry memory before using it, and
also to be more readable we adjust the bidi code.

Fixed: fe25cc34795(tcmu: Recalculate the tcmu_cmd size to save cmd area
memories)
Reported-by: Bryant G. Ly 
Signed-off-by: Xiubo Li 
---
 drivers/target/target_core_user.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/target/target_core_user.c 
b/drivers/target/target_core_user.c
index 2f1fa92..be62c86 100644
--- a/drivers/target/target_core_user.c
+++ b/drivers/target/target_core_user.c
@@ -840,6 +840,7 @@ static inline size_t tcmu_cmd_get_cmd_size(struct tcmu_cmd 
*tcmu_cmd,
}
 
entry = (void *) mb + CMDR_OFF + cmd_head;
+   memset(entry, 0, command_size);
tcmu_hdr_set_op(>hdr.len_op, TCMU_OP_CMD);
entry->hdr.cmd_id = tcmu_cmd->cmd_id;
entry->hdr.kflags = 0;
@@ -865,8 +866,8 @@ static inline size_t tcmu_cmd_get_cmd_size(struct tcmu_cmd 
*tcmu_cmd,
entry->req.iov_dif_cnt = 0;
 
/* Handle BIDI commands */
+   iov_cnt = 0;
if (se_cmd->se_cmd_flags & SCF_BIDI) {
-   iov_cnt = 0;
iov++;
ret = scatter_data_area(udev, tcmu_cmd,
se_cmd->t_bidi_data_sg,
@@ -879,8 +880,8 @@ static inline size_t tcmu_cmd_get_cmd_size(struct tcmu_cmd 
*tcmu_cmd,
pr_err("tcmu: alloc and scatter bidi data failed\n");
return TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE;
}
-   entry->req.iov_bidi_cnt = iov_cnt;
}
+   entry->req.iov_bidi_cnt = iov_cnt;
 
/*
 * Recalaulate the command's base size and size according
-- 
1.8.3.1

Re: [PATCH] tcmu: Fix possible overflow for memcpy address in iovec

2017-07-11 Thread Xiubo Li


Hi All

Please ignore about this patch.

Just my mistake.

Sorry.


Brs

Xiubo



On 2017年07月11日 15:40, lixi...@cmss.chinamobile.com wrote:

From: Xiubo Li 

Before the data area dynamic grow patches, though the overflow
bug was already exist, since the data area memories are all
preallocated, so there mostly won't any bad page fault core
trace produced.

The dynamic grow patches will only allocate and map the block
needed in data area, so when memcpy overflow, the system will
die.

[  367.864705] [c000fc657340] [c00d220c] do_exit+0x79c/0xcf0
[  367.864710] [c000fc657410] [c00249a4] die+0x314/0x470
[  367.864715] [c000fc6574a0] [c005425c] bad_page_fault+0xdc/0x150
[  367.864720] [c000fc657510] [c0008964] handle_page_fault+0x2c/0x30
[  367.864726] --- interrupt: 300 at memcpy_power7+0x20c/0x840
[  367.864726] LR = tcmu_queue_cmd+0x844/0xa80 [target_core_user]
[  367.864732] [c000fc657800] [d88916d8] tcmu_queue_cmd+0x768/0xa80 
[target_core_user] (unreliable)
[  367.864746] [c000fc657940] [d2993184] 
__target_execute_cmd+0x54/0x150 [target_core_mod]
[  367.864758] [c000fc657970] [d2994708] 
transport_generic_new_cmd+0x158/0x2d0 [target_core_mod]
[  367.864770] [c000fc6579f0] [d29948e4] 
transport_handle_cdb_direct+0x64/0xd0 [target_core_mod]
[  367.864783] [c000fc657a60] [d2994af8] 
target_submit_cmd_map_sgls+0x1a8/0x320 [target_core_mod]
[  367.864796] [c000fc657af0] [d2994cb8] 
target_submit_cmd+0x48/0x60 [target_core_mod]
[  367.864803] [c000fc657b90] [d2a54bd0] 
ibmvscsis_scheduler+0x350/0x5c0 [ibmvscsis]
[  367.864808] [c000fc657c50] [c00f1c28] 
process_one_work+0x1e8/0x5b0
[  367.864813] [c000fc657ce0] [c00f2098] worker_thread+0xa8/0x650
[  367.864818] [c000fc657d80] [c00fa864] kthread+0x114/0x140
[  367.864823] [c000fc657e30] [c00098f0] 
ret_from_kernel_thread+0x5c/0x6c
[  367.864827] Instruction dump:
[  367.864829] 6042 7fe3fb78 4bfcd175 6000 4bfffecc 7c0802a6 f8010010 
6000
[  367.864838] 7c0802a6 f8010010 f821ffe1 e9230690  38210020 e8010010 
7c0803a6
[  367.864847] ---[ end trace 8d085df7e65f7d20 ]---
[  367.870358]
[  367.870362] Fixing recursive fault but reboot is needed!
[  388.859695] INFO: rcu_sched detected stalls on CPUs/tasks:
[  388.859717]  16-...: (0 ticks this GP) idle=7e3/140/0 
softirq=12245/12245 fqs=2622
[  388.859722]  (detected by 20, t=5252 jiffies, g=12458, c=12457, q=2904)
[  388.859744] Task dump for CPU 16:
[  388.859747] kworker/16:2D  0  6865 0 0x0800
[  388.859762] Call Trace:
[  388.859768] [c000fc6579a0] [c14ef090] 
sysctl_sched_migration_cost+0x0/0x4 (unreliable)
[  388.859778] [c000fc6579c0] [d8890c1c] tcmu_parse_cdb+0x2c/0x40 
[target_core_user]
[  388.859782] [c000fc6579e0] [c000fc657a60] 0xc000fc657a60

Reported-by: Bryant G. Ly 
Signed-off-by: Xiubo Li 
---
  drivers/target/target_core_user.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/target/target_core_user.c 
b/drivers/target/target_core_user.c
index 930800c..86a845a 100644
--- a/drivers/target/target_core_user.c
+++ b/drivers/target/target_core_user.c
@@ -437,7 +437,7 @@ static int scatter_data_area(struct tcmu_dev *udev,
to_offset = get_block_offset_user(udev, dbi,
block_remaining);
offset = DATA_BLOCK_SIZE - block_remaining;
-   to = (void *)(unsigned long)to + offset;
+   to = (void *)((unsigned long)to + offset);
  
  			if (*iov_cnt != 0 &&

to_offset == iov_tail(udev, *iov)) {
@@ -510,7 +510,7 @@ static void gather_data_area(struct tcmu_dev *udev, struct 
tcmu_cmd *cmd,
copy_bytes = min_t(size_t, sg_remaining,
block_remaining);
offset = DATA_BLOCK_SIZE - block_remaining;
-   from = (void *)(unsigned long)from + offset;
+   from = (void *)((unsigned long)from + offset);
tcmu_flush_dcache_range(from, copy_bytes);
memcpy(to + sg->length - sg_remaining, from,
copy_bytes);

[PATCH] tcmu: Fix possible overflow for memcpy address in iovec

2017-07-11 Thread lixiubo

From: Xiubo Li 

Before the data area dynamic grow patches, though the overflow
bug was already exist, since the data area memories are all
preallocated, so there mostly won't any bad page fault core
trace produced.

The dynamic grow patches will only allocate and map the block
needed in data area, so when memcpy overflow, the system will
die.

[  367.864705] [c000fc657340] [c00d220c] do_exit+0x79c/0xcf0
[  367.864710] [c000fc657410] [c00249a4] die+0x314/0x470
[  367.864715] [c000fc6574a0] [c005425c] bad_page_fault+0xdc/0x150
[  367.864720] [c000fc657510] [c0008964] handle_page_fault+0x2c/0x30
[  367.864726] --- interrupt: 300 at memcpy_power7+0x20c/0x840
[  367.864726] LR = tcmu_queue_cmd+0x844/0xa80 [target_core_user]
[  367.864732] [c000fc657800] [d88916d8] tcmu_queue_cmd+0x768/0xa80 
[target_core_user] (unreliable)
[  367.864746] [c000fc657940] [d2993184] 
__target_execute_cmd+0x54/0x150 [target_core_mod]
[  367.864758] [c000fc657970] [d2994708] 
transport_generic_new_cmd+0x158/0x2d0 [target_core_mod]
[  367.864770] [c000fc6579f0] [d29948e4] 
transport_handle_cdb_direct+0x64/0xd0 [target_core_mod]
[  367.864783] [c000fc657a60] [d2994af8] 
target_submit_cmd_map_sgls+0x1a8/0x320 [target_core_mod]
[  367.864796] [c000fc657af0] [d2994cb8] 
target_submit_cmd+0x48/0x60 [target_core_mod]
[  367.864803] [c000fc657b90] [d2a54bd0] 
ibmvscsis_scheduler+0x350/0x5c0 [ibmvscsis]
[  367.864808] [c000fc657c50] [c00f1c28] 
process_one_work+0x1e8/0x5b0
[  367.864813] [c000fc657ce0] [c00f2098] worker_thread+0xa8/0x650
[  367.864818] [c000fc657d80] [c00fa864] kthread+0x114/0x140
[  367.864823] [c000fc657e30] [c00098f0] 
ret_from_kernel_thread+0x5c/0x6c
[  367.864827] Instruction dump:
[  367.864829] 6042 7fe3fb78 4bfcd175 6000 4bfffecc 7c0802a6 f8010010 
6000
[  367.864838] 7c0802a6 f8010010 f821ffe1 e9230690  38210020 e8010010 
7c0803a6
[  367.864847] ---[ end trace 8d085df7e65f7d20 ]---
[  367.870358]
[  367.870362] Fixing recursive fault but reboot is needed!
[  388.859695] INFO: rcu_sched detected stalls on CPUs/tasks:
[  388.859717]  16-...: (0 ticks this GP) idle=7e3/140/0 
softirq=12245/12245 fqs=2622
[  388.859722]  (detected by 20, t=5252 jiffies, g=12458, c=12457, q=2904)
[  388.859744] Task dump for CPU 16:
[  388.859747] kworker/16:2D  0  6865 0 0x0800
[  388.859762] Call Trace:
[  388.859768] [c000fc6579a0] [c14ef090] 
sysctl_sched_migration_cost+0x0/0x4 (unreliable)
[  388.859778] [c000fc6579c0] [d8890c1c] tcmu_parse_cdb+0x2c/0x40 
[target_core_user]
[  388.859782] [c000fc6579e0] [c000fc657a60] 0xc000fc657a60

Reported-by: Bryant G. Ly 
Signed-off-by: Xiubo Li 
---
 drivers/target/target_core_user.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/target/target_core_user.c 
b/drivers/target/target_core_user.c
index 930800c..86a845a 100644
--- a/drivers/target/target_core_user.c
+++ b/drivers/target/target_core_user.c
@@ -437,7 +437,7 @@ static int scatter_data_area(struct tcmu_dev *udev,
to_offset = get_block_offset_user(udev, dbi,
block_remaining);
offset = DATA_BLOCK_SIZE - block_remaining;
-   to = (void *)(unsigned long)to + offset;
+   to = (void *)((unsigned long)to + offset);
 
if (*iov_cnt != 0 &&
to_offset == iov_tail(udev, *iov)) {
@@ -510,7 +510,7 @@ static void gather_data_area(struct tcmu_dev *udev, struct 
tcmu_cmd *cmd,
copy_bytes = min_t(size_t, sg_remaining,
block_remaining);
offset = DATA_BLOCK_SIZE - block_remaining;
-   from = (void *)(unsigned long)from + offset;
+   from = (void *)((unsigned long)from + offset);
tcmu_flush_dcache_range(from, copy_bytes);
memcpy(to + sg->length - sg_remaining, from,
copy_bytes);
-- 
1.8.3.1

tgtd CPU 100% problem

2017-07-11 Thread 李春

We have meet a problem of tgtd CPU 100%.

the infinband network card was negotiate as eth mode by mistake,
after we change it to ib mode and restart opensmd for correct State（Active）
the tgtd using 100% of CPU. and when we connect to it using tgtadm,
tgtadm hang forever.

# how to repeat

* tgtd export a disk throught port 3260 of iser
* iscsiadm login a target from tgt through infiniband

* connectx_port_config set the mellanox infiniband to eth mode
* connectx_port_config set the mellanox infiniband to ib mode
* /etc/init.d/opensmd restart
* tgtadm connect to tgt will hang

# error messge

```
Jul  1 21:32:37 shadow tgtd: iser_handle_rdmacm(1628) Unsupported
event:11, RDMA_CM_EVENT_DEVICE_REMOVAL - ignored
Jul  1 21:32:37 shadow tgtd: iser_handle_rdmacm(1628) Unsupported
event:11, RDMA_CM_EVENT_DEVICE_REMOVAL - ignored

Jul  1 21:32:39 shadow tgtd: iser_handle_async_event(3174) dev:mlx4_0
HCA evt: local catastrophic error

Jul  1 21:46:56 shadow tgtd: iser_cm_connect_request(1471)
conn:0x1380bf0 cm_id:0x1380950 rdma_create_qp failed, Cannot allocate
memory
Jul  1 21:46:56 shadow tgtd: iser_cm_connect_request(1520)
cm_id:0x1380950 rdma_reject failed, Bad file descriptor
Jul  1 21:46:56 shadow tgtd: iser_cm_connect_request(1471)
conn:0x1380bf0 cm_id:0x1380950 rdma_create_qp failed, Cannot allocate
memory
Jul  1 21:46:56 shadow tgtd: iser_cm_connect_request(1520)
cm_id:0x1380950 rdma_reject failed, Bad file descriptor
```

# what we found

we have download the lastest 1.0.70 source code and found the problem
in usr/iscsi/iser.c:iser_nop_work_handler()

```
list_for_each_entry(conn, _conn_list, conn_list) {
if (conn->h.state != STATE_FULL)
continue;
task = conn->nop_in_task;
if (!task)
continue;
conn->nop_in_task = NULL;
iser_send_ping_nop_in(task);
}

```
because the conn->h.state will not be STATE_FULL it become a dead loop
of first 3 lines.

Can any one find the reason and fix the problem.

-- 
pickup.li

Re: [PATCH 00/13]mpt3sas driver NVMe support:

2017-07-11 Thread Johannes Thumshirn

On Tue, Jul 11, 2017 at 01:05:29PM +0530, Suganath Prabu Subramani wrote:
> Is there any update on this ?
> Will include  linux-n...@lists.infradead.org in next version of this
> patch submission, if there is any change suggestion.

can you please re-send with Cc to linux-nvme?

Thanks,
Johannes
-- 
Johannes Thumshirn  Storage
jthumsh...@suse.de+49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

Re: [PATCH 00/13]mpt3sas driver NVMe support:

2017-07-11 Thread Suganath Prabu Subramani

Is there any update on this ?
Will include  linux-n...@lists.infradead.org in next version of this
patch submission, if there is any change suggestion.

Thanks,
Suganath Prabu S

On Thu, Jun 29, 2017 at 8:01 PM, Johannes Thumshirn  wrote:
> On Thu, Jun 29, 2017 at 07:49:01PM +0530, Suganath Prabu S wrote:
>> Ventura Series controller are Tri-mode. The controller and
>> firmware are capable of supporting NVMe devices and
>> PCIe switches to be connected with the controller. This
>> patch set adds driver level support for NVMe devices and
>> PCIe switches.
>
> Hi Suganath,
> Can you please also Cc linux-n...@lists.infradead.org for NVMe related topics.
>
> Thanks,
> Johannes
> --
> Johannes Thumshirn  Storage
> jthumsh...@suse.de+49 911 74053 689
> SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
> GF: Felix Imendörffer, Jane Smithard, Graham Norton
> HRB 21284 (AG Nürnberg)
> Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

Re: [PATCH] iscsi-target: Reject immediate data underflow larger than SCSI transfer length

2017-07-11 Thread Nicholas A. Bellinger

Hi Bart,

On Thu, 2017-06-08 at 23:55 -0700, Nicholas A. Bellinger wrote:
> On Thu, 2017-06-08 at 15:37 +, Bart Van Assche wrote:
> > On Thu, 2017-06-08 at 04:21 +, Nicholas A. Bellinger wrote:
> > > + /*
> > > +  * Check for underflow case where both EDTL and immediate data payload
> > > +  * exceeds what is presented by CDB's TRANSFER LENGTH, and what has
> > > +  * already been set in target_cmd_size_check() as se_cmd->data_length.
> > > +  *
> > > +  * For this special case, fail the command and dump the immediate data
> > > +  * payload.
> > > +  */
> > > + if (cmd->first_burst_len > cmd->se_cmd.data_length) {
> > > + cmd->sense_reason = TCM_INVALID_CDB_FIELD;
> > > + goto after_immediate_data;
> > > + }
> > 
> > A quote from the iSCSI RFC (https://tools.ietf.org/html/rfc5048):
> > 
> >If SPDTL < EDTL for a task, iSCSI Underflow MUST be signaled in the
> >SCSI Response PDU as specified in [RFC3720].  The Residual Count MUST
> >be set to the numerical value of (EDTL - SPDTL).
> > 
> > Sorry but I don't think that sending TCM_INVALID_CDB_FIELD back to the
> > initiator is compliant with the iSCSI RFC.
> 
> Alas, the nuance of what this patch actually does was missed when you
> cut the context.
> 
> First, a bit of history.  LIO has rejected underflow for all WRITEs for
> the first ~12.5 years of RFC-3720, and in the context of iscsi-target
> mode there has never been a single host environment that ever once
> cared.
> 
> Since Roland's patch to allow underflow for control CDBs in v4.3+ opened
> this discussion for control CDBs with a WRITE payload in order to make
> MSFT/FCP cert for PERSISTENT_RESERVE_OUT happy, the question has become
> what control CDB WRITE underflow cases should we allow..?
> 
> The point with this patch is when a host is sending a underflow with a
> iscsi immediate data payload that exceeds SCSI transfer length, it's a
> bogus request with a garbage payload.  It's a garbage payload because
> the SCSI CDB itself obviously doesn't want anything to do it.
> 
> I'm very dubious of any host environment who's trying to do this for any
> CDB, and expects achieve expected results.
> 
> Of course, since v4.3+ normal overflow where SCSI transfer length
> matches the iscsi immediate data payload just works with or without this
> patch.
> 
> So to that extent, I'm going to push this patch as a defensive fix for
> v4.3+, to let those imaginary iscsi host environments know they being
> very, very naughty.
> 
> >  Please note that a fix that is
> > compliant with the iSCSI RFC is present in the following patch series: 
> > [PATCH
> > 00/33] SCSI target driver patches for kernel v4.13, 23 May 2017
> > (https://www.spinics.net/lists/target-devel/msg15370.html).
> 
> So I might still consider this as a v4.13-rc item for control CDB
> underflow, but no way as stable material.
> 
> Also, there is certainly no way I'm going to allow a patch to randomly
> enable underflow/overflow for all WRITE non control CDBs tree-wide
> across all fabric drivers, because 1) no host environments actually care
> about this, and 2) it's still dangerous to do for all fabrics without
> some serious auditing.

After further consideration, I've decided against allowing iscsi-target
underflow with a immediate data payload larger than SCSI transfer
length.

Any host environment that attempts to send an underflow with a immediate
data payload larger than SCSI transfer length, expects the target to
automatically truncate immediate data, and still return GOOD status is
completely bogus.  Any host that attempts this is buggy, and needs to be
fixed.

This is because for the last ~12 years of RFC-3720:

  - There has never been a host environment in the wild that exhibits 
this behavior.
  - There has never been a conformance suite which expects this 
behavior.

So rejecting this case as already done in commit abb85a9b51 is the
correct approach for >= v4.3.y.

Of course, the typical underflow scenario which Roland's v4.3.y commit
enabled, underflow where immediate data matches the SCSI transfer length
is supported for control CDBs.

That said, thanks for high-lighting this particular corner case, so it
could be fixed in >= v4.3.y.

[PATCH] hpsa: add support for legacy boards

2017-07-11 Thread Hannes Reinecke

Add support for legacy boards, ensuring to enable the driver for
those boards only when 'hpsa_allow_any' is set.

Signed-off-by: Hannes Reinecke 
---
 drivers/scsi/hpsa.c | 35 +--
 drivers/scsi/hpsa.h | 44 
 2 files changed, 77 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index 8914eab..2cf6ccc 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -148,6 +148,8 @@
{PCI_VENDOR_ID_HP, 0x333f, 0x103c, 0x333f},
{PCI_VENDOR_ID_HP, PCI_ANY_ID,  PCI_ANY_ID, PCI_ANY_ID,
PCI_CLASS_STORAGE_RAID << 8, 0x << 8, 0},
+   {PCI_VENDOR_ID_COMPAQ, PCI_ANY_ID,  PCI_ANY_ID, PCI_ANY_ID,
+   PCI_CLASS_STORAGE_RAID << 8, 0x << 8, 0},
{0,}
 };
 
@@ -158,6 +160,26 @@
  *  access = Address of the struct of function pointers
  */
 static struct board_type products[] = {
+   {0x40700E11, "Smart Array 5300", _access},
+   {0x40800E11, "Smart Array 5i", _access},
+   {0x40820E11, "Smart Array 532", _access},
+   {0x40830E11, "Smart Array 5312", _access},
+   {0x409A0E11, "Smart Array 641", _access},
+   {0x409B0E11, "Smart Array 642", _access},
+   {0x409C0E11, "Smart Array 6400", _access},
+   {0x409D0E11, "Smart Array 6400 EM", _access},
+   {0x40910E11, "Smart Array 6i", _access},
+   {0x3225103C, "Smart Array P600", _access},
+   {0x3223103C, "Smart Array P800", _access},
+   {0x3234103C, "Smart Array P400", _access},
+   {0x3235103C, "Smart Array P400i", _access},
+   {0x3211103C, "Smart Array E200i", _access},
+   {0x3212103C, "Smart Array E200", _access},
+   {0x3213103C, "Smart Array E200i", _access},
+   {0x3214103C, "Smart Array E200i", _access},
+   {0x3215103C, "Smart Array E200i", _access},
+   {0x3237103C, "Smart Array E500", _access},
+   {0x323D103C, "Smart Array P700m", _access},
{0x3241103C, "Smart Array P212", _access},
{0x3243103C, "Smart Array P410", _access},
{0x3245103C, "Smart Array P410i", _access},
@@ -7243,8 +7265,17 @@ static int hpsa_lookup_board_id(struct pci_dev *pdev, 
u32 *board_id)
subsystem_vendor_id;
 
for (i = 0; i < ARRAY_SIZE(products); i++)
-   if (*board_id == products[i].board_id)
-   return i;
+   if (*board_id == products[i].board_id) {
+   if (products[i].access != _access &&
+   products[i].access != _access)
+   return i;
+   if (hpsa_allow_any) {
+   dev_warn(>dev,
+"unsupported board ID: 0x%08x\n",
+*board_id);
+   return i;
+   }
+   }
 
if ((subsystem_vendor_id != PCI_VENDOR_ID_HP &&
subsystem_vendor_id != PCI_VENDOR_ID_COMPAQ) ||
diff --git a/drivers/scsi/hpsa.h b/drivers/scsi/hpsa.h
index 1c49741..e700d2b 100644
--- a/drivers/scsi/hpsa.h
+++ b/drivers/scsi/hpsa.h
@@ -447,6 +447,25 @@ static void SA5_intr_mask(struct ctlr_info *h, unsigned 
long val)
}
 }
 
+/*
+ *  This card is the opposite of the other cards.
+ *   0 turns interrupts on...
+ *   0x04 turns them off...
+ */
+static void SA5B_intr_mask(struct ctlr_info *h, unsigned long val)
+{
+   if (val) { /* Turn interrupts on */
+   h->interrupts_enabled = 1;
+   writel(0, h->vaddr + SA5_REPLY_INTR_MASK_OFFSET);
+   (void) readl(h->vaddr + SA5_REPLY_INTR_MASK_OFFSET);
+   } else { /* Turn them off */
+   h->interrupts_enabled = 0;
+   writel(SA5B_INTR_OFF,
+  h->vaddr + SA5_REPLY_INTR_MASK_OFFSET);
+   (void) readl(h->vaddr + SA5_REPLY_INTR_MASK_OFFSET);
+   }
+}
+
 static void SA5_performant_intr_mask(struct ctlr_info *h, unsigned long val)
 {
if (val) { /* turn on interrupts */
@@ -549,6 +568,16 @@ static bool SA5_ioaccel_mode1_intr_pending(struct 
ctlr_info *h)
true : false;
 }
 
+/*
+ *  Returns true if an interrupt is pending..
+ */
+static bool SA5B_intr_pending(struct ctlr_info *h)
+{
+   unsigned long register_value  =
+   readl(h->vaddr + SA5_INTR_STATUS);
+   return (register_value & SA5B_INTR_PENDING);
+}
+
 #define IOACCEL_MODE1_REPLY_QUEUE_INDEX  0x1A0
 #define IOACCEL_MODE1_PRODUCER_INDEX 0x1B8
 #define IOACCEL_MODE1_CONSUMER_INDEX 0x1BC
@@ -587,6 +616,21 @@ static unsigned long SA5_ioaccel_mode1_completed(struct 
ctlr_info *h, u8 q)
.command_completed = SA5_completed,
 };
 
+/* Duplicate entry of the above to mark unsupported boards */
+static struct access_method SA5A_access = {
+   .submit_command = SA5_submit_command,
+   .set_intr_mask = SA5_intr_mask,
+

90 matches

Mail list logo