date:20170213

[PATCH] qla2xxx: Fix warnings reported by static checker.

2017-02-13 Thread Himanshu Madhani

From: Quinn Tran 

Fix following warning reported by static checker

drivers/scsi/qla2xxx/qla_init.c:3910 qla2x00_alloc_fcport()
warn: use 'flags' here instead of GFP_XXX?

Fixes: b79414e ("qla2xxx: Add framework for async fabric discovery")
Reported-by: Dan Carpenter 
Signed-off-by: Quinn Tran 
Signed-off-by: Himanshu Madhani 
Cc: Dan Carpenter 
---
 drivers/scsi/qla2xxx/qla_init.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/qla2xxx/qla_init.c b/drivers/scsi/qla2xxx/qla_init.c
index ea6ddcf..1598e84 100644
--- a/drivers/scsi/qla2xxx/qla_init.c
+++ b/drivers/scsi/qla2xxx/qla_init.c
@@ -3878,7 +3878,7 @@ static void qla2xxx_nvram_wwn_from_ofw(scsi_qla_host_t 
*vha, nvram_t *nv)
 
fcport->ct_desc.ct_sns = dma_alloc_coherent(>hw->pdev->dev,
sizeof(struct ct_sns_pkt), >ct_desc.ct_sns_dma,
-   GFP_ATOMIC);
+   flags);
fcport->disc_state = DSC_DELETED;
fcport->fw_login_state = DSC_LS_PORT_UNAVAIL;
fcport->deleted = QLA_SESS_DELETED;
-- 
1.8.3.1

RE: [PATCH] scsi: aacraid: fix information leak on hbainfo.driver_name

2017-02-13 Thread Raghava Aditya Renukunta

Hello Colin,


> -Original Message-
> From: Colin King [mailto:colin.k...@canonical.com]
> Sent: Tuesday, February 7, 2017 5:55 AM
> To: dl-esc-Aacraid Linux Driver ; James E . J .
> Bottomley ; Martin K . Petersen
> ; linux-scsi@vger.kernel.org
> Cc: kernel-janit...@vger.kernel.org; linux-ker...@vger.kernel.org
> Subject: [PATCH] scsi: aacraid: fix information leak on hbainfo.driver_name
> 
> EXTERNAL EMAIL
> 
> 
> From: Colin Ian King 
> 
> The driver_name field is not initialized and hence information
> on the stack is being leaked to userspace on the copy_to_user.
> Fix this.

I am curious, do you mean that the user will be able to retrieve garbage stack  
values from the variables that were not set (driver_name etc)? .
 If so how is it a security threat?

Regards,
Raghava Aditya

> Signed-off-by: Colin Ian King 
> ---
>  drivers/scsi/aacraid/commctrl.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/aacraid/commctrl.c b/drivers/scsi/aacraid/commctrl.c
> index 614842a..eb48d0a 100644
> --- a/drivers/scsi/aacraid/commctrl.c
> +++ b/drivers/scsi/aacraid/commctrl.c
> @@ -1015,7 +1015,7 @@ static int aac_get_pci_info(struct aac_dev* dev,
> void __user *arg)
> 
>  static int aac_get_hba_info(struct aac_dev *dev, void __user *arg)
>  {
> -   struct aac_hba_info hbainfo;
> +   struct aac_hba_info hbainfo = { 0 };
> 
> hbainfo.adapter_number  = (u8) dev->id;
> hbainfo.system_io_bus_number= dev->pdev->bus->number;
> --
> 2.10.2

Re: [PATCH] scsi: lpfc: Add shutdown method for kexec

2017-02-13 Thread Brian King

On 02/12/2017 07:01 PM, Benjamin Herrenschmidt wrote:
> On Mon, 2017-02-13 at 13:21 +1300, Eric W. Biederman wrote:
>>> Good point, at the very least we should call remove if shutdown doesn't
>>> exist. Eric: could we make the changes Ben suggests?
>>
>> Definitely.  That was the original design of the kexec interface
>> but people were worried about calling remove during reboot might
>> introduce regressions on the reboot functionality.  So shutdown was
>> added to be remove without the cleaning up the kernel data structures.
> 
> Right. Part of the problem was also that remove was dependent on 
> CONFIG_HOTPLUG
> though that's no longer the case anymore.
> 
> The problem is that a bunch of drivers either don't have a shutdown or
> worse, have one that actually shuts the HW down rather than "idle" it
> which puts it in a state that the new kernel can't deal with.

If we do transition to use remove rather than shutdown, I think we want
some way for a device driver to know whether we are doing kexec or not.
A RAID adapter with a write cache is going to want to flush its write
cache on a PCI hotplug remove, but for a kexec, its going to want to skip
that so the kexec is faster. Today, since kexec looks like a reboot,
rather than a shutdown, we can skip the flush on a reboot, since its
technically not needed there either.

-Brian

-- 
Brian King
Power Linux I/O
IBM Linux Technology Center

Re: [PATCH] scsi: lpfc: Add shutdown method for kexec

2017-02-13 Thread Benjamin Herrenschmidt

On Tue, 2017-02-14 at 15:45 +1300, Eric W. Biederman wrote:
> The only difference ever that should exist between shutdown and remove
> is do you clean up kernel data structures.  The shutdown method is
> allowed to skip the cleanup up kernel data structures that the remove
> method needs to make.
> 
> Assuming the kernel does not have corrupted data structures I don't see
> any practical reason for distinguishing between the two.  There may be
> some real world gotchas we have to deal with but semantically I don't
> see any concerns.

We had historical additions to shutdown actually shutting things down,
like spinning platters down to park disk heads, switching devices
into D3 on some systems, etc... that remove never had (and we don't
want for kexec).

Cheers,
Ben.

[PATCH] qedf: fix ifnullfree.cocci warnings

2017-02-13 Thread kbuild test robot

drivers/scsi/qedf/qedf_main.c:2742:2-7: WARNING: NULL check before freeing 
functions like kfree, debugfs_remove, debugfs_remove_recursive or usb_free_urb 
is not needed. Maybe consider reorganizing relevant code to avoid passing NULL 
values.

 NULL check before some freeing functions is not needed.

 Based on checkpatch warning
 "kfree(NULL) is safe this check is probably not required"
 and kfreeaddr.cocci by Julia Lawall.

Generated by: scripts/coccinelle/free/ifnullfree.cocci

CC: Dupuis, Chad 
Signed-off-by: Fengguang Wu 
---

 qedf_main.c |3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

--- a/drivers/scsi/qedf/qedf_main.c
+++ b/drivers/scsi/qedf/qedf_main.c
@@ -2738,8 +2738,7 @@ static void qedf_free_fcoe_pf_param(stru
 
qedf_free_global_queues(qedf);
 
-   if (qedf->global_queues)
-   kfree(qedf->global_queues);
+   kfree(qedf->global_queues);
 }
 
 /*

Re: [PATCH V4 2/2] qedf: Add QLogic FastLinQ offload FCoE driver framework.

2017-02-13 Thread kbuild test robot

Hi Chad,

[auto build test WARNING on net-next/master]

url:
https://github.com/0day-ci/linux/commits/Dupuis-Chad/qed-Add-support-for-hardware-offloaded-FCoE/20170214-040339


coccinelle warnings: (new ones prefixed by >>)

>> drivers/scsi/qedf/qedf_main.c:2742:2-7: WARNING: NULL check before freeing 
>> functions like kfree, debugfs_remove, debugfs_remove_recursive or 
>> usb_free_urb is not needed. Maybe consider reorganizing relevant code to 
>> avoid passing NULL values.

Please review and possibly fold the followup patch.

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation

Re: [PATCH] scsi: lpfc: Add shutdown method for kexec

2017-02-13 Thread Benjamin Herrenschmidt

On Mon, 2017-02-13 at 15:57 -0600, Brian King wrote:
> If we do transition to use remove rather than shutdown, I think we
> want
> some way for a device driver to know whether we are doing kexec or
> not.
> A RAID adapter with a write cache is going to want to flush its write
> cache on a PCI hotplug remove, but for a kexec, its going to want to
> skip
> that so the kexec is faster. Today, since kexec looks like a reboot,
> rather than a shutdown, we can skip the flush on a reboot, since its
> technically not needed there either.

What happens if a non-flushed adapter gets a PERST ?

I wouldn't trust that 'don't have to flush' magic ...

Cheers,
Ben.

Re: [PATCH 1/1] mpt3sas: Ignore unaligned completion length for ZBC_IN

2017-02-13 Thread Damien Le Moal

Bart,

On 2/14/17 03:57, Bart Van Assche wrote:
> On Mon, 2017-02-13 at 14:11 +0900, Damien Le Moal wrote:
>> The ZBC_IN command (REPORT ZONES) reply length is always a multiple of
>> 64B and thus may not be aligned on the device LBA size.
>> For this command, retry due to the unaligned completion length is
>> incorrect so do not check alignment of the reply length.
>>
>> Signed-off-by: Damien Le Moal 
>> ---
>>  drivers/scsi/mpt3sas/mpt3sas_scsih.c | 5 +++--
>>  1 file changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c 
>> b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
>> index 0b5b423..f04b872 100644
>> --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
>> +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
>> @@ -4723,8 +4723,9 @@ _scsih_io_done(struct MPT3SAS_ADAPTER *ioc, u16 smid, 
>> u8 msix_index, u32 reply)
>>   * then scsi-ml does not need to handle this misbehavior.
>>   */
>>  sector_sz = scmd->device->sector_size;
>> -if (unlikely(scmd->request->cmd_type == REQ_TYPE_FS && sector_sz &&
>> - xfer_cnt % sector_sz)) {
>> +if (scmd->request->cmd_type == REQ_TYPE_FS &&
>> +scmd->cmnd[0] != ZBC_IN &&
>> +sector_sz && xfer_cnt % sector_sz) {
>>  sdev_printk(KERN_INFO, scmd->device,
>>  "unaligned partial completion avoided (xfer_cnt=%u, 
>> sector_sz=%u)\n",
>>  xfer_cnt, sector_sz);
> 
> Hello Damien,
> 
> What software generated a ZBC_IN request with type REQ_TYPE_FS? Shouldn't
> all ZBC_IN requests be assigned type REQ_TYPE_BLOCK_PC?

Any application that issues a BLKREPORTZONE ioctl (e.g. mkfs.f2fs or
libzbc tools) or kernel component calling blkdev_report_zones (e.g.
f2fs) will generate one or more REQ_OP_ZONE_REPORT BIO which translate
into a REQ_TYPE_FS requests for the command ZBC_IN. That command does
not operate on LBAs. The transfer length of the request can be any
length larger than 64B and the reply length will be at least 64B and
aligned on a multiple of 64.

The patch "mpt3sas: Force request partial completion alignment" applied
last week forgot this case, and frankly, I did too despite having looked
at it.

Martin,

We really need this fix to go into 4.10. Without it, the REPORT
ZONES/ZBC_IN command can be retried forever by the HBA.

Best regards.

-- 
Damien Le Moal, Ph.D.
Sr. Manager, System Software Research Group,
Western Digital Corporation
damien.lem...@wdc.com
(+81) 0466-98-3593 (ext. 513593)
1 kirihara-cho, Fujisawa,
Kanagawa, 252-0888 Japan
www.wdc.com, www.hgst.com

Re: [PATCH 1/1] mpt3sas: Ignore unaligned completion length for ZBC_IN

2017-02-13 Thread Bart Van Assche

On Tue, 2017-02-14 at 12:45 +0900, Damien Le Moal wrote:
> On 2/14/17 03:57, Bart Van Assche wrote:
> > On Mon, 2017-02-13 at 14:11 +0900, Damien Le Moal wrote:
> > > The ZBC_IN command (REPORT ZONES) reply length is always a multiple of
> > > 64B and thus may not be aligned on the device LBA size.
> > > For this command, retry due to the unaligned completion length is
> > > incorrect so do not check alignment of the reply length.
> > > 
> > > Signed-off-by: Damien Le Moal 
> > > ---
> > >  drivers/scsi/mpt3sas/mpt3sas_scsih.c | 5 +++--
> > >  1 file changed, 3 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c 
> > > b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> > > index 0b5b423..f04b872 100644
> > > --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> > > +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> > > @@ -4723,8 +4723,9 @@ _scsih_io_done(struct MPT3SAS_ADAPTER *ioc, u16 
> > > smid, u8 msix_index, u32 reply)
> > >* then scsi-ml does not need to handle this misbehavior.
> > >*/
> > >   sector_sz = scmd->device->sector_size;
> > > - if (unlikely(scmd->request->cmd_type == REQ_TYPE_FS && sector_sz &&
> > > -  xfer_cnt % sector_sz)) {
> > > + if (scmd->request->cmd_type == REQ_TYPE_FS &&
> > > + scmd->cmnd[0] != ZBC_IN &&
> > > + sector_sz && xfer_cnt % sector_sz) {
> > >   sdev_printk(KERN_INFO, scmd->device,
> > >   "unaligned partial completion avoided (xfer_cnt=%u, 
> > > sector_sz=%u)\n",
> > >   xfer_cnt, sector_sz);
> > 
> > What software generated a ZBC_IN request with type REQ_TYPE_FS? Shouldn't
> > all ZBC_IN requests be assigned type REQ_TYPE_BLOCK_PC?
> 
> Any application that issues a BLKREPORTZONE ioctl (e.g. mkfs.f2fs or
> libzbc tools) or kernel component calling blkdev_report_zones (e.g.
> f2fs) will generate one or more REQ_OP_ZONE_REPORT BIO which translate
> into a REQ_TYPE_FS requests for the command ZBC_IN. That command does
> not operate on LBAs. The transfer length of the request can be any
> length larger than 64B and the reply length will be at least 64B and
> aligned on a multiple of 64.
> 
> The patch "mpt3sas: Force request partial completion alignment" applied
> last week forgot this case, and frankly, I did too despite having looked
> at it.

Hello Damien,

How about modifying the block layer core such that it sets REQ_TYPE_BLOCK_PC
for REQ_OP_ZONE_REPORT requests? Would that be a valid alternative to the
above patch?

Thanks,

Bart.

Re: [PATCH] scsi: lpfc: Add shutdown method for kexec

2017-02-13 Thread Eric W. Biederman

Brian King  writes:

> On 02/12/2017 07:01 PM, Benjamin Herrenschmidt wrote:
>> On Mon, 2017-02-13 at 13:21 +1300, Eric W. Biederman wrote:
 Good point, at the very least we should call remove if shutdown doesn't
 exist. Eric: could we make the changes Ben suggests?
>>>
>>> Definitely.  That was the original design of the kexec interface
>>> but people were worried about calling remove during reboot might
>>> introduce regressions on the reboot functionality.  So shutdown was
>>> added to be remove without the cleaning up the kernel data structures.
>> 
>> Right. Part of the problem was also that remove was dependent on 
>> CONFIG_HOTPLUG
>> though that's no longer the case anymore.
>> 
>> The problem is that a bunch of drivers either don't have a shutdown or
>> worse, have one that actually shuts the HW down rather than "idle" it
>> which puts it in a state that the new kernel can't deal with.
>
> If we do transition to use remove rather than shutdown, I think we want
> some way for a device driver to know whether we are doing kexec or not.
> A RAID adapter with a write cache is going to want to flush its write
> cache on a PCI hotplug remove, but for a kexec, its going to want to skip
> that so the kexec is faster. Today, since kexec looks like a reboot,
> rather than a shutdown, we can skip the flush on a reboot, since its
> technically not needed there either.

This doesn't make any sense.

With a PCI hotplug remove you can't do anything because by the time you
get the event the hardware is gone.  I think there are optional
interlocks but nothing as good as the old 3.5" floppy disk button, or
the button on CD drives.

The only difference ever that should exist between shutdown and remove
is do you clean up kernel data structures.  The shutdown method is
allowed to skip the cleanup up kernel data structures that the remove
method needs to make.

Assuming the kernel does not have corrupted data structures I don't see
any practical reason for distinguishing between the two.  There may be
some real world gotchas we have to deal with but semantically I don't
see any concerns.

Eric

Re: [PATCH] cdrom: Make device operations read-only

2017-02-13 Thread David Miller

From: Kees Cook 
Date: Mon, 13 Feb 2017 16:25:26 -0800

> diff --git a/drivers/ide/ide-cd.c b/drivers/ide/ide-cd.c
> index 9cbd217bc0c9..ab9232e1e16f 100644
> --- a/drivers/ide/ide-cd.c
> +++ b/drivers/ide/ide-cd.c
> @@ -1166,7 +1166,7 @@ void ide_cdrom_update_speed(ide_drive_t *drive, u8 *buf)
>CDC_CD_RW | CDC_DVD | CDC_DVD_R | CDC_DVD_RAM | CDC_GENERIC_PACKET | \
>CDC_MO_DRIVE | CDC_MRW | CDC_MRW_W | CDC_RAM)
>  
> -static struct cdrom_device_ops ide_cdrom_dops = {
> +static const struct cdrom_device_ops ide_cdrom_dops = {
>   .open   = ide_cdrom_open_real,
>   .release= ide_cdrom_release_real,
>   .drive_status   = ide_cdrom_drive_status,

Acked-by: David S. Miller

[PATCH] cdrom: Make device operations read-only

2017-02-13 Thread Kees Cook

Since function tables are a common target for attackers, it's best to keep
them in read-only memory. As such, this makes the CDROM device ops tables
const. This drops additionally n_minors, since it isn't used meaningfully,
and sets the only user of cdrom_dummy_generic_packet explicitly so the
variables can all be const.

Inspired by similar changes in grsecurity/PaX.

Signed-off-by: Kees Cook 
---
 Documentation/cdrom/cdrom-standard.tex |  9 +-
 drivers/block/paride/pcd.c |  2 +-
 drivers/cdrom/cdrom.c  | 58 --
 drivers/cdrom/gdrom.c  |  4 +--
 drivers/ide/ide-cd.c   |  2 +-
 drivers/scsi/sr.c  |  2 +-
 include/linux/cdrom.h  |  5 +--
 7 files changed, 37 insertions(+), 45 deletions(-)

diff --git a/Documentation/cdrom/cdrom-standard.tex 
b/Documentation/cdrom/cdrom-standard.tex
index c06233fe52ac..8f85b0e41046 100644
--- a/Documentation/cdrom/cdrom-standard.tex
+++ b/Documentation/cdrom/cdrom-standard.tex
@@ -249,7 +249,6 @@ struct& cdrom_device_ops\ \{ \hidewidth\cr
 unsigned\ long);\cr
 \noalign{\medskip}
   \ int& capability;& capability flags \cr
-  & n_minors;& number of active minor devices \cr
 \};\cr
 }
 $$
@@ -258,13 +257,7 @@ it should add a function pointer to this $struct$. When a 
particular
 function is not implemented, however, this $struct$ should contain a
 NULL instead. The $capability$ flags specify the capabilities of the
 \cdrom\ hardware and/or low-level \cdrom\ driver when a \cdrom\ drive
-is registered with the \UCD. The value $n_minors$ should be a positive
-value indicating the number of minor devices that are supported by
-the low-level device driver, normally~1. Although these two variables
-are `informative' rather than `operational,' they are included in
-$cdrom_device_ops$ because they describe the capability of the {\em
-driver\/} rather than the {\em drive}. Nomenclature has always been
-difficult in computer programming.
+is registered with the \UCD.
 
 Note that most functions have fewer parameters than their
 $blkdev_fops$ counterparts. This is because very little of the
diff --git a/drivers/block/paride/pcd.c b/drivers/block/paride/pcd.c
index 5fd2d0e25567..10aed84244f5 100644
--- a/drivers/block/paride/pcd.c
+++ b/drivers/block/paride/pcd.c
@@ -273,7 +273,7 @@ static const struct block_device_operations pcd_bdops = {
.check_events   = pcd_block_check_events,
 };
 
-static struct cdrom_device_ops pcd_dops = {
+static const struct cdrom_device_ops pcd_dops = {
.open   = pcd_open,
.release= pcd_release,
.drive_status   = pcd_drive_status,
diff --git a/drivers/cdrom/cdrom.c b/drivers/cdrom/cdrom.c
index 59cca72647a6..bbbd3caa927c 100644
--- a/drivers/cdrom/cdrom.c
+++ b/drivers/cdrom/cdrom.c
@@ -342,8 +342,8 @@ static void cdrom_sysctl_register(void);
 
 static LIST_HEAD(cdrom_list);
 
-static int cdrom_dummy_generic_packet(struct cdrom_device_info *cdi,
- struct packet_command *cgc)
+int cdrom_dummy_generic_packet(struct cdrom_device_info *cdi,
+  struct packet_command *cgc)
 {
if (cgc->sense) {
cgc->sense->sense_key = 0x05;
@@ -354,6 +354,7 @@ static int cdrom_dummy_generic_packet(struct 
cdrom_device_info *cdi,
cgc->stat = -EIO;
return -EIO;
 }
+EXPORT_SYMBOL(cdrom_dummy_generic_packet);
 
 static int cdrom_flush_cache(struct cdrom_device_info *cdi)
 {
@@ -371,7 +372,7 @@ static int cdrom_flush_cache(struct cdrom_device_info *cdi)
 static int cdrom_get_disc_info(struct cdrom_device_info *cdi,
   disc_information *di)
 {
-   struct cdrom_device_ops *cdo = cdi->ops;
+   const struct cdrom_device_ops *cdo = cdi->ops;
struct packet_command cgc;
int ret, buflen;
 
@@ -586,7 +587,7 @@ static int cdrom_mrw_set_lba_space(struct cdrom_device_info 
*cdi, int space)
 int register_cdrom(struct cdrom_device_info *cdi)
 {
static char banner_printed;
-   struct cdrom_device_ops *cdo = cdi->ops;
+   const struct cdrom_device_ops *cdo = cdi->ops;
int *change_capability = (int *)>capability; /* hack */
 
cd_dbg(CD_OPEN, "entering register_cdrom\n");
@@ -610,7 +611,6 @@ int register_cdrom(struct cdrom_device_info *cdi)
ENSURE(reset, CDC_RESET);
ENSURE(generic_packet, CDC_GENERIC_PACKET);
cdi->mc_flags = 0;
-   cdo->n_minors = 0;
cdi->options = CDO_USE_FFLAGS;
 
if (autoclose == 1 && CDROM_CAN(CDC_CLOSE_TRAY))
@@ -630,8 +630,7 @@ int register_cdrom(struct cdrom_device_info *cdi)
else
cdi->cdda_method = CDDA_OLD;
 
-   if (!cdo->generic_packet)
-   cdo->generic_packet = cdrom_dummy_generic_packet;
+   WARN_ON(!cdo->generic_packet);
 
cd_dbg(CD_REG_UNREG, "drive \"/dev/%s\" registered\n", cdi->name);

Re: [PATCH 1/1] mpt3sas: Ignore unaligned completion length for ZBC_IN

2017-02-13 Thread h...@lst.de

On Tue, Feb 14, 2017 at 02:21:37PM +0900, Damien Le Moal wrote:
> > I think we want to keep the knowledge of which requests have a request size
> > that should be a multiple of the logical block size in the block layer core
> > or in the SCSI core but not in the mpt3sas driver. But I'm not sure what the
> > best approach is to do that. Should we use the request type, should we add a
> > new request attribute or should we add a new function?
> 
> I agree. But the mpt3sas patch that introduced the problem is to solve
> problems with a buggy hardware in the first place... A quirck of some
> sort. So do we really need such a big change just the report zones
> exception ? (all other REQ_TYPE_FS commands either have no payload or
> the payload size is LBA aligned)
> 
> Martin, James, Hannes,
> 
> Please advise ! 4.10 is introducing zoned block device support, and from
> the first release that will be broken with mpt3sas HBAs if we do not
> patch this. Any preferred approach ?

I suspect we'll need to move the workaround to the SD driver.  While it's
an mpt bug, sd is the party that knows which kind of requests were
generated, so it's where we can fix up the length in the done callback.

Re: [PATCH 1/1] mpt3sas: Ignore unaligned completion length for ZBC_IN

2017-02-13 Thread Damien Le Moal

Bart,

On 2/14/17 12:59, Bart Van Assche wrote:
>>> What software generated a ZBC_IN request with type REQ_TYPE_FS? Shouldn't
>>> all ZBC_IN requests be assigned type REQ_TYPE_BLOCK_PC?
>>
>> Any application that issues a BLKREPORTZONE ioctl (e.g. mkfs.f2fs or
>> libzbc tools) or kernel component calling blkdev_report_zones (e.g.
>> f2fs) will generate one or more REQ_OP_ZONE_REPORT BIO which translate
>> into a REQ_TYPE_FS requests for the command ZBC_IN. That command does
>> not operate on LBAs. The transfer length of the request can be any
>> length larger than 64B and the reply length will be at least 64B and
>> aligned on a multiple of 64.
>>
>> The patch "mpt3sas: Force request partial completion alignment" applied
>> last week forgot this case, and frankly, I did too despite having looked
>> at it.
> 
> Hello Damien,
> 
> How about modifying the block layer core such that it sets REQ_TYPE_BLOCK_PC
> for REQ_OP_ZONE_REPORT requests? Would that be a valid alternative to the
> above patch?

I think so. But my understanding of REQ_TYPE_BLOCK_PC is that it is the
type for requests issued internally (scsi_execute) of from things like
the SG driver, so in essence, all requests not derived from a BIO... Is
this correct ? If yes, then setting the BLOCK_TYPE_PC for
REQ_OP_ZONE_REPORT (and REQ_OP_ZONE_RESET while at it) would break this
model, wouldn't it ?

Best regards.

-- 
Damien Le Moal, Ph.D.
Sr. Manager, System Software Research Group,
Western Digital Corporation
damien.lem...@wdc.com
(+81) 0466-98-3593 (ext. 513593)
1 kirihara-cho, Fujisawa,
Kanagawa, 252-0888 Japan
www.wdc.com, www.hgst.com

Re: [PATCH 1/1] mpt3sas: Ignore unaligned completion length for ZBC_IN

2017-02-13 Thread Damien Le Moal

Christoph,

On 2/14/17 15:18, h...@lst.de wrote:
> On Tue, Feb 14, 2017 at 02:21:37PM +0900, Damien Le Moal wrote:
>>> I think we want to keep the knowledge of which requests have a request size
>>> that should be a multiple of the logical block size in the block layer core
>>> or in the SCSI core but not in the mpt3sas driver. But I'm not sure what the
>>> best approach is to do that. Should we use the request type, should we add a
>>> new request attribute or should we add a new function?
>>
>> I agree. But the mpt3sas patch that introduced the problem is to solve
>> problems with a buggy hardware in the first place... A quirck of some
>> sort. So do we really need such a big change just the report zones
>> exception ? (all other REQ_TYPE_FS commands either have no payload or
>> the payload size is LBA aligned)
>>
>> Martin, James, Hannes,
>>
>> Please advise ! 4.10 is introducing zoned block device support, and from
>> the first release that will be broken with mpt3sas HBAs if we do not
>> patch this. Any preferred approach ?
> 
> I suspect we'll need to move the workaround to the SD driver.  While it's
> an mpt bug, sd is the party that knows which kind of requests were
> generated, so it's where we can fix up the length in the done callback.

This makes sense. sd_done already sets report zone command resid to 0,
always. So it make sense to have the mpt3sas check after that, written
in a generic manner. I will send a patch shortly.

Best regards.


-- 
Damien Le Moal, Ph.D.
Sr. Manager, System Software Research Group,
Western Digital Corporation
damien.lem...@wdc.com
(+81) 0466-98-3593 (ext. 513593)
1 kirihara-cho, Fujisawa,
Kanagawa, 252-0888 Japan
www.wdc.com, www.hgst.com

[bug report] scsi: ufs-qcom: dump additional testbus registers

2017-02-13 Thread Dan Carpenter

Hello Venkat Gopalakrishnan,

The patch 9c46b8676271: "scsi: ufs-qcom: dump additional testbus
registers" from Feb 3, 2017, leads to the following static checker
warning:

drivers/scsi/ufs/ufs-qcom.c:1531 ufs_qcom_testbus_cfg_is_ok()
warn: impossible condition '(host->testbus.select_minor > 255) => 
(0-255 > 255)'

drivers/scsi/ufs/ufs-qcom.c
  1517  static bool ufs_qcom_testbus_cfg_is_ok(struct ufs_qcom_host *host)
  1518  {
  1519  if (host->testbus.select_major >= TSTBUS_MAX) {
  1520  dev_err(host->hba->dev,
  1521  "%s: UFS_CFG1[TEST_BUS_SEL} may not equal 
0x%05X\n",
  1522  __func__, host->testbus.select_major);
  1523  return false;
  1524  }
  1525  
  1526  /*
  1527   * Not performing check for each individual select_major
  1528   * mappings of select_minor, since there is no harm in
  1529   * configuring a non-existent select_minor
  1530   */
  1531  if (host->testbus.select_minor > 0xFF) {
^

It might make sense to keep this check.  I don't know.  But it's
confusing that 0xFF is a magic number.  Better to make it a define.

  1532  dev_err(host->hba->dev,
  1533  "%s: 0x%05X is not a legal testbus option\n",
  1534  __func__, host->testbus.select_minor);
  1535  return false;
  1536  }
  1537  
  1538  return true;
  1539  }

regards,
dan carpenter

Re: [PATCH RFC] target/user: Add double ring buffers support.

2017-02-13 Thread Xiubo Li




On 2017年02月14日 01:42, Andy Grover wrote:

On 02/12/2017 05:38 PM, Xiubo Li wrote:

On 2017年02月11日 02:02, Andy Grover wrote:

1. increase the region mmap()ed by userspace, TCMU_RING_SIZE, from 1MB
to 1GB or larger
2. Don't vmalloc() the whole thing, instead vmalloc for the cmd ring
portion, and dynamically alloc pages for the data area as needed and
map them into the data area.

Should the data area mapping is dynamical or add one new API to
userspace to trigger it ?


I would recommend having the kernel make the decision to allocate/map 
more memory, or not (i.e. wait for existing I/O to complete and free 
pages) because this avoids the need for a new API, and also the kernel 
may better decide how to allocate total pages used across multiple 
TCMU devices.



Yes, agreed.



4. Implement an algorithm to keep allocated pages mapped into the data
area for reuse, and maybe a heuristic to keep extreme burstiness from
over-allocating pages


Does a little like slab ?


It's definitely becoming less of a storage issue and more of a 
page-management issue. It might make sense to look at mm code and ask 
mm experts for their advice, yep.



Sure, i will have a deep investigate about this.

Thanks very much.

BRs
Xiubo



This should allow TCMU to allocate more data area as needed, not waste
memory for slower devices, and avoid userspace ABI changes. Could we
prototype this approach and see if it is workable?


I will do more investigate about this.


Thank you for working to improve this area!

Regards -- Andy

Re: [PATCH 1/1] mpt3sas: Ignore unaligned completion length for ZBC_IN

2017-02-13 Thread Bart Van Assche

On Tue, 2017-02-14 at 13:42 +0900, Damien Le Moal wrote:
> I think so. But my understanding of REQ_TYPE_BLOCK_PC is that it is the
> type for requests issued internally (scsi_execute) of from things like
> the SG driver, so in essence, all requests not derived from a BIO... Is
> this correct ? If yes, then setting the BLOCK_TYPE_PC for
> REQ_OP_ZONE_REPORT (and REQ_OP_ZONE_RESET while at it) would break this
> model, wouldn't it ?

Hello Damien,

I think we want to keep the knowledge of which requests have a request size
that should be a multiple of the logical block size in the block layer core
or in the SCSI core but not in the mpt3sas driver. But I'm not sure what the
best approach is to do that. Should we use the request type, should we add a
new request attribute or should we add a new function?

Bart.

Re: [PATCH 1/1] mpt3sas: Ignore unaligned completion length for ZBC_IN

2017-02-13 Thread Damien Le Moal

Bart,

On 2/14/17 14:11, Bart Van Assche wrote:
> On Tue, 2017-02-14 at 13:42 +0900, Damien Le Moal wrote:
>> I think so. But my understanding of REQ_TYPE_BLOCK_PC is that it is the
>> type for requests issued internally (scsi_execute) of from things like
>> the SG driver, so in essence, all requests not derived from a BIO... Is
>> this correct ? If yes, then setting the BLOCK_TYPE_PC for
>> REQ_OP_ZONE_REPORT (and REQ_OP_ZONE_RESET while at it) would break this
>> model, wouldn't it ?
> 
> Hello Damien,
> 
> I think we want to keep the knowledge of which requests have a request size
> that should be a multiple of the logical block size in the block layer core
> or in the SCSI core but not in the mpt3sas driver. But I'm not sure what the
> best approach is to do that. Should we use the request type, should we add a
> new request attribute or should we add a new function?

I agree. But the mpt3sas patch that introduced the problem is to solve
problems with a buggy hardware in the first place... A quirck of some
sort. So do we really need such a big change just the report zones
exception ? (all other REQ_TYPE_FS commands either have no payload or
the payload size is LBA aligned)

Martin, James, Hannes,

Please advise ! 4.10 is introducing zoned block device support, and from
the first release that will be broken with mpt3sas HBAs if we do not
patch this. Any preferred approach ?

Best regards.

-- 
Damien Le Moal, Ph.D.
Sr. Manager, System Software Research Group,
Western Digital Corporation
damien.lem...@wdc.com
(+81) 0466-98-3593 (ext. 513593)
1 kirihara-cho, Fujisawa,
Kanagawa, 252-0888 Japan
www.wdc.com, www.hgst.com

Re: [PATCH RFC] target/user: Add double ring buffers support.

2017-02-13 Thread Xiubo Li




Yes I think it's now clear we need more buffer space to avoid
bottlenecks for high iops. The initial design kept it simple with the
1MB vmalloc'd space but anticipated greater would be needed. It should
not be necessary to change userspace or the TCMU ABI to handle growing
the buffer for fast devices:

1. increase the region mmap()ed by userspace, TCMU_RING_SIZE, from 1MB
to 1GB or larger

For the cmd area, set the size to fixed 512M, and data area's size to
fixed 1G, is that okay ?


512M seems a little big for the cmd area... use of the cmd ring should 
be much smaller than the data area. Each cmd has a minimum size, but 
then to describe an additional page in the iovec[] should be just one 
struct iovec (16 bytes?) for each 4K page.




The struct tcmu_cmd_entry {} size is fixed 44 bytes without iovec[], and 
the size of struct iovec[N] is about 16 bytes * N.


The cmd entry size will be [44B, N *16 + 44B], and the data size will be 
[0, N * 4096], so the ratio of sizeof(cmd entry): sizeof(entry datas) == 
(N * 16 + 44)Bytes : (N * 4096)Bytes == (N * 16)/(N * 4096) + 
44/(N*4096) == 1/256 + 11/(N * 1024).
When N is bigger, the ratio will be smaller. If N >= 1, the ratio will 
be [15/1024, 4/1024).


So, that's right, 512M is a little bigger than actually needed, for the 
safe ratio i think 16/1024 is enough, if the data area is fixed 1G, so 
the cmd area could be 16M+.

Or cmd area(64M) + data area(960M) == 1G ?



3. Upgrade the current fixed-size bitmap-based tracking of data area
to handle the new scheme

The Radix tree will be used to keep the block's index(0 ~
1G/DATA_BLOCK_SIZE) and physical page mapping relations. Each leaf is
one data block(the size is DATA_BLOCK_SIZE).

For non-leaf nodes, use the radix tags[0][SLOTs] to indicate wether
slot[SLOTs]'s branch has free(reused the old one or NULL leafs) block
leafs or not.
This could speed the search of the free blocks in data area.


Yes I think radix tree is a good place to start.


Okay.

Thanks,

BRs
Xiubo



Regards -- Andy

Re: [bug report] scsi: aacraid: Added support for hotplug

2017-02-13 Thread Dan Carpenter

On Mon, Feb 13, 2017 at 07:39:15PM +, Raghava Aditya Renukunta wrote:
> Hi Don,
> 
> > -Original Message-
> > From: Dan Carpenter [mailto:dan.carpen...@oracle.com]
> > Sent: Monday, February 13, 2017 10:47 AM
> > To: Raghava Aditya Renukunta
> > 
> > Cc: linux-scsi@vger.kernel.org
> > Subject: [bug report] scsi: aacraid: Added support for hotplug
> > 
> > EXTERNAL EMAIL
> > 
> > 
> > Hello Raghava Aditya Renukunta,
> > 
> > The patch 6223a39fe6fb: "scsi: aacraid: Added support for hotplug"
> > from Feb 2, 2017, leads to the following static checker warning:
> > 
> > drivers/scsi/aacraid/commsup.c:2243 aac_process_events()
> > error: double unlock 'spin_lock:t_lock'
> > 
> > drivers/scsi/aacraid/commsup.c
> >   2130  spin_lock_irqsave(t_lock, flags);
> > 
> >   2131
> >   2132  while (!list_empty(&(dev->queues-
> > >queue[HostNormCmdQueue].cmdq))) {
> >   2133  struct list_head *entry;
> >   2134  struct aac_aifcmd *aifcmd;
> >   2135  unsigned int  num;
> >   2136  struct hw_fib **hw_fib_pool, **hw_fib_p;
> >   2137  struct fib **fib_pool, **fib_p;
> >   2138
> >   2139  set_current_state(TASK_RUNNING);
> >   2140
> >   2141  entry = dev->queues-
> > >queue[HostNormCmdQueue].cmdq.next;
> >   2142  list_del(entry);
> >   2143
> >   2144  t_lock = dev->queues->queue[HostNormCmdQueue].lock;
> >   2145  spin_unlock_irqrestore(t_lock, flags);
> > ^
> >   2146
> >   2147  fib = list_entry(entry, struct fib, fiblink);
> >   2148  hw_fib = fib->hw_fib_va;
> >   2149  if (dev->sa_firmware) {
> >   2150  /* Thor AIF */
> >   2151  aac_handle_sa_aif(dev, fib);
> >   2152  aac_fib_adapter_complete(fib, 
> > (u16)sizeof(u32));
> >   2153  continue;
> > 
> > The locking isn't right here.  We should re-take the spinlock before
> > continuing.
> 
> The intention here is to protect the retrieval of entry from the queues.
> Or do you mean that we should just protect the whole while loop with one 
> spin_lock (t_lock)?
> 

This is a static checker warning that says we call
spin_unlock_irqrestore(t_lock, flags); at the end of the loop but
sometimes we're not holding the lock.

This is a Smatch warning and it doesn't handle loops correctly.  It
should also warn that on line 2145 we might not be holding the lock
either but it misses that bug.

There is no way this continue is correct with regards to locking.

regards,
dan carpenter

> Thank you,
> Raghava Aditya
> 
> >   2154  }
> >   2155  /*
> >   2156   *  We will process the FIB here or pass it to a
> >   2157   *  worker thread that is TBD. We Really can't
> >   2158   *  do anything at this point since we don't 
> > have
> >   2159   *  anything defined for this thread to do.
> >   2160   */
> > 
> > [ snip ]
> > 
> >   2221  free_mem:
> >     /* Free up the remaining resources */
> >   2223  hw_fib_p = hw_fib_pool;
> >   2224  fib_p = fib_pool;
> >   2225  while (hw_fib_p < _fib_pool[num]) {
> >   2226  kfree(*hw_fib_p);
> >   2227  kfree(*fib_p);
> >   2228  ++fib_p;
> >   2229  ++hw_fib_p;
> >   2230  }
> >   2231  kfree(fib_pool);
> >   2232  free_hw_fib_pool:
> >   2233  kfree(hw_fib_pool);
> >   2234  free_fib:
> >   2235  kfree(fib);
> >   2236  t_lock = dev->queues->queue[HostNormCmdQueue].lock;
> >   2237  spin_lock_irqsave(t_lock, flags);
> >   2238  }
> >   2239  /*
> >   2240   *  There are no more AIF's
> >   2241   */
> >   2242  t_lock = dev->queues->queue[HostNormCmdQueue].lock;
> >   2243  spin_unlock_irqrestore(t_lock, flags);
> > 
> > Otherwise it is a double unlock bug.
> > 
> >   2244  }
> > 
> > 
> > regards,
> > dan carpenter

[PATCH] return valid data buffer length in scsi_bufflen() API using RQF_SPECIAL_PAYLOAD

2017-02-13 Thread Kashyap Desai

Regression due to  commit f9d03f96b988002027d4b28ea1b7a24729a4c9b5
block: improve handling of the magic discard payload

 and  HBA FW encounter FW fault in DMA operation
while creating File system on SSDs.
Below CDB cause FW fault.
CDB: Write same(16) 93 08 00 00 00 00 00 00 00 00 00 00 80 00 00 00

Root cause is SCSI buffer length and DMA buffer length miss match for
WRITE SAME command.

Fix - return valid data buffer length in scsi_bufflen() API using
RQF_SPECIAL_PAYLOAD

Signed-off-by: Kashyap Desai 
---
diff --git a/include/scsi/scsi_cmnd.h b/include/scsi/scsi_cmnd.h
index 9fc1aec..1f796fc 100644
--- a/include/scsi/scsi_cmnd.h
+++ b/include/scsi/scsi_cmnd.h
@@ -180,7 +180,8 @@ static inline struct scatterlist *scsi_sglist(struct
scsi_cmnd *cmd)

 static inline unsigned scsi_bufflen(struct scsi_cmnd *cmd)
 {
-   return cmd->sdb.length;
+   return (cmd->request->rq_flags & RQF_SPECIAL_PAYLOAD) ?
+   cmd->request->special_vec.bv_len :
cmd->sdb.length;
 }

 static inline void scsi_set_resid(struct scsi_cmnd *cmd, int resid)

Re: [PATCH v3 10/16] lpfc: NVME Target: Base modifications

2017-02-13 Thread Hannes Reinecke

On 02/12/2017 10:52 PM, James Smart wrote:
> 
> NVME Target: Base modifications
> 
> This set of patches adds the base modifications for NVME target support
> 
> The base modifications consist of:
> - Additional module parameters or configuration tuning
> - Enablement of configuration mode for NVME target. Ties into the
>   queueing model put into place by the initiator basemods patches.
> - Target-specific buffer pools, dma pools, sgl pools
> 
> Signed-off-by: Dick Kennedy 
> Signed-off-by: James Smart 

Reviewed-by: Hannes Reinecke 

Cheers,

Hannes
-- 
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

Re: [PATCH v3 11/16] lpfc: NVME Target: Receive buffer updates

2017-02-13 Thread Hannes Reinecke

On 02/12/2017 10:52 PM, James Smart wrote:
> 
> NVME Target: Receive buffer updates
> 
> Allocates buffer pools and configures adapter interfaces to handle
> receive buffer (asynchronous FCP CMD ius, first burst data)
> from the adapter. Splits by protocol, etc.
> 
> Signed-off-by: Dick Kennedy 
> Signed-off-by: James Smart 

I probably do _not_ want to know why 32G FC-NVMe needs to know about
FCFs ...

Reviewed-by: Hannes Reinecke 

Cheers,

Hannes
-- 
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

Re: [PATCH v3 03/16] lpfc: minor code cleanups

2017-02-13 Thread Johannes Thumshirn

On 02/12/2017 10:52 PM, James Smart wrote:
> 
> This contains code cleanups that were in the prior patch set.
> This allows better review of real changes later.
> 
> minor code cleanups:
>  fix indentation, punctuation, line length
>  addition/reduction of whitespace
>  remove unneeded parens, braces
>  lpfc_debugfs_nodelist_data: print as u64 rather than byte by byte
>  covert printk(KERN_ERR to pr_err
>  small print string deltas
>  use num_present_cpus() rather than count them
>  comment updates
>  rctl/type names moved to module variable, not on stack
> 
> Signed-off-by: Dick Kennedy 
> Signed-off-by: James Smart 
> ---

Reviewed-by: Johannes Thumshirn 


-- 
Johannes Thumshirn  Storage
jthumsh...@suse.de+49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

Re: [PATCH v3 05/16] lpfc: refactor debugfs queue dump routines

2017-02-13 Thread Johannes Thumshirn

On 02/12/2017 10:52 PM, James Smart wrote:
> 
> Create common wq, cq, eq, rq dump functions
> 
> Signed-off-by: Dick Kennedy 
> Signed-off-by: James Smart 
> ---

Looks good,
Reviewed-by: Johannes Thumshirn 

-- 
Johannes Thumshirn  Storage
jthumsh...@suse.de+49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

Re: [PATCH v3 15/16] lpfc: Update copyrights

2017-02-13 Thread Hannes Reinecke

On 02/12/2017 10:52 PM, James Smart wrote:
> 
> Update copyrights to 2017 for all files touched in this patch set
> 
> Signed-off-by: Dick Kennedy 
> Signed-off-by: James Smart 
Reviewed-by: Hannes Reinecke 

Cheers,

Hannes
-- 
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

Re: [PATCH v3 04/16] lpfc: refactor debugfs queue prints

2017-02-13 Thread Hannes Reinecke

On 02/12/2017 10:52 PM, James Smart wrote:
> 
> Create common wq, cq, eq, rq print functions
> 
> Signed-off-by: Dick Kennedy 
> Signed-off-by: James Smart 
> ---
>  drivers/scsi/lpfc/lpfc.h |   1 +
>  drivers/scsi/lpfc/lpfc_debugfs.c | 564 
> ---
>  2 files changed, 228 insertions(+), 337 deletions(-)
> 
One should look at converting it to seq_file(), seeing that there are
provisions for overflow already.
But that can be done in a later patch.

Reviewed-by: Hannes Reinecke 

Cheers,

Hannes
-- 
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

Re: [PATCH v3 05/16] lpfc: refactor debugfs queue dump routines

2017-02-13 Thread Hannes Reinecke

On 02/12/2017 10:52 PM, James Smart wrote:
> 
> Create common wq, cq, eq, rq dump functions
> 
> Signed-off-by: Dick Kennedy 
> Signed-off-by: James Smart 
Reviewed-by: Hannes Reinecke 

Cheers,

Hannes
-- 
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

Re: [PATCH v3 16/16] lpfc: Update lpfc version to 11.2.0.7

2017-02-13 Thread Johannes Thumshirn

On 02/12/2017 10:52 PM, James Smart wrote:
> 
> Update lpfc version to 11.2.0.7
> 
> Signed-off-by: Dick Kennedy 
> Signed-off-by: James Smart 
> ---

Reviewed-by: Johannes Thumshirn 

-- 
Johannes Thumshirn  Storage
jthumsh...@suse.de+49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

Re: [PATCH v3 15/16] lpfc: Update copyrights

2017-02-13 Thread Johannes Thumshirn

On 02/12/2017 10:52 PM, James Smart wrote:
> 
> Update copyrights to 2017 for all files touched in this patch set
> 
> Signed-off-by: Dick Kennedy 
> Signed-off-by: James Smart 
> ---

Reviewed-by: Johannes Thumshirn 

-- 
Johannes Thumshirn  Storage
jthumsh...@suse.de+49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

Re: [PATCH v3 04/16] lpfc: refactor debugfs queue prints

2017-02-13 Thread Johannes Thumshirn

On 02/12/2017 10:52 PM, James Smart wrote:
> 
> Create common wq, cq, eq, rq print functions
> 
> Signed-off-by: Dick Kennedy 
> Signed-off-by: James Smart 
> ---

Looks good,
Reviewed-by: Johannes Thumshirn 

-- 
Johannes Thumshirn  Storage
jthumsh...@suse.de+49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

Re: [PATCH v3 02/16] lpfc: use pci_irq_alloc_vectors and pci_irq_free_vectors

2017-02-13 Thread Hannes Reinecke

On 02/12/2017 10:52 PM, James Smart wrote:
> 
> From: Christoph Hellwig 
> 
> This avoids having to store the msix_entries array and simpliefies the
> shutdown and cleanup path a lot.
> 
> Signed-off-by: Christoph Hellwig 
> Signed-off-by: James Smart 
> Reviewed-by: Johannes Thumshirn 
Reviewed-by: Hannes Reinecke 

Cheers,

Hannes
-- 
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

Re: [PATCH v3 03/16] lpfc: minor code cleanups

2017-02-13 Thread Hannes Reinecke

On 02/12/2017 10:52 PM, James Smart wrote:
> 
> This contains code cleanups that were in the prior patch set.
> This allows better review of real changes later.
> 
> minor code cleanups:
>  fix indentation, punctuation, line length
>  addition/reduction of whitespace
>  remove unneeded parens, braces
>  lpfc_debugfs_nodelist_data: print as u64 rather than byte by byte
>  covert printk(KERN_ERR to pr_err
>  small print string deltas
>  use num_present_cpus() rather than count them
>  comment updates
>  rctl/type names moved to module variable, not on stack
> 
> Signed-off-by: Dick Kennedy 
> Signed-off-by: James Smart 
Reviewed-by: Hannes Reinecke 

Cheers,

Hannes
-- 
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

Re: [PATCH v3 07/16] lpfc: NVME Initiator: Merge into FC discovery

2017-02-13 Thread Hannes Reinecke

On 02/12/2017 10:52 PM, James Smart wrote:
> 
> NVME Initiator: Merge into FC discovery
> 
> Adds NVME PRLI support and Nameserver registrations and Queries for NVME
> 
> Signed-off-by: Dick Kennedy 
> Signed-off-by: James Smart 
> ---
Reviewed-by: Hannes Reinecke 

Cheers,

Hannes
-- 
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

Re: [PATCH v3 06/16] lpfc: NVME Initiator: Base modifications

2017-02-13 Thread Hannes Reinecke

On 02/12/2017 10:52 PM, James Smart wrote:
> 
> NVME Initiator: Base modifications
> 
> This patch adds base modifications for NVME initiator support.
> 
> The base modifications consist of:
> - Formal split of SLI3 rings from SLI-4 WQs (sometimes referred to as
>   rings as well) as implementation now widely varies between the two.
> - Addition of configuration modes:
>SCSI initiator only; NVME initiator only; NVME target only; and
>SCSI and NVME initiator.
>The configuration mode drives overall adapter configuration,
>offloads enabled, and resource splits.
>NVME support is only available on SLI-4 devices and newer fw.
> - Implements the following based on configuration mode:
>   - Exchange resources are split by protocol; Obviously, if only
>  1 mode, then no split occurs. Default is 50/50. module attribute
>  allows tuning.
>   - Pools and config parameters are separated per-protocol
>   - Each protocol has it's own set of queues, but share interrupt
> vectors.
>  SCSI:
>SLI3 devices have few queues and the original style of queue
>  allocation remains.
>SLI4 devices piggy back on an "io-channel" concept that
>  eventually needs to merge with scsi-mq/blk-mq support (it is
>underway).  For now, the paradigm continues as it existed
>prior. io channel allocates N msix and N WQs (N=4 default)
>and either round robins or uses cpu # modulo N for scheduling.
>A bunch of module parameters allow the configuration to be
>tuned.
>  NVME (initiator):
>Allocates an msix per cpu (or whatever pci_alloc_irq_vectors
>  gets)
>Allocates a WQ per cpu, and maps the WQs to msix on a WQ #
>  modulo msix vector count basis.
>Module parameters exist to cap/control the config if desired.
>   - Each protocol has its own buffer and dma pools.
> 
> I apologize for the size of the patch.
> 
> Signed-off-by: Dick Kennedy 
> Signed-off-by: James Smart 
> 
Apologies accepted :-)

Reviewed-by: Hannes Reinecke 

Cheers,

Hannes
-- 
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

Re: [PATCH v3 08/16] lpfc: NVME Initiator: bind to nvme_fc api

2017-02-13 Thread Hannes Reinecke

On 02/12/2017 10:52 PM, James Smart wrote:
> 
> NVME Initiator: Tie in to NVME Fabrics nvme_fc LLDD initiator api
> 
> Adds the routines to:
> - register and deregister the FC port as a nvme-fc initiator localport
> - register and deregister remote FC ports as a nvme-fc remoteport
> - binding of nvme queues to adapter WQs
> - send/perform NVME LS's
> - send/perform NVME FCP initiator io operations
> 
> Signed-off-by: Dick Kennedy 
> Signed-off-by: James Smart 
Reviewed-by: Hannes Reinecke 

Cheers,

Hannes
-- 
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

Re: [PATCH v3 09/16] lpfc: NVME Initiator: Add debugfs support

2017-02-13 Thread Hannes Reinecke

On 02/12/2017 10:52 PM, James Smart wrote:
> 
> NVME Initiator: Add debugfs support
> 
> Adds debugfs snippets to cover the new NVME initiator functionality
> 
> Signed-off-by: Dick Kennedy 
> Signed-off-by: James Smart 
Reviewed-by: Hannes Reinecke 

Cheers,

Hannes
-- 
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

Re: [PATCH v3 13/16] lpfc: NVME Target: bind to nvmet_fc api

2017-02-13 Thread Hannes Reinecke

On 02/12/2017 10:52 PM, James Smart wrote:
> 
> NVME Target: Tie in to NVME Fabrics nvmet_fc LLDD target api
> 
> Adds the routines to:
> - register and deregister the FC port as a nvmet-fc targetport
> - binding of nvme queues to adapter WQs
> - receipt and passing of NVME LS's to transport, sending transport response
> - receipt of NVME FCP CMD IUs, processing FCP target io data transmission
>   commands; transmission of FCP io response
> - Abort operations for tgt io exchanges
> 
> Signed-off-by: Dick Kennedy 
> Signed-off-by: James Smart 
Reviewed-by: Hannes Reinecke 

Cheers,

Hannes
-- 
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

Re: [PATCH v3 14/16] lpfc: NVME Target: Add debugfs support

2017-02-13 Thread Hannes Reinecke

On 02/12/2017 10:52 PM, James Smart wrote:
> 
> NVME Target: Add debugfs support
> 
> Adds debugfs snippets to cover the new NVME target functionality
> 
> Signed-off-by: Dick Kennedy 
> Signed-off-by: James Smart 
Reviewed-by: Hannes Reinecke 

Cheers,

Hannes
-- 
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

Re: [PATCH v3 12/16] lpfc: NVME Target: Merge into FC discovery

2017-02-13 Thread Hannes Reinecke

On 02/12/2017 10:52 PM, James Smart wrote:
> 
> NVME Target: Merge into FC discovery
> 
> Adds NVME PRLI handling and Nameserver registrations for NVME
> 
> Signed-off-by: Dick Kennedy 
> Signed-off-by: James Smart 

Reviewed-by: Hannes Reinecke 

Cheers,

Hannes
-- 
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

Re: [PATCH v3 16/16] lpfc: Update lpfc version to 11.2.0.7

2017-02-13 Thread Hannes Reinecke

On 02/12/2017 10:52 PM, James Smart wrote:
> 
> Update lpfc version to 11.2.0.7
> 
> Signed-off-by: Dick Kennedy 
> Signed-off-by: James Smart 
> ---
>  drivers/scsi/lpfc/lpfc_version.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/lpfc/lpfc_version.h 
> b/drivers/scsi/lpfc/lpfc_version.h
> index 0d93535..86c6c9b 100644
> --- a/drivers/scsi/lpfc/lpfc_version.h
> +++ b/drivers/scsi/lpfc/lpfc_version.h
> @@ -20,7 +20,7 @@
>   * included with this package. *
>   ***/
>  
> -#define LPFC_DRIVER_VERSION "11.2.0.4"
> +#define LPFC_DRIVER_VERSION "11.2.0.7"
>  #define LPFC_DRIVER_NAME "lpfc"
>  
>  /* Used for SLI 2/3 */
> 
Hmm. One would have thought that a major rework like this would warrant
an increased major number.
But that's probably just me being totally ignorant of any company
policies :-)

Reviewed-by: Hannes Reinecke 

Cheers,

Hannes
-- 
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

Re: [PATCH RFC] target/user: Add double ring buffers support.

2017-02-13 Thread Andy Grover


On 02/13/2017 01:50 AM, Xiubo Li wrote:

There is one new scheme in my mind:



Yes I think it's now clear we need more buffer space to avoid
bottlenecks for high iops. The initial design kept it simple with the
1MB vmalloc'd space but anticipated greater would be needed. It should
not be necessary to change userspace or the TCMU ABI to handle growing
the buffer for fast devices:

1. increase the region mmap()ed by userspace, TCMU_RING_SIZE, from 1MB
to 1GB or larger

For the cmd area, set the size to fixed 512M, and data area's size to
fixed 1G, is that okay ?


512M seems a little big for the cmd area... use of the cmd ring should 
be much smaller than the data area. Each cmd has a minimum size, but 
then to describe an additional page in the iovec[] should be just one 
struct iovec (16 bytes?) for each 4K page.



3. Upgrade the current fixed-size bitmap-based tracking of data area
to handle the new scheme

The Radix tree will be used to keep the block's index(0 ~
1G/DATA_BLOCK_SIZE) and physical page mapping relations. Each leaf is
one data block(the size is DATA_BLOCK_SIZE).

For non-leaf nodes, use the radix tags[0][SLOTs] to indicate wether
slot[SLOTs]'s branch has free(reused the old one or NULL leafs) block
leafs or not.
This could speed the search of the free blocks in data area.


Yes I think radix tree is a good place to start.

Regards -- Andy

Re: [PATCH RFC] target/user: Add double ring buffers support.

2017-02-13 Thread Andy Grover


On 02/12/2017 05:38 PM, Xiubo Li wrote:

On 2017年02月11日 02:02, Andy Grover wrote:

1. increase the region mmap()ed by userspace, TCMU_RING_SIZE, from 1MB
to 1GB or larger
2. Don't vmalloc() the whole thing, instead vmalloc for the cmd ring
portion, and dynamically alloc pages for the data area as needed and
map them into the data area.

Should the data area mapping is dynamical or add one new API to
userspace to trigger it ?


I would recommend having the kernel make the decision to allocate/map 
more memory, or not (i.e. wait for existing I/O to complete and free 
pages) because this avoids the need for a new API, and also the kernel 
may better decide how to allocate total pages used across multiple TCMU 
devices.



4. Implement an algorithm to keep allocated pages mapped into the data
area for reuse, and maybe a heuristic to keep extreme burstiness from
over-allocating pages


Does a little like slab ?


It's definitely becoming less of a storage issue and more of a 
page-management issue. It might make sense to look at mm code and ask mm 
experts for their advice, yep.



This should allow TCMU to allocate more data area as needed, not waste
memory for slower devices, and avoid userspace ABI changes. Could we
prototype this approach and see if it is workable?


I will do more investigate about this.


Thank you for working to improve this area!

Regards -- Andy

Re: [PATCH RFC] target/user: Add double ring buffers support.

2017-02-13 Thread Xiubo Li


Hi Andy,

There is one new scheme in my mind:



Yes I think it's now clear we need more buffer space to avoid 
bottlenecks for high iops. The initial design kept it simple with the 
1MB vmalloc'd space but anticipated greater would be needed. It should 
not be necessary to change userspace or the TCMU ABI to handle growing 
the buffer for fast devices:


1. increase the region mmap()ed by userspace, TCMU_RING_SIZE, from 1MB 
to 1GB or larger
For the cmd area, set the size to fixed 512M, and data area's size to 
fixed 1G, is that okay ?


2. Don't vmalloc() the whole thing, instead vmalloc for the cmd ring 
portion, and dynamically alloc pages for the data area as needed and 
map them into the data area.
TCMU will just vmalloc() the 512M cmd area, and let the data area memory 
allocate and map later when needed.
The userspace runner will mmap() all the (512M + 1G) when initialising, 
and will return 1.5G virtual address space. The cmd area will be mapped 
to actual physical addresses here, while the data area will be mapped in 
page fault hook when using



3. Upgrade the current fixed-size bitmap-based tracking of data area 
to handle the new scheme
The Radix tree will be used to keep the block's index(0 ~ 
1G/DATA_BLOCK_SIZE) and physical page mapping relations. Each leaf is 
one data block(the size is DATA_BLOCK_SIZE).


For non-leaf nodes, use the radix tags[0][SLOTs] to indicate wether 
slot[SLOTs]'s branch has free(reused the old one or NULL leafs) block 
leafs or not.

This could speed the search of the free blocks in data area.


4. Implement an algorithm to keep allocated pages mapped into the data 
area for reuse, and maybe a heuristic to keep extreme burstiness from 
over-allocating pages



For leaf nodes:
if one leaf node is exist and its tags[0][SLOTs] = 0 meaning that some 
older cmds have already touched the block leafs and then "freed" it, if 
so, we will reuse it.
if the leaf node is exist and its tag[0][SLOTs] = 1 meaning that this is 
still used.
if the leaf node is non-exist, this is the first time to touch this 
leaf, will allocate memory and then insert it here setting the 
tag[0][SLOTs] to 1.


This should allow TCMU to allocate more data area as needed, not waste 
memory for slower devices, and avoid userspace ABI changes. Could we 
prototype this approach and see if it is workable?



For slower devices, it will save memory.
Could also avoid changing the userspace ABI.

Thanks,

BRs
Xiubo




Thanks -- Regards -- Andy

RE: [bug report] scsi: aacraid: Added support for hotplug

2017-02-13 Thread Raghava Aditya Renukunta

Hi Don,

> -Original Message-
> From: Dan Carpenter [mailto:dan.carpen...@oracle.com]
> Sent: Monday, February 13, 2017 10:47 AM
> To: Raghava Aditya Renukunta
> 
> Cc: linux-scsi@vger.kernel.org
> Subject: [bug report] scsi: aacraid: Added support for hotplug
> 
> EXTERNAL EMAIL
> 
> 
> Hello Raghava Aditya Renukunta,
> 
> The patch 6223a39fe6fb: "scsi: aacraid: Added support for hotplug"
> from Feb 2, 2017, leads to the following static checker warning:
> 
> drivers/scsi/aacraid/commsup.c:2243 aac_process_events()
> error: double unlock 'spin_lock:t_lock'
> 
> drivers/scsi/aacraid/commsup.c
>   2130  spin_lock_irqsave(t_lock, flags);
> 
>   2131
>   2132  while (!list_empty(&(dev->queues-
> >queue[HostNormCmdQueue].cmdq))) {
>   2133  struct list_head *entry;
>   2134  struct aac_aifcmd *aifcmd;
>   2135  unsigned int  num;
>   2136  struct hw_fib **hw_fib_pool, **hw_fib_p;
>   2137  struct fib **fib_pool, **fib_p;
>   2138
>   2139  set_current_state(TASK_RUNNING);
>   2140
>   2141  entry = dev->queues-
> >queue[HostNormCmdQueue].cmdq.next;
>   2142  list_del(entry);
>   2143
>   2144  t_lock = dev->queues->queue[HostNormCmdQueue].lock;
>   2145  spin_unlock_irqrestore(t_lock, flags);
> ^
>   2146
>   2147  fib = list_entry(entry, struct fib, fiblink);
>   2148  hw_fib = fib->hw_fib_va;
>   2149  if (dev->sa_firmware) {
>   2150  /* Thor AIF */
>   2151  aac_handle_sa_aif(dev, fib);
>   2152  aac_fib_adapter_complete(fib, 
> (u16)sizeof(u32));
>   2153  continue;
> 
> The locking isn't right here.  We should re-take the spinlock before
> continuing.

The intention here is to protect the retrieval of entry from the queues.
Or do you mean that we should just protect the whole while loop with one 
spin_lock (t_lock)?

Thank you,
Raghava Aditya

>   2154  }
>   2155  /*
>   2156   *  We will process the FIB here or pass it to a
>   2157   *  worker thread that is TBD. We Really can't
>   2158   *  do anything at this point since we don't have
>   2159   *  anything defined for this thread to do.
>   2160   */
> 
> [ snip ]
> 
>   2221  free_mem:
>     /* Free up the remaining resources */
>   2223  hw_fib_p = hw_fib_pool;
>   2224  fib_p = fib_pool;
>   2225  while (hw_fib_p < _fib_pool[num]) {
>   2226  kfree(*hw_fib_p);
>   2227  kfree(*fib_p);
>   2228  ++fib_p;
>   2229  ++hw_fib_p;
>   2230  }
>   2231  kfree(fib_pool);
>   2232  free_hw_fib_pool:
>   2233  kfree(hw_fib_pool);
>   2234  free_fib:
>   2235  kfree(fib);
>   2236  t_lock = dev->queues->queue[HostNormCmdQueue].lock;
>   2237  spin_lock_irqsave(t_lock, flags);
>   2238  }
>   2239  /*
>   2240   *  There are no more AIF's
>   2241   */
>   2242  t_lock = dev->queues->queue[HostNormCmdQueue].lock;
>   2243  spin_unlock_irqrestore(t_lock, flags);
> 
> Otherwise it is a double unlock bug.
> 
>   2244  }
> 
> 
> regards,
> dan carpenter

[bug report] scsi: aacraid: Include HBA direct interface

2017-02-13 Thread Dan Carpenter

Hello Raghava Aditya Renukunta,

The patch 423400e64d37: "scsi: aacraid: Include HBA direct interface"
from Feb 2, 2017, leads to the following static checker warning:

drivers/scsi/aacraid/commsup.c:762 aac_hba_send()
error: double unlock 'sem:>event_wait'

drivers/scsi/aacraid/commsup.c
   757  if (wait) {
   758  spin_unlock_irqrestore(>event_lock, flags);
   759  /* Only set for first known interruptable command */

This comment just confuses me...  What is a "known interruptable command?"

   760  if (down_interruptible(>event_wait)) {

down_interruptible() return -EINTR on failure so this test looks
reversed.  It should probably be:

if (!down_interruptible(>event_wait)) {


   761  fibptr->done = 2;
   762  up(>event_wait);
   763  }
   764  spin_lock_irqsave(>event_lock, flags);
   765  if ((fibptr->done == 0) || (fibptr->done == 2)) {
   766  fibptr->done = 2; /* Tell interrupt we aborted 
*/
   767  spin_unlock_irqrestore(>event_lock, 
flags);
   768  return -ERESTARTSYS;
   769  }
   770  spin_unlock_irqrestore(>event_lock, flags);
   771  WARN_ON(fibptr->done == 0);
   772  
   773  if (unlikely(fibptr->flags & 
FIB_CONTEXT_FLAG_TIMED_OUT))
   774  return -ETIMEDOUT;
   775  
   776  return 0;
   777  }

regards,
dan carpenter

Re: [PATCH 1/1] mpt3sas: Ignore unaligned completion length for ZBC_IN

2017-02-13 Thread Bart Van Assche

On Mon, 2017-02-13 at 14:11 +0900, Damien Le Moal wrote:
> The ZBC_IN command (REPORT ZONES) reply length is always a multiple of
> 64B and thus may not be aligned on the device LBA size.
> For this command, retry due to the unaligned completion length is
> incorrect so do not check alignment of the reply length.
> 
> Signed-off-by: Damien Le Moal 
> ---
>  drivers/scsi/mpt3sas/mpt3sas_scsih.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c 
> b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> index 0b5b423..f04b872 100644
> --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> @@ -4723,8 +4723,9 @@ _scsih_io_done(struct MPT3SAS_ADAPTER *ioc, u16 smid, 
> u8 msix_index, u32 reply)
>* then scsi-ml does not need to handle this misbehavior.
>*/
>   sector_sz = scmd->device->sector_size;
> - if (unlikely(scmd->request->cmd_type == REQ_TYPE_FS && sector_sz &&
> -  xfer_cnt % sector_sz)) {
> + if (scmd->request->cmd_type == REQ_TYPE_FS &&
> + scmd->cmnd[0] != ZBC_IN &&
> + sector_sz && xfer_cnt % sector_sz) {
>   sdev_printk(KERN_INFO, scmd->device,
>   "unaligned partial completion avoided (xfer_cnt=%u, 
> sector_sz=%u)\n",
>   xfer_cnt, sector_sz);

Hello Damien,

What software generated a ZBC_IN request with type REQ_TYPE_FS? Shouldn't
all ZBC_IN requests be assigned type REQ_TYPE_BLOCK_PC?

Thanks,

Bart.

[bug report] scsi: aacraid: Added support for hotplug

2017-02-13 Thread Dan Carpenter

Hello Raghava Aditya Renukunta,

The patch 6223a39fe6fb: "scsi: aacraid: Added support for hotplug"
from Feb 2, 2017, leads to the following static checker warning:

drivers/scsi/aacraid/commsup.c:2243 aac_process_events()
error: double unlock 'spin_lock:t_lock'

drivers/scsi/aacraid/commsup.c
  2130  spin_lock_irqsave(t_lock, flags);

  2131  
  2132  while 
(!list_empty(&(dev->queues->queue[HostNormCmdQueue].cmdq))) {
  2133  struct list_head *entry;
  2134  struct aac_aifcmd *aifcmd;
  2135  unsigned int  num;
  2136  struct hw_fib **hw_fib_pool, **hw_fib_p;
  2137  struct fib **fib_pool, **fib_p;
  2138  
  2139  set_current_state(TASK_RUNNING);
  2140  
  2141  entry = dev->queues->queue[HostNormCmdQueue].cmdq.next;
  2142  list_del(entry);
  2143  
  2144  t_lock = dev->queues->queue[HostNormCmdQueue].lock;
  2145  spin_unlock_irqrestore(t_lock, flags);
^
  2146  
  2147  fib = list_entry(entry, struct fib, fiblink);
  2148  hw_fib = fib->hw_fib_va;
  2149  if (dev->sa_firmware) {
  2150  /* Thor AIF */
  2151  aac_handle_sa_aif(dev, fib);
  2152  aac_fib_adapter_complete(fib, (u16)sizeof(u32));
  2153  continue;

The locking isn't right here.  We should re-take the spinlock before
continuing.

  2154  }
  2155  /*
  2156   *  We will process the FIB here or pass it to a
  2157   *  worker thread that is TBD. We Really can't
  2158   *  do anything at this point since we don't have
  2159   *  anything defined for this thread to do.
  2160   */

[ snip ]

  2221  free_mem:
    /* Free up the remaining resources */
  2223  hw_fib_p = hw_fib_pool;
  2224  fib_p = fib_pool;
  2225  while (hw_fib_p < _fib_pool[num]) {
  2226  kfree(*hw_fib_p);
  2227  kfree(*fib_p);
  2228  ++fib_p;
  2229  ++hw_fib_p;
  2230  }
  2231  kfree(fib_pool);
  2232  free_hw_fib_pool:
  2233  kfree(hw_fib_pool);
  2234  free_fib:
  2235  kfree(fib);
  2236  t_lock = dev->queues->queue[HostNormCmdQueue].lock;
  2237  spin_lock_irqsave(t_lock, flags);
  2238  }
  2239  /*
  2240   *  There are no more AIF's
  2241   */
  2242  t_lock = dev->queues->queue[HostNormCmdQueue].lock;
  2243  spin_unlock_irqrestore(t_lock, flags);

Otherwise it is a double unlock bug.

  2244  }


regards,
dan carpenter

[PATCH V4 net-next 1/2] qed: Add support for hardware offloaded FCoE.

2017-02-13 Thread Dupuis, Chad

From: Arun Easi 

This adds the backbone required for the various HW initalizations
which are necessary for the FCoE driver (qedf) for QLogic FastLinQ
4 line of adapters - FW notification, resource initializations, etc.

Signed-off-by: Arun Easi 
Signed-off-by: Yuval Mintz 
---
 drivers/net/ethernet/qlogic/Kconfig   |3 +
 drivers/net/ethernet/qlogic/qed/Makefile  |1 +
 drivers/net/ethernet/qlogic/qed/qed.h |   11 +
 drivers/net/ethernet/qlogic/qed/qed_cxt.c |   98 +-
 drivers/net/ethernet/qlogic/qed/qed_cxt.h |3 +
 drivers/net/ethernet/qlogic/qed/qed_dcbx.c|   13 +-
 drivers/net/ethernet/qlogic/qed/qed_dcbx.h|5 +-
 drivers/net/ethernet/qlogic/qed/qed_dev.c |  205 -
 drivers/net/ethernet/qlogic/qed/qed_dev_api.h |   42 +
 drivers/net/ethernet/qlogic/qed/qed_fcoe.c| 1014 +
 drivers/net/ethernet/qlogic/qed/qed_fcoe.h|   76 ++
 drivers/net/ethernet/qlogic/qed/qed_hsi.h |  781 +++-
 drivers/net/ethernet/qlogic/qed/qed_hw.c  |3 +
 drivers/net/ethernet/qlogic/qed/qed_ll2.c |   25 +
 drivers/net/ethernet/qlogic/qed/qed_ll2.h |2 +-
 drivers/net/ethernet/qlogic/qed/qed_main.c|7 +
 drivers/net/ethernet/qlogic/qed/qed_mcp.c |3 +
 drivers/net/ethernet/qlogic/qed/qed_mcp.h |1 +
 drivers/net/ethernet/qlogic/qed/qed_reg_addr.h|8 +
 drivers/net/ethernet/qlogic/qed/qed_sp.h  |4 +
 drivers/net/ethernet/qlogic/qed/qed_sp_commands.c |3 +
 include/linux/qed/common_hsi.h|   10 +-
 include/linux/qed/fcoe_common.h   |  715 +++
 include/linux/qed/qed_fcoe_if.h   |  145 +++
 include/linux/qed/qed_if.h|   41 +-
 25 files changed, 3200 insertions(+), 19 deletions(-)
 create mode 100644 drivers/net/ethernet/qlogic/qed/qed_fcoe.c
 create mode 100644 drivers/net/ethernet/qlogic/qed/qed_fcoe.h
 create mode 100644 include/linux/qed/fcoe_common.h
 create mode 100644 include/linux/qed/qed_fcoe_if.h

diff --git a/drivers/net/ethernet/qlogic/Kconfig 
b/drivers/net/ethernet/qlogic/Kconfig
index 3cfd105..737b303 100644
--- a/drivers/net/ethernet/qlogic/Kconfig
+++ b/drivers/net/ethernet/qlogic/Kconfig
@@ -113,4 +113,7 @@ config QED_RDMA
 config QED_ISCSI
bool
 
+config QED_FCOE
+   bool
+
 endif # NET_VENDOR_QLOGIC
diff --git a/drivers/net/ethernet/qlogic/qed/Makefile 
b/drivers/net/ethernet/qlogic/qed/Makefile
index 729e437..e234083 100644
--- a/drivers/net/ethernet/qlogic/qed/Makefile
+++ b/drivers/net/ethernet/qlogic/qed/Makefile
@@ -7,3 +7,4 @@ qed-$(CONFIG_QED_SRIOV) += qed_sriov.o qed_vf.o
 qed-$(CONFIG_QED_LL2) += qed_ll2.o
 qed-$(CONFIG_QED_RDMA) += qed_roce.o
 qed-$(CONFIG_QED_ISCSI) += qed_iscsi.o qed_ooo.o
+qed-$(CONFIG_QED_FCOE) += qed_fcoe.o
diff --git a/drivers/net/ethernet/qlogic/qed/qed.h 
b/drivers/net/ethernet/qlogic/qed/qed.h
index 1f61cf3..08f2885 100644
--- a/drivers/net/ethernet/qlogic/qed/qed.h
+++ b/drivers/net/ethernet/qlogic/qed/qed.h
@@ -60,6 +60,7 @@
 #define QED_WFQ_UNIT   100
 
 #define ISCSI_BDQ_ID(_port_id) (_port_id)
+#define FCOE_BDQ_ID(_port_id) ((_port_id) + 2)
 #define QED_WID_SIZE(1024)
 #define QED_PF_DEMS_SIZE(4)
 
@@ -167,6 +168,7 @@ struct qed_tunn_update_params {
  */
 enum qed_pci_personality {
QED_PCI_ETH,
+   QED_PCI_FCOE,
QED_PCI_ISCSI,
QED_PCI_ETH_ROCE,
QED_PCI_DEFAULT /* default in shmem */
@@ -204,6 +206,7 @@ enum QED_FEATURE {
QED_VF,
QED_RDMA_CNQ,
QED_VF_L2_QUE,
+   QED_FCOE_CQ,
QED_MAX_FEATURES,
 };
 
@@ -221,6 +224,7 @@ enum QED_PORT_MODE {
 
 enum qed_dev_cap {
QED_DEV_CAP_ETH,
+   QED_DEV_CAP_FCOE,
QED_DEV_CAP_ISCSI,
QED_DEV_CAP_ROCE,
 };
@@ -255,6 +259,10 @@ struct qed_hw_info {
u32 part_num[4];
 
unsigned char   hw_mac_addr[ETH_ALEN];
+   u64 node_wwn;
+   u64 port_wwn;
+
+   u16 num_fcoe_conns;
 
struct qed_igu_info *p_igu_info;
 
@@ -410,6 +418,7 @@ struct qed_hwfn {
struct qed_ooo_info *p_ooo_info;
struct qed_rdma_info*p_rdma_info;
struct qed_iscsi_info   *p_iscsi_info;
+   struct qed_fcoe_info*p_fcoe_info;
struct qed_pf_paramspf_params;
 
bool b_rdma_enabled_in_prs;
@@ -618,11 +627,13 @@ struct qed_dev {
 
u8  protocol;
 #define IS_QED_ETH_IF(cdev) ((cdev)->protocol == QED_PROTOCOL_ETH)
+#define IS_QED_FCOE_IF(cdev)((cdev)->protocol == QED_PROTOCOL_FCOE)
 
/* Callbacks to protocol driver */
union {
struct qed_common_cb_ops*common;

[PATCH V4 0/2] Add QLogic FastLinQ FCoE (qedf) driver

2017-02-13 Thread Dupuis, Chad

From: "Dupuis, Chad" 

Dave, please apply the qed patch to net-next at your earliest convenience.
Martin, the qed patch needs to be applied first as the qedf patch is dependent
on the FCoE bits in the first qed driver patch.

This series introduces the hardware offload FCoE initiator driver for the
41000 Series Converged Network Adapters (579xx chip) by Cavium. The overall
driver design includes a common module ('qed') and protocol specific
dependent modules ('qedf' for FCoE).

This driver uses the kernel components of libfc and libfcoe as is and does not
make use of the open-fcoe user space components.  Therefore, no changes will 
need to be
made to any open-fcoe components.

The 'qed' common module, under drivers/net/ethernet/qlogic/qed/, is
enhanced with functionality required for FCoE support.

Changes from V3 -> V4

- Minor update to banner text in qed_fcoe.c|h files
- Fix kbuild robot error on 32-bit systems in qedf
- Remove unneeded double memcpy for offloaded ELS commands in qedf

Changes from V2 -> V3

- Fix uninitialized variables reported by kbuild robot in qedf
- Remove superfluous comments from qedf.h
- Introduce new qedf_ctx flag to different stopping I/O for debug purposes.
- Don't take lport->disc.disc_mutex when restarting an rport.
- Remove extra whitespace in qedf_hsi.h

Changes from V1 -> V2

Changes in qed:
- Fix compiler warning when CONFIG_DCB is not set.

Fixes in qedf:
- Add qedf to scsi directory Makefile.
- Updates to convert LightL2 and I/O processing kthreads to workqueues.

Changes from RFC -> V1

- Squash qedf patches to one patch now that the initial review has taken place
- Convert qedf to use hotplug state machine
- Return via va_end to match corresponding va_start in logging functions
- Convert qedf_ctx offloaded port list to a RCU list so searches do not need
  to make use of spinlocks.  Also eliminates the need to fcport conn_id's.
- Use IS_ERR(fp) in qedf_flogi_resp() instead of checking individual FC_EX_* 
errors.
- Remove scsi_block_target when executing TMF request.
- Checkpatch fixes in the qed and qedf patches

Arun Easi (1):
  qed: Add support for hardware offloaded FCoE.

Dupuis, Chad (1):
  qedf: Add QLogic FastLinQ offload FCoE driver framework.

 MAINTAINERS   |6 +
 drivers/net/ethernet/qlogic/Kconfig   |3 +
 drivers/net/ethernet/qlogic/qed/Makefile  |1 +
 drivers/net/ethernet/qlogic/qed/qed.h |   11 +
 drivers/net/ethernet/qlogic/qed/qed_cxt.c |   98 +-
 drivers/net/ethernet/qlogic/qed/qed_cxt.h |3 +
 drivers/net/ethernet/qlogic/qed/qed_dcbx.c|   13 +-
 drivers/net/ethernet/qlogic/qed/qed_dcbx.h|5 +-
 drivers/net/ethernet/qlogic/qed/qed_dev.c |  205 +-
 drivers/net/ethernet/qlogic/qed/qed_dev_api.h |   42 +
 drivers/net/ethernet/qlogic/qed/qed_fcoe.c| 1014 +++
 drivers/net/ethernet/qlogic/qed/qed_fcoe.h|   76 +
 drivers/net/ethernet/qlogic/qed/qed_hsi.h |  781 -
 drivers/net/ethernet/qlogic/qed/qed_hw.c  |3 +
 drivers/net/ethernet/qlogic/qed/qed_ll2.c |   25 +
 drivers/net/ethernet/qlogic/qed/qed_ll2.h |2 +-
 drivers/net/ethernet/qlogic/qed/qed_main.c|7 +
 drivers/net/ethernet/qlogic/qed/qed_mcp.c |3 +
 drivers/net/ethernet/qlogic/qed/qed_mcp.h |1 +
 drivers/net/ethernet/qlogic/qed/qed_reg_addr.h|8 +
 drivers/net/ethernet/qlogic/qed/qed_sp.h  |4 +
 drivers/net/ethernet/qlogic/qed/qed_sp_commands.c |3 +
 drivers/scsi/Kconfig  |1 +
 drivers/scsi/Makefile |1 +
 drivers/scsi/qedf/Kconfig |   11 +
 drivers/scsi/qedf/Makefile|5 +
 drivers/scsi/qedf/qedf.h  |  545 
 drivers/scsi/qedf/qedf_attr.c |  165 +
 drivers/scsi/qedf/qedf_dbg.c  |  195 ++
 drivers/scsi/qedf/qedf_dbg.h  |  154 +
 drivers/scsi/qedf/qedf_debugfs.c  |  460 +++
 drivers/scsi/qedf/qedf_els.c  |  949 ++
 drivers/scsi/qedf/qedf_fip.c  |  269 ++
 drivers/scsi/qedf/qedf_hsi.h  |  422 +++
 drivers/scsi/qedf/qedf_io.c   | 2282 ++
 drivers/scsi/qedf/qedf_main.c | 3336 +
 drivers/scsi/qedf/qedf_version.h  |   15 +
 include/linux/qed/common_hsi.h|   10 +-
 include/linux/qed/fcoe_common.h   |  715 +
 include/linux/qed/qed_fcoe_if.h   |  145 +
 include/linux/qed/qed_if.h|   41 +-
 41 files changed, 12016 insertions(+), 19 deletions(-)
 create mode 100644 drivers/net/ethernet/qlogic/qed/qed_fcoe.c
 create mode 100644 drivers/net/ethernet/qlogic/qed/qed_fcoe.h
 create mode 100644

Re: [PATCH v3 00/39] megaraid_sas: Updates for scsi-next

2017-02-13 Thread Martin K. Petersen

> "Shivasharan" == Shivasharan S  
> writes:

Shivasharan> Changes in v3: Patch 21: Fix to pass rctx_g35 pointer to
Shivasharan> set/get_num_sge() Move all v2 changelog descriptions beyond
Shivasharan> actual commit message body

Applied to 4.11/scsi-queue.

Thanks!

-- 
Martin K. Petersen  Oracle Linux Engineering

Re: [PATCH 00/10] mpt3sas: full mq support

2017-02-13 Thread Hannes Reinecke

On 02/13/2017 07:15 AM, Sreekanth Reddy wrote:
> On Fri, Feb 10, 2017 at 12:29 PM, Hannes Reinecke  wrote:
>> On 02/10/2017 05:43 AM, Sreekanth Reddy wrote:
>>> On Thu, Feb 9, 2017 at 6:42 PM, Hannes Reinecke  wrote:
 On 02/09/2017 02:03 PM, Sreekanth Reddy wrote:
>> [ .. ]
>
>
> Hannes,
>
> I have created a md raid0 with 4 SAS SSD drives using below command,
> #mdadm --create /dev/md0 --level=0 --raid-devices=4 /dev/sdg /dev/sdh
> /dev/sdi /dev/sdj
>
> And here is 'mdadm --detail /dev/md0' command output,
> --
> /dev/md0:
> Version : 1.2
>   Creation Time : Thu Feb  9 14:38:47 2017
>  Raid Level : raid0
>  Array Size : 780918784 (744.74 GiB 799.66 GB)
>Raid Devices : 4
>   Total Devices : 4
> Persistence : Superblock is persistent
>
> Update Time : Thu Feb  9 14:38:47 2017
>   State : clean
>  Active Devices : 4
> Working Devices : 4
>  Failed Devices : 0
>   Spare Devices : 0
>
>  Chunk Size : 512K
>
>Name : host_name
>UUID : b63f9da7:b7de9a25:6a46ca00:42214e22
>  Events : 0
>
> Number   Major   Minor   RaidDevice State
>0   8   960  active sync   /dev/sdg
>1   8  1121  active sync   /dev/sdh
>2   8  1442  active sync   /dev/sdj
>3   8  1283  active sync   /dev/sdi
> --
>
> Then I have used below fio profile to run 4K sequence read operations
> with nr_hw_queues=1 driver and with nr_hw_queues=24 driver (as my
> system has two numa node and each with 12 cpus).
> -
> global]
> ioengine=libaio
> group_reporting
> direct=1
> rw=read
> bs=4k
> allow_mounted_write=0
> iodepth=128
> runtime=150s
>
> [job1]
> filename=/dev/md0
> -
>
> Here are the fio results when nr_hw_queues=1 (i.e. single request
> queue) with various number of job counts
> 1JOB 4k read  : io=213268MB, bw=1421.8MB/s, iops=363975, runt=150001msec
> 2JOBs 4k read : io=309605MB, bw=2064.2MB/s, iops=528389, runt=150001msec
> 4JOBs 4k read : io=281001MB, bw=1873.4MB/s, iops=479569, runt=150002msec
> 8JOBs 4k read : io=236297MB, bw=1575.2MB/s, iops=403236, runt=150016msec
>
> Here are the fio results when nr_hw_queues=24 (i.e. multiple request
> queue) with various number of job counts
> 1JOB 4k read   : io=95194MB, bw=649852KB/s, iops=162463, runt=150001msec
> 2JOBs 4k read : io=189343MB, bw=1262.3MB/s, iops=323142, runt=150001msec
> 4JOBs 4k read : io=314832MB, bw=2098.9MB/s, iops=537309, runt=150001msec
> 8JOBs 4k read : io=277015MB, bw=1846.8MB/s, iops=472769, runt=150001msec
>
> Here we can see that on less number of jobs count, single request
> queue (nr_hw_queues=1) is giving more IOPs than multi request
> queues(nr_hw_queues=24).
>
> Can you please share your fio profile, so that I can try same thing on
> my system.
>
 Have you tried with the latest git update from Jens for-4.11/block (or
 for-4.11/next) branch?
>>>
>>> I am using below git repo,
>>>
>>> https://git.kernel.org/cgit/linux/kernel/git/mkp/scsi.git/log/?h=4.11/scsi-queue
>>>
>>> Today I will try with Jens for-4.11/block.
>>>
>> By all means, do.
>>
 I've found that using the mq-deadline scheduler has a noticeable
 performance boost.

 The fio job I'm using is essentially the same; you just should make sure
 to specify a 'numjob=' statement in there.
 Otherwise fio will just use a single CPU, which of course leads to
 averse effects in the multiqueue case.
>>>
>>> Yes I am providing 'numjob=' on fio command line as shown below,
>>>
>>> # fio md_fio_profile --numjobs=8 --output=fio_results.txt
>>>
>> Still, it looks as if you'd be using less jobs than you have CPUs.
>> Which means you'll be running into a tag starvation scenario on those
>> CPUs, especially for the small blocksizes.
>> What are the results if you set 'numjobs' to the number of CPUs?
>>
> 
> Hannes,
> 
> Tried on Jens for-4.11/block kernel repo and also set each block PD's
> scheduler as 'mq-deadline', and here is my results for 4K SR on md0
> (raid0 with 4 drives). I have 24 CPUs and so tried even with setting
> numjobs=24.
> 
> fio results when nr_hw_queues=1 (i.e. single request queue) with
> various number of job counts
> 
> 4k read when numjobs=1 : io=215553MB, bw=1437.9MB/s, iops=367874,
> runt=150001msec
>

Re: [PATCH] return valid data buffer length in scsi_bufflen() API using RQF_SPECIAL_PAYLOAD

2017-02-13 Thread Christoph Hellwig

Hi Kashyap,

can you try 4.10-rc8?  It has "scsi: use blk_rq_payload_bytes" which
initializeѕ sdb->length to blk_rq_payload_bytes(), which should fix
your issue.

54 matches

Mail list logo