Re: [PATCH] scsi: isci: initialize shost fully before calling scsi_add_host()

2019-01-16 Thread John Garry

On 16/01/2019 02:54, Martin K. Petersen wrote:


Hi John,



Hi Martin,


So in this case I think that accessor functions are actually better
because they allow us to print a big fat warning when you twiddle
something you shouldn't post-initialization. So that's something I think
we could--and should--improve.


Sure, this is an alternative, but I would rather make it obvious when
these parameters should be set so that this would not be required.


I would like to have a mechanism in place that warns if you twiddle
things that break assumptions made at host registration time.


Yes, something more robust would be good.

That's not

a scenario the old registration interface was designed to handle.

I am not sure I agree with your assertion that setting these masks in
the struct prior to scsi_add_host() makes this ordering requirement much
more obvious.

It is not like you are passing in a list of parameters and

then receiving a separately instantiated immutable scsi_host object. You
are performing an operation on something you already have and own.

That's why I commented that the current intersection between
compile-time static host template, dynamically discovered
pre-registration scsi_host parameters, and the registered runtime
scsi_host struct is somewhat messy.

Btw. I have no attachment to the prot wrappers whatsoever. The reason
they exist is that the SCSI integrity support was optional. And
therefore we had stub functions so things could be compiled without any
of the integrity fields being present in the various SCSI structs.


I never noticed stubs for setting/getting 
Scsi_host.prot_{capabilities,guard_type}


So I

don't have any problem killing the wrappers except they may actually be
handy for regressions like this one where you could #error if the driver
writer violates the ordering requirement.



We set many Scsi_host parameters without such safeguarding, and I don't 
know what's special about these protection-related members.


Thanks,
John




Re: [PATCH] scsi: isci: initialize shost fully before calling scsi_add_host()

2019-01-15 Thread Martin K. Petersen


Hi John,

>> So in this case I think that accessor functions are actually better
>> because they allow us to print a big fat warning when you twiddle
>> something you shouldn't post-initialization. So that's something I think
>> we could--and should--improve.
>>
> Sure, this is an alternative, but I would rather make it obvious when
> these parameters should be set so that this would not be required.

I would like to have a mechanism in place that warns if you twiddle
things that break assumptions made at host registration time. That's not
a scenario the old registration interface was designed to handle.

I am not sure I agree with your assertion that setting these masks in
the struct prior to scsi_add_host() makes this ordering requirement much
more obvious. It is not like you are passing in a list of parameters and
then receiving a separately instantiated immutable scsi_host object. You
are performing an operation on something you already have and own.

That's why I commented that the current intersection between
compile-time static host template, dynamically discovered
pre-registration scsi_host parameters, and the registered runtime
scsi_host struct is somewhat messy.

Btw. I have no attachment to the prot wrappers whatsoever. The reason
they exist is that the SCSI integrity support was optional. And
therefore we had stub functions so things could be compiled without any
of the integrity fields being present in the various SCSI structs. So I
don't have any problem killing the wrappers except they may actually be
handy for regressions like this one where you could #error if the driver
writer violates the ordering requirement.

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH] scsi: isci: initialize shost fully before calling scsi_add_host()

2019-01-14 Thread John Garry

On 12/01/2019 02:34, Martin K. Petersen wrote:


John,


So how about just drop these APIs and let the user set the shost
protection parameters directly, like other shost parameters,


The protection interfaces here obviously predate the block layer
allocation changes that made this particular issue pop up.


which should make it a bit clearer when these should be set,
i.e. before scsi_add_host()?


In general, I am not so keen on the somewhat messy intersection between
the host parameters and the host template. The static host templates
made lots of sense in the days of Seagate ST01 and fixed hardware
capabilities.  But reality is that most modern controllers have to query
firmware interfaces to figure out what their actual capabilities are.


Hi Martin,

I am not suggested setting the parameters via scsi host template, but 
rather dynamically (as we currently do) but just drop the set helper 
functions, like:


shost->max_channel = 1;
shost->max_cmd_len = 16;

...

if (hisi_hba->prot_mask) {
dev_info(dev, "Registering for DIF/DIX prot_mask=0x%x\n",
 prot_mask);
-scsi_host_set_prot(hisi_hba->shost, prot_mask);
+shost->prot_capabilities = prot_mask;
}

rc = scsi_add_host(shost, dev);
if (rc)
goto err_out_ha;

rc = sas_register_ha(sha);
if (rc)
goto err_out_register_ha;

I find that it is not crystal clear when scsi_host_set_prot() and 
scsi_host_set_guard() should be called, but not so for setting the shost 
parameters directly, which is common.




So in this case I think that accessor functions are actually better
because they allow us to print a big fat warning when you twiddle
something you shouldn't post-initialization. So that's something I think
we could--and should--improve.



Sure, this is an alternative, but I would rather make it obvious when 
these parameters should be set so that this would not be required.


Thanks,
John




Re: [PATCH] scsi: isci: initialize shost fully before calling scsi_add_host()

2019-01-11 Thread Martin K. Petersen


John,

> So how about just drop these APIs and let the user set the shost
> protection parameters directly, like other shost parameters,

The protection interfaces here obviously predate the block layer
allocation changes that made this particular issue pop up.

> which should make it a bit clearer when these should be set,
> i.e. before scsi_add_host()?

In general, I am not so keen on the somewhat messy intersection between
the host parameters and the host template. The static host templates
made lots of sense in the days of Seagate ST01 and fixed hardware
capabilities.  But reality is that most modern controllers have to query
firmware interfaces to figure out what their actual capabilities are.

So in this case I think that accessor functions are actually better
because they allow us to print a big fat warning when you twiddle
something you shouldn't post-initialization. So that's something I think
we could--and should--improve.

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH] scsi: isci: initialize shost fully before calling scsi_add_host()

2019-01-10 Thread John Garry

On 09/01/2019 18:41, Christoph Hellwig wrote:

This looks good.  I wonder if there is any good way to prevent other
drivers from picking up this bug byt using a better interface, but
that should not delay your fix.

.



I noticed that hisi_sas has this same problem but I forgot to fix it.

So how about just drop these APIs and let the user set the shost 
protection parameters directly, like other shost parameters, which 
should make it a bit clearer when these should be set, i.e. before 
scsi_add_host()?


John




Re: [PATCH] scsi: isci: initialize shost fully before calling scsi_add_host()

2019-01-09 Thread Christoph Hellwig
This looks good.  I wonder if there is any good way to prevent other
drivers from picking up this bug byt using a better interface, but
that should not delay your fix.


Re: [PATCH] scsi: isci: initialize shost fully before calling scsi_add_host()

2019-01-08 Thread Martin K. Petersen


Logan,

> To prevent this, the calls to scsi_host_set_prot() are moved into
> isci_host_alloc() before the call to scsi_add_host(). Out of caution,
> also move the similar call to scsi_host_set_guard().

Applied to 5.0/scsi-fixes. Thanks much!

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH] scsi: isci: initialize shost fully before calling scsi_add_host()

2019-01-08 Thread Jens Axboe
On 1/8/19 1:50 PM, Logan Gunthorpe wrote:
> scsi_mq_setup_tags(), which is called by scsi_add_host(), calculates
> the command size to allocate based on the prot_capabilities. In the
> isci driver, scsi_host_set_prot() is called after scsi_add_host()
> so the command size gets calculated to be smaller than it needs to be.
> Eventually, scsi_mq_init_request() locates the 'prot_sdb' after the command
> assuming it was sized correctly and a buffer overrun may occur.
> 
> However, seeing blk_mq_alloc_rqs() rounds up to the nearest cache line
> size, the mistake can go unnoticed.
> 
> The bug was noticed after the struct request size was reduced by
> commit 9d037ad707ed ("block: remove req->timeout_list")
> 
> Which likely reduced the allocated space for the request by an entire
> cache line, enough that the overflow could be hit and it caused a panic,
> on boot, at:
> 
>   RIP: 0010:t10_pi_complete+0x77/0x1c0
>   Call Trace:
> 
> sd_done+0xf5/0x340
> scsi_finish_command+0xc3/0x120
> blk_done_softirq+0x83/0xb0
> __do_softirq+0xa1/0x2e6
> irq_exit+0xbc/0xd0
> call_function_single_interrupt+0xf/0x20
> 
> 
> sd_done() would call scsi_prot_sg_count() which reads the number of
> entities in 'prot_sdb', but seeing 'prot_sdb' is located after the end of
> the allocated space it reads a garbage number and erroneously calls
> t10_pi_complete().
> 
> To prevent this, the calls to scsi_host_set_prot() are moved into
> isci_host_alloc() before the call to scsi_add_host(). Out of caution,
> also move the similar call to scsi_host_set_guard().

Nice work!

Reviewed-by: Jens Axboe 

-- 
Jens Axboe



Re: [PATCH] scsi: isci: initialize shost fully before calling scsi_add_host()

2019-01-08 Thread Jeff Moyer
Logan Gunthorpe  writes:

> scsi_mq_setup_tags(), which is called by scsi_add_host(), calculates
> the command size to allocate based on the prot_capabilities. In the
> isci driver, scsi_host_set_prot() is called after scsi_add_host()
> so the command size gets calculated to be smaller than it needs to be.
> Eventually, scsi_mq_init_request() locates the 'prot_sdb' after the command
> assuming it was sized correctly and a buffer overrun may occur.

[...]

> To prevent this, the calls to scsi_host_set_prot() are moved into
> isci_host_alloc() before the call to scsi_add_host(). Out of caution,
> also move the similar call to scsi_host_set_guard().
>
> Fixes: 3d2d75254915 ("[SCSI] isci: T10 DIF support")
> Link: 
> http://lkml.kernel.org/r/da851333-eadd-163a-8c78-e1f4ec5ec...@deltatee.com
> Signed-off-by: Logan Gunthorpe 
> Cc: Intel SCU Linux support 
> Cc: Artur Paszkiewicz 
> Cc: "James E.J. Bottomley" 
> Cc: "Martin K. Petersen" 
> Cc: Christoph Hellwig 
> Cc: Jens Axboe 
> Cc: Jeff Moyer 

Nice job, and excellent commit message.  We'll need a similar patch for
lpfc.

Reviewed-by: Jeff Moyer 

> ---
>  drivers/scsi/isci/init.c | 14 +++---
>  1 file changed, 7 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/scsi/isci/init.c b/drivers/scsi/isci/init.c
> index 68b90c4f79a3..1727d0c71b12 100644
> --- a/drivers/scsi/isci/init.c
> +++ b/drivers/scsi/isci/init.c
> @@ -576,6 +576,13 @@ static struct isci_host *isci_host_alloc(struct pci_dev 
> *pdev, int id)
>   shost->max_lun = ~0;
>   shost->max_cmd_len = MAX_COMMAND_SIZE;
>  
> + /* turn on DIF support */
> + scsi_host_set_prot(shost,
> +SHOST_DIF_TYPE1_PROTECTION |
> +SHOST_DIF_TYPE2_PROTECTION |
> +SHOST_DIF_TYPE3_PROTECTION);
> + scsi_host_set_guard(shost, SHOST_DIX_GUARD_CRC);
> +
>   err = scsi_add_host(shost, >dev);
>   if (err)
>   goto err_shost;
> @@ -663,13 +670,6 @@ static int isci_pci_probe(struct pci_dev *pdev, const 
> struct pci_device_id *id)
>   goto err_host_alloc;
>   }
>   pci_info->hosts[i] = h;
> -
> - /* turn on DIF support */
> - scsi_host_set_prot(to_shost(h),
> -SHOST_DIF_TYPE1_PROTECTION |
> -SHOST_DIF_TYPE2_PROTECTION |
> -SHOST_DIF_TYPE3_PROTECTION);
> - scsi_host_set_guard(to_shost(h), SHOST_DIX_GUARD_CRC);
>   }
>  
>   err = isci_setup_interrupts(pdev);


[PATCH] scsi: isci: initialize shost fully before calling scsi_add_host()

2019-01-08 Thread Logan Gunthorpe
scsi_mq_setup_tags(), which is called by scsi_add_host(), calculates
the command size to allocate based on the prot_capabilities. In the
isci driver, scsi_host_set_prot() is called after scsi_add_host()
so the command size gets calculated to be smaller than it needs to be.
Eventually, scsi_mq_init_request() locates the 'prot_sdb' after the command
assuming it was sized correctly and a buffer overrun may occur.

However, seeing blk_mq_alloc_rqs() rounds up to the nearest cache line
size, the mistake can go unnoticed.

The bug was noticed after the struct request size was reduced by
commit 9d037ad707ed ("block: remove req->timeout_list")

Which likely reduced the allocated space for the request by an entire
cache line, enough that the overflow could be hit and it caused a panic,
on boot, at:

  RIP: 0010:t10_pi_complete+0x77/0x1c0
  Call Trace:

sd_done+0xf5/0x340
scsi_finish_command+0xc3/0x120
blk_done_softirq+0x83/0xb0
__do_softirq+0xa1/0x2e6
irq_exit+0xbc/0xd0
call_function_single_interrupt+0xf/0x20


sd_done() would call scsi_prot_sg_count() which reads the number of
entities in 'prot_sdb', but seeing 'prot_sdb' is located after the end of
the allocated space it reads a garbage number and erroneously calls
t10_pi_complete().

To prevent this, the calls to scsi_host_set_prot() are moved into
isci_host_alloc() before the call to scsi_add_host(). Out of caution,
also move the similar call to scsi_host_set_guard().

Fixes: 3d2d75254915 ("[SCSI] isci: T10 DIF support")
Link: http://lkml.kernel.org/r/da851333-eadd-163a-8c78-e1f4ec5ec...@deltatee.com
Signed-off-by: Logan Gunthorpe 
Cc: Intel SCU Linux support 
Cc: Artur Paszkiewicz 
Cc: "James E.J. Bottomley" 
Cc: "Martin K. Petersen" 
Cc: Christoph Hellwig 
Cc: Jens Axboe 
Cc: Jeff Moyer 
---
 drivers/scsi/isci/init.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/scsi/isci/init.c b/drivers/scsi/isci/init.c
index 68b90c4f79a3..1727d0c71b12 100644
--- a/drivers/scsi/isci/init.c
+++ b/drivers/scsi/isci/init.c
@@ -576,6 +576,13 @@ static struct isci_host *isci_host_alloc(struct pci_dev 
*pdev, int id)
shost->max_lun = ~0;
shost->max_cmd_len = MAX_COMMAND_SIZE;
 
+   /* turn on DIF support */
+   scsi_host_set_prot(shost,
+  SHOST_DIF_TYPE1_PROTECTION |
+  SHOST_DIF_TYPE2_PROTECTION |
+  SHOST_DIF_TYPE3_PROTECTION);
+   scsi_host_set_guard(shost, SHOST_DIX_GUARD_CRC);
+
err = scsi_add_host(shost, >dev);
if (err)
goto err_shost;
@@ -663,13 +670,6 @@ static int isci_pci_probe(struct pci_dev *pdev, const 
struct pci_device_id *id)
goto err_host_alloc;
}
pci_info->hosts[i] = h;
-
-   /* turn on DIF support */
-   scsi_host_set_prot(to_shost(h),
-  SHOST_DIF_TYPE1_PROTECTION |
-  SHOST_DIF_TYPE2_PROTECTION |
-  SHOST_DIF_TYPE3_PROTECTION);
-   scsi_host_set_guard(to_shost(h), SHOST_DIX_GUARD_CRC);
}
 
err = isci_setup_interrupts(pdev);
-- 
2.19.0