Re: Regression 4.17-rc1: SSD doesn properly resume causing system hang (NULL pointer dereference)

2018-04-24 Thread Paul Menzel

Dear Bart,


Am 24.04.2018 um 23:17 schrieb Bart Van Assche:

On Tue, 2018-04-24 at 23:04 +0200, Paul Menzel wrote:

I applied your change, and rebuilt the Linux kernel. Unfortunately, it
looks like, it didn’t make a difference.


In that case I don't know what is causing the failure. Can you run a bisect
to determine which commit introduced this regression?


With `scsi_mod.use_blk_mq=n` the system resumes fine, so for to me 
unknown reasons, that Kconfig option get selected in my Linux kernel 
configuration. I remember having similar issues when this was enabled by 
default in Linux 4.13-rc?, so it was just a configuration problem and 
not a regression. Unfortunately, the Linux configuration files are not 
under version control, so I cannot check, but it is probably my fault.


Sorry for the noise, and please tell me, what I can do to get the option 
working on this old device.



Kind regards,

Paul


Re: [PATCH 2/5] ide: kill ide_toggle_bounce

2018-04-24 Thread Jens Axboe
On 4/24/18 12:16 PM, Christoph Hellwig wrote:
> ide_toggle_bounce did select various strange block bounce limits, including
> not bouncing at all as soon as an iommu is present in the system.  Given
> that the dma_map routines now handle any required bounce buffering except
> for ISA DMA, and the ide code already must handle either ISA DMA or highmem
> at least for iommu equipped systems we can get rid of the block layer
> bounce limit setting entirely.

Pretty sure I was the one to add this code, when highmem page IO was
enabled about two decades ago...

Outside of DMA, the issue was that the PIO code could not handle
highmem. That's not the case anymore, so this should be fine.

Reviewed-by: Jens Axboe 

-- 
Jens Axboe



Re: [PATCH blktests] scsi/004: add regression test for false BLK_STS_OK with non good SAM status

2018-04-24 Thread Omar Sandoval
On Mon, Apr 23, 2018 at 02:25:03PM +0200, Steffen Maier wrote:
> 
> On 04/19/2018 10:18 PM, Omar Sandoval wrote:
> > On Thu, Apr 19, 2018 at 01:44:41PM -0600, Jens Axboe wrote:
> >> On 4/19/18 1:41 PM, Bart Van Assche wrote:
> >>> On Thu, 2018-04-19 at 12:13 -0700, Omar Sandoval wrote:
>  On Thu, Apr 19, 2018 at 11:53:30AM -0700, Omar Sandoval wrote:
> > Thanks for the test! Applied.
> 
>  Side note, it's unfortunate that this test takes 180 seconds to run only
>  because we have to wait for the command timeout. We should be able to
>  export request_queue->rq_timeout writeable in sysfs. Would you be
>  interested in doing that?
> >>>
> >>> Hello Omar,
> >>>
> >>> Is this perhaps what you are looking for?
> >>> # ls -l /sys/class/scsi_device/*/*/timeout
> >>> -rw-r--r-- 1 root root 4096 Apr 19 08:52 
> >>> /sys/class/scsi_device/2:0:0:0/device/timeout
> >>> -rw-r--r-- 1 root root 4096 Apr 19 12:39 
> >>> /sys/class/scsi_device/8:0:0:1/device/timeout
> >>
> >> We should have it generically available though, not just for SCSI. In
> >> retrospect, it should have been under queue/ from the start, now we'll
> >> end up with duplicate entries for SCSI.
> > 
> > For the sake of this test, I just decreased the timeout through SCSI.
> 
> Great idea.
> 
> > echo 5 > "/sys/block/${SCSI_DEBUG_DEVICES[0]}/device/timeout"
> 
> However, the timeout should be sufficiently larger than scsi_debug/delay,
> in order not to run into the command timeout.
> It may be unfortunate that scsi_debug/delay uses jiffies as unit and
> can thus differ in a range of an order of magnitude for different kernel 
> configs.
> 
> > # delay to reduce response repetition: around 1..10sec depending on HZ
> > echo 1000 > /sys/bus/pseudo/drivers/scsi_debug/delay
> 
> On s390, we typically have HZ=100, so 1000 jiffies are 10 seconds.

Good catch, I just switched this to use ndelay in nanoseconds instead of
delay.

> We can increase the sdev cmd timeout or decrease the scsi_debug/delay.
> 100 instead of 1000 for scsi_debug/delay worked for me;
> but for some reason the loop checking for busy did not work (any more?)
> causing an unexpected test case error:
> 
> > # ./check scsi/004
> > scsi/004 (ensure repeated TASK SET FULL results in EIO on timing out 
> > command) [failed]
> > runtime  31.892s  ...  31.720s
> > --- tests/scsi/004.out  2018-04-16 11:47:19.105931872 +0200
> > +++ results/nodev/scsi/004.out.bad  2018-04-23 14:07:33.615445253 
> > +0200
> > @@ -1,3 +1,3 @@
> >  Running scsi/004
> > -Input/output error
> > +modprobe: FATAL: Module scsi_debug is in use.
> >  Test complete
> 
> so I added another sleep hack:
> 
>  # dd closing SCSI disk causes implicit TUR also being delayed once
> +# sleep over time window where READ was done and TUR not yet queued
> +sleep 2
>  while grep -q -F "in_use_bm BUSY:" 
> "/proc/scsi/scsi_debug/${SCSI_DEBUG_HOSTS[0]}"; do
> 
> What do you think?

I've been hitting this on and off on all of the scsi-debug tests for
awhile, and I can't figure out where the lingering reference comes from.
I don't think it's related, but I'll look into it.


Re: [PATCH 3/3] scsi: ufs: Use freq table with devfreq

2018-04-24 Thread Bjorn Andersson
On Tue 24 Apr 15:08 PDT 2018, Subhash Jadavani wrote:

> On 2018-04-23 17:20, Bjorn Andersson wrote:
> > devfreq requires that the client operates on actual frequencies, not
> > only 0 and UMAX_INT and as such UFS brok with the introduction of
> > f1d981eaecf8 ("PM / devfreq: Use the available min/max frequency").
> > 
> > This patch registers the frequencies of the first clock with devfreq and
> > use these to determine if we're trying to step up or down.
> > 
[..]
> 
> Looks good to me.
> Reviewed-by: Subhash Jadavani 
> 

Thanks Subhash. Unfortunately I need to respin this to register the opp
table based on our freq table, so there will be a v2 of this soon.

Regards,
Bjorn


Re: [PATCH 3/3] scsi: ufs: Use freq table with devfreq

2018-04-24 Thread Subhash Jadavani

On 2018-04-23 17:20, Bjorn Andersson wrote:

devfreq requires that the client operates on actual frequencies, not
only 0 and UMAX_INT and as such UFS brok with the introduction of
f1d981eaecf8 ("PM / devfreq: Use the available min/max frequency").

This patch registers the frequencies of the first clock with devfreq 
and

use these to determine if we're trying to step up or down.

Signed-off-by: Bjorn Andersson 
---
 drivers/scsi/ufs/ufshcd.c | 39 ---
 1 file changed, 32 insertions(+), 7 deletions(-)

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 2253f24309ec..07b1f3c7bd2d 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -1168,16 +1168,13 @@ static int ufshcd_devfreq_target(struct device 
*dev,

struct ufs_hba *hba = dev_get_drvdata(dev);
ktime_t start;
bool scale_up, sched_clk_scaling_suspend_work = false;
+   struct list_head *clk_list = >clk_list_head;
+   struct ufs_clk_info *clki;
unsigned long irq_flags;

if (!ufshcd_is_clkscaling_supported(hba))
return -EINVAL;

-   if ((*freq > 0) && (*freq < UINT_MAX)) {
-   dev_err(hba->dev, "%s: invalid freq = %lu\n", __func__, *freq);
-   return -EINVAL;
-   }
-
spin_lock_irqsave(hba->host->host_lock, irq_flags);
if (ufshcd_eh_in_progress(hba)) {
spin_unlock_irqrestore(hba->host->host_lock, irq_flags);
@@ -1187,7 +1184,13 @@ static int ufshcd_devfreq_target(struct device 
*dev,

if (!hba->clk_scaling.active_reqs)
sched_clk_scaling_suspend_work = true;

-   scale_up = (*freq == UINT_MAX) ? true : false;
+   if (list_empty(clk_list)) {
+   spin_unlock_irqrestore(hba->host->host_lock, irq_flags);
+   goto out;
+   }
+
+	clki = list_first_entry(>clk_list_head, struct ufs_clk_info, 
list);

+   scale_up = (*freq == clki->max_freq) ? true : false;
if (!ufshcd_is_devfreq_scaling_required(hba, scale_up)) {
spin_unlock_irqrestore(hba->host->host_lock, irq_flags);
ret = 0;
@@ -1257,11 +1260,33 @@ static struct devfreq_dev_profile
ufs_devfreq_profile = {

 static int ufshcd_devfreq_init(struct ufs_hba *hba)
 {
+   struct devfreq_dev_profile *profile;
+   struct list_head *clk_list = >clk_list_head;
+   struct ufs_clk_info *clki;
struct devfreq *devfreq;
int ret;

+   /* Skip devfreq if we don't have any clocks in the list */
+   if (list_empty(clk_list))
+   return 0;
+
+   profile = devm_kmemdup(hba->dev, _devfreq_profile,
+  sizeof(ufs_devfreq_profile), GFP_KERNEL);
+   if (!profile)
+   return -ENOMEM;
+
+   profile->max_state = 2;
+   profile->freq_table = devm_kcalloc(hba->dev, profile->max_state,
+  sizeof(unsigned long), GFP_KERNEL);
+   if (!profile->freq_table)
+   return -ENOMEM;
+
+	clki = list_first_entry(>clk_list_head, struct ufs_clk_info, 
list);

+   profile->freq_table[0] = clki->min_freq;
+   profile->freq_table[1] = clki->max_freq;
+
devfreq = devm_devfreq_add_device(hba->dev,
-   _devfreq_profile,
+   profile,
"simple_ondemand",
NULL);
if (IS_ERR(devfreq)) {


Looks good to me.
Reviewed-by: Subhash Jadavani 

--
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: Regression 4.17-rc1: SSD doesn properly resume causing system hang (NULL pointer dereference)

2018-04-24 Thread Bart Van Assche
On Tue, 2018-04-24 at 23:04 +0200, Paul Menzel wrote:
> I applied your change, and rebuilt the Linux kernel. Unfortunately, it 
> looks like, it didn’t make a difference.

In that case I don't know what is causing the failure. Can you run a bisect
to determine which commit introduced this regression?

Thanks,

Bart.




Re: [PATCH 2/3] scsi: ufs: Extract devfreq registration

2018-04-24 Thread Subhash Jadavani

On 2018-04-23 17:20, Bjorn Andersson wrote:

Failing to register with devfreq leaves hba->devfreq assigned, which
causes the error path to dereference the ERR_PTR(). Rather than bolting
on more conditionals, move the call of devm_devfreq_add_device() into
it's own function and only update hba->devfreq once it's successfully
registered.

The subsequent patch builds upon this to make UFS actually work again,
as it's been broken since f1d981eaecf8 ("PM / devfreq: Use the 
available

min/max frequency")

Signed-off-by: Bjorn Andersson 
---
 drivers/scsi/ufs/ufshcd.c | 31 ++-
 1 file changed, 22 insertions(+), 9 deletions(-)

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 8f22a980b1a7..2253f24309ec 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -1255,6 +1255,26 @@ static struct devfreq_dev_profile 
ufs_devfreq_profile = {

.get_dev_status = ufshcd_devfreq_get_dev_status,
 };

+static int ufshcd_devfreq_init(struct ufs_hba *hba)
+{
+   struct devfreq *devfreq;
+   int ret;
+
+   devfreq = devm_devfreq_add_device(hba->dev,
+   _devfreq_profile,
+   "simple_ondemand",
+   NULL);
+   if (IS_ERR(devfreq)) {
+   ret = PTR_ERR(devfreq);
+   dev_err(hba->dev, "Unable to register with devfreq %d\n", ret);
+   return ret;
+   }
+
+   hba->devfreq = devfreq;
+
+   return 0;
+}
+
 static void __ufshcd_suspend_clkscaling(struct ufs_hba *hba)
 {
unsigned long flags;
@@ -6399,16 +6419,9 @@ static int ufshcd_probe_hba(struct ufs_hba *hba)
sizeof(struct ufs_pa_layer_attr));
hba->clk_scaling.saved_pwr_info.is_valid = true;
if (!hba->devfreq) {
-   hba->devfreq = devm_devfreq_add_device(hba->dev,
-   _devfreq_profile,
-   "simple_ondemand",
-   NULL);
-   if (IS_ERR(hba->devfreq)) {
-   ret = PTR_ERR(hba->devfreq);
-   dev_err(hba->dev, "Unable to register with 
devfreq %d\n",
-   ret);
+   ret = ufshcd_devfreq_init(hba);
+   if (ret)
goto out;
-   }
}
hba->clk_scaling.is_allowed = true;
}


Looks good to me.
Reviewed-by: Subhash Jadavani 

--
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


Re: [PATCH 1/3] PM / devfreq: Actually support providing freq_table

2018-04-24 Thread Bjorn Andersson
On Mon 23 Apr 23:09 PDT 2018, MyungJoo Ham wrote:

> >On Mon 23 Apr 19:48 PDT 2018, Chanwoo Choi wrote:
> >
> >> Hi,
> >> 
> >> On 2018??? 04??? 24??? 09:20, Bjorn Andersson wrote:
> >> > The code in devfreq_add_device() handles the case where a freq_table is
> >> > passed by the client, but then requests min and max frequences from
> >> > the, in this case absent, opp tables.
> >> > 
> >> > Read the min and max frequencies from the frequency table, which has
> >> > been built from the opp table if one exists, instead of querying the
> >> > opp table.
> >> > 
> >> > Signed-off-by: Bjorn Andersson 
> >> > ---
> >> > 
> >> > An alternative approach is to clarify in the devfreq code that it's not
> >> > possible to pass a freq_table and then in patch 3 create an opp table 
> >> > for the
> >> > device in runtime; although the error handling of this becomes 
> >> > non-trivial.
> >> > 
> >> > Transitioning the UFSHCD to use opp tables directly is hindered by the 
> >> > fact
> >> > that the Qualcomm UFS hardware has two different clocks that needs to be
> >> > running at different rates, so we would need a way to describe the two 
> >> > rates in
> >> > the opp table. (And would force us to change the DT binding)
> >> > 
> >> >  drivers/devfreq/devfreq.c | 22 --
> >> >  1 file changed, 4 insertions(+), 18 deletions(-)
> >> > 
> >> > diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
> >> > index fe2af6aa88fc..086ced50a13d 100644
> >> > --- a/drivers/devfreq/devfreq.c
> >> > +++ b/drivers/devfreq/devfreq.c
> >> > @@ -74,30 +74,16 @@ static struct devfreq *find_device_devfreq(struct 
> >> > device *dev)
> >> >  
> >> >  static unsigned long find_available_min_freq(struct devfreq *devfreq)
> >> >  {
> >> > -struct dev_pm_opp *opp;
> >> > -unsigned long min_freq = 0;
> >> > -
> >> > -opp = dev_pm_opp_find_freq_ceil(devfreq->dev.parent, _freq);
> >> > -if (IS_ERR(opp))
> >> > -min_freq = 0;
> >> > -else
> >> > -dev_pm_opp_put(opp);
> >> > +struct devfreq_dev_profile *profile = devfreq->profile;
> >> >  
> >> > -return min_freq;
> >> > +return profile->freq_table[0];
> >> 
> >> It is wrong. The thermal framework support the devfreq-cooling device
> >> which uses the dev_pm_opp_enable/disable().
> >> 
> >
> >Okay, that makes sense. So rather than registering a custom freq_table I
> >should register the min and max frequency using dev_pm_opp_add().
> >
> >> In order to find the correct available min frequency,
> >> the devfreq have to use the OPP function instead of using the first entry
> >> of the freq_table array.
> >> 
> >
> >Based on this there seems to be room for cleaning out the freq_table
> >from devfreq, to reduce the confusion. I will review this further.
> 
> Could you please check if the bug suffering you gets resolved by
> replacing 0 with ULONG_MAX in the function find_available_max_freq?
> 
> -max_freq = 0;
> +max_freq = ULONG_MAX;
> 
> Even if you are not using OPP, these functions should provide somewhat
> "compatible" values.
> 

I also need to make set_freq_table() handle the case where there is no
opp table and change a min_freq of 0 from being treated as an error.

With this I think we're back at supporting using devfreq without
specifying any available frequencies. I am however uncertain if this
should be considered valid use of devfreq.

Regards,
Bjorn


Re: [PATCH 1/3] PM / devfreq: Actually support providing freq_table

2018-04-24 Thread Bjorn Andersson
On Tue 24 Apr 00:26 PDT 2018, Chanwoo Choi wrote:

> Hi,
> 
> On 2018??? 04??? 24??? 14:29, Bjorn Andersson wrote:
> > On Mon 23 Apr 19:48 PDT 2018, Chanwoo Choi wrote:
> > 
> >> Hi,
> >>
> >> On 2018??? 04??? 24??? 09:20, Bjorn Andersson wrote:
> >>> The code in devfreq_add_device() handles the case where a freq_table is
> >>> passed by the client, but then requests min and max frequences from
> >>> the, in this case absent, opp tables.
> >>>
> >>> Read the min and max frequencies from the frequency table, which has
> >>> been built from the opp table if one exists, instead of querying the
> >>> opp table.
> >>>
> >>> Signed-off-by: Bjorn Andersson 
> >>> ---
> >>>
> >>> An alternative approach is to clarify in the devfreq code that it's not
> >>> possible to pass a freq_table and then in patch 3 create an opp table for 
> >>> the
> >>> device in runtime; although the error handling of this becomes 
> >>> non-trivial.
> >>>
> >>> Transitioning the UFSHCD to use opp tables directly is hindered by the 
> >>> fact
> >>> that the Qualcomm UFS hardware has two different clocks that needs to be
> >>> running at different rates, so we would need a way to describe the two 
> >>> rates in
> >>> the opp table. (And would force us to change the DT binding)
> >>>
> >>>  drivers/devfreq/devfreq.c | 22 --
> >>>  1 file changed, 4 insertions(+), 18 deletions(-)
> >>>
> >>> diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
> >>> index fe2af6aa88fc..086ced50a13d 100644
> >>> --- a/drivers/devfreq/devfreq.c
> >>> +++ b/drivers/devfreq/devfreq.c
> >>> @@ -74,30 +74,16 @@ static struct devfreq *find_device_devfreq(struct 
> >>> device *dev)
> >>>  
> >>>  static unsigned long find_available_min_freq(struct devfreq *devfreq)
> >>>  {
> >>> - struct dev_pm_opp *opp;
> >>> - unsigned long min_freq = 0;
> >>> -
> >>> - opp = dev_pm_opp_find_freq_ceil(devfreq->dev.parent, _freq);
> >>> - if (IS_ERR(opp))
> >>> - min_freq = 0;
> >>> - else
> >>> - dev_pm_opp_put(opp);
> >>> + struct devfreq_dev_profile *profile = devfreq->profile;
> >>>  
> >>> - return min_freq;
> >>> + return profile->freq_table[0];
> >>
> >> It is wrong. The thermal framework support the devfreq-cooling device
> >> which uses the dev_pm_opp_enable/disable().
> >>
> > 
> > Okay, that makes sense. So rather than registering a custom freq_table I
> > should register the min and max frequency using dev_pm_opp_add().
> 
> Thanks.
> 
> > 
> >> In order to find the correct available min frequency,
> >> the devfreq have to use the OPP function instead of using the first entry
> >> of the freq_table array.
> >>
> > 
> > Based on this there seems to be room for cleaning out the freq_table
> > from devfreq, to reduce the confusion. I will review this further.
> 
> Actually, devfreq must need to have the freq_table[] array. But, freq_table[]
> array should be handled in the devfreq core. Now, the devfreq device drivers 
> can
> touch the freq_table. I think it is not good.
> 
> There is a reason why we have to maintain the freq_table[] as the internal 
> variable.
> OPP doesn't provide the OPP API which get the all registered frequencies.
> If devfreq-cooling device disables the specific frequency by using 
> dev_pm_oppdisable(),
> the user of OPP interface can not get the disabled frequency list.
> So, I maintain the freq_table even if using the OPP interface.
> 

Thanks for the clarification, I see some possibilities for improving
this but it makes sense.

> And, devfreq-cooling device uses the freq_table directly because
> released MALi driver from ARM initializes the freq_table list
> directly.
> 

Forgive me if I misunderstand this, but isn't this exactly what I'm
trying to do in patch 3? Which stopped working back in v4.15-rc1, with
the introduction of f1d981eaecf8 ("PM / devfreq: Use the available
min/max frequency").

> I have no any objection for refactoring. Just I'm sharing the issue
> and current status.
> 

Thanks for sharing the current status and helping me understand how to
properly use devfreq.

Regards,
Bjorn


Re: Regression 4.17-rc1: SSD doesn properly resume causing system hang (NULL pointer dereference)

2018-04-24 Thread Bart Van Assche
On Tue, 2018-04-24 at 19:37 +0200, Paul Menzel wrote:
> On 04/24/18 19:31, Bart Van Assche wrote:
> Here it is, pasted as citation, as otherwise Thunderbird would wrap the 
> line.
> 
> > (gdb) disas blk_set_runtime_active
> > Dump of assembler code for function blk_set_runtime_active:
> >0xc1518610 <+0>: call   0xc106ac9c <__fentry__>
> >0xc1518615 <+5>: push   %ebp
> >0xc1518616 <+6>: mov%esp,%ebp
> >0xc1518618 <+8>: sub$0x14,%esp
> >0xc151861b <+11>:mov%ebx,-0xc(%ebp)
> >0xc151861e <+14>:mov%eax,%ebx
> >0xc1518620 <+16>:mov%gs:0x14,%eax
> >0xc1518626 <+22>:mov%eax,-0x10(%ebp)
> >0xc1518629 <+25>:xor%eax,%eax
> >0xc151862b <+27>:test   %ebx,%ebx
> >0xc151862d <+29>:mov%esi,-0x8(%ebp)
> >0xc1518630 <+32>:mov%edi,-0x4(%ebp)
> >0xc1518633 <+35>:je 0xc15186b3 
> >0xc1518635 <+37>:mov0xfc(%ebx),%eax
> >0xc151863b <+43>:call   0xc1a4b920 <_raw_spin_lock_irq>
> >0xc1518640 <+48>:mov0x150(%ebx),%esi
> >0xc1518646 <+54>:xor%eax,%eax
> >0xc1518648 <+56>:mov0xc1ca7d20,%edi
> >0xc151864e <+62>:mov%eax,0x154(%ebx)
> >0xc1518654 <+68>:cmp$0xff0c,%esi
> >0xc151865a <+74>:mov%edi,-0x14(%ebp)
> >0xc151865d <+77>:je 0xc15186a5 
> >0xc151865f <+79>:mov%edi,0xf4(%esi)

The e-mail at the start of this e-mail thread shows that %esi == NULL at
the time of the crash and also that the crash occurred at offset 79 (0x4f)
in this function. I think that means that the crash occurred in the following
code: pm_request_autosuspend(q->dev) and also that this means that q->dev ==
NULL. Can you test the (untested) patch below?

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 57cae47ab1c2..b029a94a1e66 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -3272,7 +3272,6 @@ static void sd_probe_async(struct work_struct *work)
gd->events |= DISK_EVENT_MEDIA_CHANGE;
}
 
-   blk_pm_runtime_init(sdp->request_queue, dev);
device_add_disk(dev, gd);
if (sdkp->capacity)
sd_dif_config_host(sdkp);
@@ -3390,6 +3389,8 @@ static int sd_probe(struct device *dev)
get_device(dev);
dev_set_drvdata(dev, sdkp);
 
+   blk_pm_runtime_init(sdp->request_queue, dev);
+
get_device(>dev); /* prevent release before sd_probe_async() */
WARN_ON_ONCE(!queue_work(system_unbound_wq, >probe_work));
 

Thanks,

Bart.

Re: [PATCH 5/5] PCI: remove PCI_DMA_BUS_IS_PHYS

2018-04-24 Thread Palmer Dabbelt

On Tue, 24 Apr 2018 11:16:25 PDT (-0700), Christoph Hellwig wrote:

This was used by the ide, scsi and networking code in the past to
determine if they should bounce payloads.  Now that the dma mapping
always have to support dma to all physical memory (thanks to swiotlb
for non-iommu systems) there is no need to this crude hack any more.

Signed-off-by: Christoph Hellwig 
---
[...]
diff --git a/arch/riscv/include/asm/pci.h b/arch/riscv/include/asm/pci.h
index 0f2fc9ef20fc..b3638c505728 100644
--- a/arch/riscv/include/asm/pci.h
+++ b/arch/riscv/include/asm/pci.h
@@ -26,9 +26,6 @@
 /* RISC-V shim does not initialize PCI bus */
 #define pcibios_assign_all_busses() 1

-/* We do not have an IOMMU */
-#define PCI_DMA_BUS_IS_PHYS 1
-
 extern int isa_dma_bridge_buggy;

 #ifdef CONFIG_PCI


Thanks!

Acked-by: Palmer Dabbelt  (For the RISC-V change)


[PATCH 2/5] ide: kill ide_toggle_bounce

2018-04-24 Thread Christoph Hellwig
ide_toggle_bounce did select various strange block bounce limits, including
not bouncing at all as soon as an iommu is present in the system.  Given
that the dma_map routines now handle any required bounce buffering except
for ISA DMA, and the ide code already must handle either ISA DMA or highmem
at least for iommu equipped systems we can get rid of the block layer
bounce limit setting entirely.

Signed-off-by: Christoph Hellwig 
---
 drivers/ide/ide-dma.c   |  2 --
 drivers/ide/ide-lib.c   | 26 --
 drivers/ide/ide-probe.c |  3 ---
 include/linux/ide.h |  2 --
 4 files changed, 33 deletions(-)

diff --git a/drivers/ide/ide-dma.c b/drivers/ide/ide-dma.c
index 54d4d78ca46a..6f344654ef22 100644
--- a/drivers/ide/ide-dma.c
+++ b/drivers/ide/ide-dma.c
@@ -180,7 +180,6 @@ EXPORT_SYMBOL_GPL(ide_dma_unmap_sg);
 void ide_dma_off_quietly(ide_drive_t *drive)
 {
drive->dev_flags &= ~IDE_DFLAG_USING_DMA;
-   ide_toggle_bounce(drive, 0);
 
drive->hwif->dma_ops->dma_host_set(drive, 0);
 }
@@ -211,7 +210,6 @@ EXPORT_SYMBOL(ide_dma_off);
 void ide_dma_on(ide_drive_t *drive)
 {
drive->dev_flags |= IDE_DFLAG_USING_DMA;
-   ide_toggle_bounce(drive, 1);
 
drive->hwif->dma_ops->dma_host_set(drive, 1);
 }
diff --git a/drivers/ide/ide-lib.c b/drivers/ide/ide-lib.c
index e1180fa46196..78cb79eddc8b 100644
--- a/drivers/ide/ide-lib.c
+++ b/drivers/ide/ide-lib.c
@@ -6,32 +6,6 @@
 #include 
 #include 
 
-/**
- * ide_toggle_bounce   -   handle bounce buffering
- * @drive: drive to update
- * @on: on/off boolean
- *
- * Enable or disable bounce buffering for the device. Drives move
- * between PIO and DMA and that changes the rules we need.
- */
-
-void ide_toggle_bounce(ide_drive_t *drive, int on)
-{
-   u64 addr = BLK_BOUNCE_HIGH; /* dma64_addr_t */
-
-   if (!PCI_DMA_BUS_IS_PHYS) {
-   addr = BLK_BOUNCE_ANY;
-   } else if (on && drive->media == ide_disk) {
-   struct device *dev = drive->hwif->dev;
-
-   if (dev && dev->dma_mask)
-   addr = *dev->dma_mask;
-   }
-
-   if (drive->queue)
-   blk_queue_bounce_limit(drive->queue, addr);
-}
-
 u64 ide_get_lba_addr(struct ide_cmd *cmd, int lba48)
 {
struct ide_taskfile *tf = >tf;
diff --git a/drivers/ide/ide-probe.c b/drivers/ide/ide-probe.c
index 2019e66eada7..8d8ed036ca0a 100644
--- a/drivers/ide/ide-probe.c
+++ b/drivers/ide/ide-probe.c
@@ -805,9 +805,6 @@ static int ide_init_queue(ide_drive_t *drive)
/* assign drive queue */
drive->queue = q;
 
-   /* needs drive->queue to be set */
-   ide_toggle_bounce(drive, 1);
-
return 0;
 }
 
diff --git a/include/linux/ide.h b/include/linux/ide.h
index ca9d34feb572..11f0dd03a4b4 100644
--- a/include/linux/ide.h
+++ b/include/linux/ide.h
@@ -1508,8 +1508,6 @@ static inline void ide_set_hwifdata (ide_hwif_t * hwif, 
void *data)
hwif->hwif_data = data;
 }
 
-extern void ide_toggle_bounce(ide_drive_t *drive, int on);
-
 u64 ide_get_lba_addr(struct ide_cmd *, int);
 u8 ide_dump_status(ide_drive_t *, const char *, u8);
 
-- 
2.17.0



remove PCI_DMA_BUS_IS_PHYS V2

2018-04-24 Thread Christoph Hellwig
Hi all,

this series tries to get rid of the global and PCI_DMA_BUS_IS_PHYS flag,
which causes the block layer and networking code to bounce buffer memory
above the dma mask in some cases.  It is a leftover from i386 + highmem
days and is obsolete now that we have swiotlb or iommus so that the
dma ops implementations can always (minus the ISA DMA case which
will require further attention) handle memory passed to them.

Changes since V1:
 - dropped all patches not strictly required to remove
   PCI_DMA_BUS_IS_PHYS, those will be resent separately


[PATCH 1/5] scsi: reduce use of block bounce buffers

2018-04-24 Thread Christoph Hellwig
We can rely on the dma-mapping code to handle any DMA limits that is
bigger than the ISA DMA mask for us (either using an iommu or swiotlb),
so remove setting the block layer bounce limit for anything but the
unchecked_isa_dma case, or the bouncing for highmem pages.

Signed-off-by: Christoph Hellwig 
---
 drivers/scsi/scsi_lib.c | 24 ++--
 1 file changed, 2 insertions(+), 22 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index e9b4f279d29c..e0b614c0b1e6 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -2149,27 +2149,6 @@ static int scsi_map_queues(struct blk_mq_tag_set *set)
return blk_mq_map_queues(set);
 }
 
-static u64 scsi_calculate_bounce_limit(struct Scsi_Host *shost)
-{
-   struct device *host_dev;
-   u64 bounce_limit = 0x;
-
-   if (shost->unchecked_isa_dma)
-   return BLK_BOUNCE_ISA;
-   /*
-* Platforms with virtual-DMA translation
-* hardware have no practical limit.
-*/
-   if (!PCI_DMA_BUS_IS_PHYS)
-   return BLK_BOUNCE_ANY;
-
-   host_dev = scsi_get_device(shost);
-   if (host_dev && host_dev->dma_mask)
-   bounce_limit = (u64)dma_max_pfn(host_dev) << PAGE_SHIFT;
-
-   return bounce_limit;
-}
-
 void __scsi_init_queue(struct Scsi_Host *shost, struct request_queue *q)
 {
struct device *dev = shost->dma_dev;
@@ -2189,7 +2168,8 @@ void __scsi_init_queue(struct Scsi_Host *shost, struct 
request_queue *q)
}
 
blk_queue_max_hw_sectors(q, shost->max_sectors);
-   blk_queue_bounce_limit(q, scsi_calculate_bounce_limit(shost));
+   if (shost->unchecked_isa_dma)
+   blk_queue_bounce_limit(q, BLK_BOUNCE_ISA);
blk_queue_segment_boundary(q, shost->dma_boundary);
dma_set_seg_boundary(dev, shost->dma_boundary);
 
-- 
2.17.0



[PATCH 5/5] PCI: remove PCI_DMA_BUS_IS_PHYS

2018-04-24 Thread Christoph Hellwig
This was used by the ide, scsi and networking code in the past to
determine if they should bounce payloads.  Now that the dma mapping
always have to support dma to all physical memory (thanks to swiotlb
for non-iommu systems) there is no need to this crude hack any more.

Signed-off-by: Christoph Hellwig 
---
 arch/alpha/include/asm/pci.h  |  5 -
 arch/arc/include/asm/pci.h|  6 --
 arch/arm/include/asm/pci.h|  7 ---
 arch/arm64/include/asm/pci.h  |  5 -
 arch/h8300/include/asm/pci.h  |  2 --
 arch/hexagon/kernel/dma.c |  1 -
 arch/ia64/hp/common/sba_iommu.c   |  3 ---
 arch/ia64/include/asm/pci.h   | 17 -
 arch/ia64/kernel/setup.c  | 12 
 arch/ia64/sn/kernel/io_common.c   |  5 -
 arch/m68k/include/asm/pci.h   |  6 --
 arch/microblaze/include/asm/pci.h |  6 --
 arch/mips/include/asm/pci.h   |  7 ---
 arch/parisc/include/asm/pci.h | 23 ---
 arch/parisc/kernel/setup.c|  5 -
 arch/powerpc/include/asm/pci.h| 18 --
 arch/riscv/include/asm/pci.h  |  3 ---
 arch/s390/include/asm/pci.h   |  2 --
 arch/s390/pci/pci_dma.c   |  2 --
 arch/sh/include/asm/pci.h |  6 --
 arch/sh/kernel/dma-nommu.c|  1 -
 arch/sparc/include/asm/pci_32.h   |  4 
 arch/sparc/include/asm/pci_64.h   |  6 --
 arch/x86/include/asm/pci.h|  3 ---
 arch/xtensa/include/asm/pci.h |  2 --
 drivers/parisc/ccio-dma.c |  2 --
 drivers/parisc/sba_iommu.c|  2 --
 include/asm-generic/pci.h |  8 
 include/linux/dma-mapping.h   |  1 -
 lib/dma-direct.c  |  1 -
 tools/virtio/linux/dma-mapping.h  |  2 --
 31 files changed, 173 deletions(-)

diff --git a/arch/alpha/include/asm/pci.h b/arch/alpha/include/asm/pci.h
index b9ec55351924..cf6bc1e64d66 100644
--- a/arch/alpha/include/asm/pci.h
+++ b/arch/alpha/include/asm/pci.h
@@ -56,11 +56,6 @@ struct pci_controller {
 
 /* IOMMU controls.  */
 
-/* The PCI address space does not equal the physical memory address space.
-   The networking and block device layers use this boolean for bounce buffer
-   decisions.  */
-#define PCI_DMA_BUS_IS_PHYS  0
-
 /* TODO: integrate with include/asm-generic/pci.h ? */
 static inline int pci_get_legacy_ide_irq(struct pci_dev *dev, int channel)
 {
diff --git a/arch/arc/include/asm/pci.h b/arch/arc/include/asm/pci.h
index ba56c23c1b20..4ff53c041c64 100644
--- a/arch/arc/include/asm/pci.h
+++ b/arch/arc/include/asm/pci.h
@@ -16,12 +16,6 @@
 #define PCIBIOS_MIN_MEM 0x10
 
 #define pcibios_assign_all_busses()1
-/*
- * The PCI address space does equal the physical memory address space.
- * The networking and block device layers use this boolean for bounce
- * buffer decisions.
- */
-#define PCI_DMA_BUS_IS_PHYS1
 
 #endif /* __KERNEL__ */
 
diff --git a/arch/arm/include/asm/pci.h b/arch/arm/include/asm/pci.h
index 1f0de808d111..0abd389cf0ec 100644
--- a/arch/arm/include/asm/pci.h
+++ b/arch/arm/include/asm/pci.h
@@ -19,13 +19,6 @@ static inline int pci_proc_domain(struct pci_bus *bus)
 }
 #endif /* CONFIG_PCI_DOMAINS */
 
-/*
- * The PCI address space does equal the physical memory address space.
- * The networking and block device layers use this boolean for bounce
- * buffer decisions.
- */
-#define PCI_DMA_BUS_IS_PHYS (1)
-
 #define HAVE_PCI_MMAP
 #define ARCH_GENERIC_PCI_MMAP_RESOURCE
 
diff --git a/arch/arm64/include/asm/pci.h b/arch/arm64/include/asm/pci.h
index 8747f7c5e0e7..9e690686e8aa 100644
--- a/arch/arm64/include/asm/pci.h
+++ b/arch/arm64/include/asm/pci.h
@@ -18,11 +18,6 @@
 #define pcibios_assign_all_busses() \
(pci_has_flag(PCI_REASSIGN_ALL_BUS))
 
-/*
- * PCI address space differs from physical memory address space
- */
-#define PCI_DMA_BUS_IS_PHYS(0)
-
 #define ARCH_GENERIC_PCI_MMAP_RESOURCE 1
 
 extern int isa_dma_bridge_buggy;
diff --git a/arch/h8300/include/asm/pci.h b/arch/h8300/include/asm/pci.h
index 7c9e55d62215..d4d345a52092 100644
--- a/arch/h8300/include/asm/pci.h
+++ b/arch/h8300/include/asm/pci.h
@@ -15,6 +15,4 @@ static inline void pcibios_penalize_isa_irq(int irq, int 
active)
/* We don't do dynamic PCI IRQ allocation */
 }
 
-#define PCI_DMA_BUS_IS_PHYS(1)
-
 #endif /* _ASM_H8300_PCI_H */
diff --git a/arch/hexagon/kernel/dma.c b/arch/hexagon/kernel/dma.c
index ad8347c29dcf..77459df34e2e 100644
--- a/arch/hexagon/kernel/dma.c
+++ b/arch/hexagon/kernel/dma.c
@@ -208,7 +208,6 @@ const struct dma_map_ops hexagon_dma_ops = {
.sync_single_for_cpu = hexagon_sync_single_for_cpu,
.sync_single_for_device = hexagon_sync_single_for_device,
.mapping_error  = hexagon_mapping_error,
-   .is_phys= 1,
 };
 
 void __init hexagon_dma_init(void)
diff --git a/arch/ia64/hp/common/sba_iommu.c b/arch/ia64/hp/common/sba_iommu.c
index aec4a3354abe..6f05aba9012f 100644
--- a/arch/ia64/hp/common/sba_iommu.c
+++ 

[PATCH 4/5] net: remove the PCI_DMA_BUS_IS_PHYS check in illegal_highdma

2018-04-24 Thread Christoph Hellwig
These days the dma mapping routines must be able to handle any address
supported by the device, be that using an iommu, or swiotlb if none is
supported.  With that the PCI_DMA_BUS_IS_PHYS check in illegal_highdma
is not needed and can be removed.

Signed-off-by: Christoph Hellwig 
---
 net/core/dev.c | 20 +---
 1 file changed, 1 insertion(+), 19 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index af0558b00c6c..060256cbf4f3 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2884,11 +2884,7 @@ void netdev_rx_csum_fault(struct net_device *dev)
 EXPORT_SYMBOL(netdev_rx_csum_fault);
 #endif
 
-/* Actually, we should eliminate this check as soon as we know, that:
- * 1. IOMMU is present and allows to map all the memory.
- * 2. No high memory really exists on this machine.
- */
-
+/* XXX: check that highmem exists at all on the given machine. */
 static int illegal_highdma(struct net_device *dev, struct sk_buff *skb)
 {
 #ifdef CONFIG_HIGHMEM
@@ -2902,20 +2898,6 @@ static int illegal_highdma(struct net_device *dev, 
struct sk_buff *skb)
return 1;
}
}
-
-   if (PCI_DMA_BUS_IS_PHYS) {
-   struct device *pdev = dev->dev.parent;
-
-   if (!pdev)
-   return 0;
-   for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
-   skb_frag_t *frag = _shinfo(skb)->frags[i];
-   dma_addr_t addr = page_to_phys(skb_frag_page(frag));
-
-   if (!pdev->dma_mask || addr + PAGE_SIZE - 1 > 
*pdev->dma_mask)
-   return 1;
-   }
-   }
 #endif
return 0;
 }
-- 
2.17.0



[PATCH 3/5] ide: remove the PCI_DMA_BUS_IS_PHYS check

2018-04-24 Thread Christoph Hellwig
We now have ways to deal with drainage in the block layer, and libata has
been using it for ages.  We also want to get rid of PCI_DMA_BUS_IS_PHYS
now, so just reduce the PCI transfer size for ide - anyone who cares for
performance on PCI controllers should have switched to libata long ago.

Signed-off-by: Christoph Hellwig 
---
 drivers/ide/ide-probe.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/ide/ide-probe.c b/drivers/ide/ide-probe.c
index 8d8ed036ca0a..56d7bc228cb3 100644
--- a/drivers/ide/ide-probe.c
+++ b/drivers/ide/ide-probe.c
@@ -796,8 +796,7 @@ static int ide_init_queue(ide_drive_t *drive)
 * This will be fixed once we teach pci_map_sg() about our boundary
 * requirements, hopefully soon. *FIXME*
 */
-   if (!PCI_DMA_BUS_IS_PHYS)
-   max_sg_entries >>= 1;
+   max_sg_entries >>= 1;
 #endif /* CONFIG_PCI */
 
blk_queue_max_segments(q, max_sg_entries);
-- 
2.17.0



Re: Regression 4.17-rc1: SSD doesn properly resume causing system hang (NULL pointer dereference)

2018-04-24 Thread Paul Menzel

Dear Bart,


On 04/24/18 19:31, Bart Van Assche wrote:

On Tue, 2018-04-24 at 19:10 +0200, Paul Menzel wrote:

Please find the configuration file attached. The log only has
`initcall_debug no_console_suspend` added.


What I was looking for in the .config is the following:
CONFIG_SCSI_MQ_DEFAULT=y

Can you also provide the disassembly output for blk_set_runtime_active,
e.g. by loading vmlinux into gdb and by running the command "disas
blk_set_runtime_active"?


Here it is, pasted as citation, as otherwise Thunderbird would wrap the 
line.



(gdb) disas blk_set_runtime_active
Dump of assembler code for function blk_set_runtime_active:
   0xc1518610 <+0>:   call   0xc106ac9c <__fentry__>
   0xc1518615 <+5>:   push   %ebp
   0xc1518616 <+6>:   mov%esp,%ebp
   0xc1518618 <+8>:   sub$0x14,%esp
   0xc151861b <+11>:  mov%ebx,-0xc(%ebp)
   0xc151861e <+14>:  mov%eax,%ebx
   0xc1518620 <+16>:  mov%gs:0x14,%eax
   0xc1518626 <+22>:  mov%eax,-0x10(%ebp)
   0xc1518629 <+25>:  xor%eax,%eax
   0xc151862b <+27>:  test   %ebx,%ebx
   0xc151862d <+29>:  mov%esi,-0x8(%ebp)
   0xc1518630 <+32>:  mov%edi,-0x4(%ebp)
   0xc1518633 <+35>:  je 0xc15186b3 
   0xc1518635 <+37>:  mov0xfc(%ebx),%eax
   0xc151863b <+43>:  call   0xc1a4b920 <_raw_spin_lock_irq>
   0xc1518640 <+48>:  mov0x150(%ebx),%esi
   0xc1518646 <+54>:  xor%eax,%eax
   0xc1518648 <+56>:  mov0xc1ca7d20,%edi
   0xc151864e <+62>:  mov%eax,0x154(%ebx)
   0xc1518654 <+68>:  cmp$0xff0c,%esi
   0xc151865a <+74>:  mov%edi,-0x14(%ebp)
   0xc151865d <+77>:  je 0xc15186a5 
   0xc151865f <+79>:  mov%edi,0xf4(%esi)
   0xc1518665 <+85>:  mov$0x9,%edx
   0xc151866a <+90>:  mov0x150(%ebx),%eax
   0xc1518670 <+96>:  call   0xc175ab80 <__pm_runtime_suspend>
   0xc1518675 <+101>: mov0xfc(%ebx),%eax
   0xc151867b <+107>: call   *0xc1ce2918
   0xc1518681 <+113>: call   *0xc1ce2888
   0xc1518687 <+119>: mov-0x10(%ebp),%eax
   0xc151868a <+122>: xor%gs:0x14,%eax
   0xc1518691 <+129>: jne0xc15186a0 
   0xc1518693 <+131>: mov-0xc(%ebp),%ebx
   0xc1518696 <+134>: mov-0x8(%ebp),%esi
   0xc1518699 <+137>: mov-0x4(%ebp),%edi
   0xc151869c <+140>: mov%ebp,%esp
   0xc151869e <+142>: pop%ebp
   0xc151869f <+143>:	ret
   0xc15186a0 <+144>:	call   0xc108c6c0 <__stack_chk_fail>

   0xc15186a5 <+149>: xor%edx,%edx
   0xc15186a7 <+151>: mov$0xc1ee14b4,%eax
   0xc15186ac <+156>: call   0xc15bb7f0 <__ubsan_handle_type_mismatch>
   0xc15186b1 <+161>: jmp0xc151865f 
   0xc15186b3 <+163>: xor%edx,%edx
   0xc15186b5 <+165>: mov$0xc1ee14cc,%eax
   0xc15186ba <+170>: call   0xc15bb7f0 <__ubsan_handle_type_mismatch>
   0xc15186bf <+175>: jmp0xc1518635 
End of assembler dump.



Kind regards,

Paul


PS: By the way, your mailer stripped the full names of my first message, 
and replace the “names” with the email address.




smime.p7s
Description: S/MIME Cryptographic Signature


Re: Regression 4.17-rc1: SSD doesn properly resume causing system hang (NULL pointer dereference)

2018-04-24 Thread Bart Van Assche
On Tue, 2018-04-24 at 19:10 +0200, Paul Menzel wrote:
> Please find the configuration file attached. The log only has 
> `initcall_debug no_console_suspend` added.

What I was looking for in the .config is the following:
CONFIG_SCSI_MQ_DEFAULT=y

Can you also provide the disassembly output for blk_set_runtime_active,
e.g. by loading vmlinux into gdb and by running the command "disas
blk_set_runtime_active"?

Thanks,

Bart.





Re: Regression 4.17-rc1: SSD doesnʼt properly resume causing system hang (NULL pointer dereference)

2018-04-24 Thread Bart Van Assche
On Tue, 2018-04-24 at 18:14 +0200, Paul Menzel wrote:
> Since Linux 4.17-rc1, resume from ACPI on the Lenovo X60 fails because 
> of a NULL pointer dereference. As the drive doesn’t come back, messages 
> can be seen on the display, but the system cannot be controlled anymore, 
> and has to be forcefully shut down.

Are you using scsi-mq or the legacy SCSI core? If you are not sure how to
answer this question, please share the kernel config and kernel command
line parameters.

Thanks,

Bart.




Re: simplify procfs code for seq_file instances

2018-04-24 Thread Christoph Hellwig
On Tue, Apr 24, 2018 at 08:19:16AM -0700, Andrew Morton wrote:
> > > I want to ask if it is time to start using poorman function overloading
> > > with _b_c_e(). There are millions of allocation functions for example,
> > > all slightly difference, and people will add more. Seeing /proc interfaces
> > > doubled like this is painful.
> > 
> > Function overloading is totally unacceptable.
> > 
> > And I very much disagree with a tradeoff that keeps 5000 lines of 
> > code vs a few new helpers.
> 
> OK, the curiosity and suspense are killing me.  What the heck is
> "function overloading with _b_c_e()"?

The way I understood Alexey was to use have a proc_create macro
that can take different ops types.  Although the short cut for
__builtin_types_compatible_p would be _b_t_c or similar, so maybe
I misunderstood him.


Re: simplify procfs code for seq_file instances

2018-04-24 Thread Andrew Morton
On Tue, 24 Apr 2018 16:23:04 +0200 Christoph Hellwig  wrote:

> On Thu, Apr 19, 2018 at 09:57:50PM +0300, Alexey Dobriyan wrote:
> > > git://git.infradead.org/users/hch/misc.git proc_create
> > 
> > 
> > I want to ask if it is time to start using poorman function overloading
> > with _b_c_e(). There are millions of allocation functions for example,
> > all slightly difference, and people will add more. Seeing /proc interfaces
> > doubled like this is painful.
> 
> Function overloading is totally unacceptable.
> 
> And I very much disagree with a tradeoff that keeps 5000 lines of 
> code vs a few new helpers.

OK, the curiosity and suspense are killing me.  What the heck is
"function overloading with _b_c_e()"?


Re: [PATCH 02/39] proc: introduce proc_create_seq{,_data}

2018-04-24 Thread Christoph Hellwig
On Thu, Apr 19, 2018 at 09:41:06PM +0300, Alexey Dobriyan wrote:
> Should be oopsable.
> Once proc_create_data() returns, entry is live, ->open can be called.

Ok, switching to opencoding proc_create_data instead.


Re: [PATCH 03/39] proc: introduce proc_create_seq_private

2018-04-24 Thread Christoph Hellwig
On Thu, Apr 19, 2018 at 09:50:27PM +0300, Alexey Dobriyan wrote:
> On Thu, Apr 19, 2018 at 02:41:04PM +0200, Christoph Hellwig wrote:
> > Variant of proc_create_data that directly take a struct seq_operations
> 
> > --- a/fs/proc/internal.h
> > +++ b/fs/proc/internal.h
> > @@ -45,6 +45,7 @@ struct proc_dir_entry {
> > const struct inode_operations *proc_iops;
> > const struct file_operations *proc_fops;
> > const struct seq_operations *seq_ops;
> > +   size_t state_size;
> 
> "unsigned int" please.
> 
> Where have you seen 4GB priv states?

We're passing the result of sizeof, which happens to be a size_t.
But if it makes you happy I can switch to unsigned int.


Re: simplify procfs code for seq_file instances

2018-04-24 Thread Christoph Hellwig
On Thu, Apr 19, 2018 at 09:57:50PM +0300, Alexey Dobriyan wrote:
> > git://git.infradead.org/users/hch/misc.git proc_create
> 
> 
> I want to ask if it is time to start using poorman function overloading
> with _b_c_e(). There are millions of allocation functions for example,
> all slightly difference, and people will add more. Seeing /proc interfaces
> doubled like this is painful.

Function overloading is totally unacceptable.

And I very much disagree with a tradeoff that keeps 5000 lines of 
code vs a few new helpers.


Re: [PATCH 03/39] proc: introduce proc_create_seq_private

2018-04-24 Thread Christoph Hellwig
On Thu, Apr 19, 2018 at 05:18:18PM +0300, Dan Carpenter wrote:
> > -static const struct file_operations cio_ignore_proc_fops = {
> > -   .open= cio_ignore_proc_open,
> > -   .read= seq_read,
> > -   .llseek  = seq_lseek,
> > -   .release = seq_release_private,
> > -   .write   = cio_ignore_write,
>
> The cio_ignore_write() function isn't used any more so compilers will
> complain.

No compiler in the buildboot farm complained, but neverless this


Re: [PATCH 16/39] ipmi: simplify procfs code

2018-04-24 Thread Christoph Hellwig
On Thu, Apr 19, 2018 at 10:29:29AM -0500, Corey Minyard wrote:
> On 04/19/2018 07:41 AM, Christoph Hellwig wrote:
>> Use remove_proc_subtree to remove the whole subtree on cleanup instead
>> of a hand rolled list of proc entries, unwind the registration loop into
>> individual calls.  Switch to use proc_create_single to further simplify
>> the code.
>
> I'm yanking all the proc code out of the IPMI driver in 3.18.  So this is 
> probably
> not necessary.

Ok, I'll drop this patch.


Re: [PATCH 26/39] rtc/proc: switch to proc_create_single_data

2018-04-24 Thread Christoph Hellwig
On Thu, Apr 19, 2018 at 03:10:27PM +0200, Alexandre Belloni wrote:
> On 19/04/2018 14:41:27+0200, Christoph Hellwig wrote:
> > And stop trying to get a reference on the submodule, procfs code deals
> > with release after and unloaded module and thus removed proc entry.
> > 
> 
> Are you sure about that? The rtc module is not the one adding the procfs
> file so I'm not sure how the procfs code can handle it.

The proc file is removed from this call chain:

  _exit (module_exit handler)
-> rtc_device_unregister
  -> rtc_proc_del_device
-> remove_proc_entry

remove_proc_entry takes care of waiting for currently active file
operation instances and makes sure every new operation never calls
into the actual proc file ops.  Same behavior as in RTC exists all
over the kernel.


[PATCH] mptlan: fix mpt_lan_sdu_send()'s return type

2018-04-24 Thread Luc Van Oostenryck
The method ndo_start_xmit() is defined as returning an 'netdev_tx_t',
which is a typedef for an enum type, but the implementation in this
driver returns an 'int'.

Fix this by returning 'netdev_tx_t' in this driver too.

Signed-off-by: Luc Van Oostenryck 
---
 drivers/message/fusion/mptlan.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/message/fusion/mptlan.c b/drivers/message/fusion/mptlan.c
index 55dd71bbd..4cbed4d06 100644
--- a/drivers/message/fusion/mptlan.c
+++ b/drivers/message/fusion/mptlan.c
@@ -670,7 +670,7 @@ mpt_lan_send_reply(struct net_device *dev, LANSendReply_t 
*pSendRep)
 }
 
 /*=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=*/
-static int
+static netdev_tx_t
 mpt_lan_sdu_send (struct sk_buff *skb, struct net_device *dev)
 {
struct mpt_lan_priv *priv = netdev_priv(dev);
-- 
2.17.0



Re: [PATCH v9 2/5] dt-bindings: scsi: ufs: add document for hisi-ufs

2018-04-24 Thread Rob Herring
On Tue, Apr 17, 2018 at 10:08:11PM +0800, Li Wei wrote:
> add ufs node document for Hisilicon.
> 
> Signed-off-by: Li Wei 
> ---
>  Documentation/devicetree/bindings/ufs/ufs-hisi.txt | 29 
> ++
>  .../devicetree/bindings/ufs/ufshcd-pltfrm.txt  | 10 +---
>  2 files changed, 36 insertions(+), 3 deletions(-)
>  create mode 100644 Documentation/devicetree/bindings/ufs/ufs-hisi.txt
> 
> diff --git a/Documentation/devicetree/bindings/ufs/ufs-hisi.txt 
> b/Documentation/devicetree/bindings/ufs/ufs-hisi.txt
> new file mode 100644
> index ..d49ab7d8f31d
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/ufs/ufs-hisi.txt
> @@ -0,0 +1,29 @@
> +* Hisilicon Universal Flash Storage (UFS) Host Controller
> +
> +UFS nodes are defined to describe on-chip UFS hardware macro.
> +Each UFS Host Controller should have its own node.
> +
> +Required properties:
> +- compatible: compatible list, contains one of the following -
> + "hisilicon,hi3660-ufs", "jedec,ufs-1.1" 
> for hisi ufs
> + host controller present on Hi36xx 
> chipset.
> +- reg   : should contain UFS register address space & UFS SYS 
> CTRL register address,
> +- interrupt-parent  : interrupt device
> +- interrupts: interrupt number
> +- resets: reset node register, the "arst" corresponds to reset 
> the APB/AXI bus.

arst belongs in reset-names.

> +- reset-names   : describe reset node register

What happened to clocks? You still have to list which ones apply even if 
documented in the common binding.

> +
> +Example:
> +
> + ufs: ufs@ff3b {
> + compatible = "hisilicon,hi3660-ufs", "jedec,ufs-1.1";
> + /* 0: HCI standard */
> + /* 1: UFS SYS CTRL */
> + reg = <0x0 0xff3b 0x0 0x1000>,
> + <0x0 0xff3b1000 0x0 0x1000>;
> + interrupt-parent = <>;
> + interrupts = ;
> + /* offset: 0x84; bit: 7  */
> + resets = <_rst 0x84 7>;
> + reset-names = "arst";
> + };
> diff --git a/Documentation/devicetree/bindings/ufs/ufshcd-pltfrm.txt 
> b/Documentation/devicetree/bindings/ufs/ufshcd-pltfrm.txt
> index c39dfef76a18..adcfb79f63f5 100644
> --- a/Documentation/devicetree/bindings/ufs/ufshcd-pltfrm.txt
> +++ b/Documentation/devicetree/bindings/ufs/ufshcd-pltfrm.txt
> @@ -41,6 +41,8 @@ Optional properties:
>  -lanes-per-direction : number of lanes available per direction - either 1 or 
> 2.
> Note that it is assume same number of lanes is used 
> both
> directions at once. If not specified, default is 2 
> lanes per direction.
> +- resets: reset node register, the "rst" corresponds to reset 
> the whole UFS IP.
> +- reset-names   : describe reset node register

Does your controller have 1 or 2 resets? There's no point in adding this 
here if it doesn't apply to your controller.


>  Note: If above properties are not defined it can be assumed that the supply
>  regulators or clocks are always on.
> @@ -61,9 +63,11 @@ Example:
>   vccq-max-microamp = 20;
>   vccq2-max-microamp = 20;
>  
> - clocks = < 0>, < 0>, < 0>;
> - clock-names = "core_clk", "ref_clk", "iface_clk";
> - freq-table-hz = <1 2>, <0 0>, <0 0>;
> + clocks = < 0>, < 0>, < 0>, < 0>;
> + clock-names = "core_clk", "ref_clk", "phy_clk", "iface_clk";
> + freq-table-hz = <1 2>, <0 0>, <0 0>, <0 0>;
> + resets = < 0 1>;
> + reset-names = "rst";
>   phys = <>;
>   phy-names = "ufsphy";
>   };
> -- 
> 2.15.0
> 


Re: [PATCH] [SCSI] mpt3sas: Fix secure erase premature termination (v2)

2018-04-24 Thread Hannes Reinecke

On 04/24/2018 11:09 AM, Steffen Maier wrote:


On 11/04/2016 05:35 PM, Martin K. Petersen wrote:

"Hannes" == Hannes Reinecke  writes:


Hannes> Checking with SAT-3 (section 6.2.4: Commands the SATL queues
Hannes> internally) the implemented behaviour is standards conformant,
Hannes> although the standard also allows for returning 'TASK SET FULL'
Hannes> or 'BUSY' in these cases.  Doing so would nicely solve this
Hannes> issue.

I agree with Hannes that it would be appropriate for the SATL to report
busy when it makes an non-queued command queueable.


Wouldn't this potentially still cause problems if the secure erase takes 
longer than max_retries * scmd_tmo. I.e. the command timing out by 
default after 180 seconds as in 
https://www.spinics.net/lists/linux-block/msg24837.html ?


The fix approach here seems to also handle this gracefully.


Well, yes, of course the command will be terminated after it timed out.
But typically secure erase is invoked from userspace via sg ioctls, and 
it's in the responsibility of the application to set the correct timeout.


Cheers,

Hannes


[PATCH v3 07/14] mpt3sas: Increase event log buffer to support 24 port HBA's.

2018-04-24 Thread Chaitra P B
For 24 port HBA's events generated by IOC are more in certain cases and
the current circular buffer may be overwritten.Hence increased the event
log buffer to accommodate more events.

Signed-off-by: Chaitra P B 
Signed-off-by: Suganath Prabu S 
---
 drivers/scsi/mpt3sas/mpt3sas_ctl.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_ctl.h 
b/drivers/scsi/mpt3sas/mpt3sas_ctl.h
index a44046c..18b46fa 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_ctl.h
+++ b/drivers/scsi/mpt3sas/mpt3sas_ctl.h
@@ -184,7 +184,7 @@ struct mpt3_ioctl_iocinfo {
 
 
 /* number of event log entries */
-#define MPT3SAS_CTL_EVENT_LOG_SIZE (50)
+#define MPT3SAS_CTL_EVENT_LOG_SIZE (200)
 
 /**
  * struct mpt3_ioctl_eventquery - query event count and type
-- 
1.8.3.1



[PATCH v3 00/14] mpt3sas: Enhancements and Defect fixes.

2018-04-24 Thread Chaitra P B
Chaitra P B (14):
  mpt3sas: Bug fix for big endian systems.
  mpt3sas: Pre-allocate RDPQ Array at driver boot time.
  mpt3sas: Lockless access for chain buffers.
  mpt3sas: Optimize I/O memory consumption in driver.
  mpt3sas: Enhanced handling of Sense Buffer.
  mpt3sas: Added support for SAS Device Discovery Error Event.
  mpt3sas: Increase event log buffer to support 24 port HBA's.
  mpt3sas: Allow processing of events during driver unload.
  mpt3sas: Cache enclosure pages during enclosure add.
  mpt3sas: Report Firmware Package Version from HBA Driver.
  mpt3sas: Update MPI Headers
  mpt3sas: For NVME device, issue a protocol level reset instead of
hot reset and use TM timeout value exposed in PCIe Device Page  2.
  mpt3sas: fix possible memory leak.
  mpt3sas: Update driver version "25.100.00.00"

 drivers/scsi/mpt3sas/mpi/mpi2.h  |   9 +-
 drivers/scsi/mpt3sas/mpi/mpi2_cnfg.h |  30 +-
 drivers/scsi/mpt3sas/mpi/mpi2_init.h |   2 +-
 drivers/scsi/mpt3sas/mpi/mpi2_ioc.h  |   7 +-
 drivers/scsi/mpt3sas/mpt3sas_base.c  | 475 +++---
 drivers/scsi/mpt3sas/mpt3sas_base.h  |  60 +++-
 drivers/scsi/mpt3sas/mpt3sas_ctl.c   |  33 ++-
 drivers/scsi/mpt3sas/mpt3sas_ctl.h   |   2 +-
 drivers/scsi/mpt3sas/mpt3sas_scsih.c | 491 +--
 drivers/scsi/mpt3sas/mpt3sas_warpdrive.c |   3 +-
 10 files changed, 816 insertions(+), 296 deletions(-)

-- 
1.8.3.1



[PATCH v3 10/14] mpt3sas: Report Firmware Package Version from HBA Driver.

2018-04-24 Thread Chaitra P B
Added function _base_display_fwpkg_version, which sends FWUpload
request to pull FW package version from FW Image Header.
Now driver prints FW package version in addition to FW
version if the PackageVersion is valid.

Signed-off-by: Chaitra P B 
Signed-off-by: Suganath Prabu S 
---
 drivers/scsi/mpt3sas/mpt3sas_base.c | 109 +++-
 drivers/scsi/mpt3sas/mpt3sas_base.h |   1 +
 2 files changed, 108 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c 
b/drivers/scsi/mpt3sas/mpt3sas_base.c
index bcd16e4..febd858 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.c
@@ -3825,6 +3825,105 @@ _base_display_OEMs_branding(struct MPT3SAS_ADAPTER *ioc)
 }
 
 /**
+ * _base_display_fwpkg_version - sends FWUpload request to pull FWPkg
+ * version from FW Image Header.
+ * @ioc: per adapter object
+ *
+ * Returns 0 for success, non-zero for failure.
+ */
+   static int
+_base_display_fwpkg_version(struct MPT3SAS_ADAPTER *ioc)
+{
+   Mpi2FWImageHeader_t *FWImgHdr;
+   Mpi25FWUploadRequest_t *mpi_request;
+   Mpi2FWUploadReply_t mpi_reply;
+   int r = 0;
+   void *fwpkg_data = NULL;
+   dma_addr_t fwpkg_data_dma;
+   u16 smid, ioc_status;
+   size_t data_length;
+
+   dinitprintk(ioc, pr_info(MPT3SAS_FMT "%s\n", ioc->name,
+   __func__));
+
+   if (ioc->base_cmds.status & MPT3_CMD_PENDING) {
+   pr_err(MPT3SAS_FMT "%s: internal command already in use\n",
+   ioc->name, __func__);
+   return -EAGAIN;
+   }
+
+   data_length = sizeof(Mpi2FWImageHeader_t);
+   fwpkg_data = pci_alloc_consistent(ioc->pdev, data_length,
+   _data_dma);
+   if (!fwpkg_data) {
+   pr_err(MPT3SAS_FMT "failure at %s:%d/%s()!\n",
+   ioc->name, __FILE__, __LINE__, __func__);
+   return -ENOMEM;
+   }
+
+   smid = mpt3sas_base_get_smid(ioc, ioc->base_cb_idx);
+   if (!smid) {
+   pr_err(MPT3SAS_FMT "%s: failed obtaining a smid\n",
+   ioc->name, __func__);
+   r = -EAGAIN;
+   goto out;
+   }
+
+   ioc->base_cmds.status = MPT3_CMD_PENDING;
+   mpi_request = mpt3sas_base_get_msg_frame(ioc, smid);
+   ioc->base_cmds.smid = smid;
+   memset(mpi_request, 0, sizeof(Mpi25FWUploadRequest_t));
+   mpi_request->Function = MPI2_FUNCTION_FW_UPLOAD;
+   mpi_request->ImageType = MPI2_FW_UPLOAD_ITYPE_FW_FLASH;
+   mpi_request->ImageSize = cpu_to_le32(data_length);
+   ioc->build_sg(ioc, _request->SGL, 0, 0, fwpkg_data_dma,
+   data_length);
+   init_completion(>base_cmds.done);
+   mpt3sas_base_put_smid_default(ioc, smid);
+   /* Wait for 15 seconds */
+   wait_for_completion_timeout(>base_cmds.done,
+   FW_IMG_HDR_READ_TIMEOUT*HZ);
+   pr_info(MPT3SAS_FMT "%s: complete\n",
+   ioc->name, __func__);
+   if (!(ioc->base_cmds.status & MPT3_CMD_COMPLETE)) {
+   pr_err(MPT3SAS_FMT "%s: timeout\n",
+   ioc->name, __func__);
+   _debug_dump_mf(mpi_request,
+   sizeof(Mpi25FWUploadRequest_t)/4);
+   r = -ETIME;
+   } else {
+   memset(_reply, 0, sizeof(Mpi2FWUploadReply_t));
+   if (ioc->base_cmds.status & MPT3_CMD_REPLY_VALID) {
+   memcpy(_reply, ioc->base_cmds.reply,
+   sizeof(Mpi2FWUploadReply_t));
+   ioc_status = le16_to_cpu(mpi_reply.IOCStatus) &
+   MPI2_IOCSTATUS_MASK;
+   if (ioc_status == MPI2_IOCSTATUS_SUCCESS) {
+   FWImgHdr = (Mpi2FWImageHeader_t *)fwpkg_data;
+   if (FWImgHdr->PackageVersion.Word) {
+   pr_info(MPT3SAS_FMT "FW Package Version"
+   "(%02d.%02d.%02d.%02d)\n",
+   ioc->name,
+   FWImgHdr->PackageVersion.Struct.Major,
+   FWImgHdr->PackageVersion.Struct.Minor,
+   FWImgHdr->PackageVersion.Struct.Unit,
+   FWImgHdr->PackageVersion.Struct.Dev);
+   }
+   } else {
+   _debug_dump_mf(_reply,
+   sizeof(Mpi2FWUploadReply_t)/4);
+   }
+   }
+   }
+   ioc->base_cmds.status = MPT3_CMD_NOT_USED;
+out:
+   if (fwpkg_data)
+   

[PATCH v3 13/14] mpt3sas: fix possible memory leak.

2018-04-24 Thread Chaitra P B
In ioctl exit path driver refers ioc_list to free memory associated with
diag buffers and event_log pointer used to save events by driver.
If ctl_exit() func is called after unregistering driver, then ioc_list will
be empty and hence driver will not be able to free the allocated memory
which in turn causes memory leak.
So call ctl_exit() function before unregistering mpt3sas driver.

Signed-off-by: Chaitra P B 
Signed-off-by: Suganath Prabu S 
---
 drivers/scsi/mpt3sas/mpt3sas_scsih.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c 
b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index 8ba72a2..8fc2922 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -11276,10 +11276,10 @@ _mpt3sas_exit(void)
pr_info("mpt3sas version %s unloading\n",
MPT3SAS_DRIVER_VERSION);
 
-   pci_unregister_driver(_driver);
-
mpt3sas_ctl_exit(hbas_to_enumerate);
 
+   pci_unregister_driver(_driver);
+
scsih_exit();
 }
 
-- 
1.8.3.1



[PATCH v3 14/14] mpt3sas: Update driver version "25.100.00.00"

2018-04-24 Thread Chaitra P B
Update driver version to match OOB/internal driver version.

Signed-off-by: Chaitra P B 
Signed-off-by: Suganath Prabu S 
---
 drivers/scsi/mpt3sas/mpt3sas_base.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.h 
b/drivers/scsi/mpt3sas/mpt3sas_base.h
index 43d66c5..f2debe1 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.h
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.h
@@ -74,8 +74,8 @@
 #define MPT3SAS_DRIVER_NAME"mpt3sas"
 #define MPT3SAS_AUTHOR "Avago Technologies "
 #define MPT3SAS_DESCRIPTION"LSI MPT Fusion SAS 3.0 Device Driver"
-#define MPT3SAS_DRIVER_VERSION "17.100.00.00"
-#define MPT3SAS_MAJOR_VERSION  17
+#define MPT3SAS_DRIVER_VERSION "25.100.00.00"
+#define MPT3SAS_MAJOR_VERSION  25
 #define MPT3SAS_MINOR_VERSION  100
 #define MPT3SAS_BUILD_VERSION  0
 #define MPT3SAS_RELEASE_VERSION00
-- 
1.8.3.1



[PATCH v3 02/14] mpt3sas: Pre-allocate RDPQ Array at driver boot time.

2018-04-24 Thread Chaitra P B
Instead of allocating RDPQ array (This stores the address's of each RDPQ
pools) at run time, now it will be allocated once during driver load time
and same will be reused during host reset operation also (instead of
allocating & freeing this buffer on the fly during every host reset
operation) and then freed during driver unload.

Signed-off-by: Chaitra P B 
Signed-off-by: Suganath Prabu S 
---
 drivers/scsi/mpt3sas/mpt3sas_base.c | 57 +++--
 drivers/scsi/mpt3sas/mpt3sas_base.h |  3 ++
 2 files changed, 38 insertions(+), 22 deletions(-)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c 
b/drivers/scsi/mpt3sas/mpt3sas_base.c
index 9c00185..b23a2eb 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.c
@@ -4159,7 +4159,14 @@ _base_release_memory_pools(struct MPT3SAS_ADAPTER *ioc)
}
} while (ioc->rdpq_array_enable &&
   (++i < ioc->reply_queue_count));
-
+   if (ioc->reply_post_free_array &&
+   ioc->rdpq_array_enable) {
+   dma_pool_free(ioc->reply_post_free_array_dma_pool,
+   ioc->reply_post_free_array,
+   ioc->reply_post_free_array_dma);
+   ioc->reply_post_free_array = NULL;
+   }
+   dma_pool_destroy(ioc->reply_post_free_array_dma_pool);
dma_pool_destroy(ioc->reply_post_free_dma_pool);
kfree(ioc->reply_post);
}
@@ -4209,7 +4216,7 @@ _base_allocate_memory_pools(struct MPT3SAS_ADAPTER *ioc)
struct mpt3sas_facts *facts;
u16 max_sge_elements;
u16 chains_needed_per_io;
-   u32 sz, total_sz, reply_post_free_sz;
+   u32 sz, total_sz, reply_post_free_sz, reply_post_free_array_sz;
u32 retry_sz;
u16 max_request_credit, nvme_blocks_needed;
unsigned short sg_tablesize;
@@ -4681,6 +4688,28 @@ _base_allocate_memory_pools(struct MPT3SAS_ADAPTER *ioc)
ioc->name, (unsigned long long)ioc->reply_free_dma));
total_sz += sz;
 
+   if (ioc->rdpq_array_enable) {
+   reply_post_free_array_sz = ioc->reply_queue_count *
+   sizeof(Mpi2IOCInitRDPQArrayEntry);
+   ioc->reply_post_free_array_dma_pool =
+   dma_pool_create("reply_post_free_array pool",
+   >pdev->dev, reply_post_free_array_sz, 16, 0);
+   if (!ioc->reply_post_free_array_dma_pool) {
+   dinitprintk(ioc,
+   pr_info(MPT3SAS_FMT "reply_post_free_array pool: "
+   "dma_pool_create failed\n", ioc->name));
+   goto out;
+   }
+   ioc->reply_post_free_array =
+   dma_pool_alloc(ioc->reply_post_free_array_dma_pool,
+   GFP_KERNEL, >reply_post_free_array_dma);
+   if (!ioc->reply_post_free_array) {
+   dinitprintk(ioc,
+   pr_info(MPT3SAS_FMT "reply_post_free_array pool: "
+   "dma_pool_alloc failed\n", ioc->name));
+   goto out;
+   }
+   }
ioc->config_page_sz = 512;
ioc->config_page = pci_alloc_consistent(ioc->pdev,
ioc->config_page_sz, >config_page_dma);
@@ -5487,8 +5516,6 @@ _base_send_ioc_init(struct MPT3SAS_ADAPTER *ioc)
ktime_t current_time;
u16 ioc_status;
u32 reply_post_free_array_sz = 0;
-   Mpi2IOCInitRDPQArrayEntry *reply_post_free_array = NULL;
-   dma_addr_t reply_post_free_array_dma;
 
dinitprintk(ioc, pr_info(MPT3SAS_FMT "%s\n", ioc->name,
__func__));
@@ -5522,23 +5549,14 @@ _base_send_ioc_init(struct MPT3SAS_ADAPTER *ioc)
if (ioc->rdpq_array_enable) {
reply_post_free_array_sz = ioc->reply_queue_count *
sizeof(Mpi2IOCInitRDPQArrayEntry);
-   reply_post_free_array = pci_alloc_consistent(ioc->pdev,
-   reply_post_free_array_sz, _post_free_array_dma);
-   if (!reply_post_free_array) {
-   pr_err(MPT3SAS_FMT
-   "reply_post_free_array: pci_alloc_consistent failed\n",
-   ioc->name);
-   r = -ENOMEM;
-   goto out;
-   }
-   memset(reply_post_free_array, 0, reply_post_free_array_sz);
+   memset(ioc->reply_post_free_array, 0, reply_post_free_array_sz);
for (i = 0; i < ioc->reply_queue_count; i++)
-   reply_post_free_array[i].RDPQBaseAddress =
+   ioc->reply_post_free_array[i].RDPQBaseAddress =
cpu_to_le64(

[PATCH v3 04/14] mpt3sas: Optimize I/O memory consumption in driver.

2018-04-24 Thread Chaitra P B
For every IO, memory of PAGE size is allocated for handling NVMe native
PRPS. And in addition to that for every IO (chains need per IO * chain
buffer size, e.g. 38 * 128byte) amount of memory is allocated for chain
buffers.

However, at any point of time; the IO request can be for NVMe target device
(where PRP's page is used for framing PRP's) or can be for SCSI target
device (where chain buffers are used for framing chain SGE's). This patch
modifies the driver to reuse same pre-allocated PRP page buffers as a chain
buffer for IO's targeted for SCSI target devices. No need to allocate
separate buffers for chain SGE's buffers.

Suppose if the number of chain buffers need for IO doesn't fit in the PRP
Page size then driver maintain's separate buffers for those extra chain
buffers that exceeds the PRP page size. For example consider PRP page size
as 4K and chain buffer size as 128 bytes, then number of chain buffers that
can fit in PRP page is 4096/128 => 32. if the number of chain buffer need
per IO exceeds 32; for example consider number of chains need per IO is 36
then for remaining 4 chain buffer's driver allocates them individual.

Signed-off-by: Chaitra P B 
Signed-off-by: Suganath Prabu S 
---
 drivers/scsi/mpt3sas/mpt3sas_base.c | 80 +++--
 1 file changed, 51 insertions(+), 29 deletions(-)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c 
b/drivers/scsi/mpt3sas/mpt3sas_base.c
index 438da7c..415e7f0 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.c
@@ -4188,7 +4188,8 @@ _base_release_memory_pools(struct MPT3SAS_ADAPTER *ioc)
kfree(ioc->internal_lookup);
if (ioc->chain_lookup) {
for (i = 0; i < ioc->scsiio_depth; i++) {
-   for (j = 0; j < ioc->chains_needed_per_io; j++) {
+   for (j = ioc->chains_per_prp_buffer;
+   j < ioc->chains_needed_per_io; j++) {
ct = >chain_lookup[i].chains_per_smid[j];
if (ct && ct->chain_buffer)
dma_pool_free(ioc->chain_dma_pool,
@@ -4506,7 +4507,7 @@ _base_allocate_memory_pools(struct MPT3SAS_ADAPTER *ioc)
ioc->chain_lookup = kzalloc(sz, GFP_KERNEL);
if (!ioc->chain_lookup) {
pr_err(MPT3SAS_FMT "chain_lookup: __get_free_pages "
-   "failed\n", ioc->name);
+   "failed\n", ioc->name);
goto out;
}
 
@@ -4520,33 +4521,6 @@ _base_allocate_memory_pools(struct MPT3SAS_ADAPTER *ioc)
}
}
 
-   ioc->chain_dma_pool = dma_pool_create("chain pool", >pdev->dev,
-   ioc->chain_segment_sz, 16, 0);
-   if (!ioc->chain_dma_pool) {
-   pr_err(MPT3SAS_FMT "chain_dma_pool: dma_pool_create failed\n",
-   ioc->name);
-   goto out;
-   }
-   for (i = 0; i < ioc->scsiio_depth; i++) {
-   for (j = 0; j < ioc->chains_needed_per_io; j++) {
-   ct = >chain_lookup[i].chains_per_smid[j];
-   ct->chain_buffer = dma_pool_alloc(
-   ioc->chain_dma_pool , GFP_KERNEL,
-   >chain_buffer_dma);
-   if (!ct->chain_buffer) {
-   pr_err(MPT3SAS_FMT "chain_lookup: "
-   " pci_pool_alloc failed\n", ioc->name);
-   goto out;
-   }
-   }
-   total_sz += ioc->chain_segment_sz;
-   }
-
-   dinitprintk(ioc, pr_info(MPT3SAS_FMT
-   "chain pool depth(%d), frame_size(%d), pool_size(%d kB)\n",
-   ioc->name, ioc->chain_depth, ioc->chain_segment_sz,
-   ((ioc->chain_depth *  ioc->chain_segment_sz))/1024));
-
/* initialize hi-priority queue smid's */
ioc->hpr_lookup = kcalloc(ioc->hi_priority_depth,
sizeof(struct request_tracker), GFP_KERNEL);
@@ -4587,6 +4561,7 @@ _base_allocate_memory_pools(struct MPT3SAS_ADAPTER *ioc)
 * be required for NVMe PRP's, only each set of NVMe blocks will be
 * contiguous, so a new set is allocated for each possible I/O.
 */
+   ioc->chains_per_prp_buffer = 0;
if (ioc->facts.ProtocolFlags & MPI2_IOCFACTS_PROTOCOL_NVME_DEVICES) {
nvme_blocks_needed =
(ioc->shost->sg_tablesize * NVME_PRP_SIZE) - 1;
@@ -4609,6 +4584,11 @@ _base_allocate_memory_pools(struct MPT3SAS_ADAPTER *ioc)
ioc->name);
goto out;
}
+
+   ioc->chains_per_prp_buffer = sz/ioc->chain_segment_sz;
+   ioc->chains_per_prp_buffer = min(ioc->chains_per_prp_buffer,
+   ioc->chains_needed_per_io);
+

[PATCH v3 06/14] mpt3sas: Added support for SAS Device Discovery Error Event.

2018-04-24 Thread Chaitra P B
The SAS Device Discovery Error Event is sent to the host when
discovery for a particular device is failed during discovery,
even after maximum retries by the IOC.

Signed-off-by: Chaitra P B 
Signed-off-by: Suganath Prabu S 
---
 drivers/scsi/mpt3sas/mpt3sas_base.c  |  4 
 drivers/scsi/mpt3sas/mpt3sas_scsih.c | 42 
 2 files changed, 46 insertions(+)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c 
b/drivers/scsi/mpt3sas/mpt3sas_base.c
index 0b44ff7..93bf710 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.c
@@ -1030,6 +1030,9 @@ _base_display_event_data(struct MPT3SAS_ADAPTER *ioc,
case MPI2_EVENT_ACTIVE_CABLE_EXCEPTION:
desc = "Cable Event";
break;
+   case MPI2_EVENT_SAS_DEVICE_DISCOVERY_ERROR:
+   desc = "SAS Device Discovery Error";
+   break;
case MPI2_EVENT_PCIE_DEVICE_STATUS_CHANGE:
desc = "PCIE Device Status Change";
break;
@@ -6596,6 +6599,7 @@ mpt3sas_base_attach(struct MPT3SAS_ADAPTER *ioc)
_base_unmask_events(ioc, MPI2_EVENT_LOG_ENTRY_ADDED);
_base_unmask_events(ioc, MPI2_EVENT_TEMP_THRESHOLD);
_base_unmask_events(ioc, MPI2_EVENT_ACTIVE_CABLE_EXCEPTION);
+   _base_unmask_events(ioc, MPI2_EVENT_SAS_DEVICE_DISCOVERY_ERROR);
if (ioc->hba_mpi_version_belonged == MPI26_VERSION) {
if (ioc->is_gen35_ioc) {
_base_unmask_events(ioc,
diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c 
b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index c9cce65..796bf33 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -7524,6 +7524,44 @@ _scsih_sas_discovery_event(struct MPT3SAS_ADAPTER *ioc,
 }
 
 /**
+ * _scsih_sas_device_discovery_error_event - display SAS device discovery error
+ * events
+ * @ioc: per adapter object
+ * @fw_event: The fw_event_work object
+ * Context: user.
+ *
+ * Return nothing.
+ */
+static void
+_scsih_sas_device_discovery_error_event(struct MPT3SAS_ADAPTER *ioc,
+   struct fw_event_work *fw_event)
+{
+   Mpi25EventDataSasDeviceDiscoveryError_t *event_data =
+   (Mpi25EventDataSasDeviceDiscoveryError_t *)fw_event->event_data;
+
+   switch (event_data->ReasonCode) {
+   case MPI25_EVENT_SAS_DISC_ERR_SMP_FAILED:
+   pr_warn(MPT3SAS_FMT "SMP command sent to the expander"
+   "(handle:0x%04x, sas_address:0x%016llx,"
+   "physical_port:0x%02x) has failed",
+   ioc->name, le16_to_cpu(event_data->DevHandle),
+   (unsigned long long)le64_to_cpu(event_data->SASAddress),
+   event_data->PhysicalPort);
+   break;
+   case MPI25_EVENT_SAS_DISC_ERR_SMP_TIMEOUT:
+   pr_warn(MPT3SAS_FMT "SMP command sent to the expander"
+   "(handle:0x%04x, sas_address:0x%016llx,"
+   "physical_port:0x%02x) has timed out",
+   ioc->name, le16_to_cpu(event_data->DevHandle),
+   (unsigned long long)le64_to_cpu(event_data->SASAddress),
+   event_data->PhysicalPort);
+   break;
+   default:
+   break;
+   }
+}
+
+/**
  * _scsih_pcie_enumeration_event - handle enumeration events
  * @ioc: per adapter object
  * @fw_event: The fw_event_work object
@@ -9350,6 +9388,9 @@ _mpt3sas_fw_work(struct MPT3SAS_ADAPTER *ioc, struct 
fw_event_work *fw_event)
case MPI2_EVENT_SAS_DISCOVERY:
_scsih_sas_discovery_event(ioc, fw_event);
break;
+   case MPI2_EVENT_SAS_DEVICE_DISCOVERY_ERROR:
+   _scsih_sas_device_discovery_error_event(ioc, fw_event);
+   break;
case MPI2_EVENT_SAS_BROADCAST_PRIMITIVE:
_scsih_sas_broadcast_primitive_event(ioc, fw_event);
break;
@@ -9534,6 +9575,7 @@ mpt3sas_scsih_event_callback(struct MPT3SAS_ADAPTER *ioc, 
u8 msix_index,
case MPI2_EVENT_SAS_DEVICE_STATUS_CHANGE:
case MPI2_EVENT_IR_OPERATION_STATUS:
case MPI2_EVENT_SAS_DISCOVERY:
+   case MPI2_EVENT_SAS_DEVICE_DISCOVERY_ERROR:
case MPI2_EVENT_SAS_ENCL_DEVICE_STATUS_CHANGE:
case MPI2_EVENT_IR_PHYSICAL_DISK:
case MPI2_EVENT_PCIE_ENUMERATION:
-- 
1.8.3.1



[PATCH v3 05/14] mpt3sas: Enhanced handling of Sense Buffer.

2018-04-24 Thread Chaitra P B
Enhanced DMA allocation for Sense Buffer, if the allocation does not fit
within same 4GB.Introduced is_MSB_are_same function to check if allocted
buffer within 4GB range or not.

Signed-off-by: Chaitra P B 
Signed-off-by: Suganath Prabu S 
---
 drivers/scsi/mpt3sas/mpt3sas_base.c | 56 +
 1 file changed, 56 insertions(+)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c 
b/drivers/scsi/mpt3sas/mpt3sas_base.c
index 415e7f0..0b44ff7 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.c
@@ -4205,6 +4205,31 @@ _base_release_memory_pools(struct MPT3SAS_ADAPTER *ioc)
 }
 
 /**
+ * is_MSB_are_same - checks whether all reply queues in a set are
+ * having same upper 32bits in their base memory address.
+ * @reply_pool_start_address: Base address of a reply queue set
+ * @pool_sz: Size of single Reply Descriptor Post Queues pool size
+ *
+ * Returns 1 if reply queues in a set have a same upper 32bits
+ * in their base memory address,
+ * else 0
+ */
+
+static int
+is_MSB_are_same(long reply_pool_start_address, u32 pool_sz)
+{
+   long reply_pool_end_address;
+
+   reply_pool_end_address = reply_pool_start_address + pool_sz;
+
+   if (upper_32_bits(reply_pool_start_address) ==
+   upper_32_bits(reply_pool_end_address))
+   return 1;
+   else
+   return 0;
+}
+
+/**
  * _base_allocate_memory_pools - allocate start of day memory pools
  * @ioc: per adapter object
  *
@@ -4664,6 +4689,37 @@ _base_allocate_memory_pools(struct MPT3SAS_ADAPTER *ioc)
ioc->name);
goto out;
}
+   /* sense buffer requires to be in same 4 gb region.
+* Below function will check the same.
+* In case of failure, new pci pool will be created with updated
+* alignment. Older allocation and pool will be destroyed.
+* Alignment will be used such a way that next allocation if
+* success, will always meet same 4gb region requirement.
+* Actual requirement is not alignment, but we need start and end of
+* DMA address must have same upper 32 bit address.
+*/
+   if (!is_MSB_are_same((long)ioc->sense, sz)) {
+   //Release Sense pool & Reallocate
+   dma_pool_free(ioc->sense_dma_pool, ioc->sense, ioc->sense_dma);
+   dma_pool_destroy(ioc->sense_dma_pool);
+   ioc->sense = NULL;
+
+   ioc->sense_dma_pool =
+   dma_pool_create("sense pool", >pdev->dev, sz,
+   roundup_pow_of_two(sz), 0);
+   if (!ioc->sense_dma_pool) {
+   pr_err(MPT3SAS_FMT "sense pool: pci_pool_create 
failed\n",
+   ioc->name);
+   goto out;
+   }
+   ioc->sense = dma_pool_alloc(ioc->sense_dma_pool, GFP_KERNEL,
+   >sense_dma);
+   if (!ioc->sense) {
+   pr_err(MPT3SAS_FMT "sense pool: pci_pool_alloc 
failed\n",
+   ioc->name);
+   goto out;
+   }
+   }
dinitprintk(ioc, pr_info(MPT3SAS_FMT
"sense pool(0x%p): depth(%d), element_size(%d), pool_size"
"(%d kB)\n", ioc->name, ioc->sense, ioc->scsiio_depth,
-- 
1.8.3.1



[PATCH v3 01/14] mpt3sas: Bug fix for big endian systems.

2018-04-24 Thread Chaitra P B
This patch fixes bug for big endian systems.

Signed-off-by: Chaitra P B 
Signed-off-by: Suganath Prabu S 
---
 drivers/scsi/mpt3sas/mpi/mpi2_init.h |  2 +-
 drivers/scsi/mpt3sas/mpt3sas_base.c  | 55 -
 drivers/scsi/mpt3sas/mpt3sas_base.h  |  4 +--
 drivers/scsi/mpt3sas/mpt3sas_ctl.c   | 11 +++---
 drivers/scsi/mpt3sas/mpt3sas_scsih.c | 59 +++-
 drivers/scsi/mpt3sas/mpt3sas_warpdrive.c |  3 +-
 6 files changed, 71 insertions(+), 63 deletions(-)

diff --git a/drivers/scsi/mpt3sas/mpi/mpi2_init.h 
b/drivers/scsi/mpt3sas/mpi/mpi2_init.h
index 948a3ba..6213ce6 100644
--- a/drivers/scsi/mpt3sas/mpi/mpi2_init.h
+++ b/drivers/scsi/mpt3sas/mpi/mpi2_init.h
@@ -75,7 +75,7 @@
 
 typedef struct _MPI2_SCSI_IO_CDB_EEDP32 {
U8 CDB[20]; /*0x00 */
-   U32 PrimaryReferenceTag;/*0x14 */
+   __be32 PrimaryReferenceTag; /*0x14 */
U16 PrimaryApplicationTag;  /*0x18 */
U16 PrimaryApplicationTagMask;  /*0x1A */
U32 TransferLength; /*0x1C */
diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c 
b/drivers/scsi/mpt3sas/mpt3sas_base.c
index 0a0e7aa..9c00185 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.c
@@ -394,13 +394,14 @@ static void _clone_sg_entries(struct MPT3SAS_ADAPTER *ioc,
buff_ptr_phys = buffer_iomem_phys;
WARN_ON(buff_ptr_phys > U32_MAX);
 
-   if (sgel->FlagsLength &
+   if (le32_to_cpu(sgel->FlagsLength) &
(MPI2_SGE_FLAGS_HOST_TO_IOC << MPI2_SGE_FLAGS_SHIFT))
is_write = 1;
 
for (i = 0; i < MPT_MIN_PHYS_SEGMENTS + ioc->facts.MaxChainDepth; i++) {
 
-   sgl_flags = (sgel->FlagsLength >> MPI2_SGE_FLAGS_SHIFT);
+   sgl_flags =
+   (le32_to_cpu(sgel->FlagsLength) >> MPI2_SGE_FLAGS_SHIFT);
 
switch (sgl_flags & MPI2_SGE_FLAGS_ELEMENT_MASK) {
case MPI2_SGE_FLAGS_CHAIN_ELEMENT:
@@ -411,7 +412,7 @@ static void _clone_sg_entries(struct MPT3SAS_ADAPTER *ioc,
 */
sgel_next =
_base_get_chain_buffer_dma_to_chain_buffer(ioc,
-   sgel->Address);
+   le32_to_cpu(sgel->Address));
if (sgel_next == NULL)
return;
/*
@@ -426,7 +427,8 @@ static void _clone_sg_entries(struct MPT3SAS_ADAPTER *ioc,
dst_addr_phys = _base_get_chain_phys(ioc,
smid, sge_chain_count);
WARN_ON(dst_addr_phys > U32_MAX);
-   sgel->Address = (u32)dst_addr_phys;
+   sgel->Address =
+   cpu_to_le32(lower_32_bits(dst_addr_phys));
sgel = sgel_next;
sge_chain_count++;
break;
@@ -435,22 +437,28 @@ static void _clone_sg_entries(struct MPT3SAS_ADAPTER *ioc,
if (is_scsiio_req) {
_base_clone_to_sys_mem(buff_ptr,
sg_virt(sg_scmd),
-   (sgel->FlagsLength & 0x00ff));
+   (le32_to_cpu(sgel->FlagsLength) &
+   0x00ff));
/*
 * FIXME: this relies on a a zero
 * PCI mem_offset.
 */
-   sgel->Address = (u32)buff_ptr_phys;
+   sgel->Address =
+   cpu_to_le32((u32)buff_ptr_phys);
} else {
_base_clone_to_sys_mem(buff_ptr,
ioc->config_vaddr,
-   (sgel->FlagsLength & 0x00ff));
-   sgel->Address = (u32)buff_ptr_phys;
+   (le32_to_cpu(sgel->FlagsLength) &
+   0x00ff));
+   sgel->Address =
+   cpu_to_le32((u32)buff_ptr_phys);
}
}
-   buff_ptr += (sgel->FlagsLength & 0x00ff);
-   buff_ptr_phys += (sgel->FlagsLength & 0x00ff);
-   if ((sgel->FlagsLength &
+   buff_ptr += (le32_to_cpu(sgel->FlagsLength) &
+   

[PATCH v3 12/14] mpt3sas: For NVME device, issue a protocol level reset instead of hot reset and use TM timeout value exposed in PCIe Device Page 2.

2018-04-24 Thread Chaitra P B
1)Manufacturing Page 11 contains parameters to control
internal firmware behavior. Based on AddlFlags2 field
FW/Driver behaviour can be changed, (flag tm_custom_handling
is used for this)

a) For PCIe device, protocol level reset should be used if
flag tm_custom_handling is 0.
Since Abort Task Set, LUN reset and Target reset will result
in a protocol level reset. Drivers should issue only one type
of this reset, if that fails then it should escalate to a controller
reset (diag reset/OCR).
b) If the driver has control over the TM reset timeout value, then
driver should use the value exposed in PCIe Device Page 2 for pcie device
(field ControllerResetTO).

Signed-off-by: Chaitra P B 
Signed-off-by: Suganath Prabu S 
---
 drivers/scsi/mpt3sas/mpt3sas_base.c  | 13 ++
 drivers/scsi/mpt3sas/mpt3sas_base.h  | 26 ++--
 drivers/scsi/mpt3sas/mpt3sas_ctl.c   | 22 +--
 drivers/scsi/mpt3sas/mpt3sas_scsih.c | 76 ++--
 4 files changed, 116 insertions(+), 21 deletions(-)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c 
b/drivers/scsi/mpt3sas/mpt3sas_base.c
index febd858..600c717 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.c
@@ -4139,6 +4139,7 @@ _base_static_config_pages(struct MPT3SAS_ADAPTER *ioc)
Mpi2ConfigReply_t mpi_reply;
u32 iounit_pg1_flags;
 
+   ioc->nvme_abort_timeout = 30;
mpt3sas_config_get_manufacturing_pg0(ioc, _reply, >manu_pg0);
if (ioc->ir_firmware)
mpt3sas_config_get_manufacturing_pg10(ioc, _reply,
@@ -4157,6 +4158,18 @@ _base_static_config_pages(struct MPT3SAS_ADAPTER *ioc)
mpt3sas_config_set_manufacturing_pg11(ioc, _reply,
>manu_pg11);
}
+   if (ioc->manu_pg11.AddlFlags2 & NVME_TASK_MNGT_CUSTOM_MASK)
+   ioc->tm_custom_handling = 1;
+   else {
+   ioc->tm_custom_handling = 0;
+   if (ioc->manu_pg11.NVMeAbortTO < NVME_TASK_ABORT_MIN_TIMEOUT)
+   ioc->nvme_abort_timeout = NVME_TASK_ABORT_MIN_TIMEOUT;
+   else if (ioc->manu_pg11.NVMeAbortTO >
+   NVME_TASK_ABORT_MAX_TIMEOUT)
+   ioc->nvme_abort_timeout = NVME_TASK_ABORT_MAX_TIMEOUT;
+   else
+   ioc->nvme_abort_timeout = ioc->manu_pg11.NVMeAbortTO;
+   }
 
mpt3sas_config_get_bios_pg2(ioc, _reply, >bios_pg2);
mpt3sas_config_get_bios_pg3(ioc, _reply, >bios_pg3);
diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.h 
b/drivers/scsi/mpt3sas/mpt3sas_base.h
index 49333b9..43d66c5 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.h
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.h
@@ -146,8 +146,12 @@
 #defineNVME_CMD_PRP1_OFFSET24  /* PRP1 offset in NVMe 
cmd */
 #defineNVME_CMD_PRP2_OFFSET32  /* PRP2 offset in NVMe 
cmd */
 #defineNVME_ERROR_RESPONSE_SIZE16  /* Max NVME Error 
Response */
+#define NVME_TASK_ABORT_MIN_TIMEOUT6
+#define NVME_TASK_ABORT_MAX_TIMEOUT60
+#define NVME_TASK_MNGT_CUSTOM_MASK (0x0010)
 #defineNVME_PRP_PAGE_SIZE  4096/* Page size */
 
+
 /*
  * reset phases
  */
@@ -363,7 +367,15 @@ struct Mpi2ManufacturingPage11_t {
u8  EEDPTagMode;/* 09h */
u8  Reserved3;  /* 0Ah */
u8  Reserved4;  /* 0Bh */
-   __le32  Reserved5[23];  /* 0Ch-60h*/
+   __le32  Reserved5[8];   /* 0Ch-2Ch */
+   u16 AddlFlags2; /* 2Ch */
+   u8  AddlFlags3; /* 2Eh */
+   u8  Reserved6;  /* 2Fh */
+   __le32  Reserved7[7];   /* 30h - 4Bh */
+   u8  NVMeAbortTO;/* 4Ch */
+   u8  Reserved8;  /* 4Dh */
+   u16 Reserved9;  /* 4Eh */
+   __le32  Reserved10[4];  /* 50h - 60h */
 };
 
 /**
@@ -573,6 +585,7 @@ struct _pcie_device {
u8  enclosure_level;
u8  connector_name[4];
u8  *serial_number;
+   u8  reset_timeout;
struct kref refcount;
 };
 /**
@@ -1211,6 +1224,10 @@ struct MPT3SAS_ADAPTER {
void*event_log;
u32 event_masks[MPI2_EVENT_NOTIFY_EVENTMASK_WORDS];
 
+   u8  tm_custom_handling;
+   u8  nvme_abort_timeout;
+
+
/* static config pages */
struct mpt3sas_facts facts;
struct mpt3sas_port_facts *pfacts;
@@ -1470,10 +1487,11 @@ u8 mpt3sas_scsih_event_callback(struct MPT3SAS_ADAPTER 
*ioc, u8 msix_index,
u32 reply);
 void mpt3sas_scsih_reset_handler(struct MPT3SAS_ADAPTER *ioc, int reset_phase);
 
-int mpt3sas_scsih_issue_tm(struct MPT3SAS_ADAPTER *ioc, u16 

[PATCH v3 09/14] mpt3sas: Cache enclosure pages during enclosure add.

2018-04-24 Thread Chaitra P B
In function _scsih_add_device,
for each device connected to an enclosure, driver reads the
enclosure page(To get details like enclosure handle,
enclosure logical ID, enclosure level etc.)

With this patch, instead of reading enclosure page everytime,
driver maintains a list for enclosure device(During enclosure
add event, enclosure device is added to the list and removed
from the list on delete events) and uses the enclosure page
from the list.

Signed-off-by: Chaitra P B 
Signed-off-by: Suganath Prabu S 
---
 drivers/scsi/mpt3sas/mpt3sas_base.c  |  22 +++
 drivers/scsi/mpt3sas/mpt3sas_base.h  |  14 ++
 drivers/scsi/mpt3sas/mpt3sas_scsih.c | 296 +++
 3 files changed, 236 insertions(+), 96 deletions(-)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c 
b/drivers/scsi/mpt3sas/mpt3sas_base.c
index 93bf710..bcd16e4 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.c
@@ -4087,6 +4087,27 @@ _base_static_config_pages(struct MPT3SAS_ADAPTER *ioc)
 }
 
 /**
+ * mpt3sas_free_enclosure_list - release memory
+ * @ioc: per adapter object
+ *
+ * Free memory allocated during encloure add.
+ *
+ * Return nothing.
+ */
+void
+mpt3sas_free_enclosure_list(struct MPT3SAS_ADAPTER *ioc)
+{
+   struct _enclosure_node *enclosure_dev, *enclosure_dev_next;
+
+   /* Free enclosure list */
+   list_for_each_entry_safe(enclosure_dev,
+   enclosure_dev_next, >enclosure_list, list) {
+   list_del(_dev->list);
+   kfree(enclosure_dev);
+   }
+}
+
+/**
  * _base_release_memory_pools - release memory
  * @ioc: per adapter object
  *
@@ -,6 +6687,7 @@ mpt3sas_base_detach(struct MPT3SAS_ADAPTER *ioc)
mpt3sas_base_stop_watchdog(ioc);
mpt3sas_base_free_resources(ioc);
_base_release_memory_pools(ioc);
+   mpt3sas_free_enclosure_list(ioc);
pci_set_drvdata(ioc->pdev, NULL);
kfree(ioc->cpu_msix_table);
if (ioc->is_warpdrive)
diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.h 
b/drivers/scsi/mpt3sas/mpt3sas_base.h
index 01eb910..ee8ce7e 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.h
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.h
@@ -741,6 +741,17 @@ struct _sas_node {
struct list_head sas_port_list;
 };
 
+
+/**
+ * struct _enclosure_node - enclosure information
+ * @list: list of enclosures
+ * @pg0: enclosure pg0;
+ */
+struct _enclosure_node {
+   struct list_head list;
+   Mpi2SasEnclosurePage0_t pg0;
+};
+
 /**
  * enum reset_type - reset state
  * @FORCE_BIG_HAMMER: issue diagnostic reset
@@ -1013,6 +1024,7 @@ typedef void (*MPT3SAS_FLUSH_RUNNING_CMDS)(struct 
MPT3SAS_ADAPTER *ioc);
  * @iounit_pg8: static iounit page 8
  * @sas_hba: sas host object
  * @sas_expander_list: expander object list
+ * @enclosure_list: enclosure object list
  * @sas_node_lock:
  * @sas_device_list: sas device object list
  * @sas_device_init_list: sas device object list (used only at init time)
@@ -1218,6 +1230,7 @@ struct MPT3SAS_ADAPTER {
/* sas hba, expander, and device list */
struct _sas_node sas_hba;
struct list_head sas_expander_list;
+   struct list_head enclosure_list;
spinlock_t  sas_node_lock;
struct list_head sas_device_list;
struct list_head sas_device_init_list;
@@ -1391,6 +1404,7 @@ int mpt3sas_base_attach(struct MPT3SAS_ADAPTER *ioc);
 void mpt3sas_base_detach(struct MPT3SAS_ADAPTER *ioc);
 int mpt3sas_base_map_resources(struct MPT3SAS_ADAPTER *ioc);
 void mpt3sas_base_free_resources(struct MPT3SAS_ADAPTER *ioc);
+void mpt3sas_free_enclosure_list(struct MPT3SAS_ADAPTER *ioc);
 int mpt3sas_base_hard_reset_handler(struct MPT3SAS_ADAPTER *ioc,
enum reset_type type);
 
diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c 
b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index 33587a8..bc8e7f0 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -1362,6 +1362,30 @@ mpt3sas_scsih_expander_find_by_handle(struct 
MPT3SAS_ADAPTER *ioc, u16 handle)
 }
 
 /**
+ * mpt3sas_scsih_enclosure_find_by_handle - exclosure device search
+ * @ioc: per adapter object
+ * @handle: enclosure handle (assigned by firmware)
+ * Context: Calling function should acquire ioc->sas_device_lock
+ *
+ * This searches for enclosure device based on handle, then returns the
+ * enclosure object.
+ */
+static struct _enclosure_node *
+mpt3sas_scsih_enclosure_find_by_handle(struct MPT3SAS_ADAPTER *ioc, u16 handle)
+{
+   struct _enclosure_node *enclosure_dev, *r;
+
+   r = NULL;
+   list_for_each_entry(enclosure_dev, >enclosure_list, list) {
+   if (le16_to_cpu(enclosure_dev->pg0.EnclosureHandle) != handle)
+   continue;
+   r = enclosure_dev;
+   goto out;
+   }
+out:
+   return r;
+}
+/**
  * mpt3sas_scsih_expander_find_by_sas_address - expander device 

[PATCH v3 03/14] mpt3sas: Lockless access for chain buffers.

2018-04-24 Thread Chaitra P B
Introduces Chain lookup table/tracker and implements accessing chain buffer
using smid.
Removed link list based access of chain buffer which requires lock and
allocated as many chains needed.

Signed-off-by: Chaitra P B 
Signed-off-by: Suganath Prabu S 
---
 drivers/scsi/mpt3sas/mpt3sas_base.c | 111 +++-
 drivers/scsi/mpt3sas/mpt3sas_base.h |   8 ++-
 2 files changed, 65 insertions(+), 54 deletions(-)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c 
b/drivers/scsi/mpt3sas/mpt3sas_base.c
index b23a2eb..438da7c 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.c
@@ -297,12 +297,15 @@ static void *
 _base_get_chain_buffer_dma_to_chain_buffer(struct MPT3SAS_ADAPTER *ioc,
dma_addr_t chain_buffer_dma)
 {
-   u16 index;
-
-   for (index = 0; index < ioc->chain_depth; index++) {
-   if (ioc->chain_lookup[index].chain_buffer_dma ==
-   chain_buffer_dma)
-   return ioc->chain_lookup[index].chain_buffer;
+   u16 index, j;
+   struct chain_tracker *ct;
+
+   for (index = 0; index < ioc->scsiio_depth; index++) {
+   for (j = 0; j < ioc->chains_needed_per_io; j++) {
+   ct = >chain_lookup[index].chains_per_smid[j];
+   if (ct && ct->chain_buffer_dma == chain_buffer_dma)
+   return ct->chain_buffer;
+   }
}
pr_info(MPT3SAS_FMT
"Provided chain_buffer_dma address is not in the lookup list\n",
@@ -1679,7 +1682,8 @@ _base_add_sg_single_64(void *paddr, u32 flags_length, 
dma_addr_t dma_addr)
  * @ioc: per adapter object
  * @scmd: SCSI commands of the IO request
  *
- * Returns chain tracker(from ioc->free_chain_list)
+ * Returns chain tracker from chain_lookup table using key as
+ * smid and smid's chain_offset.
  */
 static struct chain_tracker *
 _base_get_chain_buffer_tracker(struct MPT3SAS_ADAPTER *ioc,
@@ -1687,20 +1691,15 @@ _base_get_chain_buffer_tracker(struct MPT3SAS_ADAPTER 
*ioc,
 {
struct chain_tracker *chain_req;
struct scsiio_tracker *st = scsi_cmd_priv(scmd);
-   unsigned long flags;
+   u16 smid = st->smid;
+   u8 chain_offset =
+  atomic_read(>chain_lookup[smid - 1].chain_offset);
 
-   spin_lock_irqsave(>scsi_lookup_lock, flags);
-   if (list_empty(>free_chain_list)) {
-   spin_unlock_irqrestore(>scsi_lookup_lock, flags);
-   dfailprintk(ioc, pr_warn(MPT3SAS_FMT
-   "chain buffers not available\n", ioc->name));
+   if (chain_offset == ioc->chains_needed_per_io)
return NULL;
-   }
-   chain_req = list_entry(ioc->free_chain_list.next,
-   struct chain_tracker, tracker_list);
-   list_del_init(_req->tracker_list);
-   list_add_tail(_req->tracker_list, >chain_list);
-   spin_unlock_irqrestore(>scsi_lookup_lock, flags);
+
+   chain_req = >chain_lookup[smid - 1].chains_per_smid[chain_offset];
+   atomic_inc(>chain_lookup[smid - 1].chain_offset);
return chain_req;
 }
 
@@ -3278,13 +3277,7 @@ void mpt3sas_base_clear_st(struct MPT3SAS_ADAPTER *ioc,
return;
st->cb_idx = 0xFF;
st->direct_io = 0;
-   if (!list_empty(>chain_list)) {
-   unsigned long flags;
-
-   spin_lock_irqsave(>scsi_lookup_lock, flags);
-   list_splice_init(>chain_list, >free_chain_list);
-   spin_unlock_irqrestore(>scsi_lookup_lock, flags);
-   }
+   atomic_set(>chain_lookup[st->smid - 1].chain_offset, 0);
 }
 
 /**
@@ -4102,6 +4095,8 @@ static void
 _base_release_memory_pools(struct MPT3SAS_ADAPTER *ioc)
 {
int i = 0;
+   int j = 0;
+   struct chain_tracker *ct;
struct reply_post_struct *rps;
 
dexitprintk(ioc, pr_info(MPT3SAS_FMT "%s\n", ioc->name,
@@ -4192,14 +4187,18 @@ _base_release_memory_pools(struct MPT3SAS_ADAPTER *ioc)
kfree(ioc->hpr_lookup);
kfree(ioc->internal_lookup);
if (ioc->chain_lookup) {
-   for (i = 0; i < ioc->chain_depth; i++) {
-   if (ioc->chain_lookup[i].chain_buffer)
-   dma_pool_free(ioc->chain_dma_pool,
-   ioc->chain_lookup[i].chain_buffer,
-   ioc->chain_lookup[i].chain_buffer_dma);
+   for (i = 0; i < ioc->scsiio_depth; i++) {
+   for (j = 0; j < ioc->chains_needed_per_io; j++) {
+   ct = >chain_lookup[i].chains_per_smid[j];
+   if (ct && ct->chain_buffer)
+   dma_pool_free(ioc->chain_dma_pool,
+   ct->chain_buffer,
+   ct->chain_buffer_dma);

[PATCH v3 11/14] mpt3sas: Update MPI Headers

2018-04-24 Thread Chaitra P B
Update MPI Files to support protocol level reset for NVMe device.

Signed-off-by: Chaitra P B 
Signed-off-by: Suganath Prabu S 
---
 drivers/scsi/mpt3sas/mpi/mpi2.h  |  9 ++---
 drivers/scsi/mpt3sas/mpi/mpi2_cnfg.h | 30 --
 drivers/scsi/mpt3sas/mpi/mpi2_ioc.h  |  7 ++-
 3 files changed, 36 insertions(+), 10 deletions(-)

diff --git a/drivers/scsi/mpt3sas/mpi/mpi2.h b/drivers/scsi/mpt3sas/mpi/mpi2.h
index b015c30..1e45268 100644
--- a/drivers/scsi/mpt3sas/mpi/mpi2.h
+++ b/drivers/scsi/mpt3sas/mpi/mpi2.h
@@ -9,7 +9,7 @@
  * scatter/gather formats.
  * Creation Date:  June 21, 2006
  *
- * mpi2.h Version:  02.00.48
+ *  mpi2.h Version:  02.00.50
  *
  * NOTE: Names (typedefs, defines, etc.) beginning with an MPI25 or Mpi25
  *   prefix are for use only on MPI v2.5 products, and must not be used
@@ -114,6 +114,8 @@
  * 09-02-16  02.00.46  Bumped MPI2_HEADER_VERSION_UNIT.
  * 11-23-16  02.00.47  Bumped MPI2_HEADER_VERSION_UNIT.
  * 02-03-17  02.00.48  Bumped MPI2_HEADER_VERSION_UNIT.
+ * 06-13-17  02.00.49  Bumped MPI2_HEADER_VERSION_UNIT.
+ * 09-29-17  02.00.50  Bumped MPI2_HEADER_VERSION_UNIT.
  * --
  */
 
@@ -152,8 +154,9 @@
MPI26_VERSION_MINOR)
 #define MPI2_VERSION_02_06 (0x0206)
 
-/*Unit and Dev versioning for this MPI header set */
-#define MPI2_HEADER_VERSION_UNIT(0x30)
+
+/* Unit and Dev versioning for this MPI header set */
+#define MPI2_HEADER_VERSION_UNIT(0x32)
 #define MPI2_HEADER_VERSION_DEV (0x00)
 #define MPI2_HEADER_VERSION_UNIT_MASK   (0xFF00)
 #define MPI2_HEADER_VERSION_UNIT_SHIFT  (8)
diff --git a/drivers/scsi/mpt3sas/mpi/mpi2_cnfg.h 
b/drivers/scsi/mpt3sas/mpi/mpi2_cnfg.h
index 0ad88de..5122920 100644
--- a/drivers/scsi/mpt3sas/mpi/mpi2_cnfg.h
+++ b/drivers/scsi/mpt3sas/mpi/mpi2_cnfg.h
@@ -7,7 +7,7 @@
  * Title:  MPI Configuration messages and pages
  * Creation Date:  November 10, 2006
  *
- *   mpi2_cnfg.h Version:  02.00.40
+ *mpi2_cnfg.h Version:  02.00.42
  *
  * NOTE: Names (typedefs, defines, etc.) beginning with an MPI25 or Mpi25
  *   prefix are for use only on MPI v2.5 products, and must not be used
@@ -219,6 +219,18 @@
  * Added ChassisSlot field to SAS Enclosure Page 0.
  * Added ChassisSlot Valid bit (bit 5) to the Flags field
  * in SAS Enclosure Page 0.
+ * 06-13-17  02.00.41  Added MPI26_MFGPAGE_DEVID_SAS3816 and
+ * MPI26_MFGPAGE_DEVID_SAS3916 defines.
+ * Removed MPI26_MFGPAGE_DEVID_SAS4008 define.
+ * Added MPI26_PCIEIOUNIT1_LINKFLAGS_SRNS_EN define.
+ * Renamed PI26_PCIEIOUNIT1_LINKFLAGS_EN_SRIS to
+ * PI26_PCIEIOUNIT1_LINKFLAGS_SRIS_EN.
+ * Renamed MPI26_PCIEIOUNIT1_LINKFLAGS_DIS_SRIS to
+ * MPI26_PCIEIOUNIT1_LINKFLAGS_DIS_SEPARATE_REFCLK.
+ * 09-29-17  02.00.42  Added ControllerResetTO field to PCIe Device Page 2.
+ * Added NOIOB field to PCIe Device Page 2.
+ * Added MPI26_PCIEDEV2_CAP_DATA_BLK_ALIGN_AND_GRAN to
+ * the Capabilities field of PCIe Device Page 2.
  * --
  */
 
@@ -556,7 +568,8 @@ typedef struct _MPI2_CONFIG_REPLY {
 #define MPI26_MFGPAGE_DEVID_SAS3616 (0x00D1)
 #define MPI26_MFGPAGE_DEVID_SAS3708 (0x00D2)
 
-#define MPI26_MFGPAGE_DEVID_SAS4008 (0x00A1)
+#define MPI26_MFGPAGE_DEVID_SAS3816 (0x00A1)
+#define MPI26_MFGPAGE_DEVID_SAS3916 (0x00A0)
 
 
 /*Manufacturing Page 0 */
@@ -3864,20 +3877,25 @@ typedef struct _MPI26_CONFIG_PAGE_PCIEDEV_0 {
 typedef struct _MPI26_CONFIG_PAGE_PCIEDEV_2 {
MPI2_CONFIG_EXTENDED_PAGE_HEADERHeader; /*0x00 */
U16 DevHandle;  /*0x08 */
-   U16 Reserved1;  /*0x0A */
-   U32 MaximumDataTransferSize;/*0x0C */
+   U8  ControllerResetTO;  /* 0x0A */
+   U8  Reserved1;  /* 0x0B */
+   U32 MaximumDataTransferSize;/*0x0C */
U32 Capabilities;   /*0x10 */
-   U32 Reserved2;  /*0x14 */
+   U16 NOIOB;  /* 0x14 */
+   U16 Reserved2;  /* 0x16 */
 } MPI26_CONFIG_PAGE_PCIEDEV_2, *PTR_MPI26_CONFIG_PAGE_PCIEDEV_2,
Mpi26PCIeDevicePage2_t, *pMpi26PCIeDevicePage2_t;
 
-#define MPI26_PCIEDEVICE2_PAGEVERSION   (0x00)
+#define MPI26_PCIEDEVICE2_PAGEVERSION   (0x01)
 
 /*defines for PCIe Device Page 2 Capabilities field */
+#define MPI26_PCIEDEV2_CAP_DATA_BLK_ALIGN_AND_GRAN (0x0008)
 #define 

[PATCH v3 08/14] mpt3sas: Allow processing of events during driver unload.

2018-04-24 Thread Chaitra P B
Events were not processed during driver unload, hence unloading of driver
doesn't complete when drives are disconnected while unloading of driver.
So don't block events in ISR path, i,e., remove the flag ioc->remove_host
so that events are getting processed during driver unload.
Thus allowing driver unload to complete by processing drive removal events
during driver unload.

Signed-off-by: Chaitra P B 
Signed-off-by: Suganath Prabu S 
---
 drivers/scsi/mpt3sas/mpt3sas_scsih.c | 16 +---
 1 file changed, 5 insertions(+), 11 deletions(-)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c 
b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index 796bf33..33587a8 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -3677,11 +3677,7 @@ _scsih_tm_tr_complete(struct MPT3SAS_ADAPTER *ioc, u16 
smid, u8 msix_index,
u32 ioc_state;
struct _sc_list *delayed_sc;
 
-   if (ioc->remove_host) {
-   dewtprintk(ioc, pr_info(MPT3SAS_FMT
-   "%s: host has been removed\n", __func__, ioc->name));
-   return 1;
-   } else if (ioc->pci_error_recovery) {
+   if (ioc->pci_error_recovery) {
dewtprintk(ioc, pr_info(MPT3SAS_FMT
"%s: host in pci error recovery\n", __func__,
ioc->name));
@@ -3803,8 +3799,7 @@ _scsih_tm_tr_volume_send(struct MPT3SAS_ADAPTER *ioc, u16 
handle)
u16 smid;
struct _tr_list *delayed_tr;
 
-   if (ioc->shost_recovery || ioc->remove_host ||
-   ioc->pci_error_recovery) {
+   if (ioc->pci_error_recovery) {
dewtprintk(ioc, pr_info(MPT3SAS_FMT
"%s: host reset in progress!\n",
__func__, ioc->name));
@@ -3857,8 +3852,7 @@ _scsih_tm_volume_tr_complete(struct MPT3SAS_ADAPTER *ioc, 
u16 smid,
Mpi2SCSITaskManagementReply_t *mpi_reply =
mpt3sas_base_get_reply_virt_addr(ioc, reply);
 
-   if (ioc->shost_recovery || ioc->remove_host ||
-   ioc->pci_error_recovery) {
+   if (ioc->shost_recovery || ioc->pci_error_recovery) {
dewtprintk(ioc, pr_info(MPT3SAS_FMT
"%s: host reset in progress!\n",
__func__, ioc->name));
@@ -9468,8 +9462,8 @@ mpt3sas_scsih_event_callback(struct MPT3SAS_ADAPTER *ioc, 
u8 msix_index,
u16 sz;
Mpi26EventDataActiveCableExcept_t *ActiveCableEventData;
 
-   /* events turned off due to host reset or driver unloading */
-   if (ioc->remove_host || ioc->pci_error_recovery)
+   /* events turned off due to host reset */
+   if (ioc->pci_error_recovery)
return 1;
 
mpi_reply = mpt3sas_base_get_reply_virt_addr(ioc, reply);
-- 
1.8.3.1



Re: [PATCH] [SCSI] mpt3sas: Fix secure erase premature termination (v2)

2018-04-24 Thread Steffen Maier


On 11/04/2016 05:35 PM, Martin K. Petersen wrote:

"Hannes" == Hannes Reinecke  writes:


Hannes> Checking with SAT-3 (section 6.2.4: Commands the SATL queues
Hannes> internally) the implemented behaviour is standards conformant,
Hannes> although the standard also allows for returning 'TASK SET FULL'
Hannes> or 'BUSY' in these cases.  Doing so would nicely solve this
Hannes> issue.

I agree with Hannes that it would be appropriate for the SATL to report
busy when it makes an non-queued command queueable.


Wouldn't this potentially still cause problems if the secure erase takes 
longer than max_retries * scmd_tmo. I.e. the command timing out by 
default after 180 seconds as in 
https://www.spinics.net/lists/linux-block/msg24837.html ?


The fix approach here seems to also handle this gracefully.

--
Mit freundlichen Grüßen / Kind regards
Steffen Maier

Linux on z Systems Development

IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Martina Koederitz
Geschaeftsfuehrung: Dirk Wittkopp
Sitz der Gesellschaft: Boeblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294



Re: [PATCH 1/3] PM / devfreq: Actually support providing freq_table

2018-04-24 Thread Chanwoo Choi
Hi,

On 2018년 04월 24일 14:29, Bjorn Andersson wrote:
> On Mon 23 Apr 19:48 PDT 2018, Chanwoo Choi wrote:
> 
>> Hi,
>>
>> On 2018??? 04??? 24??? 09:20, Bjorn Andersson wrote:
>>> The code in devfreq_add_device() handles the case where a freq_table is
>>> passed by the client, but then requests min and max frequences from
>>> the, in this case absent, opp tables.
>>>
>>> Read the min and max frequencies from the frequency table, which has
>>> been built from the opp table if one exists, instead of querying the
>>> opp table.
>>>
>>> Signed-off-by: Bjorn Andersson 
>>> ---
>>>
>>> An alternative approach is to clarify in the devfreq code that it's not
>>> possible to pass a freq_table and then in patch 3 create an opp table for 
>>> the
>>> device in runtime; although the error handling of this becomes non-trivial.
>>>
>>> Transitioning the UFSHCD to use opp tables directly is hindered by the fact
>>> that the Qualcomm UFS hardware has two different clocks that needs to be
>>> running at different rates, so we would need a way to describe the two 
>>> rates in
>>> the opp table. (And would force us to change the DT binding)
>>>
>>>  drivers/devfreq/devfreq.c | 22 --
>>>  1 file changed, 4 insertions(+), 18 deletions(-)
>>>
>>> diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
>>> index fe2af6aa88fc..086ced50a13d 100644
>>> --- a/drivers/devfreq/devfreq.c
>>> +++ b/drivers/devfreq/devfreq.c
>>> @@ -74,30 +74,16 @@ static struct devfreq *find_device_devfreq(struct 
>>> device *dev)
>>>  
>>>  static unsigned long find_available_min_freq(struct devfreq *devfreq)
>>>  {
>>> -   struct dev_pm_opp *opp;
>>> -   unsigned long min_freq = 0;
>>> -
>>> -   opp = dev_pm_opp_find_freq_ceil(devfreq->dev.parent, _freq);
>>> -   if (IS_ERR(opp))
>>> -   min_freq = 0;
>>> -   else
>>> -   dev_pm_opp_put(opp);
>>> +   struct devfreq_dev_profile *profile = devfreq->profile;
>>>  
>>> -   return min_freq;
>>> +   return profile->freq_table[0];
>>
>> It is wrong. The thermal framework support the devfreq-cooling device
>> which uses the dev_pm_opp_enable/disable().
>>
> 
> Okay, that makes sense. So rather than registering a custom freq_table I
> should register the min and max frequency using dev_pm_opp_add().

Thanks.

> 
>> In order to find the correct available min frequency,
>> the devfreq have to use the OPP function instead of using the first entry
>> of the freq_table array.
>>
> 
> Based on this there seems to be room for cleaning out the freq_table
> from devfreq, to reduce the confusion. I will review this further.

Actually, devfreq must need to have the freq_table[] array. But, freq_table[]
array should be handled in the devfreq core. Now, the devfreq device drivers can
touch the freq_table. I think it is not good.

There is a reason why we have to maintain the freq_table[] as the internal 
variable.
OPP doesn't provide the OPP API which get the all registered frequencies.
If devfreq-cooling device disables the specific frequency by using 
dev_pm_oppdisable(),
the user of OPP interface can not get the disabled frequency list.
So, I maintain the freq_table even if using the OPP interface.

And, devfreq-cooling device uses the freq_table directly because released MALi 
driver
from ARM initializes the freq_table list directly.

I have no any objection for refactoring. Just I'm sharing the issue and current 
status.

> 
> Thanks,
> Bjorn
> 
> 
> 


-- 
Best Regards,
Chanwoo Choi
Samsung Electronics


Re: [PATCH] [SCSI] mpt3sas: Fix secure erase premature termination (v4)

2018-04-24 Thread Greg KH
On Mon, Apr 23, 2018 at 06:28:03PM +, Igor Rybak wrote:
> Hi,
> 
> We are running kernel 4.4.0-22 and the patch below does not seem to be 
> present in the mpt3sas driver. Can you please confirm?

Please update your kernel, this patch was in the 4.4.36 kernel release
which came out December 2, 2016, well over a full year ago.

thanks,

greg k-h


RE: Re: [PATCH 1/3] PM / devfreq: Actually support providing freq_table

2018-04-24 Thread MyungJoo Ham
>On Mon 23 Apr 19:48 PDT 2018, Chanwoo Choi wrote:
>
>> Hi,
>> 
>> On 2018??? 04??? 24??? 09:20, Bjorn Andersson wrote:
>> > The code in devfreq_add_device() handles the case where a freq_table is
>> > passed by the client, but then requests min and max frequences from
>> > the, in this case absent, opp tables.
>> > 
>> > Read the min and max frequencies from the frequency table, which has
>> > been built from the opp table if one exists, instead of querying the
>> > opp table.
>> > 
>> > Signed-off-by: Bjorn Andersson 
>> > ---
>> > 
>> > An alternative approach is to clarify in the devfreq code that it's not
>> > possible to pass a freq_table and then in patch 3 create an opp table for 
>> > the
>> > device in runtime; although the error handling of this becomes non-trivial.
>> > 
>> > Transitioning the UFSHCD to use opp tables directly is hindered by the fact
>> > that the Qualcomm UFS hardware has two different clocks that needs to be
>> > running at different rates, so we would need a way to describe the two 
>> > rates in
>> > the opp table. (And would force us to change the DT binding)
>> > 
>> >  drivers/devfreq/devfreq.c | 22 --
>> >  1 file changed, 4 insertions(+), 18 deletions(-)
>> > 
>> > diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
>> > index fe2af6aa88fc..086ced50a13d 100644
>> > --- a/drivers/devfreq/devfreq.c
>> > +++ b/drivers/devfreq/devfreq.c
>> > @@ -74,30 +74,16 @@ static struct devfreq *find_device_devfreq(struct 
>> > device *dev)
>> >  
>> >  static unsigned long find_available_min_freq(struct devfreq *devfreq)
>> >  {
>> > -  struct dev_pm_opp *opp;
>> > -  unsigned long min_freq = 0;
>> > -
>> > -  opp = dev_pm_opp_find_freq_ceil(devfreq->dev.parent, _freq);
>> > -  if (IS_ERR(opp))
>> > -  min_freq = 0;
>> > -  else
>> > -  dev_pm_opp_put(opp);
>> > +  struct devfreq_dev_profile *profile = devfreq->profile;
>> >  
>> > -  return min_freq;
>> > +  return profile->freq_table[0];
>> 
>> It is wrong. The thermal framework support the devfreq-cooling device
>> which uses the dev_pm_opp_enable/disable().
>> 
>
>Okay, that makes sense. So rather than registering a custom freq_table I
>should register the min and max frequency using dev_pm_opp_add().
>
>> In order to find the correct available min frequency,
>> the devfreq have to use the OPP function instead of using the first entry
>> of the freq_table array.
>> 
>
>Based on this there seems to be room for cleaning out the freq_table
>from devfreq, to reduce the confusion. I will review this further.

Could you please check if the bug suffering you gets resolved by
replacing 0 with ULONG_MAX in the function find_available_max_freq?

-max_freq = 0;
+max_freq = ULONG_MAX;

Even if you are not using OPP, these functions should provide somewhat
"compatible" values.

Cheers,
MyungJoo


>
>Thanks,
>Bjorn
>