Re: [PATCH 1/3] PM / devfreq: Actually support providing freq_table

2018-04-23 Thread Bjorn Andersson
On Mon 23 Apr 19:48 PDT 2018, Chanwoo Choi wrote:

> Hi,
> 
> On 2018??? 04??? 24??? 09:20, Bjorn Andersson wrote:
> > The code in devfreq_add_device() handles the case where a freq_table is
> > passed by the client, but then requests min and max frequences from
> > the, in this case absent, opp tables.
> > 
> > Read the min and max frequencies from the frequency table, which has
> > been built from the opp table if one exists, instead of querying the
> > opp table.
> > 
> > Signed-off-by: Bjorn Andersson 
> > ---
> > 
> > An alternative approach is to clarify in the devfreq code that it's not
> > possible to pass a freq_table and then in patch 3 create an opp table for 
> > the
> > device in runtime; although the error handling of this becomes non-trivial.
> > 
> > Transitioning the UFSHCD to use opp tables directly is hindered by the fact
> > that the Qualcomm UFS hardware has two different clocks that needs to be
> > running at different rates, so we would need a way to describe the two 
> > rates in
> > the opp table. (And would force us to change the DT binding)
> > 
> >  drivers/devfreq/devfreq.c | 22 --
> >  1 file changed, 4 insertions(+), 18 deletions(-)
> > 
> > diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
> > index fe2af6aa88fc..086ced50a13d 100644
> > --- a/drivers/devfreq/devfreq.c
> > +++ b/drivers/devfreq/devfreq.c
> > @@ -74,30 +74,16 @@ static struct devfreq *find_device_devfreq(struct 
> > device *dev)
> >  
> >  static unsigned long find_available_min_freq(struct devfreq *devfreq)
> >  {
> > -   struct dev_pm_opp *opp;
> > -   unsigned long min_freq = 0;
> > -
> > -   opp = dev_pm_opp_find_freq_ceil(devfreq->dev.parent, _freq);
> > -   if (IS_ERR(opp))
> > -   min_freq = 0;
> > -   else
> > -   dev_pm_opp_put(opp);
> > +   struct devfreq_dev_profile *profile = devfreq->profile;
> >  
> > -   return min_freq;
> > +   return profile->freq_table[0];
> 
> It is wrong. The thermal framework support the devfreq-cooling device
> which uses the dev_pm_opp_enable/disable().
> 

Okay, that makes sense. So rather than registering a custom freq_table I
should register the min and max frequency using dev_pm_opp_add().

> In order to find the correct available min frequency,
> the devfreq have to use the OPP function instead of using the first entry
> of the freq_table array.
> 

Based on this there seems to be room for cleaning out the freq_table
from devfreq, to reduce the confusion. I will review this further.

Thanks,
Bjorn


Re: [PATCH 1/3] PM / devfreq: Actually support providing freq_table

2018-04-23 Thread Chanwoo Choi
Hi,

On 2018년 04월 24일 09:20, Bjorn Andersson wrote:
> The code in devfreq_add_device() handles the case where a freq_table is
> passed by the client, but then requests min and max frequences from
> the, in this case absent, opp tables.
> 
> Read the min and max frequencies from the frequency table, which has
> been built from the opp table if one exists, instead of querying the
> opp table.
> 
> Signed-off-by: Bjorn Andersson 
> ---
> 
> An alternative approach is to clarify in the devfreq code that it's not
> possible to pass a freq_table and then in patch 3 create an opp table for the
> device in runtime; although the error handling of this becomes non-trivial.
> 
> Transitioning the UFSHCD to use opp tables directly is hindered by the fact
> that the Qualcomm UFS hardware has two different clocks that needs to be
> running at different rates, so we would need a way to describe the two rates 
> in
> the opp table. (And would force us to change the DT binding)
> 
>  drivers/devfreq/devfreq.c | 22 --
>  1 file changed, 4 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
> index fe2af6aa88fc..086ced50a13d 100644
> --- a/drivers/devfreq/devfreq.c
> +++ b/drivers/devfreq/devfreq.c
> @@ -74,30 +74,16 @@ static struct devfreq *find_device_devfreq(struct device 
> *dev)
>  
>  static unsigned long find_available_min_freq(struct devfreq *devfreq)
>  {
> - struct dev_pm_opp *opp;
> - unsigned long min_freq = 0;
> -
> - opp = dev_pm_opp_find_freq_ceil(devfreq->dev.parent, _freq);
> - if (IS_ERR(opp))
> - min_freq = 0;
> - else
> - dev_pm_opp_put(opp);
> + struct devfreq_dev_profile *profile = devfreq->profile;
>  
> - return min_freq;
> + return profile->freq_table[0];

It is wrong. The thermal framework support the devfreq-cooling device
which uses the dev_pm_opp_enable/disable().

In order to find the correct available min frequency,
the devfreq have to use the OPP function instead of using the first entry
of the freq_table array.

>  }
>  
>  static unsigned long find_available_max_freq(struct devfreq *devfreq)
>  {
> - struct dev_pm_opp *opp;
> - unsigned long max_freq = ULONG_MAX;
> -
> - opp = dev_pm_opp_find_freq_floor(devfreq->dev.parent, _freq);
> - if (IS_ERR(opp))
> - max_freq = 0;
> - else
> - dev_pm_opp_put(opp);
> + struct devfreq_dev_profile *profile = devfreq->profile;
>  
> - return max_freq;
> + return profile->freq_table[profile->max_state - 1];
>  }

ditto.

>  
>  /**
> 


-- 
Best Regards,
Chanwoo Choi
Samsung Electronics


[PATCH 0/3] Fix UFS and devfreq interaction

2018-04-23 Thread Bjorn Andersson
With the introduction of f1d981eaecf8 ("PM / devfreq: Use the available min/max
frequency") the UFS host controller driver (UFSHCD) stopped probing for
platforms that supports frequency scaling, e.g. all modern Qualcomm platforms.

The cause of this was UFSHCD's reliance of not registering any frequencies and
then being called by devfreq to switch between the frequencies 0 and UINT_MAX.

The devfreq code implies that the client is able to pass the frequency table,
instead of relying on opp tables, so the first patch makes this actually work.
The second patch extracts the devfreq registration in the UFSHCD driver, both
to facilitate the third patch and to remove a dereference of an ERR_PTR() in
the case that devfreq registration fails. Finally, the third patch picks the
two frequencies from the freq-table provided in UFSHCD and pass these to
devfreq, as well as map these frequencies back to the step up/down actions.


With this UFS is once again functional on the db820c, and is needed to get UFS
working on SDM845 (both tested).

Bjorn Andersson (3):
  PM / devfreq: Actually support providing freq_table
  scsi: ufs: Extract devfreq registration
  scsi: ufs: Use freq table with devfreq

 drivers/devfreq/devfreq.c | 22 +++
 drivers/scsi/ufs/ufshcd.c | 68 ---
 2 files changed, 57 insertions(+), 33 deletions(-)

-- 
2.16.2



[PATCH 2/3] scsi: ufs: Extract devfreq registration

2018-04-23 Thread Bjorn Andersson
Failing to register with devfreq leaves hba->devfreq assigned, which
causes the error path to dereference the ERR_PTR(). Rather than bolting
on more conditionals, move the call of devm_devfreq_add_device() into
it's own function and only update hba->devfreq once it's successfully
registered.

The subsequent patch builds upon this to make UFS actually work again,
as it's been broken since f1d981eaecf8 ("PM / devfreq: Use the available
min/max frequency")

Signed-off-by: Bjorn Andersson 
---
 drivers/scsi/ufs/ufshcd.c | 31 ++-
 1 file changed, 22 insertions(+), 9 deletions(-)

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 8f22a980b1a7..2253f24309ec 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -1255,6 +1255,26 @@ static struct devfreq_dev_profile ufs_devfreq_profile = {
.get_dev_status = ufshcd_devfreq_get_dev_status,
 };
 
+static int ufshcd_devfreq_init(struct ufs_hba *hba)
+{
+   struct devfreq *devfreq;
+   int ret;
+
+   devfreq = devm_devfreq_add_device(hba->dev,
+   _devfreq_profile,
+   "simple_ondemand",
+   NULL);
+   if (IS_ERR(devfreq)) {
+   ret = PTR_ERR(devfreq);
+   dev_err(hba->dev, "Unable to register with devfreq %d\n", ret);
+   return ret;
+   }
+
+   hba->devfreq = devfreq;
+
+   return 0;
+}
+
 static void __ufshcd_suspend_clkscaling(struct ufs_hba *hba)
 {
unsigned long flags;
@@ -6399,16 +6419,9 @@ static int ufshcd_probe_hba(struct ufs_hba *hba)
sizeof(struct ufs_pa_layer_attr));
hba->clk_scaling.saved_pwr_info.is_valid = true;
if (!hba->devfreq) {
-   hba->devfreq = devm_devfreq_add_device(hba->dev,
-   _devfreq_profile,
-   "simple_ondemand",
-   NULL);
-   if (IS_ERR(hba->devfreq)) {
-   ret = PTR_ERR(hba->devfreq);
-   dev_err(hba->dev, "Unable to register 
with devfreq %d\n",
-   ret);
+   ret = ufshcd_devfreq_init(hba);
+   if (ret)
goto out;
-   }
}
hba->clk_scaling.is_allowed = true;
}
-- 
2.16.2



[PATCH 3/3] scsi: ufs: Use freq table with devfreq

2018-04-23 Thread Bjorn Andersson
devfreq requires that the client operates on actual frequencies, not
only 0 and UMAX_INT and as such UFS brok with the introduction of
f1d981eaecf8 ("PM / devfreq: Use the available min/max frequency").

This patch registers the frequencies of the first clock with devfreq and
use these to determine if we're trying to step up or down.

Signed-off-by: Bjorn Andersson 
---
 drivers/scsi/ufs/ufshcd.c | 39 ---
 1 file changed, 32 insertions(+), 7 deletions(-)

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 2253f24309ec..07b1f3c7bd2d 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -1168,16 +1168,13 @@ static int ufshcd_devfreq_target(struct device *dev,
struct ufs_hba *hba = dev_get_drvdata(dev);
ktime_t start;
bool scale_up, sched_clk_scaling_suspend_work = false;
+   struct list_head *clk_list = >clk_list_head;
+   struct ufs_clk_info *clki;
unsigned long irq_flags;
 
if (!ufshcd_is_clkscaling_supported(hba))
return -EINVAL;
 
-   if ((*freq > 0) && (*freq < UINT_MAX)) {
-   dev_err(hba->dev, "%s: invalid freq = %lu\n", __func__, *freq);
-   return -EINVAL;
-   }
-
spin_lock_irqsave(hba->host->host_lock, irq_flags);
if (ufshcd_eh_in_progress(hba)) {
spin_unlock_irqrestore(hba->host->host_lock, irq_flags);
@@ -1187,7 +1184,13 @@ static int ufshcd_devfreq_target(struct device *dev,
if (!hba->clk_scaling.active_reqs)
sched_clk_scaling_suspend_work = true;
 
-   scale_up = (*freq == UINT_MAX) ? true : false;
+   if (list_empty(clk_list)) {
+   spin_unlock_irqrestore(hba->host->host_lock, irq_flags);
+   goto out;
+   }
+
+   clki = list_first_entry(>clk_list_head, struct ufs_clk_info, list);
+   scale_up = (*freq == clki->max_freq) ? true : false;
if (!ufshcd_is_devfreq_scaling_required(hba, scale_up)) {
spin_unlock_irqrestore(hba->host->host_lock, irq_flags);
ret = 0;
@@ -1257,11 +1260,33 @@ static struct devfreq_dev_profile ufs_devfreq_profile = 
{
 
 static int ufshcd_devfreq_init(struct ufs_hba *hba)
 {
+   struct devfreq_dev_profile *profile;
+   struct list_head *clk_list = >clk_list_head;
+   struct ufs_clk_info *clki;
struct devfreq *devfreq;
int ret;
 
+   /* Skip devfreq if we don't have any clocks in the list */
+   if (list_empty(clk_list))
+   return 0;
+
+   profile = devm_kmemdup(hba->dev, _devfreq_profile,
+  sizeof(ufs_devfreq_profile), GFP_KERNEL);
+   if (!profile)
+   return -ENOMEM;
+
+   profile->max_state = 2;
+   profile->freq_table = devm_kcalloc(hba->dev, profile->max_state,
+  sizeof(unsigned long), GFP_KERNEL);
+   if (!profile->freq_table)
+   return -ENOMEM;
+
+   clki = list_first_entry(>clk_list_head, struct ufs_clk_info, list);
+   profile->freq_table[0] = clki->min_freq;
+   profile->freq_table[1] = clki->max_freq;
+
devfreq = devm_devfreq_add_device(hba->dev,
-   _devfreq_profile,
+   profile,
"simple_ondemand",
NULL);
if (IS_ERR(devfreq)) {
-- 
2.16.2



[PATCH 1/3] PM / devfreq: Actually support providing freq_table

2018-04-23 Thread Bjorn Andersson
The code in devfreq_add_device() handles the case where a freq_table is
passed by the client, but then requests min and max frequences from
the, in this case absent, opp tables.

Read the min and max frequencies from the frequency table, which has
been built from the opp table if one exists, instead of querying the
opp table.

Signed-off-by: Bjorn Andersson 
---

An alternative approach is to clarify in the devfreq code that it's not
possible to pass a freq_table and then in patch 3 create an opp table for the
device in runtime; although the error handling of this becomes non-trivial.

Transitioning the UFSHCD to use opp tables directly is hindered by the fact
that the Qualcomm UFS hardware has two different clocks that needs to be
running at different rates, so we would need a way to describe the two rates in
the opp table. (And would force us to change the DT binding)

 drivers/devfreq/devfreq.c | 22 --
 1 file changed, 4 insertions(+), 18 deletions(-)

diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
index fe2af6aa88fc..086ced50a13d 100644
--- a/drivers/devfreq/devfreq.c
+++ b/drivers/devfreq/devfreq.c
@@ -74,30 +74,16 @@ static struct devfreq *find_device_devfreq(struct device 
*dev)
 
 static unsigned long find_available_min_freq(struct devfreq *devfreq)
 {
-   struct dev_pm_opp *opp;
-   unsigned long min_freq = 0;
-
-   opp = dev_pm_opp_find_freq_ceil(devfreq->dev.parent, _freq);
-   if (IS_ERR(opp))
-   min_freq = 0;
-   else
-   dev_pm_opp_put(opp);
+   struct devfreq_dev_profile *profile = devfreq->profile;
 
-   return min_freq;
+   return profile->freq_table[0];
 }
 
 static unsigned long find_available_max_freq(struct devfreq *devfreq)
 {
-   struct dev_pm_opp *opp;
-   unsigned long max_freq = ULONG_MAX;
-
-   opp = dev_pm_opp_find_freq_floor(devfreq->dev.parent, _freq);
-   if (IS_ERR(opp))
-   max_freq = 0;
-   else
-   dev_pm_opp_put(opp);
+   struct devfreq_dev_profile *profile = devfreq->profile;
 
-   return max_freq;
+   return profile->freq_table[profile->max_state - 1];
 }
 
 /**
-- 
2.16.2



[Bug 199435] HPSA + P420i resetting logical Direct-Access never complete

2018-04-23 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=199435

--- Comment #15 from Don (don.br...@microsemi.com) ---
(In reply to Anthony Hausman from comment #11)
> The only patch that I'm sure that I have is the "scsi: hpsa: fix selection
> of reply queue" one.
> For the I'm using an out of the box 4.11 kernel. So I'm really not sure that
> the other patches are present.
> 
> 
> Unfortunately, the module does not compile using 4.11.0-14-generic headers.
> 
> # make -C /lib/modules/4.11.0-14-generic/build M=$(pwd)
> --makefile="/root/hpsa-3.4.20-136/hpsa-3.4.20/drivers/scsi/Makefile.alt"
> make: Entering directory '/usr/src/linux-headers-4.11.0-14-generic'
> make -C /lib/modules/4.4.0-96-generic/build
> M=/usr/src/linux-headers-4.11.0-14-generic EXTRA_CFLAGS+=-DKCLASS4A modules
> make[1]: Entering directory '/usr/src/linux-headers-4.4.0-96-generic'
> make[2]: *** No rule to make target 'kernel/bounds.c', needed by
> 'kernel/bounds.s'.  Stop.
> Makefile:1423: recipe for target
> '_module_/usr/src/linux-headers-4.11.0-14-generic' failed
> make[1]: *** [_module_/usr/src/linux-headers-4.11.0-14-generic] Error 2
> make[1]: Leaving directory '/usr/src/linux-headers-4.4.0-96-generic'
> /root/hpsa-3.4.20-136/hpsa-3.4.20/drivers/scsi/Makefile.alt:96: recipe for
> target 'default' failed
> make: *** [default] Error 2
> make: Leaving directory '/usr/src/linux-headers-4.11.0-14-generic'
> 
> But if you tell me the principal problem is using the 4.11 kernel, I can
> upgrade it to use the 4.16.3 kernel.
> 
> If I use it, must I use the out of box 3.4.20-136 hpsa driver or use your
> precedent patch on the last 3.4.20-125?


(In reply to Anthony Hausman from comment #11)
> The only patch that I'm sure that I have is the "scsi: hpsa: fix selection
> of reply queue" one.
> For the I'm using an out of the box 4.11 kernel. So I'm really not sure that
> the other patches are present.
> 
> 
> Unfortunately, the module does not compile using 4.11.0-14-generic headers.
> 
> # make -C /lib/modules/4.11.0-14-generic/build M=$(pwd)
> --makefile="/root/hpsa-3.4.20-136/hpsa-3.4.20/drivers/scsi/Makefile.alt"
> make: Entering directory '/usr/src/linux-headers-4.11.0-14-generic'
> make -C /lib/modules/4.4.0-96-generic/build
> M=/usr/src/linux-headers-4.11.0-14-generic EXTRA_CFLAGS+=-DKCLASS4A modules
> make[1]: Entering directory '/usr/src/linux-headers-4.4.0-96-generic'
> make[2]: *** No rule to make target 'kernel/bounds.c', needed by
> 'kernel/bounds.s'.  Stop.
> Makefile:1423: recipe for target
> '_module_/usr/src/linux-headers-4.11.0-14-generic' failed
> make[1]: *** [_module_/usr/src/linux-headers-4.11.0-14-generic] Error 2
> make[1]: Leaving directory '/usr/src/linux-headers-4.4.0-96-generic'
> /root/hpsa-3.4.20-136/hpsa-3.4.20/drivers/scsi/Makefile.alt:96: recipe for
> target 'default' failed
> make: *** [default] Error 2
> make: Leaving directory '/usr/src/linux-headers-4.11.0-14-generic'
> 
> But if you tell me the principal problem is using the 4.11 kernel, I can
> upgrade it to use the 4.16.3 kernel.
> 
> If I use it, must I use the out of box 3.4.20-136 hpsa driver or use your
> precedent patch on the last 3.4.20-125?

The 4.16.3 driver should be OK to use.

You could not untar the sources I gave you in /tmp and build with make -f
Makefile.alt?

If you copy the source code into the kernel tree, you should be able to do
make modules SUBDIRS=drivers/scsi hpsa.ko

-- 
You are receiving this mail because:
You are the assignee for the bug.


God dag.

2018-04-23 Thread cfengchao
God dag,

Jeg er administrerende direktør i industri- og handelsbanken i Kina
(ICBC). Jeg har et gjensidig forretningsforslag, som refererer til overføring 
av a stor mengde penger til en konto i utlandet, med din hjelp som utlending
partner som mottaker av midlene. Alt om denne transaksjonen
vil bli gjort lovlig uten noen bro av økonomisk myndighet både i min
Land og ditt. Hvis du er interessert, vennligst svar tilbake gjennom min
privat e-post skrevet nedenfor, og jeg vil gi deg mer informasjon og
prosjekt så snart jeg mottar det positive svaret ditt.

Privat e-post: direc...@chengfengchao.com


Med vennlig hilsen,

Daglig leder.

ICBC. Kina


Re: [PATCH] bsg referencing bus driver module

2018-04-23 Thread Anatoliy Glagolev
Thanks, James. The idea of cutting communications with Scsi_Host at
bsg_unregister_queue(..) time and leaving bsg_class_device to
its own fate makes a lot of sense, conceptually. But there are
implementation issues that are difficult to work around.

bsg.c creates bsg_class_device and takes a reference to Scsi_Host
at bsg_register_queue(..) time. The reference is dropped at
bsg_class_device's release(..) function. If the driver implementing
Scsi_Host template is not around we crash.
We could move the reference drop from bsg_class_device's release(..)
function to bsg_unregister_queue(..). That would be a small change in
bsg.c. But bsg.c sets Scsi_Host as the parent of bsg_class_device's
device. We cannot have a device around with a dangling parent.
A device's parent cannot be changed dynamically. Not setting
the device's parent at creation may affect software relying
on bsg_class_device - Scsi_Host child-parent relations.

It looks like I am out of options. Do you have suggestions on
how to work around Scsi_Host being bsg_class_device's parent?



RE: [PATCH] [SCSI] mpt3sas: Fix secure erase premature termination (v4)

2018-04-23 Thread Igor Rybak
Hi,

We are running kernel 4.4.0-22 and the patch below does not seem to be present 
in the mpt3sas driver. Can you please confirm?
As a reminder the patch was related to a Security Erase ATA command that 
requires a very long timeout like 100 minutes or more and the drive retains a 
busy status. And the driver should not try to send other commands or reset the 
drive.

Thanks,

Igor Rybak
CTO
MediaClone Inc
6900 Canby Ave Ste 107
Reseda, CA 91335
USA
+1-818-654-6286

From: Sreekanth Reddy [sreekanth.re...@broadcom.com]
Sent: Thursday, November 10, 2016 8:38 PM
To: Andrey Grodzovsky
Cc: PDL-MPT-FUSIONLINUX; Igor Rybak; Ezra Kohavi; linux-scsi@vger.kernel.org; 
Sathya Prakash; Chaitra P B; Suganath Prabu Subramani; Hannes Reinecke; 
sta...@vger.kernel.org
Subject: Re: [PATCH] [SCSI] mpt3sas: Fix secure erase premature termination (v4)

On Thu, Nov 10, 2016 at 8:05 PM, Andrey Grodzovsky  wrote:
> Problem:
> This is a work around for a bug with LSI Fusion MPT SAS2 when
> pefroming secure erase. Due to the very long time the operation
> takes commands issued during the erase will time out and will trigger
> execution of abort hook. Even though the abort hook is called for
> the specific command which timed out this leads to entire device halt
> (scsi_state terminated) and premature termination of the secured erase.
>
> Fix:
> Set device state to busy while erase in progress to reject any incoming
> commands until the erase is done. The device is blocked any way during
> this time and cannot execute any other command.
> More data and logs can be found here -
> https://drive.google.com/file/d/0B9ocOHYHbbS1Q3VMdkkzeWFkTjg/view
>
> v2: Update according to example patch by Hannes Reinecke to apply
> the blocking logic to any ATA 12/16 command.
>
> v3: Use SCSI commands opcodes definitions instead of value and
> correct identation.
>
> v4: Fix checkpath errors and warning.
>
> Signed-off-by: Andrey Grodzovsky 
> Cc: 
> Cc: Sathya Prakash 
> Cc: Chaitra P B 
> Cc: Suganath Prabu Subramani 
> Cc: Sreekanth Reddy 
> Cc: Hannes Reinecke 
> Cc: 

Acked-by: Sreekanth Reddy 

> ---
>  drivers/scsi/mpt3sas/mpt3sas_scsih.c | 16 
>  1 file changed, 16 insertions(+)
>
> diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c 
> b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> index 5a97e32..c032319 100644
> --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> @@ -3500,6 +3500,10 @@ _scsih_eedp_error_handling(struct scsi_cmnd *scmd, u16 
> ioc_status)
> SAM_STAT_CHECK_CONDITION;
>  }
>
> +static inline bool ata_12_16_cmd(struct scsi_cmnd *scmd)
> +{
> +   return (scmd->cmnd[0] == ATA_12 || scmd->cmnd[0] == ATA_16);
> +}
>
>  /**
>   * _scsih_qcmd - main scsi request entry point
> @@ -3528,6 +3532,14 @@ _scsih_qcmd(struct Scsi_Host *shost, struct scsi_cmnd 
> *scmd)
> scsi_print_command(scmd);
>  #endif
>
> +   /**
> +   * Lock the device for any subsequent command until
> +   * command is done.
> +   */
> +   if (ata_12_16_cmd(scmd))
> +   scsi_internal_device_block(scmd->device);
> +
> +
> sas_device_priv_data = scmd->device->hostdata;
> if (!sas_device_priv_data || !sas_device_priv_data->sas_target) {
> scmd->result = DID_NO_CONNECT << 16;
> @@ -4062,6 +4074,10 @@ _scsih_io_done(struct MPT3SAS_ADAPTER *ioc, u16 smid, 
> u8 msix_index, u32 reply)
> if (scmd == NULL)
> return 1;
>
> +   if (ata_12_16_cmd(scmd))
> +   scsi_internal_device_unblock(scmd->device, SDEV_RUNNING);
> +
> +
> mpi_request = mpt3sas_base_get_msg_frame(ioc, smid);
>
> if (mpi_reply == NULL) {
> --
> 2.1.4
>



Re: [RESEND PATCH v1 2/2] trace: events: block: Add tag in block trace events

2018-04-23 Thread Steven Rostedt
On Mon, 23 Apr 2018 14:43:13 +0200
Steffen Maier  wrote:

> > -   TP_printk("[%s] %d", __entry->comm, __entry->nr_rq)
> > +   TP_printk("[%s] %d %s", __entry->comm, __entry->nr_rq,
> > +  __entry->explicit ? "Sync" : "Async")
> >   );
> > 
> >   /**  
> 
> This entire hunk does not seem related to this patch description.
> Also, I'm not sure trace-cmd and perf et al. could format it accordingly.

You mean the "?:" operation? trace-cmd and perf can handle it fine.
Just look at the trace event irq_handler_exit:

 print fmt: "irq=%d ret=%s", REC->irq, REC->ret ? "handled" : "unhandled"

# trace-cmd record -e irq_handler_exit
# trace-cmd report
  -0 [001] 856960.382767: irq_handler_exit: irq=29 
ret=handled
  -0 [001] 856961.745640: irq_handler_exit: irq=29 
ret=handled
  -0 [001] 856961.865762: irq_handler_exit: irq=29 
ret=handled


-- Steve


Re: [RESEND PATCH v1 1/2] trace: events: scsi: Add tag in SCSI trace events

2018-04-23 Thread Steffen Maier


On 04/17/2018 12:00 PM, Bean Huo (beanhuo) wrote:


#Cat trace
iozone-4055  [000]    665.039276: block_unplug: [iozone] 1 Sync
iozone-4055  [000] ...1   665.039278: block_rq_insert: 8,48 WS 0 () 39604352 + 
128 tag=18 [iozone]
iozone-4055  [000] ...1   665.039280: block_rq_issue: 8,48 WS 0 () 39604352 + 
128 tag=18 [iozone]
iozone-4055  [000] ...1   665.039284: scsi_dispatch_cmd_start: host_no=0 
channel=0 id=0 lun=3 data_sgl=16 prot_sgl=0 prot_op=SCSI_PROT_NORMAL tag=18 
cmnd=(WRITE_10 lba=4950544 txlen=16 protect=0 raw=2a 00 00 4b 8a 10 00 00 10 00)
iozone-4056  [002]    665.039284: block_dirty_buffer: 8,62 sector=44375 
size=4096
-0 [000] d.h2   665.039319: scsi_dispatch_cmd_done: host_no=0 
channel=0 id=0 lun=3 data_sgl=16 prot_sgl=0 prot_op=SCSI_PROT_NORMAL tag=24 
cmnd=(WRITE_10 lba=4944016 txlen=16 protect=0 raw=2a 00 00 4b 70 90 00 00 10 00) 
result=(driver=DRIVER_OK host=DID_OK message=COMMAND_COMPLETE status=SAM_STAT_GOOD)
-0 [000] d.h3   665.039321: block_rq_complete: 8,48 WS () 39552128 + 
128 tag=24 [0]



iozone-4058  [003]    665.039362: block_bio_remap: 8,48 WS 39568768 + 128 
<- (8,62) 337280
iozone-4058  [003]    665.039364: block_bio_queue: 8,48 WS 39568768 + 128 
[iozone]
iozone-4058  [003] ...1   665.039366: block_getrq: 8,48 WS 39568768 + 128 
[iozone]


I'm not familiar with block/scsi command tagging.

Some block events now would get a tag field.
Some block events would not get a tag field (maybe because for some the 
tag is not (yet) known).


So all block events that belong to the same request would still need to 
be correlated by something like (devt, RWBS, LBA, length) because not 
all have a tag field.



Especially, the ftrace log with tag information, I can easily figure out one 
I/O request between block layer and SCSI.


Provided this is done correctly, I would be in favor of a solution.
Since
v4.11 commit 48b77ad60844 (``block: cleanup tracing'')\newline
v4.11 commit 82ed4db499b8 (``block: split scsi\_request out of struct 
request'')
we don't have the SCSI CDB in block traces (nor in blktrace traditional 
relayfs trace format, nor in ftrace 'blk' tracer binary synthesized 
output [1]) any more (empty Packet Command payload).
Being able to correlate block events with scsi events would indeed be 
very helpful for some cases.


Is a correlation between block and scsi only necessary for these pairs?:

block_rq_issue causes scsi_dispatch_cmd_start, and
scsi_dispatch_cmd_done causes block_rq_complete.

If so, only those two block trace events would need to get a new field?


[1] v2.6.30 commit 08a06b83ff8b (``blkftrace: binary tracing, 
synthesizing old format'')
v2.6.31 commit f3948f8857ef (``blktrace: fix context-info when 
mixed-using blk tracer and trace events'')


--
Mit freundlichen Grüßen / Kind regards
Steffen Maier

Linux on z Systems Development

IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Martina Koederitz
Geschaeftsfuehrung: Dirk Wittkopp
Sitz der Gesellschaft: Boeblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294



Re: [RESEND PATCH v1 2/2] trace: events: block: Add tag in block trace events

2018-04-23 Thread Steffen Maier


On 04/16/2018 04:33 PM, Bean Huo (beanhuo) wrote:

Print the request tag along with other information in block trace events
when tracing request , and unplug type (Sync / Async).

Signed-off-by: Bean Huo 
---
  include/trace/events/block.h | 36 +---
  1 file changed, 25 insertions(+), 11 deletions(-)

diff --git a/include/trace/events/block.h b/include/trace/events/block.h
index 81b43f5..f8c0b9e 100644
--- a/include/trace/events/block.h
+++ b/include/trace/events/block.h



@@ -478,15 +486,18 @@ DECLARE_EVENT_CLASS(block_unplug,

TP_STRUCT__entry(
__field( int,   nr_rq   )
+   __field( bool,  explicit)
__array( char,  comm,   TASK_COMM_LEN   )
),

TP_fast_assign(
__entry->nr_rq = depth;
+   __entry->explicit = explicit;
memcpy(__entry->comm, current->comm, TASK_COMM_LEN);
),

-   TP_printk("[%s] %d", __entry->comm, __entry->nr_rq)
+   TP_printk("[%s] %d %s", __entry->comm, __entry->nr_rq,
+  __entry->explicit ? "Sync" : "Async")
  );

  /**


This entire hunk does not seem related to this patch description.
Also, I'm not sure trace-cmd and perf et al. could format it accordingly.
See also my patch for this same functionality:
https://www.spinics.net/lists/linux-block/msg24691.html
("[PATCH v2 1/2] tracing/events: block: track and print if unplug was 
explicit or schedule")




--
Mit freundlichen Grüßen / Kind regards
Steffen Maier

Linux on z Systems Development

IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Martina Koederitz
Geschaeftsfuehrung: Dirk Wittkopp
Sitz der Gesellschaft: Boeblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294



Re: [PATCH blktests] scsi/004: add regression test for false BLK_STS_OK with non good SAM status

2018-04-23 Thread Steffen Maier

On 04/19/2018 10:18 PM, Omar Sandoval wrote:
> On Thu, Apr 19, 2018 at 01:44:41PM -0600, Jens Axboe wrote:
>> On 4/19/18 1:41 PM, Bart Van Assche wrote:
>>> On Thu, 2018-04-19 at 12:13 -0700, Omar Sandoval wrote:
 On Thu, Apr 19, 2018 at 11:53:30AM -0700, Omar Sandoval wrote:
> Thanks for the test! Applied.

 Side note, it's unfortunate that this test takes 180 seconds to run only
 because we have to wait for the command timeout. We should be able to
 export request_queue->rq_timeout writeable in sysfs. Would you be
 interested in doing that?
>>>
>>> Hello Omar,
>>>
>>> Is this perhaps what you are looking for?
>>> # ls -l /sys/class/scsi_device/*/*/timeout
>>> -rw-r--r-- 1 root root 4096 Apr 19 08:52 
>>> /sys/class/scsi_device/2:0:0:0/device/timeout
>>> -rw-r--r-- 1 root root 4096 Apr 19 12:39 
>>> /sys/class/scsi_device/8:0:0:1/device/timeout
>>
>> We should have it generically available though, not just for SCSI. In
>> retrospect, it should have been under queue/ from the start, now we'll
>> end up with duplicate entries for SCSI.
> 
> For the sake of this test, I just decreased the timeout through SCSI.

Great idea.

>   echo 5 > "/sys/block/${SCSI_DEBUG_DEVICES[0]}/device/timeout"

However, the timeout should be sufficiently larger than scsi_debug/delay,
in order not to run into the command timeout.
It may be unfortunate that scsi_debug/delay uses jiffies as unit and
can thus differ in a range of an order of magnitude for different kernel 
configs.

>   # delay to reduce response repetition: around 1..10sec depending on HZ
>   echo 1000 > /sys/bus/pseudo/drivers/scsi_debug/delay

On s390, we typically have HZ=100, so 1000 jiffies are 10 seconds.

We can increase the sdev cmd timeout or decrease the scsi_debug/delay.
100 instead of 1000 for scsi_debug/delay worked for me;
but for some reason the loop checking for busy did not work (any more?)
causing an unexpected test case error:

> # ./check scsi/004
> scsi/004 (ensure repeated TASK SET FULL results in EIO on timing out command) 
> [failed]
> runtime  31.892s  ...  31.720s
> --- tests/scsi/004.out2018-04-16 11:47:19.105931872 +0200
> +++ results/nodev/scsi/004.out.bad2018-04-23 14:07:33.615445253 
> +0200
> @@ -1,3 +1,3 @@
>  Running scsi/004
> -Input/output error
> +modprobe: FATAL: Module scsi_debug is in use.
>  Test complete

so I added another sleep hack:

 # dd closing SCSI disk causes implicit TUR also being delayed once
+# sleep over time window where READ was done and TUR not yet queued
+sleep 2
 while grep -q -F "in_use_bm BUSY:" 
"/proc/scsi/scsi_debug/${SCSI_DEBUG_HOSTS[0]}"; do

What do you think?

-- 
Mit freundlichen Grüßen / Kind regards
Steffen Maier

Linux on z Systems Development

IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Martina Koederitz
Geschaeftsfuehrung: Dirk Wittkopp
Sitz der Gesellschaft: Boeblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294



RE: [PATCH v2 01/14] mpt3sas: Bug fix for big endian systems.

2018-04-23 Thread Chaitra Basappa
Martin,
 Please see my replies inline.

Thanks,
 Chaitra

-Original Message-
From: Martin K. Petersen [mailto:martin.peter...@oracle.com]
Sent: Saturday, April 21, 2018 3:52 AM
To: Chaitra P B
Cc: linux-scsi@vger.kernel.org; sathya.prak...@broadcom.com;
sreekanth.re...@broadcom.com; suganath-prabu.subram...@broadcom.com
Subject: Re: [PATCH v2 01/14] mpt3sas: Bug fix for big endian systems.


Chaitra,

A few comments:

> @@ -426,7 +427,7 @@ static void _clone_sg_entries(struct MPT3SAS_ADAPTER
*ioc,
>   dst_addr_phys = _base_get_chain_phys(ioc,
>   smid, sge_chain_count);
>   WARN_ON(dst_addr_phys > U32_MAX);
> - sgel->Address = (u32)dst_addr_phys;
> + sgel->Address = cpu_to_le32((u32)dst_addr_phys);

I tend to prefer lower_32_bits() but that's your choice.
 Accepted, lower_32_bits() can be used here.


> @@ -3040,8 +3047,9 @@ mpt3sas_base_map_resources(struct MPT3SAS_ADAPTER
*ioc)
>   }
>
>   for (i = 0; i < ioc->combined_reply_index_count; i++) {
> - ioc->replyPostRegisterIndex[i] = (resource_size_t
*)
> -  ((u8 *)>chip->Doorbell +
> + ioc->replyPostRegisterIndex[i] =
> + (volatile void __iomem *)
> +  ((u8 __force *)>chip->Doorbell +
>MPI25_SUP_REPLY_POST_HOST_INDEX_OFFSET +
>(i *
MPT3_SUP_REPLY_POST_HOST_INDEX_REG_OFFSET));

Do you really need volatile here? The existing resource_size_t didn't
imply volatile.

Why the double type casts? You've already changed replyPostRegisterIndex
to be 'volatile void __iomem **' in the header file. So why not:

ioc->replyPostRegisterIndex[i] =
>chip->Doorbell +
MPI25_SUP_REPLY_POST_HOST_INDEX_OFFSET +
i *
MPT3_SUP_REPLY_POST_HOST_INDEX_REG_OFFSET;

Also looks like ioc->reply_post_host_index handling a few lines further
down could lose the type casts.
 Accepted, volatile is not really needed. I shall remove
volatile.



> @@ -3386,7 +3394,7 @@ _base_put_smid_mpi_ep_scsi_io(struct
MPT3SAS_ADAPTER *ioc, u16 smid, u16 handle)
>   __le32 *mfp = (__le32 *)mpt3sas_base_get_msg_frame(ioc, smid);
>
>   _clone_sg_entries(ioc, (void *) mfp, smid);
> - mpi_req_iomem = (void *)ioc->chip +
> + mpi_req_iomem = (void __force *)ioc->chip +
>   MPI_FRAME_START_OFFSET + (smid * ioc->request_sz);
>   _base_clone_mpi_to_sys_mem(mpi_req_iomem, (void *)mfp,
>   ioc->request_sz);


Wouldn't it be better to add __iomem to the definition of mpi_req_iomem?
 With this change I still see the below warnings:
warning: cast removes address space of expression
warning: incorrect type in assignment (different address
spaces)
expected void [noderef] *mpi_req_iomem
got void *
warning: incorrect type in argument 1 (different address
spaces)
expected void *dst_iomem
got void [noderef] *mpi_req_iome



> + nvme_encap_request->ErrorResponseBaseAddress =
> + cpu_to_le64(ioc->sense_dma & 0xUL);

upper_32_bits()?
 since upper_32_bits() returns only upper 32 bits. But here after
bitwise & below we are doing bitwise | with dma_address lower 32 bits , so
in this case use of upper_32_bits() will yield wrong address for below
assignment. Hence upper_32_bits() can't be used.

nvme_encap_request->ErrorResponseBaseAddress |=
   cpu_to_le64(le32_to_cpu(
   mpt3sas_base_get_sense_buffer_dma(ioc, smid)));

-- 
Martin K. Petersen  Oracle Linux Engineering


[Bug 199435] HPSA + P420i resetting logical Direct-Access never complete

2018-04-23 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=199435

--- Comment #14 from Anthony Hausman (anthonyhaussm...@gmail.com) ---
Indeed I have a charged battery (capacitor) and the writeback-cache enabled.
I run the hp-health component too, I have already try to disable it on the 4.11
kernel and have reproduced the problem of load without it.

The cma related call trace up after the logical drive reset is called.

Right now, I test on a server the kernel 4.16.3-041603-generic with the hpsa
module with the patch to use local work-queue insead of system work-queue.

Right now I didn't reproduce the problem.
I had a disk with bad blocks (before launching a read-only test badblocks
returned a lot of block error) but since I have upgraded the kernel with the
patch hpsa module I have no more error.

I'm still trying to reproduce the problem by launching a badblocks read-only
test on the "ex-faulty" disk.

I'll tell you the result of the test.

-- 
You are receiving this mail because:
You are the assignee for the bug.


Re: MegaCli fails to communicate with Raid-Controller

2018-04-23 Thread Volker Schwicking

> On 23. Apr 2018, at 10:28, Kashyap Desai  wrote
>> 
>> Just did that on the “Dell R730xd, MegaRAID SAS-3 3108” and get the
>> following output when the megacli works fine.
>> 
>> ###
>> Apr 23 09:31:37 xh643 kernel: [  368.319092] GD IOV-len: 2048 Apr 23
>> 09:31:37
>> xh643 kernel: [  368.319426] GD IOV-len: 32 Apr 23 09:31:37 xh643 kernel:
>> [
>> 368.319563] GD IOV-len: 320 Apr 23 09:31:37 xh643 kernel: [  368.319698]
>> GD
>> IOV-len: 616 Apr 23 09:31:37 xh643 kernel: [  368.319887] GD IOV-len: 1664
>> Apr 23 09:31:37 xh643 kernel: [  368.320040] GD IOV-len: 32 Apr 23
>> 09:31:37
>> xh643 kernel: [  368.320174] GD IOV-len: 8 … ###
>> 
>> Full output is attached in iov_len_megacli_works.txt, it also contains the
>> output of /proc/buddyinfo which might be important based in my research so
>> far.
> 
> We need similar output whenever there is a dma_alloc_coherent() failure.
> Did you added new prints in failure of dma_alloc_coherent() OR it is generic
> print for all the case ?

Right now its generic for every iteration of the loop

###
6943 for (i = 0; i < ioc->sge_count; i++) {
6944 if (!ioc->sgl[i].iov_len)
6945 continue;
6946
6947 printk("GD IOV-len: %d\n", ioc->sgl[i].iov_len);
6948
6949 kbuff_arr[i] = dma_alloc_coherent(>pdev->dev,
###

I will add the printk to dma_alloc_coherent() as well to see, which request 
actually fails. But i have to be a bit patient since its a production system 
and the customers aren’t to happy about reboots.

I have tried to build live-patch kernel to be more flexible with changes, but 
the built failed with

###
...
/root/kpatch/kpatch-build/create-diff-object: unreconcilable difference
...
###

but thats out of focus for this list and just fyi :-)




signature.asc
Description: Message signed with OpenPGP


RE: MegaCli fails to communicate with Raid-Controller

2018-04-23 Thread Kashyap Desai
>
> Interesting. What is considered old and new? I have a third machine "Dell
> R515, MegaRAID SAS 2108”, is that considered new? Its running the same
> Xen/Kernel/Megacli-versions as the other two, but the error does not
> occur.

Nope this is also old controller. When I say new controller, It is pretty
much active development like SAS3.0 and SAS3.5. Driver level changes related
to DMA mask settings is FW dependent, so we cannot open it for all.

>
> > There can be a two possibilities.
> >
> > 1. This is actual memory allocation failure due to system resource
> > issue.
>
> I have not seen any OOMs on the two machines when/where the SGL-error
> occurs. According to "xl info” and our munin-graphs it all looks ok with a
> couple 100 MiB “free".
>
>
> > 2. IOCLT provided large memory length in iov and dma buffer allocation
> > from below API failed due to large memory chunk requested.
> >
> >kbuff_arr[i] = dma_alloc_coherent(>pdev->dev,
> >ioc->sgl[i].iov_len,
> >_handle,
> > GFP_KERNEL);
> >
> > Can you change driver code *printk* to dump iov_len ? Just to confirm.
>
> Just did that on the “Dell R730xd, MegaRAID SAS-3 3108” and get the
> following output when the megacli works fine.
>
> ###
> Apr 23 09:31:37 xh643 kernel: [  368.319092] GD IOV-len: 2048 Apr 23
> 09:31:37
> xh643 kernel: [  368.319426] GD IOV-len: 32 Apr 23 09:31:37 xh643 kernel:
> [
> 368.319563] GD IOV-len: 320 Apr 23 09:31:37 xh643 kernel: [  368.319698]
> GD
> IOV-len: 616 Apr 23 09:31:37 xh643 kernel: [  368.319887] GD IOV-len: 1664
> Apr 23 09:31:37 xh643 kernel: [  368.320040] GD IOV-len: 32 Apr 23
> 09:31:37
> xh643 kernel: [  368.320174] GD IOV-len: 8 … ###
>
> Full output is attached in iov_len_megacli_works.txt, it also contains the
> output of /proc/buddyinfo which might be important based in my research so
> far.

We need similar output whenever there is a dma_alloc_coherent() failure.
Did you added new prints in failure of dma_alloc_coherent() OR it is generic
print for all the case ?