RE: [PATCH v6 04/10] scsi: ufshpb: Make eviction depends on region's reads

2021-03-24 Thread Avri Altman
> 
> On 2021-03-22 16:10, Avri Altman wrote:
> > In host mode, eviction is considered an extreme measure.
> > verify that the entering region has enough reads, and the exiting
> > region has much less reads.
> >
> > Signed-off-by: Avri Altman 
> > ---
> >  drivers/scsi/ufs/ufshpb.c | 18 +-
> >  1 file changed, 17 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/scsi/ufs/ufshpb.c b/drivers/scsi/ufs/ufshpb.c
> > index a1519cbb4ce0..5e757220d66a 100644
> > --- a/drivers/scsi/ufs/ufshpb.c
> > +++ b/drivers/scsi/ufs/ufshpb.c
> > @@ -17,6 +17,7 @@
> >  #include "../sd.h"
> >
> >  #define ACTIVATION_THRESHOLD 8 /* 8 IOs */
> > +#define EVICTION_THRESHOLD (ACTIVATION_THRESHOLD << 5) /* 256 IOs
> */
> >
> >  /* memory management */
> >  static struct kmem_cache *ufshpb_mctx_cache;
> > @@ -1047,6 +1048,13 @@ static struct ufshpb_region
> > *ufshpb_victim_lru_info(struct ufshpb_lu *hpb)
> >   if (ufshpb_check_srgns_issue_state(hpb, rgn))
> >   continue;
> >
> > + /*
> > +  * in host control mode, verify that the exiting region
> > +  * has less reads
> > +  */
> > + if (hpb->is_hcm && rgn->reads > (EVICTION_THRESHOLD >> 1))
> > + continue;
> > +
> >   victim_rgn = rgn;
> >   break;
> >   }
> > @@ -1219,7 +1227,7 @@ static int ufshpb_issue_map_req(struct ufshpb_lu
> > *hpb,
> >
> >  static int ufshpb_add_region(struct ufshpb_lu *hpb, struct
> > ufshpb_region *rgn)
> >  {
> > - struct ufshpb_region *victim_rgn;
> > + struct ufshpb_region *victim_rgn = NULL;
> >   struct victim_select_info *lru_info = >lru_info;
> >   unsigned long flags;
> >   int ret = 0;
> > @@ -1246,7 +1254,15 @@ static int ufshpb_add_region(struct ufshpb_lu
> > *hpb, struct ufshpb_region *rgn)
> >* It is okay to evict the least recently used region,
> >* because the device could detect this region
> >* by not issuing HPB_READ
> > +  *
> > +  * in host control mode, verify that the entering
> > +  * region has enough reads
> >*/
> > + if (hpb->is_hcm && rgn->reads < EVICTION_THRESHOLD) {
> > + ret = -EACCES;
> > + goto out;
> > + }
> > +
> 
> I cannot understand the logic behind this. A rgn which host chooses to
> activate,
> is in INACTIVE state now, if its rgn->reads < 256, then don't activate
> it.
> Could you please elaborate?
I am re-citing the commit log:
"In host mode, eviction is considered an extreme measure.
verify that the entering region has enough reads, and the exiting
region has much less reads."

Here comes to play the reads counter as a comparative index.
Max-active-regions has crossed, and to activate a region, you need to evict 
another region.
But the activation threshold is relatively low, how do you know that you will 
benefit more,
>From the new region, than from the one you choose to evict?

Not to arbitrarily evict the "first" (LRU) region, like in device mode, we are 
looking for a solid
Reason for the new region to enter, and for the existing region to leave.
Otherwise, you will find yourself entering and existing the same region over 
and over,
Just threshing the active-list creating an unnecessary overhead by keep sending 
map requests.
For example, say the entering region has 4 reads, but the LRU region has 200, 
and its reads keeps coming.
Is it the "correct" decision to evict a 200-reads region for a 4-reads region?
If you indeed evict this 200-reads region, you will evict another to put it 
right back,
Over and over.

On the other hand, we are not hanging-on to "cold" regions, and inactivate them 
if there are no recent
Reads to that region - see the patch with the "Cold" timeout.

I agree that this can be elaborate to a more sophisticated policies - which we 
tried.
For now, let's go with the simplest one - use thresholds for both the entering 
and exiting regions.

Thanks,
Avri
> 
> Thanks,
> Can Guo.
> 
> >   victim_rgn = ufshpb_victim_lru_info(hpb);
> >   if (!victim_rgn) {
> >   dev_warn(>sdev_ufs_lu->sdev_dev,


Re: [PATCH v5 0/6] KVM: arm64: Add VLPI migration support on GICv4.1

2021-03-24 Thread Shenming Lu
On 2021/3/25 2:19, Marc Zyngier wrote:
> On Mon, 22 Mar 2021 14:01:52 +0800, Shenming Lu wrote:
>> In GICv4.1, migration has been supported except for (directly-injected)
>> VLPI. And GICv4.1 Spec explicitly gives a way to get the VLPI's pending
>> state (which was crucially missing in GICv4.0). So we make VLPI migration
>> capable on GICv4.1 in this series.
>>
>> In order to support VLPI migration, we need to save and restore all
>> required configuration information and pending states of VLPIs. But
>> in fact, the configuration information of VLPIs has already been saved
>> (or will be reallocated on the dst host...) in vgic(kvm) migration.
>> So we only have to migrate the pending states of VLPIs specially.
>>
>> [...]
> 
> Applied to next, thanks!

Thanks a lot again for all the comments and suggestions. :-)

Shenming

> 
> [1/6] irqchip/gic-v3-its: Add a cache invalidation right after vPE unmapping
>   commit: 301beaf19739cb6e640ed44e630e7da993f0ecc8
> [2/6] irqchip/gic-v3-its: Drop the setting of PTZ altogether
>   commit: c21bc068cdbe5613d3319ae171c3f2eb9f321352
> [3/6] KVM: arm64: GICv4.1: Add function to get VLPI state
>   commit: 80317fe4a65375fae668672a1398a0fb73eb9023
> [4/6] KVM: arm64: GICv4.1: Try to save VLPI state in save_pending_tables
>   commit: f66b7b151e00427168409f8c1857970e926b1e27
> [5/6] KVM: arm64: GICv4.1: Restore VLPI pending state to physical side
>   commit: 12df7429213abbfa9632ab7db94f629ec309a58b
> [6/6] KVM: arm64: GICv4.1: Give a chance to save VLPI state
>   commit: 8082d50f4817ff6a7e08f4b7e9b18e5f8bfa290d
> 
> Cheers,
> 
>   M.
> 


[PATCH 2/2] nvmem: qfprom: Add support for fuse blowing on sc7280

2021-03-24 Thread Rajendra Nayak
Handle the differences across LDO voltage needed for blowing fuses,
and the blow timer value, identified using a minor version of 15
on sc7280.

Signed-off-by: Rajendra Nayak 
Signed-off-by: Ravi Kumar Bokka 
---
Applies on top of https://lore.kernel.org/patchwork/patch/1376175/

 drivers/nvmem/qfprom.c | 27 +--
 1 file changed, 25 insertions(+), 2 deletions(-)

diff --git a/drivers/nvmem/qfprom.c b/drivers/nvmem/qfprom.c
index 100d69d..d6d3f24 100644
--- a/drivers/nvmem/qfprom.c
+++ b/drivers/nvmem/qfprom.c
@@ -45,11 +45,13 @@ MODULE_PARM_DESC(read_raw_data, "Read raw instead of 
corrected data");
  * @qfprom_blow_timer_value: The timer value of qfprom when doing efuse blow.
  * @qfprom_blow_set_freq:The frequency required to set when we start the
  *   fuse blowing.
+ * @qfprom_blow_uV:  LDO voltage to be set when doing efuse blow
  */
 struct qfprom_soc_data {
u32 accel_value;
u32 qfprom_blow_timer_value;
u32 qfprom_blow_set_freq;
+   int qfprom_blow_uV;
 };
 
 /**
@@ -111,6 +113,15 @@ static const struct qfprom_soc_compatible_data 
sc7180_qfprom = {
.nkeepout = ARRAY_SIZE(sc7180_qfprom_keepout)
 };
 
+static const struct nvmem_keepout sc7280_qfprom_keepout[] = {
+   {.start = 0x128, .end = 0x148},
+   {.start = 0x238, .end = 0x248}
+};
+
+static const struct qfprom_soc_compatible_data sc7280_qfprom = {
+   .keepout = sc7280_qfprom_keepout,
+   .nkeepout = ARRAY_SIZE(sc7280_qfprom_keepout)
+};
 /**
  * qfprom_disable_fuse_blowing() - Undo enabling of fuse blowing.
  * @priv: Our driver data.
@@ -168,6 +179,7 @@ static int qfprom_enable_fuse_blowing(const struct 
qfprom_priv *priv,
  struct qfprom_touched_values *old)
 {
int ret;
+   int qfprom_blow_uV = priv->soc_data->qfprom_blow_uV;
 
ret = clk_prepare_enable(priv->secclk);
if (ret) {
@@ -187,9 +199,9 @@ static int qfprom_enable_fuse_blowing(const struct 
qfprom_priv *priv,
 * a rail shared do don't specify a max--regulator constraints
 * will handle.
 */
-   ret = regulator_set_voltage(priv->vcc, 180, INT_MAX);
+   ret = regulator_set_voltage(priv->vcc, qfprom_blow_uV, INT_MAX);
if (ret) {
-   dev_err(priv->dev, "Failed to set 1.8 voltage\n");
+   dev_err(priv->dev, "Failed to set %duV\n", qfprom_blow_uV);
goto err_clk_rate_set;
}
 
@@ -311,6 +323,14 @@ static const struct qfprom_soc_data qfprom_7_8_data = {
.accel_value = 0xD10,
.qfprom_blow_timer_value = 25,
.qfprom_blow_set_freq = 480,
+   .qfprom_blow_uV = 180,
+};
+
+static const struct qfprom_soc_data qfprom_7_15_data = {
+   .accel_value = 0xD08,
+   .qfprom_blow_timer_value = 24,
+   .qfprom_blow_set_freq = 480,
+   .qfprom_blow_uV = 190,
 };
 
 static int qfprom_probe(struct platform_device *pdev)
@@ -379,6 +399,8 @@ static int qfprom_probe(struct platform_device *pdev)
 
if (major_version == 7 && minor_version == 8)
priv->soc_data = _7_8_data;
+   if (major_version == 7 && minor_version == 15)
+   priv->soc_data = _7_15_data;
 
priv->vcc = devm_regulator_get(>dev, "vcc");
if (IS_ERR(priv->vcc))
@@ -405,6 +427,7 @@ static int qfprom_probe(struct platform_device *pdev)
 static const struct of_device_id qfprom_of_match[] = {
{ .compatible = "qcom,qfprom",},
{ .compatible = "qcom,sc7180-qfprom", .data = _qfprom},
+   { .compatible = "qcom,sc7280-qfprom", .data = _qfprom},
{/* sentinel */},
 };
 MODULE_DEVICE_TABLE(of, qfprom_of_match);
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation



[PATCH 1/2] dt-bindings: nvmem: Add SoC compatible for sc7280

2021-03-24 Thread Rajendra Nayak
Document SoC compatible for sc7280

Signed-off-by: Rajendra Nayak 
---
 Documentation/devicetree/bindings/nvmem/qcom,qfprom.yaml | 1 +
 1 file changed, 1 insertion(+)

diff --git a/Documentation/devicetree/bindings/nvmem/qcom,qfprom.yaml 
b/Documentation/devicetree/bindings/nvmem/qcom,qfprom.yaml
index 992777c..861b205 100644
--- a/Documentation/devicetree/bindings/nvmem/qcom,qfprom.yaml
+++ b/Documentation/devicetree/bindings/nvmem/qcom,qfprom.yaml
@@ -24,6 +24,7 @@ properties:
   - qcom,msm8998-qfprom
   - qcom,qcs404-qfprom
   - qcom,sc7180-qfprom
+  - qcom,sc7280-qfprom
   - qcom,sdm845-qfprom
   - const: qcom,qfprom
 
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation



Re: [PATCH v3 9/9] dt-bindings: serial: stm32: add phandle 'bluetooth' to fix dtbs_check warrning

2021-03-24 Thread dillon min
Hi Rob,

Thanks for the suggestion.


On Thu, Mar 25, 2021 at 1:45 AM Rob Herring  wrote:
>
> On Fri, Mar 19, 2021 at 07:13:27PM +0800, dillon min wrote:
> > Hi Alexandre,
> >
> > Thanks for the reply.
> >
> > On Fri, Mar 19, 2021 at 4:38 PM Alexandre TORGUE
> >  wrote:
> > >
> > > Hi Dillon
> > >
> > > On 3/19/21 5:28 AM, dillon min wrote:
> > > > No changes, Just loop lkp in.
> > > >
> > > >
> > > > Hi lkp,
> > > >
> > > > Sorry for the late reply, thanks for your report.
> > > > This patch is to fix the build warning message.
> > > >
> > > > Thanks.
> > > > Regards
> > > >
> > > > On Mon, Mar 15, 2021 at 5:45 PM  wrote:
> > > >>
> > > >> From: dillon min 
> > > >>
> > > >> when run make dtbs_check with 'bluetoothi brcm,bcm43438-bt'
> > > >> dts enabled on stm32h7, there is a warrning popup:
> > > >>
> > >  arch/arm/boot/dts/stm32h750i-art-pi.dt.yaml: serial@40004800: 
> > >  'bluetooth'
> > > >> does not match any of the regexes: 'pinctrl-[0-9]+'
> > > >>
> > > >> to make dtbs_check happy, so add a phandle bluetooth
> > > >>
> > > >> Fixes: 500cdb23d608 ("ARM: dts: stm32: Add STM32H743 MCU and 
> > > >> STM32H743i-EVAL board")
> > > >> Signed-off-by: dillon min 
> > > >> Reported-by: kernel test robot 
> > > >> ---
> > > >>   Documentation/devicetree/bindings/serial/st,stm32-uart.yaml | 5 +
> > > >>   1 file changed, 5 insertions(+)
> > > >>
> > > >> diff --git 
> > > >> a/Documentation/devicetree/bindings/serial/st,stm32-uart.yaml 
> > > >> b/Documentation/devicetree/bindings/serial/st,stm32-uart.yaml
> > > >> index 8631678283f9..5e674840e62d 100644
> > > >> --- a/Documentation/devicetree/bindings/serial/st,stm32-uart.yaml
> > > >> +++ b/Documentation/devicetree/bindings/serial/st,stm32-uart.yaml
> > > >> @@ -50,6 +50,11 @@ properties:
> > > >>   minItems: 1
> > > >>   maxItems: 2
> > > >>
> > > >> +  bluetooth:
> > > >> +type: object
> > > >> +description: |
> > > >> +  phandles to the usart controller and bluetooth
> > > >> +
> > >
> > > Do we really need to add this "generic" property here ? You could test
> > > without the "AditionalProperties:False".
> > Yes, indeed. we have no reason to add a generic 'bluetooth' property
> > into specific soc's interface yaml.
> > I can't just remove "AditionalProperties:False", else make
> > O=../kernel-art/ dtbs dtbs_check will run into
> >
> > /home/fmin/linux/Documentation/devicetree/bindings/serial/st,stm32-uart.yaml:
> > 'oneOf' conditional failed, one must be fixed:
> > 'unevaluatedProperties' is a required property
> > 'additionalProperties' is a required property
> > ...
> >
> > So , i will replace "AditionalProperties:False". with
> > unevaluatedProperties: false, do you agree with this?
>
> This is okay as long as 'serial.yaml' is referenced, but will eventually
> fail if not (unevaluatedProperties isn't actually implemented yet).
>
> > If so, i will send patch v4 later.
>
> Or you can do this:
>
> addtionalProperties:
>   type: object
>
> Which means any other property has to be a node.
>
Okay, I just test your patch, it's fixed dtbs_check warrning as well.
I will merge it to next submit, thanks.

Hi, Valentin CARON,
Could you help to double check it, after my v5 submit ? thanks so much.

Regards.

Valent
> Rob


Re: [PATCH] cpufreq: dt: check the error returned by dev_pm_opp_of_cpumask_add_table

2021-03-24 Thread quanyang.wang

Hi Viresh,

On 3/25/21 1:24 PM, Viresh Kumar wrote:

On 25-03-21, 13:15, quanyang.wang wrote:

Thank you for pointing it out.  Do you mean that even if
dev_pm_opp_of_cpumask_add_table returns

an error, dev_pm_opp_get_opp_count may still return count > 0 because
someone may call dev_pm_opp_add

to add OPP to cpu succcessfully at somewhere else?

Yes.

There are two ways we can add OPPs today:

- Statically via device tree. This is what
   dev_pm_opp_of_cpumask_add_table() tries to do.

- Dynamically via call to dev_pm_opp_add(), which I described earlier.

What failed here is the static way of adding OPPs, we still need to
check if OPPs were added dynamically.


Thank you for shedding light on this.

I will send a V2 patch which only check the return error -EPROBE_DEFER.

Thanks,

Quanyang





Re: [PATCH v6 04/10] scsi: ufshpb: Make eviction depends on region's reads

2021-03-24 Thread Can Guo

On 2021-03-22 16:10, Avri Altman wrote:

In host mode, eviction is considered an extreme measure.
verify that the entering region has enough reads, and the exiting
region has much less reads.

Signed-off-by: Avri Altman 
---
 drivers/scsi/ufs/ufshpb.c | 18 +-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/ufs/ufshpb.c b/drivers/scsi/ufs/ufshpb.c
index a1519cbb4ce0..5e757220d66a 100644
--- a/drivers/scsi/ufs/ufshpb.c
+++ b/drivers/scsi/ufs/ufshpb.c
@@ -17,6 +17,7 @@
 #include "../sd.h"

 #define ACTIVATION_THRESHOLD 8 /* 8 IOs */
+#define EVICTION_THRESHOLD (ACTIVATION_THRESHOLD << 5) /* 256 IOs */

 /* memory management */
 static struct kmem_cache *ufshpb_mctx_cache;
@@ -1047,6 +1048,13 @@ static struct ufshpb_region
*ufshpb_victim_lru_info(struct ufshpb_lu *hpb)
if (ufshpb_check_srgns_issue_state(hpb, rgn))
continue;

+   /*
+* in host control mode, verify that the exiting region
+* has less reads
+*/
+   if (hpb->is_hcm && rgn->reads > (EVICTION_THRESHOLD >> 1))
+   continue;
+
victim_rgn = rgn;
break;
}
@@ -1219,7 +1227,7 @@ static int ufshpb_issue_map_req(struct ufshpb_lu 
*hpb,


 static int ufshpb_add_region(struct ufshpb_lu *hpb, struct 
ufshpb_region *rgn)

 {
-   struct ufshpb_region *victim_rgn;
+   struct ufshpb_region *victim_rgn = NULL;
struct victim_select_info *lru_info = >lru_info;
unsigned long flags;
int ret = 0;
@@ -1246,7 +1254,15 @@ static int ufshpb_add_region(struct ufshpb_lu
*hpb, struct ufshpb_region *rgn)
 * It is okay to evict the least recently used region,
 * because the device could detect this region
 * by not issuing HPB_READ
+*
+* in host control mode, verify that the entering
+* region has enough reads
 */
+   if (hpb->is_hcm && rgn->reads < EVICTION_THRESHOLD) {
+   ret = -EACCES;
+   goto out;
+   }
+


I cannot understand the logic behind this. A rgn which host chooses to 
activate,
is in INACTIVE state now, if its rgn->reads < 256, then don't activate 
it.

Could you please elaborate?

Thanks,
Can Guo.


victim_rgn = ufshpb_victim_lru_info(hpb);
if (!victim_rgn) {
dev_warn(>sdev_ufs_lu->sdev_dev,


Re: [PATCH 2/2] dt-binding: leds: Document leds-multi-gpio bindings

2021-03-24 Thread Vesa Jääskeläinen

Hi,

See below.

On 24.3.2021 9.56, Hermes Zhang wrote:

From: Hermes Zhang 

Document the device tree bindings of the multiple GPIOs LED driver
Documentation/devicetree/bindings/leds/leds-multi-gpio.yaml.

Signed-off-by: Hermes Zhang 
---
  .../bindings/leds/leds-multi-gpio.yaml| 50 +++
  1 file changed, 50 insertions(+)
  create mode 100644 Documentation/devicetree/bindings/leds/leds-multi-gpio.yaml

diff --git a/Documentation/devicetree/bindings/leds/leds-multi-gpio.yaml 
b/Documentation/devicetree/bindings/leds/leds-multi-gpio.yaml
new file mode 100644
index ..6f2b47487b90
--- /dev/null
+++ b/Documentation/devicetree/bindings/leds/leds-multi-gpio.yaml
@@ -0,0 +1,50 @@
+# SPDX-License-Identifier: GPL-2.0
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/leds/leds-multi-gpio.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Multiple GPIOs LED driver
+
+maintainers:
+  - Hermes Zhang 
+
+description:
+  This will support some LED made of multiple GPIOs and the brightness of the
+  LED could map to different states of the GPIOs.
+
+properties:
+  compatible:
+const: multi-gpio-led
+
+  led-gpios:
+description: Array of one or more GPIOs pins used to control the LED.
+minItems: 1
+maxItems: 8  # Should be enough


We also have a case with multi color LEDs (which is probably a more 
common than multi intensity LED. So I am wondering how these both could 
co-exist.


From: 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/devicetree/bindings/leds/leds-gpio.yaml?h=v5.12-rc4#n58


led-0 {
gpios = <_pio 0 GPIO_ACTIVE_LOW>;
linux,default-trigger = "disk-activity";
function = LED_FUNCTION_DISK;
};

Now 'gpios' (and in LED context) and 'led-gpios' is very close to each 
other and could easily be confused.


Perhaps this could be something like:

intensity-gpios = ...

or even simplified then just to gpios = <...>


+
+  led-states:
+description: |
+  The array list the supported states here which will map to brightness
+  from 0 to maximum. Each item in the array will present all the GPIOs
+  value by bit.
+$ref: /schemas/types.yaml#/definitions/uint8-array
+minItems: 1
+maxItems: 16 # Should be enough
+
+required:
+  - compatible
+  - led-gpios
+  - led-states
+
+additionalProperties: false
+
+examples:
+  - |
+gpios-led {
+  compatible = "multi-gpio-led";
+
+  led-gpios = < 23 0x1>,
+  < 24 0x1>;
+  led-states = /bits/ 8 <0x00 0x01 0x02 0x03>;
+};
+...



From: 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/devicetree/bindings/leds/leds-lp55xx.yaml?h=v5.12-rc4#n196


There is example of multi color LED configuration. In example below I 
used two-color LED with red and green as an example (which what we seem 
to have most in use).


Then if try to combine these into something like:

# Multi color LED with single GPIO line per color
multi-led@2 {
  compatible = "gpio-leds";
  color = ;
  led@0 {
color = ;
gpios = <_pio 0 GPIO_ACTIVE_LOW>;
  };

  led@1 {
color = ;
gpios = <_pio 1 GPIO_ACTIVE_LOW>;
  };
};

# And with intensity GPIOs:
multi-led@2 {
  compatible = "gpio-leds";
  color = ;

  led@0 {
color = ;
gpios = < 23 0x1>,
< 24 0x1>;
... see below
  };

  led@1 {
color = ;
gpios = < 25 0x1>,
< 26 0x1>;
... see below
  };
};

# And then single GPIO with intensity GPIOs:
led@2 {
  compatible = "gpio-leds";
  gpios = < 23 0x1>,
  < 24 0x1>;
  gpios-brightness-levels = <0 1 2 3>
};

I changed 'led-states' to 'gpios-brightness-levels' as it describe more 
that this is about brightness and not some other state information.


How would this sound?

Thanks,
Vesa Jääskeläinen


Re: [PATCH v1 3/3] KEYS: trusted: Introduce support for NXP CAAM-based trusted keys

2021-03-24 Thread Sumit Garg
On Wed, 24 Mar 2021 at 19:37, Ahmad Fatoum  wrote:
>
> Hello Sumit,
>
> On 24.03.21 11:47, Sumit Garg wrote:
> > On Wed, 24 Mar 2021 at 14:56, Ahmad Fatoum  wrote:
> >>
> >> Hello Mimi,
> >>
> >> On 23.03.21 19:07, Mimi Zohar wrote:
> >>> On Tue, 2021-03-23 at 17:35 +0100, Ahmad Fatoum wrote:
>  On 21.03.21 21:48, Horia Geantă wrote:
> > caam has random number generation capabilities, so it's worth using that
> > by implementing .get_random.
> 
>  If the CAAM HWRNG is already seeding the kernel RNG, why not use the 
>  kernel's?
> 
>  Makes for less code duplication IMO.
> >>>
> >>> Using kernel RNG, in general, for trusted keys has been discussed
> >>> before.   Please refer to Dave Safford's detailed explanation for not
> >>> using it [1].
> >>
> >> The argument seems to boil down to:
> >>
> >>  - TPM RNG are known to be of good quality
> >>  - Trusted keys always used it so far
> >>
> >> Both are fine by me for TPMs, but the CAAM backend is new code and neither 
> >> point
> >> really applies.
> >>
> >> get_random_bytes_wait is already used for generating key material 
> >> elsewhere.
> >> Why shouldn't new trusted key backends be able to do the same thing?
> >>
> >
> > Please refer to documented trusted keys behaviour here [1]. New
> > trusted key backends should align to this behaviour and in your case
> > CAAM offers HWRNG so we should be better using that.
>
> Why is it better?
>
> Can you explain what benefit a CAAM user would have if the trusted key
> randomness comes directly out of the CAAM instead of indirectly from
> the kernel entropy pool that is seeded by it?

IMO, user trust in case of trusted keys comes from trusted keys
backend which is CAAM here. If a user doesn't trust that CAAM would
act as a reliable source for RNG then CAAM shouldn't be used as a
trust source in the first place.

And I think building user's trust for kernel RNG implementation with
multiple entropy contributions is pretty difficult when compared with
CAAM HWRNG implementation.

-Sumit

>
> > Also, do update documentation corresponding to CAAM as a trusted keys 
> > backend.
>
> Yes. The documentation should be updated for CAAM and it should describe
> how the key material is derived. Will do so for v2.
>
> Cheers,
> Ahmad
>
> >
> > [1] 
> > https://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd.git/tree/Documentation/security/keys/trusted-encrypted.rst#n87
> >
> > -Sumit
> >
> >> Cheers,
> >> Ahmad
> >>
> >>>
> >>> thanks,
> >>>
> >>> Mimi
> >>>
> >>> [1]
> >>> https://lore.kernel.org/linux-integrity/bca04d5d9a3b764c9b7405bba4d4a3c035f2a...@alpmbapa12.e2k.ad.ge.com/
> >>>
> >>>
> >>>
> >>
> >> --
> >> Pengutronix e.K.   | |
> >> Steuerwalder Str. 21   | http://www.pengutronix.de/  |
> >> 31137 Hildesheim, Germany  | Phone: +49-5121-206917-0|
> >> Amtsgericht Hildesheim, HRA 2686   | Fax:   +49-5121-206917- |
> >
>
> --
> Pengutronix e.K.   | |
> Steuerwalder Str. 21   | http://www.pengutronix.de/  |
> 31137 Hildesheim, Germany  | Phone: +49-5121-206917-0|
> Amtsgericht Hildesheim, HRA 2686   | Fax:   +49-5121-206917- |


Re: [PATCH 2/5] cifsd: add server-side procedures for SMB3

2021-03-24 Thread Sebastian Gottschall



Am 23.03.2021 um 08:19 schrieb Dan Carpenter:

On Tue, Mar 23, 2021 at 08:17:47AM +0900, Namjae Jeon wrote:

+
+static int
+compare_oid(unsigned long *oid1, unsigned int oid1len,
+   unsigned long *oid2, unsigned int oid2len) {
+   unsigned int i;
+
+   if (oid1len != oid2len)
+   return 0;
+
+   for (i = 0; i < oid1len; i++) {
+   if (oid1[i] != oid2[i])
+   return 0;
+   }
+   return 1;
+}

Call this oid_eq()?

Why not compare_oid()? This code is come from cifs.
I need clear reason to change both cifs/cifsd...


Boolean functions should tell you what they are testing in the name.
Without any context you can't know what if (compare_oid(one, two)) {
means, but if (oid_equal(one, two)) { is readable.

regards,
dan carpenter

ahm just a pointless comment. but
return !memcmp(oid1,oid2, sizeof(long*)*oid1len);
looks much more efficient than this "for" loop






Re: [PATCH] cpufreq: dt: check the error returned by dev_pm_opp_of_cpumask_add_table

2021-03-24 Thread Viresh Kumar
On 25-03-21, 13:15, quanyang.wang wrote:
> Thank you for pointing it out.  Do you mean that even if
> dev_pm_opp_of_cpumask_add_table returns
> 
> an error, dev_pm_opp_get_opp_count may still return count > 0 because
> someone may call dev_pm_opp_add
> 
> to add OPP to cpu succcessfully at somewhere else?

Yes.

There are two ways we can add OPPs today:

- Statically via device tree. This is what
  dev_pm_opp_of_cpumask_add_table() tries to do.

- Dynamically via call to dev_pm_opp_add(), which I described earlier.

What failed here is the static way of adding OPPs, we still need to
check if OPPs were added dynamically.

-- 
viresh


[no subject]

2021-03-24 Thread Kayla Manthey
Hej min kære, jeg vil gerne vide, om du har min tidligere besked, tak.


Re: [PATCH] arm64: dts: qcom: sc7280: Add PMIC peripherals for SC7280

2021-03-24 Thread skakit

Hi Matthias,

On 2021-03-22 23:04, Matthias Kaehlcke wrote:

Hi Satya,

On Mon, Mar 22, 2021 at 06:50:47PM +0530, ska...@codeaurora.org wrote:

Hi Matthias,

On 2021-03-13 02:10, Matthias Kaehlcke wrote:
> Hi Satya,
>
> On Thu, Mar 11, 2021 at 04:10:29PM +0530, satya priya wrote:
> > Add PM7325/PM8350C/PMK8350/PMR735A peripherals such as PON,
> > GPIOs, RTC and other PMIC infra modules for SC7280.
> >
> > Signed-off-by: satya priya 
> > ---
> > This patch depends on base DT and board files for SC7280 to merge
> > first
> > https://lore.kernel.org/patchwork/project/lkml/list/?series=487403
> >
> >  arch/arm64/boot/dts/qcom/pm7325.dtsi  |  60 
> >  arch/arm64/boot/dts/qcom/pm8350c.dtsi |  60 
> >  arch/arm64/boot/dts/qcom/pmk8350.dtsi | 104
> > ++
> >  arch/arm64/boot/dts/qcom/pmr735a.dtsi |  60 
> >  arch/arm64/boot/dts/qcom/sc7280.dtsi  |   8 +++
> >  5 files changed, 292 insertions(+)
> >  create mode 100644 arch/arm64/boot/dts/qcom/pm7325.dtsi
> >  create mode 100644 arch/arm64/boot/dts/qcom/pm8350c.dtsi
> >  create mode 100644 arch/arm64/boot/dts/qcom/pmk8350.dtsi
> >  create mode 100644 arch/arm64/boot/dts/qcom/pmr735a.dtsi
> >
> > diff --git a/arch/arm64/boot/dts/qcom/pm7325.dtsi
> > b/arch/arm64/boot/dts/qcom/pm7325.dtsi
> > new file mode 100644
> > index 000..393b256
> > --- /dev/null
> > +++ b/arch/arm64/boot/dts/qcom/pm7325.dtsi
> > @@ -0,0 +1,60 @@
>
> ...
>
> > + polling-delay-passive = <100>;
> > + polling-delay = <0>;
>
> Are you sure that no polling delay is needed? How does the thermal
> framework
> detect that the temperatures is >= the passive trip point and that it
> should
> start polling at 'polling-delay-passive' rate?
>

As the temp-alarm has interrupt support, whenever preconfigured 
threshold
violates it notifies thermal framework, so I think the polling delay 
is not

needed here.


From the documentation I found it's not clear to me how exactly these
interrupts work. Is a single interrupt triggered when the threshold is
violated or are there periodic (?) interrupts as long as the 
temperature

is above the stage 0 threshold?

Why is 'polling-delay-passive' passive needed if there are interrupts? 
Maybe
to detect that the zone should transition from passive to no cooling 
when the

temperature drops below the stage 0 threshold?


The PMIC TEMP_ALARM peripheral maintains an internal over-temperature 
stage: 0, 1, 2, or 3.  Stage 0 is normal operation below the lowest 
(stage 1) threshold [usually 95 C].  When in stage 1, the temperature is 
between the stage 1 and 2 thresholds [stage 2 threshold is usually 115 
C].  Upon hitting the stage 3 threshold [usually 145 C], the PMIC 
hardware will automatically shut down the system.


The TEMP_ALARM IRQ fires on stage 0 -> 1 and 1 -> 0 transitions.  We 
therefore set polling-delay = <0> since there is no need for software to 
monitor the temperature periodically when operating in stage 0.  Upon 
crossing the stage 1 threshold, SW receives the IRQ and the thermal 
framework hits its first trip changing the thermal zone to passive mode. 
 This then engages the 100 ms polling enabled via polling-delay-passive 
= <100>.  If the temperate keeps climbing and passes the stage 2 
threshold, the thermal framework hits the second trip (which is 
critical) and it initiates an orderly shutdown.  If the temperature 
drops below the stage 1 threshold, then the thermal framework exits 
passive mode and stops polling.  This approach reduces/eliminates the 
software overhead when not at an elevated temperature.


Thanks,
Satya Priya


Re: [PATCH] fs: Improve eventpoll logging to stop indicting timerfd

2021-03-24 Thread Manish Varma
Hi Thomas,

On Mon, Mar 22, 2021 at 2:40 PM Thomas Gleixner  wrote:
>
> Manish,
>
> On Mon, Mar 22 2021 at 10:15, Manish Varma wrote:
> > On Thu, Mar 18, 2021 at 6:04 AM Thomas Gleixner  wrote:
> >> > +static atomic_t instance_count = ATOMIC_INIT(0);
> >>
> >> instance_count is misleading as it does not do any accounting of
> >> instances as the name suggests.
> >>
> >
> > Not sure if I am missing a broader point here, but the objective of this
> > patch is to:
> > A. To help find the process a given timerfd associated with, and
> > B. one step further, if there are multiple fds created by a single
> > process then label each instance using monotonically increasing integer
> > i.e. "instance_count" to help identify each of them separately.
> >
> > So, instance_count in my mind helps with "B", i.e. to keep track and
> > identify each instance of timerfd individually.
>
> I know what you want to do. The point is that instance_count is the
> wrong name as it suggests that it counts instances, and that in most
> cases implies active instances.
>
> It's not a counter, it's a token generator which allows you to create
> unique ids. The fact that it is just incrementing once per created file
> descriptor does not matter. That's just an implementation detail.
>
> Name it something like timerfd_create_id or timerfd_session_id which
> clearly tells that this is not counting any thing. It immediately tells
> the purpose of generating an id.
>
> Naming matters when reading code, really.
>

Noted, and thanks for the clarification!

> >> > + snprintf(file_name_buf, sizeof(file_name_buf), "[timerfd%d:%s]",
> >> > +  instance, task_comm_buf);
> >> > + ufd = anon_inode_getfd(file_name_buf, _fops, ctx,
> >> >  O_RDWR | (flags & TFD_SHARED_FCNTL_FLAGS));
> >> >   if (ufd < 0)
> >> >   kfree(ctx);
> >>
> >> I actually wonder, whether this should be part of anon_inode_get*().
> >>
> >
> > I am curious (and open at the same time) if that will be helpful..
> > In the case of timerfd, I could see it adds up value by stuffing more
> > context to the file descriptor name as eventpoll is using the same file
> > descriptor names as wakesource name, and hence the cost of slightly
> > longer file descriptor name justifies. But I don't have a solid reason
> > if this additional cost (of longer file descriptor names) will be
> > helpful in general with other file descriptors.
>
> Obviously you want to make that depend on a flag handed to anon_...().

Unfortunately, changing file descriptor names does not seem to be a viable
option here (more details in my answer in the next section), and
hence changes in anon_...() does not seem to be required.

>
> The point is that there will be the next anonfd usecase which needs
> unique identification at some point. That is going to copy that
> timerfd code and then make it slightly different just because and then
> userspace needs to parse yet another format.
>
> >> Aside of that this is a user space visible change both for eventpoll and
> >> timerfd.
>
> Not when done right.
>
> >> Have you carefully investigated whether there is existing user space
> >> which might depend on the existing naming conventions?
> >>
> > I am not sure how I can confirm that for all userspace, but open for
> > suggestions if you can share some ideas.
> >
> > However, I have verified and can confirm for Android userspace that
> > there is no dependency on existing naming conventions for timerfd and
> > eventpoll wakesource names, if that helps.
>
> Well, there is a world outside Android and you're working for a company
> which should have tools to search for '[timerfd]' usage in a gazillion of
> projects. The obvious primary targets are distros of all sorts. I'm sure
> there are ways to figure this out without doing it manually.
>
> Not that I expect any real dependencies on it, but as always the devil
> is in the details.
>

Right, there are some userspace which depends on "[timerfd]" string
https://codesearch.debian.net/search?q=%22%5Btimerfd%5D%22=1

So, modifying file descriptor names at-least for timerfd will definitely
break those.

With that said, I am now thinking about leaving alone the file descriptor
names as is, and instead, adding those extra information about the
associated processes (i.e. process name or rather PID of the
process) along with token ID directly into wakesource name, at the
time of creating new wakesource i.e. in ep_create_wakeup_source().

So, the wakesource names, that currently named as "[timerfd]", will be
named something like:
"epollitem:.[timerfd]"

Where N is the number of wakesource created since boot.

This way we can still associate the process with the wakesource
name and also distinguish multiple instances of wakesources using
the integer identifier.

Please share your thoughts!

> Thanks,
>
> tglx

Thanks,
Manish
--
Manish Varma | Software Engineer | var...@google.com | 650-686-0858


Re: [PATCH] cpufreq: dt: check the error returned by dev_pm_opp_of_cpumask_add_table

2021-03-24 Thread quanyang.wang

Hi Viresh,

On 3/25/21 12:45 PM, Viresh Kumar wrote:

On 25-03-21, 12:31, quanyang.w...@windriver.com wrote:

From: Quanyang Wang 

The function dev_pm_opp_of_cpumask_add_table may return zero or an
error. When it returns an error, this means that no OPP table is
added for the cpumask because _dev_pm_opp_cpumask_remove_table is
called to free all OPPs associated with the cpu devices in the error
label "remove_table". So continuing to run the next function
dev_pm_opp_get_opp_count is meaningless since it always return the
count value as 0.

There is another reason why we should check the error returned by
dev_pm_opp_of_cpumask_add_table is that it may return -EPROBE_DEFER
which comes from clk_get(dev, NULL) in _update_opp_table_clk. When
the clk for cpu device isn't ready, dt_cpufreq_probe should be deferred
and wait to be called again. But if we ignore the return error of
dev_pm_opp_of_cpumask_add_table, dt_cpufreq_probe will return -ENODEV
because dev_pm_opp_get_opp_count returns the count value as 0,
the cpufreq-dt driver will fail with the error log as below:

[0.724069] cpu cpu0: OPP table can't be empty

Signed-off-by: Quanyang Wang 
---
  drivers/cpufreq/cpufreq-dt.c | 12 +---
  1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/cpufreq/cpufreq-dt.c b/drivers/cpufreq/cpufreq-dt.c
index b1e1bdc63b01..f24359f47b1a 100644
--- a/drivers/cpufreq/cpufreq-dt.c
+++ b/drivers/cpufreq/cpufreq-dt.c
@@ -255,10 +255,16 @@ static int dt_cpufreq_early_init(struct device *dev, int 
cpu)
 * before updating priv->cpus. Otherwise, we will end up creating
 * duplicate OPPs for the CPUs.
 *
-* OPPs might be populated at runtime, don't check for error here.

As the comment (which you removed) clearly says, the OPPs maybe added
at runtime, don't check for error here.

When we say runtime, we mean someone may have called dev_pm_opp_add()
for the devices.


Thank you for pointing it out.  Do you mean that even if 
dev_pm_opp_of_cpumask_add_table returns


an error, dev_pm_opp_get_opp_count may still return count > 0 because 
someone may call dev_pm_opp_add


to add OPP to cpu succcessfully at somewhere else?

Thanks,

Quanyang




+* We need check the return value here, if it is non-zero, there is
+* need to go on.
 */
-   if (!dev_pm_opp_of_cpumask_add_table(priv->cpus))
-   priv->have_static_opps = true;
+   ret = dev_pm_opp_of_cpumask_add_table(priv->cpus);
+   if (ret) {
+   dev_err(cpu_dev, "Failed to add OPP table for CPUs\n");
+   goto out;
+   }
+
+   priv->have_static_opps = true;
  
  	/*

 * The OPP table must be initialized, statically or dynamically, by this


Re: [PATCH v4] audit: log nftables configuration change events once per table

2021-03-24 Thread kernel test robot
Hi Richard,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on nf/master]
[also build test WARNING on nf-next/master pcmoore-audit/next v5.12-rc4 
next-20210324]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Richard-Guy-Briggs/audit-log-nftables-configuration-change-events-once-per-table/20210325-115438
base:   https://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git master
config: arc-allyesconfig (attached as .config)
compiler: arceb-elf-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# 
https://github.com/0day-ci/linux/commit/e2632994acb2553a22a739b3a876a091d04f446c
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review 
Richard-Guy-Briggs/audit-log-nftables-configuration-change-events-once-per-table/20210325-115438
git checkout e2632994acb2553a22a739b3a876a091d04f446c
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=arc 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All warnings (new ones prefixed by >>):

>> net/netfilter/nf_tables_api.c:7993:5: warning: no previous prototype for 
>> 'nf_tables_commit_audit_alloc' [-Wmissing-prototypes]
7993 | int nf_tables_commit_audit_alloc(struct list_head *adl,
 | ^~~~
>> net/netfilter/nf_tables_api.c:8011:6: warning: no previous prototype for 
>> 'nf_tables_commit_audit_collect' [-Wmissing-prototypes]
8011 | void nf_tables_commit_audit_collect(struct list_head *adl,
 |  ^~
>> net/netfilter/nf_tables_api.c:8030:6: warning: no previous prototype for 
>> 'nf_tables_commit_audit_log' [-Wmissing-prototypes]
8030 | void nf_tables_commit_audit_log(struct list_head *adl, u32 
generation)
 |  ^~


vim +/nf_tables_commit_audit_alloc +7993 net/netfilter/nf_tables_api.c

  7992  
> 7993  int nf_tables_commit_audit_alloc(struct list_head *adl,
  7994   struct nft_table *table)
  7995  {
  7996  struct nft_audit_data *adp;
  7997  
  7998  list_for_each_entry(adp, adl, list) {
  7999  if (adp->table == table)
  8000  return 0;
  8001  }
  8002  adp = kzalloc(sizeof(*adp), GFP_KERNEL);
  8003  if (!adp)
  8004  return -ENOMEM;
  8005  adp->table = table;
  8006  INIT_LIST_HEAD(>list);
  8007  list_add(>list, adl);
  8008  return 0;
  8009  }
  8010  
> 8011  void nf_tables_commit_audit_collect(struct list_head *adl,
  8012  struct nft_table *table, u32 op)
  8013  {
  8014  struct nft_audit_data *adp;
  8015  
  8016  list_for_each_entry(adp, adl, list) {
  8017  if (adp->table == table)
  8018  goto found;
  8019  }
  8020  WARN_ONCE("table=%s not expected in commit list", table->name);
  8021  return;
  8022  found:
  8023  adp->entries++;
  8024  if (!adp->op || adp->op > op)
  8025  adp->op = op;
  8026  }
  8027  
  8028  #define AUNFTABLENAMELEN (NFT_TABLE_MAXNAMELEN + 22)
  8029  
> 8030  void nf_tables_commit_audit_log(struct list_head *adl, u32 generation)
  8031  {
  8032  struct nft_audit_data *adp, *adn;
  8033  char aubuf[AUNFTABLENAMELEN];
  8034  
  8035  list_for_each_entry_safe(adp, adn, adl, list) {
  8036  snprintf(aubuf, AUNFTABLENAMELEN, "%s:%u", 
adp->table->name,
  8037   generation);
  8038  audit_log_nfcfg(aubuf, adp->table->family, adp->entries,
  8039  nft2audit_op[adp->op], GFP_KERNEL);
  8040  list_del(>list);
  8041  kfree(adp);
  8042  }
  8043  }
  8044  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip


Re: [PATCH v4 14/22] x86/fpu/xstate: Expand the xstate buffer on the first use of dynamic user state

2021-03-24 Thread Liu, Jing2




On 3/25/2021 5:09 AM, Len Brown wrote:

On Tue, Mar 23, 2021 at 11:15 PM Liu, Jing2  wrote:


IMO, the problem with AVX512 state
is that we guaranteed it will be zero for XINUSE=0.
That means we have to write 0's on saves.

why "we have to write 0's on saves" when XINUSE=0.

Since due to SDM, if XINUSE=0, the XSAVES will *not* save the data and
xstate_bv bit is 0; if use XSAVE, it need save the state but
xstate_bv bit is also 0.

   It would be better
to be able to skip the write -- even if we can't save the space
we can save the data transfer.  (This is what we did for AMX).

With XFD feature that XFD=1, XSAVE command still has to save INIT state
to the area. So it seems with XINUSE=0 and XFD=1, the XSAVE(S) commands
do the same that both can help save the data transfer.

Hi Jing, Good observation!

There are 3 cases.

Hi Len, thanks for your reply.


1. Task context switch save into the context switch buffer.
Here we use XSAVES, and as you point out, XSAVES includes
the compaction optimization feature tracked by XINUSE.
So when AMX is enabled, but clean, XSAVES doesn't write zeros.
Further, it omits the buffer space for AMX in the destination altogether!
However, since XINUSE=1 is possible, we have to *allocate* a buffer
large enough to handle the dirty data for when XSAVES can not
employ that optimization.

Yes, I agree with you about the first case.


2. Entry into user signal handler saves into the user space sigframe.
Here we use XSAVE, and so the hardware will write zeros for XINUSE=0,
and for AVX512, we save neither time or space.

My understanding that for application compatibility, we can *not* compact
the destination buffer that user-space sees.  This is because existing code
may have adopted fixed size offsets.  (which is unfortunate).



And so, for AVX512, we both reserve the space, and we write zeros
for clean AVX512 state.
By XSAVE, I think this is true if we assume setting EDX:EAX AVX512 bits 
as 1,
which means XSAVE will write zeros when XINUSE=0. Is this the same 
assumption

with yours?...

For AMX, we must still reserve the space, but we are not going to write zeros
for clean state.  We so this in software by checking XINUSE=0, and clearing
the xstate_bf for the XSAVE.  As a result, for XINUSE=0, we can skip
writing the zeros, even though we can't compress the space.

So my understanding is that clearing xstate_bv will not help prevent saving
zeros, but only not masking EDX:EAX, since the following logic. Not sure if
this is just what you mean. :)

RFBM ← XCR0 AND EDX:EAX; /* bitwise logical AND */
OLD_BV ← XSTATE_BV field from XSAVE header;
...
FOR i ← 2 TO 62
IF RFBM[i] = 1
THEN save XSAVE state component i at offset n from base of XSAVE area;
FI;
ENDFOR;

XSTATE_BV field in XSAVE header ← (OLD_BV AND NOT RFBM) OR (XINUSE AND 
RFBM);



3. user space always uses fully uncompacted XSAVE buffers.


The reason I'm interested in XINUSE denotation is that it might be helpful
for the XFD MSRs context switch cost during vmexit and vmenter.

As the guest OS may be using XFD, the VMM can not use it for itself.
Rather, the VMM must context switch it when it switches between guests.
(or not expose it to guests at all)


My understand is that KVM cannot assume that userspace qemu uses XFD or not,
so KVM need context switch XFD between vcpu threads when vmexit/vmenter.

That's why I am thinking about detecting XINUSE when vmexit, otherwise, a
wrong armed IA32_XFD will impact XSAVES/XRSTORS causing guest AMX states
lost.

Thanks,
Jing


cheers,
-Len


cheers,
Len Brown, Intel Open Source Technology Center




[PATCH] usb: typec: Fix a typo

2021-03-24 Thread Bhaskar Chowdhury


s/Acknowlege/Acknowledge/

Signed-off-by: Bhaskar Chowdhury 
---
 drivers/usb/typec/ucsi/ucsi.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/usb/typec/ucsi/ucsi.c b/drivers/usb/typec/ucsi/ucsi.c
index 244270755ae6..282c3c825c13 100644
--- a/drivers/usb/typec/ucsi/ucsi.c
+++ b/drivers/usb/typec/ucsi/ucsi.c
@@ -63,7 +63,7 @@ static int ucsi_read_error(struct ucsi *ucsi)
u16 error;
int ret;

-   /* Acknowlege the command that failed */
+   /* Acknowledge the command that failed */
ret = ucsi_acknowledge_command(ucsi);
if (ret)
return ret;
--
2.30.1



Re: [PATCH v1] usb: dwc3: core: Add shutdown callback for dwc3

2021-03-24 Thread Sandeep Maheswaram



On 3/24/2021 9:01 AM, Stephen Boyd wrote:

Quoting Sandeep Maheswaram (2021-03-23 12:27:32)

This patch adds a shutdown callback to USB DWC core driver to ensure that
it is properly shutdown in reboot/shutdown path. This is required
where SMMU address translation is enabled like on SC7180
SoC and few others. If the hardware is still accessing memory after
SMMU translation is disabled as part of SMMU shutdown callback in
system reboot or shutdown path, then IOVAs(I/O virtual address)
which it was using will go on the bus as the physical addresses which
might result in unknown crashes (NoC/interconnect errors).

Previously this was added in dwc3 qcom glue driver.
https://patchwork.kernel.org/project/linux-arm-msm/list/?series=382449
But observed kernel panic as glue driver shutdown getting called after
iommu shutdown. As we are adding iommu nodes in dwc core node
in device tree adding shutdown callback in core driver seems correct.

Signed-off-by: Sandeep Maheswaram 
---
  drivers/usb/dwc3/core.c | 26 +++---
  1 file changed, 19 insertions(+), 7 deletions(-)

diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c
index 94fdbe5..777b2b5 100644
--- a/drivers/usb/dwc3/core.c
+++ b/drivers/usb/dwc3/core.c
@@ -1634,11 +1634,9 @@ static int dwc3_probe(struct platform_device *pdev)
 return ret;
  }
  
-static int dwc3_remove(struct platform_device *pdev)

+static void __dwc3_teardown(struct dwc3 *dwc)
  {
-   struct dwc3 *dwc = platform_get_drvdata(pdev);
-
-   pm_runtime_get_sync(>dev);
+   pm_runtime_get_sync(dwc->dev);
  
 dwc3_debugfs_exit(dwc);

 dwc3_core_exit_mode(dwc);
@@ -1646,19 +1644,32 @@ static int dwc3_remove(struct platform_device *pdev)
 dwc3_core_exit(dwc);
 dwc3_ulpi_exit(dwc);
  
-   pm_runtime_disable(>dev);

-   pm_runtime_put_noidle(>dev);
-   pm_runtime_set_suspended(>dev);
+   pm_runtime_disable(dwc->dev);
+   pm_runtime_put_noidle(dwc->dev);
+   pm_runtime_set_suspended(dwc->dev);
  
 dwc3_free_event_buffers(dwc);

 dwc3_free_scratch_buffers(dwc);
  
 if (dwc->usb_psy)

 power_supply_put(dwc->usb_psy);
+}
+
+static int dwc3_remove(struct platform_device *pdev)
+{
+   struct dwc3 *dwc = platform_get_drvdata(pdev);
+
+   __dwc3_teardown(dwc);
  
 return 0;

  }
  
+static void dwc3_shutdown(struct platform_device *pdev)

+{
+   struct dwc3 *dwc = platform_get_drvdata(pdev);
+
+   __dwc3_teardown(dwc);
+}

Can't this be

static void dwc3_shutdown(struct platform_device *pdev)
{
   dwc3_remove(pdev);
}

and then there's nothing else to change? Basically ignore return value
of dwc3_remove() to make shutdown and remove harmonize. I also wonder if
this is more common than we think and a struct driver flag could be set
to say "call remove for shutdown" and then have driver core swizzle on
that and save some duplicate functions.


I was referring to similar patch 
https://patchwork.kernel.org/project/linux-usb/patch/20190817174140.6394-1-vice...@gmail.com/



  #ifdef CONFIG_PM
  static int dwc3_core_init_for_resume(struct dwc3 *dwc)
  {
@@ -1976,6 +1987,7 @@ MODULE_DEVICE_TABLE(acpi, dwc3_acpi_match);
  static struct platform_driver dwc3_driver = {
 .probe  = dwc3_probe,
 .remove = dwc3_remove,
+   .shutdown   = dwc3_shutdown,


Re: Re: [PATCH] fuse: Fix a potential double free in virtio_fs_get_tree

2021-03-24 Thread lyl2019



> -原始邮件-
> 发件人: "Vivek Goyal" 
> 发送时间: 2021-03-24 01:10:03 (星期三)
> 收件人: "Lv Yunlong" 
> 抄送: stefa...@redhat.com, mik...@szeredi.hu, 
> virtualizat...@lists.linux-foundation.org, linux-fsde...@vger.kernel.org, 
> linux-kernel@vger.kernel.org
> 主题: Re: [PATCH] fuse: Fix a potential double free in virtio_fs_get_tree
> 
> On Mon, Mar 22, 2021 at 10:18:31PM -0700, Lv Yunlong wrote:
> > In virtio_fs_get_tree, fm is allocated by kzalloc() and
> > assigned to fsc->s_fs_info by fsc->s_fs_info=fm statement.
> > If the kzalloc() failed, it will goto err directly, so that
> > fsc->s_fs_info must be non-NULL and fm will be freed.
> 
> sget_fc() will either consume fsc->s_fs_info in case a new super
> block is allocated and set fsc->s_fs_info. In that case we don't
> free fc or fm.
> 
> Or, sget_fc() will return with fsc->s_fs_info set in case we already
> found a super block. In that case we need to free fc and fm.
> 
> In case of error from sget_fc(), fc/fm need to be freed first and
> then error needs to be returned to caller.
> 
> if (IS_ERR(sb))
> return PTR_ERR(sb);
> 
> 
> If we allocated a new super block in sget_fc(), then next step is
> to initialize it.
> 
> if (!sb->s_root) {
> err = virtio_fs_fill_super(sb, fsc);
>   }
> 
> If we run into errors here, then fc/fm need to be freed.
> 
> So current code looks fine to me.
> 
> Vivek
> 
> > 
> > But later fm is freed again when virtio_fs_fill_super() fialed.
> > I think the statement if (fsc->s_fs_info) {kfree(fm);} is
> > misplaced.
> > 
> > My patch puts this statement in the correct palce to avoid
> > double free.
> > 
> > Signed-off-by: Lv Yunlong 
> > ---
> >  fs/fuse/virtio_fs.c | 10 ++
> >  1 file changed, 6 insertions(+), 4 deletions(-)
> > 
> > diff --git a/fs/fuse/virtio_fs.c b/fs/fuse/virtio_fs.c
> > index 8868ac31a3c0..727cf436828f 100644
> > --- a/fs/fuse/virtio_fs.c
> > +++ b/fs/fuse/virtio_fs.c
> > @@ -1437,10 +1437,7 @@ static int virtio_fs_get_tree(struct fs_context *fsc)
> >  
> > fsc->s_fs_info = fm;
> > sb = sget_fc(fsc, virtio_fs_test_super, set_anon_super_fc);
> > -   if (fsc->s_fs_info) {
> > -   fuse_conn_put(fc);
> > -   kfree(fm);
> > -   }
> > +
> > if (IS_ERR(sb))
> > return PTR_ERR(sb);
> >  
> > @@ -1457,6 +1454,11 @@ static int virtio_fs_get_tree(struct fs_context *fsc)
> > sb->s_flags |= SB_ACTIVE;
> > }
> >  
> > +   if (fsc->s_fs_info) {
> > +   fuse_conn_put(fc);
> > +   kfree(fm);
> > +   }
> > +
> > WARN_ON(fsc->root);
> > fsc->root = dget(sb->s_root);
> > return 0;
> > -- 
> > 2.25.1
> > 
> > 
> 


Ok, thanks.
It should be a false positive.

[PATCH] drivers: gpu: drm: Remove repeated declaration

2021-03-24 Thread Wan Jiabing
struct drm_i915_private, struct intel_crtc_state and
struct intel_crtc have been declared before. 
Remove the duplicate.

Signed-off-by: Wan Jiabing 
---
 drivers/gpu/drm/i915/display/intel_crt.h | 1 -
 drivers/gpu/drm/i915/display/intel_display.h | 1 -
 drivers/gpu/drm/i915/display/intel_vrr.h | 1 -
 3 files changed, 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_crt.h 
b/drivers/gpu/drm/i915/display/intel_crt.h
index 1b3fba359efc..6c5c44600cbd 100644
--- a/drivers/gpu/drm/i915/display/intel_crt.h
+++ b/drivers/gpu/drm/i915/display/intel_crt.h
@@ -11,7 +11,6 @@
 enum pipe;
 struct drm_encoder;
 struct drm_i915_private;
-struct drm_i915_private;
 
 bool intel_crt_port_enabled(struct drm_i915_private *dev_priv,
i915_reg_t adpa_reg, enum pipe *pipe);
diff --git a/drivers/gpu/drm/i915/display/intel_display.h 
b/drivers/gpu/drm/i915/display/intel_display.h
index 76f8a805b0a3..29cb6d84ed70 100644
--- a/drivers/gpu/drm/i915/display/intel_display.h
+++ b/drivers/gpu/drm/i915/display/intel_display.h
@@ -48,7 +48,6 @@ struct i915_ggtt_view;
 struct intel_atomic_state;
 struct intel_crtc;
 struct intel_crtc_state;
-struct intel_crtc_state;
 struct intel_digital_port;
 struct intel_dp;
 struct intel_encoder;
diff --git a/drivers/gpu/drm/i915/display/intel_vrr.h 
b/drivers/gpu/drm/i915/display/intel_vrr.h
index fac01bf4ab50..96f9c9c27ab9 100644
--- a/drivers/gpu/drm/i915/display/intel_vrr.h
+++ b/drivers/gpu/drm/i915/display/intel_vrr.h
@@ -15,7 +15,6 @@ struct intel_crtc;
 struct intel_crtc_state;
 struct intel_dp;
 struct intel_encoder;
-struct intel_crtc;
 
 bool intel_vrr_is_capable(struct drm_connector *connector);
 void intel_vrr_check_modeset(struct intel_atomic_state *state);
-- 
2.25.1



Re: [PATCH] tee: optee: fix build error caused by recent optee tracepoints feature

2021-03-24 Thread Guenter Roeck
On Thu, Mar 25, 2021 at 12:06:01PM +0800, Jisheng Zhang wrote:
> If build kernel without "O=dir", below error will be seen:
> 
> In file included from drivers/tee/optee/optee_trace.h:67,
>  from drivers/tee/optee/call.c:18:
> ./include/trace/define_trace.h:95:42: fatal error: ./optee_trace.h: No such 
> file or directory
>95 | #include TRACE_INCLUDE(TRACE_INCLUDE_FILE)
>   |  ^
> compilation terminated.
> 
> Fix it by adding below line to Makefile:
> CFLAGS_call.o := -I$(src)
> 
> Tested with and without "O=dir", both can build successfully.
> 
> Reported-by: Guenter Roeck 
> Suggested-by: Steven Rostedt 
> Signed-off-by: Jisheng Zhang 

Tested-by: Guenter Roeck 

> ---
>  drivers/tee/optee/Makefile | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/tee/optee/Makefile b/drivers/tee/optee/Makefile
> index 56263ae3b1d7..3aa33ea9e6a6 100644
> --- a/drivers/tee/optee/Makefile
> +++ b/drivers/tee/optee/Makefile
> @@ -6,3 +6,6 @@ optee-objs += rpc.o
>  optee-objs += supp.o
>  optee-objs += shm_pool.o
>  optee-objs += device.o
> +
> +# for tracing framework to find optee_trace.h
> +CFLAGS_call.o := -I$(src)
> -- 
> 2.31.0
> 


[PATCH] ARM: imx: Fix a typo

2021-03-24 Thread Bhaskar Chowdhury


s/confgiured/configured/

Signed-off-by: Bhaskar Chowdhury 
---
 arch/arm/mach-imx/pm-imx5.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/mach-imx/pm-imx5.c b/arch/arm/mach-imx/pm-imx5.c
index e9962b48e30c..2e3af2bc7758 100644
--- a/arch/arm/mach-imx/pm-imx5.c
+++ b/arch/arm/mach-imx/pm-imx5.c
@@ -45,7 +45,7 @@
  * This is also the lowest power state possible without affecting
  * non-cpu parts of the system.  For these reasons, imx5 should default
  * to always using this state for cpu idling.  The PM_SUSPEND_STANDBY also
- * uses this state and needs to take no action when registers remain confgiured
+ * uses this state and needs to take no action when registers remain configured
  * for this state.
  */
 #define IMX5_DEFAULT_CPU_IDLE_STATE WAIT_UNCLOCKED_POWER_OFF
--
2.30.1



RE: [PATCH v9 1/7] smccc: Add HVC call variant with result registers other than 0 thru 3

2021-03-24 Thread Michael Kelley
From: Mark Rutland  Sent: Wednesday, March 24, 2021 9:55 
AM
> 
> Hi Michael,
> 
> On Mon, Mar 08, 2021 at 11:57:13AM -0800, Michael Kelley wrote:
> > Hypercalls to Hyper-V on ARM64 may return results in registers other
> > than X0 thru X3, as permitted by the SMCCC spec version 1.2 and later.
> > Accommodate this by adding a variant of arm_smccc_1_1_hvc that allows
> > the caller to specify which 3 registers are returned in addition to X0.
> >
> > Signed-off-by: Michael Kelley 
> > ---
> > There are several ways to support returning results from registers
> > other than X0 thru X3, and Hyper-V usage should be compatible with
> > whatever the maintainers prefer.  What's implemented in this patch
> > may be the most flexible, but it has the downside of not being a
> > true function interface in that args 0 thru 2 must be fixed strings,
> > and not general "C" expressions.
> 
> For the benefit of others here, SMCCCv1.2 allows:
> 
> * SMC64/HVC64 to use all of x1-x17 for both parameters and return values
> * SMC32/HVC32 to use all of r1-r7 for both parameters and return values
> 
> The rationale for this was to make it possible to pass a large number of
> arguments in one call without the hypervisor/firmware needing to access
> the memory of the caller.
> 
> My preference would be to add arm_smccc_1_2_{hvc,smc}() assembly
> functions which read all the permitted argument registers from a struct,
> and write all the permitted result registers to a struct, leaving it to
> callers to set those up and decompose them.
> 
> That way we only have to write one implementation that all callers can
> use, which'll be far easier to maintain. I suspect that in general the
> cost of temporarily bouncing the values through memory will be dominated
> by whatever the hypervisor/firmware is going to do, and if it's not we
> can optimize that away in future.
> 

Thanks for the feedback, and I'm working on implementing this approach.
But I've hit a snag in that gcc limits the "asm" statement to 30 arguments,
which gives us 15 registers as parameters and 15 registers as return
values, instead of the 18 each allowed by SMCCC v1.2.  I will continue
with the 15 register limit for now, unless someone knows a way to exceed
that.  The alternative would be to go to pure assembly language.

I'll post a standalone RFC patch when I have something that works.  My
C pre-processor wizardry is limited, so others will probably know some
tricks that can improve on my first cut.

Michael


Re: [PATCH 1/2] extcon: extcon-gpio: Log error if work-queue init fails

2021-03-24 Thread Vaittinen, Matti

On Thu, 2021-03-25 at 09:49 +0900, Chanwoo Choi wrote:
> On 3/24/21 6:51 PM, Vaittinen, Matti wrote:
> > Hello Hans, Chanwoo, Greg,
> > 
> > On Wed, 2021-03-24 at 10:25 +0100, Hans de Goede wrote:
> > > Hi,
> > > 
> > > On 3/24/21 10:21 AM, Matti Vaittinen wrote:
> > > > Add error print for probe failure when resource managed work-
> > > > queue
> > > > initialization fails.
> > > > 
> > > > Signed-off-by: Matti Vaittinen <
> > > > matti.vaitti...@fi.rohmeurope.com>
> > > > Suggested-by: Chanwoo Choi 
> > > > ---
> > > >  drivers/extcon/extcon-gpio.c | 4 +++-
> > > >  1 file changed, 3 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/drivers/extcon/extcon-gpio.c
> > > > b/drivers/extcon/extcon-
> > > > gpio.c
> > > > index 4105df74f2b0..8ea2cda8f7f3 100644
> > > > --- a/drivers/extcon/extcon-gpio.c
> > > > +++ b/drivers/extcon/extcon-gpio.c
> > > > @@ -114,8 +114,10 @@ static int gpio_extcon_probe(struct
> > > > platform_device *pdev)
> > > > return ret;
> > > >  
> > > > ret = devm_delayed_work_autocancel(dev, >work,
> > > > gpio_extcon_work);
> > > > -   if (ret)
> > > > +   if (ret) {
> > > > +   dev_err(dev, "Failed to initialize
> > > > delayed_work");
> > > > return ret;
> > > > +   }
> > > 
> > > The only ret which we can have here is -ENOMEM and as a rule we
> > > don't
> > > log
> > > errors for those, because the kernel memory-management code
> > > already
> > > complains
> > > loudly when this happens.
> > 
> > I know. This is why I originally omitted the print. Besides, if the
> > memory is so low that devres adding fails - then we probably have
> > plenty of other complaints as well... But as Chanwoo maintains the
> > driver and wanted to have the print - I do not have objections to
> > that
> > either. Maybe someone some-day adds another error path to wq
> > initialization in which case seeing it failed could make sense.
> > 
> > > So IMHO this patch should be dropped.
> > Fine for me - as well as keeping it. I have no strong opinion on
> > this.
> 
> If it is the same handling way for -ENOMEM, don't need to add log ss
> Hans said. 
> Thanks for Hans.

So be it :)
Greg, can you just apply the patch 2/2 and drop this one? (There should
be no dependency between these) or do you want me to resend 2/2 alone?

> > Br,
> > Matti
> > 
> 
> 



[PATCH] drivers: gpu: drm: Remove duplicate declaration

2021-03-24 Thread Wan Jiabing
struct dss_device has been declared at 51st line. 
Remove the duplicate.

Signed-off-by: Wan Jiabing 
---
 drivers/gpu/drm/omapdrm/dss/omapdss.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/omapdrm/dss/omapdss.h 
b/drivers/gpu/drm/omapdrm/dss/omapdss.h
index a40abeafd2e9..2658aadee09a 100644
--- a/drivers/gpu/drm/omapdrm/dss/omapdss.h
+++ b/drivers/gpu/drm/omapdrm/dss/omapdss.h
@@ -52,7 +52,6 @@ struct dss_device;
 struct omap_drm_private;
 struct omap_dss_device;
 struct dispc_device;
-struct dss_device;
 struct dss_lcd_mgr_config;
 struct snd_aes_iec958;
 struct snd_cea_861_aud_if;
-- 
2.25.1



Re: [PATCH 01/13] kconfig: split randconfig setup code into set_randconfig_seed()

2021-03-24 Thread Masahiro Yamada
On Sun, Mar 14, 2021 at 4:48 AM Masahiro Yamada  wrote:
>
> This code is too big to be placed in the switch statement.
>
> Move the code into a new helper function. I slightly refactor the code
> without changing the behavior.
>
> Signed-off-by: Masahiro Yamada 
> ---

All applied to linux-kbuild/kconfig.




>  scripts/kconfig/conf.c | 54 --
>  1 file changed, 31 insertions(+), 23 deletions(-)
>
> diff --git a/scripts/kconfig/conf.c b/scripts/kconfig/conf.c
> index 957d2a0832f7..063c9e7a34c1 100644
> --- a/scripts/kconfig/conf.c
> +++ b/scripts/kconfig/conf.c
> @@ -82,6 +82,36 @@ static void xfgets(char *str, int size, FILE *in)
> printf("%s", str);
>  }
>
> +static void set_randconfig_seed(void)
> +{
> +   unsigned int seed;
> +   char *env;
> +   bool seed_set = false;
> +
> +   env = getenv("KCONFIG_SEED");
> +   if (env && *env) {
> +   char *endp;
> +
> +   seed = strtol(env, , 0);
> +   if (*endp == '\0')
> +   seed_set = true;
> +   }
> +
> +   if (!seed_set) {
> +   struct timeval now;
> +
> +   /*
> +* Use microseconds derived seed, compensate for systems 
> where it may
> +* be zero.
> +*/
> +   gettimeofday(, NULL);
> +   seed = (now.tv_sec + 1) * (now.tv_usec + 1);
> +   }
> +
> +   printf("KCONFIG_SEED=0x%X\n", seed);
> +   srand(seed);
> +}
> +
>  static int conf_askvalue(struct symbol *sym, const char *def)
>  {
> if (!sym_has_value(sym))
> @@ -515,30 +545,8 @@ int main(int ac, char **av)
> defconfig_file = optarg;
> break;
> case randconfig:
> -   {
> -   struct timeval now;
> -   unsigned int seed;
> -   char *seed_env;
> -
> -   /*
> -* Use microseconds derived seed,
> -* compensate for systems where it may be zero
> -*/
> -   gettimeofday(, NULL);
> -   seed = (unsigned int)((now.tv_sec + 1) * (now.tv_usec 
> + 1));
> -
> -   seed_env = getenv("KCONFIG_SEED");
> -   if( seed_env && *seed_env ) {
> -   char *endp;
> -   int tmp = (int)strtol(seed_env, , 0);
> -   if (*endp == '\0') {
> -   seed = tmp;
> -   }
> -   }
> -   fprintf( stderr, "KCONFIG_SEED=0x%X\n", seed );
> -   srand(seed);
> +   set_randconfig_seed();
> break;
> -   }
> case oldaskconfig:
> case oldconfig:
> case allnoconfig:
> --
> 2.27.0
>


-- 
Best Regards
Masahiro Yamada


Re: [PATCH] kconfig: use true and false for bool variable

2021-03-24 Thread Masahiro Yamada
On Mon, Mar 15, 2021 at 3:55 PM Yang Li  wrote:
>
> fixed the following coccicheck:
> ./scripts/kconfig/confdata.c:36:9-10: WARNING: return of 0/1 in function
> 'is_dir' with return type bool
>
> Reported-by: Abaci Robot 
> Signed-off-by: Yang Li 
> ---

Applied. Thanks.




>  scripts/kconfig/confdata.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/scripts/kconfig/confdata.c b/scripts/kconfig/confdata.c
> index 2568dbe..b65b8c3 100644
> --- a/scripts/kconfig/confdata.c
> +++ b/scripts/kconfig/confdata.c
> @@ -33,7 +33,7 @@ static bool is_dir(const char *path)
> struct stat st;
>
> if (stat(path, ))
> -   return 0;
> +   return false;
>
> return S_ISDIR(st.st_mode);
>  }
> --
> 1.8.3.1
>


-- 
Best Regards
Masahiro Yamada


Re: [PATCH] cpufreq: dt: check the error returned by dev_pm_opp_of_cpumask_add_table

2021-03-24 Thread Viresh Kumar
On 25-03-21, 12:31, quanyang.w...@windriver.com wrote:
> From: Quanyang Wang 
> 
> The function dev_pm_opp_of_cpumask_add_table may return zero or an
> error. When it returns an error, this means that no OPP table is
> added for the cpumask because _dev_pm_opp_cpumask_remove_table is
> called to free all OPPs associated with the cpu devices in the error
> label "remove_table". So continuing to run the next function
> dev_pm_opp_get_opp_count is meaningless since it always return the
> count value as 0.
> 
> There is another reason why we should check the error returned by
> dev_pm_opp_of_cpumask_add_table is that it may return -EPROBE_DEFER
> which comes from clk_get(dev, NULL) in _update_opp_table_clk. When
> the clk for cpu device isn't ready, dt_cpufreq_probe should be deferred
> and wait to be called again. But if we ignore the return error of
> dev_pm_opp_of_cpumask_add_table, dt_cpufreq_probe will return -ENODEV
> because dev_pm_opp_get_opp_count returns the count value as 0,
> the cpufreq-dt driver will fail with the error log as below:
> 
> [0.724069] cpu cpu0: OPP table can't be empty
> 
> Signed-off-by: Quanyang Wang 
> ---
>  drivers/cpufreq/cpufreq-dt.c | 12 +---
>  1 file changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/cpufreq/cpufreq-dt.c b/drivers/cpufreq/cpufreq-dt.c
> index b1e1bdc63b01..f24359f47b1a 100644
> --- a/drivers/cpufreq/cpufreq-dt.c
> +++ b/drivers/cpufreq/cpufreq-dt.c
> @@ -255,10 +255,16 @@ static int dt_cpufreq_early_init(struct device *dev, 
> int cpu)
>* before updating priv->cpus. Otherwise, we will end up creating
>* duplicate OPPs for the CPUs.
>*
> -  * OPPs might be populated at runtime, don't check for error here.

As the comment (which you removed) clearly says, the OPPs maybe added
at runtime, don't check for error here.

When we say runtime, we mean someone may have called dev_pm_opp_add()
for the devices.

> +  * We need check the return value here, if it is non-zero, there is
> +  * need to go on.
>*/
> - if (!dev_pm_opp_of_cpumask_add_table(priv->cpus))
> - priv->have_static_opps = true;
> + ret = dev_pm_opp_of_cpumask_add_table(priv->cpus);
> + if (ret) {
> + dev_err(cpu_dev, "Failed to add OPP table for CPUs\n");
> + goto out;
> + }
> +
> + priv->have_static_opps = true;
>  
>   /*
>* The OPP table must be initialized, statically or dynamically, by this

-- 
viresh


[PATCH V4] kbuild: Add rule to build .dt.yaml files for overlays

2021-03-24 Thread Viresh Kumar
The overlay source files are named with .dtso extension now, add a new
rule to generate .dt.yaml for them.

Reviewed-by: Geert Uytterhoeven 
Tested-by: Geert Uytterhoeven 
Signed-off-by: Viresh Kumar 
---
V4:
- Rebase over Frank's cleanup patch:

  https://lore.kernel.org/lkml/20210324223713.1334666-1-frowand.l...@gmail.com/

- Drop changes to drivers/of/unittest-data/Makefile.
- Drop modifications to the rule that builds .dtbo files (as it is
  already updated by Frank).

 scripts/Makefile.lib | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/scripts/Makefile.lib b/scripts/Makefile.lib
index 814b430b407e..a682869d8f4b 100644
--- a/scripts/Makefile.lib
+++ b/scripts/Makefile.lib
@@ -376,6 +376,9 @@ endef
 $(obj)/%.dt.yaml: $(src)/%.dts $(DTC) $(DT_TMP_SCHEMA) FORCE
$(call if_changed_rule,dtc,yaml)
 
+$(obj)/%.dt.yaml: $(src)/%.dtso $(DTC) $(DT_TMP_SCHEMA) FORCE
+   $(call if_changed_rule,dtc,yaml)
+
 dtc-tmp = $(subst $(comma),_,$(dot-target).dts.tmp)
 
 # Bzip2
-- 
2.25.0.rc1.19.g042ed3e048af



[PATCH] tools: perf: util: Remove duplicate declaration

2021-03-24 Thread Wan Jiabing
struct evlist has been declared at 10th line.
struct comm has been declared at 15th line.
Remove the duplicate.

Signed-off-by: Wan Jiabing 
---
 tools/perf/util/metricgroup.h  | 1 -
 tools/perf/util/thread-stack.h | 1 -
 2 files changed, 2 deletions(-)

diff --git a/tools/perf/util/metricgroup.h b/tools/perf/util/metricgroup.h
index ed1b9392e624..026bbf416c48 100644
--- a/tools/perf/util/metricgroup.h
+++ b/tools/perf/util/metricgroup.h
@@ -9,7 +9,6 @@
 
 struct evlist;
 struct evsel;
-struct evlist;
 struct option;
 struct rblist;
 struct pmu_events_map;
diff --git a/tools/perf/util/thread-stack.h b/tools/perf/util/thread-stack.h
index 3bc47a42af8e..b3cd09beb62f 100644
--- a/tools/perf/util/thread-stack.h
+++ b/tools/perf/util/thread-stack.h
@@ -16,7 +16,6 @@ struct comm;
 struct ip_callchain;
 struct symbol;
 struct dso;
-struct comm;
 struct perf_sample;
 struct addr_location;
 struct call_path;
-- 
2.25.1



[PATCH] Bluetooth: L2CAP: Rudimentary typo fixes

2021-03-24 Thread Bhaskar Chowdhury


s/minium/minimum/
s/procdure/procedure/

Signed-off-by: Bhaskar Chowdhury 
---
 net/bluetooth/l2cap_core.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/bluetooth/l2cap_core.c b/net/bluetooth/l2cap_core.c
index 72c2f5226d67..b38e80a0e819 100644
--- a/net/bluetooth/l2cap_core.c
+++ b/net/bluetooth/l2cap_core.c
@@ -1690,7 +1690,7 @@ static void l2cap_le_conn_ready(struct l2cap_conn *conn)
smp_conn_security(hcon, hcon->pending_sec_level);

/* For LE slave connections, make sure the connection interval
-* is in the range of the minium and maximum interval that has
+* is in the range of the minimum and maximum interval that has
 * been configured for this connection. If not, then trigger
 * the connection update procedure.
 */
@@ -7542,7 +7542,7 @@ static void l2cap_data_channel(struct l2cap_conn *conn, 
u16 cid,
BT_DBG("chan %p, len %d", chan, skb->len);

/* If we receive data on a fixed channel before the info req/rsp
-* procdure is done simply assume that the channel is supported
+* procedure is done simply assume that the channel is supported
 * and mark it as ready.
 */
if (chan->chan_type == L2CAP_CHAN_FIXED)
--
2.30.1



Re: [RFC] mm: activate access-more-than-once page via NUMA balancing

2021-03-24 Thread Huang, Ying
Hi, Mel,

Thanks for comment!

Mel Gorman  writes:

> On Wed, Mar 24, 2021 at 04:32:09PM +0800, Huang Ying wrote:
>> One idea behind the LRU page reclaiming algorithm is to put the
>> access-once pages in the inactive list and access-more-than-once pages
>> in the active list.  This is true for the file pages that are accessed
>> via syscall (read()/write(), etc.), but not for the pages accessed via
>> the page tables.  We can only activate them via page reclaim scanning
>> now.  This may cause some problems.  For example, even if there are
>> only hot file pages accessed via the page tables in the inactive list,
>> we will enable the cache trim mode incorrectly to scan only the hot
>> file pages instead of cold anon pages.
>> 
>
> I caution against this patch.
>
> It's non-deterministic for a number of reasons. As it requires NUMA
> balancing to be enabled, the pageout behaviour of a system changes when
> NUMA balancing is active. If this led to pages being artificially and
> inappropriately preserved, NUMA balancing could be disabled for the
> wrong reasons.  It only applies to pages that have no target node so
> memory policies affect which pages are activated differently. Similarly,
> NUMA balancing does not scan all VMAs and some pages may never trap a
> NUMA fault as a result. The timing of when an address space gets scanned
> is driven by the locality of pages and so the timing of page activation
> potentially becomes linked to whether pages are local or need to migrate
> (although not right now for this patch as it only affects pages with a
> target nid of NUMA_NO_NODE). In other words, changes in NUMA balancing
> that affect migration potentially affect the aging rate.  Similarly,
> the activate rate of a process with a single thread and multiple threads
> potentially have different activation rates.
>
> Finally, the NUMA balancing scan algorithm is sub-optimal. It potentially
> scans the entire address space even though only a small number of pages
> are scanned. This is particularly problematic when a process has a lot
> of threads because threads are redundantly scanning the same regions. If
> NUMA balancing ever introduced range tracking of faulted pages to limit
> how much scanning it has to do, it would inadvertently cause a change in
> page activation rate.
>
> NUMA balancing is about page locality, it should not get conflated with
> page aging.

I understand your concerns about binding the NUMA balancing and page
reclaiming.  The requirement of the page locality and page aging is
different, so the policies need to be different.  This is the wrong part
of the patch.

>From another point of view, it's still possible to share some underlying
mechanisms (and code) between them.  That is, scanning the page tables
to make pages unaccessible and capture the page accesses via the page
fault.  Now these page accessing information is used for the page
locality.  Do you think it's a good idea to use these information for
the page aging too (but with a different policy as you pointed out)?

>From yet another point of view :-), in current NUMA balancing
implementation, it's assumed that the node private pages can fit in the
accessing node.  But this may be not always true.  Is it a valid
optimization to migrate the hot private pages first?

Best Regards,
Huang, Ying


[PATCH] cpufreq: dt: check the error returned by dev_pm_opp_of_cpumask_add_table

2021-03-24 Thread quanyang . wang
From: Quanyang Wang 

The function dev_pm_opp_of_cpumask_add_table may return zero or an
error. When it returns an error, this means that no OPP table is
added for the cpumask because _dev_pm_opp_cpumask_remove_table is
called to free all OPPs associated with the cpu devices in the error
label "remove_table". So continuing to run the next function
dev_pm_opp_get_opp_count is meaningless since it always return the
count value as 0.

There is another reason why we should check the error returned by
dev_pm_opp_of_cpumask_add_table is that it may return -EPROBE_DEFER
which comes from clk_get(dev, NULL) in _update_opp_table_clk. When
the clk for cpu device isn't ready, dt_cpufreq_probe should be deferred
and wait to be called again. But if we ignore the return error of
dev_pm_opp_of_cpumask_add_table, dt_cpufreq_probe will return -ENODEV
because dev_pm_opp_get_opp_count returns the count value as 0,
the cpufreq-dt driver will fail with the error log as below:

[0.724069] cpu cpu0: OPP table can't be empty

Signed-off-by: Quanyang Wang 
---
 drivers/cpufreq/cpufreq-dt.c | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/cpufreq/cpufreq-dt.c b/drivers/cpufreq/cpufreq-dt.c
index b1e1bdc63b01..f24359f47b1a 100644
--- a/drivers/cpufreq/cpufreq-dt.c
+++ b/drivers/cpufreq/cpufreq-dt.c
@@ -255,10 +255,16 @@ static int dt_cpufreq_early_init(struct device *dev, int 
cpu)
 * before updating priv->cpus. Otherwise, we will end up creating
 * duplicate OPPs for the CPUs.
 *
-* OPPs might be populated at runtime, don't check for error here.
+* We need check the return value here, if it is non-zero, there is
+* need to go on.
 */
-   if (!dev_pm_opp_of_cpumask_add_table(priv->cpus))
-   priv->have_static_opps = true;
+   ret = dev_pm_opp_of_cpumask_add_table(priv->cpus);
+   if (ret) {
+   dev_err(cpu_dev, "Failed to add OPP table for CPUs\n");
+   goto out;
+   }
+
+   priv->have_static_opps = true;
 
/*
 * The OPP table must be initialized, statically or dynamically, by this
-- 
2.25.1



Re: linux-next: manual merge of the opp tree with the v4l-dvb tree

2021-03-24 Thread Viresh Kumar
On 24-03-21, 16:49, Stanimir Varbanov wrote:
> Thanks Stephen!
> 
> On 3/23/21 2:27 AM, Stephen Rothwell wrote:
> > Hi all,
> > 
> > Today's linux-next merge of the opp tree got a conflict in:
> > 
> >   drivers/media/platform/qcom/venus/pm_helpers.c
> > 
> > between commit:
> > 
> >   08b1cf474b7f ("media: venus: core, venc, vdec: Fix probe dependency 
> > error")
> > 
> > from the v4l-dvb tree and commit:
> > 
> >   857219ae4043 ("media: venus: Convert to use resource-managed OPP API")
> > 
> > from the opp tree.
> > 
> > I fixed it up (see below) and can carry the fix as necessary. This
> > is now fixed as far as linux-next is concerned, but any non trivial
> > conflicts should be mentioned to your upstream maintainer when your tree
> > is submitted for merging.  You may also want to consider cooperating
> > with the maintainer of the conflicting tree to minimise any particularly
> > complex conflicts.
> > 
> 
> I don't know what is the best solution here.
> 
> Viresh, Can I take the OPP API changes through media-tree to avoid
> conflicts?

I already suggested something similar earlier, and I was expecting
Thierry to respond to that.. Not sure who should pick those patches.

https://lore.kernel.org/lkml/20210318103250.shjyd66pxw2g2nsd@vireshk-i7/

Can you please respond to this series then ?

-- 
viresh


[PATCH] btrfs: fixed rudimentary typos

2021-03-24 Thread Bhaskar Chowdhury


s/contaning/containing
s/clearning/clearing/

Signed-off-by: Bhaskar Chowdhury 
---
 fs/btrfs/inode.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 7cdf65be3707..e0c08176bc18 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -2784,8 +2784,8 @@ static int insert_reserved_file_extent(struct 
btrfs_trans_handle *trans,
/*
 * If we dropped an inline extent here, we know the range where it is
 * was not marked with the EXTENT_DELALLOC_NEW bit, so we update the
-* number of bytes only for that range contaning the inline extent.
-* The remaining of the range will be processed when clearning the
+* number of bytes only for that range containing the inline extent.
+* The remaining of the range will be processed when clearing the
 * EXTENT_DELALLOC_BIT bit through the ordered extent completion.
 */
if (file_pos == 0 && !IS_ALIGNED(drop_args.bytes_found, sectorsize)) {
--
2.30.1



Re: [PATCH v2 1/1] dmaengine: dw: Make it dependent to HAS_IOMEM

2021-03-24 Thread Viresh Kumar
On 24-03-21, 16:17, Andy Shevchenko wrote:
> Some architectures do not provide devm_*() APIs. Hence make the driver
> dependent on HAVE_IOMEM.
> 
> Fixes: dbde5c2934d1 ("dw_dmac: use devm_* functions to simplify code")
> Reported-by: kernel test robot 
> Signed-off-by: Andy Shevchenko 
> ---
> v2: used proper option (Serge)
>  drivers/dma/dw/Kconfig | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/dma/dw/Kconfig b/drivers/dma/dw/Kconfig
> index e5162690de8f..db25f9b7778c 100644
> --- a/drivers/dma/dw/Kconfig
> +++ b/drivers/dma/dw/Kconfig
> @@ -10,6 +10,7 @@ config DW_DMAC_CORE
>  
>  config DW_DMAC
>   tristate "Synopsys DesignWare AHB DMA platform driver"
> + depends on HAS_IOMEM
>   select DW_DMAC_CORE
>   help
> Support the Synopsys DesignWare AHB DMA controller. This
> @@ -18,6 +19,7 @@ config DW_DMAC
>  config DW_DMAC_PCI
>   tristate "Synopsys DesignWare AHB DMA PCI driver"
>   depends on PCI
> + depends on HAS_IOMEM
>   select DW_DMAC_CORE
>   help
> Support the Synopsys DesignWare AHB DMA controller on the

Acked-by: Viresh Kumar 

-- 
viresh


RE: [PATCH v2 05/15] PCI: xilinx: Convert to MSI domains

2021-03-24 Thread Bharat Kumar Gogada
> Subject: Re: [PATCH v2 05/15] PCI: xilinx: Convert to MSI domains
> 
> On Wed, 24 Mar 2021 13:56:16 +,
> Bharat Kumar Gogada  wrote:
> 
> > > Thanks for that. Can you please try the following patch and let me
> > > know if it helps?
> > >
> > > Thanks,
> > >
> > >   M.
> > >
> > > diff --git a/drivers/pci/controller/pcie-xilinx.c
> > > b/drivers/pci/controller/pcie- xilinx.c index
> > > ad9abf405167..14001febf59a 100644
> > > --- a/drivers/pci/controller/pcie-xilinx.c
> > > +++ b/drivers/pci/controller/pcie-xilinx.c
> > > @@ -194,8 +194,18 @@ static struct pci_ops xilinx_pcie_ops = {
> > >
> > >  /* MSI functions */
> > >
> > > +static void xilinx_msi_top_irq_ack(struct irq_data *d) {
> > > + /*
> > > +  * xilinx_pcie_intr_handler() will have performed the Ack.
> > > +  * Eventually, this should be fixed and the Ack be moved in
> > > +  * the respective callbacks for INTx and MSI.
> > > +  */
> > > +}
> > > +
> > >  static struct irq_chip xilinx_msi_top_chip = {
> > >   .name   = "PCIe MSI",
> > > + .irq_ack= xilinx_msi_top_irq_ack,
> > >  };
> > >
> > >  static int xilinx_msi_set_affinity(struct irq_data *d, const struct
> > > cpumask *mask, bool force) @@ -206,7 +216,7 @@ static int
> > > xilinx_msi_set_affinity(struct irq_data *d, const struct cpumask
> > > *mas  static void xilinx_compose_msi_msg(struct irq_data *data, struct
> msi_msg *msg)  {
> > >   struct xilinx_pcie_port *pcie = irq_data_get_irq_chip_data(data);
> > > - phys_addr_t pa = virt_to_phys(pcie);
> > > + phys_addr_t pa = ALIGN_DOWN(virt_to_phys(pcie), SZ_4K);
> > >
> > >   msg->address_lo = lower_32_bits(pa);
> > >   msg->address_hi = upper_32_bits(pa); @@ -468,7 +478,7 @@ static
> > > int xilinx_pcie_init_irq_domain(struct
> > > xilinx_pcie_port *port)
> > >
> > >   /* Setup MSI */
> > >   if (IS_ENABLED(CONFIG_PCI_MSI)) {
> > > - phys_addr_t pa = virt_to_phys(port);
> > > + phys_addr_t pa = ALIGN_DOWN(virt_to_phys(port), SZ_4K);
> > >
> > >   ret = xilinx_allocate_msi_domains(port);
> > >   if (ret)
> > >
> > Thanks Marc.
> > With above patch now everything works fine, tested a Samsung NVMe SSD.
> > tst~# lspci
> > 00:00.0 PCI bridge: Xilinx Corporation Device 0706
> > 01:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd
> > NVMe SSD Controller 172Xa/172Xb (rev 01)
> 
> Great, thanks for giving it a shot. Can I take this as a Tested-by:
> tag?
> 
Yes. 

Regards,
Bharat


[PATCH] xtensa: Couple of typo fixes

2021-03-24 Thread Bhaskar Chowdhury


s/contans/contains/
s/desination/destination/

Signed-off-by: Bhaskar Chowdhury 
---
 arch/xtensa/kernel/head.S | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/xtensa/kernel/head.S b/arch/xtensa/kernel/head.S
index e0c1fac0910f..c74fdaacf4cf 100644
--- a/arch/xtensa/kernel/head.S
+++ b/arch/xtensa/kernel/head.S
@@ -212,7 +212,7 @@ ENTRY(_startup)
 *
 * The linker script used to build the Linux kernel image
 * creates a table located at __boot_reloc_table_start
-* that contans the information what data needs to be unpacked.
+* that contains the information what data needs to be unpacked.
 *
 * Uses a2-a7.
 */
@@ -222,7 +222,7 @@ ENTRY(_startup)

 1: beq a2, a3, 3f  # no more entries?
l32ia4, a2, 0   # start destination (in RAM)
-   l32ia5, a2, 4   # end desination (in RAM)
+   l32ia5, a2, 4   # end destination (in RAM)
l32ia6, a2, 8   # start source (in ROM)
addia2, a2, 12  # next entry
beq a4, a5, 1b  # skip, empty entry
--
2.30.1



[PATCH] tee: optee: fix build error caused by recent optee tracepoints feature

2021-03-24 Thread Jisheng Zhang
If build kernel without "O=dir", below error will be seen:

In file included from drivers/tee/optee/optee_trace.h:67,
 from drivers/tee/optee/call.c:18:
./include/trace/define_trace.h:95:42: fatal error: ./optee_trace.h: No such 
file or directory
   95 | #include TRACE_INCLUDE(TRACE_INCLUDE_FILE)
  |  ^
compilation terminated.

Fix it by adding below line to Makefile:
CFLAGS_call.o := -I$(src)

Tested with and without "O=dir", both can build successfully.

Reported-by: Guenter Roeck 
Suggested-by: Steven Rostedt 
Signed-off-by: Jisheng Zhang 
---
 drivers/tee/optee/Makefile | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/tee/optee/Makefile b/drivers/tee/optee/Makefile
index 56263ae3b1d7..3aa33ea9e6a6 100644
--- a/drivers/tee/optee/Makefile
+++ b/drivers/tee/optee/Makefile
@@ -6,3 +6,6 @@ optee-objs += rpc.o
 optee-objs += supp.o
 optee-objs += shm_pool.o
 optee-objs += device.o
+
+# for tracing framework to find optee_trace.h
+CFLAGS_call.o := -I$(src)
-- 
2.31.0



Re: [syzbot] WARNING in firmware_fallback_sysfs

2021-03-24 Thread syzbot
syzbot has found a reproducer for the following issue on:

HEAD commit:20f1b5f9 Add linux-next specific files for 20210324
git tree:   linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=1506414ed0
kernel config:  https://syzkaller.appspot.com/x/.config?x=31aa577aa2dca78c
dashboard link: https://syzkaller.appspot.com/bug?extid=95f2e2439b97575ec3c0
syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=14e50426d0
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=1388dfe6d0

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+95f2e2439b97575ec...@syzkaller.appspotmail.com

sysfs group 'power' not found for kobject 'ueagle-atm!eagleI.fw'
WARNING: CPU: 1 PID: 36 at fs/sysfs/group.c:279 sysfs_remove_group+0x126/0x170 
fs/sysfs/group.c:279
Modules linked in:
CPU: 1 PID: 36 Comm: kworker/1:1 Not tainted 5.12.0-rc4-next-20210324-syzkaller 
#0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011
Workqueue: events request_firmware_work_func
RIP: 0010:sysfs_remove_group+0x126/0x170 fs/sysfs/group.c:279
Code: 48 89 d9 49 8b 14 24 48 b8 00 00 00 00 00 fc ff df 48 c1 e9 03 80 3c 01 
00 75 37 48 8b 33 48 c7 c7 e0 7d 7c 89 e8 9d cc d9 06 <0f> 0b eb 98 e8 f1 23 c9 
ff e9 01 ff ff ff 48 89 df e8 e4 23 c9 ff
RSP: 0018:c9e6faa0 EFLAGS: 00010282
RAX:  RBX: 89da8900 RCX: 
RDX: 888011e01c80 RSI: 815c3fd5 RDI: f520001cdf46
RBP:  R08:  R09: 
R10: 815bd77e R11:  R12: 8880276ac008
R13: 89da8ea0 R14: 8880133e6878 R15: 8880133e68c0
FS:  () GS:8880b9d0() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 7f2c3971a0c8 CR3: 1cf2a000 CR4: 001506e0
DR0:  DR1:  DR2: 
DR3:  DR6: fffe0ff0 DR7: 0400
Call Trace:
 dpm_sysfs_remove+0x97/0xb0 drivers/base/power/sysfs.c:837
 device_del+0x20c/0xd40 drivers/base/core.c:3398
 fw_load_sysfs_fallback drivers/base/firmware_loader/fallback.c:543 [inline]
 fw_load_from_user_helper drivers/base/firmware_loader/fallback.c:581 [inline]
 firmware_fallback_sysfs+0x9ff/0xe20 drivers/base/firmware_loader/fallback.c:657
 _request_firmware+0xa80/0xe80 drivers/base/firmware_loader/main.c:833
 request_firmware_work_func+0xdd/0x230 drivers/base/firmware_loader/main.c:1079
 process_one_work+0x98d/0x1600 kernel/workqueue.c:2275
 worker_thread+0x64c/0x1120 kernel/workqueue.c:2421
 kthread+0x3b1/0x4a0 kernel/kthread.c:292
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294



Re: [PATCH V2] arm64: dts: qcom: sc7280: Add nodes for eMMC and SD card

2021-03-24 Thread Veerabhadrarao Badiganti



On 3/23/2021 9:41 PM, Doug Anderson wrote:

Hi,

On Sat, Mar 20, 2021 at 11:18 AM Shaik Sajida Bhanu
 wrote:

Add nodes for eMMC and SD card on sc7280.

Signed-off-by: Shaik Sajida Bhanu 

---
This change is depends on the below patch series:
https://lore.kernel.org/patchwork/project/lkml/list/?series=488871
https://lore.kernel.org/patchwork/project/lkml/list/?series=489530
https://lore.kernel.org/patchwork/project/lkml/list/?series=488429

Changes since V1:
 - Moved SDHC nodes as suggested by Bjorn Andersson.
 - Dropped "pinconf-" prefix as suggested by Bjorn Andersson.
 - Removed extra newlines as suggested by Konrad Dybcio.
 - Changed sd-cd pin to bias-pull-up in sdc2_off as suggested by
   Veerabhadrarao Badiganti.
 - Added bandwidth votes for eMMC and SD card.
---
  arch/arm64/boot/dts/qcom/sc7280-idp.dts |  25 
  arch/arm64/boot/dts/qcom/sc7280.dtsi| 213 
  2 files changed, 238 insertions(+)

diff --git a/arch/arm64/boot/dts/qcom/sc7280-idp.dts 
b/arch/arm64/boot/dts/qcom/sc7280-idp.dts
index 54d2cb3..4105263 100644
--- a/arch/arm64/boot/dts/qcom/sc7280-idp.dts
+++ b/arch/arm64/boot/dts/qcom/sc7280-idp.dts
@@ -8,6 +8,7 @@
  /dts-v1/;

  #include "sc7280.dtsi"
+#include 

  / {
 model = "Qualcomm Technologies, Inc. sc7280 IDP platform";
@@ -242,6 +243,30 @@
 status = "okay";
  };

+_1 {
+   status = "okay";

When I apply your patch I find that your sort order is wrong. "s"
comes before "u" and after "q" in the alphabet so "sdhc_1" and
"sdhc_2" should sort _after "qupv3_id_0" and before "uart5"



+   pinctrl-names = "default", "sleep";
+   pinctrl-0 = <_on>;
+   pinctrl-1 = <_off>;
+
+   vmmc-supply = <_l7b_2p9>;
+   vqmmc-supply = <_l19b_1p8>;
+};
+
+_2 {
+   status = "okay";
+
+   pinctrl-names = "default","sleep";
+   pinctrl-0 = <_on>;
+   pinctrl-1 = <_off>;
+
+   vmmc-supply = <_l9c_2p9>;
+   vqmmc-supply = <_l6c_2p9>;
+
+   cd-gpios = < 91 GPIO_ACTIVE_LOW>;

Where is the pinctrl for the card detect?  Oh, I see it's in
"sdc2_on". Probably would be good to break it out since this is
board-specific. See below.



+};
+
  /* PINCTRL - additions to nodes defined in sc7280.dtsi */

  _uart5_default {
diff --git a/arch/arm64/boot/dts/qcom/sc7280.dtsi 
b/arch/arm64/boot/dts/qcom/sc7280.dtsi
index 8f6b569..69eb064 100644
--- a/arch/arm64/boot/dts/qcom/sc7280.dtsi
+++ b/arch/arm64/boot/dts/qcom/sc7280.dtsi
@@ -20,6 +20,11 @@

 chosen { };

+   aliases {
+   mmc1 = _1;
+   mmc2 = _2;
+   };
+
 clocks {
 xo_board: xo-board {
 compatible = "fixed-clock";
@@ -305,6 +310,64 @@
 #power-domain-cells = <1>;
 };

+   sdhc_1: sdhci@7c4000 {
+   compatible = "qcom,sdhci-msm-v5";

Please make the compatible:
   compatible = "qcom,sc7280-sdhci", "qcom,sdhci-msm-v5";

...and add to the bindings. It should be a trivial bindings patch so
not too much trouble.

NOTE: even though the "qcom,sc7280-sdhci" should be in the bindings
and here you _shouldn't_ be adding any code for it. Just let the
fallback compatible string ("qcom,sdhci-msm-v5") do its magic. Adding
the sc7280 specific version is more of a "just in case we need it
later" type thing.



+   reg = <0 0x7c4000 0 0x1000>,
+   <0 0x7c5000 0 0x1000>;
+   reg-names = "hc", "cqhci";
+
+   iommus = <_smmu 0xC0 0x0>;
+   interrupts = ,
+   ;
+   interrupt-names = "hc_irq", "pwr_irq";
+
+   clocks = < GCC_SDCC1_APPS_CLK>,
+   < GCC_SDCC1_AHB_CLK>,
+   < RPMH_CXO_CLK>;
+   clock-names = "core", "iface", "xo";

I'm curious: why is the "xo" clock needed here but not for sc7180?
Actually its needed even for sc7180. We are making use of this clock in 
msm_init_cm_dll()
The default PoR value is also same as calculated value for 
HS200/HS400/SDR104 modes.

But just not to rely on default register values we need this entry.




+   interconnects = <_noc MASTER_SDCC_1 0 _virt 
SLAVE_EBI1 0>,
+   <_noc MASTER_APPSS_PROC 0  
SLAVE_SDCC_1 0>;
+   interconnect-names = "sdhc-ddr","cpu-sdhc";
+   power-domains = < SC7280_CX>;
+   operating-points-v2 = <_opp_table>;
+
+   bus-width = <8>;
+   non-removable;

This was actually a problem on sc7180 too, but you probably don't want
"non-removable" in the SoC file. Board files really should be adding
this. Though the SoC might be designed with the idea that this would
be used for a non-removable 

Re: [PATCH 00/36] [Set 4] Rid W=1 warnings in SCSI

2021-03-24 Thread Martin K. Petersen
On Wed, 17 Mar 2021 09:11:54 +, Lee Jones wrote:

> This set is part of a larger effort attempting to clean-up W=1
> kernel builds, which are currently overwhelmingly riddled with
> niggly little warnings.
> 
> Lee Jones (36):
>   scsi: myrb: Demote non-conformant kernel-doc headers and fix others
>   scsi: ipr: Fix incorrect function names in their headers
>   scsi: mvumi: Fix formatting and doc-rot issues
>   scsi: sd_zbc: Place function name into header
>   scsi: pmcraid: Fix a whole host of kernel-doc issues
>   scsi: sd: Fix function name in header
>   scsi: aic94xx: aic94xx_dump: Correct misspelling of function
> asd_dump_seq_state()
>   scsi: be2iscsi: be_main: Ensure function follows directly after its
> header
>   scsi: dc395x: Fix some function param descriptions
>   scsi: initio: Fix a few kernel-doc misdemeanours
>   scsi: a100u2w: Fix some misnaming and formatting issues
>   scsi: myrs: Add missing ':' to make the kernel-doc checker happy
>   scsi: pmcraid: Correct function name pmcraid_show_adapter_id() in
> header
>   scsi: mpt3sas: mpt3sas_scs: Fix a few kernel-doc issues
>   scsi: be2iscsi: be_main: Demote incomplete/non-conformant kernel-doc
> header
>   scsi: isci: phy: Fix a few different kernel-doc related issues
>   scsi: fnic: fnic_scsi: Demote non-conformant kernel-doc headers
>   scsi: fnic: fnic_fcs: Kernel-doc headers must contain the function
> name
>   scsi: isci: phy: Provide function name and demote non-conforming
> header
>   scsi: isci: request: Fix a myriad of kernel-doc issues
>   scsi: isci: host: Fix bunch of kernel-doc related issues
>   scsi: isci: task: Demote non-conformant header and remove superfluous
> param
>   scsi: isci: remote_node_table: Fix a bunch of kernel-doc misdemeanours
>   scsi: isci: remote_node_context: Fix one function header and demote a
> couple more
>   scsi: isci: port_config: Fix a bunch of doc-rot and demote abuses
>   scsi: isci: remote_device: Fix a bunch of doc-rot issues
>   scsi: isci: request: Fix doc-rot issue relating to 'ireq' param
>   scsi: isci: port: Fix a bunch of kernel-doc issues
>   scsi: isci: remote_node_context: Demote kernel-doc abuse
>   scsi: isci: remote_node_table: Provide some missing params and remove
> others
>   scsi: cxlflash: main: Fix a little do-rot
>   scsi: cxlflash: superpipe: Fix a few misnaming issues
>   scsi: ibmvscsi: Fix a bunch of kernel-doc related issues
>   scsi: ibmvscsi: ibmvfc: Fix a bunch of misdocumentation
>   scsi: ibmvscsi_tgt: ibmvscsi_tgt: Remove duplicate section 'NOTE'
>   scsi: cxlflash: vlun: Fix some misnaming related doc-rot
> 
> [...]

Applied to 5.13/scsi-queue, thanks!

[01/36] scsi: myrb: Demote non-conformant kernel-doc headers and fix others
https://git.kernel.org/mkp/scsi/c/12a1b740f225
[02/36] scsi: ipr: Fix incorrect function names in their headers
https://git.kernel.org/mkp/scsi/c/637b5c3ebc1c
[03/36] scsi: mvumi: Fix formatting and doc-rot issues
https://git.kernel.org/mkp/scsi/c/5ccd626516e1
[04/36] scsi: sd_zbc: Place function name into header
https://git.kernel.org/mkp/scsi/c/59863cb53d80
[05/36] scsi: pmcraid: Fix a whole host of kernel-doc issues
https://git.kernel.org/mkp/scsi/c/3673b7b0007b
[06/36] scsi: sd: Fix function name in header
https://git.kernel.org/mkp/scsi/c/ad907c54e36f
[07/36] scsi: aic94xx: aic94xx_dump: Correct misspelling of function 
asd_dump_seq_state()
https://git.kernel.org/mkp/scsi/c/3e2f4679ea03
[08/36] scsi: be2iscsi: be_main: Ensure function follows directly after its 
header
https://git.kernel.org/mkp/scsi/c/f1d50e8ee5c9
[09/36] scsi: dc395x: Fix some function param descriptions
https://git.kernel.org/mkp/scsi/c/33c8ef953ece
[10/36] scsi: initio: Fix a few kernel-doc misdemeanours
https://git.kernel.org/mkp/scsi/c/100ec495e01e
[11/36] scsi: a100u2w: Fix some misnaming and formatting issues
https://git.kernel.org/mkp/scsi/c/c548a6250627
[12/36] scsi: myrs: Add missing ':' to make the kernel-doc checker happy
https://git.kernel.org/mkp/scsi/c/9eb292eb2ef7
[13/36] scsi: pmcraid: Correct function name pmcraid_show_adapter_id() in header
https://git.kernel.org/mkp/scsi/c/a364a147b1dc
[14/36] scsi: mpt3sas: mpt3sas_scs: Fix a few kernel-doc issues
https://git.kernel.org/mkp/scsi/c/a8d548b0b3ee
[15/36] scsi: be2iscsi: be_main: Demote incomplete/non-conformant kernel-doc 
header
https://git.kernel.org/mkp/scsi/c/a90a8c607570
[16/36] scsi: isci: phy: Fix a few different kernel-doc related issues
https://git.kernel.org/mkp/scsi/c/6af1d9bd9051
[17/36] scsi: fnic: fnic_scsi: Demote non-conformant kernel-doc headers
https://git.kernel.org/mkp/scsi/c/c7eab0704c30
[18/36] scsi: fnic: fnic_fcs: Kernel-doc headers must contain the function name
https://git.kernel.org/mkp/scsi/c/2efd8631d6a5
[19/36] scsi: isci: phy: Provide function name and demote non-conforming header
 

Re: [PATCH v3] scsi: ufs: Tidy up WB configuration code

2021-03-24 Thread Martin K. Petersen
On Thu, 18 Mar 2021 17:55:36 +0800, Yue Hu wrote:

> There are similar code implementations for WB configuration in
> ufshcd_wb_{ctrl, toggle_flush_during_h8, toggle_flush}. We can
> extract the part to create a new helper with a flag parameter to
> reduce code duplication.
> 
> Meanwhile, rename ufshcd_wb_ctrl() to ufshcd_wb_toggle() for better
> readability.
> 
> [...]

Applied to 5.13/scsi-queue, thanks!

[1/1] scsi: ufs: Tidy up WB configuration code
  https://git.kernel.org/mkp/scsi/c/3b5f3c0d0548

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH] scsi: fnic: Rudimentary spelling fixes throughout the file fnic_trace.c

2021-03-24 Thread Martin K. Petersen
On Wed, 17 Mar 2021 14:52:40 +0530, Bhaskar Chowdhury wrote:

> Rudimentary typo fixes throughout the file.

Applied to 5.13/scsi-queue, thanks!

[1/1] scsi: fnic: Rudimentary spelling fixes throughout the file fnic_trace.c
  https://git.kernel.org/mkp/scsi/c/bcf064bc2a3b

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH 0/2] Fix EH race and MQ support

2021-03-24 Thread Martin K. Petersen
On Fri, 19 Mar 2021 14:50:27 -0600, Tyrel Datwyler wrote:

> Changes to the locking pattern protecting the event lists and handling of scsi
> command completion introduced a race where an ouststanding command that EH is
> waiting ifor to complete is no longer identifiable by being on the sent list, 
> but
> instead as a command that is not on the free list. This is a result of moving
> commands to be completed off the sent list to a private list to be completed
> outside the list lock.
> 
> [...]

Applied to 5.12/scsi-fixes, thanks!

[1/2] ibmvfc: fix potential race in ibmvfc_wait_for_ops
  https://git.kernel.org/mkp/scsi/c/8b1c9b202549
[2/2] ibmvfc: make ibmvfc_wait_for_ops MQ aware
  https://git.kernel.org/mkp/scsi/c/62fc2661482b

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH] message: fusion: Fix a typo in the file mptbase.h

2021-03-24 Thread Martin K. Petersen
On Wed, 17 Mar 2021 15:42:38 +0530, Bhaskar Chowdhury wrote:

> s/contets/contents/

Applied to 5.13/scsi-queue, thanks!

[1/1] message: fusion: Fix a typo in the file mptbase.h
  https://git.kernel.org/mkp/scsi/c/69a1709e2ec8

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH] scsi: qedi: fix error return code of qedi_alloc_global_queues()

2021-03-24 Thread Martin K. Petersen
On Sun, 7 Mar 2021 19:30:24 -0800, Jia-Ju Bai wrote:

> When kzalloc() returns NULL to qedi->global_queues[i], no error return
> code of qedi_alloc_global_queues() is assigned.
> To fix this bug, status is assigned with -ENOMEM in this case.

Applied to 5.12/scsi-fixes, thanks!

[1/1] scsi: qedi: fix error return code of qedi_alloc_global_queues()
  https://git.kernel.org/mkp/scsi/c/f69953837ca5

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH] scsi: mpt3sas: fix error return code of mpt3sas_base_attach()

2021-03-24 Thread Martin K. Petersen
On Sun, 7 Mar 2021 19:52:41 -0800, Jia-Ju Bai wrote:

> When kzalloc() returns NULL, no error return code of
> mpt3sas_base_attach() is assigned.
> To fix this bug, r is assigned with -ENOMEM in this case.

Applied to 5.12/scsi-fixes, thanks!

[1/1] scsi: mpt3sas: fix error return code of mpt3sas_base_attach()
  https://git.kernel.org/mkp/scsi/c/3401ecf7fc1b

-- 
Martin K. Petersen  Oracle Linux Engineering


[PATCH v4] audit: log nftables configuration change events once per table

2021-03-24 Thread Richard Guy Briggs
Reduce logging of nftables events to a level similar to iptables.
Restore the table field to list the table, adding the generation.

Indicate the op as the most significant operation in the event.

A couple of sample events:

type=PROCTITLE msg=audit(2021-03-18 09:30:49.801:143) : 
proctitle=/usr/bin/python3 -s /usr/sbin/firewalld --nofork --nopid
type=SYSCALL msg=audit(2021-03-18 09:30:49.801:143) : arch=x86_64 
syscall=sendmsg success=yes exit=172 a0=0x6 a1=0x7ffdcfcbe650 a2=0x0 
a3=0x7ffdcfcbd52c items=0 ppid=1 pid=367 auid=unset uid=root gid=root euid=root 
suid=root fsuid=root egid=roo
t sgid=root fsgid=root tty=(none) ses=unset comm=firewalld 
exe=/usr/bin/python3.9 subj=system_u:system_r:firewalld_t:s0 key=(null)
type=NETFILTER_CFG msg=audit(2021-03-18 09:30:49.801:143) : table=firewalld:2 
family=ipv6 entries=1 op=nft_register_table pid=367 
subj=system_u:system_r:firewalld_t:s0 comm=firewalld
type=NETFILTER_CFG msg=audit(2021-03-18 09:30:49.801:143) : table=firewalld:2 
family=ipv4 entries=1 op=nft_register_table pid=367 
subj=system_u:system_r:firewalld_t:s0 comm=firewalld
type=NETFILTER_CFG msg=audit(2021-03-18 09:30:49.801:143) : table=firewalld:2 
family=inet entries=1 op=nft_register_table pid=367 
subj=system_u:system_r:firewalld_t:s0 comm=firewalld

type=PROCTITLE msg=audit(2021-03-18 09:30:49.839:144) : 
proctitle=/usr/bin/python3 -s /usr/sbin/firewalld --nofork --nopid
type=SYSCALL msg=audit(2021-03-18 09:30:49.839:144) : arch=x86_64 
syscall=sendmsg success=yes exit=22792 a0=0x6 a1=0x7ffdcfcbe650 a2=0x0 
a3=0x7ffdcfcbd52c items=0 ppid=1 pid=367 auid=unset uid=root gid=root euid=root 
suid=root fsuid=root egid=r
oot sgid=root fsgid=root tty=(none) ses=unset comm=firewalld 
exe=/usr/bin/python3.9 subj=system_u:system_r:firewalld_t:s0 key=(null)
type=NETFILTER_CFG msg=audit(2021-03-18 09:30:49.839:144) : table=firewalld:3 
family=ipv6 entries=30 op=nft_register_chain pid=367 
subj=system_u:system_r:firewalld_t:s0 comm=firewalld
type=NETFILTER_CFG msg=audit(2021-03-18 09:30:49.839:144) : table=firewalld:3 
family=ipv4 entries=30 op=nft_register_chain pid=367 
subj=system_u:system_r:firewalld_t:s0 comm=firewalld
type=NETFILTER_CFG msg=audit(2021-03-18 09:30:49.839:144) : table=firewalld:3 
family=inet entries=165 op=nft_register_chain pid=367 
subj=system_u:system_r:firewalld_t:s0 comm=firewalld

The issue was originally documented in
https://github.com/linux-audit/audit-kernel/issues/124

Signed-off-by: Richard Guy Briggs 
---
Changelog:
v4:
- move nf_tables_commit_audit_log() before nf_tables_commit_release() [fw]
- move nft2audit_op[] from audit.h to nf_tables_api.c

v3:
- fix function braces, reduce parameter scope [pna]
- pre-allocate nft_audit_data per table in step 1, bail on ENOMEM [pna]

v2:
- convert NFT ops to array indicies in nft2audit_op[] [ps]
- use linux lists [pna]
- use functions for each of collection and logging of audit data [pna]
---
 net/netfilter/nf_tables_api.c | 187 +++---
 1 file changed, 104 insertions(+), 83 deletions(-)

diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index c1eb5cdb3033..9c930fe72005 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -66,6 +66,41 @@ static const struct rhashtable_params nft_objname_ht_params 
= {
.automatic_shrinking= true,
 };
 
+struct nft_audit_data {
+   struct nft_table *table;
+   int entries;
+   int op;
+   struct list_head list;
+};
+
+static const u8 nft2audit_op[NFT_MSG_MAX] = { // enum nf_tables_msg_types
+   [NFT_MSG_NEWTABLE]  = AUDIT_NFT_OP_TABLE_REGISTER,
+   [NFT_MSG_GETTABLE]  = AUDIT_NFT_OP_INVALID,
+   [NFT_MSG_DELTABLE]  = AUDIT_NFT_OP_TABLE_UNREGISTER,
+   [NFT_MSG_NEWCHAIN]  = AUDIT_NFT_OP_CHAIN_REGISTER,
+   [NFT_MSG_GETCHAIN]  = AUDIT_NFT_OP_INVALID,
+   [NFT_MSG_DELCHAIN]  = AUDIT_NFT_OP_CHAIN_UNREGISTER,
+   [NFT_MSG_NEWRULE]   = AUDIT_NFT_OP_RULE_REGISTER,
+   [NFT_MSG_GETRULE]   = AUDIT_NFT_OP_INVALID,
+   [NFT_MSG_DELRULE]   = AUDIT_NFT_OP_RULE_UNREGISTER,
+   [NFT_MSG_NEWSET]= AUDIT_NFT_OP_SET_REGISTER,
+   [NFT_MSG_GETSET]= AUDIT_NFT_OP_INVALID,
+   [NFT_MSG_DELSET]= AUDIT_NFT_OP_SET_UNREGISTER,
+   [NFT_MSG_NEWSETELEM]= AUDIT_NFT_OP_SETELEM_REGISTER,
+   [NFT_MSG_GETSETELEM]= AUDIT_NFT_OP_INVALID,
+   [NFT_MSG_DELSETELEM]= AUDIT_NFT_OP_SETELEM_UNREGISTER,
+   [NFT_MSG_NEWGEN]= AUDIT_NFT_OP_GEN_REGISTER,
+   [NFT_MSG_GETGEN]= AUDIT_NFT_OP_INVALID,
+   [NFT_MSG_TRACE] = AUDIT_NFT_OP_INVALID,
+   [NFT_MSG_NEWOBJ]= AUDIT_NFT_OP_OBJ_REGISTER,
+   [NFT_MSG_GETOBJ]= AUDIT_NFT_OP_INVALID,
+   [NFT_MSG_DELOBJ]= AUDIT_NFT_OP_OBJ_UNREGISTER,
+   [NFT_MSG_GETOBJ_RESET]  = AUDIT_NFT_OP_OBJ_RESET,
+   [NFT_MSG_NEWFLOWTABLE]  = AUDIT_NFT_OP_FLOWTABLE_REGISTER,
+   [NFT_MSG_GETFLOWTABLE]  

[PATCH resend 3/4] nfc: fix memory leak in llcp_sock_connect()

2021-03-24 Thread Xiaoming Ni
In llcp_sock_connect(), use kmemdup to allocate memory for
 "llcp_sock->service_name". The memory is not released in the sock_unlink
label of the subsequent failure branch.
As a result, memory leakage occurs.

fix CVE-2020-25672

Fixes: d646960f7986 ("NFC: Initial LLCP support")
Reported-by: "kiyin(尹亮)" 
Link: https://www.openwall.com/lists/oss-security/2020/11/01/1
Cc:  #v3.3
Signed-off-by: Xiaoming Ni 
---
 net/nfc/llcp_sock.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/nfc/llcp_sock.c b/net/nfc/llcp_sock.c
index 9e2799ee1595..59172614b249 100644
--- a/net/nfc/llcp_sock.c
+++ b/net/nfc/llcp_sock.c
@@ -746,6 +746,8 @@ static int llcp_sock_connect(struct socket *sock, struct 
sockaddr *_addr,
 
 sock_unlink:
nfc_llcp_sock_unlink(>connecting_sockets, sk);
+   kfree(llcp_sock->service_name);
+   llcp_sock->service_name = NULL;
 
 sock_llcp_release:
nfc_llcp_put_ssap(local, llcp_sock->ssap);
-- 
2.27.0



[PATCH resend 0/4] nfc: fix Resource leakage and endless loop

2021-03-24 Thread Xiaoming Ni
fix Resource leakage and endless loop in net/nfc/llcp_sock.c,
 reported by "kiyin(尹亮)".

Link: https://www.openwall.com/lists/oss-security/2020/11/01/1

Xiaoming Ni (4):
  nfc: fix refcount leak in llcp_sock_bind()
  nfc: fix refcount leak in llcp_sock_connect()
  nfc: fix memory leak in llcp_sock_connect()
  nfc: Avoid endless loops caused by repeated llcp_sock_connect()

 net/nfc/llcp_sock.c | 10 ++
 1 file changed, 10 insertions(+)

-- 
2.27.0



[PATCH resend 1/4] nfc: fix refcount leak in llcp_sock_bind()

2021-03-24 Thread Xiaoming Ni
nfc_llcp_local_get() is invoked in llcp_sock_bind(),
but nfc_llcp_local_put() is not invoked in subsequent failure branches.
As a result, refcount leakage occurs.
To fix it, add calling nfc_llcp_local_put().

fix CVE-2020-25670
Fixes: c7aa12252f51 ("NFC: Take a reference on the LLCP local pointer when
 creating a socket")
Reported-by: "kiyin(尹亮)" 
Link: https://www.openwall.com/lists/oss-security/2020/11/01/1
Cc:  #v3.6
Signed-off-by: Xiaoming Ni 
---
 net/nfc/llcp_sock.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/nfc/llcp_sock.c b/net/nfc/llcp_sock.c
index d257ed3b732a..68832ee4b9f8 100644
--- a/net/nfc/llcp_sock.c
+++ b/net/nfc/llcp_sock.c
@@ -108,11 +108,13 @@ static int llcp_sock_bind(struct socket *sock, struct 
sockaddr *addr, int alen)
  llcp_sock->service_name_len,
  GFP_KERNEL);
if (!llcp_sock->service_name) {
+   nfc_llcp_local_put(llcp_sock->local);
ret = -ENOMEM;
goto put_dev;
}
llcp_sock->ssap = nfc_llcp_get_sdp_ssap(local, llcp_sock);
if (llcp_sock->ssap == LLCP_SAP_MAX) {
+   nfc_llcp_local_put(llcp_sock->local);
kfree(llcp_sock->service_name);
llcp_sock->service_name = NULL;
ret = -EADDRINUSE;
-- 
2.27.0



[PATCH resend 2/4] nfc: fix refcount leak in llcp_sock_connect()

2021-03-24 Thread Xiaoming Ni
nfc_llcp_local_get() is invoked in llcp_sock_connect(),
but nfc_llcp_local_put() is not invoked in subsequent failure branches.
As a result, refcount leakage occurs.
To fix it, add calling nfc_llcp_local_put().

fix CVE-2020-25671
Fixes: c7aa12252f51 ("NFC: Take a reference on the LLCP local pointer when
creating a socket")
Reported-by: "kiyin(尹亮)" 
Link: https://www.openwall.com/lists/oss-security/2020/11/01/1
Cc:  #v3.6
Signed-off-by: Xiaoming Ni 
---
 net/nfc/llcp_sock.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/nfc/llcp_sock.c b/net/nfc/llcp_sock.c
index 68832ee4b9f8..9e2799ee1595 100644
--- a/net/nfc/llcp_sock.c
+++ b/net/nfc/llcp_sock.c
@@ -704,6 +704,7 @@ static int llcp_sock_connect(struct socket *sock, struct 
sockaddr *_addr,
llcp_sock->local = nfc_llcp_local_get(local);
llcp_sock->ssap = nfc_llcp_get_local_ssap(local);
if (llcp_sock->ssap == LLCP_SAP_MAX) {
+   nfc_llcp_local_put(llcp_sock->local);
ret = -ENOMEM;
goto put_dev;
}
@@ -748,6 +749,7 @@ static int llcp_sock_connect(struct socket *sock, struct 
sockaddr *_addr,
 
 sock_llcp_release:
nfc_llcp_put_ssap(local, llcp_sock->ssap);
+   nfc_llcp_local_put(llcp_sock->local);
 
 put_dev:
nfc_put_device(dev);
-- 
2.27.0



[PATCH resend 4/4] nfc: Avoid endless loops caused by repeated llcp_sock_connect()

2021-03-24 Thread Xiaoming Ni
When sock_wait_state() returns -EINPROGRESS, "sk->sk_state" is
 LLCP_CONNECTING. In this case, llcp_sock_connect() is repeatedly invoked,
 nfc_llcp_sock_link() will add sk to local->connecting_sockets twice.
 sk->sk_node->next will point to itself, that will make an endless loop
 and hang-up the system.
To fix it, check whether sk->sk_state is LLCP_CONNECTING in
 llcp_sock_connect() to avoid repeated invoking.

Fixes: b4011239a08e ("NFC: llcp: Fix non blocking sockets connections")
Reported-by: "kiyin(尹亮)" 
Link: https://www.openwall.com/lists/oss-security/2020/11/01/1
Cc:  #v3.11
Signed-off-by: Xiaoming Ni 
---
 net/nfc/llcp_sock.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/net/nfc/llcp_sock.c b/net/nfc/llcp_sock.c
index 59172614b249..a3b46f03 100644
--- a/net/nfc/llcp_sock.c
+++ b/net/nfc/llcp_sock.c
@@ -673,6 +673,10 @@ static int llcp_sock_connect(struct socket *sock, struct 
sockaddr *_addr,
ret = -EISCONN;
goto error;
}
+   if (sk->sk_state == LLCP_CONNECTING) {
+   ret = -EINPROGRESS;
+   goto error;
+   }
 
dev = nfc_get_device(addr->dev_idx);
if (dev == NULL) {
-- 
2.27.0



Re: [PATCH] tee: optee: add invoke_fn tracepoints

2021-03-24 Thread Jisheng Zhang
On Wed, 24 Mar 2021 10:53:13 -0400
Steven Rostedt  wrote:


> 
> On Wed, 24 Mar 2021 07:48:53 -0700
> Guenter Roeck  wrote:
> 
> > On Wed, Mar 24, 2021 at 07:34:07AM -0700, Guenter Roeck wrote:  
> > > On Wed, Feb 10, 2021 at 02:44:09PM +0800, Jisheng Zhang wrote:  
> > > > Add tracepoints to retrieve information about the invoke_fn. This would
> > > > help to measure how many invoke_fn are triggered and how long it takes
> > > > to complete one invoke_fn call.
> > > >
> > > > Signed-off-by: Jisheng Zhang   
> > >
> > > arm64:defconfig:
> > >
> > > make-arm64 -j drivers/tee/optee/call.o
> > >   CALLscripts/atomic/check-atomics.sh
> > >   CALLscripts/checksyscalls.sh
> > >   CC  drivers/tee/optee/call.o
> > > In file included from drivers/tee/optee/optee_trace.h:67,
> > >  from drivers/tee/optee/call.c:18:
> > > ./include/trace/define_trace.h:95:42: fatal error: ./optee_trace.h: No 
> > > such file or directory
> > >95 | #include TRACE_INCLUDE(TRACE_INCLUDE_FILE)
> > >   |  ^
> > > compilation terminated.

Interesting, I always build linux kernel with "O=", didn't see such build error
and IIRC, we didn't receive any lkp robot build error report.

My steps are:

mkdir /tmp/test

make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- O=/tmp/test defconfig

make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- O=/tmp/test drivers/tee/optee/

Today, I tried to build the linux kernel w/o "O=...", I reproduced this error!
This is the first time I saw "O=" make a different behavior.

I'll send out a patch to fix it.

Thanks

> > >  
> >
> > The problem also affects arm:imx_v6_v7_defconfig.
> >  
> 
> I think it affects everything. The problem is that the
> drivers/tee/optee/Makefile needs to be updated with:
> 
> CFLAGS_call.o := -I$(src)
> 
> otherwise the compiler wont know how to find the path to optee_tree.h.
> 
> This is described in:
> 
>samples/trace_events/Makefile

Thank Steven for pointing this out.



Re: [PATCH 2/2] media: videobuf2: cleanup size argument from attach_dmabuf()

2021-03-24 Thread kernel test robot
Hi Helen,

I love your patch! Yet something to improve:

[auto build test ERROR on linuxtv-media/master]
[also build test ERROR on next-20210324]
[cannot apply to v5.12-rc4]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Helen-Koike/media-videobuf2-use-dmabuf-size-for-length/20210325-082047
base:   git://linuxtv.org/media_tree.git master
config: powerpc64-randconfig-r016-20210325 (attached as .config)
compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project 
5d6b4aa80d6df62b924a12af030c5ded868ee4f1)
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# install powerpc64 cross compiling tool for clang build
# apt-get install binutils-powerpc64-linux-gnu
# 
https://github.com/0day-ci/linux/commit/41e2cea31db8378b33e31785aec668a009d1355b
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review 
Helen-Koike/media-videobuf2-use-dmabuf-size-for-length/20210325-082047
git checkout 41e2cea31db8378b33e31785aec668a009d1355b
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross 
ARCH=powerpc64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All errors (new ones prefixed by >>):

>> drivers/media/common/videobuf2/videobuf2-dma-sg.c:631:14: error: use of 
>> undeclared identifier 'dmabuf'; did you mean 'dbuf'?
   buf->size = dmabuf->size;
   ^~
   dbuf
   drivers/media/common/videobuf2/videobuf2-dma-sg.c:608:75: note: 'dbuf' 
declared here
   static void *vb2_dma_sg_attach_dmabuf(struct device *dev, struct dma_buf 
*dbuf,
 ^
   1 error generated.


vim +631 drivers/media/common/videobuf2/videobuf2-dma-sg.c

   607  
   608  static void *vb2_dma_sg_attach_dmabuf(struct device *dev, struct 
dma_buf *dbuf,
   609enum dma_data_direction dma_dir)
   610  {
   611  struct vb2_dma_sg_buf *buf;
   612  struct dma_buf_attachment *dba;
   613  
   614  if (WARN_ON(!dev))
   615  return ERR_PTR(-EINVAL);
   616  
   617  buf = kzalloc(sizeof(*buf), GFP_KERNEL);
   618  if (!buf)
   619  return ERR_PTR(-ENOMEM);
   620  
   621  buf->dev = dev;
   622  /* create attachment for the dmabuf with the user device */
   623  dba = dma_buf_attach(dbuf, buf->dev);
   624  if (IS_ERR(dba)) {
   625  pr_err("failed to attach dmabuf\n");
   626  kfree(buf);
   627  return dba;
   628  }
   629  
   630  buf->dma_dir = dma_dir;
 > 631  buf->size = dmabuf->size;
   632  buf->db_attach = dba;
   633  
   634  return buf;
   635  }
   636  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip


Re: [PATCH 5.10 000/150] 5.10.26-rc3 review

2021-03-24 Thread Florian Fainelli



On 3/24/2021 2:40 AM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 5.10.26 release.
> There are 150 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Fri, 26 Mar 2021 09:33:54 +.
> Anything received after that time might be too late.
> 
> The whole patch series can be found in one patch at:
>   
> https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.10.26-rc3.gz
> or in the git tree and branch at:
>   
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git 
> linux-5.10.y
> and the diffstat can be found below.
> 
> thanks,
> 
> greg k-h

On ARCH_BRCMSTB, using 32-bit and 64-bit ARM kernels:

Tested-by: Florian Fainelli 
-- 
Florian


[PATCH 1/3] ASoC:codec:max98373: Changed amp shutdown register as volatile

2021-03-24 Thread Ryan Lee
0x20FF(amp global enable) register was defined as non-volatile,
but it is not. Overheating, overcurrent can cause amp shutdown
in hardware.
'regmap_write' compare register readback value before writing
to avoid same value writing. 'regmap_read' just read cache
not actual hardware value for the non-volatile register.
When amp is internally shutdown by some reason, next 'AMP ON'
command can be ignored because regmap think amp is already ON.

Signed-off-by: Ryan Lee 
---
 sound/soc/codecs/max98373-i2c.c | 1 +
 sound/soc/codecs/max98373-sdw.c | 1 +
 2 files changed, 2 insertions(+)

diff --git a/sound/soc/codecs/max98373-i2c.c b/sound/soc/codecs/max98373-i2c.c
index 85f6865019d4..ddb6436835d7 100644
--- a/sound/soc/codecs/max98373-i2c.c
+++ b/sound/soc/codecs/max98373-i2c.c
@@ -446,6 +446,7 @@ static bool max98373_volatile_reg(struct device *dev, 
unsigned int reg)
case MAX98373_R2054_MEAS_ADC_PVDD_CH_READBACK:
case MAX98373_R2055_MEAS_ADC_THERM_CH_READBACK:
case MAX98373_R20B6_BDE_CUR_STATE_READBACK:
+   case MAX98373_R20FF_GLOBAL_SHDN:
case MAX98373_R21FF_REV_ID:
return true;
default:
diff --git a/sound/soc/codecs/max98373-sdw.c b/sound/soc/codecs/max98373-sdw.c
index d8c47667a9ea..f3a12205cd48 100644
--- a/sound/soc/codecs/max98373-sdw.c
+++ b/sound/soc/codecs/max98373-sdw.c
@@ -220,6 +220,7 @@ static bool max98373_volatile_reg(struct device *dev, 
unsigned int reg)
case MAX98373_R2054_MEAS_ADC_PVDD_CH_READBACK:
case MAX98373_R2055_MEAS_ADC_THERM_CH_READBACK:
case MAX98373_R20B6_BDE_CUR_STATE_READBACK:
+   case MAX98373_R20FF_GLOBAL_SHDN:
case MAX98373_R21FF_REV_ID:
/* SoundWire Control Port Registers */
case MAX98373_R0040_SCP_INIT_STAT_1 ... MAX98373_R0070_SCP_FRAME_CTLR:
-- 
2.17.1



[PATCH 3/3] ASoC:codec:max98373: Added controls for autorestart config

2021-03-24 Thread Ryan Lee
3 new controls are added.
"OVC Autorestart Switch" : controls whether or not the speaker amplifier
automatically re-enables after an overcurrent fault condition.
"THERM Autorestart Switch" : controls whether or not the device
automatically resumes playback when the die temperature recovers from
thermal shutdown.
"CMON Autorestart Switch" : controls whether or not the device
automatically resumes playback when the clock returns after stopping.

Above Auto Restart functions are enabled by default.

Signed-off-by: Ryan Lee 
---
 sound/soc/codecs/max98373.c | 14 ++
 sound/soc/codecs/max98373.h |  3 +++
 2 files changed, 17 insertions(+)

diff --git a/sound/soc/codecs/max98373.c b/sound/soc/codecs/max98373.c
index 1346a98ce8a1..e14fe98349a5 100644
--- a/sound/soc/codecs/max98373.c
+++ b/sound/soc/codecs/max98373.c
@@ -204,6 +204,15 @@ SOC_SINGLE("Ramp Up Switch", MAX98373_R203F_AMP_DSP_CFG,
MAX98373_AMP_DSP_CFG_RMP_UP_SHIFT, 1, 0),
 SOC_SINGLE("Ramp Down Switch", MAX98373_R203F_AMP_DSP_CFG,
MAX98373_AMP_DSP_CFG_RMP_DN_SHIFT, 1, 0),
+/* Speaker Amplifier Overcurrent Automatic Restart Enable */
+SOC_SINGLE("OVC Autorestart Switch", MAX98373_R20FE_DEVICE_AUTO_RESTART_CFG,
+   MAX98373_OVC_AUTORESTART_SHIFT, 1, 0),
+/* Thermal Shutdown Automatic Restart Enable */
+SOC_SINGLE("THERM Autorestart Switch", MAX98373_R20FE_DEVICE_AUTO_RESTART_CFG,
+   MAX98373_THERM_AUTORESTART_SHIFT, 1, 0),
+/* Clock Monitor Automatic Restart Enable */
+SOC_SINGLE("CMON Autorestart Switch", MAX98373_R20FE_DEVICE_AUTO_RESTART_CFG,
+   MAX98373_CMON_AUTORESTART_SHIFT, 1, 0),
 SOC_SINGLE("CLK Monitor Switch", MAX98373_R20FE_DEVICE_AUTO_RESTART_CFG,
MAX98373_CLOCK_MON_SHIFT, 1, 0),
 SOC_SINGLE("Dither Switch", MAX98373_R203F_AMP_DSP_CFG,
@@ -392,6 +401,11 @@ static int max98373_probe(struct snd_soc_component 
*component)
MAX98373_R2021_PCM_TX_HIZ_EN_2,
1 << (max98373->i_slot - 8), 0);
 
+   /* enable auto restart function by default */
+   regmap_write(max98373->regmap,
+   MAX98373_R20FE_DEVICE_AUTO_RESTART_CFG,
+   0xF);
+
/* speaker feedback slot configuration */
regmap_write(max98373->regmap,
MAX98373_R2023_PCM_TX_SRC_2,
diff --git a/sound/soc/codecs/max98373.h b/sound/soc/codecs/max98373.h
index 71f5a5228f34..73a2cf69d84a 100644
--- a/sound/soc/codecs/max98373.h
+++ b/sound/soc/codecs/max98373.h
@@ -195,6 +195,9 @@
 #define MAX98373_LIMITER_EN_SHIFT (0)
 
 /* MAX98373_R20FE_DEVICE_AUTO_RESTART_CFG */
+#define MAX98373_OVC_AUTORESTART_SHIFT (3)
+#define MAX98373_THERM_AUTORESTART_SHIFT (2)
+#define MAX98373_CMON_AUTORESTART_SHIFT (1)
 #define MAX98373_CLOCK_MON_SHIFT (0)
 
 /* MAX98373_R20FF_GLOBAL_SHDN */
-- 
2.17.1



[PATCH 2/3] ASoC:codec:max98373: Added 30ms turn on/off time delay

2021-03-24 Thread Ryan Lee
Amp requires 10 ~ 30ms for the power ON and OFF.
Added 30ms delay for stability.

Signed-off-by: Ryan Lee 
---
 sound/soc/codecs/max98373.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/sound/soc/codecs/max98373.c b/sound/soc/codecs/max98373.c
index 746c829312b8..1346a98ce8a1 100644
--- a/sound/soc/codecs/max98373.c
+++ b/sound/soc/codecs/max98373.c
@@ -28,11 +28,13 @@ static int max98373_dac_event(struct snd_soc_dapm_widget *w,
regmap_update_bits(max98373->regmap,
MAX98373_R20FF_GLOBAL_SHDN,
MAX98373_GLOBAL_EN_MASK, 1);
+   usleep_range(3, 31000);
break;
case SND_SOC_DAPM_POST_PMD:
regmap_update_bits(max98373->regmap,
MAX98373_R20FF_GLOBAL_SHDN,
MAX98373_GLOBAL_EN_MASK, 0);
+   usleep_range(3, 31000);
max98373->tdm_mode = false;
break;
default:
-- 
2.17.1



Re: [PATCH V2] arm64: dts: qcom: sc7280: Add nodes for eMMC and SD card

2021-03-24 Thread Veerabhadrarao Badiganti



On 3/24/2021 9:58 PM, Stephen Boyd wrote:

Quoting Stephen Boyd (2021-03-24 08:57:33)

Quoting sbh...@codeaurora.org (2021-03-24 08:23:55)

On 2021-03-23 12:31, Stephen Boyd wrote:

Quoting Shaik Sajida Bhanu (2021-03-20 11:17:00)

+
+   bus-width = <8>;
+   non-removable;
+   supports-cqe;
+   no-sd;
+   no-sdio;
+
+   max-frequency = <19200>;

Is this necessary?

yes, to avoid lower speed modes running with high clock rates.

Is it part of the DT binding? I don't see any mention of it.

Nevermind, found it in mmc-controller.yaml. But I think this is to work
around some problem with the clk driver picking lower speeds than
requested? That has been fixed on the clk driver side (see commit like
148ddaa89d4a "clk: qcom: gcc-sc7180: Use floor ops for the correct sdcc1
clk") so ideally this property can be omitted.

This is a good have dt node.

This will align clock requests between mmc core layer and sdhci-msm
platform driver. Say, for HS200/HS400 modes of eMMC, mmc-core layer
tries to set clock at 200Mhz, whereas sdhci-msm expects 192Mhz for
these modes. So we have to rely on clock driver floor/ceil values.
By having this property, mmc-core layer itself request for 192Mhz.

Same is for SD card SDR104 mode, core layer expects clock at 208Mhz
whereas sdhci-msm can max operate only at 202Mhz. By having this
property, core layer requests only for 202Mhz for SDR104 mode.

BTW, this helps only for max possible speed modes.
In case of lower-speed modes (for DDR52) we still need to rely on clock
floor rounding.



Re: [PATCH] powerpc/asm-offsets: GPR14 is not needed either

2021-03-24 Thread Rashmica Gupta
On Mon, 2021-03-15 at 11:01 +, Christophe Leroy wrote:
> Commit aac6a91fea93 ("powerpc/asm: Remove unused symbols in
> asm-offsets.c") removed GPR15 to GPR31 but kept GPR14,
> probably because it pops up in a couple of comments when doing
> a grep.
> 
> However, it was never used either, so remove it as well.
> 

Looks good to me.

Reviewed-by: Rashmica Gupta 

> Fixes: aac6a91fea93 ("powerpc/asm: Remove unused symbols in asm-
> offsets.c")
> Cc: Rashmica Gupta 
> Signed-off-by: Christophe Leroy 
> ---
>  arch/powerpc/kernel/asm-offsets.c | 3 ---
>  1 file changed, 3 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/asm-offsets.c
> b/arch/powerpc/kernel/asm-offsets.c
> index f3a662201a9f..4d230c5c7099 100644
> --- a/arch/powerpc/kernel/asm-offsets.c
> +++ b/arch/powerpc/kernel/asm-offsets.c
> @@ -323,9 +323,6 @@ int main(void)
>   STACK_PT_REGS_OFFSET(GPR11, gpr[11]);
>   STACK_PT_REGS_OFFSET(GPR12, gpr[12]);
>   STACK_PT_REGS_OFFSET(GPR13, gpr[13]);
> -#ifndef CONFIG_PPC64
> - STACK_PT_REGS_OFFSET(GPR14, gpr[14]);
> -#endif /* CONFIG_PPC64 */
>   /*
>* Note: these symbols include _ because they overlap with
> special
>* register names



[question] kernel panic at timerqueue_add+32

2021-03-24 Thread wangjian
On the x86 platform, we encountered the following problems. The kernel version 
we are using is 3.10. The following is our analysis process, hoping to get your 
help.

kernel panic at timerqueue_add+32.The stack information is as follows.

crash> bt -c 3
PID: 27797  TASK: 9f9e28805f40  CPU: 3   COMMAND: "ipmi_sim"
#0 [9f9ec0ac3dd0] die at ac82f97b
#1 [9f9ec0ac3e00] do_general_protection at acf3211e
#2 [9f9ec0ac3e30] general_protection at acf31718
[exception RIP: timerqueue_add+32]
RIP: acb67340  RSP: 9f9ec0ac3ee0  RFLAGS: 00010006
RAX: 7401f88348078b48  RBX: 9f9ec0ad3fa0  RCX: 
RDX: ac8d4395  RSI: 9f9ec0ad3fa0  RDI: ac8d4395
RBP: 9f9ec0ac3ef0   R8: 00405b31f6958080   R9: 9f9ec0ac3de0
R10: 0002  R11: 9f9ec0ac3de8  R12: ac8d4395
R13: ac8d4385  R14: 0001  R15: 9f9ec0ad3b58
ORIG_RAX:   CS: 0010  SS: 0018
#3 [9f9ec0ac3ef8] enqueue_hrtimer at ac8c32f5
#4 [9f9ec0ac3f20] __hrtimer_run_queues at ac8c3c7d
#5 [9f9ec0ac3f78] hrtimer_interrupt at ac8c41af
#6 [9f9ec0ac3fc0] local_apic_timer_interrupt at ac85aeeb
#7 [9f9ec0ac3fd8] smp_apic_timer_interrupt at acf3f0a3
#8 [9f9ec0ac3ff0] apic_timer_interrupt at acf3b7ba
---  ---
bt: cannot transition from IRQ stack to current process stack:
IRQ stack pointer: 9f9ec0ac3dd0
process stack pointer: 9f708e693df8
   current stack base: 9f9e25764000

We first parse timerqueue_add+32

crash> dis -l timerqueue_add+32
/usr/src/debug/kernel-3.10.0/linux-3.10.0-862.14.1.6_110.x86_64/lib/timerqueue.c:
 52
0xacb67340 : mov0x18(%rax),%rsi

39 void timerqueue_add(struct timerqueue_head *head, struct timerqueue_node 
*node)
40 {
41 struct rb_node **p = >head.rb_node;
42 struct rb_node *parent = NULL;
43 struct timerqueue_node  *ptr;
44
45 /* Make sure we don't add nodes that are already added */
46 WARN_ON_ONCE(!RB_EMPTY_NODE(>node));
47
48 while (*p) {
49 parent = *p;
50 ptr = rb_entry(parent, struct timerqueue_node, node);
51 if (node->expires.tv64 < ptr->expires.tv64)
52 p = &(*p)->rb_left; //at here, the p is the invalid 
address
53 else
54 p = &(*p)->rb_right;
55 }
56 rb_link_node(>node, parent, p);
57 rb_insert_color(>node, >head);
58
59 if (!head->next || node->expires.tv64 < head->next->expires.tv64)
60 head->next = node;
61 }
62 EXPORT_SYMBOL_GPL(timerqueue_add);


Let's disassemble the timerqueue_add function, the following is the part of the 
disassembled code of the timerqueue_add function
crash> dis -l timerqueue_add
/usr/src/debug/kernel-3.10.0/linux-3.10.0-862.14.1.6_110.x86_64/lib/timerqueue.c:
 40
0xacb67320 :push   %rbp
0xacb67321 :  mov%rsp,%rbp
0xacb67324 :  push   %r12
0xacb67326 :  mov%rdi,%r12
0xacb67329 :  push   %rbx
/usr/src/debug/kernel-3.10.0/linux-3.10.0-862.14.1.6_110.x86_64/lib/timerqueue.c:
 46
0xacb6732a : cmp(%rsi),%rsi
/usr/src/debug/kernel-3.10.0/linux-3.10.0-862.14.1.6_110.x86_64/lib/timerqueue.c:
 40
0xacb6732d : mov%rsi,%rbx
/usr/src/debug/kernel-3.10.0/linux-3.10.0-862.14.1.6_110.x86_64/lib/timerqueue.c:
 46
0xacb67330 : jne0xacb6739e 

/usr/src/debug/kernel-3.10.0/linux-3.10.0-862.14.1.6_110.x86_64/lib/timerqueue.c:
 41
0xacb67332 : mov%r12,%rdx
/usr/src/debug/kernel-3.10.0/linux-3.10.0-862.14.1.6_110.x86_64/lib/timerqueue.c:
 42
0xacb67335 : xor%ecx,%ecx
/usr/src/debug/kernel-3.10.0/linux-3.10.0-862.14.1.6_110.x86_64/lib/timerqueue.c:
 48
0xacb67337 : jmp0xacb67357 

0xacb67339 : nopl   0x0(%rax)
/usr/src/debug/kernel-3.10.0/linux-3.10.0-862.14.1.6_110.x86_64/lib/timerqueue.c:
 52
0xacb67340 : mov0x18(%rax),%rsi  //rax is the p
0xacb67344 : cmp%rsi,0x18(%rbx)
0xacb67348 : lea0x8(%rax),%rcx
0xacb6734c : lea0x10(%rax),%rdx
0xacb67350 : cmovge %rcx,%rdx
0xacb67354 : mov%rax,%rcx
/usr/src/debug/kernel-3.10.0/linux-3.10.0-862.14.1.6_110.x86_64/lib/timerqueue.c:
 48
0xacb67357 : mov(%rdx),%rax
0xacb6735a : test   %rax,%rax
0xacb6735d : jne0xacb67340 


Through the disassembly code of the timerqueue_add function, you can see that 
rdi is the first parameter of the timerqueue_add function (struct 
timerqueue_head *head),
and rsi is the second parameter of the timerqueue_add function (struct 
timerqueue_node *node).


We go to parse rdi (ac8d4395) and rsi(9f9ec0ad3fa0) to get the 
value of the parameter. The result is as follows.
crash> struct timerqueue_head -ox
struct timerqueue_head {
 

Re: [PATCH -tip v4 10/12] x86/kprobes: Push a fake return address at kretprobe_trampoline

2021-03-24 Thread Masami Hiramatsu
On Wed, 24 Mar 2021 20:26:13 -0400
Steven Rostedt  wrote:

> On Thu, 25 Mar 2021 08:47:41 +0900
> Masami Hiramatsu  wrote:
> 
> > > I think the REGS and REGS_PARTIAL cases can also be affected by function
> > > graph tracing.  So should they use the generic unwind_recover_ret_addr()
> > > instead of unwind_recover_kretprobe()?  
> > 
> > Yes, but I'm not sure this parameter can be applied.
> > For example, it passed "state->sp - sizeof(unsigned long)" as where the
> > return address stored address. Is that same on ftrace graph too?
> 
> Stack traces on the return side of function graph tracer has never
> worked. It's on my todo list, because that's one of the requirements to
> get right if we every manage to combine kretprobe and function graph
> tracers together.

OK, then at this point let's just fix the kretprobe side.

Thanks,

> 
> -- Steve


-- 
Masami Hiramatsu 


[RFC] Convert sysv filesystem to use folios exclusively

2021-03-24 Thread Matthew Wilcox


I decided to see what a filesystem free from struct page would look
like.  I chose sysv more-or-less at random; I wanted a relatively simple
filesystem, but I didn't want a toy.  The advantage of sysv is that the
maintainer is quite interested in folios ;-)

$ git grep page fs/sysv
fs/sysv/dir.c:#include 
fs/sysv/dir.c:  if (offset_in_page(diter->pos)) {
fs/sysv/inode.c:.get_link   = page_get_link,
fs/sysv/inode.c:truncate_inode_pages_final(>i_data);
fs/sysv/itree.c:block_truncate_page(inode->i_mapping, inode->i_size, 
get_block);
fs/sysv/itree.c:truncate_pagecache(inode, inode->i_size);
fs/sysv/itree.c:.readpage = sysv_read_folio,
fs/sysv/itree.c:.writepage = sysv_write_folio,
fs/sysv/namei.c:#include 
fs/sysv/namei.c:err = page_symlink(inode, symname, l);

I think those are "acceptable" mentions of pages -- offset_in_page()
is related to kmap(), page_get_link and page_symlink are in the VFS (to
be ported separately), and the others are just the names of the functions.

The big change here is the rewrite of directory iteration.
sysv_delete_entry() (and a couple of other functions) needs to recover
'pos' from the in-memory address and the struct page.  Once we move from
pages to folios, we can't realistically ask where the folio is mapped.
So switch to an iterator based approach which keeps the pos, dirent mapped
address and the struct folio together.  It's actually a nice cleanup:
204 insertions(+), 259 deletions(-).  We could be more tricksy and pass
around the pgoff_t instead of the loff_t, but I'm not really interested
in saving 4 bytes on the stack for 32-bit arches.

I don't know if this is really how one would do the conversion.
We could easily say "directories never use folios larger than a page"
and that would make evrything much simpler, but that wasn't the point
of this exercise.

There's probably bugs here; again that wasn't the point.  The direction
here looks sound -- it should be possible to write a filesystem without
the use of struct page in the future.  This patch won't apply to anything
published; it won't even link for me because I just changed a bunch of
random function types in the header files to prototype this work.

I might submit a patch to do the diter conversion anyway, although I
have no clue how to test the sysv filesystem.  Is there a mkfs for Linux?
I assume there's no support in xfstests for it.

diff --git a/fs/sysv/dir.c b/fs/sysv/dir.c
index 88e38cd8f5c9..df38f53f1385 100644
--- a/fs/sysv/dir.c
+++ b/fs/sysv/dir.c
@@ -28,80 +28,85 @@ const struct file_operations sysv_dir_operations = {
.fsync  = generic_file_fsync,
 };
 
-static inline void dir_put_page(struct page *page)
+void sysv_diter_end(struct sysv_diter *diter)
 {
-   kunmap(page);
-   put_page(page);
+   if (diter->entry) {
+   kunmap_local(diter->entry);
+   put_folio(diter->folio);
+   }
 }
 
-static int dir_commit_chunk(struct page *page, loff_t pos, unsigned len)
+static int sysv_diter_next(struct inode *dir, struct sysv_diter *diter)
 {
-   struct address_space *mapping = page->mapping;
+   struct address_space *mapping = dir->i_mapping;
+   struct folio *folio = diter->folio;
+   size_t offset;
+
+   if (diter->entry) {
+   diter->pos += sizeof(*diter->entry);
+   if (offset_in_page(diter->pos)) {
+   diter->entry++;
+   return 0;
+   }
+   kunmap_local(diter->entry);
+   offset = offset_in_folio(folio, diter->pos);
+   if (offset != 0)
+   goto map;
+   put_folio(folio);
+   }
+   folio = read_mapping_folio(mapping, diter->pos / PAGE_SIZE, NULL);
+   if (IS_ERR(folio)) {
+   diter->pos = round_up(diter->pos, PAGE_SIZE);
+   diter->entry = NULL;
+   return PTR_ERR(folio);
+   }
+   diter->folio = folio;
+   offset = offset_in_folio(folio, diter->pos);
+
+map:
+   diter->entry = kmap_local_folio(folio, offset);
+   return 0;
+}
+
+static int dir_commit_chunk(struct folio *folio, loff_t pos, unsigned len)
+{
+   struct address_space *mapping = folio->mapping;
struct inode *dir = mapping->host;
int err = 0;
 
-   block_write_end(NULL, mapping, pos, len, len, page, NULL);
+   block_write_end(NULL, mapping, pos, len, len, folio, NULL);
if (pos+len > dir->i_size) {
i_size_write(dir, pos+len);
mark_inode_dirty(dir);
}
if (IS_DIRSYNC(dir))
-   err = write_one_page(page);
+   err = write_one_folio(folio);
else
-   unlock_page(page);
+   unlock_folio(folio);
return err;
 }
 
-static struct page * dir_get_page(struct inode *dir, unsigned long n)
-{
-   struct address_space *mapping = dir->i_mapping;
-   

Re: [PATCH v7 0/5] clk: add driver for the SiFive FU740

2021-03-24 Thread Zong Li
On Wed, Mar 24, 2021 at 6:36 PM Andreas Schwab  wrote:
>
> Were you able to reproduce the problem?
>

Hi Andreas,

Sorry, I'm not available past few days, I'm just coming back, I would
take a look at this again. Could you also let me know which bootloader
you used (FSBL or U-boot-SPL)? Thanks.

> Andreas.
>
> --
> Andreas Schwab, sch...@linux-m68k.org
> GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
> "And now for something completely different."


Re: [PATCH v3] audit: log nftables configuration change events once per table

2021-03-24 Thread Richard Guy Briggs
On 2021-03-24 12:32, Paul Moore wrote:
> On Tue, Mar 23, 2021 at 4:05 PM Richard Guy Briggs  wrote:
> >
> > Reduce logging of nftables events to a level similar to iptables.
> > Restore the table field to list the table, adding the generation.
> >
> > Indicate the op as the most significant operation in the event.
> >
> > A couple of sample events:
> >
> > type=PROCTITLE msg=audit(2021-03-18 09:30:49.801:143) : 
> > proctitle=/usr/bin/python3 -s /usr/sbin/firewalld --nofork --nopid
> > type=SYSCALL msg=audit(2021-03-18 09:30:49.801:143) : arch=x86_64 
> > syscall=sendmsg success=yes exit=172 a0=0x6 a1=0x7ffdcfcbe650 a2=0x0 
> > a3=0x7ffdcfcbd52c items=0 ppid=1 pid=367 auid=unset uid=root gid=root 
> > euid=root suid=root fsuid=root egid=roo
> > t sgid=root fsgid=root tty=(none) ses=unset comm=firewalld 
> > exe=/usr/bin/python3.9 subj=system_u:system_r:firewalld_t:s0 key=(null)
> > type=NETFILTER_CFG msg=audit(2021-03-18 09:30:49.801:143) : 
> > table=firewalld:2 family=ipv6 entries=1 op=nft_register_table pid=367 
> > subj=system_u:system_r:firewalld_t:s0 comm=firewalld
> > type=NETFILTER_CFG msg=audit(2021-03-18 09:30:49.801:143) : 
> > table=firewalld:2 family=ipv4 entries=1 op=nft_register_table pid=367 
> > subj=system_u:system_r:firewalld_t:s0 comm=firewalld
> > type=NETFILTER_CFG msg=audit(2021-03-18 09:30:49.801:143) : 
> > table=firewalld:2 family=inet entries=1 op=nft_register_table pid=367 
> > subj=system_u:system_r:firewalld_t:s0 comm=firewalld
> >
> > type=PROCTITLE msg=audit(2021-03-18 09:30:49.839:144) : 
> > proctitle=/usr/bin/python3 -s /usr/sbin/firewalld --nofork --nopid
> > type=SYSCALL msg=audit(2021-03-18 09:30:49.839:144) : arch=x86_64 
> > syscall=sendmsg success=yes exit=22792 a0=0x6 a1=0x7ffdcfcbe650 a2=0x0 
> > a3=0x7ffdcfcbd52c items=0 ppid=1 pid=367 auid=unset uid=root gid=root 
> > euid=root suid=root fsuid=root egid=r
> > oot sgid=root fsgid=root tty=(none) ses=unset comm=firewalld 
> > exe=/usr/bin/python3.9 subj=system_u:system_r:firewalld_t:s0 key=(null)
> > type=NETFILTER_CFG msg=audit(2021-03-18 09:30:49.839:144) : 
> > table=firewalld:3 family=ipv6 entries=30 op=nft_register_chain pid=367 
> > subj=system_u:system_r:firewalld_t:s0 comm=firewalld
> > type=NETFILTER_CFG msg=audit(2021-03-18 09:30:49.839:144) : 
> > table=firewalld:3 family=ipv4 entries=30 op=nft_register_chain pid=367 
> > subj=system_u:system_r:firewalld_t:s0 comm=firewalld
> > type=NETFILTER_CFG msg=audit(2021-03-18 09:30:49.839:144) : 
> > table=firewalld:3 family=inet entries=165 op=nft_register_chain pid=367 
> > subj=system_u:system_r:firewalld_t:s0 comm=firewalld
> >
> > The issue was originally documented in
> > https://github.com/linux-audit/audit-kernel/issues/124
> >
> > Signed-off-by: Richard Guy Briggs 
> > ---
> > Changelog:
> > v3:
> > - fix function braces, reduce parameter scope
> > - pre-allocate nft_audit_data per table in step 1, bail on ENOMEM
> >
> > v2:
> > - convert NFT ops to array indicies in nft2audit_op[]
> > - use linux lists
> > - use functions for each of collection and logging of audit data
> > ---
> >  include/linux/audit.h |  28 ++
> >  net/netfilter/nf_tables_api.c | 160 --
> >  2 files changed, 105 insertions(+), 83 deletions(-)
> 
> ...
> 
> > diff --git a/include/linux/audit.h b/include/linux/audit.h
> > index 82b7c1116a85..5fafcf4c13de 100644
> > --- a/include/linux/audit.h
> > +++ b/include/linux/audit.h
> > @@ -118,6 +118,34 @@ enum audit_nfcfgop {
> > AUDIT_NFT_OP_INVALID,
> >  };
> >
> > +static const u8 nft2audit_op[NFT_MSG_MAX] = { // enum nf_tables_msg_types
> > +   [NFT_MSG_NEWTABLE]  = AUDIT_NFT_OP_TABLE_REGISTER,
> > +   [NFT_MSG_GETTABLE]  = AUDIT_NFT_OP_INVALID,
> > +   [NFT_MSG_DELTABLE]  = AUDIT_NFT_OP_TABLE_UNREGISTER,
> > +   [NFT_MSG_NEWCHAIN]  = AUDIT_NFT_OP_CHAIN_REGISTER,
> > +   [NFT_MSG_GETCHAIN]  = AUDIT_NFT_OP_INVALID,
> > +   [NFT_MSG_DELCHAIN]  = AUDIT_NFT_OP_CHAIN_UNREGISTER,
> > +   [NFT_MSG_NEWRULE]   = AUDIT_NFT_OP_RULE_REGISTER,
> > +   [NFT_MSG_GETRULE]   = AUDIT_NFT_OP_INVALID,
> > +   [NFT_MSG_DELRULE]   = AUDIT_NFT_OP_RULE_UNREGISTER,
> > +   [NFT_MSG_NEWSET]= AUDIT_NFT_OP_SET_REGISTER,
> > +   [NFT_MSG_GETSET]= AUDIT_NFT_OP_INVALID,
> > +   [NFT_MSG_DELSET]= AUDIT_NFT_OP_SET_UNREGISTER,
> > +   [NFT_MSG_NEWSETELEM]= AUDIT_NFT_OP_SETELEM_REGISTER,
> > +   [NFT_MSG_GETSETELEM]= AUDIT_NFT_OP_INVALID,
> > +   [NFT_MSG_DELSETELEM]= AUDIT_NFT_OP_SETELEM_UNREGISTER,
> > +   [NFT_MSG_NEWGEN]= AUDIT_NFT_OP_GEN_REGISTER,
> > +   [NFT_MSG_GETGEN]= AUDIT_NFT_OP_INVALID,
> > +   [NFT_MSG_TRACE] = AUDIT_NFT_OP_INVALID,
> > +   [NFT_MSG_NEWOBJ]= AUDIT_NFT_OP_OBJ_REGISTER,
> > +   [NFT_MSG_GETOBJ]= AUDIT_NFT_OP_INVALID,
> > +   [NFT_MSG_DELOBJ]= AUDIT_NFT_OP_OBJ_UNREGISTER,
> > +   

[PATCH net v3] net: sched: fix packet stuck problem for lockless qdisc

2021-03-24 Thread Yunsheng Lin
Lockless qdisc has below concurrent problem:
cpu0 cpu1
 . .
q->enqueue .
 . .
qdisc_run_begin()  .
 . .
dequeue_skb()  .
 . .
sch_direct_xmit()  .
 . .
 .q->enqueue
 . qdisc_run_begin()
 .return and do nothing
 . .
qdisc_run_end().

cpu1 enqueue a skb without calling __qdisc_run() because cpu0
has not released the lock yet and spin_trylock() return false
for cpu1 in qdisc_run_begin(), and cpu0 do not see the skb
enqueued by cpu1 when calling dequeue_skb() because cpu1 may
enqueue the skb after cpu0 calling dequeue_skb() and before
cpu0 calling qdisc_run_end().

Lockless qdisc has below another concurrent problem when
tx_action is involved:

cpu0(serving tx_action) cpu1 cpu2
  .   ..
  .  q->enqueue.
  .qdisc_run_begin()   .
  .  dequeue_skb() .
  .   .q->enqueue
  .   ..
  . sch_direct_xmit()  .
  .   . qdisc_run_begin()
  .   .   return and do nothing
  .   ..
 clear __QDISC_STATE_SCHED..
 qdisc_run_begin()..
 return and do nothing..
  .   ..
  .qdisc_run_end() .

This patch fixes the above data race by:
1. Get the flag before doing spin_trylock().
2. If the first spin_trylock() return false and the flag is not
   set before the first spin_trylock(), Set the flag and retry
   another spin_trylock() in case other CPU may not see the new
   flag after it releases the lock.
3. reschedule if the flags is set after the lock is released
   at the end of qdisc_run_end().

For tx_action case, the flags is also set when cpu1 is at the
end if qdisc_run_end(), so tx_action will be rescheduled
again to dequeue the skb enqueued by cpu2.

Only clear the flag before retrying a dequeuing when dequeuing
returns NULL in order to reduce the overhead of the above double
spin_trylock() and __netif_schedule() calling.

The performance impact of this patch, tested using pktgen and
dummy netdev with pfifo_fast qdisc attached:

 threads  without+this_patch   with+this_patch  delta
12.61Mpps2.60Mpps   -0.3%
23.97Mpps3.82Mpps   -3.7%
45.62Mpps5.59Mpps   -0.5%
82.78Mpps2.77Mpps   -0.3%
   162.22Mpps2.22Mpps   -0.0%

Fixes: 6b3ba9146fe6 ("net: sched: allow qdiscs to handle locking")
Signed-off-by: Yunsheng Lin 
---
V3: fix a compile error and a few comment typo, remove the
__QDISC_STATE_DEACTIVATED checking, and update the
performance data.
V2: Avoid the overhead of fixing the data race as much as
possible.
---
 include/net/sch_generic.h | 38 +-
 net/sched/sch_generic.c   | 12 
 2 files changed, 49 insertions(+), 1 deletion(-)

diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index f7a6e14..e3f46eb 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -36,6 +36,7 @@ struct qdisc_rate_table {
 enum qdisc_state_t {
__QDISC_STATE_SCHED,
__QDISC_STATE_DEACTIVATED,
+   __QDISC_STATE_NEED_RESCHEDULE,
 };
 
 struct qdisc_size_table {
@@ -159,8 +160,38 @@ static inline bool qdisc_is_empty(const struct Qdisc 
*qdisc)
 static inline bool qdisc_run_begin(struct Qdisc *qdisc)
 {
if (qdisc->flags & TCQ_F_NOLOCK) {
+   bool dont_retry = test_bit(__QDISC_STATE_NEED_RESCHEDULE,
+  >state);
+
+   if (spin_trylock(>seqlock))
+   goto nolock_empty;
+
+   /* If the flag is set before doing the spin_trylock() and
+* the above spin_trylock() return false, it means other cpu
+* holding the lock will do dequeuing for us, or it wil see
+* the flag set after releasing lock and reschedule the
+* net_tx_action() to do the dequeuing.
+*/
+   if (dont_retry)
+   return false;
+
+   /* We could do set_bit() before the first spin_trylock(),
+* and avoid doing second spin_trylock() completely, then
+* we could have multi cpus doing the set_bit(). Here use
+* dont_retry to avoid doing the set_bit() and the second
+* spin_trylock(), which has 5% performance improvement 

RE: Re: [PATCH v2 1/3] dt-bindings: imx6q-pcie: add one regulator used to power up pcie phy

2021-03-24 Thread Richard Zhu

> -Original Message-
> From: Lucas Stach 
> Sent: Wednesday, March 24, 2021 5:27 PM
> To: Richard Zhu ; andrew.smir...@gmail.com;
> shawn...@kernel.org; k...@linux.com; bhelg...@google.com;
> ste...@agner.ch; lorenzo.pieral...@arm.com
> Cc: linux-...@vger.kernel.org; dl-linux-imx ;
> linux-arm-ker...@lists.infradead.org; linux-kernel@vger.kernel.org;
> ker...@pengutronix.de
> Subject: Re: [PATCH v2 1/3] dt-bindings: imx6q-pcie: add one regulator
> used to power up pcie phy
> Hi Richard,
> 
> Am Mittwoch, dem 24.03.2021 um 13:34 +0800 schrieb Richard Zhu:
> > Both 1.8v and 3.3v power supplies can be used by i.MX8MQ PCIe PHY.
> > In default, the PCIE_VPH voltage is suggested to be 1.8v refer to data
> > sheet. When PCIE_VPH is supplied by 3.3v in the HW schematic design,
> > the VREG_BYPASS bits of GPR registers should be cleared from default
> > value 1b'1 to 1b'0. Thus, the internal 3v3 to 1v8 translator would be
> > turned on.
> >
> > Signed-off-by: Richard Zhu 
> > ---
> >  Documentation/devicetree/bindings/pci/fsl,imx6q-pcie.txt | 6 ++
> >  1 file changed, 6 insertions(+)
> >
> > diff --git a/Documentation/devicetree/bindings/pci/fsl,imx6q-pcie.txt
> > b/Documentation/devicetree/bindings/pci/fsl,imx6q-pcie.txt
> > index de4b2baf91e8..3248b7192ced 100644
> > --- a/Documentation/devicetree/bindings/pci/fsl,imx6q-pcie.txt
> > +++ b/Documentation/devicetree/bindings/pci/fsl,imx6q-pcie.txt
> > @@ -38,6 +38,12 @@ Optional properties:
> >The regulator will be enabled when initializing the PCIe host and
> >disabled either as part of the init process or when shutting down the
> >host.
> > +- vph-supply: Should specify the regulator in charge of PCIe PHY power.
> > +  On i.MX8MQ, both 1.8v and 3.3v power supplies can be used by
> > +i.MX8MQ PCIe
> > +  PHY. In default, the PCIE_VPH voltage is suggested to be 1.8v refer
> > +to data
> > +  sheet. When PCIE_VPH is supplied by 3.3v in the HW schematic
> > +design, the
> > +  VREG_BYPASS bits of GPR registers should be cleared from default
> > +value 1b'1
> > +  to 1b'0.
> 
> This description of the internal driver behavior does not belong into a DT
> binding description.
> Instead the binding should describe the function of the regulator exactly. 
> From
> the datasheet I can see that there are actually 3 supplies (VPH, VP, VPTX)
> going into the PCIe PHY, so "regulator in charge of PCIe PHY power" doesn't
> seem like a very accurate description.
[Richard Zhu] Hi Lucas:  Thanks for your comments.
VP/VPTX are combined together and connected to VDD_PHY_0V9.
Only VPH can be supplied by different voltage power supplies.
So, only VPH is specified in the DT binding, might be used to distinguish 
different
 HW board designs.

How about this description:
- vph-supply: Should specify the regulator in charge of VPH one of the three
  PCIe PHY powers. This regulator can be supplied by both 1.8v and 3.3v voltage
  supplies. Might be used to distinguish different HW board designs.
> 
> Regards,
> Lucas



[PATCH] perf x86 kvm-stat: support to analyze kvm msr

2021-03-24 Thread Li RongQing
From: Lei Zhao 

usage:
- kvm stat
  run a command and gather performance counter statistics

- show the result:
  perf kvm stat report --event=msr

See the msr events:

Analyze events for all VMs, all VCPUs:

MSR Access Samples  Samples% Time%  Min Time Max Time  Avg time

  0x6e0:W   67007  98.17%   98.31%  0.59us   10.69us  0.90us ( +-  0.10% )
  0x830:W1186   1.74%1.60%  0.53us  108.34us  0.82us ( +- 11.02% )
   0x3b:R  66   0.10%0.09%  0.56us1.26us  0.80us ( +-  3.24% )

Total Samples:68259, Total events handled time:61150.95us.

Signed-off-by: Li RongQing 
Signed-off-by: Lei Zhao 
---
 tools/perf/arch/x86/util/kvm-stat.c | 46 +
 1 file changed, 46 insertions(+)

diff --git a/tools/perf/arch/x86/util/kvm-stat.c 
b/tools/perf/arch/x86/util/kvm-stat.c
index 072920475b65..c5dd54f6ef5e 100644
--- a/tools/perf/arch/x86/util/kvm-stat.c
+++ b/tools/perf/arch/x86/util/kvm-stat.c
@@ -133,11 +133,56 @@ static struct kvm_events_ops ioport_events = {
.name = "IO Port Access"
 };
 
+ /* The time of emulation msr is from kvm_msr to kvm_entry. */
+static void msr_event_get_key(struct evsel *evsel,
+struct perf_sample *sample,
+struct event_key *key)
+{
+   key->key  = evsel__intval(evsel, sample, "ecx");
+   key->info = evsel__intval(evsel, sample, "write");
+}
+
+static bool msr_event_begin(struct evsel *evsel,
+  struct perf_sample *sample,
+  struct event_key *key)
+{
+   if (!strcmp(evsel->name, "kvm:kvm_msr")) {
+   msr_event_get_key(evsel, sample, key);
+   return true;
+   }
+
+   return false;
+}
+
+static bool msr_event_end(struct evsel *evsel,
+struct perf_sample *sample __maybe_unused,
+struct event_key *key __maybe_unused)
+{
+   return kvm_entry_event(evsel);
+}
+
+static void msr_event_decode_key(struct perf_kvm_stat *kvm __maybe_unused,
+   struct event_key *key,
+   char *decode)
+{
+   scnprintf(decode, decode_str_len, "%#llx:%s",
+ (unsigned long long)key->key,
+ key->info ? "W" : "R");
+}
+
+static struct kvm_events_ops msr_events = {
+   .is_begin_event = msr_event_begin,
+   .is_end_event = msr_event_end,
+   .decode_key = msr_event_decode_key,
+   .name = "MSR Access"
+};
+
 const char *kvm_events_tp[] = {
"kvm:kvm_entry",
"kvm:kvm_exit",
"kvm:kvm_mmio",
"kvm:kvm_pio",
+   "kvm:kvm_msr",
NULL,
 };
 
@@ -145,6 +190,7 @@ struct kvm_reg_events_ops kvm_reg_events_ops[] = {
{ .name = "vmexit", .ops = _events },
{ .name = "mmio", .ops = _events },
{ .name = "ioport", .ops = _events },
+   { .name = "msr", .ops = _events },
{ NULL, NULL },
 };
 
-- 
2.17.3



[PATCH] net: Fix a misspell in socket.c

2021-03-24 Thread Lu Wei
s/addres/address

Reported-by: Hulk Robot 
Signed-off-by: Lu Wei 
---
 net/socket.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/socket.c b/net/socket.c
index 84a8049c2b09..27e3e7d53f8e 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -3568,7 +3568,7 @@ EXPORT_SYMBOL(kernel_accept);
  * @addrlen: address length
  * @flags: flags (O_NONBLOCK, ...)
  *
- * For datagram sockets, @addr is the addres to which datagrams are sent
+ * For datagram sockets, @addr is the address to which datagrams are sent
  * by default, and the only address from which datagrams are received.
  * For stream sockets, attempts to connect to @addr.
  * Returns 0 or an error code.
-- 
2.17.1



Re: [PATCH V3] exit: trigger panic when global init has exited

2021-03-24 Thread qianli zhao
>> But,my patch has another purpose,protect some key variables(such
>> as:task->mm,task->nsproxy,etc) to recover init coredump from
>> fulldump,if sub-threads finish do_exit(),

> Yes I know.

> But the purpose of this SIGNAL_GROUP_EXIT check is not clear and not
> documented. That is why I said it should be documented at least in the
> changelog.

Ok.
I will update the changelog as you suggest.

Oleg Nesterov  于2021年3月25日周四 上午2:12写道:
>
> Hi,
>
> On 03/23, qianli zhao wrote:
> >
> > Hi,Oleg
> >
> > > You certainly don't understand me :/
> >
> > > Please read my email you quoted below. I didn't mean the current logic.
> > > I meant the logic after your patch which moves atomic_dec_and_test() and
> > > panic() before exit_signals().
> >
> > Sorry, I think I see what you mean now.
> >
> > You mean that after apply my patch,SIGNAL_GROUP_EXIT no longer needs
> > to be tested or avoid zap_pid_ns_processes()->BUG().
> > Yes,your consideration is correct.
>
> OK, great
>
> > But,my patch has another purpose,protect some key variables(such
> > as:task->mm,task->nsproxy,etc) to recover init coredump from
> > fulldump,if sub-threads finish do_exit(),
>
> Yes I know.
>
> But the purpose of this SIGNAL_GROUP_EXIT check is not clear and not
> documented. That is why I said it should be documented at least in the
> changelog.
>
> Oleg.
>


Re: [PATCH] livepatch: klp_send_signal should treat PF_IO_WORKER like PF_KTHREAD

2021-03-24 Thread Joe Lawrence

On 3/24/21 9:48 PM, Dong Kai wrote:

commit 15b2219facad ("kernel: freezer should treat PF_IO_WORKER like
PF_KTHREAD for freezing") is to fix the freezeing issue of IO threads


nit: s/freezeing/freezing


by making the freezer not send them fake signals.

Here live patching consistency model call klp_send_signals to wake up
all tasks by send fake signal to all non-kthread which only check the
PF_KTHREAD flag, so it still send signal to io threads which may lead to
freezeing issue of io threads.

Here we take the same fix action by treating PF_IO_WORKERS as PF_KTHREAD
within klp_send_signal function.

Signed-off-by: Dong Kai 
---
note:
the io threads freeze issue links:
[1] https://lore.kernel.org/io-uring/yegnip43%2f6kfn...@kevinlocke.name/
[2] 
https://lore.kernel.org/io-uring/d7350ce7-17dc-75d7-611b-27ebf2cb5...@kernel.dk/

  kernel/livepatch/transition.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
index f6310f848f34..0e1c35c8f4b4 100644
--- a/kernel/livepatch/transition.c
+++ b/kernel/livepatch/transition.c
@@ -358,7 +358,7 @@ static void klp_send_signals(void)
 * Meanwhile the task could migrate itself and the action
 * would be meaningless. It is not serious though.
 */
-   if (task->flags & PF_KTHREAD) {
+   if (task->flags & (PF_KTHREAD | PF_IO_WORKER)) {
/*
 * Wake up a kthread which sleeps interruptedly and
 * still has not been migrated.



(PF_KTHREAD | PF_IO_WORKER) is open coded in soo many places maybe this 
is a silly question, but...


If the livepatch code could use fake_signal_wake_up(), we could 
consolidate the pattern in klp_send_signals() with the one in 
freeze_task().  Then there would only one place for wake up / fake 
signal logic.


I don't fully understand the differences in the freeze_task() version, 
so I only pose this as a question and not v2 request.


As it is, this change seems logical to me, so:
Acked-by: Joe Lawrence 

Thanks,

-- Joe



[PATCH v14 4/7] soc: mediatek: SVS: add debug commands

2021-03-24 Thread Roger Lu
The purpose of SVS is to help find the suitable voltages
for DVFS. Therefore, if SVS bank voltages are concerned
to be wrong, we can adjust SVS bank voltages by this patch.

Signed-off-by: Roger Lu 
---
 drivers/soc/mediatek/mtk-svs.c | 328 +
 1 file changed, 328 insertions(+)

diff --git a/drivers/soc/mediatek/mtk-svs.c b/drivers/soc/mediatek/mtk-svs.c
index ee3b3989ab88..e36b3abfee03 100644
--- a/drivers/soc/mediatek/mtk-svs.c
+++ b/drivers/soc/mediatek/mtk-svs.c
@@ -6,6 +6,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -24,6 +25,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -60,6 +62,39 @@
 #define SVSB_INTSTS_COMPLETE   0x1
 #define SVSB_INTSTS_CLEAN  0x00ff
 
+#define debug_fops_ro(name)\
+   static int svs_##name##_debug_open(struct inode *inode, \
+  struct file *filp)   \
+   {   \
+   return single_open(filp, svs_##name##_debug_show,   \
+  inode->i_private);   \
+   }   \
+   static const struct file_operations svs_##name##_debug_fops = { \
+   .owner = THIS_MODULE,   \
+   .open = svs_##name##_debug_open,\
+   .read = seq_read,   \
+   .llseek = seq_lseek,\
+   .release = single_release,  \
+   }
+
+#define debug_fops_rw(name)\
+   static int svs_##name##_debug_open(struct inode *inode, \
+  struct file *filp)   \
+   {   \
+   return single_open(filp, svs_##name##_debug_show,   \
+  inode->i_private);   \
+   }   \
+   static const struct file_operations svs_##name##_debug_fops = { \
+   .owner = THIS_MODULE,   \
+   .open = svs_##name##_debug_open,\
+   .read = seq_read,   \
+   .write = svs_##name##_debug_write,  \
+   .llseek = seq_lseek,\
+   .release = single_release,  \
+   }
+
+#define svs_dentry(name)   {__stringify(name), _##name##_debug_fops}
+
 static DEFINE_SPINLOCK(mtk_svs_lock);
 
 /*
@@ -81,6 +116,7 @@ enum svsb_phase {
SVSB_PHASE_INIT01,
SVSB_PHASE_INIT02,
SVSB_PHASE_MON,
+   SVSB_PHASE_NUM,
 };
 
 enum svs_reg_index {
@@ -138,6 +174,7 @@ enum svs_reg_index {
SPARE2,
SPARE3,
THSLPEVEB,
+   SVS_REG_NUM,
 };
 
 static const u32 svs_regs_v2[] = {
@@ -241,6 +278,7 @@ struct thermal_parameter {
  * @opp_volts: signed-off voltages from default opp table
  * @freqs_pct: percent of "opp_freqs / freq_base" for bank init
  * @volts: bank voltages
+ * @reg_data: bank register data of each phase
  * @freq_base: reference frequency for bank init
  * @vboot: voltage request for bank init01 stage only
  * @volt_step: bank voltage step
@@ -259,6 +297,7 @@ struct thermal_parameter {
  * @opp_count: bank opp count
  * @int_st: bank interrupt identification
  * @sw_id: bank software identification
+ * @hw_id: bank hardware identification
  * @ctl0: bank thermal sensor selection
  * @cpu_id: cpu core id for SVS CPU only
  *
@@ -284,6 +323,7 @@ struct svs_bank {
u32 opp_volts[16];
u32 freqs_pct[16];
u32 volts[16];
+   u32 reg_data[SVSB_PHASE_NUM][SVS_REG_NUM];
u32 freq_base;
u32 vboot;
u32 volt_step;
@@ -321,6 +361,7 @@ struct svs_bank {
u32 opp_count;
u32 int_st;
u32 sw_id;
+   u32 hw_id;
u32 ctl0;
u32 cpu_id;
 };
@@ -636,11 +677,15 @@ static void svs_set_bank_phase(struct svs_platform *svsp,
 static inline void svs_init01_isr_handler(struct svs_platform *svsp)
 {
struct svs_bank *svsb = svsp->pbank;
+   enum svs_reg_index rg_i;
 
dev_info(svsb->dev, "%s: VDN74~30:0x%08x~0x%08x, DC:0x%08x\n",
 __func__, svs_readl(svsp, VDESIGN74),
 svs_readl(svsp, VDESIGN30), svs_readl(svsp, DCVALUES));
 
+   for (rg_i = DESCHAR; rg_i < SVS_REG_NUM; rg_i++)
+   svsb->reg_data[SVSB_PHASE_INIT01][rg_i] = svs_readl(svsp, rg_i);
+
svsb->phase = SVSB_PHASE_INIT01;
svsb->dc_voffset_in = 

[PATCH v14 1/7] dt-bindings: soc: mediatek: add mtk svs dt-bindings

2021-03-24 Thread Roger Lu
Document the binding for enabling mtk svs on MediaTek SoC.

Signed-off-by: Roger Lu 
---
 .../bindings/soc/mediatek/mtk-svs.yaml| 84 +++
 1 file changed, 84 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/soc/mediatek/mtk-svs.yaml

diff --git a/Documentation/devicetree/bindings/soc/mediatek/mtk-svs.yaml 
b/Documentation/devicetree/bindings/soc/mediatek/mtk-svs.yaml
new file mode 100644
index ..a855ced410f8
--- /dev/null
+++ b/Documentation/devicetree/bindings/soc/mediatek/mtk-svs.yaml
@@ -0,0 +1,84 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/soc/mediatek/mtk-svs.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Mediatek Smart Voltage Scaling (SVS) Device Tree Bindings
+
+maintainers:
+  - Roger Lu 
+  - Matthias Brugger 
+  - Kevin Hilman 
+
+description: |+
+  The SVS engine is a piece of hardware which has several
+  controllers(banks) for calculating suitable voltage to
+  different power domains(CPU/GPU/CCI) according to
+  chip process corner, temperatures and other factors. Then DVFS
+  driver could apply SVS bank voltage to PMIC/Buck.
+
+properties:
+  compatible:
+enum:
+  - mediatek,mt8183-svs
+
+  reg:
+maxItems: 1
+description: Address range of the MTK SVS controller.
+
+  interrupts:
+maxItems: 1
+
+  clocks:
+maxItems: 1
+description: Main clock for MTK SVS controller to work.
+
+  clock-names:
+const: main
+
+  nvmem-cells:
+minItems: 1
+maxItems: 2
+description:
+  Phandle to the calibration data provided by a nvmem device.
+items:
+  - description: SVS efuse for SVS controller
+  - description: Thermal efuse for SVS controller
+
+  nvmem-cell-names:
+items:
+  - const: svs-calibration-data
+  - const: t-calibration-data
+
+required:
+  - compatible
+  - reg
+  - interrupts
+  - clocks
+  - clock-names
+  - nvmem-cells
+  - nvmem-cell-names
+
+additionalProperties: false
+
+examples:
+  - |
+#include 
+#include 
+#include 
+
+soc {
+#address-cells = <2>;
+#size-cells = <2>;
+
+svs@1100b000 {
+compatible = "mediatek,mt8183-svs";
+reg = <0 0x1100b000 0 0x1000>;
+interrupts = ;
+clocks = < CLK_INFRA_THERM>;
+clock-names = "main";
+nvmem-cells = <_calibration>, <_calibration>;
+nvmem-cell-names = "svs-calibration-data", "t-calibration-data";
+};
+};
-- 
2.18.0



[PATCH v14 3/7] soc: mediatek: SVS: introduce MTK SVS engine

2021-03-24 Thread Roger Lu
The Smart Voltage Scaling(SVS) engine is a piece of hardware
which calculates suitable SVS bank voltages to OPP voltage table.
Then, DVFS driver could apply those SVS bank voltages to PMIC/Buck
when receiving OPP_EVENT_ADJUST_VOLTAGE.

Signed-off-by: Roger Lu 
---
 drivers/soc/mediatek/Kconfig   |   10 +
 drivers/soc/mediatek/Makefile  |1 +
 drivers/soc/mediatek/mtk-svs.c | 1702 
 3 files changed, 1713 insertions(+)
 create mode 100644 drivers/soc/mediatek/mtk-svs.c

diff --git a/drivers/soc/mediatek/Kconfig b/drivers/soc/mediatek/Kconfig
index fdd8bc08569e..3c3eedea35f7 100644
--- a/drivers/soc/mediatek/Kconfig
+++ b/drivers/soc/mediatek/Kconfig
@@ -73,4 +73,14 @@ config MTK_MMSYS
  Say yes here to add support for the MediaTek Multimedia
  Subsystem (MMSYS).
 
+config MTK_SVS
+   tristate "MediaTek Smart Voltage Scaling(SVS)"
+   depends on MTK_EFUSE && NVMEM
+   help
+ The Smart Voltage Scaling(SVS) engine is a piece of hardware
+ which has several controllers(banks) for calculating suitable
+ voltage to different power domains(CPU/GPU/CCI) according to
+ chip process corner, temperatures and other factors. Then DVFS
+ driver could apply SVS bank voltage to PMIC/Buck.
+
 endmenu
diff --git a/drivers/soc/mediatek/Makefile b/drivers/soc/mediatek/Makefile
index 90270f8114ed..0e9e703c931a 100644
--- a/drivers/soc/mediatek/Makefile
+++ b/drivers/soc/mediatek/Makefile
@@ -7,3 +7,4 @@ obj-$(CONFIG_MTK_SCPSYS) += mtk-scpsys.o
 obj-$(CONFIG_MTK_SCPSYS_PM_DOMAINS) += mtk-pm-domains.o
 obj-$(CONFIG_MTK_MMSYS) += mtk-mmsys.o
 obj-$(CONFIG_MTK_MMSYS) += mtk-mutex.o
+obj-$(CONFIG_MTK_SVS) += mtk-svs.o
diff --git a/drivers/soc/mediatek/mtk-svs.c b/drivers/soc/mediatek/mtk-svs.c
new file mode 100644
index ..ee3b3989ab88
--- /dev/null
+++ b/drivers/soc/mediatek/mtk-svs.c
@@ -0,0 +1,1702 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2020 MediaTek Inc.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/* svs bank 1-line sw id */
+#define SVSB_CPU_LITTLEBIT(0)
+#define SVSB_CPU_BIG   BIT(1)
+#define SVSB_CCI   BIT(2)
+#define SVSB_GPU   BIT(3)
+
+/* svs bank mode support */
+#define SVSB_MODE_ALL_DISABLE  0
+#define SVSB_MODE_INIT01   BIT(1)
+#define SVSB_MODE_INIT02   BIT(2)
+#define SVSB_MODE_MON  BIT(3)
+
+/* svs bank init01 condition */
+#define SVSB_INIT01_VOLT_IGNOREBIT(1)
+#define SVSB_INIT01_VOLT_INC_ONLY  BIT(2)
+#define SVSB_INIT01_CLK_EN BIT(31)
+
+/* svs bank common setting */
+#define SVSB_TZONE_HIGH_TEMP_MAX   U32_MAX
+#define SVSB_RUNCONFIG_DEFAULT 0x8000
+#define SVSB_DC_SIGNED_BIT 0x8000
+#define SVSB_INTEN_INIT0x  0x5f01
+#define SVSB_INTEN_MONVOPEN0x00ff
+#define SVSB_EN_OFF0x0
+#define SVSB_EN_MASK   0x7
+#define SVSB_EN_INIT01 0x1
+#define SVSB_EN_INIT02 0x5
+#define SVSB_EN_MON0x2
+#define SVSB_INTSTS_MONVOP 0x00ff
+#define SVSB_INTSTS_COMPLETE   0x1
+#define SVSB_INTSTS_CLEAN  0x00ff
+
+static DEFINE_SPINLOCK(mtk_svs_lock);
+
+/*
+ * enum svsb_phase - svs bank phase enumeration
+ * @SVSB_PHASE_INIT01: basic init for svs bank
+ * @SVSB_PHASE_INIT02: svs bank can provide voltages
+ * @SVSB_PHASE_MON: svs bank can provide voltages with thermal effect
+ * @SVSB_PHASE_ERROR: svs bank encounters unexpected condition
+ *
+ * Each svs bank has its own independent phase. We enable each svs bank by
+ * running their phase orderly. However, When svs bank encounters unexpected
+ * condition, it will fire an irq (PHASE_ERROR) to inform svs software.
+ *
+ * svs bank general phase-enabled order:
+ * SVSB_PHASE_INIT01 -> SVSB_PHASE_INIT02 -> SVSB_PHASE_MON
+ */
+enum svsb_phase {
+   SVSB_PHASE_ERROR = 0,
+   SVSB_PHASE_INIT01,
+   SVSB_PHASE_INIT02,
+   SVSB_PHASE_MON,
+};
+
+enum svs_reg_index {
+   DESCHAR = 0,
+   TEMPCHAR,
+   DETCHAR,
+   AGECHAR,
+   DCCONFIG,
+   AGECONFIG,
+   FREQPCT30,
+   FREQPCT74,
+   LIMITVALS,
+   VBOOT,
+   DETWINDOW,
+   CONFIG,
+   TSCALCS,
+   RUNCONFIG,
+   SVSEN,
+   INIT2VALS,
+   DCVALUES,
+   AGEVALUES,
+   VOP30,
+   VOP74,
+   TEMP,
+   INTSTS,
+   INTSTSRAW,
+   INTEN,
+   CHKINT,
+   CHKSHIFT,
+   STATUS,
+   VDESIGN30,
+   VDESIGN74,
+   DVT30,
+   DVT74,
+   AGECOUNT,
+   SMSTATE0,
+   SMSTATE1,
+   CTL0,
+   

[PATCH v14 5/7] dt-bindings: soc: mediatek: add mt8192 svs dt-bindings

2021-03-24 Thread Roger Lu
Signed-off-by: Roger Lu 
---
 .../devicetree/bindings/soc/mediatek/mtk-svs.yaml | 8 
 1 file changed, 8 insertions(+)

diff --git a/Documentation/devicetree/bindings/soc/mediatek/mtk-svs.yaml 
b/Documentation/devicetree/bindings/soc/mediatek/mtk-svs.yaml
index a855ced410f8..59342e627b67 100644
--- a/Documentation/devicetree/bindings/soc/mediatek/mtk-svs.yaml
+++ b/Documentation/devicetree/bindings/soc/mediatek/mtk-svs.yaml
@@ -22,6 +22,7 @@ properties:
   compatible:
 enum:
   - mediatek,mt8183-svs
+  - mediatek,mt8192-svs
 
   reg:
 maxItems: 1
@@ -51,6 +52,13 @@ properties:
   - const: svs-calibration-data
   - const: t-calibration-data
 
+  resets:
+maxItems: 1
+
+  reset-names:
+items:
+  - const: svs_rst
+
 required:
   - compatible
   - reg
-- 
2.18.0



[PATCH v14 6/7] arm64: dts: mt8192: add svs device information

2021-03-24 Thread Roger Lu
add compitable/reg/irq/clock/efuse/reset setting in svs node

Signed-off-by: Roger Lu 
---
 arch/arm64/boot/dts/mediatek/mt8192.dtsi | 34 
 1 file changed, 34 insertions(+)

diff --git a/arch/arm64/boot/dts/mediatek/mt8192.dtsi 
b/arch/arm64/boot/dts/mediatek/mt8192.dtsi
index 2f0b4824a024..f3a339de8992 100644
--- a/arch/arm64/boot/dts/mediatek/mt8192.dtsi
+++ b/arch/arm64/boot/dts/mediatek/mt8192.dtsi
@@ -268,6 +268,14 @@
compatible = "mediatek,mt8192-infracfg", "syscon";
reg = <0 0x10001000 0 0x1000>;
#clock-cells = <1>;
+
+   infracfg_rst: reset-controller {
+   compatible = "mediatek,infra-reset", 
"ti,syscon-reset";
+   #reset-cells = <1>;
+   ti,reset-bits = <
+   0x150 5 0x154 5 0 0 (ASSERT_SET | 
DEASSERT_SET | STATUS_NONE) /* 0: svs */
+   >;
+   };
};
 
pericfg: syscon@10003000 {
@@ -362,6 +370,20 @@
status = "disabled";
};
 
+   svs: svs@1100b000 {
+   compatible = "mediatek,mt8192-svs";
+   reg = <0 0x1100b000 0 0x1000>;
+   interrupts = ;
+   clocks = < CLK_INFRA_THERM>;
+   clock-names = "main";
+   nvmem-cells = <_calibration>,
+ <_e_data1>;
+   nvmem-cell-names = "svs-calibration-data",
+  "t-calibration-data";
+   resets = <_rst 0>;
+   reset-names = "svs_rst";
+   };
+
spi1: spi@1101 {
compatible = "mediatek,mt8192-spi",
 "mediatek,mt6765-spi";
@@ -473,6 +495,18 @@
status = "disable";
};
 
+   efuse: efuse@11c1 {
+   compatible = "mediatek,efuse";
+   reg = <0 0x11c1 0 0x1000>;
+
+   lvts_e_data1: data1 {
+   reg = <0x1C0 0x58>;
+   };
+   svs_calibration: calib@580 {
+   reg = <0x580 0x68>;
+   };
+   };
+
i2c3: i2c3@11cb {
compatible = "mediatek,mt8192-i2c";
reg = <0 0x11cb 0 0x1000>,
-- 
2.18.0



[PATCH v14 7/7] soc: mediatek: SVS: add mt8192 SVS GPU driver

2021-03-24 Thread Roger Lu
Signed-off-by: Roger Lu 
---
 drivers/soc/mediatek/mtk-svs.c | 477 -
 1 file changed, 471 insertions(+), 6 deletions(-)

diff --git a/drivers/soc/mediatek/mtk-svs.c b/drivers/soc/mediatek/mtk-svs.c
index e36b3abfee03..3e152a86d280 100644
--- a/drivers/soc/mediatek/mtk-svs.c
+++ b/drivers/soc/mediatek/mtk-svs.c
@@ -36,6 +36,10 @@
 #define SVSB_CCI   BIT(2)
 #define SVSB_GPU   BIT(3)
 
+/* svs bank 2-line type */
+#define SVSB_LOW   BIT(4)
+#define SVSB_HIGH  BIT(5)
+
 /* svs bank mode support */
 #define SVSB_MODE_ALL_DISABLE  0
 #define SVSB_MODE_INIT01   BIT(1)
@@ -280,6 +284,7 @@ struct thermal_parameter {
  * @volts: bank voltages
  * @reg_data: bank register data of each phase
  * @freq_base: reference frequency for bank init
+ * @turn_freq_base: refenrece frequency for turn point
  * @vboot: voltage request for bank init01 stage only
  * @volt_step: bank voltage step
  * @volt_base: bank voltage base
@@ -300,6 +305,8 @@ struct thermal_parameter {
  * @hw_id: bank hardware identification
  * @ctl0: bank thermal sensor selection
  * @cpu_id: cpu core id for SVS CPU only
+ * @turn_pt: turn point informs which opp_volt calculated by high/low bank.
+ * @type: bank type to represent it is 2-line (high/low) bank or 1-line bank.
  *
  * Other structure members which are not listed above are svs platform
  * efuse data for bank init
@@ -325,6 +332,7 @@ struct svs_bank {
u32 volts[16];
u32 reg_data[SVSB_PHASE_NUM][SVS_REG_NUM];
u32 freq_base;
+   u32 turn_freq_base;
u32 vboot;
u32 volt_step;
u32 volt_base;
@@ -364,6 +372,8 @@ struct svs_bank {
u32 hw_id;
u32 ctl0;
u32 cpu_id;
+   u32 turn_pt;
+   u32 type;
 };
 
 /*
@@ -441,6 +451,37 @@ static u32 svs_bank_volt_to_opp_volt(u32 svsb_volt, u32 
svsb_volt_step,
return (svsb_volt * svsb_volt_step) + svsb_volt_base;
 }
 
+static u32 svs_opp_volt_to_bank_volt(u32 opp_u_volt, u32 svsb_volt_step,
+u32 svsb_volt_base)
+{
+   return (opp_u_volt - svsb_volt_base) / svsb_volt_step;
+}
+
+static int svs_sync_bank_volts_from_opp(struct svs_bank *svsb)
+{
+   struct dev_pm_opp *opp;
+   u32 i, opp_u_volt;
+
+   for (i = 0; i < svsb->opp_count; i++) {
+   opp = dev_pm_opp_find_freq_exact(svsb->opp_dev,
+svsb->opp_freqs[i],
+true);
+   if (IS_ERR(opp)) {
+   dev_err(svsb->dev, "cannot find freq = %u (%ld)\n",
+   svsb->opp_freqs[i], PTR_ERR(opp));
+   return PTR_ERR(opp);
+   }
+
+   opp_u_volt = dev_pm_opp_get_voltage(opp);
+   svsb->volts[i] = svs_opp_volt_to_bank_volt(opp_u_volt,
+  svsb->volt_step,
+  svsb->volt_base);
+   dev_pm_opp_put(opp);
+   }
+
+   return 0;
+}
+
 static int svs_get_bank_zone_temperature(const char *tzone_name,
 int *tzone_temp)
 {
@@ -456,7 +497,7 @@ static int svs_get_bank_zone_temperature(const char 
*tzone_name,
 static int svs_adjust_pm_opp_volts(struct svs_bank *svsb, bool force_update)
 {
int tzone_temp, ret = -EPERM;
-   u32 i, svsb_volt, opp_volt, temp_offset = 0;
+   u32 i, svsb_volt, opp_volt, temp_offset = 0, opp_start, opp_stop;
 
mutex_lock(>lock);
 
@@ -470,6 +511,21 @@ static int svs_adjust_pm_opp_volts(struct svs_bank *svsb, 
bool force_update)
goto unlock_mutex;
}
 
+   /*
+* 2-line bank updates its corresponding opp volts.
+* 1-line bank updates all opp volts.
+*/
+   if (svsb->type == SVSB_HIGH) {
+   opp_start = 0;
+   opp_stop = svsb->turn_pt;
+   } else if (svsb->type == SVSB_LOW) {
+   opp_start = svsb->turn_pt;
+   opp_stop = svsb->opp_count;
+   } else {
+   opp_start = 0;
+   opp_stop = svsb->opp_count;
+   }
+
/* Get thermal effect */
if (svsb->phase == SVSB_PHASE_MON) {
if (svsb->temp > svsb->temp_upper_bound &&
@@ -491,10 +547,16 @@ static int svs_adjust_pm_opp_volts(struct svs_bank *svsb, 
bool force_update)
temp_offset += svsb->tzone_high_temp_offset;
else if (tzone_temp <= svsb->tzone_low_temp)
temp_offset += svsb->tzone_low_temp_offset;
+
+   /* 2-line bank takes thermal factor to update all opp volts */
+   if (svsb->type == SVSB_HIGH || svsb->type == SVSB_LOW) {
+   opp_start = 0;
+   opp_stop = svsb->opp_count;
+   }

[PATCH v14 2/7] arm64: dts: mt8183: add svs device information

2021-03-24 Thread Roger Lu
add compitable/reg/irq/clock/efuse setting in svs node

Signed-off-by: Roger Lu 
---
 arch/arm64/boot/dts/mediatek/mt8183.dtsi | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/arch/arm64/boot/dts/mediatek/mt8183.dtsi 
b/arch/arm64/boot/dts/mediatek/mt8183.dtsi
index 80519a145f13..441d617ece43 100644
--- a/arch/arm64/boot/dts/mediatek/mt8183.dtsi
+++ b/arch/arm64/boot/dts/mediatek/mt8183.dtsi
@@ -657,6 +657,18 @@
status = "disabled";
};
 
+   svs: svs@1100b000 {
+   compatible = "mediatek,mt8183-svs";
+   reg = <0 0x1100b000 0 0x1000>;
+   interrupts = ;
+   clocks = < CLK_INFRA_THERM>;
+   clock-names = "main";
+   nvmem-cells = <_calibration>,
+ <_calibration>;
+   nvmem-cell-names = "svs-calibration-data",
+  "t-calibration-data";
+   };
+
pwm0: pwm@1100e000 {
compatible = "mediatek,mt8183-disp-pwm";
reg = <0 0x1100e000 0 0x1000>;
@@ -941,9 +953,15 @@
reg = <0 0x11f1 0 0x1000>;
#address-cells = <1>;
#size-cells = <1>;
+   thermal_calibration: calib@180 {
+   reg = <0x180 0xc>;
+   };
mipi_tx_calibration: calib@190 {
reg = <0x190 0xc>;
};
+   svs_calibration: calib@580 {
+   reg = <0x580 0x64>;
+   };
};
 
u3phy: usb-phy@11f4 {
-- 
2.18.0



[PATCH v14 0/7] soc: mediatek: SVS: introduce MTK SVS

2021-03-24 Thread Roger Lu
1. SVS driver uses OPP adjust event in [1] to update OPP table voltage part.
2. SVS driver gets thermal/GPU device by node [2][3] and CPU device by 
get_cpu_device(). After retrieving subsys device, SVS driver does
device_link_add() to make sure probe/suspend callback priority.
3. SVS dts refers to reset controller [4] to help reset SVS HW.

#mt8183 SVS related patches
[1] https://patchwork.kernel.org/patch/11193513/
[2] 
https://patchwork.kernel.org/project/linux-mediatek/patch/20201013102358.22588-2-michael@mediatek.com/
[3] 
https://patchwork.kernel.org/project/linux-mediatek/patch/20200306041345.259332-3-drink...@chromium.org/

#mt8192 SVS related patches
[1] https://patchwork.kernel.org/patch/11193513/
[2] 
https://patchwork.kernel.org/project/linux-mediatek/patch/20201223074944.2061-1-michael@mediatek.com/
[3] https://lore.kernel.org/patchwork/patch/1360551/
[4] 
https://patchwork.kernel.org/project/linux-mediatek/patch/20200817030324.5690-5-crystal@mediatek.com/

changes since v13:
- Fix "mtk-svs.yaml: properties:nvmem-cells:maxItems: False schema does not 
allow 2"
- Remove wrong maintainer "Nishanth Menon "
- When turn_pt = 0, SVS HIGH bank fills FREQPCT74 / FREQPCT30 with 0 and SVS 
controller won't run normally. Therefore, we initialize SVS HIGH bank's 
FREQPCT30 with svsb->freqs_pct[0] to avoid this issue.
- Change SVS GPU opp count back from 14 to 16 because GPU DVFS has a better 
solution

Roger Lu (7):
  [v14,1/7]: dt-bindings: soc: mediatek: add mtk svs dt-bindings
  [v14,2/7]: arm64: dts: mt8183: add svs device information
  [v14,3/7]: soc: mediatek: SVS: introduce MTK SVS engine
  [v14,4/7]: soc: mediatek: SVS: add debug commands
  [v14,5/7]: dt-bindings: soc: mediatek: add mt8192 svs dt-bindings
  [v14,6/7]: arm64: dts: mt8192: add svs device information
  [v14,7/7]: soc: mediatek: SVS: add mt8192 SVS GPU driver

 .../bindings/soc/mediatek/mtk-svs.yaml|   92 +
 arch/arm64/boot/dts/mediatek/mt8183.dtsi  |   18 +
 arch/arm64/boot/dts/mediatek/mt8192.dtsi  |   34 +
 drivers/soc/mediatek/Kconfig  |   10 +
 drivers/soc/mediatek/Makefile |1 +
 drivers/soc/mediatek/mtk-svs.c| 2495 +
 6 files changed, 2650 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/soc/mediatek/mtk-svs.yaml
 create mode 100644 drivers/soc/mediatek/mtk-svs.c




Re: [PATCH 1/2] perf/core: Share an event with multiple cgroups

2021-03-24 Thread Namhyung Kim
Hi Song,

Thanks for your review!

On Thu, Mar 25, 2021 at 9:56 AM Song Liu  wrote:
> > On Mar 23, 2021, at 9:21 AM, Namhyung Kim  wrote:
> >
> > As we can run many jobs (in container) on a big machine, we want to
> > measure each job's performance during the run.  To do that, the
> > perf_event can be associated to a cgroup to measure it only.
> >
> > However such cgroup events need to be opened separately and it causes
> > significant overhead in event multiplexing during the context switch
> > as well as resource consumption like in file descriptors and memory
> > footprint.
> >
> > As a cgroup event is basically a cpu event, we can share a single cpu
> > event for multiple cgroups.  All we need is a separate counter (and
> > two timing variables) for each cgroup.  I added a hash table to map
> > from cgroup id to the attached cgroups.
> >
> > With this change, the cpu event needs to calculate a delta of event
> > counter values when the cgroups of current and the next task are
> > different.  And it attributes the delta to the current task's cgroup.
> >
> > This patch adds two new ioctl commands to perf_event for light-weight
> > cgroup event counting (i.e. perf stat).
> >
> > * PERF_EVENT_IOC_ATTACH_CGROUP - it takes a buffer consists of a
> > 64-bit array to attach given cgroups.  The first element is a
> > number of cgroups in the buffer, and the rest is a list of cgroup
> > ids to add a cgroup info to the given event.
> >
> > * PERF_EVENT_IOC_READ_CGROUP - it takes a buffer consists of a 64-bit
> > array to get the event counter values.  The first element is size
> > of the array in byte, and the second element is a cgroup id to
> > read.  The rest is to save the counter value and timings.
> >
> > This attaches all cgroups in a single syscall and I didn't add the
> > DETACH command deliberately to make the implementation simple.  The
> > attached cgroup nodes would be deleted when the file descriptor of the
> > perf_event is closed.
> >
> > Cc: Tejun Heo 
> > Signed-off-by: Namhyung Kim 
> > ---
> > include/linux/perf_event.h  |  22 ++
> > include/uapi/linux/perf_event.h |   2 +
> > kernel/events/core.c| 474 ++--
> > 3 files changed, 471 insertions(+), 27 deletions(-)
> >
> > diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
> > index 3f7f89ea5e51..2760f3b07534 100644
> > --- a/include/linux/perf_event.h
> > +++ b/include/linux/perf_event.h
> > @@ -771,6 +771,18 @@ struct perf_event {
> >
> > #ifdef CONFIG_CGROUP_PERF
> >   struct perf_cgroup  *cgrp; /* cgroup event is attach to */
> > +
> > + /* to share an event for multiple cgroups */
> > + struct hlist_head   *cgrp_node_hash;
> > + struct perf_cgroup_node *cgrp_node_entries;
> > + int nr_cgrp_nodes;
> > + int cgrp_node_hash_bits;
> > +
> > + struct list_headcgrp_node_entry;
> > +
> > + u64 cgrp_node_count;
> > + u64 cgrp_node_time_enabled;
> > + u64 cgrp_node_time_running;
>
> A comment saying the above values are from previous reading would be helpful.

Sure, will add.

>
> > #endif
> >
> > #ifdef CONFIG_SECURITY
> > @@ -780,6 +792,14 @@ struct perf_event {
> > #endif /* CONFIG_PERF_EVENTS */
> > };
> >
> > +struct perf_cgroup_node {
> > + struct hlist_node   node;
> > + u64 id;
> > + u64 count;
> > + u64 time_enabled;
> > + u64 time_running;
> > + u64 padding[2];
>
> Do we really need the padding? For cache line alignment?

Yeah I was thinking about it.  It seems I need to use the
___cacheline_aligned macro instead.

>
> > +};
> >
> > struct perf_event_groups {
> >   struct rb_root  tree;
> > @@ -843,6 +863,8 @@ struct perf_event_context {
> >   int pin_count;
> > #ifdef CONFIG_CGROUP_PERF
> >   int nr_cgroups;  /* cgroup evts */
> > + struct list_headcgrp_node_list;
> > + struct list_headcgrp_ctx_entry;
> > #endif
> >   void*task_ctx_data; /* pmu specific data 
> > */
> >   struct rcu_head rcu_head;
> > diff --git a/include/uapi/linux/perf_event.h 
> > b/include/uapi/linux/perf_event.h
> > index ad15e40d7f5d..06bc7ab13616 100644
> > --- a/include/uapi/linux/perf_event.h
> > +++ b/include/uapi/linux/perf_event.h
> > @@ -479,6 +479,8 @@ struct perf_event_query_bpf {
> > #define PERF_EVENT_IOC_PAUSE_OUTPUT   _IOW('$', 9, __u32)
> > #define PERF_EVENT_IOC_QUERY_BPF  _IOWR('$', 10, struct 
> > perf_event_query_bpf *)
> > #define PERF_EVENT_IOC_MODIFY_ATTRIBUTES  _IOW('$', 11, 

Re: [PATCH v6 00/12] SVM cleanup and INVPCID feature support

2021-03-24 Thread Hugh Dickins
On Wed, 24 Mar 2021, Hugh Dickins wrote:
> On Wed, 24 Mar 2021, Borislav Petkov wrote:
> 
> > Ok,
> > 
> > some more experimenting Babu and I did lead us to:
> > 
> > ---
> > diff --git a/arch/x86/include/asm/tlbflush.h 
> > b/arch/x86/include/asm/tlbflush.h
> > index f5ca15622dc9..259aa4889cad 100644
> > --- a/arch/x86/include/asm/tlbflush.h
> > +++ b/arch/x86/include/asm/tlbflush.h
> > @@ -250,6 +250,9 @@ static inline void __native_flush_tlb_single(unsigned 
> > long addr)
> >  */
> > if (kaiser_enabled)
> > invpcid_flush_one(X86_CR3_PCID_ASID_USER, addr);
> > +   else
> > +   asm volatile("invlpg (%0)" ::"r" (addr) : "memory");
> > +
> > invpcid_flush_one(X86_CR3_PCID_ASID_KERN, addr);
> >  }
> > 
> > applied on the guest kernel which fixes the issue. And let me add Hugh
> > who did that PCID stuff at the time. So lemme summarize for Hugh and to
> > ask him nicely to sanity-check me. :-)
> 
> Just a brief interim note to assure you that I'm paying attention,
> but wow, it's a long time since I gave any thought down here!
> Trying to page it all back in...
> 
> I see no harm in your workaround if it works, but it's not as if
> this is a previously untried path: so I'm suspicious how an issue
> here with Globals could have gone unnoticed for so long, and need
> to understand it better.

Right, after looking into it more, I completely agree with you:
the Kaiser series (in both 4.4-stable and 4.9-stable) was simply
wrong to lose that invlpg - fine in the kaiser case when we don't
enable Globals at all, but plain wrong in the !kaiser_enabled case.
One way or another, we have somehow got away with it for three years.

I do agree with Paolo that the PCID_ASID_KERN flush would be better
moved under the "if (kaiser_enabled)" now. (And if this were ongoing
development, I'd want to rewrite the function altogether: but no,
these old stable trees are not the place for that.)

Boris, may I leave both -stable fixes to you?
Let me know if you'd prefer me to clean up my mess.

Thanks a lot for tracking this down,
Hugh

> > 
> > Basically, you have an AMD host which supports PCID and INVPCID and you
> > boot on it a 4.9 guest. It explodes like the panic below.
> > 
> > What fixes it is this:
> > 
> > diff --git a/arch/x86/include/asm/tlbflush.h 
> > b/arch/x86/include/asm/tlbflush.h
> > index f5ca15622dc9..259aa4889cad 100644
> > --- a/arch/x86/include/asm/tlbflush.h
> > +++ b/arch/x86/include/asm/tlbflush.h
> > @@ -250,6 +250,9 @@ static inline void __native_flush_tlb_single(unsigned 
> > long addr)
> >  */
> > if (kaiser_enabled)
> > invpcid_flush_one(X86_CR3_PCID_ASID_USER, addr);
> > +   else
> > +   asm volatile("invlpg (%0)" ::"r" (addr) : "memory");
> > +
> > invpcid_flush_one(X86_CR3_PCID_ASID_KERN, addr);
> >  }
> > 
> > ---
> > 
> > and the reason why it does, IMHO, is because on AMD, kaiser_enabled is
> > false because AMD is not affected by Meltdown, which means, there's no
> > user/kernel pagetables split.
> > 
> > And that also means, you have global TLB entries which means that if you
> > look at that __native_flush_tlb_single() function, it needs to flush
> > global TLB entries on CPUs with X86_FEATURE_INVPCID_SINGLE by doing an
> > INVLPG in the kaiser_enabled=0 case. Errgo, the above hunk.
> > 
> > But I might be completely off here thus this note...
> > 
> > Thoughts?
> > 
> > Thx.
> > 
> > 
> > [1.235726] [ cut here ]
> > [1.237515] kernel BUG at 
> > /build/linux-dqnRSc/linux-4.9.228/arch/x86/kernel/alternative.c:709!
> > [1.240926] invalid opcode:  [#1] SMP
> > [1.243301] Modules linked in:
> > [1.244585] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.9.0-13-amd64 #1 
> > Debian 4.9.228-1
> > [1.247657] Hardware name: Google Google Compute Engine/Google Compute 
> > Engine, BIOS Google 01/01/2011
> > [1.251249] task: 909363e94040 task.stack: a41bc0194000
> > [1.253519] RIP: 0010:[]  [] 
> > text_poke+0x18c/0x240
> > [1.256593] RSP: 0018:a41bc0197d90  EFLAGS: 00010096
> > [1.258657] RAX: 000f RBX: 01020800 RCX: 
> > feda3203
> > [1.261388] RDX: 178bfbff RSI:  RDI: 
> > ff57a000
> > [1.264168] RBP: 8fbd3eca R08:  R09: 
> > 0003
> > [1.266983] R10: 0003 R11: 0112 R12: 
> > 0001
> > [1.269702] R13: a41bc0197dcf R14: 0286 R15: 
> > ed1c40407500
> > [1.272572] FS:  () GS:90936630() 
> > knlGS:
> > [1.275791] CS:  0010 DS:  ES:  CR0: 80050033
> > [1.278032] CR2:  CR3: 10c08000 CR4: 
> > 003606f0
> > [1.280815] Stack:
> > [1.281630]  8fbd3eca 0005 a41bc0197e03 
> > 8fbd3ecb
> > [1.284660]    8fa2e835 
> > 

[PATCH v2] sched/topology: remove redundant cpumask_and in init_overlap_sched_group

2021-03-24 Thread Barry Song
mask is built in build_balance_mask() by for_each_cpu(i, sg_span), so
it must be a subset of sched_group_span(sg). Though cpumask_first_and
doesn't lead to a wrong result of balance cpu, it is pointless to do
cpumask_and again.

Signed-off-by: Barry Song 
Reviewed-by: Valentin Schneider 
---
 -v2: add reviewed-by of Valentin, thanks!

 kernel/sched/topology.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
index f2066d682cd8..d1aec244c027 100644
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -934,7 +934,7 @@ static void init_overlap_sched_group(struct sched_domain 
*sd,
int cpu;
 
build_balance_mask(sd, sg, mask);
-   cpu = cpumask_first_and(sched_group_span(sg), mask);
+   cpu = cpumask_first(mask);
 
sg->sgc = *per_cpu_ptr(sdd->sgc, cpu);
if (atomic_inc_return(>sgc->ref) == 1)
-- 
2.25.1



[PATCH] include: linux: debug_locks: Remove duplicate declaration

2021-03-24 Thread Wan Jiabing
struct task_struct is declared at 9th line. Remove the duplicate.

Signed-off-by: Wan Jiabing 
---
 include/linux/debug_locks.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/include/linux/debug_locks.h b/include/linux/debug_locks.h
index 2915f56ad421..0b3187a5290d 100644
--- a/include/linux/debug_locks.h
+++ b/include/linux/debug_locks.h
@@ -46,7 +46,6 @@ extern int debug_locks_off(void);
 # define locking_selftest()do { } while (0)
 #endif
 
-struct task_struct;
 
 #ifdef CONFIG_LOCKDEP
 extern void debug_show_all_locks(void);
-- 
2.25.1



RE: [EXT] Re: [PATCH v2 3/3] PCI: imx: clear vreg bypass when pcie vph voltage is 3v3

2021-03-24 Thread Richard Zhu

> -Original Message-
> From: Lucas Stach 
> Sent: Wednesday, March 24, 2021 5:30 PM
> To: Richard Zhu ; andrew.smir...@gmail.com;
> shawn...@kernel.org; k...@linux.com; bhelg...@google.com;
> ste...@agner.ch; lorenzo.pieral...@arm.com
> Cc: linux-...@vger.kernel.org; dl-linux-imx ;
> linux-arm-ker...@lists.infradead.org; linux-kernel@vger.kernel.org;
> ker...@pengutronix.de
> Subject: Re: [PATCH v2 3/3] PCI: imx: clear vreg bypass when pcie vph
> voltage is 3v3
> Am Mittwoch, dem 24.03.2021 um 13:34 +0800 schrieb Richard Zhu:
> > Both 1.8v and 3.3v power supplies can be used by i.MX8MQ PCIe PHY.
> > In default, the PCIE_VPH voltage is suggested to be 1.8v refer to data
> > sheet. When PCIE_VPH is supplied by 3.3v in the HW schematic design,
> > the VREG_BYPASS bits of GPR registers should be cleared from default
> > value 1b'1 to 1b'0. Thus, the internal 3v3 to 1v8 translator would be
> > turned on.
> >
> > Signed-off-by: Richard Zhu 
> > ---
> >  drivers/pci/controller/dwc/pci-imx6.c | 23 +++
> >  1 file changed, 23 insertions(+)
> >
> > diff --git a/drivers/pci/controller/dwc/pci-imx6.c
> > b/drivers/pci/controller/dwc/pci-imx6.c
> > index 853ea8e82952..beca085a9300 100644
> > --- a/drivers/pci/controller/dwc/pci-imx6.c
> > +++ b/drivers/pci/controller/dwc/pci-imx6.c
> > @@ -37,6 +37,7 @@
> >  #define IMX8MQ_GPR_PCIE_REF_USE_PAD  BIT(9)
> >  #define IMX8MQ_GPR_PCIE_CLK_REQ_OVERRIDE_EN  BIT(10)
> >  #define IMX8MQ_GPR_PCIE_CLK_REQ_OVERRIDE BIT(11)
> > +#define IMX8MQ_GPR_PCIE_VREG_BYPASS  BIT(12)
> >  #define IMX8MQ_GPR12_PCIE2_CTRL_DEVICE_TYPE  GENMASK(11, 8)
> >  #define IMX8MQ_PCIE2_BASE_ADDR
> 0x33c0
> >
> >
> >
> >
> > @@ -80,6 +81,7 @@ struct imx6_pcie {
> >   u32 tx_swing_full;
> >   u32 tx_swing_low;
> >   struct regulator*vpcie;
> > + struct regulator*vph;
> >   void __iomem*phy_base;
> >
> >
> >
> >
> >   /* power domain for pcie */
> > @@ -611,6 +613,8 @@ static void imx6_pcie_configure_type(struct
> > imx6_pcie *imx6_pcie)
> >
> >
> >
> >
> >  static void imx6_pcie_init_phy(struct imx6_pcie *imx6_pcie)  {
> > + int phy_uv;
> > +
> No need for this variable...
[Richard Zhu] Thanks, would be removed later.

> 
> >   switch (imx6_pcie->drvdata->variant) {
> >   case IMX8MQ:
> >   /*
> > @@ -621,6 +625,18 @@ static void imx6_pcie_init_phy(struct imx6_pcie
> *imx6_pcie)
> >  imx6_pcie_grp_offset(imx6_pcie),
> >
> IMX8MQ_GPR_PCIE_REF_USE_PAD,
> >
> IMX8MQ_GPR_PCIE_REF_USE_PAD);
> > + /*
> > +  * Regarding to the datasheet, the PCIE_VPH is suggested
> > +  * to be 1.8V. If the PCIE_VPH is supplied by 3.3V, the
> > +  * VREG_BYPASS should be cleared to zero.
> > +  */
> > + if (imx6_pcie->vph)
> > + phy_uv =
> regulator_get_voltage(imx6_pcie->vph);
> > + if (phy_uv > 300)
> > + regmap_update_bits(imx6_pcie->iomuxc_gpr,
> > +
> imx6_pcie_grp_offset(imx6_pcie),
> > +
> IMX8MQ_GPR_PCIE_VREG_BYPASS,
> > +0);
> 
> ...if you just fold this into a single condition. Right now phy_uv might be 
> used
> uninitialized when the vph-supply is not specified in the DT. Better write 
> this
> as:
> 
> if (imx6_pcie->vph && regulator_get_voltage(imx6_pcie->vph) > 300)
[Richard Zhu] Thanks. Would be changed as this way.
> 
> Regards,
> Lucas
> 
> >   break;
> >   case IMX7D:
> >   regmap_update_bits(imx6_pcie->iomuxc_gpr,
> IOMUXC_GPR12,
> > @@ -1130,6 +1146,13 @@ static int imx6_pcie_probe(struct
> platform_device *pdev)
> >   imx6_pcie->vpcie = NULL;
> >   }
> >
> >
> >
> >
> >
> >
> >
> >
> > + imx6_pcie->vph = devm_regulator_get_optional(>dev,
> "vph");
> > + if (IS_ERR(imx6_pcie->vph)) {
> > + if (PTR_ERR(imx6_pcie->vph) != -ENODEV)
> > + return PTR_ERR(imx6_pcie->vph);
> > + imx6_pcie->vph = NULL;
> > + }
> > +
> >   platform_set_drvdata(pdev, imx6_pcie);
> >
> >
> >
> >
> >
> >
> >
> >
> >   ret = imx6_pcie_attach_pd(dev);
> 



[PATCH] ext4: Fix bug on in ext4_es_cache_extent as ext4_split_extent_at failed

2021-03-24 Thread Ye Bin
We got follow bug_on:
[130747.323114] kernel BUG at fs/ext4/extents_status.c:762!
[130747.323117] Internal error: Oops - BUG: 0 [#1] SMP
..
[130747.334329] Call trace:
[130747.334553]  ext4_es_cache_extent+0x150/0x168 [ext4]
[130747.334975]  ext4_cache_extents+0x64/0xe8 [ext4]
[130747.335368]  ext4_find_extent+0x300/0x330 [ext4]
[130747.335759]  ext4_ext_map_blocks+0x74/0x1178 [ext4]
[130747.336179]  ext4_map_blocks+0x2f4/0x5f0 [ext4]
[130747.336567]  ext4_mpage_readpages+0x4a8/0x7a8 [ext4]
[130747.336995]  ext4_readpage+0x54/0x100 [ext4]
[130747.337359]  generic_file_buffered_read+0x410/0xae8
[130747.337767]  generic_file_read_iter+0x114/0x190
[130747.338152]  ext4_file_read_iter+0x5c/0x140 [ext4]
[130747.338556]  __vfs_read+0x11c/0x188
[130747.338851]  vfs_read+0x94/0x150
[130747.339110]  ksys_read+0x74/0xf0

If call ext4_ext_insert_extent failed but new extent already inserted, we just
update "ex->ee_len = orig_ex.ee_len", this will lead to extent overlap, then
cause bug on when cache extent.
If call ext4_ext_insert_extent failed don't update ex->ee_len with old value.
Maybe there will lead to block leak, but it can be fixed by fsck later.

Signed-off-by: Ye Bin 
---
 fs/ext4/extents.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 77c84d6f1af6..970eb2dfcc46 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -3246,7 +3246,7 @@ static int ext4_split_extent_at(handle_t *handle,
 
goto out;
} else if (err)
-   goto fix_extent_len;
+   goto err;
 
 out:
ext4_ext_show_leaf(inode, path);
@@ -3254,6 +3254,7 @@ static int ext4_split_extent_at(handle_t *handle,
 
 fix_extent_len:
ex->ee_len = orig_ex.ee_len;
+err:
/*
 * Ignore ext4_ext_dirty return value since we are already in error path
 * and err is a non-zero error code.
-- 
2.25.4



Re: [PATCH] scsi: bnx2i: make bnx2i_process_iscsi_error simpler and more robust

2021-03-24 Thread Martin K. Petersen


Rasmus,

> Instead of strcpy'ing into a stack buffer, just let additional_notice
> point to a string literal living in .rodata. This is better in a few
> ways:

Applied to 5.13/scsi-staging, thanks!

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH] spi: fsi: Remove multiple sequenced ops for restricted chips

2021-03-24 Thread Joel Stanley
On Wed, 24 Mar 2021 at 22:05, Eddie James  wrote:
>
> Updated restricted chips have trouble processing multiple sequenced
> operations. So remove the capability to sequence multiple operations and
> reduce the maximum transfer size to 8 bytes.
>
> Signed-off-by: Eddie James 

Reviewed-by: Joel Stanley 

> ---
>  drivers/spi/spi-fsi.c | 27 +++
>  1 file changed, 7 insertions(+), 20 deletions(-)
>
> diff --git a/drivers/spi/spi-fsi.c b/drivers/spi/spi-fsi.c
> index 3920cd3286d8..de359718e816 100644
> --- a/drivers/spi/spi-fsi.c
> +++ b/drivers/spi/spi-fsi.c
> @@ -26,7 +26,7 @@
>  #define SPI_FSI_BASE   0x7
>  #define SPI_FSI_INIT_TIMEOUT_MS1000
>  #define SPI_FSI_MAX_XFR_SIZE   2048
> -#define SPI_FSI_MAX_XFR_SIZE_RESTRICTED32
> +#define SPI_FSI_MAX_XFR_SIZE_RESTRICTED8
>
>  #define SPI_FSI_ERROR  0x0
>  #define SPI_FSI_COUNTER_CFG0x1
> @@ -265,14 +265,12 @@ static int fsi_spi_sequence_transfer(struct fsi_spi 
> *ctx,
>  struct fsi_spi_sequence *seq,
>  struct spi_transfer *transfer)
>  {
> -   bool docfg = false;
> int loops;
> int idx;
> int rc;
> u8 val = 0;
> u8 len = min(transfer->len, 8U);
> u8 rem = transfer->len % len;
> -   u64 cfg = 0ULL;
>
> loops = transfer->len / len;
>
> @@ -292,28 +290,17 @@ static int fsi_spi_sequence_transfer(struct fsi_spi 
> *ctx,
> return -EINVAL;
> }
>
> -   if (ctx->restricted) {
> -   const int eidx = rem ? 5 : 6;
> -
> -   while (loops > 1 && idx <= eidx) {
> -   idx = fsi_spi_sequence_add(seq, val);
> -   loops--;
> -   docfg = true;
> -   }
> -
> -   if (loops > 1) {
> -   dev_warn(ctx->dev, "No sequencer slots; aborting.\n");
> -   return -EINVAL;
> -   }
> +   if (ctx->restricted && loops > 1) {
> +   dev_warn(ctx->dev,
> +"Transfer too large; no branches permitted.\n");
> +   return -EINVAL;
> }
>
> if (loops > 1) {
> +   u64 cfg = SPI_FSI_COUNTER_CFG_LOOPS(loops - 1);
> +
> fsi_spi_sequence_add(seq, SPI_FSI_SEQUENCE_BRANCH(idx));
> -   docfg = true;
> -   }
>
> -   if (docfg) {
> -   cfg = SPI_FSI_COUNTER_CFG_LOOPS(loops - 1);
> if (transfer->rx_buf)
> cfg |= SPI_FSI_COUNTER_CFG_N2_RX |
> SPI_FSI_COUNTER_CFG_N2_TX |
> --
> 2.27.0
>


[PATCH] include: linux: host1x: Remove duplicate declaration

2021-03-24 Thread Wan Jiabing
struct host1x is declared at 20th line. Remove the duplicate.

Signed-off-by: Wan Jiabing 
---
 include/linux/host1x.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/include/linux/host1x.h b/include/linux/host1x.h
index ce59a6a6a008..462f0bc7a703 100644
--- a/include/linux/host1x.h
+++ b/include/linux/host1x.h
@@ -140,7 +140,6 @@ static inline void host1x_bo_munmap(struct host1x_bo *bo, 
void *addr)
 
 struct host1x_syncpt_base;
 struct host1x_syncpt;
-struct host1x;
 
 struct host1x_syncpt *host1x_syncpt_get(struct host1x *host, u32 id);
 u32 host1x_syncpt_id(struct host1x_syncpt *sp);
-- 
2.25.1



Re: [PATCH v12 1/2] scsi: ufs: Enable power management for wlun

2021-03-24 Thread Asutosh Das (asd)

On 3/23/2021 12:19 PM, Adrian Hunter wrote:

On 23/03/21 5:13 pm, Asutosh Das (asd) wrote:

On 3/22/2021 11:12 PM, Adrian Hunter wrote:

On 22/03/21 9:53 pm, Asutosh Das (asd) wrote:

On 3/19/2021 10:47 AM, Adrian Hunter wrote:

On 19/03/21 2:35 am, Asutosh Das wrote:

During runtime-suspend of ufs host, the scsi devices are
already suspended and so are the queues associated with them.
But the ufs host sends SSU to wlun during its runtime-suspend.
During the process blk_queue_enter checks if the queue is not in
suspended state. If so, it waits for the queue to resume, and never
comes out of it.
The commit
(d55d15a33: scsi: block: Do not accept any requests while suspended)
adds the check if the queue is in suspended state in blk_queue_enter().

Call trace:
    __switch_to+0x174/0x2c4
    __schedule+0x478/0x764
    schedule+0x9c/0xe0
    blk_queue_enter+0x158/0x228
    blk_mq_alloc_request+0x40/0xa4
    blk_get_request+0x2c/0x70
    __scsi_execute+0x60/0x1c4
    ufshcd_set_dev_pwr_mode+0x124/0x1e4
    ufshcd_suspend+0x208/0x83c
    ufshcd_runtime_suspend+0x40/0x154
    ufshcd_pltfrm_runtime_suspend+0x14/0x20
    pm_generic_runtime_suspend+0x28/0x3c
    __rpm_callback+0x80/0x2a4
    rpm_suspend+0x308/0x614
    rpm_idle+0x158/0x228
    pm_runtime_work+0x84/0xac
    process_one_work+0x1f0/0x470
    worker_thread+0x26c/0x4c8
    kthread+0x13c/0x320
    ret_from_fork+0x10/0x18

Fix this by registering ufs device wlun as a scsi driver and
registering it for block runtime-pm. Also make this as a
supplier for all other luns. That way, this device wlun
suspends after all the consumers and resumes after
hba resumes.

Co-developed-by: Can Guo 
Signed-off-by: Can Guo 
Signed-off-by: Asutosh Das 


I have some more comments that may help straighten things out.

Also please look at ufs_debugfs_get_user_access() and
ufs_debugfs_put_user_access() that now need to scsi_autopm_get/put_device
sdev_ufs_device.

It would also be good if you could re-base on linux-next.



Hi Adrian
Thanks for the comments.

I agree moving the code to wlun probe and other changes.
But it looks to me that it may not fully solve the issue.

Please let me explain my understanding on this:

(Please refer to the logs in v10)
scsi_autopm_*() are invoked on a sdev.
pm_runtime_get_suppliers()/rpm_put_suppliers() are on the supplier device.

For the device wlun:
  slave_configure():
  - doesn't set the rpm_autosuspend
  - pm_runtime_getnoresume()
  scsi_sysfs_add_sdev():
  - pm_runtime_forbid()
  - scsi_autopm_get_device()
  - device_add()
  - ufshcd_wl_probe()
  - scsi_autopm_put_device()

For all other scsi devices:
  slave_alloc():
  - ufshcd_setup_links()
Say all link_add: pm_runtime_put(>sdev_ufs_device->sdev_gendev);


With DL_FLAG_RPM_ACTIVE, links will 'get' not 'put'


I'm referring to the pm_runtime_put(sdev_ufs_device) after all the links are 
setup, that you suggested to add.


Ok


  slave_configure():
  - set rpm_autosuspend
  scsi_sysfs_add_sdev():
  - scsi_autopm_get_device()
  - device_add() -> schedules an async probe()
  - scsi_autopm_put_device() - (1)

Now the rpm_put_suppliers() can be invoked *after* pm_runtime_get_suppliers() 
of the async probe(), since both are running in different contexts.


Only if the sd device suspends.


Correct. What'd stop the sd device from suspending?
We should be stopping the sd device from suspending here - imho.




Hi Adrian,
Thanks for the comments.


You mean for performance reasons.  That is something we can
look at, but let's get it working first.

Not for performance reasons. I meant to say that this issue can be fixed 
if we stop the sd devices from suspending until the sd_probe() is completed.



In that case, the usage_count of supplier would be decremented until rpm_active 
of this link becomes 1.


Right, because the sd device suspended.


Now the pm_runtime_get_suppliers() expects the link_active to be more than 1.


Not sure what you mean here. pm_runtime_*put*_suppliers() won't do anything if 
the link count is 1.

I'm referring to the logs that I pasted before:
[    6.941267][    T7] scsi 0:0:0:4: rpm_put_suppliers: [BEF] Supp 
(0:0:0:49488) usage_count: 4 rpm_active: 3

-- T196 Context comes in while T7 is running --
[    6.941466][  T196] scsi 0:0:0:4: pm_runtime_get_suppliers: (0:0:0:49488): 
supp: usage_count: 5 rpm_active: 4
--

[    7.788397][    T7] scsi 0:0:0:4: rpm_put_suppliers: [AFT] Supp 
(0:0:0:49488) usage_count: 2 rpm_active: 1

I meant to say that, if the rpm_put_suppliers() is invoked after the 
pm_runtime_get_suppliers() as is seen above then the link_active may become 1 
even *after* pm_runtime_get_suppliers() is invoked.

I'm referring to the pm_runtime_get_suppliers() invoked from:
driver_probe_device() - say for, sd 0:0:0:x
 |- pm_runtime_get_suppliers() - for sd 

Re: [PATCH] userfaultfd/shmem: fix minor fault page leak

2021-03-24 Thread Axel Rasmussen
On Wed, Mar 24, 2021 at 5:52 PM Peter Xu  wrote:
>
> Hi, Andrew,
>
> On Wed, Mar 24, 2021 at 04:20:27PM -0700, Andrew Morton wrote:
> > On Mon, 22 Mar 2021 13:48:35 -0700 Axel Rasmussen 
> >  wrote:
> >
> > > This fix is analogous to Peter Xu's fix for hugetlb [0]. If we don't
> > > put_page() after getting the page out of the page cache, we leak the
> > > reference.
> > >
> > > The fix can be verified by checking /proc/meminfo and running the
> > > userfaultfd selftest in shmem mode. Without the fix, we see MemFree /
> > > MemAvailable steadily decreasing with each run of the test. With the
> > > fix, memory is correctly freed after the test program exits.
> > >
> > > Fixes: 00da60b9d0a0 ("userfaultfd: support minor fault handling for 
> > > shmem")
> >
> > Confused.  The affected code:
> >
> > > --- a/mm/shmem.c
> > > +++ b/mm/shmem.c
> > > @@ -1831,6 +1831,7 @@ static int shmem_getpage_gfp(struct inode *inode, 
> > > pgoff_t index,
> > >
> > > if (page && vma && userfaultfd_minor(vma)) {
> > > unlock_page(page);
> > > +   put_page(page);
> > > *fault_type = handle_userfault(vmf, VM_UFFD_MINOR);
> > > return 0;
> > > }
> >
> > Is added by Peter's "page && vma && userfaultfd_minor".  I assume that
> > "Fixes:" is incorrect?
> >
>
> It seems to me the commit is correct as pointed to in "Fixes", but I do have a
> different commit ID here:
>
> commit 63c826b1372c4930f89b8a55092699fa7f0d6f4e
> Author: Axel Rasmussen 
> Date:   Thu Mar 18 10:20:43 2021 -0400
>
> userfaultfd: support minor fault handling for shmem
>
> Axel, did you fetched the commit ID from your local tree, perhaps?  Since I
> should have fetched from hnaz/linux-mm and I can see Andrew's sign-off too.
>
> Thanks,
>
> --
> Peter Xu
>

Ah, this is the SHA I see when I "git log --grep linux-next/akpm"
(where my repo's linux-next remote is [1]):

commit 00da60b9d0a03818c36a2fe862578309c27006ad
Author: Axel Rasmussen 
Date:   Thu Mar 18 17:01:51 2021 +1100

userfaultfd: support minor fault handling for shmem

This is the commit that this new patch fixes. I'll admit I'm a bit
unsure which tree the "Fixes:" tag is meant to refer to before the
commits make it into Linus' tree, if I should look up the commit
another way just let me know. :) And, sorry for the confusion.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git


Re: [PATCH net v2] net: sched: fix packet stuck problem for lockless qdisc

2021-03-24 Thread Yunsheng Lin
On 2021/3/25 3:20, Cong Wang wrote:
> On Tue, Mar 23, 2021 at 7:24 PM Yunsheng Lin  wrote:
>> @@ -176,8 +207,23 @@ static inline bool qdisc_run_begin(struct Qdisc *qdisc)
>>  static inline void qdisc_run_end(struct Qdisc *qdisc)
>>  {
>> write_seqcount_end(>running);
>> -   if (qdisc->flags & TCQ_F_NOLOCK)
>> +   if (qdisc->flags & TCQ_F_NOLOCK) {
>> spin_unlock(>seqlock);
>> +
>> +   /* qdisc_run_end() is protected by RCU lock, and
>> +* qdisc reset will do a synchronize_net() after
>> +* setting __QDISC_STATE_DEACTIVATED, so testing
>> +* the below two bits separately should be fine.
> 
> Hmm, why synchronize_net() after setting this bit is fine? It could
> still be flipped right after you test RESCHEDULE bit.

That depends on when it will be fliped again.

As I see:
1. __QDISC_STATE_DEACTIVATED is set during dev_deactivate() process,
   which should also wait for all process related to "test_bit(
   __QDISC_STATE_NEED_RESCHEDULE, >state)" to finish by calling
   synchronize_net() and checking some_qdisc_is_busy().

2. it is cleared during dev_activate() process.

And dev_deactivate() and dev_activate() is protected by RTNL lock, or
serialized by linkwatch.

> 
> 
>> +* For qdisc_run() in net_tx_action() case, we
>> +* really should provide rcu protection explicitly
>> +* for document purposes or PREEMPT_RCU.
>> +*/
>> +   if (unlikely(test_bit(__QDISC_STATE_NEED_RESCHEDULE,
>> + >state) &&
>> +!test_bit(__QDISC_STATE_DEACTIVATED,
>> +  >state)))
> 
> Why do you want to test __QDISC_STATE_DEACTIVATED bit at all?
> dev_deactivate_many() will wait for those scheduled but being
> deactivated, so what's the problem of scheduling it even with this bit?

The problem I tried to fix is:

  CPU0(calling dev_deactivate)   CPU1(calling qdisc_run_end)   CPU2(calling 
tx_atcion)
 .   __netif_schedule()   .
 . set __QDISC_STATE_SCHED.
 ..   .
clear __QDISC_STATE_DEACTIVATED   .   .
 synchronize_net().   .
 ..   .
 ..  clear 
__QDISC_STATE_SCHED
 ..   .
 some_qdisc_is_busy() return false.   .
 ..   .
 ..  qdisc_run()

some_qdisc_is_busy() checks if the qdisc is busy by checking __QDISC_STATE_SCHED
and spin_is_locked(>seqlock) for lockless qdisc, and some_qdisc_is_busy()
return false for CPU0 because CPU2 has cleared the __QDISC_STATE_SCHED and has 
not
taken the qdisc->seqlock yet, qdisc is clearly still busy when qdisc_run() is 
run
by CPU2 later.

So you are right, testing __QDISC_STATE_DEACTIVATED does not completely solve
the above data race, and there are __netif_schedule() called by 
dev_requeue_skb()
and __qdisc_run() too, which need the same fixing.

So will remove the __QDISC_STATE_DEACTIVATED testing for this patch first, and
deal with it later.

> 
> Thanks.
> 
> .
> 



Re: [PATCH][next] scsi: aacraid: Replace one-element array with flexible-array member

2021-03-24 Thread Gustavo A. R. Silva
Hi Martin,

On 3/24/21 20:18, Martin K. Petersen wrote:
> 
> Hi Gustavo!
> 
> Your changes and the original code do not appear to be functionally
> equivalent.
> 
>> @@ -1235,8 +1235,8 @@ static int aac_read_raw_io(struct fib * fib, struct 
>> scsi_cmnd * cmd, u64 lba, u3
>>  if (ret < 0)
>>  return ret;
>>  command = ContainerRawIo2;
>> -fibsize = sizeof(struct aac_raw_io2) +
>> -((le32_to_cpu(readcmd2->sgeCnt)-1) * sizeof(struct 
>> sge_ieee1212));
>> +fibsize = struct_size(readcmd2, sge,
>> + le32_to_cpu(readcmd2->sgeCnt));
> 
> The old code allocated sgeCnt-1 elements (whether that was a mistake or
> not I do not know) whereas the new code would send a larger fib to the
> ASIC. I don't have any aacraid adapters and I am hesitant to merging
> changes that have not been validated on real hardware.

Precisely this sort of confusion is one of the things we want to avoid
by using flexible-array members instead of one-element arrays.

fibsize is actually the same for both the old and the new code. The
difference is that in the original code, the one-element array _sge_
at the bottom of struct aac_raw_io2, contributes to the size of the
structure, as it occupies at least as much space as a single object
of its type. On the other hand, flexible-array members don't contribute
to the size of the enclosing structure. See below...

Old code:

$ pahole -C aac_raw_io2 drivers/scsi/aacraid/aachba.o
struct aac_raw_io2 {
__le32 blockLow; /* 0 4 */
__le32 blockHigh;/* 4 4 */
__le32 byteCount;/* 8 4 */
__le16 cid;  /*12 2 */
__le16 flags;/*14 2 */
__le32 sgeFirstSize; /*16 4 */
__le32 sgeNominalSize;   /*20 4 */
u8 sgeCnt;   /*24 1 */
u8 bpTotal;  /*25 1 */
u8 bpComplete;   /*26 1 */
u8 sgeFirstIndex;/*27 1 */
u8 unused[4];/*28 4 */
struct sge_ieee1212sge[1];   /*3216 */

/* size: 48, cachelines: 1, members: 13 */
/* last cacheline: 48 bytes */
};

New code:

$ pahole -C aac_raw_io2 drivers/scsi/aacraid/aachba.o
struct aac_raw_io2 {
__le32 blockLow; /* 0 4 */
__le32 blockHigh;/* 4 4 */
__le32 byteCount;/* 8 4 */
__le16 cid;  /*12 2 */
__le16 flags;/*14 2 */
__le32 sgeFirstSize; /*16 4 */
__le32 sgeNominalSize;   /*20 4 */
u8 sgeCnt;   /*24 1 */
u8 bpTotal;  /*25 1 */
u8 bpComplete;   /*26 1 */
u8 sgeFirstIndex;/*27 1 */
u8 unused[4];/*28 4 */
struct sge_ieee1212sge[];/*32 0 */

/* size: 32, cachelines: 1, members: 13 */
/* last cacheline: 32 bytes */
};

So, the old code allocates sgeCnt-1 elements because sizeof(struct aac_raw_io2) 
is
already counting one element of the _sge_ array.

Please, let me know if this is clear now.

Thanks!
--
Gustavo


Re: [PATCH] btrfs: fix a potential hole-punching failure

2021-03-24 Thread bingjing chang
In order to reply in plain text, I send the mail from Gmail.

Filipe Manana  於 2021年3月24日 週三 下午8:16寫道:
>
> On Wed, Mar 24, 2021 at 11:15 AM bingjingc  wrote:
> >
> > From: BingJing Chang 
> >
> > In commit d77815461f04 ("btrfs: Avoid trucating page or punching hole in
> > a already existed hole."), existed holes can be skipped by calling
> > find_first_non_hole() to adjust *start and *len. However, if the given
> > len is invalid and large, when an EXTENT_MAP_HOLE extent is found, the
> > *len will not be set to zero because (em->start + em->len) is less than
> > (*start + *len). Then the ret will be 1 but the *len will not be set to
> > 0. The propagated non-zero ret will result in fallocate failure.
> >
> > In the while-loop of btrfs_replace_file_extents(), len is not updated
> > every time before it calls find_first_non_hole(). That is, if the last
> > file extent in the given hole-punching range has been dropped but
> > btrfs_drop_extents() fails with -ENOSPC (btrfs_drop_extents() runs out
> > of reserved space of the given transaction), the problem can happen.
>
> This is not entirely clear. Dropping the last extent and still
> returning ENOSPC is confusing.
> I think you mean that it drops the last file extent item that does not
> represent hole (disk_bytenr > 0), and after it there's only one file
> extent item representing a hole (disk_bytenr == 0).
> It fails with -ENOSPC when attempting to drop the file extent item
> representing the hole, after successfully dropping the non-hole file
> extent item.
> Is that it?
>

Thank you for your comments. You're right.
Saying the last file extent is not correct and confusing.
I revised and send the v2 patch for fixing the commit message. Thank you.

> > After it calls find_first_non_hole(), the cur_offset will be adjusted
> > to be larger than or equal to end. However, since the len is not set to
> > zero. The break-loop condition (ret && !len) will not meet. After it
> > leaves the while-loop, uncleared ret will result in fallocate failure.
>
> Ok, fallocate will return 1, an unexpected return value.
>
> >
> > We're not able to construct a reproducible way to let
> > btrfs_drop_extents() fails with -ENOSPC after it drops the last file
> > extent but with remaining holes. However, it's quite easy to fix. We
> > just need to update and check the len every time before we call
> > find_first_non_hole(). To make the while loop more readable, we also
> > pull the variable updates to the bottom of loop like this:
> > while (cur_offset < end) {
> > ...
> > // update cur_offset & len
> > // advance cur_offset & len in hole-punching case if needed
> > }
> >
> > Reported-by: Robbie Ko 
> > Fixes: d77815461f04 ("btrfs: Avoid trucating page or punching hole in a
> > already existed hole.")
> > Reviewed-by: Robbie Ko 
> > Reviewed-by: Chung-Chiang Cheng 
> > Signed-off-by: BingJing Chang 
>
> Looks good.
> Please just update that paragraph to be more clear about what is going on.
>
> Thanks.
>
> > ---
> >  fs/btrfs/file.c | 6 +++---
> >  1 file changed, 3 insertions(+), 3 deletions(-)
> >
> > diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
> > index 0e155f0..dccb017 100644
> > --- a/fs/btrfs/file.c
> > +++ b/fs/btrfs/file.c
> > @@ -2735,8 +2735,6 @@ int btrfs_replace_file_extents(struct inode *inode, 
> > struct btrfs_path *path,
> > extent_info->file_offset += replace_len;
> > }
> >
> > -   cur_offset = drop_args.drop_end;
> > -
> > ret = btrfs_update_inode(trans, root, BTRFS_I(inode));
> > if (ret)
> > break;
> > @@ -2756,7 +2754,9 @@ int btrfs_replace_file_extents(struct inode *inode, 
> > struct btrfs_path *path,
> > BUG_ON(ret);/* shouldn't happen */
> > trans->block_rsv = rsv;
> >
> > -   if (!extent_info) {
> > +   cur_offset = drop_args.drop_end;
> > +   len = end - cur_offset;
> > +   if (!extent_info && len) {
> > ret = find_first_non_hole(BTRFS_I(inode), 
> > _offset,
> >   );
> > if (unlikely(ret < 0))
> > --
> > 2.7.4
> >
>
>
> --
> Filipe David Manana,
>
> “Whether you think you can, or you think you can't — you're right.”

Thanks,
BingJing Chang


Re: [PATCH] [v3] drm/imx: imx-ldb: fix out of bounds array access warning

2021-03-24 Thread Liu Ying
On Wed, 2021-03-24 at 17:47 +0100, Arnd Bergmann wrote:
> From: Arnd Bergmann 
> 
> When CONFIG_OF is disabled, building with 'make W=1' produces warnings
> about out of bounds array access:
> 
> drivers/gpu/drm/imx/imx-ldb.c: In function 'imx_ldb_set_clock.constprop':
> drivers/gpu/drm/imx/imx-ldb.c:186:8: error: array subscript -22 is below 
> array bounds of 'struct clk *[4]' [-Werror=array-bounds]
> 
> Add an error check before the index is used, which helps with the
> warning, as well as any possible other error condition that may be
> triggered at runtime.
> 
> The warning could be fixed by adding a Kconfig depedency on CONFIG_OF,
> but Liu Ying points out that the driver may hit the out-of-bounds
> problem at runtime anyway.

Almost impossible to hit the out-of-bounds problem at runtime, unless
something wrong happens and makes unexpected parameters(node and/or
encoder) be handed over to drm_of_encoder_active_port_id(). Anyway, an
error check on return value from drm_of_encoder_active_port_id() looks
ok to me.

> 
> Signed-off-by: Arnd Bergmann 

Reviewed-by: Liu Ying 

Thanks,
Liu Ying

> ---
> v3: fix build regression from v2
> v2: fix subject line
> expand patch description
> print mux number
> check upper bound as well
> ---
>  drivers/gpu/drm/imx/imx-ldb.c | 10 ++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/drivers/gpu/drm/imx/imx-ldb.c b/drivers/gpu/drm/imx/imx-ldb.c
> index dbfe39e2f7f6..565482e2b816 100644
> --- a/drivers/gpu/drm/imx/imx-ldb.c
> +++ b/drivers/gpu/drm/imx/imx-ldb.c
> @@ -197,6 +197,11 @@ static void imx_ldb_encoder_enable(struct drm_encoder 
> *encoder)
>   int dual = ldb->ldb_ctrl & LDB_SPLIT_MODE_EN;
>   int mux = drm_of_encoder_active_port_id(imx_ldb_ch->child, encoder);
>  
> + if (mux < 0 || mux >= ARRAY_SIZE(ldb->clk_sel)) {
> + dev_warn(ldb->dev, "%s: invalid mux %d\n", __func__, mux);
> + return;
> + }
> +
>   drm_panel_prepare(imx_ldb_ch->panel);
>  
>   if (dual) {
> @@ -255,6 +260,11 @@ imx_ldb_encoder_atomic_mode_set(struct drm_encoder 
> *encoder,
>   int mux = drm_of_encoder_active_port_id(imx_ldb_ch->child, encoder);
>   u32 bus_format = imx_ldb_ch->bus_format;
>  
> + if (mux < 0 || mux >= ARRAY_SIZE(ldb->clk_sel)) {
> + dev_warn(ldb->dev, "%s: invalid mux %d\n", __func__, mux);
> + return;
> + }
> +
>   if (mode->clock > 17) {
>   dev_warn(ldb->dev,
>"%s: mode exceeds 170 MHz pixel clock\n", __func__);



[PATCH] include: linux: fs: Remove duplicate declaration

2021-03-24 Thread Wan Jiabing
struct iov_iter has been declared at 66th line. 
Remove the duplicate.

Signed-off-by: Wan Jiabing 
---
 include/linux/fs.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/include/linux/fs.h b/include/linux/fs.h
index ec8f3ddf4a6a..7f3cbd47670a 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1883,7 +1883,6 @@ struct dir_context {
  */
 #define REMAP_FILE_ADVISORY(REMAP_FILE_CAN_SHORTEN)
 
-struct iov_iter;
 
 struct file_operations {
struct module *owner;
-- 
2.25.1



Re: [PATCH] Revert "f2fs: give a warning only for readonly partition"

2021-03-24 Thread Chao Yu

On 2021/3/25 6:44, Jaegeuk Kim wrote:

On 03/24, Chao Yu wrote:

On 2021/3/24 12:22, Jaegeuk Kim wrote:

On 03/24, Chao Yu wrote:

On 2021/3/24 2:39, Jaegeuk Kim wrote:

On 03/23, Chao Yu wrote:

This reverts commit 938a184265d75ea474f1c6fe1da96a5196163789.

Because that commit fails generic/050 testcase which expect failure
during mount a recoverable readonly partition.


I think we need to change generic/050, since f2fs can recover this partition,


Well, not sure we can change that testcase, since it restricts all generic
filesystems behavior. At least, ext4's behavior makes sense to me:

journal_dev_ro = bdev_read_only(journal->j_dev);
really_read_only = bdev_read_only(sb->s_bdev) | journal_dev_ro;

if (journal_dev_ro && !sb_rdonly(sb)) {
ext4_msg(sb, KERN_ERR,
 "journal device read-only, try mounting with '-o ro'");
err = -EROFS;
goto err_out;
}

if (ext4_has_feature_journal_needs_recovery(sb)) {
if (sb_rdonly(sb)) {
ext4_msg(sb, KERN_INFO, "INFO: recovery "
"required on readonly filesystem");
if (really_read_only) {
ext4_msg(sb, KERN_ERR, "write access "
"unavailable, cannot proceed "
"(try mounting with noload)");
err = -EROFS;
goto err_out;
}
ext4_msg(sb, KERN_INFO, "write access will "
   "be enabled during recovery");
}
}


even though using it as readonly. And, valid checkpoint can allow for user to
read all the data without problem.



if (f2fs_hw_is_readonly(sbi)) {


Since device is readonly now, all write to the device will fail, checkpoint can
not persist recovered data, after page cache is expired, user can see stale 
data.


My point is, after mount with ro, there'll be no data write which preserves the
current status. So, in the next time, we can recover fsync'ed data later, if
user succeeds to mount as rw. Another point is, with the current checkpoint, we
should not have any corrupted metadata. So, why not giving a chance to show what
data remained to user? I think this can be doable only with CoW filesystems.


I guess we're talking about the different things...

Let me declare two different readonly status:

1. filesystem readonly: file system is mount with ro mount option, and
app from userspace can not modify any thing of filesystem, but filesystem
itself can modify data on device since device may be writable.

2. device readonly: device is set to readonly status via 'blockdev --setro'
command, and then filesystem should never issue any write IO to the device.

So, what I mean is, *when device is readonly*, rather than f2fs mountpoint
is readonly (f2fs_hw_is_readonly() returns true as below code, instead of
f2fs_readonly() returns true), in this condition, we should not issue any
write IO to device anyway, because, AFAIK, write IO will fail due to
bio_check_ro() check.


In that case, mount(2) will try readonly, no?


Yes, if device is readonly, mount (2) can not mount/remount device to rw
mountpoint.

Thanks,



# blockdev --setro /dev/vdb
# mount -t f2fs /dev/vdb /mnt/test/
mount: /mnt/test: WARNING: source write-protected, mounted read-only.



if (f2fs_hw_is_readonly(sbi)) {
-   if (!is_set_ckpt_flags(sbi, CP_UMOUNT_FLAG)) {
-   err = -EROFS;
+   if (!is_set_ckpt_flags(sbi, CP_UMOUNT_FLAG))
f2fs_err(sbi, "Need to recover fsync data, but write 
access unavailable");
-   goto free_meta;
-   }
-   f2fs_info(sbi, "write access unavailable, skipping 
recovery");
+   else
+   f2fs_info(sbi, "write access unavailable, skipping 
recovery");
goto reset_checkpoint;
}

For the case of filesystem is readonly and device is writable, it's fine
to do recovery in order to let user to see fsynced data.

Thanks,





Am I missing something?

Thanks,





Fixes: 938a184265d7 ("f2fs: give a warning only for readonly partition")
Signed-off-by: Chao Yu 
---
fs/f2fs/super.c | 8 +---
1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index b48281642e98..2b78ee11f093 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -3952,10 +3952,12 @@ static int f2fs_fill_super(struct super_block *sb, void 
*data, int silent)
 * previous checkpoint was not done by clean system shutdown.
 */
if (f2fs_hw_is_readonly(sbi)) {
-  

[PATCH v2] btrfs: fix a potential hole-punching failure

2021-03-24 Thread bingjingc
From: BingJing Chang 

In commit d77815461f04 ("btrfs: Avoid trucating page or punching hole
in a already existed hole."), existed holes can be skipped by calling
find_first_non_hole() to adjust *start and *len. However, if the given
len is invalid and large, when an EXTENT_MAP_HOLE extent is found, the
*len will not be set to zero because (em->start + em->len) is less than
(*start + *len). Then the ret will be 1 but the *len will not be set to
0. The propagated non-zero ret will result in fallocate failure.

In the while-loop of btrfs_replace_file_extents(), len is not updated
every time before it calls find_first_non_hole(). That is, after
btrfs_drop_extents() successfully drops the last non-hole file extent,
it may fail with -ENOSPC when attempting to drop a file extent item
representing a hole. The problem can happen. After it calls
find_first_non_hole(), the cur_offset will be adjusted to be larger
than or equal to end. However, since the len is not set to zero. The
break-loop condition (ret && !len) will not meet. After it leaves the
while-loop, fallocate will return 1, which is an unexpected return
value.

We're not able to construct a reproducible way to let
btrfs_drop_extents() fail with -ENOSPC after it drops the last non-hole
file extent but with remaining holes left. However, it's quite easy to
fix. We just need to update and check the len every time before we call
find_first_non_hole(). To make the while loop more readable, we also
pull the variable updates to the bottom of loop like this:
while (cur_offset < end) {
...
// update cur_offset & len
// advance cur_offset & len in hole-punching case if needed
}

Reported-by: Robbie Ko 
Fixes: d77815461f04 ("btrfs: Avoid trucating page or punching hole in a
already existed hole.")
Reviewed-by: Robbie Ko 
Reviewed-by: Chung-Chiang Cheng 
Signed-off-by: BingJing Chang 
---
 fs/btrfs/file.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 0e155f0..dccb017 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -2735,8 +2735,6 @@ int btrfs_replace_file_extents(struct inode *inode, 
struct btrfs_path *path,
extent_info->file_offset += replace_len;
}
 
-   cur_offset = drop_args.drop_end;
-
ret = btrfs_update_inode(trans, root, BTRFS_I(inode));
if (ret)
break;
@@ -2756,7 +2754,9 @@ int btrfs_replace_file_extents(struct inode *inode, 
struct btrfs_path *path,
BUG_ON(ret);/* shouldn't happen */
trans->block_rsv = rsv;
 
-   if (!extent_info) {
+   cur_offset = drop_args.drop_end;
+   len = end - cur_offset;
+   if (!extent_info && len) {
ret = find_first_non_hole(BTRFS_I(inode), _offset,
  );
if (unlikely(ret < 0))
-- 
2.7.4



  1   2   3   4   5   6   7   8   9   10   >