RE: [PATCH v6 04/10] scsi: ufshpb: Make eviction depends on region's reads
> > On 2021-03-22 16:10, Avri Altman wrote: > > In host mode, eviction is considered an extreme measure. > > verify that the entering region has enough reads, and the exiting > > region has much less reads. > > > > Signed-off-by: Avri Altman > > --- > > drivers/scsi/ufs/ufshpb.c | 18 +- > > 1 file changed, 17 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/scsi/ufs/ufshpb.c b/drivers/scsi/ufs/ufshpb.c > > index a1519cbb4ce0..5e757220d66a 100644 > > --- a/drivers/scsi/ufs/ufshpb.c > > +++ b/drivers/scsi/ufs/ufshpb.c > > @@ -17,6 +17,7 @@ > > #include "../sd.h" > > > > #define ACTIVATION_THRESHOLD 8 /* 8 IOs */ > > +#define EVICTION_THRESHOLD (ACTIVATION_THRESHOLD << 5) /* 256 IOs > */ > > > > /* memory management */ > > static struct kmem_cache *ufshpb_mctx_cache; > > @@ -1047,6 +1048,13 @@ static struct ufshpb_region > > *ufshpb_victim_lru_info(struct ufshpb_lu *hpb) > > if (ufshpb_check_srgns_issue_state(hpb, rgn)) > > continue; > > > > + /* > > + * in host control mode, verify that the exiting region > > + * has less reads > > + */ > > + if (hpb->is_hcm && rgn->reads > (EVICTION_THRESHOLD >> 1)) > > + continue; > > + > > victim_rgn = rgn; > > break; > > } > > @@ -1219,7 +1227,7 @@ static int ufshpb_issue_map_req(struct ufshpb_lu > > *hpb, > > > > static int ufshpb_add_region(struct ufshpb_lu *hpb, struct > > ufshpb_region *rgn) > > { > > - struct ufshpb_region *victim_rgn; > > + struct ufshpb_region *victim_rgn = NULL; > > struct victim_select_info *lru_info = >lru_info; > > unsigned long flags; > > int ret = 0; > > @@ -1246,7 +1254,15 @@ static int ufshpb_add_region(struct ufshpb_lu > > *hpb, struct ufshpb_region *rgn) > >* It is okay to evict the least recently used region, > >* because the device could detect this region > >* by not issuing HPB_READ > > + * > > + * in host control mode, verify that the entering > > + * region has enough reads > >*/ > > + if (hpb->is_hcm && rgn->reads < EVICTION_THRESHOLD) { > > + ret = -EACCES; > > + goto out; > > + } > > + > > I cannot understand the logic behind this. A rgn which host chooses to > activate, > is in INACTIVE state now, if its rgn->reads < 256, then don't activate > it. > Could you please elaborate? I am re-citing the commit log: "In host mode, eviction is considered an extreme measure. verify that the entering region has enough reads, and the exiting region has much less reads." Here comes to play the reads counter as a comparative index. Max-active-regions has crossed, and to activate a region, you need to evict another region. But the activation threshold is relatively low, how do you know that you will benefit more, >From the new region, than from the one you choose to evict? Not to arbitrarily evict the "first" (LRU) region, like in device mode, we are looking for a solid Reason for the new region to enter, and for the existing region to leave. Otherwise, you will find yourself entering and existing the same region over and over, Just threshing the active-list creating an unnecessary overhead by keep sending map requests. For example, say the entering region has 4 reads, but the LRU region has 200, and its reads keeps coming. Is it the "correct" decision to evict a 200-reads region for a 4-reads region? If you indeed evict this 200-reads region, you will evict another to put it right back, Over and over. On the other hand, we are not hanging-on to "cold" regions, and inactivate them if there are no recent Reads to that region - see the patch with the "Cold" timeout. I agree that this can be elaborate to a more sophisticated policies - which we tried. For now, let's go with the simplest one - use thresholds for both the entering and exiting regions. Thanks, Avri > > Thanks, > Can Guo. > > > victim_rgn = ufshpb_victim_lru_info(hpb); > > if (!victim_rgn) { > > dev_warn(>sdev_ufs_lu->sdev_dev,
Re: [PATCH v5 0/6] KVM: arm64: Add VLPI migration support on GICv4.1
On 2021/3/25 2:19, Marc Zyngier wrote: > On Mon, 22 Mar 2021 14:01:52 +0800, Shenming Lu wrote: >> In GICv4.1, migration has been supported except for (directly-injected) >> VLPI. And GICv4.1 Spec explicitly gives a way to get the VLPI's pending >> state (which was crucially missing in GICv4.0). So we make VLPI migration >> capable on GICv4.1 in this series. >> >> In order to support VLPI migration, we need to save and restore all >> required configuration information and pending states of VLPIs. But >> in fact, the configuration information of VLPIs has already been saved >> (or will be reallocated on the dst host...) in vgic(kvm) migration. >> So we only have to migrate the pending states of VLPIs specially. >> >> [...] > > Applied to next, thanks! Thanks a lot again for all the comments and suggestions. :-) Shenming > > [1/6] irqchip/gic-v3-its: Add a cache invalidation right after vPE unmapping > commit: 301beaf19739cb6e640ed44e630e7da993f0ecc8 > [2/6] irqchip/gic-v3-its: Drop the setting of PTZ altogether > commit: c21bc068cdbe5613d3319ae171c3f2eb9f321352 > [3/6] KVM: arm64: GICv4.1: Add function to get VLPI state > commit: 80317fe4a65375fae668672a1398a0fb73eb9023 > [4/6] KVM: arm64: GICv4.1: Try to save VLPI state in save_pending_tables > commit: f66b7b151e00427168409f8c1857970e926b1e27 > [5/6] KVM: arm64: GICv4.1: Restore VLPI pending state to physical side > commit: 12df7429213abbfa9632ab7db94f629ec309a58b > [6/6] KVM: arm64: GICv4.1: Give a chance to save VLPI state > commit: 8082d50f4817ff6a7e08f4b7e9b18e5f8bfa290d > > Cheers, > > M. >
[PATCH 2/2] nvmem: qfprom: Add support for fuse blowing on sc7280
Handle the differences across LDO voltage needed for blowing fuses, and the blow timer value, identified using a minor version of 15 on sc7280. Signed-off-by: Rajendra Nayak Signed-off-by: Ravi Kumar Bokka --- Applies on top of https://lore.kernel.org/patchwork/patch/1376175/ drivers/nvmem/qfprom.c | 27 +-- 1 file changed, 25 insertions(+), 2 deletions(-) diff --git a/drivers/nvmem/qfprom.c b/drivers/nvmem/qfprom.c index 100d69d..d6d3f24 100644 --- a/drivers/nvmem/qfprom.c +++ b/drivers/nvmem/qfprom.c @@ -45,11 +45,13 @@ MODULE_PARM_DESC(read_raw_data, "Read raw instead of corrected data"); * @qfprom_blow_timer_value: The timer value of qfprom when doing efuse blow. * @qfprom_blow_set_freq:The frequency required to set when we start the * fuse blowing. + * @qfprom_blow_uV: LDO voltage to be set when doing efuse blow */ struct qfprom_soc_data { u32 accel_value; u32 qfprom_blow_timer_value; u32 qfprom_blow_set_freq; + int qfprom_blow_uV; }; /** @@ -111,6 +113,15 @@ static const struct qfprom_soc_compatible_data sc7180_qfprom = { .nkeepout = ARRAY_SIZE(sc7180_qfprom_keepout) }; +static const struct nvmem_keepout sc7280_qfprom_keepout[] = { + {.start = 0x128, .end = 0x148}, + {.start = 0x238, .end = 0x248} +}; + +static const struct qfprom_soc_compatible_data sc7280_qfprom = { + .keepout = sc7280_qfprom_keepout, + .nkeepout = ARRAY_SIZE(sc7280_qfprom_keepout) +}; /** * qfprom_disable_fuse_blowing() - Undo enabling of fuse blowing. * @priv: Our driver data. @@ -168,6 +179,7 @@ static int qfprom_enable_fuse_blowing(const struct qfprom_priv *priv, struct qfprom_touched_values *old) { int ret; + int qfprom_blow_uV = priv->soc_data->qfprom_blow_uV; ret = clk_prepare_enable(priv->secclk); if (ret) { @@ -187,9 +199,9 @@ static int qfprom_enable_fuse_blowing(const struct qfprom_priv *priv, * a rail shared do don't specify a max--regulator constraints * will handle. */ - ret = regulator_set_voltage(priv->vcc, 180, INT_MAX); + ret = regulator_set_voltage(priv->vcc, qfprom_blow_uV, INT_MAX); if (ret) { - dev_err(priv->dev, "Failed to set 1.8 voltage\n"); + dev_err(priv->dev, "Failed to set %duV\n", qfprom_blow_uV); goto err_clk_rate_set; } @@ -311,6 +323,14 @@ static const struct qfprom_soc_data qfprom_7_8_data = { .accel_value = 0xD10, .qfprom_blow_timer_value = 25, .qfprom_blow_set_freq = 480, + .qfprom_blow_uV = 180, +}; + +static const struct qfprom_soc_data qfprom_7_15_data = { + .accel_value = 0xD08, + .qfprom_blow_timer_value = 24, + .qfprom_blow_set_freq = 480, + .qfprom_blow_uV = 190, }; static int qfprom_probe(struct platform_device *pdev) @@ -379,6 +399,8 @@ static int qfprom_probe(struct platform_device *pdev) if (major_version == 7 && minor_version == 8) priv->soc_data = _7_8_data; + if (major_version == 7 && minor_version == 15) + priv->soc_data = _7_15_data; priv->vcc = devm_regulator_get(>dev, "vcc"); if (IS_ERR(priv->vcc)) @@ -405,6 +427,7 @@ static int qfprom_probe(struct platform_device *pdev) static const struct of_device_id qfprom_of_match[] = { { .compatible = "qcom,qfprom",}, { .compatible = "qcom,sc7180-qfprom", .data = _qfprom}, + { .compatible = "qcom,sc7280-qfprom", .data = _qfprom}, {/* sentinel */}, }; MODULE_DEVICE_TABLE(of, qfprom_of_match); -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
[PATCH 1/2] dt-bindings: nvmem: Add SoC compatible for sc7280
Document SoC compatible for sc7280 Signed-off-by: Rajendra Nayak --- Documentation/devicetree/bindings/nvmem/qcom,qfprom.yaml | 1 + 1 file changed, 1 insertion(+) diff --git a/Documentation/devicetree/bindings/nvmem/qcom,qfprom.yaml b/Documentation/devicetree/bindings/nvmem/qcom,qfprom.yaml index 992777c..861b205 100644 --- a/Documentation/devicetree/bindings/nvmem/qcom,qfprom.yaml +++ b/Documentation/devicetree/bindings/nvmem/qcom,qfprom.yaml @@ -24,6 +24,7 @@ properties: - qcom,msm8998-qfprom - qcom,qcs404-qfprom - qcom,sc7180-qfprom + - qcom,sc7280-qfprom - qcom,sdm845-qfprom - const: qcom,qfprom -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
Re: [PATCH v3 9/9] dt-bindings: serial: stm32: add phandle 'bluetooth' to fix dtbs_check warrning
Hi Rob, Thanks for the suggestion. On Thu, Mar 25, 2021 at 1:45 AM Rob Herring wrote: > > On Fri, Mar 19, 2021 at 07:13:27PM +0800, dillon min wrote: > > Hi Alexandre, > > > > Thanks for the reply. > > > > On Fri, Mar 19, 2021 at 4:38 PM Alexandre TORGUE > > wrote: > > > > > > Hi Dillon > > > > > > On 3/19/21 5:28 AM, dillon min wrote: > > > > No changes, Just loop lkp in. > > > > > > > > > > > > Hi lkp, > > > > > > > > Sorry for the late reply, thanks for your report. > > > > This patch is to fix the build warning message. > > > > > > > > Thanks. > > > > Regards > > > > > > > > On Mon, Mar 15, 2021 at 5:45 PM wrote: > > > >> > > > >> From: dillon min > > > >> > > > >> when run make dtbs_check with 'bluetoothi brcm,bcm43438-bt' > > > >> dts enabled on stm32h7, there is a warrning popup: > > > >> > > > arch/arm/boot/dts/stm32h750i-art-pi.dt.yaml: serial@40004800: > > > 'bluetooth' > > > >> does not match any of the regexes: 'pinctrl-[0-9]+' > > > >> > > > >> to make dtbs_check happy, so add a phandle bluetooth > > > >> > > > >> Fixes: 500cdb23d608 ("ARM: dts: stm32: Add STM32H743 MCU and > > > >> STM32H743i-EVAL board") > > > >> Signed-off-by: dillon min > > > >> Reported-by: kernel test robot > > > >> --- > > > >> Documentation/devicetree/bindings/serial/st,stm32-uart.yaml | 5 + > > > >> 1 file changed, 5 insertions(+) > > > >> > > > >> diff --git > > > >> a/Documentation/devicetree/bindings/serial/st,stm32-uart.yaml > > > >> b/Documentation/devicetree/bindings/serial/st,stm32-uart.yaml > > > >> index 8631678283f9..5e674840e62d 100644 > > > >> --- a/Documentation/devicetree/bindings/serial/st,stm32-uart.yaml > > > >> +++ b/Documentation/devicetree/bindings/serial/st,stm32-uart.yaml > > > >> @@ -50,6 +50,11 @@ properties: > > > >> minItems: 1 > > > >> maxItems: 2 > > > >> > > > >> + bluetooth: > > > >> +type: object > > > >> +description: | > > > >> + phandles to the usart controller and bluetooth > > > >> + > > > > > > Do we really need to add this "generic" property here ? You could test > > > without the "AditionalProperties:False". > > Yes, indeed. we have no reason to add a generic 'bluetooth' property > > into specific soc's interface yaml. > > I can't just remove "AditionalProperties:False", else make > > O=../kernel-art/ dtbs dtbs_check will run into > > > > /home/fmin/linux/Documentation/devicetree/bindings/serial/st,stm32-uart.yaml: > > 'oneOf' conditional failed, one must be fixed: > > 'unevaluatedProperties' is a required property > > 'additionalProperties' is a required property > > ... > > > > So , i will replace "AditionalProperties:False". with > > unevaluatedProperties: false, do you agree with this? > > This is okay as long as 'serial.yaml' is referenced, but will eventually > fail if not (unevaluatedProperties isn't actually implemented yet). > > > If so, i will send patch v4 later. > > Or you can do this: > > addtionalProperties: > type: object > > Which means any other property has to be a node. > Okay, I just test your patch, it's fixed dtbs_check warrning as well. I will merge it to next submit, thanks. Hi, Valentin CARON, Could you help to double check it, after my v5 submit ? thanks so much. Regards. Valent > Rob
Re: [PATCH] cpufreq: dt: check the error returned by dev_pm_opp_of_cpumask_add_table
Hi Viresh, On 3/25/21 1:24 PM, Viresh Kumar wrote: On 25-03-21, 13:15, quanyang.wang wrote: Thank you for pointing it out. Do you mean that even if dev_pm_opp_of_cpumask_add_table returns an error, dev_pm_opp_get_opp_count may still return count > 0 because someone may call dev_pm_opp_add to add OPP to cpu succcessfully at somewhere else? Yes. There are two ways we can add OPPs today: - Statically via device tree. This is what dev_pm_opp_of_cpumask_add_table() tries to do. - Dynamically via call to dev_pm_opp_add(), which I described earlier. What failed here is the static way of adding OPPs, we still need to check if OPPs were added dynamically. Thank you for shedding light on this. I will send a V2 patch which only check the return error -EPROBE_DEFER. Thanks, Quanyang
Re: [PATCH v6 04/10] scsi: ufshpb: Make eviction depends on region's reads
On 2021-03-22 16:10, Avri Altman wrote: In host mode, eviction is considered an extreme measure. verify that the entering region has enough reads, and the exiting region has much less reads. Signed-off-by: Avri Altman --- drivers/scsi/ufs/ufshpb.c | 18 +- 1 file changed, 17 insertions(+), 1 deletion(-) diff --git a/drivers/scsi/ufs/ufshpb.c b/drivers/scsi/ufs/ufshpb.c index a1519cbb4ce0..5e757220d66a 100644 --- a/drivers/scsi/ufs/ufshpb.c +++ b/drivers/scsi/ufs/ufshpb.c @@ -17,6 +17,7 @@ #include "../sd.h" #define ACTIVATION_THRESHOLD 8 /* 8 IOs */ +#define EVICTION_THRESHOLD (ACTIVATION_THRESHOLD << 5) /* 256 IOs */ /* memory management */ static struct kmem_cache *ufshpb_mctx_cache; @@ -1047,6 +1048,13 @@ static struct ufshpb_region *ufshpb_victim_lru_info(struct ufshpb_lu *hpb) if (ufshpb_check_srgns_issue_state(hpb, rgn)) continue; + /* +* in host control mode, verify that the exiting region +* has less reads +*/ + if (hpb->is_hcm && rgn->reads > (EVICTION_THRESHOLD >> 1)) + continue; + victim_rgn = rgn; break; } @@ -1219,7 +1227,7 @@ static int ufshpb_issue_map_req(struct ufshpb_lu *hpb, static int ufshpb_add_region(struct ufshpb_lu *hpb, struct ufshpb_region *rgn) { - struct ufshpb_region *victim_rgn; + struct ufshpb_region *victim_rgn = NULL; struct victim_select_info *lru_info = >lru_info; unsigned long flags; int ret = 0; @@ -1246,7 +1254,15 @@ static int ufshpb_add_region(struct ufshpb_lu *hpb, struct ufshpb_region *rgn) * It is okay to evict the least recently used region, * because the device could detect this region * by not issuing HPB_READ +* +* in host control mode, verify that the entering +* region has enough reads */ + if (hpb->is_hcm && rgn->reads < EVICTION_THRESHOLD) { + ret = -EACCES; + goto out; + } + I cannot understand the logic behind this. A rgn which host chooses to activate, is in INACTIVE state now, if its rgn->reads < 256, then don't activate it. Could you please elaborate? Thanks, Can Guo. victim_rgn = ufshpb_victim_lru_info(hpb); if (!victim_rgn) { dev_warn(>sdev_ufs_lu->sdev_dev,
Re: [PATCH 2/2] dt-binding: leds: Document leds-multi-gpio bindings
Hi, See below. On 24.3.2021 9.56, Hermes Zhang wrote: From: Hermes Zhang Document the device tree bindings of the multiple GPIOs LED driver Documentation/devicetree/bindings/leds/leds-multi-gpio.yaml. Signed-off-by: Hermes Zhang --- .../bindings/leds/leds-multi-gpio.yaml| 50 +++ 1 file changed, 50 insertions(+) create mode 100644 Documentation/devicetree/bindings/leds/leds-multi-gpio.yaml diff --git a/Documentation/devicetree/bindings/leds/leds-multi-gpio.yaml b/Documentation/devicetree/bindings/leds/leds-multi-gpio.yaml new file mode 100644 index ..6f2b47487b90 --- /dev/null +++ b/Documentation/devicetree/bindings/leds/leds-multi-gpio.yaml @@ -0,0 +1,50 @@ +# SPDX-License-Identifier: GPL-2.0 +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/leds/leds-multi-gpio.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Multiple GPIOs LED driver + +maintainers: + - Hermes Zhang + +description: + This will support some LED made of multiple GPIOs and the brightness of the + LED could map to different states of the GPIOs. + +properties: + compatible: +const: multi-gpio-led + + led-gpios: +description: Array of one or more GPIOs pins used to control the LED. +minItems: 1 +maxItems: 8 # Should be enough We also have a case with multi color LEDs (which is probably a more common than multi intensity LED. So I am wondering how these both could co-exist. From: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/devicetree/bindings/leds/leds-gpio.yaml?h=v5.12-rc4#n58 led-0 { gpios = <_pio 0 GPIO_ACTIVE_LOW>; linux,default-trigger = "disk-activity"; function = LED_FUNCTION_DISK; }; Now 'gpios' (and in LED context) and 'led-gpios' is very close to each other and could easily be confused. Perhaps this could be something like: intensity-gpios = ... or even simplified then just to gpios = <...> + + led-states: +description: | + The array list the supported states here which will map to brightness + from 0 to maximum. Each item in the array will present all the GPIOs + value by bit. +$ref: /schemas/types.yaml#/definitions/uint8-array +minItems: 1 +maxItems: 16 # Should be enough + +required: + - compatible + - led-gpios + - led-states + +additionalProperties: false + +examples: + - | +gpios-led { + compatible = "multi-gpio-led"; + + led-gpios = < 23 0x1>, + < 24 0x1>; + led-states = /bits/ 8 <0x00 0x01 0x02 0x03>; +}; +... From: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/devicetree/bindings/leds/leds-lp55xx.yaml?h=v5.12-rc4#n196 There is example of multi color LED configuration. In example below I used two-color LED with red and green as an example (which what we seem to have most in use). Then if try to combine these into something like: # Multi color LED with single GPIO line per color multi-led@2 { compatible = "gpio-leds"; color = ; led@0 { color = ; gpios = <_pio 0 GPIO_ACTIVE_LOW>; }; led@1 { color = ; gpios = <_pio 1 GPIO_ACTIVE_LOW>; }; }; # And with intensity GPIOs: multi-led@2 { compatible = "gpio-leds"; color = ; led@0 { color = ; gpios = < 23 0x1>, < 24 0x1>; ... see below }; led@1 { color = ; gpios = < 25 0x1>, < 26 0x1>; ... see below }; }; # And then single GPIO with intensity GPIOs: led@2 { compatible = "gpio-leds"; gpios = < 23 0x1>, < 24 0x1>; gpios-brightness-levels = <0 1 2 3> }; I changed 'led-states' to 'gpios-brightness-levels' as it describe more that this is about brightness and not some other state information. How would this sound? Thanks, Vesa Jääskeläinen
Re: [PATCH v1 3/3] KEYS: trusted: Introduce support for NXP CAAM-based trusted keys
On Wed, 24 Mar 2021 at 19:37, Ahmad Fatoum wrote: > > Hello Sumit, > > On 24.03.21 11:47, Sumit Garg wrote: > > On Wed, 24 Mar 2021 at 14:56, Ahmad Fatoum wrote: > >> > >> Hello Mimi, > >> > >> On 23.03.21 19:07, Mimi Zohar wrote: > >>> On Tue, 2021-03-23 at 17:35 +0100, Ahmad Fatoum wrote: > On 21.03.21 21:48, Horia Geantă wrote: > > caam has random number generation capabilities, so it's worth using that > > by implementing .get_random. > > If the CAAM HWRNG is already seeding the kernel RNG, why not use the > kernel's? > > Makes for less code duplication IMO. > >>> > >>> Using kernel RNG, in general, for trusted keys has been discussed > >>> before. Please refer to Dave Safford's detailed explanation for not > >>> using it [1]. > >> > >> The argument seems to boil down to: > >> > >> - TPM RNG are known to be of good quality > >> - Trusted keys always used it so far > >> > >> Both are fine by me for TPMs, but the CAAM backend is new code and neither > >> point > >> really applies. > >> > >> get_random_bytes_wait is already used for generating key material > >> elsewhere. > >> Why shouldn't new trusted key backends be able to do the same thing? > >> > > > > Please refer to documented trusted keys behaviour here [1]. New > > trusted key backends should align to this behaviour and in your case > > CAAM offers HWRNG so we should be better using that. > > Why is it better? > > Can you explain what benefit a CAAM user would have if the trusted key > randomness comes directly out of the CAAM instead of indirectly from > the kernel entropy pool that is seeded by it? IMO, user trust in case of trusted keys comes from trusted keys backend which is CAAM here. If a user doesn't trust that CAAM would act as a reliable source for RNG then CAAM shouldn't be used as a trust source in the first place. And I think building user's trust for kernel RNG implementation with multiple entropy contributions is pretty difficult when compared with CAAM HWRNG implementation. -Sumit > > > Also, do update documentation corresponding to CAAM as a trusted keys > > backend. > > Yes. The documentation should be updated for CAAM and it should describe > how the key material is derived. Will do so for v2. > > Cheers, > Ahmad > > > > > [1] > > https://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd.git/tree/Documentation/security/keys/trusted-encrypted.rst#n87 > > > > -Sumit > > > >> Cheers, > >> Ahmad > >> > >>> > >>> thanks, > >>> > >>> Mimi > >>> > >>> [1] > >>> https://lore.kernel.org/linux-integrity/bca04d5d9a3b764c9b7405bba4d4a3c035f2a...@alpmbapa12.e2k.ad.ge.com/ > >>> > >>> > >>> > >> > >> -- > >> Pengutronix e.K. | | > >> Steuerwalder Str. 21 | http://www.pengutronix.de/ | > >> 31137 Hildesheim, Germany | Phone: +49-5121-206917-0| > >> Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917- | > > > > -- > Pengutronix e.K. | | > Steuerwalder Str. 21 | http://www.pengutronix.de/ | > 31137 Hildesheim, Germany | Phone: +49-5121-206917-0| > Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917- |
Re: [PATCH 2/5] cifsd: add server-side procedures for SMB3
Am 23.03.2021 um 08:19 schrieb Dan Carpenter: On Tue, Mar 23, 2021 at 08:17:47AM +0900, Namjae Jeon wrote: + +static int +compare_oid(unsigned long *oid1, unsigned int oid1len, + unsigned long *oid2, unsigned int oid2len) { + unsigned int i; + + if (oid1len != oid2len) + return 0; + + for (i = 0; i < oid1len; i++) { + if (oid1[i] != oid2[i]) + return 0; + } + return 1; +} Call this oid_eq()? Why not compare_oid()? This code is come from cifs. I need clear reason to change both cifs/cifsd... Boolean functions should tell you what they are testing in the name. Without any context you can't know what if (compare_oid(one, two)) { means, but if (oid_equal(one, two)) { is readable. regards, dan carpenter ahm just a pointless comment. but return !memcmp(oid1,oid2, sizeof(long*)*oid1len); looks much more efficient than this "for" loop
Re: [PATCH] cpufreq: dt: check the error returned by dev_pm_opp_of_cpumask_add_table
On 25-03-21, 13:15, quanyang.wang wrote: > Thank you for pointing it out. Do you mean that even if > dev_pm_opp_of_cpumask_add_table returns > > an error, dev_pm_opp_get_opp_count may still return count > 0 because > someone may call dev_pm_opp_add > > to add OPP to cpu succcessfully at somewhere else? Yes. There are two ways we can add OPPs today: - Statically via device tree. This is what dev_pm_opp_of_cpumask_add_table() tries to do. - Dynamically via call to dev_pm_opp_add(), which I described earlier. What failed here is the static way of adding OPPs, we still need to check if OPPs were added dynamically. -- viresh
[no subject]
Hej min kære, jeg vil gerne vide, om du har min tidligere besked, tak.
Re: [PATCH] arm64: dts: qcom: sc7280: Add PMIC peripherals for SC7280
Hi Matthias, On 2021-03-22 23:04, Matthias Kaehlcke wrote: Hi Satya, On Mon, Mar 22, 2021 at 06:50:47PM +0530, ska...@codeaurora.org wrote: Hi Matthias, On 2021-03-13 02:10, Matthias Kaehlcke wrote: > Hi Satya, > > On Thu, Mar 11, 2021 at 04:10:29PM +0530, satya priya wrote: > > Add PM7325/PM8350C/PMK8350/PMR735A peripherals such as PON, > > GPIOs, RTC and other PMIC infra modules for SC7280. > > > > Signed-off-by: satya priya > > --- > > This patch depends on base DT and board files for SC7280 to merge > > first > > https://lore.kernel.org/patchwork/project/lkml/list/?series=487403 > > > > arch/arm64/boot/dts/qcom/pm7325.dtsi | 60 > > arch/arm64/boot/dts/qcom/pm8350c.dtsi | 60 > > arch/arm64/boot/dts/qcom/pmk8350.dtsi | 104 > > ++ > > arch/arm64/boot/dts/qcom/pmr735a.dtsi | 60 > > arch/arm64/boot/dts/qcom/sc7280.dtsi | 8 +++ > > 5 files changed, 292 insertions(+) > > create mode 100644 arch/arm64/boot/dts/qcom/pm7325.dtsi > > create mode 100644 arch/arm64/boot/dts/qcom/pm8350c.dtsi > > create mode 100644 arch/arm64/boot/dts/qcom/pmk8350.dtsi > > create mode 100644 arch/arm64/boot/dts/qcom/pmr735a.dtsi > > > > diff --git a/arch/arm64/boot/dts/qcom/pm7325.dtsi > > b/arch/arm64/boot/dts/qcom/pm7325.dtsi > > new file mode 100644 > > index 000..393b256 > > --- /dev/null > > +++ b/arch/arm64/boot/dts/qcom/pm7325.dtsi > > @@ -0,0 +1,60 @@ > > ... > > > + polling-delay-passive = <100>; > > + polling-delay = <0>; > > Are you sure that no polling delay is needed? How does the thermal > framework > detect that the temperatures is >= the passive trip point and that it > should > start polling at 'polling-delay-passive' rate? > As the temp-alarm has interrupt support, whenever preconfigured threshold violates it notifies thermal framework, so I think the polling delay is not needed here. From the documentation I found it's not clear to me how exactly these interrupts work. Is a single interrupt triggered when the threshold is violated or are there periodic (?) interrupts as long as the temperature is above the stage 0 threshold? Why is 'polling-delay-passive' passive needed if there are interrupts? Maybe to detect that the zone should transition from passive to no cooling when the temperature drops below the stage 0 threshold? The PMIC TEMP_ALARM peripheral maintains an internal over-temperature stage: 0, 1, 2, or 3. Stage 0 is normal operation below the lowest (stage 1) threshold [usually 95 C]. When in stage 1, the temperature is between the stage 1 and 2 thresholds [stage 2 threshold is usually 115 C]. Upon hitting the stage 3 threshold [usually 145 C], the PMIC hardware will automatically shut down the system. The TEMP_ALARM IRQ fires on stage 0 -> 1 and 1 -> 0 transitions. We therefore set polling-delay = <0> since there is no need for software to monitor the temperature periodically when operating in stage 0. Upon crossing the stage 1 threshold, SW receives the IRQ and the thermal framework hits its first trip changing the thermal zone to passive mode. This then engages the 100 ms polling enabled via polling-delay-passive = <100>. If the temperate keeps climbing and passes the stage 2 threshold, the thermal framework hits the second trip (which is critical) and it initiates an orderly shutdown. If the temperature drops below the stage 1 threshold, then the thermal framework exits passive mode and stops polling. This approach reduces/eliminates the software overhead when not at an elevated temperature. Thanks, Satya Priya
Re: [PATCH] fs: Improve eventpoll logging to stop indicting timerfd
Hi Thomas, On Mon, Mar 22, 2021 at 2:40 PM Thomas Gleixner wrote: > > Manish, > > On Mon, Mar 22 2021 at 10:15, Manish Varma wrote: > > On Thu, Mar 18, 2021 at 6:04 AM Thomas Gleixner wrote: > >> > +static atomic_t instance_count = ATOMIC_INIT(0); > >> > >> instance_count is misleading as it does not do any accounting of > >> instances as the name suggests. > >> > > > > Not sure if I am missing a broader point here, but the objective of this > > patch is to: > > A. To help find the process a given timerfd associated with, and > > B. one step further, if there are multiple fds created by a single > > process then label each instance using monotonically increasing integer > > i.e. "instance_count" to help identify each of them separately. > > > > So, instance_count in my mind helps with "B", i.e. to keep track and > > identify each instance of timerfd individually. > > I know what you want to do. The point is that instance_count is the > wrong name as it suggests that it counts instances, and that in most > cases implies active instances. > > It's not a counter, it's a token generator which allows you to create > unique ids. The fact that it is just incrementing once per created file > descriptor does not matter. That's just an implementation detail. > > Name it something like timerfd_create_id or timerfd_session_id which > clearly tells that this is not counting any thing. It immediately tells > the purpose of generating an id. > > Naming matters when reading code, really. > Noted, and thanks for the clarification! > >> > + snprintf(file_name_buf, sizeof(file_name_buf), "[timerfd%d:%s]", > >> > + instance, task_comm_buf); > >> > + ufd = anon_inode_getfd(file_name_buf, _fops, ctx, > >> > O_RDWR | (flags & TFD_SHARED_FCNTL_FLAGS)); > >> > if (ufd < 0) > >> > kfree(ctx); > >> > >> I actually wonder, whether this should be part of anon_inode_get*(). > >> > > > > I am curious (and open at the same time) if that will be helpful.. > > In the case of timerfd, I could see it adds up value by stuffing more > > context to the file descriptor name as eventpoll is using the same file > > descriptor names as wakesource name, and hence the cost of slightly > > longer file descriptor name justifies. But I don't have a solid reason > > if this additional cost (of longer file descriptor names) will be > > helpful in general with other file descriptors. > > Obviously you want to make that depend on a flag handed to anon_...(). Unfortunately, changing file descriptor names does not seem to be a viable option here (more details in my answer in the next section), and hence changes in anon_...() does not seem to be required. > > The point is that there will be the next anonfd usecase which needs > unique identification at some point. That is going to copy that > timerfd code and then make it slightly different just because and then > userspace needs to parse yet another format. > > >> Aside of that this is a user space visible change both for eventpoll and > >> timerfd. > > Not when done right. > > >> Have you carefully investigated whether there is existing user space > >> which might depend on the existing naming conventions? > >> > > I am not sure how I can confirm that for all userspace, but open for > > suggestions if you can share some ideas. > > > > However, I have verified and can confirm for Android userspace that > > there is no dependency on existing naming conventions for timerfd and > > eventpoll wakesource names, if that helps. > > Well, there is a world outside Android and you're working for a company > which should have tools to search for '[timerfd]' usage in a gazillion of > projects. The obvious primary targets are distros of all sorts. I'm sure > there are ways to figure this out without doing it manually. > > Not that I expect any real dependencies on it, but as always the devil > is in the details. > Right, there are some userspace which depends on "[timerfd]" string https://codesearch.debian.net/search?q=%22%5Btimerfd%5D%22=1 So, modifying file descriptor names at-least for timerfd will definitely break those. With that said, I am now thinking about leaving alone the file descriptor names as is, and instead, adding those extra information about the associated processes (i.e. process name or rather PID of the process) along with token ID directly into wakesource name, at the time of creating new wakesource i.e. in ep_create_wakeup_source(). So, the wakesource names, that currently named as "[timerfd]", will be named something like: "epollitem:.[timerfd]" Where N is the number of wakesource created since boot. This way we can still associate the process with the wakesource name and also distinguish multiple instances of wakesources using the integer identifier. Please share your thoughts! > Thanks, > > tglx Thanks, Manish -- Manish Varma | Software Engineer | var...@google.com | 650-686-0858
Re: [PATCH] cpufreq: dt: check the error returned by dev_pm_opp_of_cpumask_add_table
Hi Viresh, On 3/25/21 12:45 PM, Viresh Kumar wrote: On 25-03-21, 12:31, quanyang.w...@windriver.com wrote: From: Quanyang Wang The function dev_pm_opp_of_cpumask_add_table may return zero or an error. When it returns an error, this means that no OPP table is added for the cpumask because _dev_pm_opp_cpumask_remove_table is called to free all OPPs associated with the cpu devices in the error label "remove_table". So continuing to run the next function dev_pm_opp_get_opp_count is meaningless since it always return the count value as 0. There is another reason why we should check the error returned by dev_pm_opp_of_cpumask_add_table is that it may return -EPROBE_DEFER which comes from clk_get(dev, NULL) in _update_opp_table_clk. When the clk for cpu device isn't ready, dt_cpufreq_probe should be deferred and wait to be called again. But if we ignore the return error of dev_pm_opp_of_cpumask_add_table, dt_cpufreq_probe will return -ENODEV because dev_pm_opp_get_opp_count returns the count value as 0, the cpufreq-dt driver will fail with the error log as below: [0.724069] cpu cpu0: OPP table can't be empty Signed-off-by: Quanyang Wang --- drivers/cpufreq/cpufreq-dt.c | 12 +--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/drivers/cpufreq/cpufreq-dt.c b/drivers/cpufreq/cpufreq-dt.c index b1e1bdc63b01..f24359f47b1a 100644 --- a/drivers/cpufreq/cpufreq-dt.c +++ b/drivers/cpufreq/cpufreq-dt.c @@ -255,10 +255,16 @@ static int dt_cpufreq_early_init(struct device *dev, int cpu) * before updating priv->cpus. Otherwise, we will end up creating * duplicate OPPs for the CPUs. * -* OPPs might be populated at runtime, don't check for error here. As the comment (which you removed) clearly says, the OPPs maybe added at runtime, don't check for error here. When we say runtime, we mean someone may have called dev_pm_opp_add() for the devices. Thank you for pointing it out. Do you mean that even if dev_pm_opp_of_cpumask_add_table returns an error, dev_pm_opp_get_opp_count may still return count > 0 because someone may call dev_pm_opp_add to add OPP to cpu succcessfully at somewhere else? Thanks, Quanyang +* We need check the return value here, if it is non-zero, there is +* need to go on. */ - if (!dev_pm_opp_of_cpumask_add_table(priv->cpus)) - priv->have_static_opps = true; + ret = dev_pm_opp_of_cpumask_add_table(priv->cpus); + if (ret) { + dev_err(cpu_dev, "Failed to add OPP table for CPUs\n"); + goto out; + } + + priv->have_static_opps = true; /* * The OPP table must be initialized, statically or dynamically, by this
Re: [PATCH v4] audit: log nftables configuration change events once per table
Hi Richard, Thank you for the patch! Perhaps something to improve: [auto build test WARNING on nf/master] [also build test WARNING on nf-next/master pcmoore-audit/next v5.12-rc4 next-20210324] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/Richard-Guy-Briggs/audit-log-nftables-configuration-change-events-once-per-table/20210325-115438 base: https://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git master config: arc-allyesconfig (attached as .config) compiler: arceb-elf-gcc (GCC) 9.3.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/0day-ci/linux/commit/e2632994acb2553a22a739b3a876a091d04f446c git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Richard-Guy-Briggs/audit-log-nftables-configuration-change-events-once-per-table/20210325-115438 git checkout e2632994acb2553a22a739b3a876a091d04f446c # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=arc If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot All warnings (new ones prefixed by >>): >> net/netfilter/nf_tables_api.c:7993:5: warning: no previous prototype for >> 'nf_tables_commit_audit_alloc' [-Wmissing-prototypes] 7993 | int nf_tables_commit_audit_alloc(struct list_head *adl, | ^~~~ >> net/netfilter/nf_tables_api.c:8011:6: warning: no previous prototype for >> 'nf_tables_commit_audit_collect' [-Wmissing-prototypes] 8011 | void nf_tables_commit_audit_collect(struct list_head *adl, | ^~ >> net/netfilter/nf_tables_api.c:8030:6: warning: no previous prototype for >> 'nf_tables_commit_audit_log' [-Wmissing-prototypes] 8030 | void nf_tables_commit_audit_log(struct list_head *adl, u32 generation) | ^~ vim +/nf_tables_commit_audit_alloc +7993 net/netfilter/nf_tables_api.c 7992 > 7993 int nf_tables_commit_audit_alloc(struct list_head *adl, 7994 struct nft_table *table) 7995 { 7996 struct nft_audit_data *adp; 7997 7998 list_for_each_entry(adp, adl, list) { 7999 if (adp->table == table) 8000 return 0; 8001 } 8002 adp = kzalloc(sizeof(*adp), GFP_KERNEL); 8003 if (!adp) 8004 return -ENOMEM; 8005 adp->table = table; 8006 INIT_LIST_HEAD(>list); 8007 list_add(>list, adl); 8008 return 0; 8009 } 8010 > 8011 void nf_tables_commit_audit_collect(struct list_head *adl, 8012 struct nft_table *table, u32 op) 8013 { 8014 struct nft_audit_data *adp; 8015 8016 list_for_each_entry(adp, adl, list) { 8017 if (adp->table == table) 8018 goto found; 8019 } 8020 WARN_ONCE("table=%s not expected in commit list", table->name); 8021 return; 8022 found: 8023 adp->entries++; 8024 if (!adp->op || adp->op > op) 8025 adp->op = op; 8026 } 8027 8028 #define AUNFTABLENAMELEN (NFT_TABLE_MAXNAMELEN + 22) 8029 > 8030 void nf_tables_commit_audit_log(struct list_head *adl, u32 generation) 8031 { 8032 struct nft_audit_data *adp, *adn; 8033 char aubuf[AUNFTABLENAMELEN]; 8034 8035 list_for_each_entry_safe(adp, adn, adl, list) { 8036 snprintf(aubuf, AUNFTABLENAMELEN, "%s:%u", adp->table->name, 8037 generation); 8038 audit_log_nfcfg(aubuf, adp->table->family, adp->entries, 8039 nft2audit_op[adp->op], GFP_KERNEL); 8040 list_del(>list); 8041 kfree(adp); 8042 } 8043 } 8044 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org .config.gz Description: application/gzip
Re: [PATCH v4 14/22] x86/fpu/xstate: Expand the xstate buffer on the first use of dynamic user state
On 3/25/2021 5:09 AM, Len Brown wrote: On Tue, Mar 23, 2021 at 11:15 PM Liu, Jing2 wrote: IMO, the problem with AVX512 state is that we guaranteed it will be zero for XINUSE=0. That means we have to write 0's on saves. why "we have to write 0's on saves" when XINUSE=0. Since due to SDM, if XINUSE=0, the XSAVES will *not* save the data and xstate_bv bit is 0; if use XSAVE, it need save the state but xstate_bv bit is also 0. It would be better to be able to skip the write -- even if we can't save the space we can save the data transfer. (This is what we did for AMX). With XFD feature that XFD=1, XSAVE command still has to save INIT state to the area. So it seems with XINUSE=0 and XFD=1, the XSAVE(S) commands do the same that both can help save the data transfer. Hi Jing, Good observation! There are 3 cases. Hi Len, thanks for your reply. 1. Task context switch save into the context switch buffer. Here we use XSAVES, and as you point out, XSAVES includes the compaction optimization feature tracked by XINUSE. So when AMX is enabled, but clean, XSAVES doesn't write zeros. Further, it omits the buffer space for AMX in the destination altogether! However, since XINUSE=1 is possible, we have to *allocate* a buffer large enough to handle the dirty data for when XSAVES can not employ that optimization. Yes, I agree with you about the first case. 2. Entry into user signal handler saves into the user space sigframe. Here we use XSAVE, and so the hardware will write zeros for XINUSE=0, and for AVX512, we save neither time or space. My understanding that for application compatibility, we can *not* compact the destination buffer that user-space sees. This is because existing code may have adopted fixed size offsets. (which is unfortunate). And so, for AVX512, we both reserve the space, and we write zeros for clean AVX512 state. By XSAVE, I think this is true if we assume setting EDX:EAX AVX512 bits as 1, which means XSAVE will write zeros when XINUSE=0. Is this the same assumption with yours?... For AMX, we must still reserve the space, but we are not going to write zeros for clean state. We so this in software by checking XINUSE=0, and clearing the xstate_bf for the XSAVE. As a result, for XINUSE=0, we can skip writing the zeros, even though we can't compress the space. So my understanding is that clearing xstate_bv will not help prevent saving zeros, but only not masking EDX:EAX, since the following logic. Not sure if this is just what you mean. :) RFBM ← XCR0 AND EDX:EAX; /* bitwise logical AND */ OLD_BV ← XSTATE_BV field from XSAVE header; ... FOR i ← 2 TO 62 IF RFBM[i] = 1 THEN save XSAVE state component i at offset n from base of XSAVE area; FI; ENDFOR; XSTATE_BV field in XSAVE header ← (OLD_BV AND NOT RFBM) OR (XINUSE AND RFBM); 3. user space always uses fully uncompacted XSAVE buffers. The reason I'm interested in XINUSE denotation is that it might be helpful for the XFD MSRs context switch cost during vmexit and vmenter. As the guest OS may be using XFD, the VMM can not use it for itself. Rather, the VMM must context switch it when it switches between guests. (or not expose it to guests at all) My understand is that KVM cannot assume that userspace qemu uses XFD or not, so KVM need context switch XFD between vcpu threads when vmexit/vmenter. That's why I am thinking about detecting XINUSE when vmexit, otherwise, a wrong armed IA32_XFD will impact XSAVES/XRSTORS causing guest AMX states lost. Thanks, Jing cheers, -Len cheers, Len Brown, Intel Open Source Technology Center
[PATCH] usb: typec: Fix a typo
s/Acknowlege/Acknowledge/ Signed-off-by: Bhaskar Chowdhury --- drivers/usb/typec/ucsi/ucsi.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/usb/typec/ucsi/ucsi.c b/drivers/usb/typec/ucsi/ucsi.c index 244270755ae6..282c3c825c13 100644 --- a/drivers/usb/typec/ucsi/ucsi.c +++ b/drivers/usb/typec/ucsi/ucsi.c @@ -63,7 +63,7 @@ static int ucsi_read_error(struct ucsi *ucsi) u16 error; int ret; - /* Acknowlege the command that failed */ + /* Acknowledge the command that failed */ ret = ucsi_acknowledge_command(ucsi); if (ret) return ret; -- 2.30.1
Re: [PATCH v1] usb: dwc3: core: Add shutdown callback for dwc3
On 3/24/2021 9:01 AM, Stephen Boyd wrote: Quoting Sandeep Maheswaram (2021-03-23 12:27:32) This patch adds a shutdown callback to USB DWC core driver to ensure that it is properly shutdown in reboot/shutdown path. This is required where SMMU address translation is enabled like on SC7180 SoC and few others. If the hardware is still accessing memory after SMMU translation is disabled as part of SMMU shutdown callback in system reboot or shutdown path, then IOVAs(I/O virtual address) which it was using will go on the bus as the physical addresses which might result in unknown crashes (NoC/interconnect errors). Previously this was added in dwc3 qcom glue driver. https://patchwork.kernel.org/project/linux-arm-msm/list/?series=382449 But observed kernel panic as glue driver shutdown getting called after iommu shutdown. As we are adding iommu nodes in dwc core node in device tree adding shutdown callback in core driver seems correct. Signed-off-by: Sandeep Maheswaram --- drivers/usb/dwc3/core.c | 26 +++--- 1 file changed, 19 insertions(+), 7 deletions(-) diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c index 94fdbe5..777b2b5 100644 --- a/drivers/usb/dwc3/core.c +++ b/drivers/usb/dwc3/core.c @@ -1634,11 +1634,9 @@ static int dwc3_probe(struct platform_device *pdev) return ret; } -static int dwc3_remove(struct platform_device *pdev) +static void __dwc3_teardown(struct dwc3 *dwc) { - struct dwc3 *dwc = platform_get_drvdata(pdev); - - pm_runtime_get_sync(>dev); + pm_runtime_get_sync(dwc->dev); dwc3_debugfs_exit(dwc); dwc3_core_exit_mode(dwc); @@ -1646,19 +1644,32 @@ static int dwc3_remove(struct platform_device *pdev) dwc3_core_exit(dwc); dwc3_ulpi_exit(dwc); - pm_runtime_disable(>dev); - pm_runtime_put_noidle(>dev); - pm_runtime_set_suspended(>dev); + pm_runtime_disable(dwc->dev); + pm_runtime_put_noidle(dwc->dev); + pm_runtime_set_suspended(dwc->dev); dwc3_free_event_buffers(dwc); dwc3_free_scratch_buffers(dwc); if (dwc->usb_psy) power_supply_put(dwc->usb_psy); +} + +static int dwc3_remove(struct platform_device *pdev) +{ + struct dwc3 *dwc = platform_get_drvdata(pdev); + + __dwc3_teardown(dwc); return 0; } +static void dwc3_shutdown(struct platform_device *pdev) +{ + struct dwc3 *dwc = platform_get_drvdata(pdev); + + __dwc3_teardown(dwc); +} Can't this be static void dwc3_shutdown(struct platform_device *pdev) { dwc3_remove(pdev); } and then there's nothing else to change? Basically ignore return value of dwc3_remove() to make shutdown and remove harmonize. I also wonder if this is more common than we think and a struct driver flag could be set to say "call remove for shutdown" and then have driver core swizzle on that and save some duplicate functions. I was referring to similar patch https://patchwork.kernel.org/project/linux-usb/patch/20190817174140.6394-1-vice...@gmail.com/ #ifdef CONFIG_PM static int dwc3_core_init_for_resume(struct dwc3 *dwc) { @@ -1976,6 +1987,7 @@ MODULE_DEVICE_TABLE(acpi, dwc3_acpi_match); static struct platform_driver dwc3_driver = { .probe = dwc3_probe, .remove = dwc3_remove, + .shutdown = dwc3_shutdown,
Re: Re: [PATCH] fuse: Fix a potential double free in virtio_fs_get_tree
> -原始邮件- > 发件人: "Vivek Goyal" > 发送时间: 2021-03-24 01:10:03 (星期三) > 收件人: "Lv Yunlong" > 抄送: stefa...@redhat.com, mik...@szeredi.hu, > virtualizat...@lists.linux-foundation.org, linux-fsde...@vger.kernel.org, > linux-kernel@vger.kernel.org > 主题: Re: [PATCH] fuse: Fix a potential double free in virtio_fs_get_tree > > On Mon, Mar 22, 2021 at 10:18:31PM -0700, Lv Yunlong wrote: > > In virtio_fs_get_tree, fm is allocated by kzalloc() and > > assigned to fsc->s_fs_info by fsc->s_fs_info=fm statement. > > If the kzalloc() failed, it will goto err directly, so that > > fsc->s_fs_info must be non-NULL and fm will be freed. > > sget_fc() will either consume fsc->s_fs_info in case a new super > block is allocated and set fsc->s_fs_info. In that case we don't > free fc or fm. > > Or, sget_fc() will return with fsc->s_fs_info set in case we already > found a super block. In that case we need to free fc and fm. > > In case of error from sget_fc(), fc/fm need to be freed first and > then error needs to be returned to caller. > > if (IS_ERR(sb)) > return PTR_ERR(sb); > > > If we allocated a new super block in sget_fc(), then next step is > to initialize it. > > if (!sb->s_root) { > err = virtio_fs_fill_super(sb, fsc); > } > > If we run into errors here, then fc/fm need to be freed. > > So current code looks fine to me. > > Vivek > > > > > But later fm is freed again when virtio_fs_fill_super() fialed. > > I think the statement if (fsc->s_fs_info) {kfree(fm);} is > > misplaced. > > > > My patch puts this statement in the correct palce to avoid > > double free. > > > > Signed-off-by: Lv Yunlong > > --- > > fs/fuse/virtio_fs.c | 10 ++ > > 1 file changed, 6 insertions(+), 4 deletions(-) > > > > diff --git a/fs/fuse/virtio_fs.c b/fs/fuse/virtio_fs.c > > index 8868ac31a3c0..727cf436828f 100644 > > --- a/fs/fuse/virtio_fs.c > > +++ b/fs/fuse/virtio_fs.c > > @@ -1437,10 +1437,7 @@ static int virtio_fs_get_tree(struct fs_context *fsc) > > > > fsc->s_fs_info = fm; > > sb = sget_fc(fsc, virtio_fs_test_super, set_anon_super_fc); > > - if (fsc->s_fs_info) { > > - fuse_conn_put(fc); > > - kfree(fm); > > - } > > + > > if (IS_ERR(sb)) > > return PTR_ERR(sb); > > > > @@ -1457,6 +1454,11 @@ static int virtio_fs_get_tree(struct fs_context *fsc) > > sb->s_flags |= SB_ACTIVE; > > } > > > > + if (fsc->s_fs_info) { > > + fuse_conn_put(fc); > > + kfree(fm); > > + } > > + > > WARN_ON(fsc->root); > > fsc->root = dget(sb->s_root); > > return 0; > > -- > > 2.25.1 > > > > > Ok, thanks. It should be a false positive.
[PATCH] drivers: gpu: drm: Remove repeated declaration
struct drm_i915_private, struct intel_crtc_state and struct intel_crtc have been declared before. Remove the duplicate. Signed-off-by: Wan Jiabing --- drivers/gpu/drm/i915/display/intel_crt.h | 1 - drivers/gpu/drm/i915/display/intel_display.h | 1 - drivers/gpu/drm/i915/display/intel_vrr.h | 1 - 3 files changed, 3 deletions(-) diff --git a/drivers/gpu/drm/i915/display/intel_crt.h b/drivers/gpu/drm/i915/display/intel_crt.h index 1b3fba359efc..6c5c44600cbd 100644 --- a/drivers/gpu/drm/i915/display/intel_crt.h +++ b/drivers/gpu/drm/i915/display/intel_crt.h @@ -11,7 +11,6 @@ enum pipe; struct drm_encoder; struct drm_i915_private; -struct drm_i915_private; bool intel_crt_port_enabled(struct drm_i915_private *dev_priv, i915_reg_t adpa_reg, enum pipe *pipe); diff --git a/drivers/gpu/drm/i915/display/intel_display.h b/drivers/gpu/drm/i915/display/intel_display.h index 76f8a805b0a3..29cb6d84ed70 100644 --- a/drivers/gpu/drm/i915/display/intel_display.h +++ b/drivers/gpu/drm/i915/display/intel_display.h @@ -48,7 +48,6 @@ struct i915_ggtt_view; struct intel_atomic_state; struct intel_crtc; struct intel_crtc_state; -struct intel_crtc_state; struct intel_digital_port; struct intel_dp; struct intel_encoder; diff --git a/drivers/gpu/drm/i915/display/intel_vrr.h b/drivers/gpu/drm/i915/display/intel_vrr.h index fac01bf4ab50..96f9c9c27ab9 100644 --- a/drivers/gpu/drm/i915/display/intel_vrr.h +++ b/drivers/gpu/drm/i915/display/intel_vrr.h @@ -15,7 +15,6 @@ struct intel_crtc; struct intel_crtc_state; struct intel_dp; struct intel_encoder; -struct intel_crtc; bool intel_vrr_is_capable(struct drm_connector *connector); void intel_vrr_check_modeset(struct intel_atomic_state *state); -- 2.25.1
Re: [PATCH] tee: optee: fix build error caused by recent optee tracepoints feature
On Thu, Mar 25, 2021 at 12:06:01PM +0800, Jisheng Zhang wrote: > If build kernel without "O=dir", below error will be seen: > > In file included from drivers/tee/optee/optee_trace.h:67, > from drivers/tee/optee/call.c:18: > ./include/trace/define_trace.h:95:42: fatal error: ./optee_trace.h: No such > file or directory >95 | #include TRACE_INCLUDE(TRACE_INCLUDE_FILE) > | ^ > compilation terminated. > > Fix it by adding below line to Makefile: > CFLAGS_call.o := -I$(src) > > Tested with and without "O=dir", both can build successfully. > > Reported-by: Guenter Roeck > Suggested-by: Steven Rostedt > Signed-off-by: Jisheng Zhang Tested-by: Guenter Roeck > --- > drivers/tee/optee/Makefile | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/tee/optee/Makefile b/drivers/tee/optee/Makefile > index 56263ae3b1d7..3aa33ea9e6a6 100644 > --- a/drivers/tee/optee/Makefile > +++ b/drivers/tee/optee/Makefile > @@ -6,3 +6,6 @@ optee-objs += rpc.o > optee-objs += supp.o > optee-objs += shm_pool.o > optee-objs += device.o > + > +# for tracing framework to find optee_trace.h > +CFLAGS_call.o := -I$(src) > -- > 2.31.0 >
[PATCH] ARM: imx: Fix a typo
s/confgiured/configured/ Signed-off-by: Bhaskar Chowdhury --- arch/arm/mach-imx/pm-imx5.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arm/mach-imx/pm-imx5.c b/arch/arm/mach-imx/pm-imx5.c index e9962b48e30c..2e3af2bc7758 100644 --- a/arch/arm/mach-imx/pm-imx5.c +++ b/arch/arm/mach-imx/pm-imx5.c @@ -45,7 +45,7 @@ * This is also the lowest power state possible without affecting * non-cpu parts of the system. For these reasons, imx5 should default * to always using this state for cpu idling. The PM_SUSPEND_STANDBY also - * uses this state and needs to take no action when registers remain confgiured + * uses this state and needs to take no action when registers remain configured * for this state. */ #define IMX5_DEFAULT_CPU_IDLE_STATE WAIT_UNCLOCKED_POWER_OFF -- 2.30.1
RE: [PATCH v9 1/7] smccc: Add HVC call variant with result registers other than 0 thru 3
From: Mark Rutland Sent: Wednesday, March 24, 2021 9:55 AM > > Hi Michael, > > On Mon, Mar 08, 2021 at 11:57:13AM -0800, Michael Kelley wrote: > > Hypercalls to Hyper-V on ARM64 may return results in registers other > > than X0 thru X3, as permitted by the SMCCC spec version 1.2 and later. > > Accommodate this by adding a variant of arm_smccc_1_1_hvc that allows > > the caller to specify which 3 registers are returned in addition to X0. > > > > Signed-off-by: Michael Kelley > > --- > > There are several ways to support returning results from registers > > other than X0 thru X3, and Hyper-V usage should be compatible with > > whatever the maintainers prefer. What's implemented in this patch > > may be the most flexible, but it has the downside of not being a > > true function interface in that args 0 thru 2 must be fixed strings, > > and not general "C" expressions. > > For the benefit of others here, SMCCCv1.2 allows: > > * SMC64/HVC64 to use all of x1-x17 for both parameters and return values > * SMC32/HVC32 to use all of r1-r7 for both parameters and return values > > The rationale for this was to make it possible to pass a large number of > arguments in one call without the hypervisor/firmware needing to access > the memory of the caller. > > My preference would be to add arm_smccc_1_2_{hvc,smc}() assembly > functions which read all the permitted argument registers from a struct, > and write all the permitted result registers to a struct, leaving it to > callers to set those up and decompose them. > > That way we only have to write one implementation that all callers can > use, which'll be far easier to maintain. I suspect that in general the > cost of temporarily bouncing the values through memory will be dominated > by whatever the hypervisor/firmware is going to do, and if it's not we > can optimize that away in future. > Thanks for the feedback, and I'm working on implementing this approach. But I've hit a snag in that gcc limits the "asm" statement to 30 arguments, which gives us 15 registers as parameters and 15 registers as return values, instead of the 18 each allowed by SMCCC v1.2. I will continue with the 15 register limit for now, unless someone knows a way to exceed that. The alternative would be to go to pure assembly language. I'll post a standalone RFC patch when I have something that works. My C pre-processor wizardry is limited, so others will probably know some tricks that can improve on my first cut. Michael
Re: [PATCH 1/2] extcon: extcon-gpio: Log error if work-queue init fails
On Thu, 2021-03-25 at 09:49 +0900, Chanwoo Choi wrote: > On 3/24/21 6:51 PM, Vaittinen, Matti wrote: > > Hello Hans, Chanwoo, Greg, > > > > On Wed, 2021-03-24 at 10:25 +0100, Hans de Goede wrote: > > > Hi, > > > > > > On 3/24/21 10:21 AM, Matti Vaittinen wrote: > > > > Add error print for probe failure when resource managed work- > > > > queue > > > > initialization fails. > > > > > > > > Signed-off-by: Matti Vaittinen < > > > > matti.vaitti...@fi.rohmeurope.com> > > > > Suggested-by: Chanwoo Choi > > > > --- > > > > drivers/extcon/extcon-gpio.c | 4 +++- > > > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > > > > > diff --git a/drivers/extcon/extcon-gpio.c > > > > b/drivers/extcon/extcon- > > > > gpio.c > > > > index 4105df74f2b0..8ea2cda8f7f3 100644 > > > > --- a/drivers/extcon/extcon-gpio.c > > > > +++ b/drivers/extcon/extcon-gpio.c > > > > @@ -114,8 +114,10 @@ static int gpio_extcon_probe(struct > > > > platform_device *pdev) > > > > return ret; > > > > > > > > ret = devm_delayed_work_autocancel(dev, >work, > > > > gpio_extcon_work); > > > > - if (ret) > > > > + if (ret) { > > > > + dev_err(dev, "Failed to initialize > > > > delayed_work"); > > > > return ret; > > > > + } > > > > > > The only ret which we can have here is -ENOMEM and as a rule we > > > don't > > > log > > > errors for those, because the kernel memory-management code > > > already > > > complains > > > loudly when this happens. > > > > I know. This is why I originally omitted the print. Besides, if the > > memory is so low that devres adding fails - then we probably have > > plenty of other complaints as well... But as Chanwoo maintains the > > driver and wanted to have the print - I do not have objections to > > that > > either. Maybe someone some-day adds another error path to wq > > initialization in which case seeing it failed could make sense. > > > > > So IMHO this patch should be dropped. > > Fine for me - as well as keeping it. I have no strong opinion on > > this. > > If it is the same handling way for -ENOMEM, don't need to add log ss > Hans said. > Thanks for Hans. So be it :) Greg, can you just apply the patch 2/2 and drop this one? (There should be no dependency between these) or do you want me to resend 2/2 alone? > > Br, > > Matti > > > >
[PATCH] drivers: gpu: drm: Remove duplicate declaration
struct dss_device has been declared at 51st line. Remove the duplicate. Signed-off-by: Wan Jiabing --- drivers/gpu/drm/omapdrm/dss/omapdss.h | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/gpu/drm/omapdrm/dss/omapdss.h b/drivers/gpu/drm/omapdrm/dss/omapdss.h index a40abeafd2e9..2658aadee09a 100644 --- a/drivers/gpu/drm/omapdrm/dss/omapdss.h +++ b/drivers/gpu/drm/omapdrm/dss/omapdss.h @@ -52,7 +52,6 @@ struct dss_device; struct omap_drm_private; struct omap_dss_device; struct dispc_device; -struct dss_device; struct dss_lcd_mgr_config; struct snd_aes_iec958; struct snd_cea_861_aud_if; -- 2.25.1
Re: [PATCH 01/13] kconfig: split randconfig setup code into set_randconfig_seed()
On Sun, Mar 14, 2021 at 4:48 AM Masahiro Yamada wrote: > > This code is too big to be placed in the switch statement. > > Move the code into a new helper function. I slightly refactor the code > without changing the behavior. > > Signed-off-by: Masahiro Yamada > --- All applied to linux-kbuild/kconfig. > scripts/kconfig/conf.c | 54 -- > 1 file changed, 31 insertions(+), 23 deletions(-) > > diff --git a/scripts/kconfig/conf.c b/scripts/kconfig/conf.c > index 957d2a0832f7..063c9e7a34c1 100644 > --- a/scripts/kconfig/conf.c > +++ b/scripts/kconfig/conf.c > @@ -82,6 +82,36 @@ static void xfgets(char *str, int size, FILE *in) > printf("%s", str); > } > > +static void set_randconfig_seed(void) > +{ > + unsigned int seed; > + char *env; > + bool seed_set = false; > + > + env = getenv("KCONFIG_SEED"); > + if (env && *env) { > + char *endp; > + > + seed = strtol(env, , 0); > + if (*endp == '\0') > + seed_set = true; > + } > + > + if (!seed_set) { > + struct timeval now; > + > + /* > +* Use microseconds derived seed, compensate for systems > where it may > +* be zero. > +*/ > + gettimeofday(, NULL); > + seed = (now.tv_sec + 1) * (now.tv_usec + 1); > + } > + > + printf("KCONFIG_SEED=0x%X\n", seed); > + srand(seed); > +} > + > static int conf_askvalue(struct symbol *sym, const char *def) > { > if (!sym_has_value(sym)) > @@ -515,30 +545,8 @@ int main(int ac, char **av) > defconfig_file = optarg; > break; > case randconfig: > - { > - struct timeval now; > - unsigned int seed; > - char *seed_env; > - > - /* > -* Use microseconds derived seed, > -* compensate for systems where it may be zero > -*/ > - gettimeofday(, NULL); > - seed = (unsigned int)((now.tv_sec + 1) * (now.tv_usec > + 1)); > - > - seed_env = getenv("KCONFIG_SEED"); > - if( seed_env && *seed_env ) { > - char *endp; > - int tmp = (int)strtol(seed_env, , 0); > - if (*endp == '\0') { > - seed = tmp; > - } > - } > - fprintf( stderr, "KCONFIG_SEED=0x%X\n", seed ); > - srand(seed); > + set_randconfig_seed(); > break; > - } > case oldaskconfig: > case oldconfig: > case allnoconfig: > -- > 2.27.0 > -- Best Regards Masahiro Yamada
Re: [PATCH] kconfig: use true and false for bool variable
On Mon, Mar 15, 2021 at 3:55 PM Yang Li wrote: > > fixed the following coccicheck: > ./scripts/kconfig/confdata.c:36:9-10: WARNING: return of 0/1 in function > 'is_dir' with return type bool > > Reported-by: Abaci Robot > Signed-off-by: Yang Li > --- Applied. Thanks. > scripts/kconfig/confdata.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/scripts/kconfig/confdata.c b/scripts/kconfig/confdata.c > index 2568dbe..b65b8c3 100644 > --- a/scripts/kconfig/confdata.c > +++ b/scripts/kconfig/confdata.c > @@ -33,7 +33,7 @@ static bool is_dir(const char *path) > struct stat st; > > if (stat(path, )) > - return 0; > + return false; > > return S_ISDIR(st.st_mode); > } > -- > 1.8.3.1 > -- Best Regards Masahiro Yamada
Re: [PATCH] cpufreq: dt: check the error returned by dev_pm_opp_of_cpumask_add_table
On 25-03-21, 12:31, quanyang.w...@windriver.com wrote: > From: Quanyang Wang > > The function dev_pm_opp_of_cpumask_add_table may return zero or an > error. When it returns an error, this means that no OPP table is > added for the cpumask because _dev_pm_opp_cpumask_remove_table is > called to free all OPPs associated with the cpu devices in the error > label "remove_table". So continuing to run the next function > dev_pm_opp_get_opp_count is meaningless since it always return the > count value as 0. > > There is another reason why we should check the error returned by > dev_pm_opp_of_cpumask_add_table is that it may return -EPROBE_DEFER > which comes from clk_get(dev, NULL) in _update_opp_table_clk. When > the clk for cpu device isn't ready, dt_cpufreq_probe should be deferred > and wait to be called again. But if we ignore the return error of > dev_pm_opp_of_cpumask_add_table, dt_cpufreq_probe will return -ENODEV > because dev_pm_opp_get_opp_count returns the count value as 0, > the cpufreq-dt driver will fail with the error log as below: > > [0.724069] cpu cpu0: OPP table can't be empty > > Signed-off-by: Quanyang Wang > --- > drivers/cpufreq/cpufreq-dt.c | 12 +--- > 1 file changed, 9 insertions(+), 3 deletions(-) > > diff --git a/drivers/cpufreq/cpufreq-dt.c b/drivers/cpufreq/cpufreq-dt.c > index b1e1bdc63b01..f24359f47b1a 100644 > --- a/drivers/cpufreq/cpufreq-dt.c > +++ b/drivers/cpufreq/cpufreq-dt.c > @@ -255,10 +255,16 @@ static int dt_cpufreq_early_init(struct device *dev, > int cpu) >* before updating priv->cpus. Otherwise, we will end up creating >* duplicate OPPs for the CPUs. >* > - * OPPs might be populated at runtime, don't check for error here. As the comment (which you removed) clearly says, the OPPs maybe added at runtime, don't check for error here. When we say runtime, we mean someone may have called dev_pm_opp_add() for the devices. > + * We need check the return value here, if it is non-zero, there is > + * need to go on. >*/ > - if (!dev_pm_opp_of_cpumask_add_table(priv->cpus)) > - priv->have_static_opps = true; > + ret = dev_pm_opp_of_cpumask_add_table(priv->cpus); > + if (ret) { > + dev_err(cpu_dev, "Failed to add OPP table for CPUs\n"); > + goto out; > + } > + > + priv->have_static_opps = true; > > /* >* The OPP table must be initialized, statically or dynamically, by this -- viresh
[PATCH V4] kbuild: Add rule to build .dt.yaml files for overlays
The overlay source files are named with .dtso extension now, add a new rule to generate .dt.yaml for them. Reviewed-by: Geert Uytterhoeven Tested-by: Geert Uytterhoeven Signed-off-by: Viresh Kumar --- V4: - Rebase over Frank's cleanup patch: https://lore.kernel.org/lkml/20210324223713.1334666-1-frowand.l...@gmail.com/ - Drop changes to drivers/of/unittest-data/Makefile. - Drop modifications to the rule that builds .dtbo files (as it is already updated by Frank). scripts/Makefile.lib | 3 +++ 1 file changed, 3 insertions(+) diff --git a/scripts/Makefile.lib b/scripts/Makefile.lib index 814b430b407e..a682869d8f4b 100644 --- a/scripts/Makefile.lib +++ b/scripts/Makefile.lib @@ -376,6 +376,9 @@ endef $(obj)/%.dt.yaml: $(src)/%.dts $(DTC) $(DT_TMP_SCHEMA) FORCE $(call if_changed_rule,dtc,yaml) +$(obj)/%.dt.yaml: $(src)/%.dtso $(DTC) $(DT_TMP_SCHEMA) FORCE + $(call if_changed_rule,dtc,yaml) + dtc-tmp = $(subst $(comma),_,$(dot-target).dts.tmp) # Bzip2 -- 2.25.0.rc1.19.g042ed3e048af
[PATCH] tools: perf: util: Remove duplicate declaration
struct evlist has been declared at 10th line. struct comm has been declared at 15th line. Remove the duplicate. Signed-off-by: Wan Jiabing --- tools/perf/util/metricgroup.h | 1 - tools/perf/util/thread-stack.h | 1 - 2 files changed, 2 deletions(-) diff --git a/tools/perf/util/metricgroup.h b/tools/perf/util/metricgroup.h index ed1b9392e624..026bbf416c48 100644 --- a/tools/perf/util/metricgroup.h +++ b/tools/perf/util/metricgroup.h @@ -9,7 +9,6 @@ struct evlist; struct evsel; -struct evlist; struct option; struct rblist; struct pmu_events_map; diff --git a/tools/perf/util/thread-stack.h b/tools/perf/util/thread-stack.h index 3bc47a42af8e..b3cd09beb62f 100644 --- a/tools/perf/util/thread-stack.h +++ b/tools/perf/util/thread-stack.h @@ -16,7 +16,6 @@ struct comm; struct ip_callchain; struct symbol; struct dso; -struct comm; struct perf_sample; struct addr_location; struct call_path; -- 2.25.1
[PATCH] Bluetooth: L2CAP: Rudimentary typo fixes
s/minium/minimum/ s/procdure/procedure/ Signed-off-by: Bhaskar Chowdhury --- net/bluetooth/l2cap_core.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/net/bluetooth/l2cap_core.c b/net/bluetooth/l2cap_core.c index 72c2f5226d67..b38e80a0e819 100644 --- a/net/bluetooth/l2cap_core.c +++ b/net/bluetooth/l2cap_core.c @@ -1690,7 +1690,7 @@ static void l2cap_le_conn_ready(struct l2cap_conn *conn) smp_conn_security(hcon, hcon->pending_sec_level); /* For LE slave connections, make sure the connection interval -* is in the range of the minium and maximum interval that has +* is in the range of the minimum and maximum interval that has * been configured for this connection. If not, then trigger * the connection update procedure. */ @@ -7542,7 +7542,7 @@ static void l2cap_data_channel(struct l2cap_conn *conn, u16 cid, BT_DBG("chan %p, len %d", chan, skb->len); /* If we receive data on a fixed channel before the info req/rsp -* procdure is done simply assume that the channel is supported +* procedure is done simply assume that the channel is supported * and mark it as ready. */ if (chan->chan_type == L2CAP_CHAN_FIXED) -- 2.30.1
Re: [RFC] mm: activate access-more-than-once page via NUMA balancing
Hi, Mel, Thanks for comment! Mel Gorman writes: > On Wed, Mar 24, 2021 at 04:32:09PM +0800, Huang Ying wrote: >> One idea behind the LRU page reclaiming algorithm is to put the >> access-once pages in the inactive list and access-more-than-once pages >> in the active list. This is true for the file pages that are accessed >> via syscall (read()/write(), etc.), but not for the pages accessed via >> the page tables. We can only activate them via page reclaim scanning >> now. This may cause some problems. For example, even if there are >> only hot file pages accessed via the page tables in the inactive list, >> we will enable the cache trim mode incorrectly to scan only the hot >> file pages instead of cold anon pages. >> > > I caution against this patch. > > It's non-deterministic for a number of reasons. As it requires NUMA > balancing to be enabled, the pageout behaviour of a system changes when > NUMA balancing is active. If this led to pages being artificially and > inappropriately preserved, NUMA balancing could be disabled for the > wrong reasons. It only applies to pages that have no target node so > memory policies affect which pages are activated differently. Similarly, > NUMA balancing does not scan all VMAs and some pages may never trap a > NUMA fault as a result. The timing of when an address space gets scanned > is driven by the locality of pages and so the timing of page activation > potentially becomes linked to whether pages are local or need to migrate > (although not right now for this patch as it only affects pages with a > target nid of NUMA_NO_NODE). In other words, changes in NUMA balancing > that affect migration potentially affect the aging rate. Similarly, > the activate rate of a process with a single thread and multiple threads > potentially have different activation rates. > > Finally, the NUMA balancing scan algorithm is sub-optimal. It potentially > scans the entire address space even though only a small number of pages > are scanned. This is particularly problematic when a process has a lot > of threads because threads are redundantly scanning the same regions. If > NUMA balancing ever introduced range tracking of faulted pages to limit > how much scanning it has to do, it would inadvertently cause a change in > page activation rate. > > NUMA balancing is about page locality, it should not get conflated with > page aging. I understand your concerns about binding the NUMA balancing and page reclaiming. The requirement of the page locality and page aging is different, so the policies need to be different. This is the wrong part of the patch. >From another point of view, it's still possible to share some underlying mechanisms (and code) between them. That is, scanning the page tables to make pages unaccessible and capture the page accesses via the page fault. Now these page accessing information is used for the page locality. Do you think it's a good idea to use these information for the page aging too (but with a different policy as you pointed out)? >From yet another point of view :-), in current NUMA balancing implementation, it's assumed that the node private pages can fit in the accessing node. But this may be not always true. Is it a valid optimization to migrate the hot private pages first? Best Regards, Huang, Ying
[PATCH] cpufreq: dt: check the error returned by dev_pm_opp_of_cpumask_add_table
From: Quanyang Wang The function dev_pm_opp_of_cpumask_add_table may return zero or an error. When it returns an error, this means that no OPP table is added for the cpumask because _dev_pm_opp_cpumask_remove_table is called to free all OPPs associated with the cpu devices in the error label "remove_table". So continuing to run the next function dev_pm_opp_get_opp_count is meaningless since it always return the count value as 0. There is another reason why we should check the error returned by dev_pm_opp_of_cpumask_add_table is that it may return -EPROBE_DEFER which comes from clk_get(dev, NULL) in _update_opp_table_clk. When the clk for cpu device isn't ready, dt_cpufreq_probe should be deferred and wait to be called again. But if we ignore the return error of dev_pm_opp_of_cpumask_add_table, dt_cpufreq_probe will return -ENODEV because dev_pm_opp_get_opp_count returns the count value as 0, the cpufreq-dt driver will fail with the error log as below: [0.724069] cpu cpu0: OPP table can't be empty Signed-off-by: Quanyang Wang --- drivers/cpufreq/cpufreq-dt.c | 12 +--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/drivers/cpufreq/cpufreq-dt.c b/drivers/cpufreq/cpufreq-dt.c index b1e1bdc63b01..f24359f47b1a 100644 --- a/drivers/cpufreq/cpufreq-dt.c +++ b/drivers/cpufreq/cpufreq-dt.c @@ -255,10 +255,16 @@ static int dt_cpufreq_early_init(struct device *dev, int cpu) * before updating priv->cpus. Otherwise, we will end up creating * duplicate OPPs for the CPUs. * -* OPPs might be populated at runtime, don't check for error here. +* We need check the return value here, if it is non-zero, there is +* need to go on. */ - if (!dev_pm_opp_of_cpumask_add_table(priv->cpus)) - priv->have_static_opps = true; + ret = dev_pm_opp_of_cpumask_add_table(priv->cpus); + if (ret) { + dev_err(cpu_dev, "Failed to add OPP table for CPUs\n"); + goto out; + } + + priv->have_static_opps = true; /* * The OPP table must be initialized, statically or dynamically, by this -- 2.25.1
Re: linux-next: manual merge of the opp tree with the v4l-dvb tree
On 24-03-21, 16:49, Stanimir Varbanov wrote: > Thanks Stephen! > > On 3/23/21 2:27 AM, Stephen Rothwell wrote: > > Hi all, > > > > Today's linux-next merge of the opp tree got a conflict in: > > > > drivers/media/platform/qcom/venus/pm_helpers.c > > > > between commit: > > > > 08b1cf474b7f ("media: venus: core, venc, vdec: Fix probe dependency > > error") > > > > from the v4l-dvb tree and commit: > > > > 857219ae4043 ("media: venus: Convert to use resource-managed OPP API") > > > > from the opp tree. > > > > I fixed it up (see below) and can carry the fix as necessary. This > > is now fixed as far as linux-next is concerned, but any non trivial > > conflicts should be mentioned to your upstream maintainer when your tree > > is submitted for merging. You may also want to consider cooperating > > with the maintainer of the conflicting tree to minimise any particularly > > complex conflicts. > > > > I don't know what is the best solution here. > > Viresh, Can I take the OPP API changes through media-tree to avoid > conflicts? I already suggested something similar earlier, and I was expecting Thierry to respond to that.. Not sure who should pick those patches. https://lore.kernel.org/lkml/20210318103250.shjyd66pxw2g2nsd@vireshk-i7/ Can you please respond to this series then ? -- viresh
[PATCH] btrfs: fixed rudimentary typos
s/contaning/containing s/clearning/clearing/ Signed-off-by: Bhaskar Chowdhury --- fs/btrfs/inode.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 7cdf65be3707..e0c08176bc18 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -2784,8 +2784,8 @@ static int insert_reserved_file_extent(struct btrfs_trans_handle *trans, /* * If we dropped an inline extent here, we know the range where it is * was not marked with the EXTENT_DELALLOC_NEW bit, so we update the -* number of bytes only for that range contaning the inline extent. -* The remaining of the range will be processed when clearning the +* number of bytes only for that range containing the inline extent. +* The remaining of the range will be processed when clearing the * EXTENT_DELALLOC_BIT bit through the ordered extent completion. */ if (file_pos == 0 && !IS_ALIGNED(drop_args.bytes_found, sectorsize)) { -- 2.30.1
Re: [PATCH v2 1/1] dmaengine: dw: Make it dependent to HAS_IOMEM
On 24-03-21, 16:17, Andy Shevchenko wrote: > Some architectures do not provide devm_*() APIs. Hence make the driver > dependent on HAVE_IOMEM. > > Fixes: dbde5c2934d1 ("dw_dmac: use devm_* functions to simplify code") > Reported-by: kernel test robot > Signed-off-by: Andy Shevchenko > --- > v2: used proper option (Serge) > drivers/dma/dw/Kconfig | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/drivers/dma/dw/Kconfig b/drivers/dma/dw/Kconfig > index e5162690de8f..db25f9b7778c 100644 > --- a/drivers/dma/dw/Kconfig > +++ b/drivers/dma/dw/Kconfig > @@ -10,6 +10,7 @@ config DW_DMAC_CORE > > config DW_DMAC > tristate "Synopsys DesignWare AHB DMA platform driver" > + depends on HAS_IOMEM > select DW_DMAC_CORE > help > Support the Synopsys DesignWare AHB DMA controller. This > @@ -18,6 +19,7 @@ config DW_DMAC > config DW_DMAC_PCI > tristate "Synopsys DesignWare AHB DMA PCI driver" > depends on PCI > + depends on HAS_IOMEM > select DW_DMAC_CORE > help > Support the Synopsys DesignWare AHB DMA controller on the Acked-by: Viresh Kumar -- viresh
RE: [PATCH v2 05/15] PCI: xilinx: Convert to MSI domains
> Subject: Re: [PATCH v2 05/15] PCI: xilinx: Convert to MSI domains > > On Wed, 24 Mar 2021 13:56:16 +, > Bharat Kumar Gogada wrote: > > > > Thanks for that. Can you please try the following patch and let me > > > know if it helps? > > > > > > Thanks, > > > > > > M. > > > > > > diff --git a/drivers/pci/controller/pcie-xilinx.c > > > b/drivers/pci/controller/pcie- xilinx.c index > > > ad9abf405167..14001febf59a 100644 > > > --- a/drivers/pci/controller/pcie-xilinx.c > > > +++ b/drivers/pci/controller/pcie-xilinx.c > > > @@ -194,8 +194,18 @@ static struct pci_ops xilinx_pcie_ops = { > > > > > > /* MSI functions */ > > > > > > +static void xilinx_msi_top_irq_ack(struct irq_data *d) { > > > + /* > > > + * xilinx_pcie_intr_handler() will have performed the Ack. > > > + * Eventually, this should be fixed and the Ack be moved in > > > + * the respective callbacks for INTx and MSI. > > > + */ > > > +} > > > + > > > static struct irq_chip xilinx_msi_top_chip = { > > > .name = "PCIe MSI", > > > + .irq_ack= xilinx_msi_top_irq_ack, > > > }; > > > > > > static int xilinx_msi_set_affinity(struct irq_data *d, const struct > > > cpumask *mask, bool force) @@ -206,7 +216,7 @@ static int > > > xilinx_msi_set_affinity(struct irq_data *d, const struct cpumask > > > *mas static void xilinx_compose_msi_msg(struct irq_data *data, struct > msi_msg *msg) { > > > struct xilinx_pcie_port *pcie = irq_data_get_irq_chip_data(data); > > > - phys_addr_t pa = virt_to_phys(pcie); > > > + phys_addr_t pa = ALIGN_DOWN(virt_to_phys(pcie), SZ_4K); > > > > > > msg->address_lo = lower_32_bits(pa); > > > msg->address_hi = upper_32_bits(pa); @@ -468,7 +478,7 @@ static > > > int xilinx_pcie_init_irq_domain(struct > > > xilinx_pcie_port *port) > > > > > > /* Setup MSI */ > > > if (IS_ENABLED(CONFIG_PCI_MSI)) { > > > - phys_addr_t pa = virt_to_phys(port); > > > + phys_addr_t pa = ALIGN_DOWN(virt_to_phys(port), SZ_4K); > > > > > > ret = xilinx_allocate_msi_domains(port); > > > if (ret) > > > > > Thanks Marc. > > With above patch now everything works fine, tested a Samsung NVMe SSD. > > tst~# lspci > > 00:00.0 PCI bridge: Xilinx Corporation Device 0706 > > 01:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd > > NVMe SSD Controller 172Xa/172Xb (rev 01) > > Great, thanks for giving it a shot. Can I take this as a Tested-by: > tag? > Yes. Regards, Bharat
[PATCH] xtensa: Couple of typo fixes
s/contans/contains/ s/desination/destination/ Signed-off-by: Bhaskar Chowdhury --- arch/xtensa/kernel/head.S | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/xtensa/kernel/head.S b/arch/xtensa/kernel/head.S index e0c1fac0910f..c74fdaacf4cf 100644 --- a/arch/xtensa/kernel/head.S +++ b/arch/xtensa/kernel/head.S @@ -212,7 +212,7 @@ ENTRY(_startup) * * The linker script used to build the Linux kernel image * creates a table located at __boot_reloc_table_start -* that contans the information what data needs to be unpacked. +* that contains the information what data needs to be unpacked. * * Uses a2-a7. */ @@ -222,7 +222,7 @@ ENTRY(_startup) 1: beq a2, a3, 3f # no more entries? l32ia4, a2, 0 # start destination (in RAM) - l32ia5, a2, 4 # end desination (in RAM) + l32ia5, a2, 4 # end destination (in RAM) l32ia6, a2, 8 # start source (in ROM) addia2, a2, 12 # next entry beq a4, a5, 1b # skip, empty entry -- 2.30.1
[PATCH] tee: optee: fix build error caused by recent optee tracepoints feature
If build kernel without "O=dir", below error will be seen: In file included from drivers/tee/optee/optee_trace.h:67, from drivers/tee/optee/call.c:18: ./include/trace/define_trace.h:95:42: fatal error: ./optee_trace.h: No such file or directory 95 | #include TRACE_INCLUDE(TRACE_INCLUDE_FILE) | ^ compilation terminated. Fix it by adding below line to Makefile: CFLAGS_call.o := -I$(src) Tested with and without "O=dir", both can build successfully. Reported-by: Guenter Roeck Suggested-by: Steven Rostedt Signed-off-by: Jisheng Zhang --- drivers/tee/optee/Makefile | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/tee/optee/Makefile b/drivers/tee/optee/Makefile index 56263ae3b1d7..3aa33ea9e6a6 100644 --- a/drivers/tee/optee/Makefile +++ b/drivers/tee/optee/Makefile @@ -6,3 +6,6 @@ optee-objs += rpc.o optee-objs += supp.o optee-objs += shm_pool.o optee-objs += device.o + +# for tracing framework to find optee_trace.h +CFLAGS_call.o := -I$(src) -- 2.31.0
Re: [syzbot] WARNING in firmware_fallback_sysfs
syzbot has found a reproducer for the following issue on: HEAD commit:20f1b5f9 Add linux-next specific files for 20210324 git tree: linux-next console output: https://syzkaller.appspot.com/x/log.txt?x=1506414ed0 kernel config: https://syzkaller.appspot.com/x/.config?x=31aa577aa2dca78c dashboard link: https://syzkaller.appspot.com/bug?extid=95f2e2439b97575ec3c0 syz repro: https://syzkaller.appspot.com/x/repro.syz?x=14e50426d0 C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1388dfe6d0 IMPORTANT: if you fix the issue, please add the following tag to the commit: Reported-by: syzbot+95f2e2439b97575ec...@syzkaller.appspotmail.com sysfs group 'power' not found for kobject 'ueagle-atm!eagleI.fw' WARNING: CPU: 1 PID: 36 at fs/sysfs/group.c:279 sysfs_remove_group+0x126/0x170 fs/sysfs/group.c:279 Modules linked in: CPU: 1 PID: 36 Comm: kworker/1:1 Not tainted 5.12.0-rc4-next-20210324-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: events request_firmware_work_func RIP: 0010:sysfs_remove_group+0x126/0x170 fs/sysfs/group.c:279 Code: 48 89 d9 49 8b 14 24 48 b8 00 00 00 00 00 fc ff df 48 c1 e9 03 80 3c 01 00 75 37 48 8b 33 48 c7 c7 e0 7d 7c 89 e8 9d cc d9 06 <0f> 0b eb 98 e8 f1 23 c9 ff e9 01 ff ff ff 48 89 df e8 e4 23 c9 ff RSP: 0018:c9e6faa0 EFLAGS: 00010282 RAX: RBX: 89da8900 RCX: RDX: 888011e01c80 RSI: 815c3fd5 RDI: f520001cdf46 RBP: R08: R09: R10: 815bd77e R11: R12: 8880276ac008 R13: 89da8ea0 R14: 8880133e6878 R15: 8880133e68c0 FS: () GS:8880b9d0() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 7f2c3971a0c8 CR3: 1cf2a000 CR4: 001506e0 DR0: DR1: DR2: DR3: DR6: fffe0ff0 DR7: 0400 Call Trace: dpm_sysfs_remove+0x97/0xb0 drivers/base/power/sysfs.c:837 device_del+0x20c/0xd40 drivers/base/core.c:3398 fw_load_sysfs_fallback drivers/base/firmware_loader/fallback.c:543 [inline] fw_load_from_user_helper drivers/base/firmware_loader/fallback.c:581 [inline] firmware_fallback_sysfs+0x9ff/0xe20 drivers/base/firmware_loader/fallback.c:657 _request_firmware+0xa80/0xe80 drivers/base/firmware_loader/main.c:833 request_firmware_work_func+0xdd/0x230 drivers/base/firmware_loader/main.c:1079 process_one_work+0x98d/0x1600 kernel/workqueue.c:2275 worker_thread+0x64c/0x1120 kernel/workqueue.c:2421 kthread+0x3b1/0x4a0 kernel/kthread.c:292 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294
Re: [PATCH V2] arm64: dts: qcom: sc7280: Add nodes for eMMC and SD card
On 3/23/2021 9:41 PM, Doug Anderson wrote: Hi, On Sat, Mar 20, 2021 at 11:18 AM Shaik Sajida Bhanu wrote: Add nodes for eMMC and SD card on sc7280. Signed-off-by: Shaik Sajida Bhanu --- This change is depends on the below patch series: https://lore.kernel.org/patchwork/project/lkml/list/?series=488871 https://lore.kernel.org/patchwork/project/lkml/list/?series=489530 https://lore.kernel.org/patchwork/project/lkml/list/?series=488429 Changes since V1: - Moved SDHC nodes as suggested by Bjorn Andersson. - Dropped "pinconf-" prefix as suggested by Bjorn Andersson. - Removed extra newlines as suggested by Konrad Dybcio. - Changed sd-cd pin to bias-pull-up in sdc2_off as suggested by Veerabhadrarao Badiganti. - Added bandwidth votes for eMMC and SD card. --- arch/arm64/boot/dts/qcom/sc7280-idp.dts | 25 arch/arm64/boot/dts/qcom/sc7280.dtsi| 213 2 files changed, 238 insertions(+) diff --git a/arch/arm64/boot/dts/qcom/sc7280-idp.dts b/arch/arm64/boot/dts/qcom/sc7280-idp.dts index 54d2cb3..4105263 100644 --- a/arch/arm64/boot/dts/qcom/sc7280-idp.dts +++ b/arch/arm64/boot/dts/qcom/sc7280-idp.dts @@ -8,6 +8,7 @@ /dts-v1/; #include "sc7280.dtsi" +#include / { model = "Qualcomm Technologies, Inc. sc7280 IDP platform"; @@ -242,6 +243,30 @@ status = "okay"; }; +_1 { + status = "okay"; When I apply your patch I find that your sort order is wrong. "s" comes before "u" and after "q" in the alphabet so "sdhc_1" and "sdhc_2" should sort _after "qupv3_id_0" and before "uart5" + pinctrl-names = "default", "sleep"; + pinctrl-0 = <_on>; + pinctrl-1 = <_off>; + + vmmc-supply = <_l7b_2p9>; + vqmmc-supply = <_l19b_1p8>; +}; + +_2 { + status = "okay"; + + pinctrl-names = "default","sleep"; + pinctrl-0 = <_on>; + pinctrl-1 = <_off>; + + vmmc-supply = <_l9c_2p9>; + vqmmc-supply = <_l6c_2p9>; + + cd-gpios = < 91 GPIO_ACTIVE_LOW>; Where is the pinctrl for the card detect? Oh, I see it's in "sdc2_on". Probably would be good to break it out since this is board-specific. See below. +}; + /* PINCTRL - additions to nodes defined in sc7280.dtsi */ _uart5_default { diff --git a/arch/arm64/boot/dts/qcom/sc7280.dtsi b/arch/arm64/boot/dts/qcom/sc7280.dtsi index 8f6b569..69eb064 100644 --- a/arch/arm64/boot/dts/qcom/sc7280.dtsi +++ b/arch/arm64/boot/dts/qcom/sc7280.dtsi @@ -20,6 +20,11 @@ chosen { }; + aliases { + mmc1 = _1; + mmc2 = _2; + }; + clocks { xo_board: xo-board { compatible = "fixed-clock"; @@ -305,6 +310,64 @@ #power-domain-cells = <1>; }; + sdhc_1: sdhci@7c4000 { + compatible = "qcom,sdhci-msm-v5"; Please make the compatible: compatible = "qcom,sc7280-sdhci", "qcom,sdhci-msm-v5"; ...and add to the bindings. It should be a trivial bindings patch so not too much trouble. NOTE: even though the "qcom,sc7280-sdhci" should be in the bindings and here you _shouldn't_ be adding any code for it. Just let the fallback compatible string ("qcom,sdhci-msm-v5") do its magic. Adding the sc7280 specific version is more of a "just in case we need it later" type thing. + reg = <0 0x7c4000 0 0x1000>, + <0 0x7c5000 0 0x1000>; + reg-names = "hc", "cqhci"; + + iommus = <_smmu 0xC0 0x0>; + interrupts = , + ; + interrupt-names = "hc_irq", "pwr_irq"; + + clocks = < GCC_SDCC1_APPS_CLK>, + < GCC_SDCC1_AHB_CLK>, + < RPMH_CXO_CLK>; + clock-names = "core", "iface", "xo"; I'm curious: why is the "xo" clock needed here but not for sc7180? Actually its needed even for sc7180. We are making use of this clock in msm_init_cm_dll() The default PoR value is also same as calculated value for HS200/HS400/SDR104 modes. But just not to rely on default register values we need this entry. + interconnects = <_noc MASTER_SDCC_1 0 _virt SLAVE_EBI1 0>, + <_noc MASTER_APPSS_PROC 0 SLAVE_SDCC_1 0>; + interconnect-names = "sdhc-ddr","cpu-sdhc"; + power-domains = < SC7280_CX>; + operating-points-v2 = <_opp_table>; + + bus-width = <8>; + non-removable; This was actually a problem on sc7180 too, but you probably don't want "non-removable" in the SoC file. Board files really should be adding this. Though the SoC might be designed with the idea that this would be used for a non-removable
Re: [PATCH 00/36] [Set 4] Rid W=1 warnings in SCSI
On Wed, 17 Mar 2021 09:11:54 +, Lee Jones wrote: > This set is part of a larger effort attempting to clean-up W=1 > kernel builds, which are currently overwhelmingly riddled with > niggly little warnings. > > Lee Jones (36): > scsi: myrb: Demote non-conformant kernel-doc headers and fix others > scsi: ipr: Fix incorrect function names in their headers > scsi: mvumi: Fix formatting and doc-rot issues > scsi: sd_zbc: Place function name into header > scsi: pmcraid: Fix a whole host of kernel-doc issues > scsi: sd: Fix function name in header > scsi: aic94xx: aic94xx_dump: Correct misspelling of function > asd_dump_seq_state() > scsi: be2iscsi: be_main: Ensure function follows directly after its > header > scsi: dc395x: Fix some function param descriptions > scsi: initio: Fix a few kernel-doc misdemeanours > scsi: a100u2w: Fix some misnaming and formatting issues > scsi: myrs: Add missing ':' to make the kernel-doc checker happy > scsi: pmcraid: Correct function name pmcraid_show_adapter_id() in > header > scsi: mpt3sas: mpt3sas_scs: Fix a few kernel-doc issues > scsi: be2iscsi: be_main: Demote incomplete/non-conformant kernel-doc > header > scsi: isci: phy: Fix a few different kernel-doc related issues > scsi: fnic: fnic_scsi: Demote non-conformant kernel-doc headers > scsi: fnic: fnic_fcs: Kernel-doc headers must contain the function > name > scsi: isci: phy: Provide function name and demote non-conforming > header > scsi: isci: request: Fix a myriad of kernel-doc issues > scsi: isci: host: Fix bunch of kernel-doc related issues > scsi: isci: task: Demote non-conformant header and remove superfluous > param > scsi: isci: remote_node_table: Fix a bunch of kernel-doc misdemeanours > scsi: isci: remote_node_context: Fix one function header and demote a > couple more > scsi: isci: port_config: Fix a bunch of doc-rot and demote abuses > scsi: isci: remote_device: Fix a bunch of doc-rot issues > scsi: isci: request: Fix doc-rot issue relating to 'ireq' param > scsi: isci: port: Fix a bunch of kernel-doc issues > scsi: isci: remote_node_context: Demote kernel-doc abuse > scsi: isci: remote_node_table: Provide some missing params and remove > others > scsi: cxlflash: main: Fix a little do-rot > scsi: cxlflash: superpipe: Fix a few misnaming issues > scsi: ibmvscsi: Fix a bunch of kernel-doc related issues > scsi: ibmvscsi: ibmvfc: Fix a bunch of misdocumentation > scsi: ibmvscsi_tgt: ibmvscsi_tgt: Remove duplicate section 'NOTE' > scsi: cxlflash: vlun: Fix some misnaming related doc-rot > > [...] Applied to 5.13/scsi-queue, thanks! [01/36] scsi: myrb: Demote non-conformant kernel-doc headers and fix others https://git.kernel.org/mkp/scsi/c/12a1b740f225 [02/36] scsi: ipr: Fix incorrect function names in their headers https://git.kernel.org/mkp/scsi/c/637b5c3ebc1c [03/36] scsi: mvumi: Fix formatting and doc-rot issues https://git.kernel.org/mkp/scsi/c/5ccd626516e1 [04/36] scsi: sd_zbc: Place function name into header https://git.kernel.org/mkp/scsi/c/59863cb53d80 [05/36] scsi: pmcraid: Fix a whole host of kernel-doc issues https://git.kernel.org/mkp/scsi/c/3673b7b0007b [06/36] scsi: sd: Fix function name in header https://git.kernel.org/mkp/scsi/c/ad907c54e36f [07/36] scsi: aic94xx: aic94xx_dump: Correct misspelling of function asd_dump_seq_state() https://git.kernel.org/mkp/scsi/c/3e2f4679ea03 [08/36] scsi: be2iscsi: be_main: Ensure function follows directly after its header https://git.kernel.org/mkp/scsi/c/f1d50e8ee5c9 [09/36] scsi: dc395x: Fix some function param descriptions https://git.kernel.org/mkp/scsi/c/33c8ef953ece [10/36] scsi: initio: Fix a few kernel-doc misdemeanours https://git.kernel.org/mkp/scsi/c/100ec495e01e [11/36] scsi: a100u2w: Fix some misnaming and formatting issues https://git.kernel.org/mkp/scsi/c/c548a6250627 [12/36] scsi: myrs: Add missing ':' to make the kernel-doc checker happy https://git.kernel.org/mkp/scsi/c/9eb292eb2ef7 [13/36] scsi: pmcraid: Correct function name pmcraid_show_adapter_id() in header https://git.kernel.org/mkp/scsi/c/a364a147b1dc [14/36] scsi: mpt3sas: mpt3sas_scs: Fix a few kernel-doc issues https://git.kernel.org/mkp/scsi/c/a8d548b0b3ee [15/36] scsi: be2iscsi: be_main: Demote incomplete/non-conformant kernel-doc header https://git.kernel.org/mkp/scsi/c/a90a8c607570 [16/36] scsi: isci: phy: Fix a few different kernel-doc related issues https://git.kernel.org/mkp/scsi/c/6af1d9bd9051 [17/36] scsi: fnic: fnic_scsi: Demote non-conformant kernel-doc headers https://git.kernel.org/mkp/scsi/c/c7eab0704c30 [18/36] scsi: fnic: fnic_fcs: Kernel-doc headers must contain the function name https://git.kernel.org/mkp/scsi/c/2efd8631d6a5 [19/36] scsi: isci: phy: Provide function name and demote non-conforming header
Re: [PATCH v3] scsi: ufs: Tidy up WB configuration code
On Thu, 18 Mar 2021 17:55:36 +0800, Yue Hu wrote: > There are similar code implementations for WB configuration in > ufshcd_wb_{ctrl, toggle_flush_during_h8, toggle_flush}. We can > extract the part to create a new helper with a flag parameter to > reduce code duplication. > > Meanwhile, rename ufshcd_wb_ctrl() to ufshcd_wb_toggle() for better > readability. > > [...] Applied to 5.13/scsi-queue, thanks! [1/1] scsi: ufs: Tidy up WB configuration code https://git.kernel.org/mkp/scsi/c/3b5f3c0d0548 -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH] scsi: fnic: Rudimentary spelling fixes throughout the file fnic_trace.c
On Wed, 17 Mar 2021 14:52:40 +0530, Bhaskar Chowdhury wrote: > Rudimentary typo fixes throughout the file. Applied to 5.13/scsi-queue, thanks! [1/1] scsi: fnic: Rudimentary spelling fixes throughout the file fnic_trace.c https://git.kernel.org/mkp/scsi/c/bcf064bc2a3b -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH 0/2] Fix EH race and MQ support
On Fri, 19 Mar 2021 14:50:27 -0600, Tyrel Datwyler wrote: > Changes to the locking pattern protecting the event lists and handling of scsi > command completion introduced a race where an ouststanding command that EH is > waiting ifor to complete is no longer identifiable by being on the sent list, > but > instead as a command that is not on the free list. This is a result of moving > commands to be completed off the sent list to a private list to be completed > outside the list lock. > > [...] Applied to 5.12/scsi-fixes, thanks! [1/2] ibmvfc: fix potential race in ibmvfc_wait_for_ops https://git.kernel.org/mkp/scsi/c/8b1c9b202549 [2/2] ibmvfc: make ibmvfc_wait_for_ops MQ aware https://git.kernel.org/mkp/scsi/c/62fc2661482b -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH] message: fusion: Fix a typo in the file mptbase.h
On Wed, 17 Mar 2021 15:42:38 +0530, Bhaskar Chowdhury wrote: > s/contets/contents/ Applied to 5.13/scsi-queue, thanks! [1/1] message: fusion: Fix a typo in the file mptbase.h https://git.kernel.org/mkp/scsi/c/69a1709e2ec8 -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH] scsi: qedi: fix error return code of qedi_alloc_global_queues()
On Sun, 7 Mar 2021 19:30:24 -0800, Jia-Ju Bai wrote: > When kzalloc() returns NULL to qedi->global_queues[i], no error return > code of qedi_alloc_global_queues() is assigned. > To fix this bug, status is assigned with -ENOMEM in this case. Applied to 5.12/scsi-fixes, thanks! [1/1] scsi: qedi: fix error return code of qedi_alloc_global_queues() https://git.kernel.org/mkp/scsi/c/f69953837ca5 -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH] scsi: mpt3sas: fix error return code of mpt3sas_base_attach()
On Sun, 7 Mar 2021 19:52:41 -0800, Jia-Ju Bai wrote: > When kzalloc() returns NULL, no error return code of > mpt3sas_base_attach() is assigned. > To fix this bug, r is assigned with -ENOMEM in this case. Applied to 5.12/scsi-fixes, thanks! [1/1] scsi: mpt3sas: fix error return code of mpt3sas_base_attach() https://git.kernel.org/mkp/scsi/c/3401ecf7fc1b -- Martin K. Petersen Oracle Linux Engineering
[PATCH v4] audit: log nftables configuration change events once per table
Reduce logging of nftables events to a level similar to iptables. Restore the table field to list the table, adding the generation. Indicate the op as the most significant operation in the event. A couple of sample events: type=PROCTITLE msg=audit(2021-03-18 09:30:49.801:143) : proctitle=/usr/bin/python3 -s /usr/sbin/firewalld --nofork --nopid type=SYSCALL msg=audit(2021-03-18 09:30:49.801:143) : arch=x86_64 syscall=sendmsg success=yes exit=172 a0=0x6 a1=0x7ffdcfcbe650 a2=0x0 a3=0x7ffdcfcbd52c items=0 ppid=1 pid=367 auid=unset uid=root gid=root euid=root suid=root fsuid=root egid=roo t sgid=root fsgid=root tty=(none) ses=unset comm=firewalld exe=/usr/bin/python3.9 subj=system_u:system_r:firewalld_t:s0 key=(null) type=NETFILTER_CFG msg=audit(2021-03-18 09:30:49.801:143) : table=firewalld:2 family=ipv6 entries=1 op=nft_register_table pid=367 subj=system_u:system_r:firewalld_t:s0 comm=firewalld type=NETFILTER_CFG msg=audit(2021-03-18 09:30:49.801:143) : table=firewalld:2 family=ipv4 entries=1 op=nft_register_table pid=367 subj=system_u:system_r:firewalld_t:s0 comm=firewalld type=NETFILTER_CFG msg=audit(2021-03-18 09:30:49.801:143) : table=firewalld:2 family=inet entries=1 op=nft_register_table pid=367 subj=system_u:system_r:firewalld_t:s0 comm=firewalld type=PROCTITLE msg=audit(2021-03-18 09:30:49.839:144) : proctitle=/usr/bin/python3 -s /usr/sbin/firewalld --nofork --nopid type=SYSCALL msg=audit(2021-03-18 09:30:49.839:144) : arch=x86_64 syscall=sendmsg success=yes exit=22792 a0=0x6 a1=0x7ffdcfcbe650 a2=0x0 a3=0x7ffdcfcbd52c items=0 ppid=1 pid=367 auid=unset uid=root gid=root euid=root suid=root fsuid=root egid=r oot sgid=root fsgid=root tty=(none) ses=unset comm=firewalld exe=/usr/bin/python3.9 subj=system_u:system_r:firewalld_t:s0 key=(null) type=NETFILTER_CFG msg=audit(2021-03-18 09:30:49.839:144) : table=firewalld:3 family=ipv6 entries=30 op=nft_register_chain pid=367 subj=system_u:system_r:firewalld_t:s0 comm=firewalld type=NETFILTER_CFG msg=audit(2021-03-18 09:30:49.839:144) : table=firewalld:3 family=ipv4 entries=30 op=nft_register_chain pid=367 subj=system_u:system_r:firewalld_t:s0 comm=firewalld type=NETFILTER_CFG msg=audit(2021-03-18 09:30:49.839:144) : table=firewalld:3 family=inet entries=165 op=nft_register_chain pid=367 subj=system_u:system_r:firewalld_t:s0 comm=firewalld The issue was originally documented in https://github.com/linux-audit/audit-kernel/issues/124 Signed-off-by: Richard Guy Briggs --- Changelog: v4: - move nf_tables_commit_audit_log() before nf_tables_commit_release() [fw] - move nft2audit_op[] from audit.h to nf_tables_api.c v3: - fix function braces, reduce parameter scope [pna] - pre-allocate nft_audit_data per table in step 1, bail on ENOMEM [pna] v2: - convert NFT ops to array indicies in nft2audit_op[] [ps] - use linux lists [pna] - use functions for each of collection and logging of audit data [pna] --- net/netfilter/nf_tables_api.c | 187 +++--- 1 file changed, 104 insertions(+), 83 deletions(-) diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c index c1eb5cdb3033..9c930fe72005 100644 --- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -66,6 +66,41 @@ static const struct rhashtable_params nft_objname_ht_params = { .automatic_shrinking= true, }; +struct nft_audit_data { + struct nft_table *table; + int entries; + int op; + struct list_head list; +}; + +static const u8 nft2audit_op[NFT_MSG_MAX] = { // enum nf_tables_msg_types + [NFT_MSG_NEWTABLE] = AUDIT_NFT_OP_TABLE_REGISTER, + [NFT_MSG_GETTABLE] = AUDIT_NFT_OP_INVALID, + [NFT_MSG_DELTABLE] = AUDIT_NFT_OP_TABLE_UNREGISTER, + [NFT_MSG_NEWCHAIN] = AUDIT_NFT_OP_CHAIN_REGISTER, + [NFT_MSG_GETCHAIN] = AUDIT_NFT_OP_INVALID, + [NFT_MSG_DELCHAIN] = AUDIT_NFT_OP_CHAIN_UNREGISTER, + [NFT_MSG_NEWRULE] = AUDIT_NFT_OP_RULE_REGISTER, + [NFT_MSG_GETRULE] = AUDIT_NFT_OP_INVALID, + [NFT_MSG_DELRULE] = AUDIT_NFT_OP_RULE_UNREGISTER, + [NFT_MSG_NEWSET]= AUDIT_NFT_OP_SET_REGISTER, + [NFT_MSG_GETSET]= AUDIT_NFT_OP_INVALID, + [NFT_MSG_DELSET]= AUDIT_NFT_OP_SET_UNREGISTER, + [NFT_MSG_NEWSETELEM]= AUDIT_NFT_OP_SETELEM_REGISTER, + [NFT_MSG_GETSETELEM]= AUDIT_NFT_OP_INVALID, + [NFT_MSG_DELSETELEM]= AUDIT_NFT_OP_SETELEM_UNREGISTER, + [NFT_MSG_NEWGEN]= AUDIT_NFT_OP_GEN_REGISTER, + [NFT_MSG_GETGEN]= AUDIT_NFT_OP_INVALID, + [NFT_MSG_TRACE] = AUDIT_NFT_OP_INVALID, + [NFT_MSG_NEWOBJ]= AUDIT_NFT_OP_OBJ_REGISTER, + [NFT_MSG_GETOBJ]= AUDIT_NFT_OP_INVALID, + [NFT_MSG_DELOBJ]= AUDIT_NFT_OP_OBJ_UNREGISTER, + [NFT_MSG_GETOBJ_RESET] = AUDIT_NFT_OP_OBJ_RESET, + [NFT_MSG_NEWFLOWTABLE] = AUDIT_NFT_OP_FLOWTABLE_REGISTER, + [NFT_MSG_GETFLOWTABLE]
[PATCH resend 3/4] nfc: fix memory leak in llcp_sock_connect()
In llcp_sock_connect(), use kmemdup to allocate memory for "llcp_sock->service_name". The memory is not released in the sock_unlink label of the subsequent failure branch. As a result, memory leakage occurs. fix CVE-2020-25672 Fixes: d646960f7986 ("NFC: Initial LLCP support") Reported-by: "kiyin(尹亮)" Link: https://www.openwall.com/lists/oss-security/2020/11/01/1 Cc: #v3.3 Signed-off-by: Xiaoming Ni --- net/nfc/llcp_sock.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/net/nfc/llcp_sock.c b/net/nfc/llcp_sock.c index 9e2799ee1595..59172614b249 100644 --- a/net/nfc/llcp_sock.c +++ b/net/nfc/llcp_sock.c @@ -746,6 +746,8 @@ static int llcp_sock_connect(struct socket *sock, struct sockaddr *_addr, sock_unlink: nfc_llcp_sock_unlink(>connecting_sockets, sk); + kfree(llcp_sock->service_name); + llcp_sock->service_name = NULL; sock_llcp_release: nfc_llcp_put_ssap(local, llcp_sock->ssap); -- 2.27.0
[PATCH resend 0/4] nfc: fix Resource leakage and endless loop
fix Resource leakage and endless loop in net/nfc/llcp_sock.c, reported by "kiyin(尹亮)". Link: https://www.openwall.com/lists/oss-security/2020/11/01/1 Xiaoming Ni (4): nfc: fix refcount leak in llcp_sock_bind() nfc: fix refcount leak in llcp_sock_connect() nfc: fix memory leak in llcp_sock_connect() nfc: Avoid endless loops caused by repeated llcp_sock_connect() net/nfc/llcp_sock.c | 10 ++ 1 file changed, 10 insertions(+) -- 2.27.0
[PATCH resend 1/4] nfc: fix refcount leak in llcp_sock_bind()
nfc_llcp_local_get() is invoked in llcp_sock_bind(), but nfc_llcp_local_put() is not invoked in subsequent failure branches. As a result, refcount leakage occurs. To fix it, add calling nfc_llcp_local_put(). fix CVE-2020-25670 Fixes: c7aa12252f51 ("NFC: Take a reference on the LLCP local pointer when creating a socket") Reported-by: "kiyin(尹亮)" Link: https://www.openwall.com/lists/oss-security/2020/11/01/1 Cc: #v3.6 Signed-off-by: Xiaoming Ni --- net/nfc/llcp_sock.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/net/nfc/llcp_sock.c b/net/nfc/llcp_sock.c index d257ed3b732a..68832ee4b9f8 100644 --- a/net/nfc/llcp_sock.c +++ b/net/nfc/llcp_sock.c @@ -108,11 +108,13 @@ static int llcp_sock_bind(struct socket *sock, struct sockaddr *addr, int alen) llcp_sock->service_name_len, GFP_KERNEL); if (!llcp_sock->service_name) { + nfc_llcp_local_put(llcp_sock->local); ret = -ENOMEM; goto put_dev; } llcp_sock->ssap = nfc_llcp_get_sdp_ssap(local, llcp_sock); if (llcp_sock->ssap == LLCP_SAP_MAX) { + nfc_llcp_local_put(llcp_sock->local); kfree(llcp_sock->service_name); llcp_sock->service_name = NULL; ret = -EADDRINUSE; -- 2.27.0
[PATCH resend 2/4] nfc: fix refcount leak in llcp_sock_connect()
nfc_llcp_local_get() is invoked in llcp_sock_connect(), but nfc_llcp_local_put() is not invoked in subsequent failure branches. As a result, refcount leakage occurs. To fix it, add calling nfc_llcp_local_put(). fix CVE-2020-25671 Fixes: c7aa12252f51 ("NFC: Take a reference on the LLCP local pointer when creating a socket") Reported-by: "kiyin(尹亮)" Link: https://www.openwall.com/lists/oss-security/2020/11/01/1 Cc: #v3.6 Signed-off-by: Xiaoming Ni --- net/nfc/llcp_sock.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/net/nfc/llcp_sock.c b/net/nfc/llcp_sock.c index 68832ee4b9f8..9e2799ee1595 100644 --- a/net/nfc/llcp_sock.c +++ b/net/nfc/llcp_sock.c @@ -704,6 +704,7 @@ static int llcp_sock_connect(struct socket *sock, struct sockaddr *_addr, llcp_sock->local = nfc_llcp_local_get(local); llcp_sock->ssap = nfc_llcp_get_local_ssap(local); if (llcp_sock->ssap == LLCP_SAP_MAX) { + nfc_llcp_local_put(llcp_sock->local); ret = -ENOMEM; goto put_dev; } @@ -748,6 +749,7 @@ static int llcp_sock_connect(struct socket *sock, struct sockaddr *_addr, sock_llcp_release: nfc_llcp_put_ssap(local, llcp_sock->ssap); + nfc_llcp_local_put(llcp_sock->local); put_dev: nfc_put_device(dev); -- 2.27.0
[PATCH resend 4/4] nfc: Avoid endless loops caused by repeated llcp_sock_connect()
When sock_wait_state() returns -EINPROGRESS, "sk->sk_state" is LLCP_CONNECTING. In this case, llcp_sock_connect() is repeatedly invoked, nfc_llcp_sock_link() will add sk to local->connecting_sockets twice. sk->sk_node->next will point to itself, that will make an endless loop and hang-up the system. To fix it, check whether sk->sk_state is LLCP_CONNECTING in llcp_sock_connect() to avoid repeated invoking. Fixes: b4011239a08e ("NFC: llcp: Fix non blocking sockets connections") Reported-by: "kiyin(尹亮)" Link: https://www.openwall.com/lists/oss-security/2020/11/01/1 Cc: #v3.11 Signed-off-by: Xiaoming Ni --- net/nfc/llcp_sock.c | 4 1 file changed, 4 insertions(+) diff --git a/net/nfc/llcp_sock.c b/net/nfc/llcp_sock.c index 59172614b249..a3b46f03 100644 --- a/net/nfc/llcp_sock.c +++ b/net/nfc/llcp_sock.c @@ -673,6 +673,10 @@ static int llcp_sock_connect(struct socket *sock, struct sockaddr *_addr, ret = -EISCONN; goto error; } + if (sk->sk_state == LLCP_CONNECTING) { + ret = -EINPROGRESS; + goto error; + } dev = nfc_get_device(addr->dev_idx); if (dev == NULL) { -- 2.27.0
Re: [PATCH] tee: optee: add invoke_fn tracepoints
On Wed, 24 Mar 2021 10:53:13 -0400 Steven Rostedt wrote: > > On Wed, 24 Mar 2021 07:48:53 -0700 > Guenter Roeck wrote: > > > On Wed, Mar 24, 2021 at 07:34:07AM -0700, Guenter Roeck wrote: > > > On Wed, Feb 10, 2021 at 02:44:09PM +0800, Jisheng Zhang wrote: > > > > Add tracepoints to retrieve information about the invoke_fn. This would > > > > help to measure how many invoke_fn are triggered and how long it takes > > > > to complete one invoke_fn call. > > > > > > > > Signed-off-by: Jisheng Zhang > > > > > > arm64:defconfig: > > > > > > make-arm64 -j drivers/tee/optee/call.o > > > CALLscripts/atomic/check-atomics.sh > > > CALLscripts/checksyscalls.sh > > > CC drivers/tee/optee/call.o > > > In file included from drivers/tee/optee/optee_trace.h:67, > > > from drivers/tee/optee/call.c:18: > > > ./include/trace/define_trace.h:95:42: fatal error: ./optee_trace.h: No > > > such file or directory > > >95 | #include TRACE_INCLUDE(TRACE_INCLUDE_FILE) > > > | ^ > > > compilation terminated. Interesting, I always build linux kernel with "O=", didn't see such build error and IIRC, we didn't receive any lkp robot build error report. My steps are: mkdir /tmp/test make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- O=/tmp/test defconfig make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- O=/tmp/test drivers/tee/optee/ Today, I tried to build the linux kernel w/o "O=...", I reproduced this error! This is the first time I saw "O=" make a different behavior. I'll send out a patch to fix it. Thanks > > > > > > > The problem also affects arm:imx_v6_v7_defconfig. > > > > I think it affects everything. The problem is that the > drivers/tee/optee/Makefile needs to be updated with: > > CFLAGS_call.o := -I$(src) > > otherwise the compiler wont know how to find the path to optee_tree.h. > > This is described in: > >samples/trace_events/Makefile Thank Steven for pointing this out.
Re: [PATCH 2/2] media: videobuf2: cleanup size argument from attach_dmabuf()
Hi Helen, I love your patch! Yet something to improve: [auto build test ERROR on linuxtv-media/master] [also build test ERROR on next-20210324] [cannot apply to v5.12-rc4] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/Helen-Koike/media-videobuf2-use-dmabuf-size-for-length/20210325-082047 base: git://linuxtv.org/media_tree.git master config: powerpc64-randconfig-r016-20210325 (attached as .config) compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project 5d6b4aa80d6df62b924a12af030c5ded868ee4f1) reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # install powerpc64 cross compiling tool for clang build # apt-get install binutils-powerpc64-linux-gnu # https://github.com/0day-ci/linux/commit/41e2cea31db8378b33e31785aec668a009d1355b git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Helen-Koike/media-videobuf2-use-dmabuf-size-for-length/20210325-082047 git checkout 41e2cea31db8378b33e31785aec668a009d1355b # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=powerpc64 If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot All errors (new ones prefixed by >>): >> drivers/media/common/videobuf2/videobuf2-dma-sg.c:631:14: error: use of >> undeclared identifier 'dmabuf'; did you mean 'dbuf'? buf->size = dmabuf->size; ^~ dbuf drivers/media/common/videobuf2/videobuf2-dma-sg.c:608:75: note: 'dbuf' declared here static void *vb2_dma_sg_attach_dmabuf(struct device *dev, struct dma_buf *dbuf, ^ 1 error generated. vim +631 drivers/media/common/videobuf2/videobuf2-dma-sg.c 607 608 static void *vb2_dma_sg_attach_dmabuf(struct device *dev, struct dma_buf *dbuf, 609enum dma_data_direction dma_dir) 610 { 611 struct vb2_dma_sg_buf *buf; 612 struct dma_buf_attachment *dba; 613 614 if (WARN_ON(!dev)) 615 return ERR_PTR(-EINVAL); 616 617 buf = kzalloc(sizeof(*buf), GFP_KERNEL); 618 if (!buf) 619 return ERR_PTR(-ENOMEM); 620 621 buf->dev = dev; 622 /* create attachment for the dmabuf with the user device */ 623 dba = dma_buf_attach(dbuf, buf->dev); 624 if (IS_ERR(dba)) { 625 pr_err("failed to attach dmabuf\n"); 626 kfree(buf); 627 return dba; 628 } 629 630 buf->dma_dir = dma_dir; > 631 buf->size = dmabuf->size; 632 buf->db_attach = dba; 633 634 return buf; 635 } 636 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org .config.gz Description: application/gzip
Re: [PATCH 5.10 000/150] 5.10.26-rc3 review
On 3/24/2021 2:40 AM, Greg Kroah-Hartman wrote: > This is the start of the stable review cycle for the 5.10.26 release. > There are 150 patches in this series, all will be posted as a response > to this one. If anyone has any issues with these being applied, please > let me know. > > Responses should be made by Fri, 26 Mar 2021 09:33:54 +. > Anything received after that time might be too late. > > The whole patch series can be found in one patch at: > > https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.10.26-rc3.gz > or in the git tree and branch at: > > git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git > linux-5.10.y > and the diffstat can be found below. > > thanks, > > greg k-h On ARCH_BRCMSTB, using 32-bit and 64-bit ARM kernels: Tested-by: Florian Fainelli -- Florian
[PATCH 1/3] ASoC:codec:max98373: Changed amp shutdown register as volatile
0x20FF(amp global enable) register was defined as non-volatile, but it is not. Overheating, overcurrent can cause amp shutdown in hardware. 'regmap_write' compare register readback value before writing to avoid same value writing. 'regmap_read' just read cache not actual hardware value for the non-volatile register. When amp is internally shutdown by some reason, next 'AMP ON' command can be ignored because regmap think amp is already ON. Signed-off-by: Ryan Lee --- sound/soc/codecs/max98373-i2c.c | 1 + sound/soc/codecs/max98373-sdw.c | 1 + 2 files changed, 2 insertions(+) diff --git a/sound/soc/codecs/max98373-i2c.c b/sound/soc/codecs/max98373-i2c.c index 85f6865019d4..ddb6436835d7 100644 --- a/sound/soc/codecs/max98373-i2c.c +++ b/sound/soc/codecs/max98373-i2c.c @@ -446,6 +446,7 @@ static bool max98373_volatile_reg(struct device *dev, unsigned int reg) case MAX98373_R2054_MEAS_ADC_PVDD_CH_READBACK: case MAX98373_R2055_MEAS_ADC_THERM_CH_READBACK: case MAX98373_R20B6_BDE_CUR_STATE_READBACK: + case MAX98373_R20FF_GLOBAL_SHDN: case MAX98373_R21FF_REV_ID: return true; default: diff --git a/sound/soc/codecs/max98373-sdw.c b/sound/soc/codecs/max98373-sdw.c index d8c47667a9ea..f3a12205cd48 100644 --- a/sound/soc/codecs/max98373-sdw.c +++ b/sound/soc/codecs/max98373-sdw.c @@ -220,6 +220,7 @@ static bool max98373_volatile_reg(struct device *dev, unsigned int reg) case MAX98373_R2054_MEAS_ADC_PVDD_CH_READBACK: case MAX98373_R2055_MEAS_ADC_THERM_CH_READBACK: case MAX98373_R20B6_BDE_CUR_STATE_READBACK: + case MAX98373_R20FF_GLOBAL_SHDN: case MAX98373_R21FF_REV_ID: /* SoundWire Control Port Registers */ case MAX98373_R0040_SCP_INIT_STAT_1 ... MAX98373_R0070_SCP_FRAME_CTLR: -- 2.17.1
[PATCH 3/3] ASoC:codec:max98373: Added controls for autorestart config
3 new controls are added. "OVC Autorestart Switch" : controls whether or not the speaker amplifier automatically re-enables after an overcurrent fault condition. "THERM Autorestart Switch" : controls whether or not the device automatically resumes playback when the die temperature recovers from thermal shutdown. "CMON Autorestart Switch" : controls whether or not the device automatically resumes playback when the clock returns after stopping. Above Auto Restart functions are enabled by default. Signed-off-by: Ryan Lee --- sound/soc/codecs/max98373.c | 14 ++ sound/soc/codecs/max98373.h | 3 +++ 2 files changed, 17 insertions(+) diff --git a/sound/soc/codecs/max98373.c b/sound/soc/codecs/max98373.c index 1346a98ce8a1..e14fe98349a5 100644 --- a/sound/soc/codecs/max98373.c +++ b/sound/soc/codecs/max98373.c @@ -204,6 +204,15 @@ SOC_SINGLE("Ramp Up Switch", MAX98373_R203F_AMP_DSP_CFG, MAX98373_AMP_DSP_CFG_RMP_UP_SHIFT, 1, 0), SOC_SINGLE("Ramp Down Switch", MAX98373_R203F_AMP_DSP_CFG, MAX98373_AMP_DSP_CFG_RMP_DN_SHIFT, 1, 0), +/* Speaker Amplifier Overcurrent Automatic Restart Enable */ +SOC_SINGLE("OVC Autorestart Switch", MAX98373_R20FE_DEVICE_AUTO_RESTART_CFG, + MAX98373_OVC_AUTORESTART_SHIFT, 1, 0), +/* Thermal Shutdown Automatic Restart Enable */ +SOC_SINGLE("THERM Autorestart Switch", MAX98373_R20FE_DEVICE_AUTO_RESTART_CFG, + MAX98373_THERM_AUTORESTART_SHIFT, 1, 0), +/* Clock Monitor Automatic Restart Enable */ +SOC_SINGLE("CMON Autorestart Switch", MAX98373_R20FE_DEVICE_AUTO_RESTART_CFG, + MAX98373_CMON_AUTORESTART_SHIFT, 1, 0), SOC_SINGLE("CLK Monitor Switch", MAX98373_R20FE_DEVICE_AUTO_RESTART_CFG, MAX98373_CLOCK_MON_SHIFT, 1, 0), SOC_SINGLE("Dither Switch", MAX98373_R203F_AMP_DSP_CFG, @@ -392,6 +401,11 @@ static int max98373_probe(struct snd_soc_component *component) MAX98373_R2021_PCM_TX_HIZ_EN_2, 1 << (max98373->i_slot - 8), 0); + /* enable auto restart function by default */ + regmap_write(max98373->regmap, + MAX98373_R20FE_DEVICE_AUTO_RESTART_CFG, + 0xF); + /* speaker feedback slot configuration */ regmap_write(max98373->regmap, MAX98373_R2023_PCM_TX_SRC_2, diff --git a/sound/soc/codecs/max98373.h b/sound/soc/codecs/max98373.h index 71f5a5228f34..73a2cf69d84a 100644 --- a/sound/soc/codecs/max98373.h +++ b/sound/soc/codecs/max98373.h @@ -195,6 +195,9 @@ #define MAX98373_LIMITER_EN_SHIFT (0) /* MAX98373_R20FE_DEVICE_AUTO_RESTART_CFG */ +#define MAX98373_OVC_AUTORESTART_SHIFT (3) +#define MAX98373_THERM_AUTORESTART_SHIFT (2) +#define MAX98373_CMON_AUTORESTART_SHIFT (1) #define MAX98373_CLOCK_MON_SHIFT (0) /* MAX98373_R20FF_GLOBAL_SHDN */ -- 2.17.1
[PATCH 2/3] ASoC:codec:max98373: Added 30ms turn on/off time delay
Amp requires 10 ~ 30ms for the power ON and OFF. Added 30ms delay for stability. Signed-off-by: Ryan Lee --- sound/soc/codecs/max98373.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/sound/soc/codecs/max98373.c b/sound/soc/codecs/max98373.c index 746c829312b8..1346a98ce8a1 100644 --- a/sound/soc/codecs/max98373.c +++ b/sound/soc/codecs/max98373.c @@ -28,11 +28,13 @@ static int max98373_dac_event(struct snd_soc_dapm_widget *w, regmap_update_bits(max98373->regmap, MAX98373_R20FF_GLOBAL_SHDN, MAX98373_GLOBAL_EN_MASK, 1); + usleep_range(3, 31000); break; case SND_SOC_DAPM_POST_PMD: regmap_update_bits(max98373->regmap, MAX98373_R20FF_GLOBAL_SHDN, MAX98373_GLOBAL_EN_MASK, 0); + usleep_range(3, 31000); max98373->tdm_mode = false; break; default: -- 2.17.1
Re: [PATCH V2] arm64: dts: qcom: sc7280: Add nodes for eMMC and SD card
On 3/24/2021 9:58 PM, Stephen Boyd wrote: Quoting Stephen Boyd (2021-03-24 08:57:33) Quoting sbh...@codeaurora.org (2021-03-24 08:23:55) On 2021-03-23 12:31, Stephen Boyd wrote: Quoting Shaik Sajida Bhanu (2021-03-20 11:17:00) + + bus-width = <8>; + non-removable; + supports-cqe; + no-sd; + no-sdio; + + max-frequency = <19200>; Is this necessary? yes, to avoid lower speed modes running with high clock rates. Is it part of the DT binding? I don't see any mention of it. Nevermind, found it in mmc-controller.yaml. But I think this is to work around some problem with the clk driver picking lower speeds than requested? That has been fixed on the clk driver side (see commit like 148ddaa89d4a "clk: qcom: gcc-sc7180: Use floor ops for the correct sdcc1 clk") so ideally this property can be omitted. This is a good have dt node. This will align clock requests between mmc core layer and sdhci-msm platform driver. Say, for HS200/HS400 modes of eMMC, mmc-core layer tries to set clock at 200Mhz, whereas sdhci-msm expects 192Mhz for these modes. So we have to rely on clock driver floor/ceil values. By having this property, mmc-core layer itself request for 192Mhz. Same is for SD card SDR104 mode, core layer expects clock at 208Mhz whereas sdhci-msm can max operate only at 202Mhz. By having this property, core layer requests only for 202Mhz for SDR104 mode. BTW, this helps only for max possible speed modes. In case of lower-speed modes (for DDR52) we still need to rely on clock floor rounding.
Re: [PATCH] powerpc/asm-offsets: GPR14 is not needed either
On Mon, 2021-03-15 at 11:01 +, Christophe Leroy wrote: > Commit aac6a91fea93 ("powerpc/asm: Remove unused symbols in > asm-offsets.c") removed GPR15 to GPR31 but kept GPR14, > probably because it pops up in a couple of comments when doing > a grep. > > However, it was never used either, so remove it as well. > Looks good to me. Reviewed-by: Rashmica Gupta > Fixes: aac6a91fea93 ("powerpc/asm: Remove unused symbols in asm- > offsets.c") > Cc: Rashmica Gupta > Signed-off-by: Christophe Leroy > --- > arch/powerpc/kernel/asm-offsets.c | 3 --- > 1 file changed, 3 deletions(-) > > diff --git a/arch/powerpc/kernel/asm-offsets.c > b/arch/powerpc/kernel/asm-offsets.c > index f3a662201a9f..4d230c5c7099 100644 > --- a/arch/powerpc/kernel/asm-offsets.c > +++ b/arch/powerpc/kernel/asm-offsets.c > @@ -323,9 +323,6 @@ int main(void) > STACK_PT_REGS_OFFSET(GPR11, gpr[11]); > STACK_PT_REGS_OFFSET(GPR12, gpr[12]); > STACK_PT_REGS_OFFSET(GPR13, gpr[13]); > -#ifndef CONFIG_PPC64 > - STACK_PT_REGS_OFFSET(GPR14, gpr[14]); > -#endif /* CONFIG_PPC64 */ > /* >* Note: these symbols include _ because they overlap with > special >* register names
[question] kernel panic at timerqueue_add+32
On the x86 platform, we encountered the following problems. The kernel version we are using is 3.10. The following is our analysis process, hoping to get your help. kernel panic at timerqueue_add+32.The stack information is as follows. crash> bt -c 3 PID: 27797 TASK: 9f9e28805f40 CPU: 3 COMMAND: "ipmi_sim" #0 [9f9ec0ac3dd0] die at ac82f97b #1 [9f9ec0ac3e00] do_general_protection at acf3211e #2 [9f9ec0ac3e30] general_protection at acf31718 [exception RIP: timerqueue_add+32] RIP: acb67340 RSP: 9f9ec0ac3ee0 RFLAGS: 00010006 RAX: 7401f88348078b48 RBX: 9f9ec0ad3fa0 RCX: RDX: ac8d4395 RSI: 9f9ec0ad3fa0 RDI: ac8d4395 RBP: 9f9ec0ac3ef0 R8: 00405b31f6958080 R9: 9f9ec0ac3de0 R10: 0002 R11: 9f9ec0ac3de8 R12: ac8d4395 R13: ac8d4385 R14: 0001 R15: 9f9ec0ad3b58 ORIG_RAX: CS: 0010 SS: 0018 #3 [9f9ec0ac3ef8] enqueue_hrtimer at ac8c32f5 #4 [9f9ec0ac3f20] __hrtimer_run_queues at ac8c3c7d #5 [9f9ec0ac3f78] hrtimer_interrupt at ac8c41af #6 [9f9ec0ac3fc0] local_apic_timer_interrupt at ac85aeeb #7 [9f9ec0ac3fd8] smp_apic_timer_interrupt at acf3f0a3 #8 [9f9ec0ac3ff0] apic_timer_interrupt at acf3b7ba --- --- bt: cannot transition from IRQ stack to current process stack: IRQ stack pointer: 9f9ec0ac3dd0 process stack pointer: 9f708e693df8 current stack base: 9f9e25764000 We first parse timerqueue_add+32 crash> dis -l timerqueue_add+32 /usr/src/debug/kernel-3.10.0/linux-3.10.0-862.14.1.6_110.x86_64/lib/timerqueue.c: 52 0xacb67340 : mov0x18(%rax),%rsi 39 void timerqueue_add(struct timerqueue_head *head, struct timerqueue_node *node) 40 { 41 struct rb_node **p = >head.rb_node; 42 struct rb_node *parent = NULL; 43 struct timerqueue_node *ptr; 44 45 /* Make sure we don't add nodes that are already added */ 46 WARN_ON_ONCE(!RB_EMPTY_NODE(>node)); 47 48 while (*p) { 49 parent = *p; 50 ptr = rb_entry(parent, struct timerqueue_node, node); 51 if (node->expires.tv64 < ptr->expires.tv64) 52 p = &(*p)->rb_left; //at here, the p is the invalid address 53 else 54 p = &(*p)->rb_right; 55 } 56 rb_link_node(>node, parent, p); 57 rb_insert_color(>node, >head); 58 59 if (!head->next || node->expires.tv64 < head->next->expires.tv64) 60 head->next = node; 61 } 62 EXPORT_SYMBOL_GPL(timerqueue_add); Let's disassemble the timerqueue_add function, the following is the part of the disassembled code of the timerqueue_add function crash> dis -l timerqueue_add /usr/src/debug/kernel-3.10.0/linux-3.10.0-862.14.1.6_110.x86_64/lib/timerqueue.c: 40 0xacb67320 :push %rbp 0xacb67321 : mov%rsp,%rbp 0xacb67324 : push %r12 0xacb67326 : mov%rdi,%r12 0xacb67329 : push %rbx /usr/src/debug/kernel-3.10.0/linux-3.10.0-862.14.1.6_110.x86_64/lib/timerqueue.c: 46 0xacb6732a : cmp(%rsi),%rsi /usr/src/debug/kernel-3.10.0/linux-3.10.0-862.14.1.6_110.x86_64/lib/timerqueue.c: 40 0xacb6732d : mov%rsi,%rbx /usr/src/debug/kernel-3.10.0/linux-3.10.0-862.14.1.6_110.x86_64/lib/timerqueue.c: 46 0xacb67330 : jne0xacb6739e /usr/src/debug/kernel-3.10.0/linux-3.10.0-862.14.1.6_110.x86_64/lib/timerqueue.c: 41 0xacb67332 : mov%r12,%rdx /usr/src/debug/kernel-3.10.0/linux-3.10.0-862.14.1.6_110.x86_64/lib/timerqueue.c: 42 0xacb67335 : xor%ecx,%ecx /usr/src/debug/kernel-3.10.0/linux-3.10.0-862.14.1.6_110.x86_64/lib/timerqueue.c: 48 0xacb67337 : jmp0xacb67357 0xacb67339 : nopl 0x0(%rax) /usr/src/debug/kernel-3.10.0/linux-3.10.0-862.14.1.6_110.x86_64/lib/timerqueue.c: 52 0xacb67340 : mov0x18(%rax),%rsi //rax is the p 0xacb67344 : cmp%rsi,0x18(%rbx) 0xacb67348 : lea0x8(%rax),%rcx 0xacb6734c : lea0x10(%rax),%rdx 0xacb67350 : cmovge %rcx,%rdx 0xacb67354 : mov%rax,%rcx /usr/src/debug/kernel-3.10.0/linux-3.10.0-862.14.1.6_110.x86_64/lib/timerqueue.c: 48 0xacb67357 : mov(%rdx),%rax 0xacb6735a : test %rax,%rax 0xacb6735d : jne0xacb67340 Through the disassembly code of the timerqueue_add function, you can see that rdi is the first parameter of the timerqueue_add function (struct timerqueue_head *head), and rsi is the second parameter of the timerqueue_add function (struct timerqueue_node *node). We go to parse rdi (ac8d4395) and rsi(9f9ec0ad3fa0) to get the value of the parameter. The result is as follows. crash> struct timerqueue_head -ox struct timerqueue_head {
Re: [PATCH -tip v4 10/12] x86/kprobes: Push a fake return address at kretprobe_trampoline
On Wed, 24 Mar 2021 20:26:13 -0400 Steven Rostedt wrote: > On Thu, 25 Mar 2021 08:47:41 +0900 > Masami Hiramatsu wrote: > > > > I think the REGS and REGS_PARTIAL cases can also be affected by function > > > graph tracing. So should they use the generic unwind_recover_ret_addr() > > > instead of unwind_recover_kretprobe()? > > > > Yes, but I'm not sure this parameter can be applied. > > For example, it passed "state->sp - sizeof(unsigned long)" as where the > > return address stored address. Is that same on ftrace graph too? > > Stack traces on the return side of function graph tracer has never > worked. It's on my todo list, because that's one of the requirements to > get right if we every manage to combine kretprobe and function graph > tracers together. OK, then at this point let's just fix the kretprobe side. Thanks, > > -- Steve -- Masami Hiramatsu
[RFC] Convert sysv filesystem to use folios exclusively
I decided to see what a filesystem free from struct page would look like. I chose sysv more-or-less at random; I wanted a relatively simple filesystem, but I didn't want a toy. The advantage of sysv is that the maintainer is quite interested in folios ;-) $ git grep page fs/sysv fs/sysv/dir.c:#include fs/sysv/dir.c: if (offset_in_page(diter->pos)) { fs/sysv/inode.c:.get_link = page_get_link, fs/sysv/inode.c:truncate_inode_pages_final(>i_data); fs/sysv/itree.c:block_truncate_page(inode->i_mapping, inode->i_size, get_block); fs/sysv/itree.c:truncate_pagecache(inode, inode->i_size); fs/sysv/itree.c:.readpage = sysv_read_folio, fs/sysv/itree.c:.writepage = sysv_write_folio, fs/sysv/namei.c:#include fs/sysv/namei.c:err = page_symlink(inode, symname, l); I think those are "acceptable" mentions of pages -- offset_in_page() is related to kmap(), page_get_link and page_symlink are in the VFS (to be ported separately), and the others are just the names of the functions. The big change here is the rewrite of directory iteration. sysv_delete_entry() (and a couple of other functions) needs to recover 'pos' from the in-memory address and the struct page. Once we move from pages to folios, we can't realistically ask where the folio is mapped. So switch to an iterator based approach which keeps the pos, dirent mapped address and the struct folio together. It's actually a nice cleanup: 204 insertions(+), 259 deletions(-). We could be more tricksy and pass around the pgoff_t instead of the loff_t, but I'm not really interested in saving 4 bytes on the stack for 32-bit arches. I don't know if this is really how one would do the conversion. We could easily say "directories never use folios larger than a page" and that would make evrything much simpler, but that wasn't the point of this exercise. There's probably bugs here; again that wasn't the point. The direction here looks sound -- it should be possible to write a filesystem without the use of struct page in the future. This patch won't apply to anything published; it won't even link for me because I just changed a bunch of random function types in the header files to prototype this work. I might submit a patch to do the diter conversion anyway, although I have no clue how to test the sysv filesystem. Is there a mkfs for Linux? I assume there's no support in xfstests for it. diff --git a/fs/sysv/dir.c b/fs/sysv/dir.c index 88e38cd8f5c9..df38f53f1385 100644 --- a/fs/sysv/dir.c +++ b/fs/sysv/dir.c @@ -28,80 +28,85 @@ const struct file_operations sysv_dir_operations = { .fsync = generic_file_fsync, }; -static inline void dir_put_page(struct page *page) +void sysv_diter_end(struct sysv_diter *diter) { - kunmap(page); - put_page(page); + if (diter->entry) { + kunmap_local(diter->entry); + put_folio(diter->folio); + } } -static int dir_commit_chunk(struct page *page, loff_t pos, unsigned len) +static int sysv_diter_next(struct inode *dir, struct sysv_diter *diter) { - struct address_space *mapping = page->mapping; + struct address_space *mapping = dir->i_mapping; + struct folio *folio = diter->folio; + size_t offset; + + if (diter->entry) { + diter->pos += sizeof(*diter->entry); + if (offset_in_page(diter->pos)) { + diter->entry++; + return 0; + } + kunmap_local(diter->entry); + offset = offset_in_folio(folio, diter->pos); + if (offset != 0) + goto map; + put_folio(folio); + } + folio = read_mapping_folio(mapping, diter->pos / PAGE_SIZE, NULL); + if (IS_ERR(folio)) { + diter->pos = round_up(diter->pos, PAGE_SIZE); + diter->entry = NULL; + return PTR_ERR(folio); + } + diter->folio = folio; + offset = offset_in_folio(folio, diter->pos); + +map: + diter->entry = kmap_local_folio(folio, offset); + return 0; +} + +static int dir_commit_chunk(struct folio *folio, loff_t pos, unsigned len) +{ + struct address_space *mapping = folio->mapping; struct inode *dir = mapping->host; int err = 0; - block_write_end(NULL, mapping, pos, len, len, page, NULL); + block_write_end(NULL, mapping, pos, len, len, folio, NULL); if (pos+len > dir->i_size) { i_size_write(dir, pos+len); mark_inode_dirty(dir); } if (IS_DIRSYNC(dir)) - err = write_one_page(page); + err = write_one_folio(folio); else - unlock_page(page); + unlock_folio(folio); return err; } -static struct page * dir_get_page(struct inode *dir, unsigned long n) -{ - struct address_space *mapping = dir->i_mapping; -
Re: [PATCH v7 0/5] clk: add driver for the SiFive FU740
On Wed, Mar 24, 2021 at 6:36 PM Andreas Schwab wrote: > > Were you able to reproduce the problem? > Hi Andreas, Sorry, I'm not available past few days, I'm just coming back, I would take a look at this again. Could you also let me know which bootloader you used (FSBL or U-boot-SPL)? Thanks. > Andreas. > > -- > Andreas Schwab, sch...@linux-m68k.org > GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1 > "And now for something completely different."
Re: [PATCH v3] audit: log nftables configuration change events once per table
On 2021-03-24 12:32, Paul Moore wrote: > On Tue, Mar 23, 2021 at 4:05 PM Richard Guy Briggs wrote: > > > > Reduce logging of nftables events to a level similar to iptables. > > Restore the table field to list the table, adding the generation. > > > > Indicate the op as the most significant operation in the event. > > > > A couple of sample events: > > > > type=PROCTITLE msg=audit(2021-03-18 09:30:49.801:143) : > > proctitle=/usr/bin/python3 -s /usr/sbin/firewalld --nofork --nopid > > type=SYSCALL msg=audit(2021-03-18 09:30:49.801:143) : arch=x86_64 > > syscall=sendmsg success=yes exit=172 a0=0x6 a1=0x7ffdcfcbe650 a2=0x0 > > a3=0x7ffdcfcbd52c items=0 ppid=1 pid=367 auid=unset uid=root gid=root > > euid=root suid=root fsuid=root egid=roo > > t sgid=root fsgid=root tty=(none) ses=unset comm=firewalld > > exe=/usr/bin/python3.9 subj=system_u:system_r:firewalld_t:s0 key=(null) > > type=NETFILTER_CFG msg=audit(2021-03-18 09:30:49.801:143) : > > table=firewalld:2 family=ipv6 entries=1 op=nft_register_table pid=367 > > subj=system_u:system_r:firewalld_t:s0 comm=firewalld > > type=NETFILTER_CFG msg=audit(2021-03-18 09:30:49.801:143) : > > table=firewalld:2 family=ipv4 entries=1 op=nft_register_table pid=367 > > subj=system_u:system_r:firewalld_t:s0 comm=firewalld > > type=NETFILTER_CFG msg=audit(2021-03-18 09:30:49.801:143) : > > table=firewalld:2 family=inet entries=1 op=nft_register_table pid=367 > > subj=system_u:system_r:firewalld_t:s0 comm=firewalld > > > > type=PROCTITLE msg=audit(2021-03-18 09:30:49.839:144) : > > proctitle=/usr/bin/python3 -s /usr/sbin/firewalld --nofork --nopid > > type=SYSCALL msg=audit(2021-03-18 09:30:49.839:144) : arch=x86_64 > > syscall=sendmsg success=yes exit=22792 a0=0x6 a1=0x7ffdcfcbe650 a2=0x0 > > a3=0x7ffdcfcbd52c items=0 ppid=1 pid=367 auid=unset uid=root gid=root > > euid=root suid=root fsuid=root egid=r > > oot sgid=root fsgid=root tty=(none) ses=unset comm=firewalld > > exe=/usr/bin/python3.9 subj=system_u:system_r:firewalld_t:s0 key=(null) > > type=NETFILTER_CFG msg=audit(2021-03-18 09:30:49.839:144) : > > table=firewalld:3 family=ipv6 entries=30 op=nft_register_chain pid=367 > > subj=system_u:system_r:firewalld_t:s0 comm=firewalld > > type=NETFILTER_CFG msg=audit(2021-03-18 09:30:49.839:144) : > > table=firewalld:3 family=ipv4 entries=30 op=nft_register_chain pid=367 > > subj=system_u:system_r:firewalld_t:s0 comm=firewalld > > type=NETFILTER_CFG msg=audit(2021-03-18 09:30:49.839:144) : > > table=firewalld:3 family=inet entries=165 op=nft_register_chain pid=367 > > subj=system_u:system_r:firewalld_t:s0 comm=firewalld > > > > The issue was originally documented in > > https://github.com/linux-audit/audit-kernel/issues/124 > > > > Signed-off-by: Richard Guy Briggs > > --- > > Changelog: > > v3: > > - fix function braces, reduce parameter scope > > - pre-allocate nft_audit_data per table in step 1, bail on ENOMEM > > > > v2: > > - convert NFT ops to array indicies in nft2audit_op[] > > - use linux lists > > - use functions for each of collection and logging of audit data > > --- > > include/linux/audit.h | 28 ++ > > net/netfilter/nf_tables_api.c | 160 -- > > 2 files changed, 105 insertions(+), 83 deletions(-) > > ... > > > diff --git a/include/linux/audit.h b/include/linux/audit.h > > index 82b7c1116a85..5fafcf4c13de 100644 > > --- a/include/linux/audit.h > > +++ b/include/linux/audit.h > > @@ -118,6 +118,34 @@ enum audit_nfcfgop { > > AUDIT_NFT_OP_INVALID, > > }; > > > > +static const u8 nft2audit_op[NFT_MSG_MAX] = { // enum nf_tables_msg_types > > + [NFT_MSG_NEWTABLE] = AUDIT_NFT_OP_TABLE_REGISTER, > > + [NFT_MSG_GETTABLE] = AUDIT_NFT_OP_INVALID, > > + [NFT_MSG_DELTABLE] = AUDIT_NFT_OP_TABLE_UNREGISTER, > > + [NFT_MSG_NEWCHAIN] = AUDIT_NFT_OP_CHAIN_REGISTER, > > + [NFT_MSG_GETCHAIN] = AUDIT_NFT_OP_INVALID, > > + [NFT_MSG_DELCHAIN] = AUDIT_NFT_OP_CHAIN_UNREGISTER, > > + [NFT_MSG_NEWRULE] = AUDIT_NFT_OP_RULE_REGISTER, > > + [NFT_MSG_GETRULE] = AUDIT_NFT_OP_INVALID, > > + [NFT_MSG_DELRULE] = AUDIT_NFT_OP_RULE_UNREGISTER, > > + [NFT_MSG_NEWSET]= AUDIT_NFT_OP_SET_REGISTER, > > + [NFT_MSG_GETSET]= AUDIT_NFT_OP_INVALID, > > + [NFT_MSG_DELSET]= AUDIT_NFT_OP_SET_UNREGISTER, > > + [NFT_MSG_NEWSETELEM]= AUDIT_NFT_OP_SETELEM_REGISTER, > > + [NFT_MSG_GETSETELEM]= AUDIT_NFT_OP_INVALID, > > + [NFT_MSG_DELSETELEM]= AUDIT_NFT_OP_SETELEM_UNREGISTER, > > + [NFT_MSG_NEWGEN]= AUDIT_NFT_OP_GEN_REGISTER, > > + [NFT_MSG_GETGEN]= AUDIT_NFT_OP_INVALID, > > + [NFT_MSG_TRACE] = AUDIT_NFT_OP_INVALID, > > + [NFT_MSG_NEWOBJ]= AUDIT_NFT_OP_OBJ_REGISTER, > > + [NFT_MSG_GETOBJ]= AUDIT_NFT_OP_INVALID, > > + [NFT_MSG_DELOBJ]= AUDIT_NFT_OP_OBJ_UNREGISTER, > > +
[PATCH net v3] net: sched: fix packet stuck problem for lockless qdisc
Lockless qdisc has below concurrent problem: cpu0 cpu1 . . q->enqueue . . . qdisc_run_begin() . . . dequeue_skb() . . . sch_direct_xmit() . . . .q->enqueue . qdisc_run_begin() .return and do nothing . . qdisc_run_end(). cpu1 enqueue a skb without calling __qdisc_run() because cpu0 has not released the lock yet and spin_trylock() return false for cpu1 in qdisc_run_begin(), and cpu0 do not see the skb enqueued by cpu1 when calling dequeue_skb() because cpu1 may enqueue the skb after cpu0 calling dequeue_skb() and before cpu0 calling qdisc_run_end(). Lockless qdisc has below another concurrent problem when tx_action is involved: cpu0(serving tx_action) cpu1 cpu2 . .. . q->enqueue. .qdisc_run_begin() . . dequeue_skb() . . .q->enqueue . .. . sch_direct_xmit() . . . qdisc_run_begin() . . return and do nothing . .. clear __QDISC_STATE_SCHED.. qdisc_run_begin().. return and do nothing.. . .. .qdisc_run_end() . This patch fixes the above data race by: 1. Get the flag before doing spin_trylock(). 2. If the first spin_trylock() return false and the flag is not set before the first spin_trylock(), Set the flag and retry another spin_trylock() in case other CPU may not see the new flag after it releases the lock. 3. reschedule if the flags is set after the lock is released at the end of qdisc_run_end(). For tx_action case, the flags is also set when cpu1 is at the end if qdisc_run_end(), so tx_action will be rescheduled again to dequeue the skb enqueued by cpu2. Only clear the flag before retrying a dequeuing when dequeuing returns NULL in order to reduce the overhead of the above double spin_trylock() and __netif_schedule() calling. The performance impact of this patch, tested using pktgen and dummy netdev with pfifo_fast qdisc attached: threads without+this_patch with+this_patch delta 12.61Mpps2.60Mpps -0.3% 23.97Mpps3.82Mpps -3.7% 45.62Mpps5.59Mpps -0.5% 82.78Mpps2.77Mpps -0.3% 162.22Mpps2.22Mpps -0.0% Fixes: 6b3ba9146fe6 ("net: sched: allow qdiscs to handle locking") Signed-off-by: Yunsheng Lin --- V3: fix a compile error and a few comment typo, remove the __QDISC_STATE_DEACTIVATED checking, and update the performance data. V2: Avoid the overhead of fixing the data race as much as possible. --- include/net/sch_generic.h | 38 +- net/sched/sch_generic.c | 12 2 files changed, 49 insertions(+), 1 deletion(-) diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h index f7a6e14..e3f46eb 100644 --- a/include/net/sch_generic.h +++ b/include/net/sch_generic.h @@ -36,6 +36,7 @@ struct qdisc_rate_table { enum qdisc_state_t { __QDISC_STATE_SCHED, __QDISC_STATE_DEACTIVATED, + __QDISC_STATE_NEED_RESCHEDULE, }; struct qdisc_size_table { @@ -159,8 +160,38 @@ static inline bool qdisc_is_empty(const struct Qdisc *qdisc) static inline bool qdisc_run_begin(struct Qdisc *qdisc) { if (qdisc->flags & TCQ_F_NOLOCK) { + bool dont_retry = test_bit(__QDISC_STATE_NEED_RESCHEDULE, + >state); + + if (spin_trylock(>seqlock)) + goto nolock_empty; + + /* If the flag is set before doing the spin_trylock() and +* the above spin_trylock() return false, it means other cpu +* holding the lock will do dequeuing for us, or it wil see +* the flag set after releasing lock and reschedule the +* net_tx_action() to do the dequeuing. +*/ + if (dont_retry) + return false; + + /* We could do set_bit() before the first spin_trylock(), +* and avoid doing second spin_trylock() completely, then +* we could have multi cpus doing the set_bit(). Here use +* dont_retry to avoid doing the set_bit() and the second +* spin_trylock(), which has 5% performance improvement
RE: Re: [PATCH v2 1/3] dt-bindings: imx6q-pcie: add one regulator used to power up pcie phy
> -Original Message- > From: Lucas Stach > Sent: Wednesday, March 24, 2021 5:27 PM > To: Richard Zhu ; andrew.smir...@gmail.com; > shawn...@kernel.org; k...@linux.com; bhelg...@google.com; > ste...@agner.ch; lorenzo.pieral...@arm.com > Cc: linux-...@vger.kernel.org; dl-linux-imx ; > linux-arm-ker...@lists.infradead.org; linux-kernel@vger.kernel.org; > ker...@pengutronix.de > Subject: Re: [PATCH v2 1/3] dt-bindings: imx6q-pcie: add one regulator > used to power up pcie phy > Hi Richard, > > Am Mittwoch, dem 24.03.2021 um 13:34 +0800 schrieb Richard Zhu: > > Both 1.8v and 3.3v power supplies can be used by i.MX8MQ PCIe PHY. > > In default, the PCIE_VPH voltage is suggested to be 1.8v refer to data > > sheet. When PCIE_VPH is supplied by 3.3v in the HW schematic design, > > the VREG_BYPASS bits of GPR registers should be cleared from default > > value 1b'1 to 1b'0. Thus, the internal 3v3 to 1v8 translator would be > > turned on. > > > > Signed-off-by: Richard Zhu > > --- > > Documentation/devicetree/bindings/pci/fsl,imx6q-pcie.txt | 6 ++ > > 1 file changed, 6 insertions(+) > > > > diff --git a/Documentation/devicetree/bindings/pci/fsl,imx6q-pcie.txt > > b/Documentation/devicetree/bindings/pci/fsl,imx6q-pcie.txt > > index de4b2baf91e8..3248b7192ced 100644 > > --- a/Documentation/devicetree/bindings/pci/fsl,imx6q-pcie.txt > > +++ b/Documentation/devicetree/bindings/pci/fsl,imx6q-pcie.txt > > @@ -38,6 +38,12 @@ Optional properties: > >The regulator will be enabled when initializing the PCIe host and > >disabled either as part of the init process or when shutting down the > >host. > > +- vph-supply: Should specify the regulator in charge of PCIe PHY power. > > + On i.MX8MQ, both 1.8v and 3.3v power supplies can be used by > > +i.MX8MQ PCIe > > + PHY. In default, the PCIE_VPH voltage is suggested to be 1.8v refer > > +to data > > + sheet. When PCIE_VPH is supplied by 3.3v in the HW schematic > > +design, the > > + VREG_BYPASS bits of GPR registers should be cleared from default > > +value 1b'1 > > + to 1b'0. > > This description of the internal driver behavior does not belong into a DT > binding description. > Instead the binding should describe the function of the regulator exactly. > From > the datasheet I can see that there are actually 3 supplies (VPH, VP, VPTX) > going into the PCIe PHY, so "regulator in charge of PCIe PHY power" doesn't > seem like a very accurate description. [Richard Zhu] Hi Lucas: Thanks for your comments. VP/VPTX are combined together and connected to VDD_PHY_0V9. Only VPH can be supplied by different voltage power supplies. So, only VPH is specified in the DT binding, might be used to distinguish different HW board designs. How about this description: - vph-supply: Should specify the regulator in charge of VPH one of the three PCIe PHY powers. This regulator can be supplied by both 1.8v and 3.3v voltage supplies. Might be used to distinguish different HW board designs. > > Regards, > Lucas
[PATCH] perf x86 kvm-stat: support to analyze kvm msr
From: Lei Zhao usage: - kvm stat run a command and gather performance counter statistics - show the result: perf kvm stat report --event=msr See the msr events: Analyze events for all VMs, all VCPUs: MSR Access Samples Samples% Time% Min Time Max Time Avg time 0x6e0:W 67007 98.17% 98.31% 0.59us 10.69us 0.90us ( +- 0.10% ) 0x830:W1186 1.74%1.60% 0.53us 108.34us 0.82us ( +- 11.02% ) 0x3b:R 66 0.10%0.09% 0.56us1.26us 0.80us ( +- 3.24% ) Total Samples:68259, Total events handled time:61150.95us. Signed-off-by: Li RongQing Signed-off-by: Lei Zhao --- tools/perf/arch/x86/util/kvm-stat.c | 46 + 1 file changed, 46 insertions(+) diff --git a/tools/perf/arch/x86/util/kvm-stat.c b/tools/perf/arch/x86/util/kvm-stat.c index 072920475b65..c5dd54f6ef5e 100644 --- a/tools/perf/arch/x86/util/kvm-stat.c +++ b/tools/perf/arch/x86/util/kvm-stat.c @@ -133,11 +133,56 @@ static struct kvm_events_ops ioport_events = { .name = "IO Port Access" }; + /* The time of emulation msr is from kvm_msr to kvm_entry. */ +static void msr_event_get_key(struct evsel *evsel, +struct perf_sample *sample, +struct event_key *key) +{ + key->key = evsel__intval(evsel, sample, "ecx"); + key->info = evsel__intval(evsel, sample, "write"); +} + +static bool msr_event_begin(struct evsel *evsel, + struct perf_sample *sample, + struct event_key *key) +{ + if (!strcmp(evsel->name, "kvm:kvm_msr")) { + msr_event_get_key(evsel, sample, key); + return true; + } + + return false; +} + +static bool msr_event_end(struct evsel *evsel, +struct perf_sample *sample __maybe_unused, +struct event_key *key __maybe_unused) +{ + return kvm_entry_event(evsel); +} + +static void msr_event_decode_key(struct perf_kvm_stat *kvm __maybe_unused, + struct event_key *key, + char *decode) +{ + scnprintf(decode, decode_str_len, "%#llx:%s", + (unsigned long long)key->key, + key->info ? "W" : "R"); +} + +static struct kvm_events_ops msr_events = { + .is_begin_event = msr_event_begin, + .is_end_event = msr_event_end, + .decode_key = msr_event_decode_key, + .name = "MSR Access" +}; + const char *kvm_events_tp[] = { "kvm:kvm_entry", "kvm:kvm_exit", "kvm:kvm_mmio", "kvm:kvm_pio", + "kvm:kvm_msr", NULL, }; @@ -145,6 +190,7 @@ struct kvm_reg_events_ops kvm_reg_events_ops[] = { { .name = "vmexit", .ops = _events }, { .name = "mmio", .ops = _events }, { .name = "ioport", .ops = _events }, + { .name = "msr", .ops = _events }, { NULL, NULL }, }; -- 2.17.3
[PATCH] net: Fix a misspell in socket.c
s/addres/address Reported-by: Hulk Robot Signed-off-by: Lu Wei --- net/socket.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/socket.c b/net/socket.c index 84a8049c2b09..27e3e7d53f8e 100644 --- a/net/socket.c +++ b/net/socket.c @@ -3568,7 +3568,7 @@ EXPORT_SYMBOL(kernel_accept); * @addrlen: address length * @flags: flags (O_NONBLOCK, ...) * - * For datagram sockets, @addr is the addres to which datagrams are sent + * For datagram sockets, @addr is the address to which datagrams are sent * by default, and the only address from which datagrams are received. * For stream sockets, attempts to connect to @addr. * Returns 0 or an error code. -- 2.17.1
Re: [PATCH V3] exit: trigger panic when global init has exited
>> But,my patch has another purpose,protect some key variables(such >> as:task->mm,task->nsproxy,etc) to recover init coredump from >> fulldump,if sub-threads finish do_exit(), > Yes I know. > But the purpose of this SIGNAL_GROUP_EXIT check is not clear and not > documented. That is why I said it should be documented at least in the > changelog. Ok. I will update the changelog as you suggest. Oleg Nesterov 于2021年3月25日周四 上午2:12写道: > > Hi, > > On 03/23, qianli zhao wrote: > > > > Hi,Oleg > > > > > You certainly don't understand me :/ > > > > > Please read my email you quoted below. I didn't mean the current logic. > > > I meant the logic after your patch which moves atomic_dec_and_test() and > > > panic() before exit_signals(). > > > > Sorry, I think I see what you mean now. > > > > You mean that after apply my patch,SIGNAL_GROUP_EXIT no longer needs > > to be tested or avoid zap_pid_ns_processes()->BUG(). > > Yes,your consideration is correct. > > OK, great > > > But,my patch has another purpose,protect some key variables(such > > as:task->mm,task->nsproxy,etc) to recover init coredump from > > fulldump,if sub-threads finish do_exit(), > > Yes I know. > > But the purpose of this SIGNAL_GROUP_EXIT check is not clear and not > documented. That is why I said it should be documented at least in the > changelog. > > Oleg. >
Re: [PATCH] livepatch: klp_send_signal should treat PF_IO_WORKER like PF_KTHREAD
On 3/24/21 9:48 PM, Dong Kai wrote: commit 15b2219facad ("kernel: freezer should treat PF_IO_WORKER like PF_KTHREAD for freezing") is to fix the freezeing issue of IO threads nit: s/freezeing/freezing by making the freezer not send them fake signals. Here live patching consistency model call klp_send_signals to wake up all tasks by send fake signal to all non-kthread which only check the PF_KTHREAD flag, so it still send signal to io threads which may lead to freezeing issue of io threads. Here we take the same fix action by treating PF_IO_WORKERS as PF_KTHREAD within klp_send_signal function. Signed-off-by: Dong Kai --- note: the io threads freeze issue links: [1] https://lore.kernel.org/io-uring/yegnip43%2f6kfn...@kevinlocke.name/ [2] https://lore.kernel.org/io-uring/d7350ce7-17dc-75d7-611b-27ebf2cb5...@kernel.dk/ kernel/livepatch/transition.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c index f6310f848f34..0e1c35c8f4b4 100644 --- a/kernel/livepatch/transition.c +++ b/kernel/livepatch/transition.c @@ -358,7 +358,7 @@ static void klp_send_signals(void) * Meanwhile the task could migrate itself and the action * would be meaningless. It is not serious though. */ - if (task->flags & PF_KTHREAD) { + if (task->flags & (PF_KTHREAD | PF_IO_WORKER)) { /* * Wake up a kthread which sleeps interruptedly and * still has not been migrated. (PF_KTHREAD | PF_IO_WORKER) is open coded in soo many places maybe this is a silly question, but... If the livepatch code could use fake_signal_wake_up(), we could consolidate the pattern in klp_send_signals() with the one in freeze_task(). Then there would only one place for wake up / fake signal logic. I don't fully understand the differences in the freeze_task() version, so I only pose this as a question and not v2 request. As it is, this change seems logical to me, so: Acked-by: Joe Lawrence Thanks, -- Joe
[PATCH v14 4/7] soc: mediatek: SVS: add debug commands
The purpose of SVS is to help find the suitable voltages for DVFS. Therefore, if SVS bank voltages are concerned to be wrong, we can adjust SVS bank voltages by this patch. Signed-off-by: Roger Lu --- drivers/soc/mediatek/mtk-svs.c | 328 + 1 file changed, 328 insertions(+) diff --git a/drivers/soc/mediatek/mtk-svs.c b/drivers/soc/mediatek/mtk-svs.c index ee3b3989ab88..e36b3abfee03 100644 --- a/drivers/soc/mediatek/mtk-svs.c +++ b/drivers/soc/mediatek/mtk-svs.c @@ -6,6 +6,7 @@ #include #include #include +#include #include #include #include @@ -24,6 +25,7 @@ #include #include #include +#include #include #include #include @@ -60,6 +62,39 @@ #define SVSB_INTSTS_COMPLETE 0x1 #define SVSB_INTSTS_CLEAN 0x00ff +#define debug_fops_ro(name)\ + static int svs_##name##_debug_open(struct inode *inode, \ + struct file *filp) \ + { \ + return single_open(filp, svs_##name##_debug_show, \ + inode->i_private); \ + } \ + static const struct file_operations svs_##name##_debug_fops = { \ + .owner = THIS_MODULE, \ + .open = svs_##name##_debug_open,\ + .read = seq_read, \ + .llseek = seq_lseek,\ + .release = single_release, \ + } + +#define debug_fops_rw(name)\ + static int svs_##name##_debug_open(struct inode *inode, \ + struct file *filp) \ + { \ + return single_open(filp, svs_##name##_debug_show, \ + inode->i_private); \ + } \ + static const struct file_operations svs_##name##_debug_fops = { \ + .owner = THIS_MODULE, \ + .open = svs_##name##_debug_open,\ + .read = seq_read, \ + .write = svs_##name##_debug_write, \ + .llseek = seq_lseek,\ + .release = single_release, \ + } + +#define svs_dentry(name) {__stringify(name), _##name##_debug_fops} + static DEFINE_SPINLOCK(mtk_svs_lock); /* @@ -81,6 +116,7 @@ enum svsb_phase { SVSB_PHASE_INIT01, SVSB_PHASE_INIT02, SVSB_PHASE_MON, + SVSB_PHASE_NUM, }; enum svs_reg_index { @@ -138,6 +174,7 @@ enum svs_reg_index { SPARE2, SPARE3, THSLPEVEB, + SVS_REG_NUM, }; static const u32 svs_regs_v2[] = { @@ -241,6 +278,7 @@ struct thermal_parameter { * @opp_volts: signed-off voltages from default opp table * @freqs_pct: percent of "opp_freqs / freq_base" for bank init * @volts: bank voltages + * @reg_data: bank register data of each phase * @freq_base: reference frequency for bank init * @vboot: voltage request for bank init01 stage only * @volt_step: bank voltage step @@ -259,6 +297,7 @@ struct thermal_parameter { * @opp_count: bank opp count * @int_st: bank interrupt identification * @sw_id: bank software identification + * @hw_id: bank hardware identification * @ctl0: bank thermal sensor selection * @cpu_id: cpu core id for SVS CPU only * @@ -284,6 +323,7 @@ struct svs_bank { u32 opp_volts[16]; u32 freqs_pct[16]; u32 volts[16]; + u32 reg_data[SVSB_PHASE_NUM][SVS_REG_NUM]; u32 freq_base; u32 vboot; u32 volt_step; @@ -321,6 +361,7 @@ struct svs_bank { u32 opp_count; u32 int_st; u32 sw_id; + u32 hw_id; u32 ctl0; u32 cpu_id; }; @@ -636,11 +677,15 @@ static void svs_set_bank_phase(struct svs_platform *svsp, static inline void svs_init01_isr_handler(struct svs_platform *svsp) { struct svs_bank *svsb = svsp->pbank; + enum svs_reg_index rg_i; dev_info(svsb->dev, "%s: VDN74~30:0x%08x~0x%08x, DC:0x%08x\n", __func__, svs_readl(svsp, VDESIGN74), svs_readl(svsp, VDESIGN30), svs_readl(svsp, DCVALUES)); + for (rg_i = DESCHAR; rg_i < SVS_REG_NUM; rg_i++) + svsb->reg_data[SVSB_PHASE_INIT01][rg_i] = svs_readl(svsp, rg_i); + svsb->phase = SVSB_PHASE_INIT01; svsb->dc_voffset_in =
[PATCH v14 1/7] dt-bindings: soc: mediatek: add mtk svs dt-bindings
Document the binding for enabling mtk svs on MediaTek SoC. Signed-off-by: Roger Lu --- .../bindings/soc/mediatek/mtk-svs.yaml| 84 +++ 1 file changed, 84 insertions(+) create mode 100644 Documentation/devicetree/bindings/soc/mediatek/mtk-svs.yaml diff --git a/Documentation/devicetree/bindings/soc/mediatek/mtk-svs.yaml b/Documentation/devicetree/bindings/soc/mediatek/mtk-svs.yaml new file mode 100644 index ..a855ced410f8 --- /dev/null +++ b/Documentation/devicetree/bindings/soc/mediatek/mtk-svs.yaml @@ -0,0 +1,84 @@ +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/soc/mediatek/mtk-svs.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Mediatek Smart Voltage Scaling (SVS) Device Tree Bindings + +maintainers: + - Roger Lu + - Matthias Brugger + - Kevin Hilman + +description: |+ + The SVS engine is a piece of hardware which has several + controllers(banks) for calculating suitable voltage to + different power domains(CPU/GPU/CCI) according to + chip process corner, temperatures and other factors. Then DVFS + driver could apply SVS bank voltage to PMIC/Buck. + +properties: + compatible: +enum: + - mediatek,mt8183-svs + + reg: +maxItems: 1 +description: Address range of the MTK SVS controller. + + interrupts: +maxItems: 1 + + clocks: +maxItems: 1 +description: Main clock for MTK SVS controller to work. + + clock-names: +const: main + + nvmem-cells: +minItems: 1 +maxItems: 2 +description: + Phandle to the calibration data provided by a nvmem device. +items: + - description: SVS efuse for SVS controller + - description: Thermal efuse for SVS controller + + nvmem-cell-names: +items: + - const: svs-calibration-data + - const: t-calibration-data + +required: + - compatible + - reg + - interrupts + - clocks + - clock-names + - nvmem-cells + - nvmem-cell-names + +additionalProperties: false + +examples: + - | +#include +#include +#include + +soc { +#address-cells = <2>; +#size-cells = <2>; + +svs@1100b000 { +compatible = "mediatek,mt8183-svs"; +reg = <0 0x1100b000 0 0x1000>; +interrupts = ; +clocks = < CLK_INFRA_THERM>; +clock-names = "main"; +nvmem-cells = <_calibration>, <_calibration>; +nvmem-cell-names = "svs-calibration-data", "t-calibration-data"; +}; +}; -- 2.18.0
[PATCH v14 3/7] soc: mediatek: SVS: introduce MTK SVS engine
The Smart Voltage Scaling(SVS) engine is a piece of hardware which calculates suitable SVS bank voltages to OPP voltage table. Then, DVFS driver could apply those SVS bank voltages to PMIC/Buck when receiving OPP_EVENT_ADJUST_VOLTAGE. Signed-off-by: Roger Lu --- drivers/soc/mediatek/Kconfig | 10 + drivers/soc/mediatek/Makefile |1 + drivers/soc/mediatek/mtk-svs.c | 1702 3 files changed, 1713 insertions(+) create mode 100644 drivers/soc/mediatek/mtk-svs.c diff --git a/drivers/soc/mediatek/Kconfig b/drivers/soc/mediatek/Kconfig index fdd8bc08569e..3c3eedea35f7 100644 --- a/drivers/soc/mediatek/Kconfig +++ b/drivers/soc/mediatek/Kconfig @@ -73,4 +73,14 @@ config MTK_MMSYS Say yes here to add support for the MediaTek Multimedia Subsystem (MMSYS). +config MTK_SVS + tristate "MediaTek Smart Voltage Scaling(SVS)" + depends on MTK_EFUSE && NVMEM + help + The Smart Voltage Scaling(SVS) engine is a piece of hardware + which has several controllers(banks) for calculating suitable + voltage to different power domains(CPU/GPU/CCI) according to + chip process corner, temperatures and other factors. Then DVFS + driver could apply SVS bank voltage to PMIC/Buck. + endmenu diff --git a/drivers/soc/mediatek/Makefile b/drivers/soc/mediatek/Makefile index 90270f8114ed..0e9e703c931a 100644 --- a/drivers/soc/mediatek/Makefile +++ b/drivers/soc/mediatek/Makefile @@ -7,3 +7,4 @@ obj-$(CONFIG_MTK_SCPSYS) += mtk-scpsys.o obj-$(CONFIG_MTK_SCPSYS_PM_DOMAINS) += mtk-pm-domains.o obj-$(CONFIG_MTK_MMSYS) += mtk-mmsys.o obj-$(CONFIG_MTK_MMSYS) += mtk-mutex.o +obj-$(CONFIG_MTK_SVS) += mtk-svs.o diff --git a/drivers/soc/mediatek/mtk-svs.c b/drivers/soc/mediatek/mtk-svs.c new file mode 100644 index ..ee3b3989ab88 --- /dev/null +++ b/drivers/soc/mediatek/mtk-svs.c @@ -0,0 +1,1702 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Copyright (C) 2020 MediaTek Inc. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +/* svs bank 1-line sw id */ +#define SVSB_CPU_LITTLEBIT(0) +#define SVSB_CPU_BIG BIT(1) +#define SVSB_CCI BIT(2) +#define SVSB_GPU BIT(3) + +/* svs bank mode support */ +#define SVSB_MODE_ALL_DISABLE 0 +#define SVSB_MODE_INIT01 BIT(1) +#define SVSB_MODE_INIT02 BIT(2) +#define SVSB_MODE_MON BIT(3) + +/* svs bank init01 condition */ +#define SVSB_INIT01_VOLT_IGNOREBIT(1) +#define SVSB_INIT01_VOLT_INC_ONLY BIT(2) +#define SVSB_INIT01_CLK_EN BIT(31) + +/* svs bank common setting */ +#define SVSB_TZONE_HIGH_TEMP_MAX U32_MAX +#define SVSB_RUNCONFIG_DEFAULT 0x8000 +#define SVSB_DC_SIGNED_BIT 0x8000 +#define SVSB_INTEN_INIT0x 0x5f01 +#define SVSB_INTEN_MONVOPEN0x00ff +#define SVSB_EN_OFF0x0 +#define SVSB_EN_MASK 0x7 +#define SVSB_EN_INIT01 0x1 +#define SVSB_EN_INIT02 0x5 +#define SVSB_EN_MON0x2 +#define SVSB_INTSTS_MONVOP 0x00ff +#define SVSB_INTSTS_COMPLETE 0x1 +#define SVSB_INTSTS_CLEAN 0x00ff + +static DEFINE_SPINLOCK(mtk_svs_lock); + +/* + * enum svsb_phase - svs bank phase enumeration + * @SVSB_PHASE_INIT01: basic init for svs bank + * @SVSB_PHASE_INIT02: svs bank can provide voltages + * @SVSB_PHASE_MON: svs bank can provide voltages with thermal effect + * @SVSB_PHASE_ERROR: svs bank encounters unexpected condition + * + * Each svs bank has its own independent phase. We enable each svs bank by + * running their phase orderly. However, When svs bank encounters unexpected + * condition, it will fire an irq (PHASE_ERROR) to inform svs software. + * + * svs bank general phase-enabled order: + * SVSB_PHASE_INIT01 -> SVSB_PHASE_INIT02 -> SVSB_PHASE_MON + */ +enum svsb_phase { + SVSB_PHASE_ERROR = 0, + SVSB_PHASE_INIT01, + SVSB_PHASE_INIT02, + SVSB_PHASE_MON, +}; + +enum svs_reg_index { + DESCHAR = 0, + TEMPCHAR, + DETCHAR, + AGECHAR, + DCCONFIG, + AGECONFIG, + FREQPCT30, + FREQPCT74, + LIMITVALS, + VBOOT, + DETWINDOW, + CONFIG, + TSCALCS, + RUNCONFIG, + SVSEN, + INIT2VALS, + DCVALUES, + AGEVALUES, + VOP30, + VOP74, + TEMP, + INTSTS, + INTSTSRAW, + INTEN, + CHKINT, + CHKSHIFT, + STATUS, + VDESIGN30, + VDESIGN74, + DVT30, + DVT74, + AGECOUNT, + SMSTATE0, + SMSTATE1, + CTL0, +
[PATCH v14 5/7] dt-bindings: soc: mediatek: add mt8192 svs dt-bindings
Signed-off-by: Roger Lu --- .../devicetree/bindings/soc/mediatek/mtk-svs.yaml | 8 1 file changed, 8 insertions(+) diff --git a/Documentation/devicetree/bindings/soc/mediatek/mtk-svs.yaml b/Documentation/devicetree/bindings/soc/mediatek/mtk-svs.yaml index a855ced410f8..59342e627b67 100644 --- a/Documentation/devicetree/bindings/soc/mediatek/mtk-svs.yaml +++ b/Documentation/devicetree/bindings/soc/mediatek/mtk-svs.yaml @@ -22,6 +22,7 @@ properties: compatible: enum: - mediatek,mt8183-svs + - mediatek,mt8192-svs reg: maxItems: 1 @@ -51,6 +52,13 @@ properties: - const: svs-calibration-data - const: t-calibration-data + resets: +maxItems: 1 + + reset-names: +items: + - const: svs_rst + required: - compatible - reg -- 2.18.0
[PATCH v14 6/7] arm64: dts: mt8192: add svs device information
add compitable/reg/irq/clock/efuse/reset setting in svs node Signed-off-by: Roger Lu --- arch/arm64/boot/dts/mediatek/mt8192.dtsi | 34 1 file changed, 34 insertions(+) diff --git a/arch/arm64/boot/dts/mediatek/mt8192.dtsi b/arch/arm64/boot/dts/mediatek/mt8192.dtsi index 2f0b4824a024..f3a339de8992 100644 --- a/arch/arm64/boot/dts/mediatek/mt8192.dtsi +++ b/arch/arm64/boot/dts/mediatek/mt8192.dtsi @@ -268,6 +268,14 @@ compatible = "mediatek,mt8192-infracfg", "syscon"; reg = <0 0x10001000 0 0x1000>; #clock-cells = <1>; + + infracfg_rst: reset-controller { + compatible = "mediatek,infra-reset", "ti,syscon-reset"; + #reset-cells = <1>; + ti,reset-bits = < + 0x150 5 0x154 5 0 0 (ASSERT_SET | DEASSERT_SET | STATUS_NONE) /* 0: svs */ + >; + }; }; pericfg: syscon@10003000 { @@ -362,6 +370,20 @@ status = "disabled"; }; + svs: svs@1100b000 { + compatible = "mediatek,mt8192-svs"; + reg = <0 0x1100b000 0 0x1000>; + interrupts = ; + clocks = < CLK_INFRA_THERM>; + clock-names = "main"; + nvmem-cells = <_calibration>, + <_e_data1>; + nvmem-cell-names = "svs-calibration-data", + "t-calibration-data"; + resets = <_rst 0>; + reset-names = "svs_rst"; + }; + spi1: spi@1101 { compatible = "mediatek,mt8192-spi", "mediatek,mt6765-spi"; @@ -473,6 +495,18 @@ status = "disable"; }; + efuse: efuse@11c1 { + compatible = "mediatek,efuse"; + reg = <0 0x11c1 0 0x1000>; + + lvts_e_data1: data1 { + reg = <0x1C0 0x58>; + }; + svs_calibration: calib@580 { + reg = <0x580 0x68>; + }; + }; + i2c3: i2c3@11cb { compatible = "mediatek,mt8192-i2c"; reg = <0 0x11cb 0 0x1000>, -- 2.18.0
[PATCH v14 7/7] soc: mediatek: SVS: add mt8192 SVS GPU driver
Signed-off-by: Roger Lu --- drivers/soc/mediatek/mtk-svs.c | 477 - 1 file changed, 471 insertions(+), 6 deletions(-) diff --git a/drivers/soc/mediatek/mtk-svs.c b/drivers/soc/mediatek/mtk-svs.c index e36b3abfee03..3e152a86d280 100644 --- a/drivers/soc/mediatek/mtk-svs.c +++ b/drivers/soc/mediatek/mtk-svs.c @@ -36,6 +36,10 @@ #define SVSB_CCI BIT(2) #define SVSB_GPU BIT(3) +/* svs bank 2-line type */ +#define SVSB_LOW BIT(4) +#define SVSB_HIGH BIT(5) + /* svs bank mode support */ #define SVSB_MODE_ALL_DISABLE 0 #define SVSB_MODE_INIT01 BIT(1) @@ -280,6 +284,7 @@ struct thermal_parameter { * @volts: bank voltages * @reg_data: bank register data of each phase * @freq_base: reference frequency for bank init + * @turn_freq_base: refenrece frequency for turn point * @vboot: voltage request for bank init01 stage only * @volt_step: bank voltage step * @volt_base: bank voltage base @@ -300,6 +305,8 @@ struct thermal_parameter { * @hw_id: bank hardware identification * @ctl0: bank thermal sensor selection * @cpu_id: cpu core id for SVS CPU only + * @turn_pt: turn point informs which opp_volt calculated by high/low bank. + * @type: bank type to represent it is 2-line (high/low) bank or 1-line bank. * * Other structure members which are not listed above are svs platform * efuse data for bank init @@ -325,6 +332,7 @@ struct svs_bank { u32 volts[16]; u32 reg_data[SVSB_PHASE_NUM][SVS_REG_NUM]; u32 freq_base; + u32 turn_freq_base; u32 vboot; u32 volt_step; u32 volt_base; @@ -364,6 +372,8 @@ struct svs_bank { u32 hw_id; u32 ctl0; u32 cpu_id; + u32 turn_pt; + u32 type; }; /* @@ -441,6 +451,37 @@ static u32 svs_bank_volt_to_opp_volt(u32 svsb_volt, u32 svsb_volt_step, return (svsb_volt * svsb_volt_step) + svsb_volt_base; } +static u32 svs_opp_volt_to_bank_volt(u32 opp_u_volt, u32 svsb_volt_step, +u32 svsb_volt_base) +{ + return (opp_u_volt - svsb_volt_base) / svsb_volt_step; +} + +static int svs_sync_bank_volts_from_opp(struct svs_bank *svsb) +{ + struct dev_pm_opp *opp; + u32 i, opp_u_volt; + + for (i = 0; i < svsb->opp_count; i++) { + opp = dev_pm_opp_find_freq_exact(svsb->opp_dev, +svsb->opp_freqs[i], +true); + if (IS_ERR(opp)) { + dev_err(svsb->dev, "cannot find freq = %u (%ld)\n", + svsb->opp_freqs[i], PTR_ERR(opp)); + return PTR_ERR(opp); + } + + opp_u_volt = dev_pm_opp_get_voltage(opp); + svsb->volts[i] = svs_opp_volt_to_bank_volt(opp_u_volt, + svsb->volt_step, + svsb->volt_base); + dev_pm_opp_put(opp); + } + + return 0; +} + static int svs_get_bank_zone_temperature(const char *tzone_name, int *tzone_temp) { @@ -456,7 +497,7 @@ static int svs_get_bank_zone_temperature(const char *tzone_name, static int svs_adjust_pm_opp_volts(struct svs_bank *svsb, bool force_update) { int tzone_temp, ret = -EPERM; - u32 i, svsb_volt, opp_volt, temp_offset = 0; + u32 i, svsb_volt, opp_volt, temp_offset = 0, opp_start, opp_stop; mutex_lock(>lock); @@ -470,6 +511,21 @@ static int svs_adjust_pm_opp_volts(struct svs_bank *svsb, bool force_update) goto unlock_mutex; } + /* +* 2-line bank updates its corresponding opp volts. +* 1-line bank updates all opp volts. +*/ + if (svsb->type == SVSB_HIGH) { + opp_start = 0; + opp_stop = svsb->turn_pt; + } else if (svsb->type == SVSB_LOW) { + opp_start = svsb->turn_pt; + opp_stop = svsb->opp_count; + } else { + opp_start = 0; + opp_stop = svsb->opp_count; + } + /* Get thermal effect */ if (svsb->phase == SVSB_PHASE_MON) { if (svsb->temp > svsb->temp_upper_bound && @@ -491,10 +547,16 @@ static int svs_adjust_pm_opp_volts(struct svs_bank *svsb, bool force_update) temp_offset += svsb->tzone_high_temp_offset; else if (tzone_temp <= svsb->tzone_low_temp) temp_offset += svsb->tzone_low_temp_offset; + + /* 2-line bank takes thermal factor to update all opp volts */ + if (svsb->type == SVSB_HIGH || svsb->type == SVSB_LOW) { + opp_start = 0; + opp_stop = svsb->opp_count; + }
[PATCH v14 2/7] arm64: dts: mt8183: add svs device information
add compitable/reg/irq/clock/efuse setting in svs node Signed-off-by: Roger Lu --- arch/arm64/boot/dts/mediatek/mt8183.dtsi | 18 ++ 1 file changed, 18 insertions(+) diff --git a/arch/arm64/boot/dts/mediatek/mt8183.dtsi b/arch/arm64/boot/dts/mediatek/mt8183.dtsi index 80519a145f13..441d617ece43 100644 --- a/arch/arm64/boot/dts/mediatek/mt8183.dtsi +++ b/arch/arm64/boot/dts/mediatek/mt8183.dtsi @@ -657,6 +657,18 @@ status = "disabled"; }; + svs: svs@1100b000 { + compatible = "mediatek,mt8183-svs"; + reg = <0 0x1100b000 0 0x1000>; + interrupts = ; + clocks = < CLK_INFRA_THERM>; + clock-names = "main"; + nvmem-cells = <_calibration>, + <_calibration>; + nvmem-cell-names = "svs-calibration-data", + "t-calibration-data"; + }; + pwm0: pwm@1100e000 { compatible = "mediatek,mt8183-disp-pwm"; reg = <0 0x1100e000 0 0x1000>; @@ -941,9 +953,15 @@ reg = <0 0x11f1 0 0x1000>; #address-cells = <1>; #size-cells = <1>; + thermal_calibration: calib@180 { + reg = <0x180 0xc>; + }; mipi_tx_calibration: calib@190 { reg = <0x190 0xc>; }; + svs_calibration: calib@580 { + reg = <0x580 0x64>; + }; }; u3phy: usb-phy@11f4 { -- 2.18.0
[PATCH v14 0/7] soc: mediatek: SVS: introduce MTK SVS
1. SVS driver uses OPP adjust event in [1] to update OPP table voltage part. 2. SVS driver gets thermal/GPU device by node [2][3] and CPU device by get_cpu_device(). After retrieving subsys device, SVS driver does device_link_add() to make sure probe/suspend callback priority. 3. SVS dts refers to reset controller [4] to help reset SVS HW. #mt8183 SVS related patches [1] https://patchwork.kernel.org/patch/11193513/ [2] https://patchwork.kernel.org/project/linux-mediatek/patch/20201013102358.22588-2-michael@mediatek.com/ [3] https://patchwork.kernel.org/project/linux-mediatek/patch/20200306041345.259332-3-drink...@chromium.org/ #mt8192 SVS related patches [1] https://patchwork.kernel.org/patch/11193513/ [2] https://patchwork.kernel.org/project/linux-mediatek/patch/20201223074944.2061-1-michael@mediatek.com/ [3] https://lore.kernel.org/patchwork/patch/1360551/ [4] https://patchwork.kernel.org/project/linux-mediatek/patch/20200817030324.5690-5-crystal@mediatek.com/ changes since v13: - Fix "mtk-svs.yaml: properties:nvmem-cells:maxItems: False schema does not allow 2" - Remove wrong maintainer "Nishanth Menon " - When turn_pt = 0, SVS HIGH bank fills FREQPCT74 / FREQPCT30 with 0 and SVS controller won't run normally. Therefore, we initialize SVS HIGH bank's FREQPCT30 with svsb->freqs_pct[0] to avoid this issue. - Change SVS GPU opp count back from 14 to 16 because GPU DVFS has a better solution Roger Lu (7): [v14,1/7]: dt-bindings: soc: mediatek: add mtk svs dt-bindings [v14,2/7]: arm64: dts: mt8183: add svs device information [v14,3/7]: soc: mediatek: SVS: introduce MTK SVS engine [v14,4/7]: soc: mediatek: SVS: add debug commands [v14,5/7]: dt-bindings: soc: mediatek: add mt8192 svs dt-bindings [v14,6/7]: arm64: dts: mt8192: add svs device information [v14,7/7]: soc: mediatek: SVS: add mt8192 SVS GPU driver .../bindings/soc/mediatek/mtk-svs.yaml| 92 + arch/arm64/boot/dts/mediatek/mt8183.dtsi | 18 + arch/arm64/boot/dts/mediatek/mt8192.dtsi | 34 + drivers/soc/mediatek/Kconfig | 10 + drivers/soc/mediatek/Makefile |1 + drivers/soc/mediatek/mtk-svs.c| 2495 + 6 files changed, 2650 insertions(+) create mode 100644 Documentation/devicetree/bindings/soc/mediatek/mtk-svs.yaml create mode 100644 drivers/soc/mediatek/mtk-svs.c
Re: [PATCH 1/2] perf/core: Share an event with multiple cgroups
Hi Song, Thanks for your review! On Thu, Mar 25, 2021 at 9:56 AM Song Liu wrote: > > On Mar 23, 2021, at 9:21 AM, Namhyung Kim wrote: > > > > As we can run many jobs (in container) on a big machine, we want to > > measure each job's performance during the run. To do that, the > > perf_event can be associated to a cgroup to measure it only. > > > > However such cgroup events need to be opened separately and it causes > > significant overhead in event multiplexing during the context switch > > as well as resource consumption like in file descriptors and memory > > footprint. > > > > As a cgroup event is basically a cpu event, we can share a single cpu > > event for multiple cgroups. All we need is a separate counter (and > > two timing variables) for each cgroup. I added a hash table to map > > from cgroup id to the attached cgroups. > > > > With this change, the cpu event needs to calculate a delta of event > > counter values when the cgroups of current and the next task are > > different. And it attributes the delta to the current task's cgroup. > > > > This patch adds two new ioctl commands to perf_event for light-weight > > cgroup event counting (i.e. perf stat). > > > > * PERF_EVENT_IOC_ATTACH_CGROUP - it takes a buffer consists of a > > 64-bit array to attach given cgroups. The first element is a > > number of cgroups in the buffer, and the rest is a list of cgroup > > ids to add a cgroup info to the given event. > > > > * PERF_EVENT_IOC_READ_CGROUP - it takes a buffer consists of a 64-bit > > array to get the event counter values. The first element is size > > of the array in byte, and the second element is a cgroup id to > > read. The rest is to save the counter value and timings. > > > > This attaches all cgroups in a single syscall and I didn't add the > > DETACH command deliberately to make the implementation simple. The > > attached cgroup nodes would be deleted when the file descriptor of the > > perf_event is closed. > > > > Cc: Tejun Heo > > Signed-off-by: Namhyung Kim > > --- > > include/linux/perf_event.h | 22 ++ > > include/uapi/linux/perf_event.h | 2 + > > kernel/events/core.c| 474 ++-- > > 3 files changed, 471 insertions(+), 27 deletions(-) > > > > diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h > > index 3f7f89ea5e51..2760f3b07534 100644 > > --- a/include/linux/perf_event.h > > +++ b/include/linux/perf_event.h > > @@ -771,6 +771,18 @@ struct perf_event { > > > > #ifdef CONFIG_CGROUP_PERF > > struct perf_cgroup *cgrp; /* cgroup event is attach to */ > > + > > + /* to share an event for multiple cgroups */ > > + struct hlist_head *cgrp_node_hash; > > + struct perf_cgroup_node *cgrp_node_entries; > > + int nr_cgrp_nodes; > > + int cgrp_node_hash_bits; > > + > > + struct list_headcgrp_node_entry; > > + > > + u64 cgrp_node_count; > > + u64 cgrp_node_time_enabled; > > + u64 cgrp_node_time_running; > > A comment saying the above values are from previous reading would be helpful. Sure, will add. > > > #endif > > > > #ifdef CONFIG_SECURITY > > @@ -780,6 +792,14 @@ struct perf_event { > > #endif /* CONFIG_PERF_EVENTS */ > > }; > > > > +struct perf_cgroup_node { > > + struct hlist_node node; > > + u64 id; > > + u64 count; > > + u64 time_enabled; > > + u64 time_running; > > + u64 padding[2]; > > Do we really need the padding? For cache line alignment? Yeah I was thinking about it. It seems I need to use the ___cacheline_aligned macro instead. > > > +}; > > > > struct perf_event_groups { > > struct rb_root tree; > > @@ -843,6 +863,8 @@ struct perf_event_context { > > int pin_count; > > #ifdef CONFIG_CGROUP_PERF > > int nr_cgroups; /* cgroup evts */ > > + struct list_headcgrp_node_list; > > + struct list_headcgrp_ctx_entry; > > #endif > > void*task_ctx_data; /* pmu specific data > > */ > > struct rcu_head rcu_head; > > diff --git a/include/uapi/linux/perf_event.h > > b/include/uapi/linux/perf_event.h > > index ad15e40d7f5d..06bc7ab13616 100644 > > --- a/include/uapi/linux/perf_event.h > > +++ b/include/uapi/linux/perf_event.h > > @@ -479,6 +479,8 @@ struct perf_event_query_bpf { > > #define PERF_EVENT_IOC_PAUSE_OUTPUT _IOW('$', 9, __u32) > > #define PERF_EVENT_IOC_QUERY_BPF _IOWR('$', 10, struct > > perf_event_query_bpf *) > > #define PERF_EVENT_IOC_MODIFY_ATTRIBUTES _IOW('$', 11,
Re: [PATCH v6 00/12] SVM cleanup and INVPCID feature support
On Wed, 24 Mar 2021, Hugh Dickins wrote: > On Wed, 24 Mar 2021, Borislav Petkov wrote: > > > Ok, > > > > some more experimenting Babu and I did lead us to: > > > > --- > > diff --git a/arch/x86/include/asm/tlbflush.h > > b/arch/x86/include/asm/tlbflush.h > > index f5ca15622dc9..259aa4889cad 100644 > > --- a/arch/x86/include/asm/tlbflush.h > > +++ b/arch/x86/include/asm/tlbflush.h > > @@ -250,6 +250,9 @@ static inline void __native_flush_tlb_single(unsigned > > long addr) > > */ > > if (kaiser_enabled) > > invpcid_flush_one(X86_CR3_PCID_ASID_USER, addr); > > + else > > + asm volatile("invlpg (%0)" ::"r" (addr) : "memory"); > > + > > invpcid_flush_one(X86_CR3_PCID_ASID_KERN, addr); > > } > > > > applied on the guest kernel which fixes the issue. And let me add Hugh > > who did that PCID stuff at the time. So lemme summarize for Hugh and to > > ask him nicely to sanity-check me. :-) > > Just a brief interim note to assure you that I'm paying attention, > but wow, it's a long time since I gave any thought down here! > Trying to page it all back in... > > I see no harm in your workaround if it works, but it's not as if > this is a previously untried path: so I'm suspicious how an issue > here with Globals could have gone unnoticed for so long, and need > to understand it better. Right, after looking into it more, I completely agree with you: the Kaiser series (in both 4.4-stable and 4.9-stable) was simply wrong to lose that invlpg - fine in the kaiser case when we don't enable Globals at all, but plain wrong in the !kaiser_enabled case. One way or another, we have somehow got away with it for three years. I do agree with Paolo that the PCID_ASID_KERN flush would be better moved under the "if (kaiser_enabled)" now. (And if this were ongoing development, I'd want to rewrite the function altogether: but no, these old stable trees are not the place for that.) Boris, may I leave both -stable fixes to you? Let me know if you'd prefer me to clean up my mess. Thanks a lot for tracking this down, Hugh > > > > Basically, you have an AMD host which supports PCID and INVPCID and you > > boot on it a 4.9 guest. It explodes like the panic below. > > > > What fixes it is this: > > > > diff --git a/arch/x86/include/asm/tlbflush.h > > b/arch/x86/include/asm/tlbflush.h > > index f5ca15622dc9..259aa4889cad 100644 > > --- a/arch/x86/include/asm/tlbflush.h > > +++ b/arch/x86/include/asm/tlbflush.h > > @@ -250,6 +250,9 @@ static inline void __native_flush_tlb_single(unsigned > > long addr) > > */ > > if (kaiser_enabled) > > invpcid_flush_one(X86_CR3_PCID_ASID_USER, addr); > > + else > > + asm volatile("invlpg (%0)" ::"r" (addr) : "memory"); > > + > > invpcid_flush_one(X86_CR3_PCID_ASID_KERN, addr); > > } > > > > --- > > > > and the reason why it does, IMHO, is because on AMD, kaiser_enabled is > > false because AMD is not affected by Meltdown, which means, there's no > > user/kernel pagetables split. > > > > And that also means, you have global TLB entries which means that if you > > look at that __native_flush_tlb_single() function, it needs to flush > > global TLB entries on CPUs with X86_FEATURE_INVPCID_SINGLE by doing an > > INVLPG in the kaiser_enabled=0 case. Errgo, the above hunk. > > > > But I might be completely off here thus this note... > > > > Thoughts? > > > > Thx. > > > > > > [1.235726] [ cut here ] > > [1.237515] kernel BUG at > > /build/linux-dqnRSc/linux-4.9.228/arch/x86/kernel/alternative.c:709! > > [1.240926] invalid opcode: [#1] SMP > > [1.243301] Modules linked in: > > [1.244585] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.9.0-13-amd64 #1 > > Debian 4.9.228-1 > > [1.247657] Hardware name: Google Google Compute Engine/Google Compute > > Engine, BIOS Google 01/01/2011 > > [1.251249] task: 909363e94040 task.stack: a41bc0194000 > > [1.253519] RIP: 0010:[] [] > > text_poke+0x18c/0x240 > > [1.256593] RSP: 0018:a41bc0197d90 EFLAGS: 00010096 > > [1.258657] RAX: 000f RBX: 01020800 RCX: > > feda3203 > > [1.261388] RDX: 178bfbff RSI: RDI: > > ff57a000 > > [1.264168] RBP: 8fbd3eca R08: R09: > > 0003 > > [1.266983] R10: 0003 R11: 0112 R12: > > 0001 > > [1.269702] R13: a41bc0197dcf R14: 0286 R15: > > ed1c40407500 > > [1.272572] FS: () GS:90936630() > > knlGS: > > [1.275791] CS: 0010 DS: ES: CR0: 80050033 > > [1.278032] CR2: CR3: 10c08000 CR4: > > 003606f0 > > [1.280815] Stack: > > [1.281630] 8fbd3eca 0005 a41bc0197e03 > > 8fbd3ecb > > [1.284660] 8fa2e835 > >
[PATCH v2] sched/topology: remove redundant cpumask_and in init_overlap_sched_group
mask is built in build_balance_mask() by for_each_cpu(i, sg_span), so it must be a subset of sched_group_span(sg). Though cpumask_first_and doesn't lead to a wrong result of balance cpu, it is pointless to do cpumask_and again. Signed-off-by: Barry Song Reviewed-by: Valentin Schneider --- -v2: add reviewed-by of Valentin, thanks! kernel/sched/topology.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c index f2066d682cd8..d1aec244c027 100644 --- a/kernel/sched/topology.c +++ b/kernel/sched/topology.c @@ -934,7 +934,7 @@ static void init_overlap_sched_group(struct sched_domain *sd, int cpu; build_balance_mask(sd, sg, mask); - cpu = cpumask_first_and(sched_group_span(sg), mask); + cpu = cpumask_first(mask); sg->sgc = *per_cpu_ptr(sdd->sgc, cpu); if (atomic_inc_return(>sgc->ref) == 1) -- 2.25.1
[PATCH] include: linux: debug_locks: Remove duplicate declaration
struct task_struct is declared at 9th line. Remove the duplicate. Signed-off-by: Wan Jiabing --- include/linux/debug_locks.h | 1 - 1 file changed, 1 deletion(-) diff --git a/include/linux/debug_locks.h b/include/linux/debug_locks.h index 2915f56ad421..0b3187a5290d 100644 --- a/include/linux/debug_locks.h +++ b/include/linux/debug_locks.h @@ -46,7 +46,6 @@ extern int debug_locks_off(void); # define locking_selftest()do { } while (0) #endif -struct task_struct; #ifdef CONFIG_LOCKDEP extern void debug_show_all_locks(void); -- 2.25.1
RE: [EXT] Re: [PATCH v2 3/3] PCI: imx: clear vreg bypass when pcie vph voltage is 3v3
> -Original Message- > From: Lucas Stach > Sent: Wednesday, March 24, 2021 5:30 PM > To: Richard Zhu ; andrew.smir...@gmail.com; > shawn...@kernel.org; k...@linux.com; bhelg...@google.com; > ste...@agner.ch; lorenzo.pieral...@arm.com > Cc: linux-...@vger.kernel.org; dl-linux-imx ; > linux-arm-ker...@lists.infradead.org; linux-kernel@vger.kernel.org; > ker...@pengutronix.de > Subject: Re: [PATCH v2 3/3] PCI: imx: clear vreg bypass when pcie vph > voltage is 3v3 > Am Mittwoch, dem 24.03.2021 um 13:34 +0800 schrieb Richard Zhu: > > Both 1.8v and 3.3v power supplies can be used by i.MX8MQ PCIe PHY. > > In default, the PCIE_VPH voltage is suggested to be 1.8v refer to data > > sheet. When PCIE_VPH is supplied by 3.3v in the HW schematic design, > > the VREG_BYPASS bits of GPR registers should be cleared from default > > value 1b'1 to 1b'0. Thus, the internal 3v3 to 1v8 translator would be > > turned on. > > > > Signed-off-by: Richard Zhu > > --- > > drivers/pci/controller/dwc/pci-imx6.c | 23 +++ > > 1 file changed, 23 insertions(+) > > > > diff --git a/drivers/pci/controller/dwc/pci-imx6.c > > b/drivers/pci/controller/dwc/pci-imx6.c > > index 853ea8e82952..beca085a9300 100644 > > --- a/drivers/pci/controller/dwc/pci-imx6.c > > +++ b/drivers/pci/controller/dwc/pci-imx6.c > > @@ -37,6 +37,7 @@ > > #define IMX8MQ_GPR_PCIE_REF_USE_PAD BIT(9) > > #define IMX8MQ_GPR_PCIE_CLK_REQ_OVERRIDE_EN BIT(10) > > #define IMX8MQ_GPR_PCIE_CLK_REQ_OVERRIDE BIT(11) > > +#define IMX8MQ_GPR_PCIE_VREG_BYPASS BIT(12) > > #define IMX8MQ_GPR12_PCIE2_CTRL_DEVICE_TYPE GENMASK(11, 8) > > #define IMX8MQ_PCIE2_BASE_ADDR > 0x33c0 > > > > > > > > > > @@ -80,6 +81,7 @@ struct imx6_pcie { > > u32 tx_swing_full; > > u32 tx_swing_low; > > struct regulator*vpcie; > > + struct regulator*vph; > > void __iomem*phy_base; > > > > > > > > > > /* power domain for pcie */ > > @@ -611,6 +613,8 @@ static void imx6_pcie_configure_type(struct > > imx6_pcie *imx6_pcie) > > > > > > > > > > static void imx6_pcie_init_phy(struct imx6_pcie *imx6_pcie) { > > + int phy_uv; > > + > No need for this variable... [Richard Zhu] Thanks, would be removed later. > > > switch (imx6_pcie->drvdata->variant) { > > case IMX8MQ: > > /* > > @@ -621,6 +625,18 @@ static void imx6_pcie_init_phy(struct imx6_pcie > *imx6_pcie) > > imx6_pcie_grp_offset(imx6_pcie), > > > IMX8MQ_GPR_PCIE_REF_USE_PAD, > > > IMX8MQ_GPR_PCIE_REF_USE_PAD); > > + /* > > + * Regarding to the datasheet, the PCIE_VPH is suggested > > + * to be 1.8V. If the PCIE_VPH is supplied by 3.3V, the > > + * VREG_BYPASS should be cleared to zero. > > + */ > > + if (imx6_pcie->vph) > > + phy_uv = > regulator_get_voltage(imx6_pcie->vph); > > + if (phy_uv > 300) > > + regmap_update_bits(imx6_pcie->iomuxc_gpr, > > + > imx6_pcie_grp_offset(imx6_pcie), > > + > IMX8MQ_GPR_PCIE_VREG_BYPASS, > > +0); > > ...if you just fold this into a single condition. Right now phy_uv might be > used > uninitialized when the vph-supply is not specified in the DT. Better write > this > as: > > if (imx6_pcie->vph && regulator_get_voltage(imx6_pcie->vph) > 300) [Richard Zhu] Thanks. Would be changed as this way. > > Regards, > Lucas > > > break; > > case IMX7D: > > regmap_update_bits(imx6_pcie->iomuxc_gpr, > IOMUXC_GPR12, > > @@ -1130,6 +1146,13 @@ static int imx6_pcie_probe(struct > platform_device *pdev) > > imx6_pcie->vpcie = NULL; > > } > > > > > > > > > > > > > > > > > > + imx6_pcie->vph = devm_regulator_get_optional(>dev, > "vph"); > > + if (IS_ERR(imx6_pcie->vph)) { > > + if (PTR_ERR(imx6_pcie->vph) != -ENODEV) > > + return PTR_ERR(imx6_pcie->vph); > > + imx6_pcie->vph = NULL; > > + } > > + > > platform_set_drvdata(pdev, imx6_pcie); > > > > > > > > > > > > > > > > > > ret = imx6_pcie_attach_pd(dev); >
[PATCH] ext4: Fix bug on in ext4_es_cache_extent as ext4_split_extent_at failed
We got follow bug_on: [130747.323114] kernel BUG at fs/ext4/extents_status.c:762! [130747.323117] Internal error: Oops - BUG: 0 [#1] SMP .. [130747.334329] Call trace: [130747.334553] ext4_es_cache_extent+0x150/0x168 [ext4] [130747.334975] ext4_cache_extents+0x64/0xe8 [ext4] [130747.335368] ext4_find_extent+0x300/0x330 [ext4] [130747.335759] ext4_ext_map_blocks+0x74/0x1178 [ext4] [130747.336179] ext4_map_blocks+0x2f4/0x5f0 [ext4] [130747.336567] ext4_mpage_readpages+0x4a8/0x7a8 [ext4] [130747.336995] ext4_readpage+0x54/0x100 [ext4] [130747.337359] generic_file_buffered_read+0x410/0xae8 [130747.337767] generic_file_read_iter+0x114/0x190 [130747.338152] ext4_file_read_iter+0x5c/0x140 [ext4] [130747.338556] __vfs_read+0x11c/0x188 [130747.338851] vfs_read+0x94/0x150 [130747.339110] ksys_read+0x74/0xf0 If call ext4_ext_insert_extent failed but new extent already inserted, we just update "ex->ee_len = orig_ex.ee_len", this will lead to extent overlap, then cause bug on when cache extent. If call ext4_ext_insert_extent failed don't update ex->ee_len with old value. Maybe there will lead to block leak, but it can be fixed by fsck later. Signed-off-by: Ye Bin --- fs/ext4/extents.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c index 77c84d6f1af6..970eb2dfcc46 100644 --- a/fs/ext4/extents.c +++ b/fs/ext4/extents.c @@ -3246,7 +3246,7 @@ static int ext4_split_extent_at(handle_t *handle, goto out; } else if (err) - goto fix_extent_len; + goto err; out: ext4_ext_show_leaf(inode, path); @@ -3254,6 +3254,7 @@ static int ext4_split_extent_at(handle_t *handle, fix_extent_len: ex->ee_len = orig_ex.ee_len; +err: /* * Ignore ext4_ext_dirty return value since we are already in error path * and err is a non-zero error code. -- 2.25.4
Re: [PATCH] scsi: bnx2i: make bnx2i_process_iscsi_error simpler and more robust
Rasmus, > Instead of strcpy'ing into a stack buffer, just let additional_notice > point to a string literal living in .rodata. This is better in a few > ways: Applied to 5.13/scsi-staging, thanks! -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH] spi: fsi: Remove multiple sequenced ops for restricted chips
On Wed, 24 Mar 2021 at 22:05, Eddie James wrote: > > Updated restricted chips have trouble processing multiple sequenced > operations. So remove the capability to sequence multiple operations and > reduce the maximum transfer size to 8 bytes. > > Signed-off-by: Eddie James Reviewed-by: Joel Stanley > --- > drivers/spi/spi-fsi.c | 27 +++ > 1 file changed, 7 insertions(+), 20 deletions(-) > > diff --git a/drivers/spi/spi-fsi.c b/drivers/spi/spi-fsi.c > index 3920cd3286d8..de359718e816 100644 > --- a/drivers/spi/spi-fsi.c > +++ b/drivers/spi/spi-fsi.c > @@ -26,7 +26,7 @@ > #define SPI_FSI_BASE 0x7 > #define SPI_FSI_INIT_TIMEOUT_MS1000 > #define SPI_FSI_MAX_XFR_SIZE 2048 > -#define SPI_FSI_MAX_XFR_SIZE_RESTRICTED32 > +#define SPI_FSI_MAX_XFR_SIZE_RESTRICTED8 > > #define SPI_FSI_ERROR 0x0 > #define SPI_FSI_COUNTER_CFG0x1 > @@ -265,14 +265,12 @@ static int fsi_spi_sequence_transfer(struct fsi_spi > *ctx, > struct fsi_spi_sequence *seq, > struct spi_transfer *transfer) > { > - bool docfg = false; > int loops; > int idx; > int rc; > u8 val = 0; > u8 len = min(transfer->len, 8U); > u8 rem = transfer->len % len; > - u64 cfg = 0ULL; > > loops = transfer->len / len; > > @@ -292,28 +290,17 @@ static int fsi_spi_sequence_transfer(struct fsi_spi > *ctx, > return -EINVAL; > } > > - if (ctx->restricted) { > - const int eidx = rem ? 5 : 6; > - > - while (loops > 1 && idx <= eidx) { > - idx = fsi_spi_sequence_add(seq, val); > - loops--; > - docfg = true; > - } > - > - if (loops > 1) { > - dev_warn(ctx->dev, "No sequencer slots; aborting.\n"); > - return -EINVAL; > - } > + if (ctx->restricted && loops > 1) { > + dev_warn(ctx->dev, > +"Transfer too large; no branches permitted.\n"); > + return -EINVAL; > } > > if (loops > 1) { > + u64 cfg = SPI_FSI_COUNTER_CFG_LOOPS(loops - 1); > + > fsi_spi_sequence_add(seq, SPI_FSI_SEQUENCE_BRANCH(idx)); > - docfg = true; > - } > > - if (docfg) { > - cfg = SPI_FSI_COUNTER_CFG_LOOPS(loops - 1); > if (transfer->rx_buf) > cfg |= SPI_FSI_COUNTER_CFG_N2_RX | > SPI_FSI_COUNTER_CFG_N2_TX | > -- > 2.27.0 >
[PATCH] include: linux: host1x: Remove duplicate declaration
struct host1x is declared at 20th line. Remove the duplicate. Signed-off-by: Wan Jiabing --- include/linux/host1x.h | 1 - 1 file changed, 1 deletion(-) diff --git a/include/linux/host1x.h b/include/linux/host1x.h index ce59a6a6a008..462f0bc7a703 100644 --- a/include/linux/host1x.h +++ b/include/linux/host1x.h @@ -140,7 +140,6 @@ static inline void host1x_bo_munmap(struct host1x_bo *bo, void *addr) struct host1x_syncpt_base; struct host1x_syncpt; -struct host1x; struct host1x_syncpt *host1x_syncpt_get(struct host1x *host, u32 id); u32 host1x_syncpt_id(struct host1x_syncpt *sp); -- 2.25.1
Re: [PATCH v12 1/2] scsi: ufs: Enable power management for wlun
On 3/23/2021 12:19 PM, Adrian Hunter wrote: On 23/03/21 5:13 pm, Asutosh Das (asd) wrote: On 3/22/2021 11:12 PM, Adrian Hunter wrote: On 22/03/21 9:53 pm, Asutosh Das (asd) wrote: On 3/19/2021 10:47 AM, Adrian Hunter wrote: On 19/03/21 2:35 am, Asutosh Das wrote: During runtime-suspend of ufs host, the scsi devices are already suspended and so are the queues associated with them. But the ufs host sends SSU to wlun during its runtime-suspend. During the process blk_queue_enter checks if the queue is not in suspended state. If so, it waits for the queue to resume, and never comes out of it. The commit (d55d15a33: scsi: block: Do not accept any requests while suspended) adds the check if the queue is in suspended state in blk_queue_enter(). Call trace: __switch_to+0x174/0x2c4 __schedule+0x478/0x764 schedule+0x9c/0xe0 blk_queue_enter+0x158/0x228 blk_mq_alloc_request+0x40/0xa4 blk_get_request+0x2c/0x70 __scsi_execute+0x60/0x1c4 ufshcd_set_dev_pwr_mode+0x124/0x1e4 ufshcd_suspend+0x208/0x83c ufshcd_runtime_suspend+0x40/0x154 ufshcd_pltfrm_runtime_suspend+0x14/0x20 pm_generic_runtime_suspend+0x28/0x3c __rpm_callback+0x80/0x2a4 rpm_suspend+0x308/0x614 rpm_idle+0x158/0x228 pm_runtime_work+0x84/0xac process_one_work+0x1f0/0x470 worker_thread+0x26c/0x4c8 kthread+0x13c/0x320 ret_from_fork+0x10/0x18 Fix this by registering ufs device wlun as a scsi driver and registering it for block runtime-pm. Also make this as a supplier for all other luns. That way, this device wlun suspends after all the consumers and resumes after hba resumes. Co-developed-by: Can Guo Signed-off-by: Can Guo Signed-off-by: Asutosh Das I have some more comments that may help straighten things out. Also please look at ufs_debugfs_get_user_access() and ufs_debugfs_put_user_access() that now need to scsi_autopm_get/put_device sdev_ufs_device. It would also be good if you could re-base on linux-next. Hi Adrian Thanks for the comments. I agree moving the code to wlun probe and other changes. But it looks to me that it may not fully solve the issue. Please let me explain my understanding on this: (Please refer to the logs in v10) scsi_autopm_*() are invoked on a sdev. pm_runtime_get_suppliers()/rpm_put_suppliers() are on the supplier device. For the device wlun: slave_configure(): - doesn't set the rpm_autosuspend - pm_runtime_getnoresume() scsi_sysfs_add_sdev(): - pm_runtime_forbid() - scsi_autopm_get_device() - device_add() - ufshcd_wl_probe() - scsi_autopm_put_device() For all other scsi devices: slave_alloc(): - ufshcd_setup_links() Say all link_add: pm_runtime_put(>sdev_ufs_device->sdev_gendev); With DL_FLAG_RPM_ACTIVE, links will 'get' not 'put' I'm referring to the pm_runtime_put(sdev_ufs_device) after all the links are setup, that you suggested to add. Ok slave_configure(): - set rpm_autosuspend scsi_sysfs_add_sdev(): - scsi_autopm_get_device() - device_add() -> schedules an async probe() - scsi_autopm_put_device() - (1) Now the rpm_put_suppliers() can be invoked *after* pm_runtime_get_suppliers() of the async probe(), since both are running in different contexts. Only if the sd device suspends. Correct. What'd stop the sd device from suspending? We should be stopping the sd device from suspending here - imho. Hi Adrian, Thanks for the comments. You mean for performance reasons. That is something we can look at, but let's get it working first. Not for performance reasons. I meant to say that this issue can be fixed if we stop the sd devices from suspending until the sd_probe() is completed. In that case, the usage_count of supplier would be decremented until rpm_active of this link becomes 1. Right, because the sd device suspended. Now the pm_runtime_get_suppliers() expects the link_active to be more than 1. Not sure what you mean here. pm_runtime_*put*_suppliers() won't do anything if the link count is 1. I'm referring to the logs that I pasted before: [ 6.941267][ T7] scsi 0:0:0:4: rpm_put_suppliers: [BEF] Supp (0:0:0:49488) usage_count: 4 rpm_active: 3 -- T196 Context comes in while T7 is running -- [ 6.941466][ T196] scsi 0:0:0:4: pm_runtime_get_suppliers: (0:0:0:49488): supp: usage_count: 5 rpm_active: 4 -- [ 7.788397][ T7] scsi 0:0:0:4: rpm_put_suppliers: [AFT] Supp (0:0:0:49488) usage_count: 2 rpm_active: 1 I meant to say that, if the rpm_put_suppliers() is invoked after the pm_runtime_get_suppliers() as is seen above then the link_active may become 1 even *after* pm_runtime_get_suppliers() is invoked. I'm referring to the pm_runtime_get_suppliers() invoked from: driver_probe_device() - say for, sd 0:0:0:x |- pm_runtime_get_suppliers() - for sd
Re: [PATCH] userfaultfd/shmem: fix minor fault page leak
On Wed, Mar 24, 2021 at 5:52 PM Peter Xu wrote: > > Hi, Andrew, > > On Wed, Mar 24, 2021 at 04:20:27PM -0700, Andrew Morton wrote: > > On Mon, 22 Mar 2021 13:48:35 -0700 Axel Rasmussen > > wrote: > > > > > This fix is analogous to Peter Xu's fix for hugetlb [0]. If we don't > > > put_page() after getting the page out of the page cache, we leak the > > > reference. > > > > > > The fix can be verified by checking /proc/meminfo and running the > > > userfaultfd selftest in shmem mode. Without the fix, we see MemFree / > > > MemAvailable steadily decreasing with each run of the test. With the > > > fix, memory is correctly freed after the test program exits. > > > > > > Fixes: 00da60b9d0a0 ("userfaultfd: support minor fault handling for > > > shmem") > > > > Confused. The affected code: > > > > > --- a/mm/shmem.c > > > +++ b/mm/shmem.c > > > @@ -1831,6 +1831,7 @@ static int shmem_getpage_gfp(struct inode *inode, > > > pgoff_t index, > > > > > > if (page && vma && userfaultfd_minor(vma)) { > > > unlock_page(page); > > > + put_page(page); > > > *fault_type = handle_userfault(vmf, VM_UFFD_MINOR); > > > return 0; > > > } > > > > Is added by Peter's "page && vma && userfaultfd_minor". I assume that > > "Fixes:" is incorrect? > > > > It seems to me the commit is correct as pointed to in "Fixes", but I do have a > different commit ID here: > > commit 63c826b1372c4930f89b8a55092699fa7f0d6f4e > Author: Axel Rasmussen > Date: Thu Mar 18 10:20:43 2021 -0400 > > userfaultfd: support minor fault handling for shmem > > Axel, did you fetched the commit ID from your local tree, perhaps? Since I > should have fetched from hnaz/linux-mm and I can see Andrew's sign-off too. > > Thanks, > > -- > Peter Xu > Ah, this is the SHA I see when I "git log --grep linux-next/akpm" (where my repo's linux-next remote is [1]): commit 00da60b9d0a03818c36a2fe862578309c27006ad Author: Axel Rasmussen Date: Thu Mar 18 17:01:51 2021 +1100 userfaultfd: support minor fault handling for shmem This is the commit that this new patch fixes. I'll admit I'm a bit unsure which tree the "Fixes:" tag is meant to refer to before the commits make it into Linus' tree, if I should look up the commit another way just let me know. :) And, sorry for the confusion. [1] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
Re: [PATCH net v2] net: sched: fix packet stuck problem for lockless qdisc
On 2021/3/25 3:20, Cong Wang wrote: > On Tue, Mar 23, 2021 at 7:24 PM Yunsheng Lin wrote: >> @@ -176,8 +207,23 @@ static inline bool qdisc_run_begin(struct Qdisc *qdisc) >> static inline void qdisc_run_end(struct Qdisc *qdisc) >> { >> write_seqcount_end(>running); >> - if (qdisc->flags & TCQ_F_NOLOCK) >> + if (qdisc->flags & TCQ_F_NOLOCK) { >> spin_unlock(>seqlock); >> + >> + /* qdisc_run_end() is protected by RCU lock, and >> +* qdisc reset will do a synchronize_net() after >> +* setting __QDISC_STATE_DEACTIVATED, so testing >> +* the below two bits separately should be fine. > > Hmm, why synchronize_net() after setting this bit is fine? It could > still be flipped right after you test RESCHEDULE bit. That depends on when it will be fliped again. As I see: 1. __QDISC_STATE_DEACTIVATED is set during dev_deactivate() process, which should also wait for all process related to "test_bit( __QDISC_STATE_NEED_RESCHEDULE, >state)" to finish by calling synchronize_net() and checking some_qdisc_is_busy(). 2. it is cleared during dev_activate() process. And dev_deactivate() and dev_activate() is protected by RTNL lock, or serialized by linkwatch. > > >> +* For qdisc_run() in net_tx_action() case, we >> +* really should provide rcu protection explicitly >> +* for document purposes or PREEMPT_RCU. >> +*/ >> + if (unlikely(test_bit(__QDISC_STATE_NEED_RESCHEDULE, >> + >state) && >> +!test_bit(__QDISC_STATE_DEACTIVATED, >> + >state))) > > Why do you want to test __QDISC_STATE_DEACTIVATED bit at all? > dev_deactivate_many() will wait for those scheduled but being > deactivated, so what's the problem of scheduling it even with this bit? The problem I tried to fix is: CPU0(calling dev_deactivate) CPU1(calling qdisc_run_end) CPU2(calling tx_atcion) . __netif_schedule() . . set __QDISC_STATE_SCHED. .. . clear __QDISC_STATE_DEACTIVATED . . synchronize_net(). . .. . .. clear __QDISC_STATE_SCHED .. . some_qdisc_is_busy() return false. . .. . .. qdisc_run() some_qdisc_is_busy() checks if the qdisc is busy by checking __QDISC_STATE_SCHED and spin_is_locked(>seqlock) for lockless qdisc, and some_qdisc_is_busy() return false for CPU0 because CPU2 has cleared the __QDISC_STATE_SCHED and has not taken the qdisc->seqlock yet, qdisc is clearly still busy when qdisc_run() is run by CPU2 later. So you are right, testing __QDISC_STATE_DEACTIVATED does not completely solve the above data race, and there are __netif_schedule() called by dev_requeue_skb() and __qdisc_run() too, which need the same fixing. So will remove the __QDISC_STATE_DEACTIVATED testing for this patch first, and deal with it later. > > Thanks. > > . >
Re: [PATCH][next] scsi: aacraid: Replace one-element array with flexible-array member
Hi Martin, On 3/24/21 20:18, Martin K. Petersen wrote: > > Hi Gustavo! > > Your changes and the original code do not appear to be functionally > equivalent. > >> @@ -1235,8 +1235,8 @@ static int aac_read_raw_io(struct fib * fib, struct >> scsi_cmnd * cmd, u64 lba, u3 >> if (ret < 0) >> return ret; >> command = ContainerRawIo2; >> -fibsize = sizeof(struct aac_raw_io2) + >> -((le32_to_cpu(readcmd2->sgeCnt)-1) * sizeof(struct >> sge_ieee1212)); >> +fibsize = struct_size(readcmd2, sge, >> + le32_to_cpu(readcmd2->sgeCnt)); > > The old code allocated sgeCnt-1 elements (whether that was a mistake or > not I do not know) whereas the new code would send a larger fib to the > ASIC. I don't have any aacraid adapters and I am hesitant to merging > changes that have not been validated on real hardware. Precisely this sort of confusion is one of the things we want to avoid by using flexible-array members instead of one-element arrays. fibsize is actually the same for both the old and the new code. The difference is that in the original code, the one-element array _sge_ at the bottom of struct aac_raw_io2, contributes to the size of the structure, as it occupies at least as much space as a single object of its type. On the other hand, flexible-array members don't contribute to the size of the enclosing structure. See below... Old code: $ pahole -C aac_raw_io2 drivers/scsi/aacraid/aachba.o struct aac_raw_io2 { __le32 blockLow; /* 0 4 */ __le32 blockHigh;/* 4 4 */ __le32 byteCount;/* 8 4 */ __le16 cid; /*12 2 */ __le16 flags;/*14 2 */ __le32 sgeFirstSize; /*16 4 */ __le32 sgeNominalSize; /*20 4 */ u8 sgeCnt; /*24 1 */ u8 bpTotal; /*25 1 */ u8 bpComplete; /*26 1 */ u8 sgeFirstIndex;/*27 1 */ u8 unused[4];/*28 4 */ struct sge_ieee1212sge[1]; /*3216 */ /* size: 48, cachelines: 1, members: 13 */ /* last cacheline: 48 bytes */ }; New code: $ pahole -C aac_raw_io2 drivers/scsi/aacraid/aachba.o struct aac_raw_io2 { __le32 blockLow; /* 0 4 */ __le32 blockHigh;/* 4 4 */ __le32 byteCount;/* 8 4 */ __le16 cid; /*12 2 */ __le16 flags;/*14 2 */ __le32 sgeFirstSize; /*16 4 */ __le32 sgeNominalSize; /*20 4 */ u8 sgeCnt; /*24 1 */ u8 bpTotal; /*25 1 */ u8 bpComplete; /*26 1 */ u8 sgeFirstIndex;/*27 1 */ u8 unused[4];/*28 4 */ struct sge_ieee1212sge[];/*32 0 */ /* size: 32, cachelines: 1, members: 13 */ /* last cacheline: 32 bytes */ }; So, the old code allocates sgeCnt-1 elements because sizeof(struct aac_raw_io2) is already counting one element of the _sge_ array. Please, let me know if this is clear now. Thanks! -- Gustavo
Re: [PATCH] btrfs: fix a potential hole-punching failure
In order to reply in plain text, I send the mail from Gmail. Filipe Manana 於 2021年3月24日 週三 下午8:16寫道: > > On Wed, Mar 24, 2021 at 11:15 AM bingjingc wrote: > > > > From: BingJing Chang > > > > In commit d77815461f04 ("btrfs: Avoid trucating page or punching hole in > > a already existed hole."), existed holes can be skipped by calling > > find_first_non_hole() to adjust *start and *len. However, if the given > > len is invalid and large, when an EXTENT_MAP_HOLE extent is found, the > > *len will not be set to zero because (em->start + em->len) is less than > > (*start + *len). Then the ret will be 1 but the *len will not be set to > > 0. The propagated non-zero ret will result in fallocate failure. > > > > In the while-loop of btrfs_replace_file_extents(), len is not updated > > every time before it calls find_first_non_hole(). That is, if the last > > file extent in the given hole-punching range has been dropped but > > btrfs_drop_extents() fails with -ENOSPC (btrfs_drop_extents() runs out > > of reserved space of the given transaction), the problem can happen. > > This is not entirely clear. Dropping the last extent and still > returning ENOSPC is confusing. > I think you mean that it drops the last file extent item that does not > represent hole (disk_bytenr > 0), and after it there's only one file > extent item representing a hole (disk_bytenr == 0). > It fails with -ENOSPC when attempting to drop the file extent item > representing the hole, after successfully dropping the non-hole file > extent item. > Is that it? > Thank you for your comments. You're right. Saying the last file extent is not correct and confusing. I revised and send the v2 patch for fixing the commit message. Thank you. > > After it calls find_first_non_hole(), the cur_offset will be adjusted > > to be larger than or equal to end. However, since the len is not set to > > zero. The break-loop condition (ret && !len) will not meet. After it > > leaves the while-loop, uncleared ret will result in fallocate failure. > > Ok, fallocate will return 1, an unexpected return value. > > > > > We're not able to construct a reproducible way to let > > btrfs_drop_extents() fails with -ENOSPC after it drops the last file > > extent but with remaining holes. However, it's quite easy to fix. We > > just need to update and check the len every time before we call > > find_first_non_hole(). To make the while loop more readable, we also > > pull the variable updates to the bottom of loop like this: > > while (cur_offset < end) { > > ... > > // update cur_offset & len > > // advance cur_offset & len in hole-punching case if needed > > } > > > > Reported-by: Robbie Ko > > Fixes: d77815461f04 ("btrfs: Avoid trucating page or punching hole in a > > already existed hole.") > > Reviewed-by: Robbie Ko > > Reviewed-by: Chung-Chiang Cheng > > Signed-off-by: BingJing Chang > > Looks good. > Please just update that paragraph to be more clear about what is going on. > > Thanks. > > > --- > > fs/btrfs/file.c | 6 +++--- > > 1 file changed, 3 insertions(+), 3 deletions(-) > > > > diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c > > index 0e155f0..dccb017 100644 > > --- a/fs/btrfs/file.c > > +++ b/fs/btrfs/file.c > > @@ -2735,8 +2735,6 @@ int btrfs_replace_file_extents(struct inode *inode, > > struct btrfs_path *path, > > extent_info->file_offset += replace_len; > > } > > > > - cur_offset = drop_args.drop_end; > > - > > ret = btrfs_update_inode(trans, root, BTRFS_I(inode)); > > if (ret) > > break; > > @@ -2756,7 +2754,9 @@ int btrfs_replace_file_extents(struct inode *inode, > > struct btrfs_path *path, > > BUG_ON(ret);/* shouldn't happen */ > > trans->block_rsv = rsv; > > > > - if (!extent_info) { > > + cur_offset = drop_args.drop_end; > > + len = end - cur_offset; > > + if (!extent_info && len) { > > ret = find_first_non_hole(BTRFS_I(inode), > > _offset, > > ); > > if (unlikely(ret < 0)) > > -- > > 2.7.4 > > > > > -- > Filipe David Manana, > > “Whether you think you can, or you think you can't — you're right.” Thanks, BingJing Chang
Re: [PATCH] [v3] drm/imx: imx-ldb: fix out of bounds array access warning
On Wed, 2021-03-24 at 17:47 +0100, Arnd Bergmann wrote: > From: Arnd Bergmann > > When CONFIG_OF is disabled, building with 'make W=1' produces warnings > about out of bounds array access: > > drivers/gpu/drm/imx/imx-ldb.c: In function 'imx_ldb_set_clock.constprop': > drivers/gpu/drm/imx/imx-ldb.c:186:8: error: array subscript -22 is below > array bounds of 'struct clk *[4]' [-Werror=array-bounds] > > Add an error check before the index is used, which helps with the > warning, as well as any possible other error condition that may be > triggered at runtime. > > The warning could be fixed by adding a Kconfig depedency on CONFIG_OF, > but Liu Ying points out that the driver may hit the out-of-bounds > problem at runtime anyway. Almost impossible to hit the out-of-bounds problem at runtime, unless something wrong happens and makes unexpected parameters(node and/or encoder) be handed over to drm_of_encoder_active_port_id(). Anyway, an error check on return value from drm_of_encoder_active_port_id() looks ok to me. > > Signed-off-by: Arnd Bergmann Reviewed-by: Liu Ying Thanks, Liu Ying > --- > v3: fix build regression from v2 > v2: fix subject line > expand patch description > print mux number > check upper bound as well > --- > drivers/gpu/drm/imx/imx-ldb.c | 10 ++ > 1 file changed, 10 insertions(+) > > diff --git a/drivers/gpu/drm/imx/imx-ldb.c b/drivers/gpu/drm/imx/imx-ldb.c > index dbfe39e2f7f6..565482e2b816 100644 > --- a/drivers/gpu/drm/imx/imx-ldb.c > +++ b/drivers/gpu/drm/imx/imx-ldb.c > @@ -197,6 +197,11 @@ static void imx_ldb_encoder_enable(struct drm_encoder > *encoder) > int dual = ldb->ldb_ctrl & LDB_SPLIT_MODE_EN; > int mux = drm_of_encoder_active_port_id(imx_ldb_ch->child, encoder); > > + if (mux < 0 || mux >= ARRAY_SIZE(ldb->clk_sel)) { > + dev_warn(ldb->dev, "%s: invalid mux %d\n", __func__, mux); > + return; > + } > + > drm_panel_prepare(imx_ldb_ch->panel); > > if (dual) { > @@ -255,6 +260,11 @@ imx_ldb_encoder_atomic_mode_set(struct drm_encoder > *encoder, > int mux = drm_of_encoder_active_port_id(imx_ldb_ch->child, encoder); > u32 bus_format = imx_ldb_ch->bus_format; > > + if (mux < 0 || mux >= ARRAY_SIZE(ldb->clk_sel)) { > + dev_warn(ldb->dev, "%s: invalid mux %d\n", __func__, mux); > + return; > + } > + > if (mode->clock > 17) { > dev_warn(ldb->dev, >"%s: mode exceeds 170 MHz pixel clock\n", __func__);
[PATCH] include: linux: fs: Remove duplicate declaration
struct iov_iter has been declared at 66th line. Remove the duplicate. Signed-off-by: Wan Jiabing --- include/linux/fs.h | 1 - 1 file changed, 1 deletion(-) diff --git a/include/linux/fs.h b/include/linux/fs.h index ec8f3ddf4a6a..7f3cbd47670a 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -1883,7 +1883,6 @@ struct dir_context { */ #define REMAP_FILE_ADVISORY(REMAP_FILE_CAN_SHORTEN) -struct iov_iter; struct file_operations { struct module *owner; -- 2.25.1
Re: [PATCH] Revert "f2fs: give a warning only for readonly partition"
On 2021/3/25 6:44, Jaegeuk Kim wrote: On 03/24, Chao Yu wrote: On 2021/3/24 12:22, Jaegeuk Kim wrote: On 03/24, Chao Yu wrote: On 2021/3/24 2:39, Jaegeuk Kim wrote: On 03/23, Chao Yu wrote: This reverts commit 938a184265d75ea474f1c6fe1da96a5196163789. Because that commit fails generic/050 testcase which expect failure during mount a recoverable readonly partition. I think we need to change generic/050, since f2fs can recover this partition, Well, not sure we can change that testcase, since it restricts all generic filesystems behavior. At least, ext4's behavior makes sense to me: journal_dev_ro = bdev_read_only(journal->j_dev); really_read_only = bdev_read_only(sb->s_bdev) | journal_dev_ro; if (journal_dev_ro && !sb_rdonly(sb)) { ext4_msg(sb, KERN_ERR, "journal device read-only, try mounting with '-o ro'"); err = -EROFS; goto err_out; } if (ext4_has_feature_journal_needs_recovery(sb)) { if (sb_rdonly(sb)) { ext4_msg(sb, KERN_INFO, "INFO: recovery " "required on readonly filesystem"); if (really_read_only) { ext4_msg(sb, KERN_ERR, "write access " "unavailable, cannot proceed " "(try mounting with noload)"); err = -EROFS; goto err_out; } ext4_msg(sb, KERN_INFO, "write access will " "be enabled during recovery"); } } even though using it as readonly. And, valid checkpoint can allow for user to read all the data without problem. if (f2fs_hw_is_readonly(sbi)) { Since device is readonly now, all write to the device will fail, checkpoint can not persist recovered data, after page cache is expired, user can see stale data. My point is, after mount with ro, there'll be no data write which preserves the current status. So, in the next time, we can recover fsync'ed data later, if user succeeds to mount as rw. Another point is, with the current checkpoint, we should not have any corrupted metadata. So, why not giving a chance to show what data remained to user? I think this can be doable only with CoW filesystems. I guess we're talking about the different things... Let me declare two different readonly status: 1. filesystem readonly: file system is mount with ro mount option, and app from userspace can not modify any thing of filesystem, but filesystem itself can modify data on device since device may be writable. 2. device readonly: device is set to readonly status via 'blockdev --setro' command, and then filesystem should never issue any write IO to the device. So, what I mean is, *when device is readonly*, rather than f2fs mountpoint is readonly (f2fs_hw_is_readonly() returns true as below code, instead of f2fs_readonly() returns true), in this condition, we should not issue any write IO to device anyway, because, AFAIK, write IO will fail due to bio_check_ro() check. In that case, mount(2) will try readonly, no? Yes, if device is readonly, mount (2) can not mount/remount device to rw mountpoint. Thanks, # blockdev --setro /dev/vdb # mount -t f2fs /dev/vdb /mnt/test/ mount: /mnt/test: WARNING: source write-protected, mounted read-only. if (f2fs_hw_is_readonly(sbi)) { - if (!is_set_ckpt_flags(sbi, CP_UMOUNT_FLAG)) { - err = -EROFS; + if (!is_set_ckpt_flags(sbi, CP_UMOUNT_FLAG)) f2fs_err(sbi, "Need to recover fsync data, but write access unavailable"); - goto free_meta; - } - f2fs_info(sbi, "write access unavailable, skipping recovery"); + else + f2fs_info(sbi, "write access unavailable, skipping recovery"); goto reset_checkpoint; } For the case of filesystem is readonly and device is writable, it's fine to do recovery in order to let user to see fsynced data. Thanks, Am I missing something? Thanks, Fixes: 938a184265d7 ("f2fs: give a warning only for readonly partition") Signed-off-by: Chao Yu --- fs/f2fs/super.c | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c index b48281642e98..2b78ee11f093 100644 --- a/fs/f2fs/super.c +++ b/fs/f2fs/super.c @@ -3952,10 +3952,12 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent) * previous checkpoint was not done by clean system shutdown. */ if (f2fs_hw_is_readonly(sbi)) { -
[PATCH v2] btrfs: fix a potential hole-punching failure
From: BingJing Chang In commit d77815461f04 ("btrfs: Avoid trucating page or punching hole in a already existed hole."), existed holes can be skipped by calling find_first_non_hole() to adjust *start and *len. However, if the given len is invalid and large, when an EXTENT_MAP_HOLE extent is found, the *len will not be set to zero because (em->start + em->len) is less than (*start + *len). Then the ret will be 1 but the *len will not be set to 0. The propagated non-zero ret will result in fallocate failure. In the while-loop of btrfs_replace_file_extents(), len is not updated every time before it calls find_first_non_hole(). That is, after btrfs_drop_extents() successfully drops the last non-hole file extent, it may fail with -ENOSPC when attempting to drop a file extent item representing a hole. The problem can happen. After it calls find_first_non_hole(), the cur_offset will be adjusted to be larger than or equal to end. However, since the len is not set to zero. The break-loop condition (ret && !len) will not meet. After it leaves the while-loop, fallocate will return 1, which is an unexpected return value. We're not able to construct a reproducible way to let btrfs_drop_extents() fail with -ENOSPC after it drops the last non-hole file extent but with remaining holes left. However, it's quite easy to fix. We just need to update and check the len every time before we call find_first_non_hole(). To make the while loop more readable, we also pull the variable updates to the bottom of loop like this: while (cur_offset < end) { ... // update cur_offset & len // advance cur_offset & len in hole-punching case if needed } Reported-by: Robbie Ko Fixes: d77815461f04 ("btrfs: Avoid trucating page or punching hole in a already existed hole.") Reviewed-by: Robbie Ko Reviewed-by: Chung-Chiang Cheng Signed-off-by: BingJing Chang --- fs/btrfs/file.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index 0e155f0..dccb017 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -2735,8 +2735,6 @@ int btrfs_replace_file_extents(struct inode *inode, struct btrfs_path *path, extent_info->file_offset += replace_len; } - cur_offset = drop_args.drop_end; - ret = btrfs_update_inode(trans, root, BTRFS_I(inode)); if (ret) break; @@ -2756,7 +2754,9 @@ int btrfs_replace_file_extents(struct inode *inode, struct btrfs_path *path, BUG_ON(ret);/* shouldn't happen */ trans->block_rsv = rsv; - if (!extent_info) { + cur_offset = drop_args.drop_end; + len = end - cur_offset; + if (!extent_info && len) { ret = find_first_non_hole(BTRFS_I(inode), _offset, ); if (unlikely(ret < 0)) -- 2.7.4