date:20210328

Re: [PATCH v1] Input: elants_i2c - drop zero-checking of ABS_MT_TOUCH_MAJOR resolution

2021-03-28 Thread Dmitry Torokhov

On Mon, Mar 29, 2021 at 02:55:07AM +0300, Dmitry Osipenko wrote:
> Drop unnecessary zero-checking of ABS_MT_TOUCH_MAJOR resolution since
> there is no difference between setting resolution to 0 vs not setting
> it at all. This change makes code cleaner a tad.
> 
> Suggested-by: Dmitry Torokhov 
> Signed-off-by: Dmitry Osipenko 

Applied, thank you.

-- 
Dmitry

Re: [PATCH] bpf: remove redundant assignment of variable id

2021-03-28 Thread Dan Carpenter

On Fri, Mar 26, 2021 at 01:18:36PM -0700, Song Liu wrote:
> On Fri, Mar 26, 2021 at 12:45 PM Colin King  wrote:
> >
> > From: Colin Ian King 
> >
> > The variable id is being assigned a value that is never
> > read, the assignment is redundant and can be removed.
> >
> > Addresses-Coverity: ("Unused value")
> > Signed-off-by: Colin Ian King 
> 
> Acked-by: Song Liu 
> 
> For future patches, please prefix it as [PATCH bpf-next] for
> [PATCH bpf], based on which tree the patch should apply to.
> 

You can keep asking us to do that but it's never going to happen... :P
I do this for networking but it's a massive pain in the butt and I get
it wrong 20% of the time.

regards,
dan carpenter

Re: [PATCH] i2c: add i2c bus driver for amd navi gpu

2021-03-28 Thread Goswami, Sanket

Hi Andy,

Thanks for the review.

On 26-Mar-21 16:10, Andy Shevchenko wrote:
> [CAUTION: External Email]
> 
> On Fri, Mar 26, 2021 at 03:53:34PM +0530, Goswami, Sanket wrote:
>> On 25-Mar-21 22:35, Andy Shevchenko wrote:
>>> On Mon, Mar 22, 2021 at 10:26:55PM +0530, Goswami, Sanket wrote:
 On 09-Mar-21 19:56, Andy Shevchenko wrote:
> On Tue, Mar 09, 2021 at 07:01:47PM +0530, Sanket Goswami wrote:
> 
> ...
> 
>>> And I think I already have told you that I prefer to see rather MODEL_ 
>>> quirk.
>>
>> I did not find MODEL_ quirk reference in the PCI device tree, It is actually
>> used in platform device tree which is completely different from our PCI
>> based configuration, can you please provide some reference of MODEL_ quirk
>> which will be part of the PCI device tree?
> 
> I meant the name of new definition for quirk.

Can you please elaborate this? i am not able to comprehend.

> ...
> 
> Also why (1) and this can't be instantiated from ACPI / DT?
 It is in line with the existing PCIe-based DesignWare driver,
 A similar approach is used by the various vendors.
>>>
>>> Here is no answer to the question. What prevents you to fix your ACPI 
>>> tables or
>>> DT?
>>>
>>> We already got rid of FIFO hard coded values, timings are harder to achieve,
>>> but we expect that new firmwares will provide values in the ACPI tables.
>>
>> AMD NAVI GPU card is the PCI initiated driver, not ACPI initiated,
> 
> Which doesn't prevent to have an ACPI companion (via description in the
> tables).
> 
>> and also
>> It does not contain a corresponding ACPI match table.
> 
> Yes, that's what should be done in the firmware.
> At least for the new version of firmware consider to add proper data into the
> tables.
> 
>> Moreover, AMD  NAVI GPU
>> based products are already in the commercial market hence going by this
>> approach will break the functionalities for the same.
> 
> This is quite bad and unfortunate. So, you have to elaborate this in the 
> commit
> message.

Sure, will take care of this as part of commit message of v3.
> 
> --
> With Best Regards,
> Andy Shevchenko
> 
> 
Thanks,
Sanket

linux-next: build failure after merge of the staging tree

2021-03-28 Thread Stephen Rothwell

Hi all,

After merging the staging tree, today's linux-next build (x86_64
allmodconfig) failed like this:

drivers/iio/adc/ti-ads131e08.c: In function 'ads131e08_read_reg':
drivers/iio/adc/ti-ads131e08.c:180:5: error: 'struct spi_transfer' has no 
member named 'delay_usecs'
  180 |.delay_usecs = st->sdecode_delay_us,
  | ^~~
drivers/iio/adc/ti-ads131e08.c: In function 'ads131e08_write_reg':
drivers/iio/adc/ti-ads131e08.c:206:5: error: 'struct spi_transfer' has no 
member named 'delay_usecs'
  206 |.delay_usecs = st->sdecode_delay_us,
  | ^~~

Caused by commit

  d935eddd2799 ("iio: adc: Add driver for Texas Instruments ADS131E0x ADC 
family")

interacting with commit

  3ab1cce55337 ("spi: core: remove 'delay_usecs' field from spi_transfer")

from the spi tree.

I have applied the following merge fix patch.

From: Stephen Rothwell 
Date: Mon, 29 Mar 2021 16:51:22 +1100
Subject: [PATCH] iio: adc: merge fix for "spi: core: remove 'delay_usecs'
 field from spi_transfer"

Signed-off-by: Stephen Rothwell 
---
 drivers/iio/adc/ti-ads131e08.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/iio/adc/ti-ads131e08.c b/drivers/iio/adc/ti-ads131e08.c
index 0060d5f0abb0..764dab087b41 100644
--- a/drivers/iio/adc/ti-ads131e08.c
+++ b/drivers/iio/adc/ti-ads131e08.c
@@ -177,7 +177,10 @@ static int ads131e08_read_reg(struct ads131e08_state *st, 
u8 reg)
{
.tx_buf = >tx_buf,
.len = 2,
-   .delay_usecs = st->sdecode_delay_us,
+   .delay = {
+   .value = st->sdecode_delay_us,
+   .unit = SPI_DELAY_UNIT_USECS,
+   },
}, {
.rx_buf = >rx_buf,
.len = 1,
@@ -203,7 +206,10 @@ static int ads131e08_write_reg(struct ads131e08_state *st, 
u8 reg, u8 value)
{
.tx_buf = >tx_buf,
.len = 3,
-   .delay_usecs = st->sdecode_delay_us,
+   .delay = {
+   .value = st->sdecode_delay_us,
+   .unit = SPI_DELAY_UNIT_USECS,
+   },
}
};
 
-- 
2.30.0

-- 
Cheers,
Stephen Rothwell


pgpi7uy9bxc2a.pgp
Description: OpenPGP digital signature

Re: [PATCH 00/30] DMA: Mundane typo fixes

2021-03-28 Thread Bhaskar Chowdhury


On 07:29 Mon 29 Mar 2021, Christoph Hellwig wrote:

I really don't think these typo patchbomb are that useful.  I'm all
for fixing typos when working with a subsystem, but I'm not sure these
patchbombs help anything.


I am sure you are holding the wrong end of the wand and grossly failing to
understand.

Anyway, I hope I give a heads up ...find "your way" to fix those damn
thing...it's glaring


On Mon, Mar 29, 2021 at 05:22:56AM +0530, Bhaskar Chowdhury wrote:

This patch series fixes some trivial and rudimentary spellings in the COMMENT
sections.

Bhaskar Chowdhury (30):
  acpi-dma.c: Fix couple of typos
  altera-msgdma.c: Couple of typos fixed
  amba-pl08x.c: Fixed couple of typos
  bcm-sba-raid.c: Few typos fixed
  bcm2835-dma.c: Fix a typo
  idma64.c: Fix couple of typos
  iop-adma.c: Few typos fixed
  mv_xor.c: Fix a typo
  mv_xor.h: Fixed a typo
  mv_xor_v2.c: Fix a typo
  nbpfaxi.c: Fixed a typo
  of-dma.c: Fixed a typo
  s3c24xx-dma.c: Fix a typo
  Revert "s3c24xx-dma.c: Fix a typo"
  s3c24xx-dma.c: Few typos fixed
  st_fdma.h: Fix couple of typos
  ste_dma40_ll.h: Fix a typo
  tegra20-apb-dma.c: Fixed a typo
  xgene-dma.c: Few spello fixes
  at_hdmac.c: Quite a few spello fixes
  owl-dma.c: Fix a typo
  at_hdmac_regs.h: Couple of typo fixes
  dma-jz4780.c: Fix a typo
  Kconfig: Change Synopsys to Synopsis
  ste_dma40.c: Few spello fixes
  dw-axi-dmac-platform.c: Few typos fixed
  dpaa2-qdma.c: Fix a typo
  usb-dmac.c: Fix a typo
  edma.c: Fix a typo
  xilinx_dma.c: Fix a typo

 drivers/dma/Kconfig|  8 
 drivers/dma/acpi-dma.c |  4 ++--
 drivers/dma/altera-msgdma.c|  4 ++--
 drivers/dma/amba-pl08x.c   |  4 ++--
 drivers/dma/at_hdmac.c | 14 +++---
 drivers/dma/at_hdmac_regs.h|  4 ++--
 drivers/dma/bcm-sba-raid.c |  8 
 drivers/dma/bcm2835-dma.c  |  2 +-
 drivers/dma/dma-jz4780.c   |  2 +-
 drivers/dma/dw-axi-dmac/dw-axi-dmac-platform.c |  8 
 drivers/dma/idma64.c   |  4 ++--
 drivers/dma/iop-adma.c |  6 +++---
 drivers/dma/mv_xor.c   |  2 +-
 drivers/dma/mv_xor.h   |  2 +-
 drivers/dma/mv_xor_v2.c|  2 +-
 drivers/dma/nbpfaxi.c  |  2 +-
 drivers/dma/of-dma.c   |  2 +-
 drivers/dma/owl-dma.c  |  2 +-
 drivers/dma/s3c24xx-dma.c  |  6 +++---
 drivers/dma/sh/shdmac.c|  2 +-
 drivers/dma/sh/usb-dmac.c  |  2 +-
 drivers/dma/st_fdma.h  |  4 ++--
 drivers/dma/ste_dma40.c| 10 +-
 drivers/dma/ste_dma40_ll.h |  2 +-
 drivers/dma/tegra20-apb-dma.c  |  2 +-
 drivers/dma/ti/edma.c  |  2 +-
 drivers/dma/xgene-dma.c|  6 +++---
 drivers/dma/xilinx/xilinx_dma.c|  2 +-
 28 files changed, 59 insertions(+), 59 deletions(-)

--
2.26.3

---end quoted text---


signature.asc
Description: PGP signature

Re: [PATCH 15/17] ASoC: ti: omap-mcsp: remove duplicate test

2021-03-28 Thread Péter Ujfalusi

Hi Pierre,

On 3/26/21 11:59 PM, Pierre-Louis Bossart wrote:
> cppcheck warning:
> 
> sound/soc/ti/omap-mcbsp.c:379:11: style: The if condition is the same
> as the previous if condition [duplicateCondition]
> 
>  if (mcbsp->irq) {
>   ^
> sound/soc/ti/omap-mcbsp.c:376:11: note: First condition
>  if (mcbsp->irq)
>   ^
> sound/soc/ti/omap-mcbsp.c:379:11: note: Second condition
>  if (mcbsp->irq) {
>   ^
> 
> Keeping two separate tests was probably intentional for clarity, but
> since this generates warnings we might as well make cppcheck happy so
> that we have fewer warnings.

There might be other historical reasons why it ended up like this but
merging them does not make it less cleaner.

Acked-by: Peter Ujfalusi 

> Signed-off-by: Pierre-Louis Bossart 
> ---
>  sound/soc/ti/omap-mcbsp.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/sound/soc/ti/omap-mcbsp.c b/sound/soc/ti/omap-mcbsp.c
> index 6025b30bbe77..db47981768c5 100644
> --- a/sound/soc/ti/omap-mcbsp.c
> +++ b/sound/soc/ti/omap-mcbsp.c
> @@ -373,10 +373,9 @@ static void omap_mcbsp_free(struct omap_mcbsp *mcbsp)
>   MCBSP_WRITE(mcbsp, WAKEUPEN, 0);
>  
>   /* Disable interrupt requests */
> - if (mcbsp->irq)
> + if (mcbsp->irq) {
>   MCBSP_WRITE(mcbsp, IRQEN, 0);
>  
> - if (mcbsp->irq) {
>   free_irq(mcbsp->irq, (void *)mcbsp);
>   } else {
>   free_irq(mcbsp->rx_irq, (void *)mcbsp);
> 

-- 
Péter

Re: [PATCH 01/23] atomctl.rst: A typo fix

2021-03-28 Thread Bhaskar Chowdhury


On 22:38 Sun 28 Mar 2021, Max Filippov wrote:

On Sun, Mar 28, 2021 at 10:37 PM Max Filippov  wrote:


On Sun, Mar 28, 2021 at 10:18 PM Bhaskar Chowdhury
 wrote:
>
> s/controlers/controllers/
>
> Signed-off-by: Bhaskar Chowdhury 
> ---
>  Documentation/xtensa/atomctl.rst | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/Documentation/xtensa/atomctl.rst 
b/Documentation/xtensa/atomctl.rst
> index 1ecbd0ba9a2e..a0efab2abe8f 100644
> --- a/Documentation/xtensa/atomctl.rst
> +++ b/Documentation/xtensa/atomctl.rst
> @@ -23,7 +23,7 @@ doing a Cached (WB) transaction and use the Memory RCW for 
un-cached
>  operations.
>
>  For systems without an coherent cache controller, non-MX, we always
> -use the memory controllers RCW, thought non-MX controlers likely
> +use the memory controllers RCW, thought non-MX controllers likely

In this line you could also do s/thought/though/.


...and s/memory controllers/memory controller's/


Thanks, will do both the mentioned changes in V2..

--
Thanks.
-- Max


signature.asc
Description: PGP signature

Re: [PATCH 14/17] ASoC: ti: omap-abe-twl6040: remove useless assignment

2021-03-28 Thread Péter Ujfalusi




On 3/26/21 11:59 PM, Pierre-Louis Bossart wrote:
> cppcheck warning:
> 
> sound/soc/ti/omap-abe-twl6040.c:173:10: style: Variable 'ret' is
> assigned a value that is never used. [unreadVariable]
>  int ret = 0;
>  ^

Thanks,
Acked-by: Peter Ujfalusi 

> 
> Signed-off-by: Pierre-Louis Bossart 
> ---
>  sound/soc/ti/omap-abe-twl6040.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/sound/soc/ti/omap-abe-twl6040.c b/sound/soc/ti/omap-abe-twl6040.c
> index 16ea039ff865..91cc9a4f44d7 100644
> --- a/sound/soc/ti/omap-abe-twl6040.c
> +++ b/sound/soc/ti/omap-abe-twl6040.c
> @@ -170,7 +170,7 @@ static int omap_abe_twl6040_init(struct 
> snd_soc_pcm_runtime *rtd)
>   struct snd_soc_card *card = rtd->card;
>   struct abe_twl6040 *priv = snd_soc_card_get_drvdata(card);
>   int hs_trim;
> - int ret = 0;
> + int ret;
>  
>   /*
>* Configure McPDM offset cancellation based on the HSOTRIM value from
> 

-- 
Péter

[PATCH v4 15/16] KVM: x86/cpuid: Refactor host/guest CPU model consistency check

2021-03-28 Thread Like Xu

For the same purpose, the leagcy intel_pmu_lbr_is_compatible() could be
renamed for reuse by more callers for the same purpose and remove the 
comment about LBR use case incidentally.

Signed-off-by: Like Xu 
---
 arch/x86/kvm/cpuid.h |  5 +
 arch/x86/kvm/vmx/pmu_intel.c | 12 +---
 arch/x86/kvm/vmx/vmx.c   |  2 +-
 arch/x86/kvm/vmx/vmx.h   |  1 -
 4 files changed, 7 insertions(+), 13 deletions(-)

diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
index 2a0c5064497f..fb478fb45b9e 100644
--- a/arch/x86/kvm/cpuid.h
+++ b/arch/x86/kvm/cpuid.h
@@ -270,6 +270,11 @@ static inline int guest_cpuid_model(struct kvm_vcpu *vcpu)
return x86_model(best->eax);
 }
 
+static inline bool cpuid_model_is_consistent(struct kvm_vcpu *vcpu)
+{
+   return boot_cpu_data.x86_model == guest_cpuid_model(vcpu);
+}
+
 static inline int guest_cpuid_stepping(struct kvm_vcpu *vcpu)
 {
struct kvm_cpuid_entry2 *best;
diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index 3c1ee59571d9..4fe13cf80bb5 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -173,16 +173,6 @@ static inline struct kvm_pmc *get_fw_gp_pmc(struct kvm_pmu 
*pmu, u32 msr)
return get_gp_pmc(pmu, msr, MSR_IA32_PMC0);
 }
 
-bool intel_pmu_lbr_is_compatible(struct kvm_vcpu *vcpu)
-{
-   /*
-* As a first step, a guest could only enable LBR feature if its
-* cpu model is the same as the host because the LBR registers
-* would be pass-through to the guest and they're model specific.
-*/
-   return boot_cpu_data.x86_model == guest_cpuid_model(vcpu);
-}
-
 bool intel_pmu_lbr_is_enabled(struct kvm_vcpu *vcpu)
 {
struct x86_pmu_lbr *lbr = vcpu_to_lbr_records(vcpu);
@@ -576,7 +566,7 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
 
nested_vmx_pmu_entry_exit_ctls_update(vcpu);
 
-   if (intel_pmu_lbr_is_compatible(vcpu))
+   if (cpuid_model_is_consistent(vcpu))
x86_perf_get_lbr(_desc->records);
else
lbr_desc->records.nr = 0;
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 966fa7962808..b0f2cb790359 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2259,7 +2259,7 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct 
msr_data *msr_info)
if ((data & PMU_CAP_LBR_FMT) !=
(vmx_get_perf_capabilities() & PMU_CAP_LBR_FMT))
return 1;
-   if (!intel_pmu_lbr_is_compatible(vcpu))
+   if (!cpuid_model_is_consistent(vcpu))
return 1;
}
ret = kvm_set_msr_common(vcpu, msr_info);
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index 0029aaad8eda..d214b6c43886 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -97,7 +97,6 @@ union vmx_exit_reason {
 #define vcpu_to_lbr_records(vcpu) (_vmx(vcpu)->lbr_desc.records)
 
 void intel_pmu_cross_mapped_check(struct kvm_pmu *pmu);
-bool intel_pmu_lbr_is_compatible(struct kvm_vcpu *vcpu);
 bool intel_pmu_lbr_is_enabled(struct kvm_vcpu *vcpu);
 
 int intel_pmu_create_guest_lbr_event(struct kvm_vcpu *vcpu);
-- 
2.29.2

[PATCH v4 16/16] KVM: x86/pmu: Expose CPUIDs feature bits PDCM, DS, DTES64

2021-03-28 Thread Like Xu

The CPUID features PDCM, DS and DTES64 are required for PEBS feature.
KVM would expose CPUID feature PDCM, DS and DTES64 to guest when PEBS
is supported in the KVM on the Ice Lake server platforms.

Originally-by: Andi Kleen 
Co-developed-by: Kan Liang 
Signed-off-by: Kan Liang 
Co-developed-by: Luwei Kang 
Signed-off-by: Luwei Kang 
Signed-off-by: Like Xu 
---
 arch/x86/kvm/vmx/capabilities.h | 26 --
 arch/x86/kvm/vmx/vmx.c  | 15 +++
 2 files changed, 35 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilities.h
index d1d77985e889..df06da09f84c 100644
--- a/arch/x86/kvm/vmx/capabilities.h
+++ b/arch/x86/kvm/vmx/capabilities.h
@@ -5,6 +5,7 @@
 #include 
 
 #include "lapic.h"
+#include "pmu.h"
 
 extern bool __read_mostly enable_vpid;
 extern bool __read_mostly flexpriority_enabled;
@@ -378,20 +379,33 @@ static inline bool vmx_pt_mode_is_host_guest(void)
return pt_mode == PT_MODE_HOST_GUEST;
 }
 
-static inline u64 vmx_get_perf_capabilities(void)
+static inline bool vmx_pebs_supported(void)
 {
-   u64 perf_cap = 0;
+   struct x86_pmu_capability x86_pmu;
 
-   if (boot_cpu_has(X86_FEATURE_PDCM))
-   rdmsrl(MSR_IA32_PERF_CAPABILITIES, perf_cap);
+   perf_get_x86_pmu_capability(_pmu);
 
-   perf_cap &= PMU_CAP_LBR_FMT;
+   return boot_cpu_has(X86_FEATURE_PEBS) && x86_pmu.pebs_vmx;
+}
 
+static inline u64 vmx_get_perf_capabilities(void)
+{
/*
 * Since counters are virtualized, KVM would support full
 * width counting unconditionally, even if the host lacks it.
 */
-   return PMU_CAP_FW_WRITES | perf_cap;
+   u64 perf_cap = PMU_CAP_FW_WRITES;
+   u64 host_perf_cap = 0;
+
+   if (boot_cpu_has(X86_FEATURE_PDCM))
+   rdmsrl(MSR_IA32_PERF_CAPABILITIES, host_perf_cap);
+
+   perf_cap |= host_perf_cap & PMU_CAP_LBR_FMT;
+
+   if (vmx_pebs_supported())
+   perf_cap |= host_perf_cap & PERF_CAP_PEBS_MASK;
+
+   return perf_cap;
 }
 
 static inline u64 vmx_supported_debugctl(void)
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index b0f2cb790359..7cd9370357f9 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2262,6 +2262,17 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct 
msr_data *msr_info)
if (!cpuid_model_is_consistent(vcpu))
return 1;
}
+   if (data & PERF_CAP_PEBS_FORMAT) {
+   if ((data & PERF_CAP_PEBS_MASK) !=
+   (vmx_get_perf_capabilities() & PERF_CAP_PEBS_MASK))
+   return 1;
+   if (!guest_cpuid_has(vcpu, X86_FEATURE_DS))
+   return 1;
+   if (!guest_cpuid_has(vcpu, X86_FEATURE_DTES64))
+   return 1;
+   if (!cpuid_model_is_consistent(vcpu))
+   return 1;
+   }
ret = kvm_set_msr_common(vcpu, msr_info);
break;
 
@@ -7264,6 +7275,10 @@ static __init void vmx_set_cpu_caps(void)
kvm_cpu_cap_clear(X86_FEATURE_INVPCID);
if (vmx_pt_mode_is_host_guest())
kvm_cpu_cap_check_and_set(X86_FEATURE_INTEL_PT);
+   if (vmx_pebs_supported()) {
+   kvm_cpu_cap_check_and_set(X86_FEATURE_DS);
+   kvm_cpu_cap_check_and_set(X86_FEATURE_DTES64);
+   }
 
if (vmx_umip_emulated())
kvm_cpu_cap_set(X86_FEATURE_UMIP);
-- 
2.29.2

[PATCH v4 14/16] KVM: x86/pmu: Add kvm_pmu_cap to optimize perf_get_x86_pmu_capability

2021-03-28 Thread Like Xu

The information obtained from the interface perf_get_x86_pmu_capability()
doesn't change, so an exported "struct x86_pmu_capability" is introduced
for all guests in the KVM, and it's initialized before hardware_setup().

Signed-off-by: Like Xu 
---
 arch/x86/kvm/cpuid.c | 24 +++-
 arch/x86/kvm/pmu.c   |  3 +++
 arch/x86/kvm/pmu.h   | 19 +++
 arch/x86/kvm/vmx/pmu_intel.c | 13 +
 arch/x86/kvm/x86.c   |  9 -
 5 files changed, 38 insertions(+), 30 deletions(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 6bd2f8b830e4..b3c751d425b7 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -680,32 +680,22 @@ static inline int __do_cpuid_func(struct kvm_cpuid_array 
*array, u32 function)
case 9:
break;
case 0xa: { /* Architectural Performance Monitoring */
-   struct x86_pmu_capability cap;
union cpuid10_eax eax;
union cpuid10_edx edx;
 
-   perf_get_x86_pmu_capability();
+   eax.split.version_id = kvm_pmu_cap.version;
+   eax.split.num_counters = kvm_pmu_cap.num_counters_gp;
+   eax.split.bit_width = kvm_pmu_cap.bit_width_gp;
+   eax.split.mask_length = kvm_pmu_cap.events_mask_len;
 
-   /*
-* Only support guest architectural pmu on a host
-* with architectural pmu.
-*/
-   if (!cap.version)
-   memset(, 0, sizeof(cap));
-
-   eax.split.version_id = min(cap.version, 2);
-   eax.split.num_counters = cap.num_counters_gp;
-   eax.split.bit_width = cap.bit_width_gp;
-   eax.split.mask_length = cap.events_mask_len;
-
-   edx.split.num_counters_fixed = min(cap.num_counters_fixed, 
MAX_FIXED_COUNTERS);
-   edx.split.bit_width_fixed = cap.bit_width_fixed;
+   edx.split.num_counters_fixed = kvm_pmu_cap.num_counters_fixed;
+   edx.split.bit_width_fixed = kvm_pmu_cap.bit_width_fixed;
edx.split.anythread_deprecated = 1;
edx.split.reserved1 = 0;
edx.split.reserved2 = 0;
 
entry->eax = eax.full;
-   entry->ebx = cap.events_mask;
+   entry->ebx = kvm_pmu_cap.events_mask;
entry->ecx = 0;
entry->edx = edx.full;
break;
diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index 0081cb742743..28deb51242e1 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -19,6 +19,9 @@
 #include "lapic.h"
 #include "pmu.h"
 
+struct x86_pmu_capability __read_mostly kvm_pmu_cap;
+EXPORT_SYMBOL_GPL(kvm_pmu_cap);
+
 /* This is enough to filter the vast majority of currently defined events. */
 #define KVM_PMU_EVENT_FILTER_MAX_EVENTS 300
 
diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
index 6c902b2d2d5a..3f84640d8f8c 100644
--- a/arch/x86/kvm/pmu.h
+++ b/arch/x86/kvm/pmu.h
@@ -160,6 +160,23 @@ static inline bool pmc_speculative_in_use(struct kvm_pmc 
*pmc)
return pmc->eventsel & ARCH_PERFMON_EVENTSEL_ENABLE;
 }
 
+extern struct x86_pmu_capability kvm_pmu_cap;
+
+static inline void kvm_init_pmu_capability(void)
+{
+   perf_get_x86_pmu_capability(_pmu_cap);
+
+   /*
+* Only support guest architectural pmu on
+* a host with architectural pmu.
+*/
+   if (!kvm_pmu_cap.version)
+   memset(_pmu_cap, 0, sizeof(kvm_pmu_cap));
+
+   kvm_pmu_cap.version = min(kvm_pmu_cap.version, 2);
+   kvm_pmu_cap.num_counters_fixed = min(kvm_pmu_cap.num_counters_fixed, 
MAX_FIXED_COUNTERS);
+}
+
 void reprogram_gp_counter(struct kvm_pmc *pmc, u64 eventsel);
 void reprogram_fixed_counter(struct kvm_pmc *pmc, u8 ctrl, int fixed_idx);
 void reprogram_counter(struct kvm_pmu *pmu, int pmc_idx);
@@ -177,9 +194,11 @@ void kvm_pmu_init(struct kvm_vcpu *vcpu);
 void kvm_pmu_cleanup(struct kvm_vcpu *vcpu);
 void kvm_pmu_destroy(struct kvm_vcpu *vcpu);
 int kvm_vm_ioctl_set_pmu_event_filter(struct kvm *kvm, void __user *argp);
+void kvm_init_pmu_capability(void);
 
 bool is_vmware_backdoor_pmc(u32 pmc_idx);
 
 extern struct kvm_pmu_ops intel_pmu_ops;
 extern struct kvm_pmu_ops amd_pmu_ops;
+
 #endif /* __KVM_X86_PMU_H */
diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index 55caa941e336..3c1ee59571d9 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -504,8 +504,6 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
 {
struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
struct lbr_desc *lbr_desc = vcpu_to_lbr_desc(vcpu);
-
-   struct x86_pmu_capability x86_pmu;
struct kvm_cpuid_entry2 *entry;
union cpuid10_eax eax;
union cpuid10_edx edx;
@@ -532,13 +530,12 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
return;

[PATCH v4 12/16] KVM: x86/pmu: Move pmc_speculative_in_use() to arch/x86/kvm/pmu.h

2021-03-28 Thread Like Xu

It allows this inline function to be reused by more callers in
more files, such as pmu_intel.c.

Signed-off-by: Like Xu 
---
 arch/x86/kvm/pmu.c | 11 ---
 arch/x86/kvm/pmu.h | 11 +++
 2 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index 8d2873cfec69..0081cb742743 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -476,17 +476,6 @@ void kvm_pmu_init(struct kvm_vcpu *vcpu)
kvm_pmu_refresh(vcpu);
 }
 
-static inline bool pmc_speculative_in_use(struct kvm_pmc *pmc)
-{
-   struct kvm_pmu *pmu = pmc_to_pmu(pmc);
-
-   if (pmc_is_fixed(pmc))
-   return fixed_ctrl_field(pmu->fixed_ctr_ctrl,
-   pmc->idx - INTEL_PMC_IDX_FIXED) & 0x3;
-
-   return pmc->eventsel & ARCH_PERFMON_EVENTSEL_ENABLE;
-}
-
 /* Release perf_events for vPMCs that have been unused for a full time slice.  
*/
 void kvm_pmu_cleanup(struct kvm_vcpu *vcpu)
 {
diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
index d9157128e6eb..6c902b2d2d5a 100644
--- a/arch/x86/kvm/pmu.h
+++ b/arch/x86/kvm/pmu.h
@@ -149,6 +149,17 @@ static inline u64 get_sample_period(struct kvm_pmc *pmc, 
u64 counter_value)
return sample_period;
 }
 
+static inline bool pmc_speculative_in_use(struct kvm_pmc *pmc)
+{
+   struct kvm_pmu *pmu = pmc_to_pmu(pmc);
+
+   if (pmc_is_fixed(pmc))
+   return fixed_ctrl_field(pmu->fixed_ctr_ctrl,
+   pmc->idx - INTEL_PMC_IDX_FIXED) & 0x3;
+
+   return pmc->eventsel & ARCH_PERFMON_EVENTSEL_ENABLE;
+}
+
 void reprogram_gp_counter(struct kvm_pmc *pmc, u64 eventsel);
 void reprogram_fixed_counter(struct kvm_pmc *pmc, u8 ctrl, int fixed_idx);
 void reprogram_counter(struct kvm_pmu *pmu, int pmc_idx);
-- 
2.29.2

[PATCH v4 13/16] KVM: x86/pmu: Disable guest PEBS before vm-entry in two cases

2021-03-28 Thread Like Xu

The guest PEBS will be disabled when some users try to perf KVM and
its user-space through the same PEBS facility OR when the host perf
doesn't schedule the guest PEBS counter in a one-to-one mapping manner
(neither of these are typical scenarios).

The PEBS records in the guest DS buffer is still accurate and the
above two restrictions will be checked before each vm-entry only if
guest PEBS is deemed to be enabled.

Signed-off-by: Like Xu 
---
 arch/x86/events/intel/core.c|  8 +++-
 arch/x86/include/asm/kvm_host.h |  9 +
 arch/x86/kvm/vmx/pmu_intel.c| 16 
 arch/x86/kvm/vmx/vmx.c  |  4 
 arch/x86/kvm/vmx/vmx.h  |  1 +
 5 files changed, 37 insertions(+), 1 deletion(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 3bbdfc4f6931..20ee1b3fd06b 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -3858,7 +3858,13 @@ static struct perf_guest_switch_msr 
*intel_guest_get_msrs(int *nr, void *data)
if (pmu && x86_pmu.pebs) {
arr[1].msr = MSR_IA32_PEBS_ENABLE;
arr[1].host = cpuc->pebs_enabled & ~cpuc->intel_ctrl_guest_mask;
-   arr[1].guest = cpuc->pebs_enabled & ~cpuc->intel_ctrl_host_mask;
+   if (!arr[1].host) {
+   arr[1].guest = cpuc->pebs_enabled & 
~cpuc->intel_ctrl_host_mask;
+   /* Disable guest PEBS for cross-mapped PEBS counters. */
+   arr[1].guest &= ~pmu->host_cross_mapped_mask;
+   } else
+   /* Disable guest PEBS if host PEBS is enabled. */
+   arr[1].guest = 0;
 
arr[2].msr = MSR_IA32_DS_AREA;
arr[2].host = (unsigned long)ds;
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 94366da2dfee..cfb5467be7e6 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -466,6 +466,15 @@ struct kvm_pmu {
u64 pebs_data_cfg;
u64 pebs_data_cfg_mask;
 
+   /*
+* If a guest counter is cross-mapped to host counter with different
+* index, its PEBS capability will be temporarily disabled.
+*
+* The user should make sure that this mask is updated
+* after disabling interrupts and before perf_guest_get_msrs();
+*/
+   u64 host_cross_mapped_mask;
+
/*
 * The gate to release perf_events not marked in
 * pmc_in_use only once in a vcpu time slice.
diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index 4dcf66e6c398..55caa941e336 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -767,6 +767,22 @@ static void intel_pmu_cleanup(struct kvm_vcpu *vcpu)
intel_pmu_release_guest_lbr_event(vcpu);
 }
 
+void intel_pmu_cross_mapped_check(struct kvm_pmu *pmu)
+{
+   struct kvm_pmc *pmc = NULL;
+   int bit;
+
+   for_each_set_bit(bit, (unsigned long *)>global_ctrl, 
X86_PMC_IDX_MAX) {
+   pmc = kvm_x86_ops.pmu_ops->pmc_idx_to_pmc(pmu, bit);
+
+   if (!pmc || !pmc_speculative_in_use(pmc) || 
!pmc_is_enabled(pmc))
+   continue;
+
+   if (pmc->perf_event && (pmc->idx != pmc->perf_event->hw.idx))
+   pmu->host_cross_mapped_mask |= 
BIT_ULL(pmc->perf_event->hw.idx);
+   }
+}
+
 struct kvm_pmu_ops intel_pmu_ops = {
.find_arch_event = intel_find_arch_event,
.find_fixed_event = intel_find_fixed_event,
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 594c058f6f0f..966fa7962808 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6516,6 +6516,10 @@ static void atomic_switch_perf_msrs(struct vcpu_vmx *vmx)
struct perf_guest_switch_msr *msrs;
struct kvm_pmu *pmu = vcpu_to_pmu(>vcpu);
 
+   pmu->host_cross_mapped_mask = 0;
+   if (pmu->pebs_enable & pmu->global_ctrl)
+   intel_pmu_cross_mapped_check(pmu);
+
/* Note, nr_msrs may be garbage if perf_guest_get_msrs() returns NULL. 
*/
msrs = perf_guest_get_msrs(_msrs, (void *)pmu);
if (!msrs)
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index 0fb3236b0283..0029aaad8eda 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -96,6 +96,7 @@ union vmx_exit_reason {
 #define vcpu_to_lbr_desc(vcpu) (_vmx(vcpu)->lbr_desc)
 #define vcpu_to_lbr_records(vcpu) (_vmx(vcpu)->lbr_desc.records)
 
+void intel_pmu_cross_mapped_check(struct kvm_pmu *pmu);
 bool intel_pmu_lbr_is_compatible(struct kvm_vcpu *vcpu);
 bool intel_pmu_lbr_is_enabled(struct kvm_vcpu *vcpu);
 
-- 
2.29.2

[PATCH v4 10/16] KVM: x86: Set PEBS_UNAVAIL in IA32_MISC_ENABLE when PEBS is enabled

2021-03-28 Thread Like Xu

The bit 12 represents "Processor Event Based Sampling Unavailable (RO)" :
1 = PEBS is not supported.
0 = PEBS is supported.

A write to this PEBS_UNAVL available bit will bring #GP(0) when guest PEBS
is enabled. Some PEBS drivers in guest may care about this bit.

Signed-off-by: Like Xu 
---
 arch/x86/kvm/vmx/pmu_intel.c | 2 ++
 arch/x86/kvm/x86.c   | 4 
 2 files changed, 6 insertions(+)

diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index 7f18c760dbae..4dcf66e6c398 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -588,6 +588,7 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
bitmap_set(pmu->all_valid_pmc_idx, INTEL_PMC_IDX_FIXED_VLBR, 1);
 
if (vcpu->arch.perf_capabilities & PERF_CAP_PEBS_FORMAT) {
+   vcpu->arch.ia32_misc_enable_msr &= 
~MSR_IA32_MISC_ENABLE_PEBS_UNAVAIL;
if (vcpu->arch.perf_capabilities & PERF_CAP_PEBS_BASELINE) {
pmu->pebs_enable_mask = ~pmu->global_ctrl;
pmu->reserved_bits &= ~ICL_EVENTSEL_ADAPTIVE;
@@ -598,6 +599,7 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
} else
pmu->pebs_enable_mask = ~((1ull << 
pmu->nr_arch_gp_counters) - 1);
} else {
+   vcpu->arch.ia32_misc_enable_msr |= 
MSR_IA32_MISC_ENABLE_PEBS_UNAVAIL;
vcpu->arch.perf_capabilities &= ~PERF_CAP_PEBS_MASK;
}
 }
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 536b64360b75..888f2c3cc288 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3126,6 +3126,10 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct 
msr_data *msr_info)
break;
case MSR_IA32_MISC_ENABLE:
data &= ~MSR_IA32_MISC_ENABLE_EMON;
+   if (!msr_info->host_initiated &&
+   (vcpu->arch.perf_capabilities & PERF_CAP_PEBS_FORMAT) &&
+   (data & MSR_IA32_MISC_ENABLE_PEBS_UNAVAIL))
+   return 1;
if (!kvm_check_has_quirk(vcpu->kvm, 
KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT) &&
((vcpu->arch.ia32_misc_enable_msr ^ data) & 
MSR_IA32_MISC_ENABLE_MWAIT)) {
if (!guest_cpuid_has(vcpu, X86_FEATURE_XMM3))
-- 
2.29.2

[PATCH v4 09/16] KVM: x86/pmu: Add PEBS_DATA_CFG MSR emulation to support adaptive PEBS

2021-03-28 Thread Like Xu

If IA32_PERF_CAPABILITIES.PEBS_BASELINE [bit 14] is set, the adaptive
PEBS is supported. The PEBS_DATA_CFG MSR and adaptive record enable
bits (IA32_PERFEVTSELx.Adaptive_Record and IA32_FIXED_CTR_CTRL.
FCx_Adaptive_Record) are also supported.

Adaptive PEBS provides software the capability to configure the PEBS
records to capture only the data of interest, keeping the record size
compact. An overflow of PMCx results in generation of an adaptive PEBS
record with state information based on the selections specified in
MSR_PEBS_DATA_CFG (Memory Info [bit 0], GPRs [bit 1], XMMs [bit 2],
and LBRs [bit 3], LBR Entries [bit 31:24]). By default, the PEBS record
will only contain the Basic group.

When guest adaptive PEBS is enabled, the IA32_PEBS_ENABLE MSR will
be added to the perf_guest_switch_msr() and switched during the VMX
transitions just like CORE_PERF_GLOBAL_CTRL MSR.

Co-developed-by: Luwei Kang 
Signed-off-by: Luwei Kang 
Signed-off-by: Like Xu 
---
 arch/x86/events/intel/core.c| 11 ++-
 arch/x86/include/asm/kvm_host.h |  2 ++
 arch/x86/kvm/vmx/pmu_intel.c| 16 
 3 files changed, 28 insertions(+), 1 deletion(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 7f3821a59b84..3bbdfc4f6931 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -3844,6 +3844,7 @@ static struct perf_guest_switch_msr 
*intel_guest_get_msrs(int *nr, void *data)
struct perf_guest_switch_msr *arr = cpuc->guest_switch_msrs;
struct debug_store *ds = __this_cpu_read(cpu_hw_events.ds);
struct kvm_pmu *pmu = (struct kvm_pmu *)data;
+   bool baseline = x86_pmu.intel_cap.pebs_baseline;
 
arr[0].msr = MSR_CORE_PERF_GLOBAL_CTRL;
arr[0].host = x86_pmu.intel_ctrl & ~cpuc->intel_ctrl_guest_mask;
@@ -3863,6 +3864,12 @@ static struct perf_guest_switch_msr 
*intel_guest_get_msrs(int *nr, void *data)
arr[2].host = (unsigned long)ds;
arr[2].guest = pmu->ds_area;
 
+   if (baseline) {
+   arr[3].msr = MSR_PEBS_DATA_CFG;
+   arr[3].host = cpuc->pebs_data_cfg;
+   arr[3].guest = pmu->pebs_data_cfg;
+   }
+
/*
 * If PMU counter has PEBS enabled it is not enough to
 * disable counter on a guest entry since PEBS memory
@@ -3879,9 +3886,11 @@ static struct perf_guest_switch_msr 
*intel_guest_get_msrs(int *nr, void *data)
else {
arr[1].guest = arr[1].host;
arr[2].guest = arr[2].host;
+   if (baseline)
+   arr[3].guest = arr[3].host;
}
 
-   *nr = 3;
+   *nr = baseline ? 4 : 3;
}
 
return arr;
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 2275cc144f58..94366da2dfee 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -463,6 +463,8 @@ struct kvm_pmu {
u64 ds_area;
u64 pebs_enable;
u64 pebs_enable_mask;
+   u64 pebs_data_cfg;
+   u64 pebs_data_cfg_mask;
 
/*
 * The gate to release perf_events not marked in
diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index 77d30106abca..7f18c760dbae 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -226,6 +226,9 @@ static bool intel_is_valid_msr(struct kvm_vcpu *vcpu, u32 
msr)
case MSR_IA32_DS_AREA:
ret = guest_cpuid_has(vcpu, X86_FEATURE_DS);
break;
+   case MSR_PEBS_DATA_CFG:
+   ret = vcpu->arch.perf_capabilities & PERF_CAP_PEBS_BASELINE;
+   break;
default:
ret = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0) ||
get_gp_pmc(pmu, msr, MSR_P6_EVNTSEL0) ||
@@ -379,6 +382,9 @@ static int intel_pmu_get_msr(struct kvm_vcpu *vcpu, struct 
msr_data *msr_info)
case MSR_IA32_DS_AREA:
msr_info->data = pmu->ds_area;
return 0;
+   case MSR_PEBS_DATA_CFG:
+   msr_info->data = pmu->pebs_data_cfg;
+   return 0;
default:
if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) ||
(pmc = get_gp_pmc(pmu, msr, MSR_IA32_PMC0))) {
@@ -452,6 +458,14 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct 
msr_data *msr_info)
return 1;
pmu->ds_area = data;
return 0;
+   case MSR_PEBS_DATA_CFG:
+   if (pmu->pebs_data_cfg == data)
+   return 0;
+   if (!(data & pmu->pebs_data_cfg_mask)) {
+   pmu->pebs_data_cfg = data;
+   return 0;
+   }
+   break;
default:
if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0))

[PATCH v4 11/16] KVM: x86/pmu: Adjust precise_ip to emulate Ice Lake guest PDIR counter

2021-03-28 Thread Like Xu

The PEBS-PDIR facility on Ice Lake server is supported on IA31_FIXED0 only.
If the guest configures counter 32 and PEBS is enabled, the PEBS-PDIR
facility is supposed to be used, in which case KVM adjusts attr.precise_ip
to 3 and request host perf to assign the exactly requested counter or fail.

The cpu model check is also required since some platforms may place the
PEBS-PDIR facility in another counter index.

Signed-off-by: Like Xu 
---
 arch/x86/kvm/pmu.c | 2 ++
 arch/x86/kvm/pmu.h | 7 +++
 2 files changed, 9 insertions(+)

diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index 3509b18478b9..8d2873cfec69 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -148,6 +148,8 @@ static void pmc_reprogram_counter(struct kvm_pmc *pmc, u32 
type,
 * in the PEBS record is calibrated on the guest side.
 */
attr.precise_ip = 1;
+   if (x86_match_cpu(vmx_icl_pebs_cpu) && pmc->idx == 32)
+   attr.precise_ip = 3;
}
 
event = perf_event_create_kernel_counter(, -1, current,
diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
index 7b30bc967af3..d9157128e6eb 100644
--- a/arch/x86/kvm/pmu.h
+++ b/arch/x86/kvm/pmu.h
@@ -4,6 +4,8 @@
 
 #include 
 
+#include 
+
 #define vcpu_to_pmu(vcpu) (&(vcpu)->arch.pmu)
 #define pmu_to_vcpu(pmu)  (container_of((pmu), struct kvm_vcpu, arch.pmu))
 #define pmc_to_pmu(pmc)   (&(pmc)->vcpu->arch.pmu)
@@ -16,6 +18,11 @@
 #define VMWARE_BACKDOOR_PMC_APPARENT_TIME  0x10002
 
 #define MAX_FIXED_COUNTERS 3
+static const struct x86_cpu_id vmx_icl_pebs_cpu[] = {
+   X86_MATCH_INTEL_FAM6_MODEL(ICELAKE_D, NULL),
+   X86_MATCH_INTEL_FAM6_MODEL(ICELAKE_X, NULL),
+   {}
+};
 
 struct kvm_event_hw_type_mapping {
u8 eventsel;
-- 
2.29.2

[PATCH v4 08/16] KVM: x86/pmu: Add IA32_DS_AREA MSR emulation to manage guest DS buffer

2021-03-28 Thread Like Xu

When CPUID.01H:EDX.DS[21] is set, the IA32_DS_AREA MSR exists and
points to the linear address of the first byte of the DS buffer
management area, which is used to manage the PEBS records.

When guest PEBS is enabled and the value is different from the
host, KVM will add the IA32_DS_AREA MSR to the msr-switch list.
The guest's DS value can be loaded to the real HW before VM-entry,
and will be removed when guest PEBS is disabled.

The WRMSR to IA32_DS_AREA MSR brings a #GP(0) if the source register
contains a non-canonical address. The switch of IA32_DS_AREA MSR would
also, setup a quiescent period to write the host PEBS records (if any)
to host DS area rather than guest DS area.

When guest PEBS is enabled, the MSR_IA32_DS_AREA MSR will be
added to the perf_guest_switch_msr() and switched during the
VMX transitions just like CORE_PERF_GLOBAL_CTRL MSR.

Originally-by: Andi Kleen 
Co-developed-by: Kan Liang 
Signed-off-by: Kan Liang 
Signed-off-by: Like Xu 
---
 arch/x86/events/intel/core.c| 15 ---
 arch/x86/include/asm/kvm_host.h |  1 +
 arch/x86/kvm/vmx/pmu_intel.c| 11 +++
 arch/x86/kvm/vmx/vmx.c  |  1 +
 4 files changed, 25 insertions(+), 3 deletions(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 2ca8ed61f444..7f3821a59b84 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -21,6 +21,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "../perf_event.h"
 
@@ -3841,6 +3842,8 @@ static struct perf_guest_switch_msr 
*intel_guest_get_msrs(int *nr, void *data)
 {
struct cpu_hw_events *cpuc = this_cpu_ptr(_hw_events);
struct perf_guest_switch_msr *arr = cpuc->guest_switch_msrs;
+   struct debug_store *ds = __this_cpu_read(cpu_hw_events.ds);
+   struct kvm_pmu *pmu = (struct kvm_pmu *)data;
 
arr[0].msr = MSR_CORE_PERF_GLOBAL_CTRL;
arr[0].host = x86_pmu.intel_ctrl & ~cpuc->intel_ctrl_guest_mask;
@@ -3851,11 +3854,15 @@ static struct perf_guest_switch_msr 
*intel_guest_get_msrs(int *nr, void *data)
arr[0].guest &= ~(cpuc->pebs_enabled & PEBS_COUNTER_MASK);
*nr = 1;
 
-   if (x86_pmu.pebs) {
+   if (pmu && x86_pmu.pebs) {
arr[1].msr = MSR_IA32_PEBS_ENABLE;
arr[1].host = cpuc->pebs_enabled & ~cpuc->intel_ctrl_guest_mask;
arr[1].guest = cpuc->pebs_enabled & ~cpuc->intel_ctrl_host_mask;
 
+   arr[2].msr = MSR_IA32_DS_AREA;
+   arr[2].host = (unsigned long)ds;
+   arr[2].guest = pmu->ds_area;
+
/*
 * If PMU counter has PEBS enabled it is not enough to
 * disable counter on a guest entry since PEBS memory
@@ -3869,10 +3876,12 @@ static struct perf_guest_switch_msr 
*intel_guest_get_msrs(int *nr, void *data)
 
if (arr[1].guest)
arr[0].guest |= arr[1].guest;
-   else
+   else {
arr[1].guest = arr[1].host;
+   arr[2].guest = arr[2].host;
+   }
 
-   *nr = 2;
+   *nr = 3;
}
 
return arr;
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index f620485d7836..2275cc144f58 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -460,6 +460,7 @@ struct kvm_pmu {
DECLARE_BITMAP(all_valid_pmc_idx, X86_PMC_IDX_MAX);
DECLARE_BITMAP(pmc_in_use, X86_PMC_IDX_MAX);
 
+   u64 ds_area;
u64 pebs_enable;
u64 pebs_enable_mask;
 
diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index 0700d6d739f7..77d30106abca 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -223,6 +223,9 @@ static bool intel_is_valid_msr(struct kvm_vcpu *vcpu, u32 
msr)
case MSR_IA32_PEBS_ENABLE:
ret = vcpu->arch.perf_capabilities & PERF_CAP_PEBS_FORMAT;
break;
+   case MSR_IA32_DS_AREA:
+   ret = guest_cpuid_has(vcpu, X86_FEATURE_DS);
+   break;
default:
ret = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0) ||
get_gp_pmc(pmu, msr, MSR_P6_EVNTSEL0) ||
@@ -373,6 +376,9 @@ static int intel_pmu_get_msr(struct kvm_vcpu *vcpu, struct 
msr_data *msr_info)
case MSR_IA32_PEBS_ENABLE:
msr_info->data = pmu->pebs_enable;
return 0;
+   case MSR_IA32_DS_AREA:
+   msr_info->data = pmu->ds_area;
+   return 0;
default:
if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) ||
(pmc = get_gp_pmc(pmu, msr, MSR_IA32_PMC0))) {
@@ -441,6 +447,11 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct 
msr_data *msr_info)
return 0;
}
break;
+   case MSR_IA32_DS_AREA:
+   if

[PATCH v4 06/16] KVM: x86/pmu: Reprogram guest PEBS event to emulate guest PEBS counter

2021-03-28 Thread Like Xu

When a guest counter is configured as a PEBS counter through
IA32_PEBS_ENABLE, a guest PEBS event will be reprogrammed by
configuring a non-zero precision level in the perf_event_attr.

The guest PEBS overflow PMI bit would be set in the guest
GLOBAL_STATUS MSR when PEBS facility generates a PEBS
overflow PMI based on guest IA32_DS_AREA MSR.

The attr.precise_ip would be adjusted to a special precision
level when the new PEBS-PDIR feature is supported later which
would affect the host counters scheduling.

The guest PEBS event would not be reused for non-PEBS
guest event even with the same guest counter index.

Originally-by: Andi Kleen 
Co-developed-by: Kan Liang 
Signed-off-by: Kan Liang 
Signed-off-by: Like Xu 
---
 arch/x86/include/asm/kvm_host.h |  2 ++
 arch/x86/kvm/pmu.c  | 33 +++--
 2 files changed, 33 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index c560960544a3..9b814bdc9137 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -460,6 +460,8 @@ struct kvm_pmu {
DECLARE_BITMAP(all_valid_pmc_idx, X86_PMC_IDX_MAX);
DECLARE_BITMAP(pmc_in_use, X86_PMC_IDX_MAX);
 
+   u64 pebs_enable;
+
/*
 * The gate to release perf_events not marked in
 * pmc_in_use only once in a vcpu time slice.
diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index 827886c12c16..3509b18478b9 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -74,11 +74,20 @@ static void kvm_perf_overflow_intr(struct perf_event 
*perf_event,
 {
struct kvm_pmc *pmc = perf_event->overflow_handler_context;
struct kvm_pmu *pmu = pmc_to_pmu(pmc);
+   bool skip_pmi = false;
 
if (!test_and_set_bit(pmc->idx, pmu->reprogram_pmi)) {
-   __set_bit(pmc->idx, (unsigned long *)>global_status);
+   if (perf_event->attr.precise_ip) {
+   /* Indicate PEBS overflow PMI to guest. */
+   skip_pmi = 
test_and_set_bit(GLOBAL_STATUS_BUFFER_OVF_BIT,
+   (unsigned long *)>global_status);
+   } else
+   __set_bit(pmc->idx, (unsigned long 
*)>global_status);
kvm_make_request(KVM_REQ_PMU, pmc->vcpu);
 
+   if (skip_pmi)
+   return;
+
/*
 * Inject PMI. If vcpu was in a guest mode during NMI PMI
 * can be ejected on a guest mode re-entry. Otherwise we can't
@@ -99,6 +108,7 @@ static void pmc_reprogram_counter(struct kvm_pmc *pmc, u32 
type,
  bool exclude_kernel, bool intr,
  bool in_tx, bool in_tx_cp)
 {
+   struct kvm_pmu *pmu = vcpu_to_pmu(pmc->vcpu);
struct perf_event *event;
struct perf_event_attr attr = {
.type = type,
@@ -110,6 +120,7 @@ static void pmc_reprogram_counter(struct kvm_pmc *pmc, u32 
type,
.exclude_kernel = exclude_kernel,
.config = config,
};
+   bool pebs = test_bit(pmc->idx, (unsigned long *)>pebs_enable);
 
attr.sample_period = get_sample_period(pmc, pmc->counter);
 
@@ -124,9 +135,23 @@ static void pmc_reprogram_counter(struct kvm_pmc *pmc, u32 
type,
attr.sample_period = 0;
attr.config |= HSW_IN_TX_CHECKPOINTED;
}
+   if (pebs) {
+   /*
+* The non-zero precision level of guest event makes the 
ordinary
+* guest event becomes a guest PEBS event and triggers the host
+* PEBS PMI handler to determine whether the PEBS overflow PMI
+* comes from the host counters or the guest.
+*
+* For most PEBS hardware events, the difference in the software
+* precision levels of guest and host PEBS events will not 
affect
+* the accuracy of the PEBS profiling result, because the 
"event IP"
+* in the PEBS record is calibrated on the guest side.
+*/
+   attr.precise_ip = 1;
+   }
 
event = perf_event_create_kernel_counter(, -1, current,
-intr ? kvm_perf_overflow_intr :
+(intr || pebs) ? 
kvm_perf_overflow_intr :
 kvm_perf_overflow, pmc);
if (IS_ERR(event)) {
pr_debug_ratelimited("kvm_pmu: event creation failed %ld for 
pmc->idx = %d\n",
@@ -161,6 +186,10 @@ static bool pmc_resume_counter(struct kvm_pmc *pmc)
  get_sample_period(pmc, pmc->counter)))
return false;
 
+   if (!test_bit(pmc->idx, (unsigned long *)_to_pmu(pmc)->pebs_enable) 
&&
+   pmc->perf_event->attr.precise_ip)
+   return false;
+
/* reuse

[PATCH v4 05/16] KVM: x86/pmu: Introduce the ctrl_mask value for fixed counter

2021-03-28 Thread Like Xu

The mask value of fixed counter control register should be dynamic
adjusted with the number of fixed counters. This patch introduces a
variable that includes the reserved bits of fixed counter control
registers. This is needed for later Ice Lake fixed counter changes.

Co-developed-by: Luwei Kang 
Signed-off-by: Luwei Kang 
Signed-off-by: Like Xu 
---
 arch/x86/include/asm/kvm_host.h | 1 +
 arch/x86/kvm/vmx/pmu_intel.c| 6 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index a52f973bdff6..c560960544a3 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -444,6 +444,7 @@ struct kvm_pmu {
unsigned nr_arch_fixed_counters;
unsigned available_event_types;
u64 fixed_ctr_ctrl;
+   u64 fixed_ctr_ctrl_mask;
u64 global_ctrl;
u64 global_status;
u64 global_ovf_ctrl;
diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index d9dbebe03cae..ac7fe714e6c1 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -400,7 +400,7 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct 
msr_data *msr_info)
case MSR_CORE_PERF_FIXED_CTR_CTRL:
if (pmu->fixed_ctr_ctrl == data)
return 0;
-   if (!(data & 0xf444ull)) {
+   if (!(data & pmu->fixed_ctr_ctrl_mask)) {
reprogram_fixed_counters(pmu, data);
return 0;
}
@@ -470,6 +470,7 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
struct kvm_cpuid_entry2 *entry;
union cpuid10_eax eax;
union cpuid10_edx edx;
+   int i;
 
pmu->nr_arch_gp_counters = 0;
pmu->nr_arch_fixed_counters = 0;
@@ -477,6 +478,7 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
pmu->counter_bitmask[KVM_PMC_FIXED] = 0;
pmu->version = 0;
pmu->reserved_bits = 0x0020ull;
+   pmu->fixed_ctr_ctrl_mask = ~0ull;
 
entry = kvm_find_cpuid_entry(vcpu, 0xa, 0);
if (!entry)
@@ -511,6 +513,8 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
((u64)1 << edx.split.bit_width_fixed) - 1;
}
 
+   for (i = 0; i < pmu->nr_arch_fixed_counters; i++)
+   pmu->fixed_ctr_ctrl_mask &= ~(0xbull << (i * 4));
pmu->global_ctrl = ((1ull << pmu->nr_arch_gp_counters) - 1) |
(((1ull << pmu->nr_arch_fixed_counters) - 1) << 
INTEL_PMC_IDX_FIXED);
pmu->global_ctrl_mask = ~pmu->global_ctrl;
-- 
2.29.2

[PATCH v4 07/16] KVM: x86/pmu: Add IA32_PEBS_ENABLE MSR emulation for extended PEBS

2021-03-28 Thread Like Xu

If IA32_PERF_CAPABILITIES.PEBS_BASELINE [bit 14] is set, the
IA32_PEBS_ENABLE MSR exists and all architecturally enumerated fixed
and general purpose counters have corresponding bits in IA32_PEBS_ENABLE
that enable generation of PEBS records. The general-purpose counter bits
start at bit IA32_PEBS_ENABLE[0], and the fixed counter bits start at
bit IA32_PEBS_ENABLE[32].

When guest PEBS is enabled, the IA32_PEBS_ENABLE MSR will be
added to the perf_guest_switch_msr() and switched during the
VMX transitions just like CORE_PERF_GLOBAL_CTRL MSR.

Originally-by: Andi Kleen 
Co-developed-by: Kan Liang 
Signed-off-by: Kan Liang 
Co-developed-by: Luwei Kang 
Signed-off-by: Luwei Kang 
Signed-off-by: Like Xu 
---
 arch/x86/events/intel/core.c | 17 +
 arch/x86/include/asm/kvm_host.h  |  1 +
 arch/x86/include/asm/msr-index.h |  6 ++
 arch/x86/kvm/vmx/pmu_intel.c | 28 
 4 files changed, 48 insertions(+), 4 deletions(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index e8fee7cf767f..2ca8ed61f444 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -3851,7 +3851,11 @@ static struct perf_guest_switch_msr 
*intel_guest_get_msrs(int *nr, void *data)
arr[0].guest &= ~(cpuc->pebs_enabled & PEBS_COUNTER_MASK);
*nr = 1;
 
-   if (x86_pmu.pebs && x86_pmu.pebs_no_isolation) {
+   if (x86_pmu.pebs) {
+   arr[1].msr = MSR_IA32_PEBS_ENABLE;
+   arr[1].host = cpuc->pebs_enabled & ~cpuc->intel_ctrl_guest_mask;
+   arr[1].guest = cpuc->pebs_enabled & ~cpuc->intel_ctrl_host_mask;
+
/*
 * If PMU counter has PEBS enabled it is not enough to
 * disable counter on a guest entry since PEBS memory
@@ -3860,9 +3864,14 @@ static struct perf_guest_switch_msr 
*intel_guest_get_msrs(int *nr, void *data)
 *
 * Don't do this if the CPU already enforces it.
 */
-   arr[1].msr = MSR_IA32_PEBS_ENABLE;
-   arr[1].host = cpuc->pebs_enabled;
-   arr[1].guest = 0;
+   if (x86_pmu.pebs_no_isolation)
+   arr[1].guest = 0;
+
+   if (arr[1].guest)
+   arr[0].guest |= arr[1].guest;
+   else
+   arr[1].guest = arr[1].host;
+
*nr = 2;
}
 
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 9b814bdc9137..f620485d7836 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -461,6 +461,7 @@ struct kvm_pmu {
DECLARE_BITMAP(pmc_in_use, X86_PMC_IDX_MAX);
 
u64 pebs_enable;
+   u64 pebs_enable_mask;
 
/*
 * The gate to release perf_events not marked in
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 546d6ecf0a35..9afcad882f4f 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -186,6 +186,12 @@
 #define MSR_IA32_DS_AREA   0x0600
 #define MSR_IA32_PERF_CAPABILITIES 0x0345
 #define MSR_PEBS_LD_LAT_THRESHOLD  0x03f6
+#define PERF_CAP_PEBS_TRAP BIT_ULL(6)
+#define PERF_CAP_ARCH_REG  BIT_ULL(7)
+#define PERF_CAP_PEBS_FORMAT   0xf00
+#define PERF_CAP_PEBS_BASELINE BIT_ULL(14)
+#define PERF_CAP_PEBS_MASK (PERF_CAP_PEBS_TRAP | PERF_CAP_ARCH_REG | \
+   PERF_CAP_PEBS_FORMAT | PERF_CAP_PEBS_BASELINE)
 
 #define MSR_IA32_RTIT_CTL  0x0570
 #define RTIT_CTL_TRACEEN   BIT(0)
diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index ac7fe714e6c1..0700d6d739f7 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -220,6 +220,9 @@ static bool intel_is_valid_msr(struct kvm_vcpu *vcpu, u32 
msr)
case MSR_CORE_PERF_GLOBAL_OVF_CTRL:
ret = pmu->version > 1;
break;
+   case MSR_IA32_PEBS_ENABLE:
+   ret = vcpu->arch.perf_capabilities & PERF_CAP_PEBS_FORMAT;
+   break;
default:
ret = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0) ||
get_gp_pmc(pmu, msr, MSR_P6_EVNTSEL0) ||
@@ -367,6 +370,9 @@ static int intel_pmu_get_msr(struct kvm_vcpu *vcpu, struct 
msr_data *msr_info)
case MSR_CORE_PERF_GLOBAL_OVF_CTRL:
msr_info->data = pmu->global_ovf_ctrl;
return 0;
+   case MSR_IA32_PEBS_ENABLE:
+   msr_info->data = pmu->pebs_enable;
+   return 0;
default:
if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) ||
(pmc = get_gp_pmc(pmu, msr, MSR_IA32_PMC0))) {
@@ -427,6 +433,14 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct 
msr_data *msr_info)
return 0;
}
break;
+

[PATCH v4 04/16] KVM: x86/pmu: Set MSR_IA32_MISC_ENABLE_EMON bit when vPMU is enabled

2021-03-28 Thread Like Xu

On Intel platforms, software may uses IA32_MISC_ENABLE[7]
bit to detect whether the performance monitoring facility
is supported in the processor.

It's dependent on the PMU being enabled for the guest and
a write to this PMU available bit will be ignored.

Cc: Yao Yuan 
Signed-off-by: Like Xu 
---
 arch/x86/kvm/vmx/pmu_intel.c | 1 +
 arch/x86/kvm/x86.c   | 1 +
 2 files changed, 2 insertions(+)

diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index 9efc1a6b8693..d9dbebe03cae 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -488,6 +488,7 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
if (!pmu->version)
return;
 
+   vcpu->arch.ia32_misc_enable_msr |= MSR_IA32_MISC_ENABLE_EMON;
perf_get_x86_pmu_capability(_pmu);
 
pmu->nr_arch_gp_counters = min_t(int, eax.split.num_counters,
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index a9d95f90a048..536b64360b75 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3125,6 +3125,7 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct 
msr_data *msr_info)
}
break;
case MSR_IA32_MISC_ENABLE:
+   data &= ~MSR_IA32_MISC_ENABLE_EMON;
if (!kvm_check_has_quirk(vcpu->kvm, 
KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT) &&
((vcpu->arch.ia32_misc_enable_msr ^ data) & 
MSR_IA32_MISC_ENABLE_MWAIT)) {
if (!guest_cpuid_has(vcpu, X86_FEATURE_XMM3))
-- 
2.29.2

[PATCH v4 00/16] KVM: x86/pmu: Add basic support to enable Guest PEBS via DS

2021-03-28 Thread Like Xu

The guest Precise Event Based Sampling (PEBS) feature can provide
an architectural state of the instruction executed after the guest
instruction that exactly caused the event. It needs new hardware
facility only available on Intel Ice Lake Server platforms. This
patch set enables the basic PEBS via DS feature for KVM guests on ICX.

We can use PEBS feature on the Linux guest like native:

  # perf record -e instructions:ppp ./br_instr a
  # perf record -c 10 -e instructions:pp ./br_instr a

To emulate guest PEBS facility for the above perf usages,
we need to implement 2 code paths:

1) Fast path

This is when the host assigned physical PMC has an identical index as
the virtual PMC (e.g. using physical PMC0 to emulate virtual PMC0).
This path is used in most common use cases.

2) Slow path

This is when the host assigned physical PMC has a different index
from the virtual PMC (e.g. using physical PMC1 to emulate virtual PMC0)
In this case, KVM needs to rewrite the PEBS records to change the
applicable counter indexes to the virtual PMC indexes, which would
otherwise contain the physical counter index written by PEBS facility,
and switch the counter reset values to the offset corresponding to
the physical counter indexes in the DS data structure.

The previous version [0] enables both fast path and slow path, which
seems a bit more complex as the first step. In this patchset, we want
to start with the fast path to get the basic guest PEBS enabled while
keeping the slow path disabled. More focused discussion on the slow
path [1] is planned to be put to another patchset in the next step.

Compared to later versions in subsequent steps, the functionality
to support host-guest PEBS both enabled and the functionality to
emulate guest PEBS when the counter is cross-mapped are missing
in this patch set (neither of these are typical scenarios).

With the basic support, the guest can retrieve the correct PEBS
information from its own PEBS records on the Ice Lake servers.
And we expect it should work when migrating to another Ice Lake
and no regression about host perf is expected.

Here are the results of pebs test from guest/host for same workload:

perf report on guest:
# Samples: 2K of event 'instructions:ppp', # Event count (approx.): 1473377250
# Overhead  Command   Shared Object  Symbol
  57.74%  br_instr  br_instr   [.] lfsr_cond
  41.40%  br_instr  br_instr   [.] cmp_end
   0.21%  br_instr  [kernel.kallsyms]  [k] __lock_acquire

perf report on host:
# Samples: 2K of event 'instructions:ppp', # Event count (approx.): 1462721386
# Overhead  Command   Shared Object Symbol
  57.90%  br_instr  br_instr  [.] lfsr_cond
  41.95%  br_instr  br_instr  [.] cmp_end
   0.05%  br_instr  [kernel.vmlinux]  [k] lock_acquire
   Conclusion: the profiling results on the guest are similar tothat on the 
host.

Please check more details in each commit and feel free to comment.

Previous:
[0] https://lore.kernel.org/kvm/20210104131542.495413-1-like...@linux.intel.com/
[1] 
https://lore.kernel.org/kvm/20210115191113.nktlnmivc3eds...@two.firstfloor.org/

v3->v4 Changelog:
- Update this cover letter and propose a new upstream plan;
[PERF]
- Drop check host DS and move handler to handle_pmi_common();
- Pass "struct kvm_pmu *" to intel_guest_get_msrs();
- Propose new assignment logic for perf_guest_switch_msr();
- Introduce x86_pmu.pebs_vmx for future capability maintenance;
[KVM]
- Add kvm_pmu_cap to optimize perf_get_x86_pmu_capability;
- Raising PEBS PMI only when OVF_BIT 62 is not set;
- Make vmx_icl_pebs_cpu specific for PEBS-PDIR emulation;
- Fix a bug for fixed_ctr_ctrl_mask;
- Add two minor refactoring patches for reuse;

Like Xu (16):
  perf/x86/intel: Add x86_pmu.pebs_vmx for Ice Lake Servers
  perf/x86/intel: Handle guest PEBS overflow PMI for KVM guest
  perf/x86/core: Pass "struct kvm_pmu *" to determine the guest values
  KVM: x86/pmu: Set MSR_IA32_MISC_ENABLE_EMON bit when vPMU is enabled
  KVM: x86/pmu: Introduce the ctrl_mask value for fixed counter
  KVM: x86/pmu: Reprogram guest PEBS event to emulate guest PEBS counter
  KVM: x86/pmu: Add IA32_PEBS_ENABLE MSR emulation for extended PEBS
  KVM: x86/pmu: Add IA32_DS_AREA MSR emulation to manage guest DS buffer
  KVM: x86/pmu: Add PEBS_DATA_CFG MSR emulation to support adaptive PEBS
  KVM: x86: Set PEBS_UNAVAIL in IA32_MISC_ENABLE when PEBS is enabled
  KVM: x86/pmu: Adjust precise_ip to emulate Ice Lake guest PDIR counter
  KVM: x86/pmu: Move pmc_speculative_in_use() to arch/x86/kvm/pmu.h
  KVM: x86/pmu: Disable guest PEBS before vm-entry in two cases
  KVM: x86/pmu: Add kvm_pmu_cap to optimize perf_get_x86_pmu_capability
  KVM: x86/cpuid: Refactor host/guest CPU model consistency check
  KVM: x86/pmu: Expose CPUIDs feature bits PDCM, DS, DTES64

 arch/x86/events/core.c|   5 +-
 arch/x86/events/intel/core.c  |  93 +++---
 arch/x86/events/perf_event.h  |   5 +-

[PATCH v4 03/16] perf/x86/core: Pass "struct kvm_pmu *" to determine the guest values

2021-03-28 Thread Like Xu

Splitting the logic for determining the guest values is unnecessarily
confusing, and potentially fragile. Perf should have full knowledge and
control of what values are loaded for the guest.

If we change .guest_get_msrs() to take a struct kvm_pmu pointer, then it
can generate the full set of guest values by grabbing guest ds_area and
pebs_data_cfg. Alternatively, .guest_get_msrs() could take the desired
guest MSR values directly (ds_area and pebs_data_cfg), but kvm_pmu is
vendor agnostic, so we don't see any reason to not just pass the pointer.

Suggested-by: Sean Christopherson 
Signed-off-by: Like Xu 
---
 arch/x86/events/core.c| 4 ++--
 arch/x86/events/intel/core.c  | 4 ++--
 arch/x86/events/perf_event.h  | 2 +-
 arch/x86/include/asm/perf_event.h | 4 ++--
 arch/x86/kvm/vmx/vmx.c| 3 ++-
 5 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 06bef6ba8a9b..7e2264a8c3f7 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -673,9 +673,9 @@ void x86_pmu_disable_all(void)
}
 }
 
-struct perf_guest_switch_msr *perf_guest_get_msrs(int *nr)
+struct perf_guest_switch_msr *perf_guest_get_msrs(int *nr, void *data)
 {
-   return static_call(x86_pmu_guest_get_msrs)(nr);
+   return static_call(x86_pmu_guest_get_msrs)(nr, data);
 }
 EXPORT_SYMBOL_GPL(perf_guest_get_msrs);
 
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index af9ac48fe840..e8fee7cf767f 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -3837,7 +3837,7 @@ static int intel_pmu_hw_config(struct perf_event *event)
return 0;
 }
 
-static struct perf_guest_switch_msr *intel_guest_get_msrs(int *nr)
+static struct perf_guest_switch_msr *intel_guest_get_msrs(int *nr, void *data)
 {
struct cpu_hw_events *cpuc = this_cpu_ptr(_hw_events);
struct perf_guest_switch_msr *arr = cpuc->guest_switch_msrs;
@@ -3869,7 +3869,7 @@ static struct perf_guest_switch_msr 
*intel_guest_get_msrs(int *nr)
return arr;
 }
 
-static struct perf_guest_switch_msr *core_guest_get_msrs(int *nr)
+static struct perf_guest_switch_msr *core_guest_get_msrs(int *nr, void *data)
 {
struct cpu_hw_events *cpuc = this_cpu_ptr(_hw_events);
struct perf_guest_switch_msr *arr = cpuc->guest_switch_msrs;
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index 85dc4e1d4514..e52b35333e1f 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -809,7 +809,7 @@ struct x86_pmu {
/*
 * Intel host/guest support (KVM)
 */
-   struct perf_guest_switch_msr *(*guest_get_msrs)(int *nr);
+   struct perf_guest_switch_msr *(*guest_get_msrs)(int *nr, void *data);
 
/*
 * Check period value for PERF_EVENT_IOC_PERIOD ioctl.
diff --git a/arch/x86/include/asm/perf_event.h 
b/arch/x86/include/asm/perf_event.h
index 6a6e707905be..d5957b68906b 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -491,10 +491,10 @@ static inline void perf_check_microcode(void) { }
 #endif
 
 #if defined(CONFIG_PERF_EVENTS) && defined(CONFIG_CPU_SUP_INTEL)
-extern struct perf_guest_switch_msr *perf_guest_get_msrs(int *nr);
+extern struct perf_guest_switch_msr *perf_guest_get_msrs(int *nr, void *data);
 extern int x86_perf_get_lbr(struct x86_pmu_lbr *lbr);
 #else
-struct perf_guest_switch_msr *perf_guest_get_msrs(int *nr);
+struct perf_guest_switch_msr *perf_guest_get_msrs(int *nr, void *data);
 static inline int x86_perf_get_lbr(struct x86_pmu_lbr *lbr)
 {
return -1;
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index c8a4a548e96b..8063cb7e8387 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6513,9 +6513,10 @@ static void atomic_switch_perf_msrs(struct vcpu_vmx *vmx)
 {
int i, nr_msrs;
struct perf_guest_switch_msr *msrs;
+   struct kvm_pmu *pmu = vcpu_to_pmu(>vcpu);
 
/* Note, nr_msrs may be garbage if perf_guest_get_msrs() returns NULL. 
*/
-   msrs = perf_guest_get_msrs(_msrs);
+   msrs = perf_guest_get_msrs(_msrs, (void *)pmu);
if (!msrs)
return;
 
-- 
2.29.2

[PATCH v4 02/16] perf/x86/intel: Handle guest PEBS overflow PMI for KVM guest

2021-03-28 Thread Like Xu

With PEBS virtualization, the guest PEBS records get delivered to the
guest DS, and the host pmi handler uses perf_guest_cbs->is_in_guest()
to distinguish whether the PMI comes from the guest code like Intel PT.

No matter how many guest PEBS counters are overflowed, only triggering
one fake event is enough. The fake event causes the KVM PMI callback to
be called, thereby injecting the PEBS overflow PMI into the guest.

KVM will inject the PMI with BUFFER_OVF set, even if the guest DS is
empty. That should really be harmless. Thus the guest PEBS handler would
retrieve the correct information from its own PEBS records buffer.

Originally-by: Andi Kleen 
Co-developed-by: Kan Liang 
Signed-off-by: Kan Liang 
Signed-off-by: Like Xu 
---
 arch/x86/events/intel/core.c | 45 +++-
 1 file changed, 44 insertions(+), 1 deletion(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 591d60cc8436..af9ac48fe840 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -2747,6 +2747,46 @@ static void intel_pmu_reset(void)
local_irq_restore(flags);
 }
 
+/*
+ * We may be running with guest PEBS events created by KVM, and the
+ * PEBS records are logged into the guest's DS and invisible to host.
+ *
+ * In the case of guest PEBS overflow, we only trigger a fake event
+ * to emulate the PEBS overflow PMI for guest PBES counters in KVM.
+ * The guest will then vm-entry and check the guest DS area to read
+ * the guest PEBS records.
+ *
+ * The contents and other behavior of the guest event do not matter.
+ */
+static int x86_pmu_handle_guest_pebs(struct pt_regs *regs,
+   struct perf_sample_data *data)
+{
+   struct cpu_hw_events *cpuc = this_cpu_ptr(_hw_events);
+   u64 guest_pebs_idxs = cpuc->pebs_enabled & ~cpuc->intel_ctrl_host_mask;
+   struct perf_event *event = NULL;
+   int bit;
+
+   if (!x86_pmu.pebs_active || !guest_pebs_idxs)
+   return 0;
+
+   for_each_set_bit(bit, (unsigned long *)_pebs_idxs,
+   INTEL_PMC_IDX_FIXED + x86_pmu.num_counters_fixed) {
+
+   event = cpuc->events[bit];
+   if (!event->attr.precise_ip)
+   continue;
+
+   perf_sample_data_init(data, 0, event->hw.last_period);
+   if (perf_event_overflow(event, data, regs))
+   x86_pmu_stop(event, 0);
+
+   /* Inject one fake event is enough. */
+   return 1;
+   }
+
+   return 0;
+}
+
 static int handle_pmi_common(struct pt_regs *regs, u64 status)
 {
struct perf_sample_data data;
@@ -2797,7 +2837,10 @@ static int handle_pmi_common(struct pt_regs *regs, u64 
status)
u64 pebs_enabled = cpuc->pebs_enabled;
 
handled++;
-   x86_pmu.drain_pebs(regs, );
+   if (x86_pmu.pebs_vmx && perf_guest_cbs && 
perf_guest_cbs->is_in_guest())
+   x86_pmu_handle_guest_pebs(regs, );
+   else
+   x86_pmu.drain_pebs(regs, );
status &= x86_pmu.intel_ctrl | GLOBAL_STATUS_TRACE_TOPAPMI;
 
/*
-- 
2.29.2

[PATCH v4 01/16] perf/x86/intel: Add x86_pmu.pebs_vmx for Ice Lake Servers

2021-03-28 Thread Like Xu

The new hardware facility supporting guest PEBS is only available
on Intel Ice Lake Server platforms for now. KVM will check this field
through perf_get_x86_pmu_capability() instead of hard coding the cpu
models in the KVM code. If it is supported, the guest PBES capability
will be exposed to the guest.

Signed-off-by: Like Xu 
---
 arch/x86/events/core.c| 1 +
 arch/x86/events/intel/core.c  | 1 +
 arch/x86/events/perf_event.h  | 3 ++-
 arch/x86/include/asm/perf_event.h | 1 +
 4 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 18df17129695..06bef6ba8a9b 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -2776,5 +2776,6 @@ void perf_get_x86_pmu_capability(struct 
x86_pmu_capability *cap)
cap->bit_width_fixed= x86_pmu.cntval_bits;
cap->events_mask= (unsigned int)x86_pmu.events_maskl;
cap->events_mask_len= x86_pmu.events_mask_len;
+   cap->pebs_vmx   = x86_pmu.pebs_vmx;
 }
 EXPORT_SYMBOL_GPL(perf_get_x86_pmu_capability);
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 7bbb5bb98d8c..591d60cc8436 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -5574,6 +5574,7 @@ __init int intel_pmu_init(void)
 
case INTEL_FAM6_ICELAKE_X:
case INTEL_FAM6_ICELAKE_D:
+   x86_pmu.pebs_vmx = 1;
pmem = true;
fallthrough;
case INTEL_FAM6_ICELAKE_L:
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index 53b2b5fc23bc..85dc4e1d4514 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -729,7 +729,8 @@ struct x86_pmu {
pebs_prec_dist  :1,
pebs_no_tlb :1,
pebs_no_isolation   :1,
-   pebs_block  :1;
+   pebs_block  :1,
+   pebs_vmx:1;
int pebs_record_size;
int pebs_buffer_size;
int max_pebs_events;
diff --git a/arch/x86/include/asm/perf_event.h 
b/arch/x86/include/asm/perf_event.h
index 544f41a179fb..6a6e707905be 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -192,6 +192,7 @@ struct x86_pmu_capability {
int bit_width_fixed;
unsigned intevents_mask;
int events_mask_len;
+   unsigned intpebs_vmx:1;
 };
 
 /*
-- 
2.29.2

Re: [PATCH 01/23] atomctl.rst: A typo fix

2021-03-28 Thread Max Filippov

On Sun, Mar 28, 2021 at 10:18 PM Bhaskar Chowdhury
 wrote:
>
> s/controlers/controllers/
>
> Signed-off-by: Bhaskar Chowdhury 
> ---
>  Documentation/xtensa/atomctl.rst | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/Documentation/xtensa/atomctl.rst 
> b/Documentation/xtensa/atomctl.rst
> index 1ecbd0ba9a2e..a0efab2abe8f 100644
> --- a/Documentation/xtensa/atomctl.rst
> +++ b/Documentation/xtensa/atomctl.rst
> @@ -23,7 +23,7 @@ doing a Cached (WB) transaction and use the Memory RCW for 
> un-cached
>  operations.
>
>  For systems without an coherent cache controller, non-MX, we always
> -use the memory controllers RCW, thought non-MX controlers likely
> +use the memory controllers RCW, thought non-MX controllers likely

In this line you could also do s/thought/though/.

-- 
Thanks.
-- Max

linux-next: manual merge of the staging tree with the scmi tree

2021-03-28 Thread Stephen Rothwell

Hi all,

Today's linux-next merge of the staging tree got a conflict in:

  drivers/iio/common/scmi_sensors/scmi_iio.c

between commit:

  fc91d6b6f0ba ("iio/scmi: port driver to the new scmi_sensor_proto_ops 
interface")

from the scmi tree and commit:

  1b33dfa5d5f1 ("Merge remote-tracking branch 
'local/ib-iio-scmi-5.12-rc2-take3' into togreg")

from the staging tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc drivers/iio/common/scmi_sensors/scmi_iio.c
index b4bdc3f3a946,63e4cec9de5e..
--- a/drivers/iio/common/scmi_sensors/scmi_iio.c
+++ b/drivers/iio/common/scmi_sensors/scmi_iio.c
@@@ -501,23 -528,9 +501,9 @@@ static int scmi_iio_set_sampling_freq_a
return 0;
  }
  
- static int scmi_iio_buffers_setup(struct iio_dev *scmi_iiodev)
- {
-   struct iio_buffer *buffer;
- 
-   buffer = devm_iio_kfifo_allocate(_iiodev->dev);
-   if (!buffer)
-   return -ENOMEM;
- 
-   iio_device_attach_buffer(scmi_iiodev, buffer);
-   scmi_iiodev->modes |= INDIO_BUFFER_SOFTWARE;
-   scmi_iiodev->setup_ops = _iio_buffer_ops;
-   return 0;
- }
- 
 -static struct iio_dev *scmi_alloc_iiodev(struct device *dev,
 -   struct scmi_handle *handle,
 -   const struct scmi_sensor_info 
*sensor_info)
 +static struct iio_dev *
 +scmi_alloc_iiodev(struct scmi_device *sdev, struct scmi_protocol_handle *ph,
 +const struct scmi_sensor_info *sensor_info)
  {
struct iio_chan_spec *iio_channels;
struct scmi_iio_priv *sensor;


pgp6pXAVmWQmB.pgp
Description: OpenPGP digital signature

EROFS big pcluster feature benchmark

2021-03-28 Thread Gao Xiang

Hi folks,

The following shows the latest progress of EROFS big pcluster feature
for the upcoming 5.13, note that big pcluster also enables inplace
decompression to minimize extra page allocation and cache thrashing.

Kernel: Linux 5.10-rc5
Testsuite: erofs-openbenchmark
Testdata: enwik9 (10 bytes)
Compression algorithm: lz4hc, 9

Processor: Intel(R) Core(TM) i5-8250U CPU @ 1.60GHz
SSD: INTEL SSDPEKKF360G7H (360 GB)
DDR: Samsung M471A1K43CB1-CRC (8 GB)
OS Distribution: Debian 10
Test environment:
Turbo Boost disabled
scaling_governor = userspace, scaling_{min,max}_freq = 1801000

Squashfs configuration:
CONFIG_SQUASHFS_FILE_DIRECT=y
CONFIG_SQUASHFS_DECOMP_MULTI_PERCPU=y

EROFS git repos:
https://git.kernel.org/pub/scm/linux/kernel/git/xiang/linux.git -b 
erofs/bigpcluster
https://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs-utils.git -b 
experimental-bigpcluster-compact

Note that test data should vary on different CPU/storage combinations.
The principle to boost up seq read is that many (not all) storage
devices perform lower I/O latency with smaller I/O size, so increase
pcluster size would increase C/R thus I/O size would be smaller.

 
|  file system  |   size| seq read | rand read | rand9m read |
|___|___|_ MiB/s __|__ MiB/s __|___ MiB/s ___|
|___erofs_4k|_556879872_|_ 781.4 __|__ 55.3 ___|___ 25.3  ___|
|___erofs_16k___|_452509696_|_ 864.8 __|_ 123.2 ___|___ 20.8  ___|
|___erofs_32k___|_415223808_|_ 899.8 __|_ 105.8 _*_|___ 16.8 |
|___erofs_64k___|_393814016_|_ 906.6 __|__ 66.6 _*_|___ 11.8 |
|__squashfs_8k__|_556191744_|_  64.9 __|__ 19.3 ___| 9.1 |
|__squashfs_16k_|_502661120_|_  98.9 __|__ 38.0 ___| 9.8 |
|__squashfs_32k_|_458784768_|_ 115.4 __|__ 71.6 _*_|___ 10.0 |
|_squashfs_128k_|_398204928_|_ 257.2 __|_ 253.8 _*_|___ 10.9 |
|ext4_4k|()_|_ 786.6 __|__ 28.6 ___|___ 27.8 |


* Squashfs grabs more page cache to keep all decompressed data with
  grab_cache_page_nowait() than the normal requested readahead (see
  squashfs_copy_cache and squashfs_readpage_block).
  In principle, EROFS can also cache such all decompressed data
  if necessary, yet it's low priority for now and have little use
  (rand9m is actually a better rand read workload, since the amount
   of I/O is 9m rather than full-sized 1000m).

For the comparison of other filesystems, see:
https://github.com/erofs/erofs-openbenchmark/wiki

Thanks,
Gao Xiang

RAW DATA:
benchmarking imgs/enwik9_4k.erofs.compacted.img with erofs
mntdir/enwik9
[seqread]
   READ: bw=832MiB/s (873MB/s), 832MiB/s-832MiB/s (873MB/s-873MB/s), io=954MiB 
(1000MB), run=1146-1146msec
   READ: bw=780MiB/s (818MB/s), 780MiB/s-780MiB/s (818MB/s-818MB/s), io=954MiB 
(1000MB), run=1222-1222msec
   READ: bw=771MiB/s (808MB/s), 771MiB/s-771MiB/s (808MB/s-808MB/s), io=954MiB 
(1000MB), run=1237-1237msec
   READ: bw=761MiB/s (797MB/s), 761MiB/s-761MiB/s (797MB/s-797MB/s), io=954MiB 
(1000MB), run=1254-1254msec
   READ: bw=763MiB/s (800MB/s), 763MiB/s-763MiB/s (800MB/s-800MB/s), io=954MiB 
(1000MB), run=1250-1250msec
[randread]
   READ: bw=56.1MiB/s (58.8MB/s), 56.1MiB/s-56.1MiB/s (58.8MB/s-58.8MB/s), 
io=954MiB (1000MB), run=16995-16995msec
   READ: bw=54.6MiB/s (57.3MB/s), 54.6MiB/s-54.6MiB/s (57.3MB/s-57.3MB/s), 
io=954MiB (1000MB), run=17457-17457msec
   READ: bw=54.6MiB/s (57.3MB/s), 54.6MiB/s-54.6MiB/s (57.3MB/s-57.3MB/s), 
io=954MiB (1000MB), run=17460-17460msec
   READ: bw=56.7MiB/s (59.5MB/s), 56.7MiB/s-56.7MiB/s (59.5MB/s-59.5MB/s), 
io=954MiB (1000MB), run=16811-16811msec
   READ: bw=54.6MiB/s (57.2MB/s), 54.6MiB/s-54.6MiB/s (57.2MB/s-57.2MB/s), 
io=954MiB (1000MB), run=17479-17479msec
[randread_9m]
   READ: bw=23.8MiB/s (24.0MB/s), 23.8MiB/s-23.8MiB/s (24.0MB/s-24.0MB/s), 
io=9216KiB (9437kB), run=378-378msec
   READ: bw=24.6MiB/s (25.8MB/s), 24.6MiB/s-24.6MiB/s (25.8MB/s-25.8MB/s), 
io=9216KiB (9437kB), run=366-366msec
   READ: bw=25.6MiB/s (26.8MB/s), 25.6MiB/s-25.6MiB/s (26.8MB/s-26.8MB/s), 
io=9216KiB (9437kB), run=352-352msec
   READ: bw=26.1MiB/s (27.4MB/s), 26.1MiB/s-26.1MiB/s (27.4MB/s-27.4MB/s), 
io=9216KiB (9437kB), run=345-345msec
   READ: bw=26.3MiB/s (27.6MB/s), 26.3MiB/s-26.3MiB/s (27.6MB/s-27.6MB/s), 
io=9216KiB (9437kB), run=342-342msec

benchmarking imgs/enwik9_16k.erofs.compacted.img with erofs
mntdir/enwik9
[seqread]
   READ: bw=905MiB/s (949MB/s), 905MiB/s-905MiB/s (949MB/s-949MB/s), io=954MiB 
(1000MB), run=1054-1054msec
   READ: bw=845MiB/s (887MB/s), 845MiB/s-845MiB/s (887MB/s-887MB/s), io=954MiB 
(1000MB), run=1128-1128msec
   READ: bw=860MiB/s (902MB/s), 860MiB/s-860MiB/s (902MB/s-902MB/s), io=954MiB 
(1000MB), run=1109-1109msec
   READ: bw=861MiB/s (903MB/s), 861MiB/s-861MiB/s (903MB/s-903MB/s), io=954MiB 
(1000MB), run=1107-1107msec
   READ: bw=853MiB/s (894MB/s), 853MiB/s-853MiB/s (894MB/s-894MB/s), io=954MiB 
(1000MB), run=1118-1118msec
[randread]
   READ: bw=126MiB/s (132MB/s),

[PATCH] mm: add ___GFP_NOINIT flag which disables zeroing on alloc

2021-03-28 Thread Hyunsoon Kim

This patch allows programmer to avoid zero initialization on page
allocation even when the kernel config "CONFIG_INIT_ON_ALLOC_DEFAULT"
is enabled. The configuration is made to prevent uninitialized
heap memory flaws, and Android has applied this for security and
deterministic execution times. Please refer to below.

https://android-review.googlesource.com/c/kernel/common/+/1235132

However, there is a case that the zeroing page memory is unnecessary
when the page is used on specific purpose and will be zeroed
automatically by hardware that accesses the memory through DMA.
For instance, page allocation used for IP packet reception from Exynos
modem is solely used for packet reception. Although the page will be
freed eventually and reused for some other purpose, initialization at
that moment of reuse will be sufficient to avoid uninitialized heap
memory flaws. To support this kind of control, this patch creates new
gfp type called ___GFP_NOINIT, that allows no zeroing at the moment
of page allocation, called by many related APIs such as page_frag_alloc,
alloc_pages, etc.

Signed-off-by: Hyunsoon Kim 
---
 include/linux/gfp.h | 2 ++
 include/linux/mm.h  | 4 +++-
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index 8572a14..4ddd947 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -58,6 +58,8 @@ struct vm_area_struct;
 #else
 #define ___GFP_NOLOCKDEP   0
 #endif
+#define ___GFP_NOINIT  0x100u
+
 /* If the above are modified, __GFP_BITS_SHIFT may need updating */
 
 /*
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 8ba4342..06a23bb 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2907,7 +2907,9 @@ static inline void kernel_unpoison_pages(struct page 
*page, int numpages) { }
 DECLARE_STATIC_KEY_FALSE(init_on_alloc);
 static inline bool want_init_on_alloc(gfp_t flags)
 {
-   if (static_branch_unlikely(_on_alloc))
+   if (flags & ___GFP_NOINIT)
+   return false;
+   else if (static_branch_unlikely(_on_alloc))
return true;
return flags & __GFP_ZERO;
 }
-- 
2.7.4

Re: [PATCH v2] kernel/resource: Fix locking in request_free_mem_region

2021-03-28 Thread Balbir Singh

On Mon, Mar 29, 2021 at 12:55:15PM +1100, Alistair Popple wrote:
> On Friday, 26 March 2021 4:15:36 PM AEDT Balbir Singh wrote:
> > On Fri, Mar 26, 2021 at 12:20:35PM +1100, Alistair Popple wrote:
> > > +static int __region_intersects(resource_size_t start, size_t size,
> > > +unsigned long flags, unsigned long desc)
> > > +{
> > > + struct resource res;
> > > + int type = 0; int other = 0;
> > > + struct resource *p;
> > > +
> > > + res.start = start;
> > > + res.end = start + size - 1;
> > > +
> > > + for (p = iomem_resource.child; p ; p = p->sibling) {
> > > + bool is_type = (((p->flags & flags) == flags) &&
> > > + ((desc == IORES_DESC_NONE) ||
> > > +  (desc == p->desc)));
> > 
> > is_type is a bad name, are we saying "is" as in boolean question?
> > Or is it short for something like intersection_type? I know you've
> > just moved the code over :)
> 
> Yeah, I'm not a fan of that name either but I was just moving code over and 
> couldn't come up with anything better :)
> 
> It is a boolean question though - it is checking to see if resource *p is the 
> same type (flags+desc) of region as what is being checked for intersection.
>
> > > +
> > > + if (resource_overlaps(p, ))
> > > + is_type ? type++ : other++;
> > > + }
> > > +
> > > + if (type == 0)
> > > + return REGION_DISJOINT;
> > > +
> > > + if (other == 0)
> > > + return REGION_INTERSECTS;
> > > +
> > > + return REGION_MIXED;
> > > +}
> > > +
> > >  /**
> > >   * region_intersects() - determine intersection of region with known 
> resources
> > >   * @start: region start address
> > > @@ -546,31 +574,12 @@ EXPORT_SYMBOL_GPL(page_is_ram);
> > >  int region_intersects(resource_size_t start, size_t size, unsigned long 
> flags,
> > > unsigned long desc)
> > >  {
> > > - struct resource res;
> > > - int type = 0; int other = 0;
> > > - struct resource *p;
> > > -
> > > - res.start = start;
> > > - res.end = start + size - 1;
> > > + int rc;
> > >
> > >   read_lock(_lock);
> > > - for (p = iomem_resource.child; p ; p = p->sibling) {
> > > - bool is_type = (((p->flags & flags) == flags) &&
> > > - ((desc == IORES_DESC_NONE) ||
> > > -  (desc == p->desc)));
> > > -
> > > - if (resource_overlaps(p, ))
> > > - is_type ? type++ : other++;
> > > - }
> > > + rc = __region_intersects(start, size, flags, desc);
> > >   read_unlock(_lock);
> > > -
> > > - if (type == 0)
> > > - return REGION_DISJOINT;
> > > -
> > > - if (other == 0)
> > > - return REGION_INTERSECTS;
> > > -
> > > - return REGION_MIXED;
> > > + return rc;
> > >  }
> > >  EXPORT_SYMBOL_GPL(region_intersects);
> > >
> > > @@ -1171,31 +1180,17 @@ struct address_space *iomem_get_mapping(void)
> > >   return smp_load_acquire(_inode)->i_mapping;
> > >  }
> > >
> > > -/**
> > > - * __request_region - create a new busy resource region
> > > - * @parent: parent resource descriptor
> > > - * @start: resource start address
> > > - * @n: resource region size
> > > - * @name: reserving caller's ID string
> > > - * @flags: IO resource flags
> > > - */
> > > -struct resource * __request_region(struct resource *parent,
> > > -resource_size_t start, resource_size_t n,
> > > -const char *name, int flags)
> > > +static bool request_region_locked(struct resource *parent,
> > > + struct resource *res, resource_size_t 
> start,
> > > + resource_size_t n, const char *name, 
> > > int 
> flags)
> > >  {
> > > - DECLARE_WAITQUEUE(wait, current);
> > > - struct resource *res = alloc_resource(GFP_KERNEL);
> > >   struct resource *orig_parent = parent;
> > > -
> > > - if (!res)
> > > - return NULL;
> > > + DECLARE_WAITQUEUE(wait, current);
> > 
> > This part of the diff looks confusing, do we have a waitqueue and we call
> > schedule() within a function called with the lock held?
> 
> Good point. schedule() does get called but the lock is dropped first:
> 
>   if (conflict->flags & flags & IORESOURCE_MUXED) {
>   add_wait_queue(_resource_wait, );
>   write_unlock(_lock);
>   set_current_state(TASK_UNINTERRUPTIBLE);
>   schedule();
>   remove_wait_queue(_resource_wait, );
>   write_lock(_lock);
>   continue;
>   }
> 
> This isn't an issue though as it's only used for request_muxed_region() which 
> isn't used for the ZONE_DEVICE allocation and by design doesn't search for 
> free space. Ie. IORESOURCE_MUXED will

Re: [PATCH 01/23] atomctl.rst: A typo fix

2021-03-28 Thread Max Filippov

On Sun, Mar 28, 2021 at 10:37 PM Max Filippov  wrote:
>
> On Sun, Mar 28, 2021 at 10:18 PM Bhaskar Chowdhury
>  wrote:
> >
> > s/controlers/controllers/
> >
> > Signed-off-by: Bhaskar Chowdhury 
> > ---
> >  Documentation/xtensa/atomctl.rst | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/Documentation/xtensa/atomctl.rst 
> > b/Documentation/xtensa/atomctl.rst
> > index 1ecbd0ba9a2e..a0efab2abe8f 100644
> > --- a/Documentation/xtensa/atomctl.rst
> > +++ b/Documentation/xtensa/atomctl.rst
> > @@ -23,7 +23,7 @@ doing a Cached (WB) transaction and use the Memory RCW 
> > for un-cached
> >  operations.
> >
> >  For systems without an coherent cache controller, non-MX, we always
> > -use the memory controllers RCW, thought non-MX controlers likely
> > +use the memory controllers RCW, thought non-MX controllers likely
>
> In this line you could also do s/thought/though/.

...and s/memory controllers/memory controller's/

-- 
Thanks.
-- Max

[PATCH v2] mm: fix race by making init_zero_pfn() early_initcall

2021-03-28 Thread Ilya Lipnitskiy

There are code paths that rely on zero_pfn to be fully initialized
before core_initcall. For example, wq_sysfs_init() is a core_initcall
function that eventually results in a call to kernel_execve, which
causes a page fault with a subsequent mmput. If zero_pfn is not
initialized by then it may not get cleaned up properly and result in an
error:
  BUG: Bad rss-counter state mm:(ptrval) type:MM_ANONPAGES val:1

Here is an analysis of the race as seen on a MIPS device. On this
particular MT7621 device (Ubiquiti ER-X), zero_pfn is PFN 0 until
initialized, at which point it becomes PFN 5120:
  1. wq_sysfs_init calls into kobject_uevent_env at core_initcall:
   [<80340dc8>] kobject_uevent_env+0x7e4/0x7ec
   [<8033f8b8>] kset_register+0x68/0x88
   [<803cf824>] bus_register+0xdc/0x34c
   [<803cfac8>] subsys_virtual_register+0x34/0x78
   [<8086afb0>] wq_sysfs_init+0x1c/0x4c
   [<80001648>] do_one_initcall+0x50/0x1a8
   [<8086503c>] kernel_init_freeable+0x230/0x2c8
   [<8066bca0>] kernel_init+0x10/0x100
   [<80003038>] ret_from_kernel_thread+0x14/0x1c

  2. kobject_uevent_env() calls call_usermodehelper_exec() which executes
 kernel_execve asynchronously.

  3. Memory allocations in kernel_execve cause a page fault, bumping the
 MM reference counter:
   [<8015adb4>] add_mm_counter_fast+0xb4/0xc0
   [<80160d58>] handle_mm_fault+0x6e4/0xea0
   [<80158aa4>] __get_user_pages.part.78+0x190/0x37c
   [<8015992c>] __get_user_pages_remote+0x128/0x360
   [<801a6d9c>] get_arg_page+0x34/0xa0
   [<801a7394>] copy_string_kernel+0x194/0x2a4
   [<801a880c>] kernel_execve+0x11c/0x298
   [<800420f4>] call_usermodehelper_exec_async+0x114/0x194

  4. In case zero_pfn has not been initialized yet, zap_pte_range does
 not decrement the MM_ANONPAGES RSS counter and the BUG message is
 triggered shortly afterwards when __mmdrop checks the ref counters:
   [<800285e8>] __mmdrop+0x98/0x1d0
   [<801a6de8>] free_bprm+0x44/0x118
   [<801a86a8>] kernel_execve+0x160/0x1d8
   [<800420f4>] call_usermodehelper_exec_async+0x114/0x194
   [<80003198>] ret_from_kernel_thread+0x14/0x1c

To avoid races such as described above, initialize init_zero_pfn at
early_initcall level. Depending on the architecture, ZERO_PAGE is either
constant or gets initialized even earlier, at paging_init, so there is
no issue with initializing zero_pfn earlier.

ML discussion: 
https://lore.kernel.org/lkml/CALCv0x2YqOXEAy2Q=hafjhHCtTHVodChv1qpM=niaxopqeb...@mail.gmail.com/

Signed-off-by: Ilya Lipnitskiy 
Cc: "Eric W. Biederman" 
Cc: sta...@vger.kernel.org
---
 mm/memory.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/memory.c b/mm/memory.c
index 46ef306375bd..a8bbc4fc121f 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -166,7 +166,7 @@ static int __init init_zero_pfn(void)
zero_pfn = page_to_pfn(ZERO_PAGE(0));
return 0;
 }
-core_initcall(init_zero_pfn);
+early_initcall(init_zero_pfn);
 
 void mm_trace_rss_stat(struct mm_struct *mm, int member, long count)
 {
-- 
2.31.0

[PATCH] tick/broadcast: Allow later probed device enter oneshot mode

2021-03-28 Thread Jindong Yue

Broadcast device is switched to oneshot mode in
hrtimer_switch_to_hres() -> tick_broadcast_switch_to_oneshot().
After high resolution timers are enabled, new installed
broadcast device has no chance to switch mode.

This issue happens in below situation:
In order to make broadcast clock source driver build as module,
use module_platform_driver() to replace TIMER_OF_DECLARE().
This will make clock source driver probed later than
high resolution timers enabled.

Signed-off-by: Jindong Yue 
---
 kernel/time/tick-broadcast.c | 15 ++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c
index e51778c312f1..f38a7544cb5b 100644
--- a/kernel/time/tick-broadcast.c
+++ b/kernel/time/tick-broadcast.c
@@ -47,6 +47,7 @@ static inline void tick_resume_broadcast_oneshot(struct 
clock_event_device *bc)
 static inline void tick_broadcast_oneshot_offline(unsigned int cpu) { }
 # endif
 #endif
+static void tick_handle_oneshot_broadcast(struct clock_event_device *dev);
 
 /*
  * Debugging: see timer_list.c
@@ -115,8 +116,20 @@ void tick_install_broadcast_device(struct 
clock_event_device *dev)
 * notification the systems stays stuck in periodic mode
 * forever.
 */
-   if (dev->features & CLOCK_EVT_FEAT_ONESHOT)
+   if (dev->features & CLOCK_EVT_FEAT_ONESHOT) {
tick_clock_notify();
+
+   /*
+* If new broadcast device is installed after high resolution
+* timers enabled, it can not switch to oneshot mode anymore.
+* Here give it a chance.
+*/
+   if (tick_broadcast_oneshot_active() &&
+   dev->event_handler != tick_handle_oneshot_broadcast) {
+   tick_broadcast_switch_to_oneshot();
+   }
+   }
+
 }
 
 /*
-- 
2.17.1

Re: [PATCH 00/30] DMA: Mundane typo fixes

2021-03-28 Thread Christoph Hellwig

I really don't think these typo patchbomb are that useful.  I'm all
for fixing typos when working with a subsystem, but I'm not sure these
patchbombs help anything.

On Mon, Mar 29, 2021 at 05:22:56AM +0530, Bhaskar Chowdhury wrote:
> This patch series fixes some trivial and rudimentary spellings in the COMMENT
> sections.
> 
> Bhaskar Chowdhury (30):
>   acpi-dma.c: Fix couple of typos
>   altera-msgdma.c: Couple of typos fixed
>   amba-pl08x.c: Fixed couple of typos
>   bcm-sba-raid.c: Few typos fixed
>   bcm2835-dma.c: Fix a typo
>   idma64.c: Fix couple of typos
>   iop-adma.c: Few typos fixed
>   mv_xor.c: Fix a typo
>   mv_xor.h: Fixed a typo
>   mv_xor_v2.c: Fix a typo
>   nbpfaxi.c: Fixed a typo
>   of-dma.c: Fixed a typo
>   s3c24xx-dma.c: Fix a typo
>   Revert "s3c24xx-dma.c: Fix a typo"
>   s3c24xx-dma.c: Few typos fixed
>   st_fdma.h: Fix couple of typos
>   ste_dma40_ll.h: Fix a typo
>   tegra20-apb-dma.c: Fixed a typo
>   xgene-dma.c: Few spello fixes
>   at_hdmac.c: Quite a few spello fixes
>   owl-dma.c: Fix a typo
>   at_hdmac_regs.h: Couple of typo fixes
>   dma-jz4780.c: Fix a typo
>   Kconfig: Change Synopsys to Synopsis
>   ste_dma40.c: Few spello fixes
>   dw-axi-dmac-platform.c: Few typos fixed
>   dpaa2-qdma.c: Fix a typo
>   usb-dmac.c: Fix a typo
>   edma.c: Fix a typo
>   xilinx_dma.c: Fix a typo
> 
>  drivers/dma/Kconfig|  8 
>  drivers/dma/acpi-dma.c |  4 ++--
>  drivers/dma/altera-msgdma.c|  4 ++--
>  drivers/dma/amba-pl08x.c   |  4 ++--
>  drivers/dma/at_hdmac.c | 14 +++---
>  drivers/dma/at_hdmac_regs.h|  4 ++--
>  drivers/dma/bcm-sba-raid.c |  8 
>  drivers/dma/bcm2835-dma.c  |  2 +-
>  drivers/dma/dma-jz4780.c   |  2 +-
>  drivers/dma/dw-axi-dmac/dw-axi-dmac-platform.c |  8 
>  drivers/dma/idma64.c   |  4 ++--
>  drivers/dma/iop-adma.c |  6 +++---
>  drivers/dma/mv_xor.c   |  2 +-
>  drivers/dma/mv_xor.h   |  2 +-
>  drivers/dma/mv_xor_v2.c|  2 +-
>  drivers/dma/nbpfaxi.c  |  2 +-
>  drivers/dma/of-dma.c   |  2 +-
>  drivers/dma/owl-dma.c  |  2 +-
>  drivers/dma/s3c24xx-dma.c  |  6 +++---
>  drivers/dma/sh/shdmac.c|  2 +-
>  drivers/dma/sh/usb-dmac.c  |  2 +-
>  drivers/dma/st_fdma.h  |  4 ++--
>  drivers/dma/ste_dma40.c| 10 +-
>  drivers/dma/ste_dma40_ll.h |  2 +-
>  drivers/dma/tegra20-apb-dma.c  |  2 +-
>  drivers/dma/ti/edma.c  |  2 +-
>  drivers/dma/xgene-dma.c|  6 +++---
>  drivers/dma/xilinx/xilinx_dma.c|  2 +-
>  28 files changed, 59 insertions(+), 59 deletions(-)
> 
> --
> 2.26.3
---end quoted text---

[PATCH] mm: fix race by making init_zero_pfn() early_initcall

2021-03-28 Thread Ilya Lipnitskiy

There are code paths that rely on zero_pfn to be fully initialized
before core_initcall. For example, wq_sysfs_init() is a core_initcall
function that eventually results in a call to kernel_execve, which
causes a page fault with a subsequent mmput. If zero_pfn is not
initialized by then it may not get cleaned up properly and result in an
error:
  BUG: Bad rss-counter state mm:(ptrval) type:MM_ANONPAGES val:1

Here is an analysis of the race as seen on a MIPS device. On this
particular MT7621 device (Ubiquiti ER-X), zero_pfn is PFN 0 until
initialized, at which point it becomes PFN 5120:
  1. wq_sysfs_init calls into kobject_uevent_env at core_initcall:
   [<80340dc8>] kobject_uevent_env+0x7e4/0x7ec
   [<8033f8b8>] kset_register+0x68/0x88
   [<803cf824>] bus_register+0xdc/0x34c
   [<803cfac8>] subsys_virtual_register+0x34/0x78
   [<8086afb0>] wq_sysfs_init+0x1c/0x4c
   [<80001648>] do_one_initcall+0x50/0x1a8
   [<8086503c>] kernel_init_freeable+0x230/0x2c8
   [<8066bca0>] kernel_init+0x10/0x100
   [<80003038>] ret_from_kernel_thread+0x14/0x1c

  2. kobject_uevent_env() calls call_usermodehelper_exec() which executes
 kernel_execve asynchronously.

  3. Memory allocations in kernel_execve cause a page fault, bumping the
 MM reference counter:
   [<8015adb4>] add_mm_counter_fast+0xb4/0xc0
   [<80160d58>] handle_mm_fault+0x6e4/0xea0
   [<80158aa4>] __get_user_pages.part.78+0x190/0x37c
   [<8015992c>] __get_user_pages_remote+0x128/0x360
   [<801a6d9c>] get_arg_page+0x34/0xa0
   [<801a7394>] copy_string_kernel+0x194/0x2a4
   [<801a880c>] kernel_execve+0x11c/0x298
   [<800420f4>] call_usermodehelper_exec_async+0x114/0x194

  4. In case zero_pfn has not been initialized yet, zap_pte_range does
 not decrement the MM_ANONPAGES RSS counter and the BUG message is
 triggered shortly afterwards when __mmdrop checks the ref counters:
   [<800285e8>] __mmdrop+0x98/0x1d0
   [<801a6de8>] free_bprm+0x44/0x118
   [<801a86a8>] kernel_execve+0x160/0x1d8
   [<800420f4>] call_usermodehelper_exec_async+0x114/0x194
   [<80003198>] ret_from_kernel_thread+0x14/0x1c

To avoid races such as described above, initialize init_zero_pfn at
early_initcall level. Depending on the architecture, ZERO_PAGE is either
constant or gets initialized even earlier, at paging_init, so there is
no issue with initializing zero_pfn earlier.

ML discussion: 
https://lore.kernel.org/lkml/CALCv0x2YqOXEAy2Q=hafjhHCtTHVodChv1qpM=niaxopqeb...@mail.gmail.com/

Signed-off-by: Ilya Lipnitskiy 
Cc: "Eric W. Biederman" 
---
 mm/memory.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/memory.c b/mm/memory.c
index 46ef306375bd..a8bbc4fc121f 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -166,7 +166,7 @@ static int __init init_zero_pfn(void)
zero_pfn = page_to_pfn(ZERO_PAGE(0));
return 0;
 }
-core_initcall(init_zero_pfn);
+early_initcall(init_zero_pfn);
 
 void mm_trace_rss_stat(struct mm_struct *mm, int member, long count)
 {
-- 
2.31.0

[PATCH v2] PCI: Disable D3cold support on Intel XMM7360

2021-03-28 Thread Kai-Heng Feng

On some platforms, the root port for Intel XMM7360 WWAN supports D3cold.
When the root port is put to D3cold by system suspend or runtime
suspend, attempt to systems resume or runtime resume will freeze the
laptop for a while, then it automatically shuts down.

The root cause is unclear for now, as the Intel XMM7360 doesn't have a
driver yet.

So disable D3cold for XMM7360 as a workaround, until proper device
driver is in place.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=212419
Signed-off-by: Kai-Heng Feng 
---
v2:
 - Add comment.

 drivers/pci/quirks.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 653660e3ba9e..c48b0b4a4164 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -5612,3 +5612,16 @@ static void apex_pci_fixup_class(struct pci_dev *pdev)
 }
 DECLARE_PCI_FIXUP_CLASS_HEADER(0x1ac1, 0x089a,
   PCI_CLASS_NOT_DEFINED, 8, apex_pci_fixup_class);
+
+/*
+ * Device [8086:7360]
+ * When it resumes from D3cold, system freeze and shutdown happens.
+ * Currently there's no driver for XMM7360, so add it as a PCI quirk.
+ * https://bugzilla.kernel.org/show_bug.cgi?id=212419
+ */
+static void pci_fixup_no_d3cold(struct pci_dev *pdev)
+{
+   pci_info(pdev, "disable D3cold\n");
+   pci_d3cold_disable(pdev);
+}
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x7360, pci_fixup_no_d3cold);
-- 
2.30.2

[PATCH 22/23] riscv: pmu.rst : A spello fix

2021-03-28 Thread Bhaskar Chowdhury

s/resonable/reasonable/

Signed-off-by: Bhaskar Chowdhury 
---
 Documentation/riscv/pmu.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/riscv/pmu.rst b/Documentation/riscv/pmu.rst
index acb216b99c26..fde31b6aa861 100644
--- a/Documentation/riscv/pmu.rst
+++ b/Documentation/riscv/pmu.rst
@@ -168,7 +168,7 @@ counter (event->count), but also updates the left period to 
the next interrupt
 But the core of perf does not need direct write to counters.  Writing counters
 is hidden behind the abstraction of 1) *pmu->start*, literally start counting 
so one
 has to set the counter to a good value for the next interrupt; 2) inside the 
IRQ
-it should set the counter to the same resonable value.
+it should set the counter to the same reasonable value.

 Reading is not a problem in RISC-V but writing would need some effort, since
 counters are not allowed to be written by S-mode.
--
2.26.3

[PATCH 21/23] scheduler: sched-nice-design.rst: Fix a typo

2021-03-28 Thread Bhaskar Chowdhury

s/assymetry/asymmetry/

Signed-off-by: Bhaskar Chowdhury 
---
 Documentation/scheduler/sched-nice-design.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/scheduler/sched-nice-design.rst 
b/Documentation/scheduler/sched-nice-design.rst
index 0571f1b47e64..3511d86575e7 100644
--- a/Documentation/scheduler/sched-nice-design.rst
+++ b/Documentation/scheduler/sched-nice-design.rst
@@ -60,7 +60,7 @@ within the constraints of HZ and jiffies and their nasty 
design level
 coupling to timeslices and granularity it was not really viable.

 The second (less frequent but still periodically occurring) complaint
-about Linux's nice level support was its assymetry around the origo
+about Linux's nice level support was its asymmetry around the origo
 (which you can see demonstrated in the picture above), or more
 accurately: the fact that nice level behavior depended on the _absolute_
 nice level as well, while the nice API itself is fundamentally
--
2.26.3

[PATCH 23/23] openrisc: openrisc_port.rst: Fix a typo

2021-03-28 Thread Bhaskar Chowdhury

s/enhancments/enhancements/

Signed-off-by: Bhaskar Chowdhury 
---
 Documentation/openrisc/openrisc_port.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/openrisc/openrisc_port.rst 
b/Documentation/openrisc/openrisc_port.rst
index 657ac4af7be6..b3c6c5e258b0 100644
--- a/Documentation/openrisc/openrisc_port.rst
+++ b/Documentation/openrisc/openrisc_port.rst
@@ -114,7 +114,7 @@ History
port to 2.6.x

 30-11-2004 Matjaz Breskvar (phoe...@bsemi.com)
-   lots of bugfixes and enhancments.
+   lots of bugfixes and enhancements.
added opencores framebuffer driver.

 09-10-2010Jonas Bonn (jo...@southpole.se)
--
2.26.3

[PATCH 20/23] scheduler: sched-bwc.rst: Fix a typo

2021-03-28 Thread Bhaskar Chowdhury

s/simultanously/simultaneously/

Signed-off-by: Bhaskar Chowdhury 
---
 Documentation/scheduler/sched-bwc.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/scheduler/sched-bwc.rst 
b/Documentation/scheduler/sched-bwc.rst
index 845eee659199..a7f9be925ab8 100644
--- a/Documentation/scheduler/sched-bwc.rst
+++ b/Documentation/scheduler/sched-bwc.rst
@@ -133,7 +133,7 @@ average usage, albeit over a longer time window than a 
single period.  This
 also limits the burst ability to no more than 1ms per cpu.  This provides
 better more predictable user experience for highly threaded applications with
 small quota limits on high core count machines. It also eliminates the
-propensity to throttle these applications while simultanously using less than
+propensity to throttle these applications while simultaneously using less than
 quota amounts of cpu. Another way to say this, is that by allowing the unused
 portion of a slice to remain valid across periods we have decreased the
 possibility of wastefully expiring quota on cpu-local silos that don't need a
--
2.26.3

[PATCH 19/23] scsi: ChangeLog.ncr53c8xx: Quite a few spello fixes

2021-03-28 Thread Bhaskar Chowdhury



Few trivial spelling fixes.

Signed-off-by: Bhaskar Chowdhury 
---
 Few lines have done self modification, means they eliminate and added
 themselves. I have no clue why that happen.

 Documentation/scsi/ChangeLog.ncr53c8xx | 260 -
 1 file changed, 130 insertions(+), 130 deletions(-)

diff --git a/Documentation/scsi/ChangeLog.ncr53c8xx 
b/Documentation/scsi/ChangeLog.ncr53c8xx
index 9288e3d8974a..0dacfeef22a5 100644
--- a/Documentation/scsi/ChangeLog.ncr53c8xx
+++ b/Documentation/scsi/ChangeLog.ncr53c8xx
@@ -9,8 +9,8 @@ Mon Feb 12 22:30 2001 Gerard Roudier (groud...@club-internet.fr)
- Call pci_enable_device() as AC wants this to be done.
- Get both the BAR cookies actual and PCI BAR values.
  (see Changelog.sym53c8xx rev. 1.7.3 for details)
-   - Merge changes for linux-2.4 that declare the host template
- in the driver object also when the driver is statically
+   - Merge changes for linux-2.4 that declare the host template
+ in the driver object also when the driver is statically
  linked with the kernel.

 Sun Sep 24 21:30 2000 Gerard Roudier (groud...@club-internet.fr)
@@ -25,34 +25,34 @@ Wed Jul 26 23:30 2000 Gerard Roudier 
(groud...@club-internet.fr)
 Sun Jul 09 16:30 2000 Gerard Roudier (groud...@club-internet.fr)
* version ncr53c8xx-3.4.0
- Remove the PROFILE C and SCRIPTS code.
- This facility was not this useful and thus was not longer
+ This facility was not this useful and thus was not longer
  desirable given the increasing complexity of the driver code.
- Merges from FreeBSD sym-1.6.2 driver:
- * Clarify memory barriers needed by the driver for architectures
+ * Clarify memory barriers needed by the driver for architectures
that implement a weak memory ordering.
- General cleanup:
- Move definitions for barriers and IO/MMIO operations to the
- sym53c8xx_defs.h header files. They are now shared by the
+ Move definitions for barriers and IO/MMIO operations to the
+ sym53c8xx_defs.h header files. They are now shared by the
  both drivers.
  Use SCSI_NCR_IOMAPPED instead of NCR_IOMAPPED.

 Thu May 11   12:30 2000 Pam Delaney (pam.dela...@lsil.com)
* revision 3.3b
-
+
 Mon Apr 24 12:00 2000 Gerard Roudier (groud...@club-internet.fr)
* revision 3.2i
- Return value 1 (instead of 0) from the driver setup routine.
-   - Let the driver also attach controllers that have been set to
+   - Let the driver also attach controllers that have been set to
  OFF in the NVRAM as it did prior to revision 3.2g.

 Sat Apr 1  12:00 2000 Gerard Roudier (groud...@club-internet.fr)
* revision 3.2h
- Fix a compilation problem on Alpha introduced in version 3.2g.
   (`port' changed to `base_io').
-   - Move from `sym' to this driver a tiny change for __sparc__ that
+   - Move from `sym' to this driver a tiny change for __sparc__ that
  applies to cache line size (? Probably from David S Miller).
-   - Make sure no data transfer will happen for Scsi_Cmnd requests
- that supply SCSI_DATA_NONE direction (this avoids some BUG()
+   - Make sure no data transfer will happen for Scsi_Cmnd requests
+ that supply SCSI_DATA_NONE direction (this avoids some BUG()
  statement in the PCI code when a data buffer is also supplied).

 Thu Mar 16   9:30 2000 Pam Delaney (pam.dela...@lsil.com)
@@ -62,10 +62,10 @@ Thu Mar 16   9:30 2000 Pam Delaney (pam.dela...@lsil.com)

 Mon March 6  23:15 2000 Gerard Roudier (groud...@club-internet.fr)
* revision 3.2g
-   - Add the file sym53c8xx_comm.h that collects code that should
- be shared by sym53c8xx and ncr53c8xx drivers. For now, it is
- a header file that is only included by the ncr53c8xx driver,
- but things will be cleaned up later. This code addresses
+   - Add the file sym53c8xx_comm.h that collects code that should
+ be shared by sym53c8xx and ncr53c8xx drivers. For now, it is
+ a header file that is only included by the ncr53c8xx driver,
+ but things will be cleaned up later. This code addresses
  notably:
  * Chip detection and PCI related initialisations
  * NVRAM detection and reading
@@ -74,7 +74,7 @@ Mon March 6  23:15 2000 Gerard Roudier 
(groud...@club-internet.fr)
  * And some other ...
- Add support for the new dynamic dma mapping kernel interface.
  Requires Linux-2.3.47 (tested with pre-2.3.47-6).
-   - Get data transfer direction from the scsi command structure
+   - Get data transfer direction from the scsi command structure
  (Scsi_Cmnd) when this information is available.

 Mon March 6  23:15 2000 Gerard Roudier (groud...@club-internet.fr)
@@ -96,12 +96,12 @@ Mon March 6  23:15 2000 Gerard Roudier

[PATCH 17/23] sparc: dax-hv-api.txt: Fix quite a few spellos

2021-03-28 Thread Bhaskar Chowdhury



s/indicies/indices/
s/retricted/restricted/
s/indentifier/identifier/   two different places.
s/proccessed/processed/ three different places.


Signed-off-by: Bhaskar Chowdhury 
---
 Documentation/sparc/oradax/dax-hv-api.txt | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/Documentation/sparc/oradax/dax-hv-api.txt 
b/Documentation/sparc/oradax/dax-hv-api.txt
index 73e8d506cf64..300fbb58ad04 100644
--- a/Documentation/sparc/oradax/dax-hv-api.txt
+++ b/Documentation/sparc/oradax/dax-hv-api.txt
@@ -742,7 +742,7 @@ Offset   Size   Field Description
 code in the CCB header.

 There are two supported formats for the output stream: the bit vector 
and index array formats (codes 0x8,
-0xD, and 0xE). The index array format is an array of indicies of bits 
which would have been set if the
+0xD, and 0xE). The index array format is an array of indices of bits 
which would have been set if the
 output format was a bit array.

 The return value of the CCB completion area contains the number of 
bits set in the output bit vector,
@@ -1254,7 +1254,7 @@ EUNAVAILABLE   The requested CCB operation could not be 
performed at this time.
submitted CCB, or may apply to a larger scope. The status 
should not be
interpreted as permanent, and the guest should attempt to 
submit CCBs in
the future which had previously been unable to be performed. 
The status
-   data provides additional information about scope of the 
retricted availability
+   data provides additional information about scope of the 
restricted availability
as follows:
Value   Description
0   Processing for the exact CCB instance submitted was 
unavailable,
@@ -1330,20 +1330,20 @@ EUNAVAILABLE   The requested CCB operation could not be 
performed at this time.
  of other CCBs ahead of the requested CCB, to provide a relative 
estimate of when the CCB may execute.

  The dax return value is only valid when the state is ENQUEUED. The 
value returned is the DAX unit
- instance indentifier for the DAX unit processing the queue where the 
requested CCB is located. The value
+ instance identifier for the DAX unit processing the queue where the 
requested CCB is located. The value
  matches the value that would have been, or was, returned by 
ccb_submit using the queue info flag.

  The queue return value is only valid when the state is ENQUEUED. The 
value returned is the DAX
- queue instance indentifier for the DAX unit processing the queue 
where the requested CCB is located. The
+ queue instance identifier for the DAX unit processing the queue where 
the requested CCB is located. The
  value matches the value that would have been, or was, returned by 
ccb_submit using the queue info flag.

 36.3.2.1. Errors

-  EOK   The request was proccessed and the CCB 
state is valid.
+  EOK   The request was processed and the CCB 
state is valid.
   EBADALIGN address is not on a 64-byte aligned.
   ENORADDR  The real address provided for address is 
not valid.
   EINVALThe CCB completion area contents are not 
valid.
-  EWOULDBLOCK   Internal resource contraints prevented the 
CCB state from being queried at this
+  EWOULDBLOCK   Internal resource constraints prevented 
the CCB state from being queried at this
 time. The guest should retry the request.
   ENOACCESS The guest does not have permission to 
access the coprocessor virtual device
 functionality.
@@ -1401,11 +1401,11 @@ EUNAVAILABLE   The requested CCB operation could not be 
performed at this time.

 36.3.3.2. Errors

-  EOKThe request was proccessed and the result 
is valid.
+  EOKThe request was processed and the result 
is valid.
   EBADALIGN  address is not on a 64-byte aligned.
   ENORADDR   The real address provided for address is 
not valid.
   EINVAL The CCB completion area contents are not 
valid.
-  EWOULDBLOCKInternal resource contraints prevented 
the CCB from being killed at this time.
+  EWOULDBLOCKInternal resource constraints prevented 
the CCB from being killed at this time.
  The guest should retry the request.
   ENOACCESS  The guest does not have permission to 
access the coprocessor virtual device
  functionality.
@@ -1423,7 +1423,7 @@ EUNAVAILABLE   The

[PATCH 18/23] security: core.rst: Fixed a spello

2021-03-28 Thread Bhaskar Chowdhury

s/implemenation/implementation/

Signed-off-by: Bhaskar Chowdhury 
---
 Documentation/security/keys/core.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/security/keys/core.rst 
b/Documentation/security/keys/core.rst
index b3ed5c581034..d66ad592c6cc 100644
--- a/Documentation/security/keys/core.rst
+++ b/Documentation/security/keys/core.rst
@@ -869,7 +869,7 @@ The keyctl syscall functions are:

 - ``char *hashname`` specifies the NUL terminated string identifying
   the hash used from the kernel crypto API and applied for the KDF
-  operation. The KDF implemenation complies with SP800-56A as well
+  operation. The KDF implementation complies with SP800-56A as well
   as with SP800-108 (the counter KDF).

 - ``char *otherinfo`` specifies the OtherInfo data as documented in
--
2.26.3

[PATCH 16/23] sparc: oradax/oracle-dax.rst: Fix a typo

2021-03-28 Thread Bhaskar Chowdhury

s/discontiguities/discontinuities/

Signed-off-by: Bhaskar Chowdhury 
---
 Documentation/sparc/oradax/oracle-dax.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/sparc/oradax/oracle-dax.rst 
b/Documentation/sparc/oradax/oracle-dax.rst
index d1e14d572918..54ccb35ed51d 100644
--- a/Documentation/sparc/oradax/oracle-dax.rst
+++ b/Documentation/sparc/oradax/oracle-dax.rst
@@ -197,7 +197,7 @@ Memory Constraints
 ==

 The DAX hardware operates only on physical addresses. Therefore, it is
-not aware of virtual memory mappings and the discontiguities that may
+not aware of virtual memory mappings and the discontinuities that may
 exist in the physical memory that a virtual buffer maps to. There is
 no I/O TLB or any scatter/gather mechanism. All buffers, whether input
 or output, must reside in a physically contiguous region of memory.
--
2.26.3

[PATCH 13/23] trace: hwlat_detector.rst: Fix a typo

2021-03-28 Thread Bhaskar Chowdhury

s/occuring/occurring/

Signed-off-by: Bhaskar Chowdhury 
---
 Documentation/trace/hwlat_detector.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/trace/hwlat_detector.rst 
b/Documentation/trace/hwlat_detector.rst
index 5739349649c8..f7811e2ddf34 100644
--- a/Documentation/trace/hwlat_detector.rst
+++ b/Documentation/trace/hwlat_detector.rst
@@ -14,7 +14,7 @@ originally written for use by the "RT" patch since the Real 
Time
 kernel is highly latency sensitive.

 SMIs are not serviced by the Linux kernel, which means that it does not
-even know that they are occuring. SMIs are instead set up by BIOS code
+even know that they are occurring. SMIs are instead set up by BIOS code
 and are serviced by BIOS code, usually for "critical" events such as
 management of thermal sensors and fans. Sometimes though, SMIs are used for
 other tasks and those tasks can spend an inordinate amount of time in the
--
2.26.3

[PATCH 14/23] trace: ftrace-uses.rst: Fix a typo

2021-03-28 Thread Bhaskar Chowdhury

s/preemptable/preemptible/

Signed-off-by: Bhaskar Chowdhury 
---
 Documentation/trace/ftrace-uses.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/trace/ftrace-uses.rst 
b/Documentation/trace/ftrace-uses.rst
index f7d98ae5b885..2903a58f5ac2 100644
--- a/Documentation/trace/ftrace-uses.rst
+++ b/Documentation/trace/ftrace-uses.rst
@@ -193,7 +193,7 @@ FTRACE_OPS_FL_RECURSION
Not, if this flag is set, then the callback will always be called
with preemption disabled. If it is not set, then it is possible
(but not guaranteed) that the callback will be called in
-   preemptable context.
+   preemptible context.

 FTRACE_OPS_FL_IPMODIFY
Requires FTRACE_OPS_FL_SAVE_REGS set. If the callback is to "hijack"
--
2.26.3

[PATCH 15/23] trace: events.rst: Fix a typo

2021-03-28 Thread Bhaskar Chowdhury

s/specfied/specified/

Signed-off-by: Bhaskar Chowdhury 
---
 Documentation/trace/events.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/trace/events.rst b/Documentation/trace/events.rst
index 8ddb9b09451c..80052adc592d 100644
--- a/Documentation/trace/events.rst
+++ b/Documentation/trace/events.rst
@@ -884,7 +884,7 @@ functions can be used.

 To create a kprobe event, an empty or partially empty kprobe event
 should first be created using kprobe_event_gen_cmd_start().  The name
-of the event and the probe location should be specfied along with one
+of the event and the probe location should be specified along with one
 or args each representing a probe field should be supplied to this
 function.  Before calling kprobe_event_gen_cmd_start(), the user
 should create and initialize a dynevent_cmd object using
--
2.26.3

[PATCH 12/23] v4l: hist-v4l2.rst: Fix a typo

2021-03-28 Thread Bhaskar Chowdhury



s/directon/director/

Signed-off-by: Bhaskar Chowdhury 
---
 Documentation/userspace-api/media/v4l/hist-v4l2.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/userspace-api/media/v4l/hist-v4l2.rst 
b/Documentation/userspace-api/media/v4l/hist-v4l2.rst
index 28a2750d5c8c..7061496126ad 100644
--- a/Documentation/userspace-api/media/v4l/hist-v4l2.rst
+++ b/Documentation/userspace-api/media/v4l/hist-v4l2.rst
@@ -47,7 +47,7 @@ Codec API was released.
 1998-11-08: Many minor changes. Most symbols have been renamed. Some
 material changes to struct v4l2_capability.

-1998-11-12: The read/write directon of some ioctls was misdefined.
+1998-11-12: The read/write director of some ioctls was misdefined.

 1998-11-14: ``V4L2_PIX_FMT_RGB24`` changed to ``V4L2_PIX_FMT_BGR24``,
 and ``V4L2_PIX_FMT_RGB32`` changed to ``V4L2_PIX_FMT_BGR32``. Audio
--
2.26.3

[PATCH 09/23] virt: kvm: /mmu.rst: Fix a typo

2021-03-28 Thread Bhaskar Chowdhury

s/unsychronized/unsynchronized/

Signed-off-by: Bhaskar Chowdhury 
---
 Documentation/virt/kvm/mmu.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/virt/kvm/mmu.rst b/Documentation/virt/kvm/mmu.rst
index 5bfe28b0728e..e6c525280813 100644
--- a/Documentation/virt/kvm/mmu.rst
+++ b/Documentation/virt/kvm/mmu.rst
@@ -244,7 +244,7 @@ Shadow pages contain the following information:
 unsynchronized children).
   unsync_child_bitmap:
 A bitmap indicating which sptes in spt point (directly or indirectly) at
-pages that may be unsynchronized.  Used to quickly locate all unsychronized
+pages that may be unsynchronized.  Used to quickly locate all 
unsynchronized
 pages reachable from a given page.
   clear_spte_count:
 Only present on 32-bit hosts, where a 64-bit spte cannot be written
--
2.26.3

[PATCH 10/23] virt: kvm: halt-polling.rst: Fixed a typo

2021-03-28 Thread Bhaskar Chowdhury

s/dependant/dependent/

Signed-off-by: Bhaskar Chowdhury 
---
 Documentation/virt/kvm/halt-polling.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/virt/kvm/halt-polling.rst 
b/Documentation/virt/kvm/halt-polling.rst
index 4922e4a15f18..c428d319de45 100644
--- a/Documentation/virt/kvm/halt-polling.rst
+++ b/Documentation/virt/kvm/halt-polling.rst
@@ -14,7 +14,7 @@ before giving up the cpu to the scheduler in order to let 
something else run.
 Polling provides a latency advantage in cases where the guest can be run again
 very quickly by at least saving us a trip through the scheduler, normally on
 the order of a few micro-seconds, although performance benefits are workload
-dependant. In the event that no wakeup source arrives during the polling
+dependent. In the event that no wakeup source arrives during the polling
 interval or some other task on the runqueue is runnable the scheduler is
 invoked. Thus halt polling is especially useful on workloads with very short
 wakeup periods where the time spent halt polling is minimised and the time
--
2.26.3

[PATCH 11/23] virt: user_mode_linux_howto_v2.rst: Few typo fixes

2021-03-28 Thread Bhaskar Chowdhury

s/absense/absence/
s/sripts/scripts/
s/resultion/resolution/
s/desireable/desirable/

Signed-off-by: Bhaskar Chowdhury 
---
 Documentation/virt/uml/user_mode_linux_howto_v2.rst | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/Documentation/virt/uml/user_mode_linux_howto_v2.rst 
b/Documentation/virt/uml/user_mode_linux_howto_v2.rst
index 312e431695d9..0e1371c3e235 100644
--- a/Documentation/virt/uml/user_mode_linux_howto_v2.rst
+++ b/Documentation/virt/uml/user_mode_linux_howto_v2.rst
@@ -317,7 +317,7 @@ Shared Options
 * ``v6=[0,1]`` to specify if a v6 connection is desired for all
   transports which operate over IP. Additionally, for transports that
   have some differences in the way they operate over v4 and v6 (for example
-  EoL2TPv3), sets the correct mode of operation. In the absense of this
+  EoL2TPv3), sets the correct mode of operation. In the absence of this
   option, the socket type is determined based on what do the src and dst
   arguments resolve/parse to.

@@ -726,7 +726,7 @@ kernel.  When you boot UML, you'll see a line like::

mconsole initialized on /home/jdike/.uml/umlNJ32yL/mconsole

-If you specify a unique machine id one the UML command line, i.e.
+If you specify a unique machine id one the UML command line, i.e.
 ``umid=debian``, you'll see this::

mconsole initialized on /home/jdike/.uml/debian/mconsole
@@ -1073,7 +1073,7 @@ If you have something to contribute such as a patch, a 
bugfix, a
 new feature, please send it to ``linux...@lists.infradead.org``

 Please follow all standard Linux patch guidelines such as cc-ing
-relevant maintainers and run ``./sripts/checkpatch.pl`` on your patch.
+relevant maintainers and run ``./scripts/checkpatch.pl`` on your patch.
 For more details see ``Documentation/process/submitting-patches.rst``

 Note - the list does not accept HTML or attachments, all emails must
@@ -1131,7 +1131,7 @@ This is a typical picture from a mostly idle UML instance
 * The sequence of ptrace calls is part of MMU emulation and runnin the
   UML userspace
 * ``timer_settime`` is part of the UML high res timer subsystem mapping
-  timer requests from inside UML onto the host high resultion timers.
+  timer requests from inside UML onto the host high resolution timers.
 * ``clock_nanosleep`` is UML going into idle (similar to the way a PC
   will execute an ACPI idle).

@@ -1195,7 +1195,7 @@ between a driver and the host at the UML command line is 
OK
 security-wise. Allowing it as a loadable module parameter
 isn't.

-If such functionality is desireable for a particular application
+If such functionality is desirable for a particular application
 (e.g. loading BPF "firmware" for raw socket network transports),
 it should be off by default and should be explicitly turned on
 as a command line parameter at startup.
--
2.26.3

[PATCH 06/23] vm: hwpoison.rst: A typo fix

2021-03-28 Thread Bhaskar Chowdhury

s/expection/exception/

Signed-off-by: Bhaskar Chowdhury 
---
 Documentation/vm/hwpoison.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/vm/hwpoison.rst b/Documentation/vm/hwpoison.rst
index a5c884293dac..727f080073dd 100644
--- a/Documentation/vm/hwpoison.rst
+++ b/Documentation/vm/hwpoison.rst
@@ -50,7 +50,7 @@ of applications. KVM support requires a recent qemu-kvm 
release.
 For the KVM use there was need for a new signal type so that
 KVM can inject the machine check into the guest with the proper
 address. This in theory allows other applications to handle
-memory failures too. The expection is that near all applications
+memory failures too. The exception is that near all applications
 won't do that, but some very specialized ones might.

 Failure recovery modes
--
2.26.3

[PATCH 05/23] vm: unevictable-lru.rst: Fix a typo

2021-03-28 Thread Bhaskar Chowdhury

s/mmaped/mapped/

Signed-off-by: Bhaskar Chowdhury 
---
 Documentation/vm/unevictable-lru.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/vm/unevictable-lru.rst 
b/Documentation/vm/unevictable-lru.rst
index 0e1490524f53..74e7d5ded114 100644
--- a/Documentation/vm/unevictable-lru.rst
+++ b/Documentation/vm/unevictable-lru.rst
@@ -467,7 +467,7 @@ In addition the mlock()/mlockall() system calls, an 
application can request
 that a region of memory be mlocked supplying the MAP_LOCKED flag to the mmap()
 call. There is one important and subtle difference here, though. mmap() + 
mlock()
 will fail if the range cannot be faulted in (e.g. because mm_populate fails)
-and returns with ENOMEM while mmap(MAP_LOCKED) will not fail. The mmaped
+and returns with ENOMEM while mmap(MAP_LOCKED) will not fail. The mapped
 area will still have properties of the locked area - aka. pages will not get
 swapped out - but major page faults to fault memory in might still happen.

--
2.26.3

[PATCH 08/23] virt: kvm: vm.rst: Fix a typo

2021-03-28 Thread Bhaskar Chowdhury

s/imlemented/implemented/

Signed-off-by: Bhaskar Chowdhury 
---
 Documentation/virt/kvm/devices/vm.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/virt/kvm/devices/vm.rst 
b/Documentation/virt/kvm/devices/vm.rst
index 0aa5b1cfd700..3eb1abf505c9 100644
--- a/Documentation/virt/kvm/devices/vm.rst
+++ b/Documentation/virt/kvm/devices/vm.rst
@@ -92,7 +92,7 @@ Allows user space to retrieve or request to change cpu 
related information for a
 KVM does not enforce or limit the cpu model data in any form. Take the 
information
 retrieved by means of KVM_S390_VM_CPU_MACHINE as hint for reasonable 
configuration
 setups. Instruction interceptions triggered by additionally set facility bits 
that
-are not handled by KVM need to by imlemented in the VM driver code.
+are not handled by KVM need to by implemented in the VM driver code.

 :Parameters: address of buffer to store/set the processor related cpu
 data of type struct kvm_s390_vm_cpu_processor*.
--
2.26.3

[PATCH 07/23] vm: hwpoison.rst: A typo fix

2021-03-28 Thread Bhaskar Chowdhury

s/focusses/focuses/

Signed-off-by: Bhaskar Chowdhury 
---
 Documentation/vm/hwpoison.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/vm/hwpoison.rst b/Documentation/vm/hwpoison.rst
index 727f080073dd..7840463a0e9a 100644
--- a/Documentation/vm/hwpoison.rst
+++ b/Documentation/vm/hwpoison.rst
@@ -19,7 +19,7 @@ To quote the overview comment::
hardware as being corrupted usually due to a 2bit ECC memory or cache
failure.

-   This focusses on pages detected as corrupted in the background.
+   This focuses on pages detected as corrupted in the background.
When the current CPU tries to consume corruption the currently
running process can just be killed directly instead. This implies
that if the error cannot be handled for some reason it's safe to
--
2.26.3

[PATCH 04/23] w1-generic.rst: A typo fix

2021-03-28 Thread Bhaskar Chowdhury

s/beetwen/between/

Signed-off-by: Bhaskar Chowdhury 
---
 Documentation/w1/w1-generic.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/w1/w1-generic.rst b/Documentation/w1/w1-generic.rst
index da4e8b4e9b01..229b16b6399b 100644
--- a/Documentation/w1/w1-generic.rst
+++ b/Documentation/w1/w1-generic.rst
@@ -101,7 +101,7 @@ w1_master_search  (rw) the number of searches left 
to do,
 w1_master_slave_count (ro) the number of slaves found
 w1_master_slaves  (ro) the names of the slaves, one per line
 w1_master_timeout (ro) the delay in seconds between searches
-w1_master_timeout_us  (ro) the delay in microseconds beetwen searches
+w1_master_timeout_us  (ro) the delay in microseconds between searches
 = =

 If you have a w1 bus that never changes (you don't add or remove devices),
--
2.26.3

[PATCH 02/23] w1: ds2482.rst: A typo fix

2021-03-28 Thread Bhaskar Chowdhury

s/busses/buses/

Signed-off-by: Bhaskar Chowdhury 
---
 Documentation/w1/masters/ds2482.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/w1/masters/ds2482.rst 
b/Documentation/w1/masters/ds2482.rst
index 17ebe8f660cd..5862024e4b15 100644
--- a/Documentation/w1/masters/ds2482.rst
+++ b/Documentation/w1/masters/ds2482.rst
@@ -22,7 +22,7 @@ Description
 ---

 The Maxim/Dallas Semiconductor DS2482 is a I2C device that provides
-one (DS2482-100) or eight (DS2482-800) 1-wire busses.
+one (DS2482-100) or eight (DS2482-800) 1-wire buses.


 General Remarks
--
2.26.3

[PATCH 03/23] w1-netlink.rst: A typo fix

2021-03-28 Thread Bhaskar Chowdhury

s/strucutre/structure/

Signed-off-by: Bhaskar Chowdhury 
---
 Documentation/w1/w1-netlink.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/w1/w1-netlink.rst b/Documentation/w1/w1-netlink.rst
index aaa13243a5e4..be4f7b82dcb4 100644
--- a/Documentation/w1/w1-netlink.rst
+++ b/Documentation/w1/w1-netlink.rst
@@ -66,7 +66,7 @@ Each connector message can include one or more w1_netlink_msg 
with
 zero or more attached w1_netlink_cmd messages.

 For event messages there are no w1_netlink_cmd embedded structures,
-only connector header and w1_netlink_msg strucutre with "len" field
+only connector header and w1_netlink_msg structure with "len" field
 being zero and filled type (one of event types) and id:
 either 8 bytes of slave unique id in host order,
 or master's id, which is assigned to bus master device
--
2.26.3

[PATCH 00/23] docs: Mundane typo fixes

2021-03-28 Thread Bhaskar Chowdhury

 This patch series tried to fix trivial spelling fixes.

Bhaskar Chowdhury (23):
  atomctl.rst: A typo fix
  w1: ds2482.rst: A typo fix
  w1-netlink.rst: A typo fix
  w1-generic.rst: A typo fix
  vm: unevictable-lru.rst: Fix a typo
  vm: hwpoison.rst: A typo fix
  vm: hwpoison.rst: A typo fix
  virt: kvm: vm.rst: Fix a typo
  virt: kvm: /mmu.rst: Fix a typo
  virt: kvm: halt-polling.rst: Fixed a typo
  virt: user_mode_linux_howto_v2.rst: Few typo fixes
  v4l: hist-v4l2.rst: Fix a typo
  trace: hwlat_detector.rst: Fix a typo
  trace: ftrace-uses.rst: Fix a typo
  trace: events.rst: Fix a typo
  sparc: oradax/oracle-dax.rst: Fix a typo
  sparc: dax-hv-api.txt: Fix quite a few spellos
  security: core.rst: Fixed a spello
  scsi: ChangeLog.ncr53c8xx: Quite a few spello fixes
  scheduler: sched-bwc.rst: Fix a typo
  scheduler: sched-nice-design.rst: Fix a typo
  riscv: pmu.rst : A spello fix
  openrisc: openrisc_port.rst: Fix a typo

 Documentation/openrisc/openrisc_port.rst  |   2 +-
 Documentation/riscv/pmu.rst   |   2 +-
 Documentation/scheduler/sched-bwc.rst |   2 +-
 Documentation/scheduler/sched-nice-design.rst |   2 +-
 Documentation/scsi/ChangeLog.ncr53c8xx| 260 +-
 Documentation/security/keys/core.rst  |   2 +-
 Documentation/sparc/oradax/dax-hv-api.txt |  18 +-
 Documentation/sparc/oradax/oracle-dax.rst |   2 +-
 Documentation/trace/events.rst|   2 +-
 Documentation/trace/ftrace-uses.rst   |   2 +-
 Documentation/trace/hwlat_detector.rst|   2 +-
 .../userspace-api/media/v4l/hist-v4l2.rst |   2 +-
 Documentation/virt/kvm/devices/vm.rst |   2 +-
 Documentation/virt/kvm/halt-polling.rst   |   2 +-
 Documentation/virt/kvm/mmu.rst|   2 +-
 .../virt/uml/user_mode_linux_howto_v2.rst |  10 +-
 Documentation/vm/hwpoison.rst |   4 +-
 Documentation/vm/unevictable-lru.rst  |   2 +-
 Documentation/w1/masters/ds2482.rst   |   2 +-
 Documentation/w1/w1-generic.rst   |   2 +-
 Documentation/w1/w1-netlink.rst   |   2 +-
 Documentation/xtensa/atomctl.rst  |   2 +-
 22 files changed, 164 insertions(+), 164 deletions(-)

--
2.26.3

[PATCH 01/23] atomctl.rst: A typo fix

2021-03-28 Thread Bhaskar Chowdhury

s/controlers/controllers/

Signed-off-by: Bhaskar Chowdhury 
---
 Documentation/xtensa/atomctl.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/xtensa/atomctl.rst b/Documentation/xtensa/atomctl.rst
index 1ecbd0ba9a2e..a0efab2abe8f 100644
--- a/Documentation/xtensa/atomctl.rst
+++ b/Documentation/xtensa/atomctl.rst
@@ -23,7 +23,7 @@ doing a Cached (WB) transaction and use the Memory RCW for 
un-cached
 operations.

 For systems without an coherent cache controller, non-MX, we always
-use the memory controllers RCW, thought non-MX controlers likely
+use the memory controllers RCW, thought non-MX controllers likely
 support the Internal Operation.

 CUSTOMER-WARNING:
--
2.26.3

Re: [PATCH v7 1/5] misc: Add Synopsys DesignWare xData IP driver

2021-03-28 Thread Greg Kroah-Hartman

On Sun, Mar 28, 2021 at 09:06:47PM +, Gustavo Pimentel wrote:
> > > +static const struct pci_device_id dw_xdata_pcie_id_table[] = {
> > > + { PCI_DEVICE_DATA(SYNOPSYS, EDDA, _edda_data) },
> > 
> > Why do you need a pointer to snps_edda_data here?
> 
> The structure snps_edda_data indicates the location of this IP block (BAR 
> and offset) for this particular endpoint.
> It's very likely in the future to be more variants that for HW design 
> reasons might require this IP block to be on a different location.

Then make the change when that happens sometime in the future.  Don't
add unneeded complexity today, that just makes the code harder to review
by us now, and for you to maintain today.

thanks,

greg k-h

Re: Compiling kernel-3.4.xxx with gcc-9.x. Need some help.

2021-03-28 Thread Greg KH

On Sun, Mar 28, 2021 at 10:20:50PM +0200, Fawad Lateef wrote:
> Hi
> 
> I am using an Olimex A20 SOM with NAND and due to some binary blob for
> NAND driver, I am stuck with the sunxi kernel 3.4.xxx version. (Repo
> here: https://github.com/linux-sunxi/linux-sunxi)

Please work with the vendor that is forcing you to use this obsolete and
insecure kernel version.  You are paying for that support, and they are
the only ones that can support you.

> I am currently using buildroot-2016 and gcc-5.5 for building the
> kernel and every other package needed.
> 
> Now the requirement is to move to the latest version of gcc-9.x, so
> that we can have glibc++ provided by the gcc-9.1 toolchain.
> 
> Main problem for moving to later versions of buildroot is the kernel
> 3.4 which we couldn't to work with gcc-6 a few years ago _but_ now the
> gcc-9.1 requirement is mandatory so now have to look into compiling
> linux-3.4 with gcc-9.1 or above.
> 
> Now I need some help.
> 
> -- Is it realistic to expect 3.4 kernel compiling and boot
> successfully with gcc-9.1?

No.  It took a lot of work and effort just to get the 4.4.y kernel to
work with that gcc version.

Again, please work with the company that you are already paying for
support from, they can do this for you.

good luck!

greg k-h

Re: [PATCH 2/2] cpupower: fix amd cpu (family >= 0x17) active state issue

2021-03-28 Thread xufuhai

yeah Shuah~ thanks for your reply

For this issue, not meaning "current CPU frequency" but "boost state 
support--->Active" during 
"cpupower frequency-info" command as below:

boost state support:
Supported: yes
Active: yes/no

I think the state returned from the command for amd cpu (family >= 0x17) should 
be like as below:

as non-root Active state:
Active: Error while evaluating Boost Capabilities on CPU xx -- 
are you root?

as root Active state:
Active: yes (if supported)
no  (if not supprted)

I don't wanna see the state returned like below:

as non-root Active state:
Active: no

as root Active state:
Active: yes (if supported)
no  (if not supprted)

I will paste the related code by detailed for showing the issue:

if amd cpu family >= 0x17 , will run read_msr function via read 
/dev/cpu/%d/msr. For non-root
caller can not open msr node due to having no permission, but 
cpufreq_has_boost_support will return 0 to 
upper caller function that not means caller no permission to read 
/dev/cpu/%d/msr. I believe we should
return negative value for the condition:

-
/linux/tools/power/cpupower/utils/helper/misc.c
cpufreq_has_boost_support:

if (cpupower_cpu_info.caps & CPUPOWER_CAP_AMD_CPB) {
*support = 1;

/* AMD Family 0x17 does not utilize PCI D18F4 like prior
 * families and has no fixed discrete boost states but
 * has Hardware determined variable increments instead.
 */

if (cpupower_cpu_info.caps & CPUPOWER_CAP_AMD_CPB_MSR) {
if (!read_msr(cpu, MSR_AMD_HWCR, )) {
if (!(val & CPUPOWER_AMD_CPBDIS))
*active = 1;
}
} else {
ret = amd_pci_get_num_boost_states(active, states);
if (ret)
return ret;
}
} else if (cpupower_cpu_info.caps & CPUPOWER_CAP_INTEL_IDA)
*support = *active = 1;
return 0;
---

在 2021/3/27 上午4:13, Shuah Khan 写道:
> On 3/24/21 4:28 AM, xufuhai wrote:
>> From: xufuhai 
>>
>> If the read_msr function is executed by a non-root user, the function
>> returns -1, which means that there is no permission to access 
>> /dev/cpu/%d/msr,
>> but cpufreq_has_boost_support should also return -1 immediately, and should 
>> not
>> follow the original logic to return 0, which will cause amd The cpupower tool
>> returns the turbo active status as 0.
>>
>> Reproduce procedure:
>>  cpupower frequency-info
>>
> 
> Please run get_maintainer.pl and send patch maintainers
> and others suggested by the tool. I don't see this in my
> Inbox for me to review/accept and send it to pm maintainer.
> 
> Please include before and after the patch when you run
> cpupower frequency-info
> 
> thanks,
> -- Shuah
>

Re: [PATCH 1/2] cpupower: fix amd cpu (family < 0x17) active state issue

2021-03-28 Thread xufuhai

yeah Shuah~ thanks for your reply

For this issue, not meaning "current CPU frequency" but "boost state 
support--->Active" during 
"cpupower frequency-info" command as below:

boost state support:
Supported: yes
Active: yes/no

I think the state returned from the command for amd cpu (family < 0x17) should 
be like as below:

as non-root Active state:
Active: Error while evaluating Boost Capabilities on CPU xx -- 
are you root?

as root Active state:
Active: yes (if supported)
no  (if not supprted)

I don't wanna see the state returned like below:

as non-root Active state:
Active: yes

as root Active state:
Active: yes (if supported)
no  (if not supprted)

I will paste the related code by detailed for showing the issue:

if amd cpu family < 0x17 , will run amd_pci_get_num_boost_states 
function:
-
/linux/tools/power/cpupower/utils/helper/misc.c
 
if (cpupower_cpu_info.caps & CPUPOWER_CAP_AMD_CPB) {
*support = 1;

/* AMD Family 0x17 does not utilize PCI D18F4 like prior
 * families and has no fixed discrete boost states but
 * has Hardware determined variable increments instead.
 */

if (cpupower_cpu_info.caps & CPUPOWER_CAP_AMD_CPB_MSR) {
if (!read_msr(cpu, MSR_AMD_HWCR, )) {
if (!(val & CPUPOWER_AMD_CPBDIS))
*active = 1;
}
} else {
ret = amd_pci_get_num_boost_states(active, states);
if (ret)
return ret;
}
---

/linux/tools/power/cpupower/utils/helper/amd.c
amd_pci_get_num_boost_states:

val = pci_read_byte(device, 0x15c);

if (val & 3)
*active = 1;
else


pci_read_byte will memset val to 0xff if caller has no permission to access to 
read from pci_dev
but for amd_pci_get_num_boost_states, active state will set 1 meaning "yes". I 
believe that active
state should return negative value to caller function meaning "have no 
permission" will showing "
Error while evaluating Boost Capabilities on CPU xx -- are you root?"  


static inline void
pci_read_data(struct pci_dev *d, void *buf, int pos, int len)
{
  if (pos & (len-1))
d->access->error("Unaligned read: pos=%02x, len=%d", pos, len);
  if (pos + len <= d->cache_len)
memcpy(buf, d->cache + pos, len);
  else if (!d->methods->read(d, pos, buf, len))
memset(buf, 0xff, len);
}

byte
pci_read_byte(struct pci_dev *d, int pos)
{
  byte buf;
  pci_read_data(d, , pos, 1);
  return buf;
}



在 2021/3/24 下午6:27, xufuhai 写道:
> From: xufuhai 
> 
> For the old  AMD processor (family < 0x17), cpupower will call the
> amd_pci_get_num_boost_states function, but for the non-root user
> pci_read_byte function (implementation comes from the psutil library),
> val will be set to 0xff, indicating that there is no read function
> callback. At this time, the original logic will set the cpupower turbo
> active state to yes. This is an obvious issue~
> 
> Reproduce procedure:
>   cpupower frequency-info
> 
> Signed-off-by: xufuhai 
> Signed-off-by: chenguanqiao 
> Signed-off-by: lishujin 
> ---
>  tools/power/cpupower/utils/helpers/amd.c | 7 +++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/tools/power/cpupower/utils/helpers/amd.c 
> b/tools/power/cpupower/utils/helpers/amd.c
> index 97f2c857048e..6f9504906afa 100644
> --- a/tools/power/cpupower/utils/helpers/amd.c
> +++ b/tools/power/cpupower/utils/helpers/amd.c
> @@ -137,6 +137,13 @@ int amd_pci_get_num_boost_states(int *active, int 
> *states)
>   return -ENODEV;
>  
>   val = pci_read_byte(device, 0x15c);
> +
> + /* If val is 0xff, meaning has no permisson to
> +  * get the boost states, return -1
> +  */
> + if (val == 0xff)
> + return -1;
> +
>   if (val & 3)
>   *active = 1;
>   else
>

[PATCH] ASoC: fsl_rpmsg: initialise pointers to NULL

2021-03-28 Thread Shengjiu Wang

This fixes the following sparse warnings:

sound/soc/fsl/fsl_rpmsg.c:45:45: sparse: sparse: Using plain integer as NULL 
pointer
sound/soc/fsl/fsl_rpmsg.c:45:56: sparse: sparse: Using plain integer as NULL 
pointer

Fixes: b73d9e6225e8 ("ASoC: fsl_rpmsg: Add CPU DAI driver for audio base on 
rpmsg")
Signed-off-by: Shengjiu Wang 
Reported-by: kernel test robot 
---
 sound/soc/fsl/fsl_rpmsg.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/sound/soc/fsl/fsl_rpmsg.c b/sound/soc/fsl/fsl_rpmsg.c
index 2d09d8850e2c..ea5c973e2e84 100644
--- a/sound/soc/fsl/fsl_rpmsg.c
+++ b/sound/soc/fsl/fsl_rpmsg.c
@@ -42,7 +42,7 @@ static int fsl_rpmsg_hw_params(struct snd_pcm_substream 
*substream,
   struct snd_soc_dai *dai)
 {
struct fsl_rpmsg *rpmsg = snd_soc_dai_get_drvdata(dai);
-   struct clk *p = rpmsg->mclk, *pll = 0, *npll = 0;
+   struct clk *p = rpmsg->mclk, *pll = NULL, *npll = NULL;
u64 rate = params_rate(params);
int ret = 0;
 
-- 
2.17.1

[PATCH v2 -next] arm64: smp: Add missing prototype for some smp.c functions

2021-03-28 Thread Chen Lifu

In commit eb631bb5bf5b
("arm64: Support arch_irq_work_raise() via self IPIs") a new
function "arch_irq_work_raise" was added without a prototype.

In commit d914d4d49745
("arm64: Implement panic_smp_self_stop()") a new
function "panic_smp_self_stop" was added without a prototype.

We get the following warnings on W=1:
arch/arm64/kernel/smp.c:842:6: warning: no previous prototype
for ‘arch_irq_work_raise’ [-Wmissing-prototypes]
arch/arm64/kernel/smp.c:862:6: warning: no previous prototype
for ‘panic_smp_self_stop’ [-Wmissing-prototypes]

Fix the warnings by:
1. Adding the prototype for 'arch_irq_work_raise' in irq_work.h
2. Adding the prototype for 'panic_smp_self_stop' in smp.h

Signed-off-by: Chen Lifu 
---
v2:
- move the prototype for 'panic_smp_self_stop' to smp.h
 arch/arm64/include/asm/irq_work.h | 2 ++
 arch/arm64/include/asm/smp.h  | 1 +
 2 files changed, 3 insertions(+)

diff --git a/arch/arm64/include/asm/irq_work.h 
b/arch/arm64/include/asm/irq_work.h
index a1020285ea75..81bbfa3a035b 100644
--- a/arch/arm64/include/asm/irq_work.h
+++ b/arch/arm64/include/asm/irq_work.h
@@ -2,6 +2,8 @@
 #ifndef __ASM_IRQ_WORK_H
 #define __ASM_IRQ_WORK_H
 
+extern void arch_irq_work_raise(void);
+
 static inline bool arch_irq_work_has_interrupt(void)
 {
return true;
diff --git a/arch/arm64/include/asm/smp.h b/arch/arm64/include/asm/smp.h
index bcb01ca15325..0e357757c0cc 100644
--- a/arch/arm64/include/asm/smp.h
+++ b/arch/arm64/include/asm/smp.h
@@ -145,6 +145,7 @@ bool cpus_are_stuck_in_kernel(void);
 
 extern void crash_smp_send_stop(void);
 extern bool smp_crash_stop_failed(void);
+extern void panic_smp_self_stop(void);
 
 #endif /* ifndef __ASSEMBLY__ */
 
-- 
2.17.1

Re: [PATCH v2 4/6] dt-bindings: PCI: Add SiFive FU740 PCIe host controller

2021-03-28 Thread Greentime Hu

Rob Herring  於 2021年3月24日 週三 上午4:35寫道：
>
> On Thu, Mar 18, 2021 at 02:08:11PM +0800, Greentime Hu wrote:
> > Add PCIe host controller DT bindings of SiFive FU740.
> >
> > Signed-off-by: Greentime Hu 
> > ---
> >  .../bindings/pci/sifive,fu740-pcie.yaml   | 119 ++
> >  1 file changed, 119 insertions(+)
> >  create mode 100644 
> > Documentation/devicetree/bindings/pci/sifive,fu740-pcie.yaml
[...]
> > +examples:
> > +  - |
> > +pcie@e {
> > +#address-cells = <3>;
> > +#interrupt-cells = <1>;
> > +#size-cells = <2>;
> > +compatible = "sifive,fu740-pcie";
> > +reg = <0xe 0x 0x1 0x0
>
> Humm, 4GB for DBI space? The DWC controller doesn't have that much
> space, and the kernel will map *all* of that. That's not an
> insignificant amount of memory just for page tables.

Thank you for review and point this out. :)

I check the spec description for DBI in DWC_pcie_ctl_dm_databook.pdf
section 3.15 3.16 and table 3-17.

I think CX_SRIOV_ENABLE and CX_ARI_ENABLE will be set to 0 because
these 2 are endpoint mode features.
Single Root I/O Virtualization (SR-IOV) This section describes the
SR-IOV features implemented in EP mode. The parameter for enabling
SR-IOV is CX_SRIOV_ENABLE
Alternative Routing-ID Interpretation (ARI) ARI allows an endpoint to
support more than eight physical functions (PFs). ARI is enabled by
the CX_ARI_ENABLE parameter.

So based on Table 3-17, we will need to map 2GB(bit30) instead of 4GB(bit31).

csky-linux-gcc: error: unrecognized command line option '-mbacktrace'; did you mean '-fbacktrace'?

2021-03-28 Thread kernel test robot

Hi Guo,

FYI, the error/warning still remains.

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
master
head:   a5e13c6df0e41702d2b2c77c8ad41677ebb065b3
commit: 000591f1ca3312d9a29e15a9e3fe5c4171f75586 csky: Enable LOCKDEP_SUPPORT
date:   12 months ago
config: csky-randconfig-r001-20210328 (attached as .config)
compiler: csky-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=000591f1ca3312d9a29e15a9e3fe5c4171f75586
git remote add linus 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
git fetch --no-tags linus master
git checkout 000591f1ca3312d9a29e15a9e3fe5c4171f75586
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross 
ARCH=csky 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All errors (new ones prefixed by >>):

   csky-linux-gcc: error: unrecognized argument in option '-mcpu=ck860'
   csky-linux-gcc: note: valid arguments to '-mcpu=' are: ck801 ck801t ck802 
ck802j ck802t ck803 ck803e ck803ef ck803efh ck803efhr1 ck803efht ck803efhtr1 
ck803efr1 ck803eft ck803eftr1 ck803eh ck803ehr1 ck803eht ck803ehtr1 ck803er1 
ck803et ck803etr1 ck803f ck803fh ck803fhr1 ck803fr1 ck803ft ck803ftr1 ck803h 
ck803hr1 ck803ht ck803htr1 ck803r1 ck803s ck803se ck803sef ck803seft ck803sf 
ck803st ck803t ck803tr1 ck807 ck807e ck807ef ck807f ck810 ck810e ck810ef 
ck810eft ck810et ck810f ck810ft ck810ftv ck810fv ck810t ck810tv ck810v; did you 
mean 'ck810'?
>> csky-linux-gcc: error: unrecognized command line option '-mbacktrace'; did 
>> you mean '-fbacktrace'?
--
   csky-linux-gcc: error: unrecognized argument in option '-mcpu=ck860'
   csky-linux-gcc: note: valid arguments to '-mcpu=' are: ck801 ck801t ck802 
ck802j ck802t ck803 ck803e ck803ef ck803efh ck803efhr1 ck803efht ck803efhtr1 
ck803efr1 ck803eft ck803eftr1 ck803eh ck803ehr1 ck803eht ck803ehtr1 ck803er1 
ck803et ck803etr1 ck803f ck803fh ck803fhr1 ck803fr1 ck803ft ck803ftr1 ck803h 
ck803hr1 ck803ht ck803htr1 ck803r1 ck803s ck803se ck803sef ck803seft ck803sf 
ck803st ck803t ck803tr1 ck807 ck807e ck807ef ck807f ck810 ck810e ck810ef 
ck810eft ck810et ck810f ck810ft ck810ftv ck810fv ck810t ck810tv ck810v; did you 
mean 'ck810'?
>> csky-linux-gcc: error: unrecognized command line option '-mbacktrace'; did 
>> you mean '-fbacktrace'?
   /usr/bin/ld: scripts/dtc/dtc-parser.tab.o:(.bss+0x20): multiple definition 
of `yylloc'; scripts/dtc/dtc-lexer.lex.o:(.bss+0x0): first defined here
   collect2: error: ld returned 1 exit status
   make[2]: *** [scripts/Makefile.host:116: scripts/dtc/dtc] Error 1
   make[2]: Target '__build' not remade because of errors.
   make[1]: *** [Makefile:1260: scripts_dtc] Error 2
   make[1]: Target 'modules_prepare' not remade because of errors.
   make: *** [Makefile:180: sub-make] Error 2
   make: Target 'modules_prepare' not remade because of errors.
--
   csky-linux-gcc: error: unrecognized argument in option '-mcpu=ck860'
   csky-linux-gcc: note: valid arguments to '-mcpu=' are: ck801 ck801t ck802 
ck802j ck802t ck803 ck803e ck803ef ck803efh ck803efhr1 ck803efht ck803efhtr1 
ck803efr1 ck803eft ck803eftr1 ck803eh ck803ehr1 ck803eht ck803ehtr1 ck803er1 
ck803et ck803etr1 ck803f ck803fh ck803fhr1 ck803fr1 ck803ft ck803ftr1 ck803h 
ck803hr1 ck803ht ck803htr1 ck803r1 ck803s ck803se ck803sef ck803seft ck803sf 
ck803st ck803t ck803tr1 ck807 ck807e ck807ef ck807f ck810 ck810e ck810ef 
ck810eft ck810et ck810f ck810ft ck810ftv ck810fv ck810t ck810tv ck810v; did you 
mean 'ck810'?
>> csky-linux-gcc: error: unrecognized command line option '-mbacktrace'; did 
>> you mean '-fbacktrace'?
   /usr/bin/ld: scripts/dtc/dtc-parser.tab.o:(.bss+0x20): multiple definition 
of `yylloc'; scripts/dtc/dtc-lexer.lex.o:(.bss+0x0): first defined here
   collect2: error: ld returned 1 exit status
   make[2]: *** [scripts/Makefile.host:116: scripts/dtc/dtc] Error 1
   make[2]: Target '__build' not remade because of errors.
   make[1]: *** [Makefile:1260: scripts_dtc] Error 2
   make[1]: Target 'prepare' not remade because of errors.
   make: *** [Makefile:180: sub-make] Error 2
   make: Target 'prepare' not remade because of errors.

Kconfig warnings: (for reference only)
   WARNING: unmet direct dependencies detected for FRAME_POINTER
   Depends on DEBUG_KERNEL && (M68K || UML || SUPERH) || 
ARCH_WANT_FRAME_POINTERS
   Selected by
   - LOCKDEP && DEBUG_KERNEL && LOCK_DEBUGGING_SUPPORT && !MIPS && !PPC && !ARM 
&& !S390 && !MICROBLAZE && !ARC && !X86

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip

Re: [PATCH] arm: 9016/2: Make symbol 'tmp_pmd_table' static

2021-03-28 Thread Liu Shixin

I'm sorry for making such a stupid mistake. There was only one patch committed 
before(5615f69bc209 "ARM: 9016/2: Initialize the mapping of KASan shadow 
memory"), and I used the same subject by mistake.

Thanks for your correction, I will revise the subject and resend it. How about 
using "arm: mm: kasan_init" in the subject?


On 2021/3/27 18:20, Russell King - ARM Linux admin wrote:
> Why do you have 9016/2 in the subject line? That's an identifier from
> the patch system which shouldn't be in the subject line.
>
> If you want to refer to something already committed, please do so via
> the sha1 git hash and quote the first line of the commit description
> within ("...") in the body of your commit description.
>
> Thanks.
>
> On Sat, Mar 27, 2021 at 04:30:18PM +0800, Shixin Liu wrote:
>> Symbol 'tmp_pmd_table' is not used outside of kasan_init.c and only used
>> when CONFIG_ARM_LPAE enabled. So marks it static and add it into 
>> CONFIG_ARM_LPAE.
>>
>> Signed-off-by: Shixin Liu 
>> ---
>>  arch/arm/mm/kasan_init.c | 4 +++-
>>  1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/arm/mm/kasan_init.c b/arch/arm/mm/kasan_init.c
>> index 9c348042a724..3a06d3b51f97 100644
>> --- a/arch/arm/mm/kasan_init.c
>> +++ b/arch/arm/mm/kasan_init.c
>> @@ -27,7 +27,9 @@
>>  
>>  static pgd_t tmp_pgd_table[PTRS_PER_PGD] __initdata __aligned(PGD_SIZE);
>>  
>> -pmd_t tmp_pmd_table[PTRS_PER_PMD] __page_aligned_bss;
>> +#ifdef CONFIG_ARM_LPAE
>> +static pmd_t tmp_pmd_table[PTRS_PER_PMD] __page_aligned_bss;
>> +#endif
>>  
>>  static __init void *kasan_alloc_block(size_t size)
>>  {
>> -- 
>> 2.25.1
>>
>>

/usr/bin/ld: ll_temac_main.c:undefined reference to `devm_platform_ioremap_resource'

2021-03-28 Thread kernel test robot

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
master
head:   a5e13c6df0e41702d2b2c77c8ad41677ebb065b3
commit: cc6596fc7295e9dcd78156ed42f9f8e1221f7530 net: ll_temac: Fix potential 
NULL dereference in temac_probe()
date:   4 months ago
config: um-randconfig-r015-20210329 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-22) 9.3.0
reproduce (this is a W=1 build):
# 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=cc6596fc7295e9dcd78156ed42f9f8e1221f7530
git remote add linus 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
git fetch --no-tags linus master
git checkout cc6596fc7295e9dcd78156ed42f9f8e1221f7530
# save the attached .config to linux build tree
make W=1 ARCH=um 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All errors (new ones prefixed by >>):

   /usr/bin/ld: drivers/net/ethernet/xilinx/ll_temac_main.o: in function 
`temac_probe':
   ll_temac_main.c:(.text+0x1af2): undefined reference to 
`devm_platform_ioremap_resource_byname'
>> /usr/bin/ld: ll_temac_main.c:(.text+0x1c2e): undefined reference to 
>> `devm_platform_ioremap_resource'
   /usr/bin/ld: ll_temac_main.c:(.text+0x2172): undefined reference to 
`devm_platform_ioremap_resource_byname'
   /usr/bin/ld: drivers/net/ethernet/xilinx/xilinx_axienet_main.o: in function 
`axienet_probe':
   xilinx_axienet_main.c:(.text+0xddb): undefined reference to 
`devm_ioremap_resource'
   collect2: error: ld returned 1 exit status

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip

Re: [PATCH 1/2] dt-bindings: devapc: Update bindings

2021-03-28 Thread Nina Wu

Hi, Rob


On Fri, 2021-03-26 at 13:58 -0600, Rob Herring wrote:
> On Fri, Mar 26, 2021 at 03:31:10PM +0800, Nina Wu wrote:
> > From: Nina Wu 
> > 
> > To support newer hardware architecture of devapc,
> > update device tree bindings.
> > 
> > Signed-off-by: Nina Wu 
> > ---
> >  .../devicetree/bindings/soc/mediatek/devapc.yaml   | 41 
> > ++
> >  1 file changed, 41 insertions(+)
> > 
> > diff --git a/Documentation/devicetree/bindings/soc/mediatek/devapc.yaml 
> > b/Documentation/devicetree/bindings/soc/mediatek/devapc.yaml
> > index 31e4d3c..489f6a9 100644
> > --- a/Documentation/devicetree/bindings/soc/mediatek/devapc.yaml
> > +++ b/Documentation/devicetree/bindings/soc/mediatek/devapc.yaml
> > @@ -20,9 +20,27 @@ properties:
> >compatible:
> >  enum:
> >- mediatek,mt6779-devapc
> > +  - mediatek,mt8192-devapc
> > +
> > +  version:
> > +description: The version of the hardware architecture
> 
> This should be implied by the compatible string.

The version attribute is used to decide how we interpret the debug info
got from registers.
As you mentioned, we can know the version of the architecture from the
compatible, but I think there will be code like this:

if (compatible is mt6779) version = 1
else if (compatible is mt8192) version = 2

And once we have more chips to support, the code will be quite long.
So I prefer to add a 'version' here.


> 
> > +$ref: /schemas/types.yaml#/definitions/uint32
> > +enum: [1, 2]
> > +maxItems: 1
> > +
> > +  slave_type_num:
> 
> vendor prefix needed and s/_/-/

I will fixed in next version.

> 
> > +description: The number of the devapc set
> 
> What?

For mt8192, there are multiple pieces of devapc HW for different subsys.
EX: infra devapc, peri devapc, etc.
'slave_type_num' is the total number of the devapc HW.
I cannot come up with an accurate description, though.


> 
> > +$ref: /schemas/types.yaml#/definitions/uint32
> > +enum: [1, 4]
> > +maxItems: 1
> >  
> >reg:
> >  description: The base address of devapc register bank
> > +maxItems: 4
> 
> Need to define what each region is.

I will fix it in the next version.

> 
> > +
> > +  vio_idx_num:
> 
> vendor prefix needed and s/_/-/

OK, will be fixed in the next version.

> 
> > +description: The number of the devices controlled by devapc
> 
> No need to know which devices?

yes, the current driver does not care each of them.

> 
> > +$ref: /schemas/types.yaml#/definitions/uint32-array
> >  maxItems: 1
> 
> uint32-array with 'maxItems: 1' is just 'uint32'
> 

got it, so it should be 'maxItems: 4'

> >  
> >interrupts:
> > @@ -39,7 +57,10 @@ properties:
> >  
> >  required:
> >- compatible
> > +  - version
> > +  - slave_type_num
> >- reg
> > +  - vio_idx_num
> >- interrupts
> >- clocks
> >- clock-names
> > @@ -53,8 +74,28 @@ examples:
> >  
> >  devapc: devapc@10207000 {
> >compatible = "mediatek,mt6779-devapc";
> > +  version = <1>;
> > +  slave_type_num = <1>;
> >reg = <0x10207000 0x1000>;
> > +  vio_idx_num = <511>;
> >interrupts = ;
> >clocks = <_ao CLK_INFRA_DEVICE_APC>;
> >clock-names = "devapc-infra-clock";
> >  };
> > +  - |
> > +#include 
> > +#include 
> > +
> > +devapc: devapc@10207000 {
> > +compatible = "mediatek,mt8192-devapc";
> > +version = <2>;
> > +slave_type_num = <4>;
> > +reg = <0 0x10207000 0 0x1000>,
> > +<0 0x10274000 0 0x1000>,
> > +<0 0x10275000 0 0x1000>,
> > +<0 0x1102 0 0x1000>;
> > +vio_idx_num = <367 292 242 58>;
> 
> Is the length of this the same as the value of slave_type_num? If so, 
> don't need both.
> 

yes, the length is equal to slave_type_num.
I will try to remove it in the next version.

> > +interrupts = ;
> > +clocks = <_ao CLK_INFRA_DEVICE_APC>;
> > +clock-names = "devapc-infra-clock";
> > +};
> > -- 
> > 2.6.4
> >

Re: [PATCH net 1/4] virtchnl: Fix layout of RSS structures

2021-03-28 Thread Samudrala, Sridhar


On 3/27/2021 2:53 AM, Geert Uytterhoeven wrote:

Hi Samudrala,

On Fri, Mar 26, 2021 at 11:45 PM Samudrala, Sridhar
 wrote:

On 3/26/2021 1:06 AM, Geert Uytterhoeven wrote:

On Thu, Mar 25, 2021 at 11:29 PM Tony Nguyen  wrote:
From: Norbert Ciosek 

Remove padding from RSS structures. Previous layout
could lead to unwanted compiler optimizations
in loops when iterating over key and lut arrays.

 From an earlier private conversation with Mateusz, I understand the real
explanation is that key[] and lut[] must be at the end of the
structures, because they are used as flexible array members?

Fixes: 65ece6de0114 ("virtchnl: Add missing explicit padding to structures")
Signed-off-by: Norbert Ciosek 
Tested-by: Konrad Jankowski 
Signed-off-by: Tony Nguyen 

--- a/include/linux/avf/virtchnl.h
+++ b/include/linux/avf/virtchnl.h
@@ -476,7 +476,6 @@ struct virtchnl_rss_key {
 u16 vsi_id;
 u16 key_len;
 u8 key[1]; /* RSS hash key, packed bytes */
-   u8 pad[1];
  };

  VIRTCHNL_CHECK_STRUCT_LEN(6, virtchnl_rss_key);
@@ -485,7 +484,6 @@ struct virtchnl_rss_lut {
 u16 vsi_id;
 u16 lut_entries;
 u8 lut[1];/* RSS lookup table */
-   u8 pad[1];
  };

If you use a flexible array member, it should be declared without a size,
i.e.

 u8 key[];

Everything else is (trying to) fool the compiler, and leading to undefined
behavior, and people (re)adding explicit padding.

This header file is shared across other OSes that use C++ that doesn't support
flexible arrays. So the structures in this file use an array of size 1 as a last
element to enable variable sized arrays.

I don't think it is accepted practice to have non-Linux-isms in
include/*linux*/avf/virtchnl.h header files.  Moreover, using a size
of 1 is counter-intuitive for people used to Linux kernel development,
and may lead to off-by-one errors in calculation of sizes.

If you insist on ignoring the above, this definitely deserves a
comment next to the member's declaration.
Sure. We can add a comment indicating that these fields are used 
variable sized arrays.


Thanks
Sridhar

Re: [PATCH v2 01/13] dt-bindings: usb: mtk-xhci: support property usb2-lpm-disable

2021-03-28 Thread Chunfeng Yun

On Sat, 2021-03-27 at 11:24 -0600, Rob Herring wrote:
> On Tue, Mar 23, 2021 at 03:02:43PM +0800, Chunfeng Yun wrote:
> > Add support common property usb2-lpm-disable
> > 
> > Signed-off-by: Chunfeng Yun 
> > ---
> > v2: no changes
> > ---
> >  Documentation/devicetree/bindings/usb/mediatek,mtk-xhci.yaml | 4 
> >  1 file changed, 4 insertions(+)
> > 
> > diff --git a/Documentation/devicetree/bindings/usb/mediatek,mtk-xhci.yaml 
> > b/Documentation/devicetree/bindings/usb/mediatek,mtk-xhci.yaml
> > index 14f40efb3b22..2246d29a5e4e 100644
> > --- a/Documentation/devicetree/bindings/usb/mediatek,mtk-xhci.yaml
> > +++ b/Documentation/devicetree/bindings/usb/mediatek,mtk-xhci.yaml
> > @@ -103,6 +103,10 @@ properties:
> >  description: supports USB3.0 LPM
> >  type: boolean
> >  
> > +  usb2-lpm-disable:
> > +description: disable USB2 HW LPM
> > +type: boolean
> 
> Already has a type. Don't redefine here. Just 'usb2-lpm-disable: true' 
> and make sure usb-xhci.yaml is referenced.
Ok, thanks
> 
> > +
> >imod-interval-ns:
> >  description:
> >Interrupt moderation interval value, it is 8 times as much as that
> > -- 
> > 2.18.0
> >

[PATCH] misc: hpilo: MAINTAINERS: add entry for hpilo

2021-03-28 Thread matt . hsiao

From: Matt Hsiao 

The original maintainer left the company, add myself as the successor.

Signed-off-by: Matt Hsiao 
---
 MAINTAINERS | 5 +
 1 file changed, 5 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index fb2a3633b719..0546e7f84a4e 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -7890,6 +7890,11 @@ W:   https://linuxtv.org
 T: git git://linuxtv.org/media_tree.git
 F: drivers/media/usb/hdpvr/
 
+HEWLETT PACKARD ENTERPRISE ILO CHIF DRIVER
+M: Matt Hsiao 
+S: Supported
+F: drivers/misc/hpilo.[ch]
+
 HEWLETT PACKARD ENTERPRISE ILO NMI WATCHDOG DRIVER
 M: Jerry Hoemann 
 S: Supported
-- 
2.16.6

Re: exec error: BUG: Bad rss-counter

2021-03-28 Thread Ilya Lipnitskiy

On Sat, Mar 20, 2021 at 8:59 AM Zhou Yanjie  wrote:
>
> Hi Ilya,
>
> On 2021/3/3 下午11:55, Ilya Lipnitskiy wrote:
> > On Wed, Mar 3, 2021 at 7:50 AM Eric W. Biederman  
> > wrote:
> >> Ilya Lipnitskiy  writes:
> >>
> >>> On Tue, Mar 2, 2021 at 11:37 AM Eric W. Biederman  
> >>> wrote:
>  Ilya Lipnitskiy  writes:
> 
> > On Mon, Mar 1, 2021 at 12:43 PM Eric W. Biederman 
> >  wrote:
> >> Ilya Lipnitskiy  writes:
> >>
> >>> Eric, All,
> >>>
> >>> The following error appears when running Linux 5.10.18 on an embedded
> >>> MIPS mt7621 target:
> >>> [0.301219] BUG: Bad rss-counter state mm:(ptrval) 
> >>> type:MM_ANONPAGES val:1
> >>>
> >>> Being a very generic error, I started digging and added a stack dump
> >>> before the BUG:
> >>> Call Trace:
> >>> [<80008094>] show_stack+0x30/0x100
> >>> [<8033b238>] dump_stack+0xac/0xe8
> >>> [<800285e8>] __mmdrop+0x98/0x1d0
> >>> [<801a6de8>] free_bprm+0x44/0x118
> >>> [<801a86a8>] kernel_execve+0x160/0x1d8
> >>> [<800420f4>] call_usermodehelper_exec_async+0x114/0x194
> >>> [<80003198>] ret_from_kernel_thread+0x14/0x1c
> >>>
> >>> So that's how I got to looking at fs/exec.c and noticed quite a few
> >>> changes last year. Turns out this message only occurs once very early
> >>> at boot during the very first call to kernel_execve. current->mm is
> >>> NULL at this stage, so acct_arg_size() is effectively a no-op.
> >> If you believe this is a new error you could bisect the kernel
> >> to see which change introduced the behavior you are seeing.
> >>
> >>> More digging, and I traced the RSS counter increment to:
> >>> [<8015adb4>] add_mm_counter_fast+0xb4/0xc0
> >>> [<80160d58>] handle_mm_fault+0x6e4/0xea0
> >>> [<80158aa4>] __get_user_pages.part.78+0x190/0x37c
> >>> [<8015992c>] __get_user_pages_remote+0x128/0x360
> >>> [<801a6d9c>] get_arg_page+0x34/0xa0
> >>> [<801a7394>] copy_string_kernel+0x194/0x2a4
> >>> [<801a880c>] kernel_execve+0x11c/0x298
> >>> [<800420f4>] call_usermodehelper_exec_async+0x114/0x194
> >>> [<80003198>] ret_from_kernel_thread+0x14/0x1c
> >>>
> >>> In fact, I also checked vma_pages(bprm->vma) and lo and behold it is 
> >>> set to 1.
> >>>
> >>> How is fs/exec.c supposed to handle implied RSS increments that happen
> >>> due to page faults when discarding the bprm structure? In this case,
> >>> the bug-generating kernel_execve call never succeeded, it returned -2,
> >>> but I didn't trace exactly what failed.
> >> Unless I am mistaken any left over pages should be purged by exit_mmap
> >> which is called by mmput before mmput calls mmdrop.
> > Good to know. Some more digging and I can say that we hit this error
> > when trying to unmap PFN 0 (is_zero_pfn(pfn) returns TRUE,
> > vm_normal_page returns NULL, zap_pte_range does not decrement
> > MM_ANONPAGES RSS counter). Is my understanding correct that PFN 0 is
> > usable, but special? Or am I totally off the mark here?
>  It would be good to know if that is the page that get_user_pages_remote
>  returned to copy_string_kernel.  The zero page that is always zero,
>  should never be returned when a writable mapping is desired.
> >>> Indeed, pfn 0 is returned from get_arg_page: (page is 0x809cf000,
> >>> page_to_pfn(page) is 0) and it is the same page that is being freed and 
> >>> not
> >>> refcounted in mmput/zap_pte_range. Confirmed with good old printk. Also,
> >>> ZERO_PAGE(0)==0x809fc000 -> PFN 5120.
> >>>
> >>> I think I have found the problem though, after much digging and thanks to 
> >>> all
> >>> the information provided. init_zero_pfn() gets called too late (after
> >>> the call to
> >>> is_zero_pfn(0) from mmput returns true), until then zero_pfn == 0, and 
> >>> after,
> >>> zero_pfn == 5120. Boom.
> >>>
> >>> So PFN 0 is special, but only for a little bit, enough for something
> >>> on my system
> >>> to call kernel_execve :)
> >>>
> >>> Question: is my system not supposed to be calling kernel_execve this
> >>> early or does
> >>> init_zero_pfn() need to happen earlier? init_zero_pfn is currently a
> >>> core_initcall.
> >> Looking quickly it seems that init_zero_pfn() is in mm/memory.c and is
> >> common for both mips and x86.  Further it appears init_zero_pfn() has
> >> been that was since 2009 a13ea5b75964 ("mm: reinstate ZERO_PAGE").
> >>
> >> Given the testing that x86 gets and that nothing like this has been
> >> reported it looks like whatever driver is triggering the kernel_execve
> >> is doing something wrong.
> >> Because honestly.  If the zero page isn't working there is not a chance
> >> that anything in userspace is working so it is clearly much too early.
> >>
> >> I suspect there is some driver that is initialized very early that is
> >> doing something that looks innocuous (like triggering a hotplug event)
> >> and that happens to cause a

Re: exec error: BUG: Bad rss-counter

2021-03-28 Thread Ilya Lipnitskiy

On Wed, Mar 3, 2021 at 7:50 AM Eric W. Biederman  wrote:
>
> Ilya Lipnitskiy  writes:
>
> > On Tue, Mar 2, 2021 at 11:37 AM Eric W. Biederman  
> > wrote:
> >>
> >> Ilya Lipnitskiy  writes:
> >>
> >> > On Mon, Mar 1, 2021 at 12:43 PM Eric W. Biederman 
> >> >  wrote:
> >> >>
> >> >> Ilya Lipnitskiy  writes:
> >> >>
> >> >> > Eric, All,
> >> >> >
> >> >> > The following error appears when running Linux 5.10.18 on an embedded
> >> >> > MIPS mt7621 target:
> >> >> > [0.301219] BUG: Bad rss-counter state mm:(ptrval) 
> >> >> > type:MM_ANONPAGES val:1
> >> >> >
> >> >> > Being a very generic error, I started digging and added a stack dump
> >> >> > before the BUG:
> >> >> > Call Trace:
> >> >> > [<80008094>] show_stack+0x30/0x100
> >> >> > [<8033b238>] dump_stack+0xac/0xe8
> >> >> > [<800285e8>] __mmdrop+0x98/0x1d0
> >> >> > [<801a6de8>] free_bprm+0x44/0x118
> >> >> > [<801a86a8>] kernel_execve+0x160/0x1d8
> >> >> > [<800420f4>] call_usermodehelper_exec_async+0x114/0x194
> >> >> > [<80003198>] ret_from_kernel_thread+0x14/0x1c
> >> >> >
> >> >> > So that's how I got to looking at fs/exec.c and noticed quite a few
> >> >> > changes last year. Turns out this message only occurs once very early
> >> >> > at boot during the very first call to kernel_execve. current->mm is
> >> >> > NULL at this stage, so acct_arg_size() is effectively a no-op.
> >> >>
> >> >> If you believe this is a new error you could bisect the kernel
> >> >> to see which change introduced the behavior you are seeing.
> >> >>
> >> >> > More digging, and I traced the RSS counter increment to:
> >> >> > [<8015adb4>] add_mm_counter_fast+0xb4/0xc0
> >> >> > [<80160d58>] handle_mm_fault+0x6e4/0xea0
> >> >> > [<80158aa4>] __get_user_pages.part.78+0x190/0x37c
> >> >> > [<8015992c>] __get_user_pages_remote+0x128/0x360
> >> >> > [<801a6d9c>] get_arg_page+0x34/0xa0
> >> >> > [<801a7394>] copy_string_kernel+0x194/0x2a4
> >> >> > [<801a880c>] kernel_execve+0x11c/0x298
> >> >> > [<800420f4>] call_usermodehelper_exec_async+0x114/0x194
> >> >> > [<80003198>] ret_from_kernel_thread+0x14/0x1c
> >> >> >
> >> >> > In fact, I also checked vma_pages(bprm->vma) and lo and behold it is 
> >> >> > set to 1.
> >> >> >
> >> >> > How is fs/exec.c supposed to handle implied RSS increments that happen
> >> >> > due to page faults when discarding the bprm structure? In this case,
> >> >> > the bug-generating kernel_execve call never succeeded, it returned -2,
> >> >> > but I didn't trace exactly what failed.
> >> >>
> >> >> Unless I am mistaken any left over pages should be purged by exit_mmap
> >> >> which is called by mmput before mmput calls mmdrop.
> >> > Good to know. Some more digging and I can say that we hit this error
> >> > when trying to unmap PFN 0 (is_zero_pfn(pfn) returns TRUE,
> >> > vm_normal_page returns NULL, zap_pte_range does not decrement
> >> > MM_ANONPAGES RSS counter). Is my understanding correct that PFN 0 is
> >> > usable, but special? Or am I totally off the mark here?
> >>
> >> It would be good to know if that is the page that get_user_pages_remote
> >> returned to copy_string_kernel.  The zero page that is always zero,
> >> should never be returned when a writable mapping is desired.
> >
> > Indeed, pfn 0 is returned from get_arg_page: (page is 0x809cf000,
> > page_to_pfn(page) is 0) and it is the same page that is being freed and not
> > refcounted in mmput/zap_pte_range. Confirmed with good old printk. Also,
> > ZERO_PAGE(0)==0x809fc000 -> PFN 5120.
> >
> > I think I have found the problem though, after much digging and thanks to 
> > all
> > the information provided. init_zero_pfn() gets called too late (after
> > the call to
> > is_zero_pfn(0) from mmput returns true), until then zero_pfn == 0, and 
> > after,
> > zero_pfn == 5120. Boom.
> >
> > So PFN 0 is special, but only for a little bit, enough for something
> > on my system
> > to call kernel_execve :)
> >
> > Question: is my system not supposed to be calling kernel_execve this
> > early or does
> > init_zero_pfn() need to happen earlier? init_zero_pfn is currently a
> > core_initcall.
>
> Looking quickly it seems that init_zero_pfn() is in mm/memory.c and is
> common for both mips and x86.  Further it appears init_zero_pfn() has
> been that was since 2009 a13ea5b75964 ("mm: reinstate ZERO_PAGE").
>
> Given the testing that x86 gets and that nothing like this has been
> reported it looks like whatever driver is triggering the kernel_execve
> is doing something wrong.
>
> Because honestly.  If the zero page isn't working there is not a chance
> that anything in userspace is working so it is clearly much too early.
>
> I suspect there is some driver that is initialized very early that is
> doing something that looks innocuous (like triggering a hotplug event)
> and that happens to cause a call_usermode_helper which then calls
> kernel_execve.

Here is the data that's passed into the very first kernel_execve call:
kernel_filename: /sbin/hotplug
argv: [/sbin/hotplug, bus]
envp:

Re: [PATCH 1/2] dt-bindings: devapc: Update bindings

2021-03-28 Thread Nina Wu

Hi, Chun-Kuang

On Sat, 2021-03-27 at 00:18 +0800, Chun-Kuang Hu wrote:
> Hi, Nina:
> 
> Nina Wu  於 2021年3月26日 週五 下午3:34寫道：
> >
> > From: Nina Wu 
> >
> > To support newer hardware architecture of devapc,
> > update device tree bindings.
> >
> > Signed-off-by: Nina Wu 
> > ---
> >  .../devicetree/bindings/soc/mediatek/devapc.yaml   | 41 
> > ++
> >  1 file changed, 41 insertions(+)
> >
> > diff --git a/Documentation/devicetree/bindings/soc/mediatek/devapc.yaml 
> > b/Documentation/devicetree/bindings/soc/mediatek/devapc.yaml
> > index 31e4d3c..489f6a9 100644
> > --- a/Documentation/devicetree/bindings/soc/mediatek/devapc.yaml
> > +++ b/Documentation/devicetree/bindings/soc/mediatek/devapc.yaml
> > @@ -20,9 +20,27 @@ properties:
> >compatible:
> >  enum:
> >- mediatek,mt6779-devapc
> > +  - mediatek,mt8192-devapc
> > +
> > +  version:
> > +description: The version of the hardware architecture
> > +$ref: /schemas/types.yaml#/definitions/uint32
> > +enum: [1, 2]
> > +maxItems: 1
> > +
> > +  slave_type_num:
> > +description: The number of the devapc set
> > +$ref: /schemas/types.yaml#/definitions/uint32
> > +enum: [1, 4]
> > +maxItems: 1
> >
> >reg:
> >  description: The base address of devapc register bank
> > +maxItems: 4
> > +
> > +  vio_idx_num:
> > +description: The number of the devices controlled by devapc
> > +$ref: /schemas/types.yaml#/definitions/uint32-array
> >  maxItems: 1
> >
> >interrupts:
> > @@ -39,7 +57,10 @@ properties:
> >
> >  required:
> >- compatible
> > +  - version
> > +  - slave_type_num
> >- reg
> > +  - vio_idx_num
> >- interrupts
> >- clocks
> >- clock-names
> > @@ -53,8 +74,28 @@ examples:
> >
> >  devapc: devapc@10207000 {
> >compatible = "mediatek,mt6779-devapc";
> > +  version = <1>;
> 
> I think version is redundant. For example, if mt0001-devapc is
> identical to mt6779-devapc, its compatible should be
> 
> compatible = "mediatek,mt0001-devapc", "mediatek,mt6779-devapc";
> 
> In driver, only keep compatible for mt6779 and no mt0001 because
> mt0001 is identical to mt6779.
> In probe sequence, try first compatible string
> "mediatek,mt0001-devapc", but it does not exist in driver, so try next
> compatible string "mediatek,mt6779-devapc" and match.
> So mt0001-devapc would work as mt6779-devapc.
> 

I think the version is still needed.
Because there is little difference in the registers which save debug
info.


> > +  slave_type_num = <1>;
> >reg = <0x10207000 0x1000>;
> > +  vio_idx_num = <511>;
> >interrupts = ;
> >clocks = <_ao CLK_INFRA_DEVICE_APC>;
> >clock-names = "devapc-infra-clock";
> >  };
> > +  - |
> > +#include 
> > +#include 
> > +
> > +devapc: devapc@10207000 {
> > +compatible = "mediatek,mt8192-devapc";
> > +version = <2>;
> > +slave_type_num = <4>;
> > +reg = <0 0x10207000 0 0x1000>,
> > +<0 0x10274000 0 0x1000>,
> > +<0 0x10275000 0 0x1000>,
> > +<0 0x1102 0 0x1000>;
> > +vio_idx_num = <367 292 242 58>;
> > +interrupts = ;
> > +clocks = <_ao CLK_INFRA_DEVICE_APC>;
> > +clock-names = "devapc-infra-clock";
> > +};
> 
> It looks like that there are 4 devapc device in mt8192.
> These 4 device work independently, so I would like to decouple them
> rather than couple them.
> 
> devapc0: devapc@10207000 {
> compatible = "mediatek,mt8192-devapc";
> reg = <0 0x10207000 0 0x1000>;
> vio_idx_num = <367>;
> ...
> };
> 
> devapc1: devapc@10274000 {
> compatible = "mediatek,mt8192-devapc";
> reg = <0 0x10274000 0 0x1000>;
> vio_idx_num = <292>;
> ...
> };
> 
> devapc2: devapc@10275000 {
> compatible = "mediatek,mt8192-devapc";
> reg = <0 0x10275000 0 0x1000>;
> vio_idx_num = <242>;
> ...
> };
> 
> devapc3: devapc@1102 {
> compatible = "mediatek,mt8192-devapc";
> reg = <0 0x1102 0 0x1000>;
> vio_idx_num = <58>;
> ...
> };
> 

I will try this with shared IRQ and re-submit another version.


> Regards,
> Chun-Kuang.
> 
> > --
> > 2.6.4
> > ___
> > Linux-mediatek mailing list
> > linux-media...@lists.infradead.org
> > https://urldefense.com/v3/__http://lists.infradead.org/mailman/listinfo/linux-mediatek__;!!CTRNKA9wMg0ARbw!02jtHESdXiknfQKFC-IqkUJOuWEjeE-GMqwk3RmPMm3_T-Xv9pmUk9Zoi2e2kvXjoKc$
> >

Re: [PATCH bpf-next 5/5] libbpf: add selftests for TC-BPF API

2021-03-28 Thread Andrii Nakryiko

On Sun, Mar 28, 2021 at 6:40 PM Alexei Starovoitov
 wrote:
>
> On Sat, Mar 27, 2021 at 09:32:58PM -0700, Andrii Nakryiko wrote:
> > > I think it's better to start with new library for tc/xdp and have
> > > libbpf as a dependency on that new lib.
> > > For example we can add it as subdir in tools/lib/bpf/.
> > >
> > > Similarly I think integerating static linking into libbpf was a mistake.
> > > It should be a sub library as well.
> > >
> > > If we end up with core libbpf and ten sublibs for tc, xdp, af_xdp, 
> > > linking,
> > > whatever else the users would appreciate that we don't shove single libbpf
> > > to them with a ton of features that they might never use.
> >
> > What's the concern exactly? The size of the library? Having 10
> > micro-libraries has its own set of downsides,
>
> specifically?

You didn't answer my question, but from what you write below I assume
libbpf size is your main concern?

As for downsides, I'm sure I'm not yet seeing all of the problems
we'll encounter when splitting libbpf into 10 pieces. But as a user,
having to figure out which libraries I need to use is a big hassle.
E.g., for XDP application using ringbuf, I'll need libbpfelf,
libbpftrace, libbpfnet, which implicitly also would depend on
libsysbpf, libbtf, libbpfutil, I assume. So having to list 3 vs 1
library is already annoying, but when statically linking I'd need to
specify all 6. I'd very much rather know that it has to be -lbpf at it
will provide me with all the basics (and it's already -lz and -lelf in
static linking scenario, which I wish we could get rid of).

>
> > I'm not convinced that's
> > a better situation for end users. And would certainly cause more
> > hassle for libbpf developers and packagers.
>
> For developers and packagers.. yes.
> For users.. quite the opposite.

See above. I don't know which hassle is libbpf for users today. You
were implying code size used for functionality users might not use
(e.g., linker). Libbpf is a very small library, <300KB. There are
users building tools for constrained embedded systems that use libbpf.
There are/were various problems mentioned, but the size of libbpf
wasn't yet one of them. We should certainly watch the code bloat, but
we are not yet at the point where library is too big for users to be
turned off. In shared library case it's even less of a concern.

> The skel gen and static linking must be split out before the next libbpf 
> release.
> Not a single application linked with libbpf is going to use those pieces.
> bpftool is one and only that needs them. Hence forcing libbpf users
> to increase their .text with a dead code is a selfish call of libbpf
> developers and packagers. The user's priorities must come first.
>
> > And what did you include in "core libbpf"?
>
> I would take this opportunity to split libbpf into maintainable pieces:
> - libsysbpf - sys_bpf wrappers (pretty much tools/lib/bpf/bpf.c)
> - libbpfutil - hash, strset

strset and hash are internal data structures, I never intended to
expose them through public APIs. I haven't investigated, but if we
have a separate shared library (libbpfutil), I imagine we won't be
able to hide those APIs, right?

> - libbtf - BTF read/write
> - libbpfelf - ELF parsing, CORE, ksym, kconfig
> - libbpfskel - skeleton gen used by bpftool only

skeleton generation is already part of bpftool, there is no need to
split anything out

> - libbpflink - linker used by bpftool only
> - libbpfnet - networking attachment via netlink including TC and XDP
> - libbpftrace - perfbuf, ringbuf

ringbuf and perfbuf are both very small code-wise, and are used in
majority of BPF applications anyways

> - libxdp - Toke's xdp chaining
> - libxsk - af_xdp logic
>

Now, if we look at libbpf .o files, we can approximately see what
functionality is using most code:

FileSize Percent

bpf.o  178004.88
bpf_prog_linfo.o29520.81
btf_dump.o 204725.61
btf.o  58160   15.93
hashmap.o   40561.11
libbpf_errno.o  29120.80
libbpf.o  190072   52.06
libbpf_probes.o 66961.83
linker.o   294088.05
netlink.o   59441.63
nlattr.o27440.75
ringbuf.o   61281.68
str_error.o 16400.45
strset.o36561.00
xsk.o  124563.41

Total 365096  100.00

so libbpf.o which has mostly bpf_object open/load logic and CO-RE take
more than half already. And it depends on still more stuff in btf,
hashmap, bpf, libbpf_probes, errno. But the final code size is even
smaller, because libbpf.so is just 285128 bytes (not 365096 as implied
by the table above), so even these numbers are pessimistic.

linker.o, which is about 8% of the code right now, but is also
actually taking less than 29KB, because when I remove linker.o and
re-compile, the final libbpf.so goes from 285128 to 267576 = 17552
reduction. Even if it grows 2x, I'd still say it's not a big deal.

One reason to keep

Re: [PATCH 1/2] dt-bindings: devapc: Update bindings

2021-03-28 Thread Nina Wu

Hi, Rob

I just found that there is the un-merged dependent patch.
https://patchwork.kernel.org/project/linux-mediatek/patch/20210324104110.13383-7-chun-jie.c...@mediatek.com/

I will add this to commit message in the next version.

Thanks

On Fri, 2021-03-26 at 09:10 -0600, Rob Herring wrote:
> On Fri, 26 Mar 2021 15:31:10 +0800, Nina Wu wrote:
> > From: Nina Wu 
> > 
> > To support newer hardware architecture of devapc,
> > update device tree bindings.
> > 
> > Signed-off-by: Nina Wu 
> > ---
> >  .../devicetree/bindings/soc/mediatek/devapc.yaml   | 41 
> > ++
> >  1 file changed, 41 insertions(+)
> > 
> 
> My bot found errors running 'make dt_binding_check' on your patch:
> 
> yamllint warnings/errors:
> 
> dtschema/dtc warnings/errors:
> /builds/robherring/linux-dt-review/Documentation/devicetree/bindings/soc/mediatek/devapc.yaml:
>  properties:version:enum: False schema does not allow [1, 2]
> /builds/robherring/linux-dt-review/Documentation/devicetree/bindings/soc/mediatek/devapc.yaml:
>  properties:slave_type_num:enum: False schema does not allow [1, 4]
> /builds/robherring/linux-dt-review/Documentation/devicetree/bindings/soc/mediatek/devapc.yaml:
>  ignoring, error in schema: properties: slave_type_num: enum
> warning: no schema found in file: 
> ./Documentation/devicetree/bindings/soc/mediatek/devapc.yaml
> Documentation/devicetree/bindings/soc/mediatek/devapc.example.dts:51:18: 
> fatal error: dt-bindings/clock/mt8192-clk.h: No such file or directory
>51 | #include 
>   |  ^~~~
> compilation terminated.
> make[1]: *** [scripts/Makefile.lib:349: 
> Documentation/devicetree/bindings/soc/mediatek/devapc.example.dt.yaml] Error 1
> make[1]: *** Waiting for unfinished jobs
> make: *** [Makefile:1380: dt_binding_check] Error 2
> 
> See 
> https://urldefense.com/v3/__https://patchwork.ozlabs.org/patch/1458687__;!!CTRNKA9wMg0ARbw!zZmjm74Leee8o-eaQUB_yHYvh-66g88Rgjozv_ecSkwW-yfo7G_c9o6-p0JlFfst3VI$
>  
> 
> This check can fail if there are any dependencies. The base for a patch
> series is generally the most recent rc1.
> 
> If you already ran 'make dt_binding_check' and didn't see the above
> error(s), then make sure 'yamllint' is installed and dt-schema is up to
> date:
> 
> pip3 install dtschema --upgrade
> 
> Please check and re-submit.
>

[PATCH V4] exit: trigger panic when global init has exited

2021-03-28 Thread Qianli Zhao

From: Qianli Zhao 

When init sub-threads running on different CPUs exit at the same time,
zap_pid_ns_processe()->BUG() may be happened(timing is as below),move
panic() before set PF_EXITING to fix this problem.

In addition,if panic() after other sub-threads finish do_exit(),
some key variables (task->mm,task->nsproxy etc) of sub-thread will be lost,
which makes it difficult to parse coredump from fulldump,checking 
SIGNAL_GROUP_EXIT
to prevent init sub-threads exit.

[   24.705376] Kernel panic - not syncing: Attempted to kill init! 
exitcode=0x7f00
[   24.705382] CPU: 4 PID: 552 Comm: init Tainted: G S O
4.14.180-perf-g4483caa8ae80-dirty #1
[   24.705390] kernel BUG at include/linux/pid_namespace.h:98!

PID: 552   CPU: 4   COMMAND: "init"
PID: 1 CPU: 7   COMMAND: "init"
core4   core7
... sys_exit_group()
do_group_exit()
   - sig->flags = SIGNAL_GROUP_EXIT
   - zap_other_threads()
do_exit() //PF_EXITING is set
ret_to_user()
do_notify_resume()
get_signal()
- signal_group_exit
- goto fatal;
do_group_exit()
do_exit() //PF_EXITING is set
- panic("Attempted to kill init! exitcode=0x%08x\n")
exit_notify()
find_alive_thread() //no alive sub-threads
zap_pid_ns_processes()//CONFIG_PID_NS is not set
BUG()

Signed-off-by: Qianli Zhao 
---
V4:
- Changelog update

V3:
- Use group_dead instead of thread_group_empty() to test single init exit.

V2:
- Changelog update
- Remove wrong useage of SIGNAL_UNKILLABLE. 
- Add thread_group_empty() test to handle single init thread exit

---
 kernel/exit.c | 21 -
 1 file changed, 12 insertions(+), 9 deletions(-)

diff --git a/kernel/exit.c b/kernel/exit.c
index 04029e3..f95f8dc 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -766,6 +766,17 @@ void __noreturn do_exit(long code)
 
validate_creds_for_do_exit(tsk);
 
+   group_dead = atomic_dec_and_test(>signal->live);
+   /*
+* If global init has exited,
+* panic immediately to get a useable coredump.
+*/
+   if (unlikely(is_global_init(tsk) &&
+   (group_dead || (tsk->signal->flags & SIGNAL_GROUP_EXIT {
+   panic("Attempted to kill init! exitcode=0x%08x\n",
+   tsk->signal->group_exit_code ?: (int)code);
+   }
+
/*
 * We're taking recursive faults here in do_exit. Safest is to just
 * leave this task alone and wait for reboot.
@@ -784,16 +795,8 @@ void __noreturn do_exit(long code)
if (tsk->mm)
sync_mm_rss(tsk->mm);
acct_update_integrals(tsk);
-   group_dead = atomic_dec_and_test(>signal->live);
-   if (group_dead) {
-   /*
-* If the last thread of global init has exited, panic
-* immediately to get a useable coredump.
-*/
-   if (unlikely(is_global_init(tsk)))
-   panic("Attempted to kill init! exitcode=0x%08x\n",
-   tsk->signal->group_exit_code ?: (int)code);
 
+   if (group_dead) {
 #ifdef CONFIG_POSIX_TIMERS
hrtimer_cancel(>signal->real_timer);
exit_itimers(tsk->signal);
-- 
1.9.1

Re: [PATCH v14 01/11] x86: kdump: replace the hard-coded alignment with macro CRASH_ALIGN

2021-03-28 Thread chenzhou




On 2021/3/2 15:43, Baoquan He wrote:
> On 02/26/21 at 09:38am, Eric W. Biederman wrote:
>> chenzhou  writes:
>>
>>> On 2021/2/25 15:25, Baoquan He wrote:
 On 02/24/21 at 02:19pm, Catalin Marinas wrote:
> On Sat, Jan 30, 2021 at 03:10:15PM +0800, Chen Zhou wrote:
>> Move CRASH_ALIGN to header asm/kexec.h for later use. Besides, the
>> alignment of crash kernel regions in x86 is 16M(CRASH_ALIGN), but
>> function reserve_crashkernel() also used 1M alignment. So just
>> replace hard-coded alignment 1M with macro CRASH_ALIGN.
> [...]
>> @@ -510,7 +507,7 @@ static void __init reserve_crashkernel(void)
>>  } else {
>>  unsigned long long start;
>>  
>> -start = memblock_phys_alloc_range(crash_size, SZ_1M, 
>> crash_base,
>> +start = memblock_phys_alloc_range(crash_size, 
>> CRASH_ALIGN, crash_base,
>>crash_base + 
>> crash_size);
>>  if (start != crash_base) {
>>  pr_info("crashkernel reservation failed - 
>> memory is in use.\n");
> There is a small functional change here for x86. Prior to this patch,
> crash_base passed by the user on the command line is allowed to be 1MB
> aligned. With this patch, such reservation will fail.
>
> Is the current behaviour a bug in the current x86 code or it does allow
> 1MB-aligned reservations?
 Hmm, you are right. Here we should keep 1MB alignment as is because
 users specify the address and size, their intention should be respected.
 The 1MB alignment for fixed memory region reservation was introduced in
 below commit, but it doesn't tell what is Eric's request at that time, I
 guess it meant respecting users' specifying.
>>
>>> I think we could make the alignment unified. Why is the alignment system 
>>> reserved and
>>> user specified different? Besides, there is no document about the 1MB 
>>> alignment.
>>> How about adding the alignment size(16MB) in doc  if user specified
>>> start address as arm64 does.
>> Looking at what the code is doing.  Attempting to reserve a crash region
>> at the location the user specified.  Adding unnecessary alignment
>> constraints is totally broken. 
>>
>> I am not even certain enforcing a 1MB alignment makes sense.  I suspect
>> it was added so that we don't accidentally reserve low memory on x86.
>> Frankly I am not even certain that makes sense.
>>
>> Now in practice there might be an argument for 2MB alignment that goes
>> with huge page sizes on x86.  But until someone finds that there are
>> actual problems with 1MB alignment I would not touch it.
>>
>> The proper response to something that isn't documented and confusing is
>> not to arbitrarily change it and risk breaking users.  Especially in
>> this case where it is clear that adding additional alignment is total
>> nonsense.  The proper response to something that isn't clear and
>> documented is to dig in and document it, or to leave it alone and let it
> Sounds reasonable. Then adding document or code comment around looks
> like a good way to go further so that people can easily get why its
> alignment is different than other reservation.
Hi Baoquan & Eric,

Sorry for late reply, i missed it earlier.

Thanks for your explanation, i will just leave the 1MB alignment here as is.

I will introduce CRASH_ALIGN_SPECIFIED to help make function 
reserve_crashkernel generic.
CRASH_ALIGN_SPECIFIED is used for user specified start address which is 
distinct from
default CRASH_ALIGN.

Thanks,
Chen Zhou
>
>> be the next persons problem.
>>
>> In this case there is no reason for changing this bit of code.
>> All CRASH_ALIGN is about is a default alignment when none is specified.
>> It is not a functional requirement but just something so that things
>> come out nicely.
>>
>>
>> Eric
>>
> .
>

[PATCH v2] powerpc: Fix HAVE_HARDLOCKUP_DETECTOR_ARCH build configuration

2021-03-28 Thread Chen Huang

When compiling the powerpc with the SMP disabled, it shows the issue:

arch/powerpc/kernel/watchdog.c: In function ‘watchdog_smp_panic’:
arch/powerpc/kernel/watchdog.c:177:4: error: implicit declaration of function 
‘smp_send_nmi_ipi’; did you mean ‘smp_send_stop’? 
[-Werror=implicit-function-declaration]
  177 |smp_send_nmi_ipi(c, wd_lockup_ipi, 100);
  |^~~~
  |smp_send_stop
cc1: all warnings being treated as errors
make[2]: *** [scripts/Makefile.build:273: arch/powerpc/kernel/watchdog.o] Error 
1
make[1]: *** [scripts/Makefile.build:534: arch/powerpc/kernel] Error 2
make: *** [Makefile:1980: arch/powerpc] Error 2
make: *** Waiting for unfinished jobs

We found that powerpc used ipi to implement hardlockup watchdog, so the
HAVE_HARDLOCKUP_DETECTOR_ARCH should depend on the SMP.

Fixes: 2104180a5369 ("powerpc/64s: implement arch-specific hardlockup watchdog")
Reported-by: Hulk Robot 
Signed-off-by: Chen Huang 
---
 arch/powerpc/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 764df010baee..a5196e1a1281 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -210,6 +210,7 @@ config PPC
select HAVE_FUNCTION_TRACER
select HAVE_GCC_PLUGINS if GCC_VERSION >= 50200   # plugin support 
on gcc <= 5.1 is buggy on PPC
select HAVE_GENERIC_VDSO
+   select HAVE_HARDLOCKUP_DETECTOR_ARCHif PPC_BOOK3S_64 && SMP
select HAVE_HW_BREAKPOINT   if PERF_EVENTS && (PPC_BOOK3S || PPC_8xx)
select HAVE_IDE
select HAVE_IOREMAP_PROT
@@ -225,7 +226,6 @@ config PPC
select HAVE_LIVEPATCH   if HAVE_DYNAMIC_FTRACE_WITH_REGS
select HAVE_MOD_ARCH_SPECIFIC
select HAVE_NMI if PERF_EVENTS || (PPC64 && PPC_BOOK3S)
-   select HAVE_HARDLOCKUP_DETECTOR_ARCHif (PPC64 && PPC_BOOK3S)
select HAVE_OPTPROBES   if PPC64
select HAVE_PERF_EVENTS
select HAVE_PERF_EVENTS_NMI if PPC64
--
2.17.1

Re: [PATCH v2] arm64: dts: imx8mm/q: Fix pad control of SD1_DATA0

2021-03-28 Thread Shawn Guo

On Wed, Mar 24, 2021 at 02:28:41PM +0100, Oliver Stäbler wrote:
> Fix address of the pad control register
> (IOMUXC_SW_PAD_CTL_PAD_SD1_DATA0) for SD1_DATA0_GPIO2_IO2.  This seems
> to be a typo but it leads to an exception when pinctrl is applied due to
> wrong memory address access.
> 
> Signed-off-by: Oliver Stäbler 

Applied, thanks.

RE: [PATCH] arch: nios2: fix unmet dependency for SERIAL_CORE_CONSOLE

2021-03-28 Thread Tan, Ley Foon




> -Original Message-
> From: Julian Braha  On Behalf Of Julian Braha
> Sent: Saturday, March 27, 2021 2:56 AM
> To: Tan, Ley Foon 
> Cc: linux-kernel@vger.kernel.org; fazilyildi...@gmail.com
> Subject: [PATCH] arch: nios2: fix unmet dependency for
> SERIAL_CORE_CONSOLE
> 
> When EARLY_PRINTK is enabled and TTY is disabled, Kbuild gives the
> following warning:
> 
> WARNING: unmet direct dependencies detected for
> SERIAL_CORE_CONSOLE
>   Depends on [n]: TTY [=n] && HAS_IOMEM [=y]
>   Selected by [y]:
>   - EARLY_PRINTK [=y]
> 
> This is because EARLY_PRINTK selects SERIAL_CORE_CONSOLE without
> selecting or depending on TTY, despite SERIAL_CORE_CONSOLE depending
> on TTY.
> 
> Signed-off-by: Julian Braha 
> ---
>  arch/nios2/Kconfig.debug | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/arch/nios2/Kconfig.debug b/arch/nios2/Kconfig.debug index
> a8bc06e96ef5..f453d5c1fd38 100644
> --- a/arch/nios2/Kconfig.debug
> +++ b/arch/nios2/Kconfig.debug
> @@ -3,6 +3,7 @@
>  config EARLY_PRINTK
>   bool "Activate early kernel debugging"
>   default y
> + depends on TTY
>   select SERIAL_CORE_CONSOLE
>   help
> Enable early printk on console
> --
> 2.25.1

Acked-by: Ley Foon Tan

linux-next: manual merge of the drm tree with Linus' tree

2021-03-28 Thread Stephen Rothwell

Hi all,

Today's linux-next merge of the drm tree got a conflict in:

  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c

between commits:

  9adb125dde69 ("drm/amdgpu: re-enable suspend phase 2 for S0ix")
  4021229e32bd ("drm/amdgpu/swsmu: skip gfx cgpg on s0ix suspend")
  9bb735abcbd8 ("drm/amdgpu: update comments about s0ix suspend/resume")

from Linus' tree and commit:

  e3c1b0712fdb ("drm/amdgpu: Reset the devices in the XGMI hive duirng probe")

from the drm tree.

I fixed it up (I think - see below) and can carry the fix as necessary.
This is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 8a5a8ff5d362,0f82c5d21237..
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@@ -2743,16 -2712,15 +2720,25 @@@ static int amdgpu_device_ip_suspend_pha
continue;
}
  
 +  /* skip suspend of gfx and psp for S0ix
 +   * gfx is in gfxoff state, so on resume it will exit gfxoff just
 +   * like at runtime. PSP is also part of the always on hardware
 +   * so no need to suspend it.
 +   */
 +  if (adev->in_s0ix &&
 +  (adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_PSP 
||
 +   adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_GFX))
 +  continue;
 +
+   /* skip unnecessary suspend if we do not initialize them yet */
+   if (adev->gmc.xgmi.pending_reset &&
+   !(adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_GMC 
||
+ adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_SMC 
||
+ adev->ip_blocks[i].version->type == 
AMD_IP_BLOCK_TYPE_COMMON ||
+ adev->ip_blocks[i].version->type == 
AMD_IP_BLOCK_TYPE_IH)) {
+   adev->ip_blocks[i].status.hw = false;
+   continue;
+   }
/* XXX handle errors */
r = adev->ip_blocks[i].version->funcs->suspend(adev);
/* XXX handle errors */


pgp9Rfosu7JKE.pgp
Description: OpenPGP digital signature

Re: Linux 5.12-rc5

2021-03-28 Thread Guenter Roeck

On Sun, Mar 28, 2021 at 04:05:54PM -0700, Linus Torvalds wrote:
> So if rc4 was perhaps a bit smaller than average, it looks like rc5 is
> a bigger than average.  We're not breaking any records, but it
> certainly isn't tiny, and the rc's aren't shrinking.
> 
> I'm not overly worried yet, but let's just say that the trend had
> better not continue, or I'll start feeling like we will need to make
> this one of those releases that need an rc8.
> 
> Most of the changes are drivers (gpu and networking stand out, but
> there's various other smaller driver updates elsewhere too) with core
> networking (including bpf) fixes being another noticeable subsystem.
> 
> Other than that, there's a smattering of noise all over: minor arch
> fixes, some filesystem fixes (btrfs, cifs, squashfs), selinux, perf
> tools, documentation.
> 
> io_uring continues to have noise in it, this time mainly due to some
> signal handling fixes. That removed a fair amount of problematic
> special casing, but the timing certainly isn't great.
> 
> So again, nothing really scary, just rather more than I would have
> liked to have in an rc5.
> 
> Shortlog appended for people who want to delve into the details,
> 

Build results:
total: 151 pass: 151 fail: 0
Qemu test results:
total: 458 pass: 457 fail: 1
Failed tests:
openrisc:or1ksim_defconfig

This is not really a new problem. I enabled devicetree unit tests
in the openrisc kernel and was rewarded with a crash.
https://lore.kernel.org/lkml/20210327224116.69309-1-li...@roeck-us.net/
has all the glorious details.

Guenter

[PATCH v2 09/13] dt-bindings: mmc: Add Pensando Elba SoC binding

2021-03-28 Thread Brad Larson

Pensando Elba ARM 64-bit SoC is integrated with this IP

Signed-off-by: Brad Larson 
---
 Documentation/devicetree/bindings/mmc/cdns,sdhci.yaml | 1 +
 1 file changed, 1 insertion(+)

diff --git a/Documentation/devicetree/bindings/mmc/cdns,sdhci.yaml 
b/Documentation/devicetree/bindings/mmc/cdns,sdhci.yaml
index af7442f73881..3e8eb3254b99 100644
--- a/Documentation/devicetree/bindings/mmc/cdns,sdhci.yaml
+++ b/Documentation/devicetree/bindings/mmc/cdns,sdhci.yaml
@@ -18,6 +18,7 @@ properties:
 items:
   - enum:
   - socionext,uniphier-sd4hc
+  - pensando,elba-emmc
   - const: cdns,sd4hc
 
   reg:
-- 
2.17.1

Re: [PATCH] powerpc: Fix HAVE_HARDLOCKUP_DETECTOR_ARCH build configuration

2021-03-28 Thread Chen Huang




在 2021/3/28 19:06, Christophe Leroy 写道:
> 
> 
> Le 27/03/2021 à 10:49, Chen Huang a écrit :
>> When compiling the powerpc with the SMP disabled, it shows the issue:
>>
>> arch/powerpc/kernel/watchdog.c: In function ‘watchdog_smp_panic’:
>> arch/powerpc/kernel/watchdog.c:177:4: error: implicit declaration of 
>> function ‘smp_send_nmi_ipi’; did you mean ‘smp_send_stop’? 
>> [-Werror=implicit-function-declaration]
>>    177 |    smp_send_nmi_ipi(c, wd_lockup_ipi, 100);
>>    |    ^~~~
>>    |    smp_send_stop
>> cc1: all warnings being treated as errors
>> make[2]: *** [scripts/Makefile.build:273: arch/powerpc/kernel/watchdog.o] 
>> Error 1
>> make[1]: *** [scripts/Makefile.build:534: arch/powerpc/kernel] Error 2
>> make: *** [Makefile:1980: arch/powerpc] Error 2
>> make: *** Waiting for unfinished jobs
>>
>> We found that powerpc used ipi to implement hardlockup watchdog, so the
>> HAVE_HARDLOCKUP_DETECTOR_ARCH should depend on the SMP.
>>
>> Fixes: 2104180a5369 ("powerpc/64s: implement arch-specific hardlockup 
>> watchdog")
>> Reported-by: Hulk Robot 
>> Signed-off-by: Chen Huang 
>> ---
>>   arch/powerpc/Kconfig | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
>> index 764df010baee..2d4f37b117ce 100644
>> --- a/arch/powerpc/Kconfig
>> +++ b/arch/powerpc/Kconfig
>> @@ -225,7 +225,7 @@ config PPC
>>   select HAVE_LIVEPATCH    if HAVE_DYNAMIC_FTRACE_WITH_REGS
>>   select HAVE_MOD_ARCH_SPECIFIC
>>   select HAVE_NMI    if PERF_EVENTS || (PPC64 && PPC_BOOK3S)
>> -    select HAVE_HARDLOCKUP_DETECTOR_ARCH    if (PPC64 && PPC_BOOK3S)
>> +    select HAVE_HARDLOCKUP_DETECTOR_ARCH    if PPC64 && PPC_BOOK3S && SMP
> 
> While modifying this line, you should restore the alphabetic order by moving 
> it up.
> 
> You can use PPC_BOOK3S_64 instead of PPC64 && PPC_BOOK3S
> 

I will modify it. Thanks!

>>   select HAVE_OPTPROBES    if PPC64
>>   select HAVE_PERF_EVENTS
>>   select HAVE_PERF_EVENTS_NMI    if PPC64
>>
> .

[PATCH v2 05/13] mmc: sdhci-cadence: Add Pensando Elba SoC support

2021-03-28 Thread Brad Larson

Add support for Pensando Elba SoC which explicitly controls
byte-lane enables on writes.  Refactor to allow platform
specific write ops.

Signed-off-by: Brad Larson 
---
 drivers/mmc/host/Kconfig  |  15 +++
 drivers/mmc/host/Makefile |   1 +
 drivers/mmc/host/sdhci-cadence-elba.c | 137 ++
 drivers/mmc/host/sdhci-cadence.c  |  81 ---
 drivers/mmc/host/sdhci-cadence.h  |  68 +
 5 files changed, 260 insertions(+), 42 deletions(-)
 create mode 100644 drivers/mmc/host/sdhci-cadence-elba.c
 create mode 100644 drivers/mmc/host/sdhci-cadence.h

diff --git a/drivers/mmc/host/Kconfig b/drivers/mmc/host/Kconfig
index b236dfe2e879..65ea323c06f2 100644
--- a/drivers/mmc/host/Kconfig
+++ b/drivers/mmc/host/Kconfig
@@ -250,6 +250,21 @@ config MMC_SDHCI_CADENCE
 
  If unsure, say N.
 
+config MMC_SDHCI_CADENCE_ELBA
+   tristate "SDHCI support for the Pensando/Cadence SD/SDIO/eMMC 
controller"
+   depends on ARCH_PENSANDO_ELBA_SOC
+   depends on MMC_SDHCI
+   depends on OF
+   depends on MMC_SDHCI_CADENCE
+   depends on MMC_SDHCI_PLTFM
+   select MMC_SDHCI_IO_ACCESSORS
+   help
+ This selects the Pensando/Cadence SD/SDIO/eMMC controller.
+
+ If you have a controller with this interface, say Y or M here.
+
+ If unsure, say N.
+
 config MMC_SDHCI_CNS3XXX
tristate "SDHCI support on the Cavium Networks CNS3xxx SoC"
depends on ARCH_CNS3XXX || COMPILE_TEST
diff --git a/drivers/mmc/host/Makefile b/drivers/mmc/host/Makefile
index 6df5c4774260..f2a6d50e64de 100644
--- a/drivers/mmc/host/Makefile
+++ b/drivers/mmc/host/Makefile
@@ -80,6 +80,7 @@ obj-$(CONFIG_MMC_REALTEK_USB) += rtsx_usb_sdmmc.o
 
 obj-$(CONFIG_MMC_SDHCI_PLTFM)  += sdhci-pltfm.o
 obj-$(CONFIG_MMC_SDHCI_CADENCE)+= sdhci-cadence.o
+obj-$(CONFIG_MMC_SDHCI_CADENCE_ELBA)   += sdhci-cadence-elba.o
 obj-$(CONFIG_MMC_SDHCI_CNS3XXX)+= sdhci-cns3xxx.o
 obj-$(CONFIG_MMC_SDHCI_ESDHC_MCF)   += sdhci-esdhc-mcf.o
 obj-$(CONFIG_MMC_SDHCI_ESDHC_IMX)  += sdhci-esdhc-imx.o
diff --git a/drivers/mmc/host/sdhci-cadence-elba.c 
b/drivers/mmc/host/sdhci-cadence-elba.c
new file mode 100644
index ..ec23f43de407
--- /dev/null
+++ b/drivers/mmc/host/sdhci-cadence-elba.c
@@ -0,0 +1,137 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Copyright (C) 2020 Pensando Systems, Inc.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "sdhci-pltfm.h"
+#include "sdhci-cadence.h"
+
+// delay regs address
+#define SDIO_REG_HRS4  0x10
+#define REG_DELAY_HS   0x00
+#define REG_DELAY_DEFAULT  0x01
+#define REG_DELAY_UHSI_SDR50   0x04
+#define REG_DELAY_UHSI_DDR50   0x05
+
+static void elba_write_l(struct sdhci_host *host, u32 val, int reg)
+{
+   struct sdhci_cdns_priv *priv = sdhci_cdns_priv(host);
+   unsigned long flags;
+
+   spin_lock_irqsave(>wrlock, flags);
+   writel(0x78, priv->ctl_addr);
+   writel(val, host->ioaddr + reg);
+   spin_unlock_irqrestore(>wrlock, flags);
+}
+
+static void elba_write_w(struct sdhci_host *host, u16 val, int reg)
+{
+   struct sdhci_cdns_priv *priv = sdhci_cdns_priv(host);
+   unsigned long flags;
+   u32 m = (reg & 0x3);
+   u32 msk = (0x3 << (m));
+
+   spin_lock_irqsave(>wrlock, flags);
+   writel(msk << 3, priv->ctl_addr);
+   writew(val, host->ioaddr + reg);
+   spin_unlock_irqrestore(>wrlock, flags);
+}
+
+static void elba_write_b(struct sdhci_host *host, u8 val, int reg)
+{
+   struct sdhci_cdns_priv *priv = sdhci_cdns_priv(host);
+   unsigned long flags;
+   u32 m = (reg & 0x3);
+   u32 msk = (0x1 << (m));
+
+   spin_lock_irqsave(>wrlock, flags);
+   writel(msk << 3, priv->ctl_addr);
+   writeb(val, host->ioaddr + reg);
+   spin_unlock_irqrestore(>wrlock, flags);
+}
+
+static void elba_priv_write_l(struct sdhci_cdns_priv *priv,
+   u32 val, void __iomem *reg)
+{
+   unsigned long flags;
+
+   spin_lock_irqsave(>wrlock, flags);
+   writel(0x78, priv->ctl_addr);
+   writel(val, reg);
+   spin_unlock_irqrestore(>wrlock, flags);
+}
+
+static const struct sdhci_ops sdhci_elba_ops = {
+   .write_l = elba_write_l,
+   .write_w = elba_write_w,
+   .write_b = elba_write_b,
+   .set_clock = sdhci_set_clock,
+   .get_timeout_clock = sdhci_cdns_get_timeout_clock,
+   .set_bus_width = sdhci_set_bus_width,
+   .reset = sdhci_reset,
+   .set_uhs_signaling = sdhci_cdns_set_uhs_signaling,
+};
+
+static void sd4_set_dlyvr(struct sdhci_host *host,
+ unsigned char addr, unsigned char data)
+{
+   unsigned long dlyrv_reg;
+
+   dlyrv_reg = ((unsigned long)data << 8);
+   dlyrv_reg |= addr;
+
+   // set data and address
+   writel(dlyrv_reg, host->ioaddr + SDIO_REG_HRS4);
+   dlyrv_reg |= (1uL

[PATCH v2 07/13] arm64: dts: Add Pensando Elba SoC support

2021-03-28 Thread Brad Larson

Add Pensando common and Elba SoC specific device nodes

Signed-off-by: Brad Larson 
---
 arch/arm64/boot/dts/Makefile  |   1 +
 arch/arm64/boot/dts/pensando/Makefile |   6 +
 arch/arm64/boot/dts/pensando/elba-16core.dtsi | 170 ++
 .../boot/dts/pensando/elba-asic-common.dtsi   | 112 +++
 arch/arm64/boot/dts/pensando/elba-asic.dts|   7 +
 .../boot/dts/pensando/elba-flash-parts.dtsi   |  78 +
 arch/arm64/boot/dts/pensando/elba.dtsi| 310 ++
 7 files changed, 684 insertions(+)
 create mode 100644 arch/arm64/boot/dts/pensando/Makefile
 create mode 100644 arch/arm64/boot/dts/pensando/elba-16core.dtsi
 create mode 100644 arch/arm64/boot/dts/pensando/elba-asic-common.dtsi
 create mode 100644 arch/arm64/boot/dts/pensando/elba-asic.dts
 create mode 100644 arch/arm64/boot/dts/pensando/elba-flash-parts.dtsi
 create mode 100644 arch/arm64/boot/dts/pensando/elba.dtsi

diff --git a/arch/arm64/boot/dts/Makefile b/arch/arm64/boot/dts/Makefile
index f1173cd93594..c85db0a097fe 100644
--- a/arch/arm64/boot/dts/Makefile
+++ b/arch/arm64/boot/dts/Makefile
@@ -19,6 +19,7 @@ subdir-y += marvell
 subdir-y += mediatek
 subdir-y += microchip
 subdir-y += nvidia
+subdir-y += pensando
 subdir-y += qcom
 subdir-y += realtek
 subdir-y += renesas
diff --git a/arch/arm64/boot/dts/pensando/Makefile 
b/arch/arm64/boot/dts/pensando/Makefile
new file mode 100644
index ..0c2c0961e64a
--- /dev/null
+++ b/arch/arm64/boot/dts/pensando/Makefile
@@ -0,0 +1,6 @@
+# SPDX-License-Identifier: GPL-2.0
+dtb-$(CONFIG_ARCH_PENSANDO_ELBA_SOC) += elba-asic.dtb
+
+always-y   := $(dtb-y)
+subdir-y   := $(dts-dirs)
+clean-files:= *.dtb
diff --git a/arch/arm64/boot/dts/pensando/elba-16core.dtsi 
b/arch/arm64/boot/dts/pensando/elba-16core.dtsi
new file mode 100644
index ..a6c47899b69a
--- /dev/null
+++ b/arch/arm64/boot/dts/pensando/elba-16core.dtsi
@@ -0,0 +1,170 @@
+
+/ {
+   cpus {
+   #address-cells = <2>;
+   #size-cells = <0>;
+
+   cpu-map {
+   cluster0 {
+   core0 { cpu = <>; };
+   core1 { cpu = <>; };
+   core2 { cpu = <>; };
+   core3 { cpu = <>; };
+   };
+   cluster1 {
+   core0 { cpu = <>; };
+   core1 { cpu = <>; };
+   core2 { cpu = <>; };
+   core3 { cpu = <>; };
+   };
+   cluster2 {
+   core0 { cpu = <>; };
+   core1 { cpu = <>; };
+   core2 { cpu = <>; };
+   core3 { cpu = <>; };
+   };
+   cluster3 {
+   core0 { cpu = <>; };
+   core1 { cpu = <>; };
+   core2 { cpu = <>; };
+   core3 { cpu = <>; };
+   };
+   };
+
+   // CLUSTER 0
+   cpu0: cpu@0 {
+   device_type = "cpu";
+   compatible = "arm,cortex-a72", "arm,armv8";
+   reg = <0 0x0>;
+   enable-method = "spin-table";
+   next-level-cache = <_0>;
+   };
+   cpu1: cpu@1 {
+   device_type = "cpu";
+   compatible = "arm,cortex-a72", "arm,armv8";
+   reg = <0 0x1>;
+   enable-method = "spin-table";
+   next-level-cache = <_0>;
+   };
+   cpu2: cpu@2 {
+   device_type = "cpu";
+   compatible = "arm,cortex-a72", "arm,armv8";
+   reg = <0 0x2>;
+   enable-method = "spin-table";
+   next-level-cache = <_0>;
+   };
+   cpu3: cpu@3 {
+   device_type = "cpu";
+   compatible = "arm,cortex-a72", "arm,armv8";
+   reg = <0 0x3>;
+   enable-method = "spin-table";
+   next-level-cache = <_0>;
+   };
+
+   l2_0: l2-cache0 {
+   compatible = "cache";
+   };
+
+   // CLUSTER 1
+   cpu4: cpu@100 {
+   device_type = "cpu";
+   compatible = "arm,cortex-a72", "arm,armv8";
+   reg = <0 0x100>;
+   enable-method = "spin-table";
+   next-level-cache = <_1>;
+   };
+   cpu5: cpu@101 {
+   device_type = "cpu";
+   compatible =

[PATCH v2 11/13] dt-bindings: gpio: Add Pensando Elba SoC support

2021-03-28 Thread Brad Larson

The Pensando Elba SoC gpio driver provides control
of four chip selects on two SPI busses.

Signed-off-by: Brad Larson 
---
 .../bindings/gpio/pensando,elba-spics.yaml| 50 +++
 1 file changed, 50 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/gpio/pensando,elba-spics.yaml

diff --git a/Documentation/devicetree/bindings/gpio/pensando,elba-spics.yaml 
b/Documentation/devicetree/bindings/gpio/pensando,elba-spics.yaml
new file mode 100644
index ..c93b481d4ad3
--- /dev/null
+++ b/Documentation/devicetree/bindings/gpio/pensando,elba-spics.yaml
@@ -0,0 +1,50 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/gpio/pensando,elba-spics.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Pensando Elba SPI Chip Select Driver
+
+description: |
+  The Pensando Elba SoC provides four SPI bus chip selects.
+
+maintainers:
+  - Brad Larson 
+
+properties:
+  $nodename:
+pattern: "^spics@[0-9a-f]+$"
+  
+  compatible:
+const: pensando,elba-spics
+
+  reg:
+maxItems: 1
+
+  gpio-controller: true
+
+  "#gpio-cells":
+const: 2
+
+required:
+  - compatible
+  - reg
+  - gpio-controller
+  - "#gpio-cells"
+
+additionalProperties: false
+
+examples:
+  - |
+soc {
+#address-cells = <2>;
+#size-cells = <2>;
+
+spics: spics@307c2468 {
+compatible = "pensando,elba-spics";
+reg = <0x0 0x307c2468 0x0 0x4>;
+gpio-controller;
+#gpio-cells = <2>;
+};
+};
-- 
2.17.1

[PATCH v2 12/13] MAINTAINERS: Add entry for PENSANDO

2021-03-28 Thread Brad Larson

Add entry for PENSANDO maintainer and files

Signed-off-by: Brad Larson 
---
 MAINTAINERS | 9 +
 1 file changed, 9 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index fb2a3633b719..272c7a7fde75 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2246,6 +2246,15 @@ S:   Maintained
 W: http://hackndev.com
 F: arch/arm/mach-pxa/palmz72.*
 
+ARM/PENSANDO SUPPORT
+M: Brad Larson 
+L: linux-arm-ker...@lists.infradead.org (moderated for non-subscribers)
+S: Maintained
+F: Documentation/devicetree/bindings/*/pensando*
+F: arch/arm64/boot/dts/pensando/
+F: drivers/gpio/gpio-elba-spics.c
+F: drivers/mmc/host/sdhci-cadence-elba.c
+
 ARM/PLEB SUPPORT
 M: Peter Chubb 
 S: Maintained
-- 
2.17.1

[PATCH v2 13/13] gpio: Use linux/gpio/driver.h

2021-03-28 Thread Brad Larson

New drivers should include  instead
of legacy .

Signed-off-by: Brad Larson 
---
 drivers/gpio/gpio-elba-spics.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/gpio/gpio-elba-spics.c b/drivers/gpio/gpio-elba-spics.c
index 351bbaeea033..c0dce5333f35 100644
--- a/drivers/gpio/gpio-elba-spics.c
+++ b/drivers/gpio/gpio-elba-spics.c
@@ -6,11 +6,10 @@
  */
 
 #include 
-#include 
+#include 
 #include 
 #include 
 #include 
-//#include 
 #include 
 #include 
 #include 
-- 
2.17.1

[PATCH v2 10/13] dt-bindings: spi: cadence-qspi: Add support for Pensando Elba SoC

2021-03-28 Thread Brad Larson

Add new vendor Pensando Systems Elba SoC compatible
string and convert to json-schema.

Signed-off-by: Brad Larson 
---
 .../bindings/spi/cadence-quadspi.txt  |  68 
 .../bindings/spi/cadence-quadspi.yaml | 153 ++
 2 files changed, 153 insertions(+), 68 deletions(-)
 delete mode 100644 Documentation/devicetree/bindings/spi/cadence-quadspi.txt
 create mode 100644 Documentation/devicetree/bindings/spi/cadence-quadspi.yaml

diff --git a/Documentation/devicetree/bindings/spi/cadence-quadspi.txt 
b/Documentation/devicetree/bindings/spi/cadence-quadspi.txt
deleted file mode 100644
index 8ace832a2d80..
--- a/Documentation/devicetree/bindings/spi/cadence-quadspi.txt
+++ /dev/null
@@ -1,68 +0,0 @@
-* Cadence Quad SPI controller
-
-Required properties:
-- compatible : should be one of the following:
-   Generic default - "cdns,qspi-nor".
-   For TI 66AK2G SoC - "ti,k2g-qspi", "cdns,qspi-nor".
-   For TI AM654 SoC  - "ti,am654-ospi", "cdns,qspi-nor".
-   For Intel LGM SoC - "intel,lgm-qspi", "cdns,qspi-nor".
-- reg : Contains two entries, each of which is a tuple consisting of a
-   physical address and length. The first entry is the address and
-   length of the controller register set. The second entry is the
-   address and length of the QSPI Controller data area.
-- interrupts : Unit interrupt specifier for the controller interrupt.
-- clocks : phandle to the Quad SPI clock.
-- cdns,fifo-depth : Size of the data FIFO in words.
-- cdns,fifo-width : Bus width of the data FIFO in bytes.
-- cdns,trigger-address : 32-bit indirect AHB trigger address.
-
-Optional properties:
-- cdns,is-decoded-cs : Flag to indicate whether decoder is used or not.
-- cdns,rclk-en : Flag to indicate that QSPI return clock is used to latch
-  the read data rather than the QSPI clock. Make sure that QSPI return
-  clock is populated on the board before using this property.
-
-Optional subnodes:
-Subnodes of the Cadence Quad SPI controller are spi slave nodes with additional
-custom properties:
-- cdns,read-delay : Delay for read capture logic, in clock cycles
-- cdns,tshsl-ns : Delay in nanoseconds for the length that the master
-  mode chip select outputs are de-asserted between
- transactions.
-- cdns,tsd2d-ns : Delay in nanoseconds between one chip select being
-  de-activated and the activation of another.
-- cdns,tchsh-ns : Delay in nanoseconds between last bit of current
-  transaction and deasserting the device chip select
- (qspi_n_ss_out).
-- cdns,tslch-ns : Delay in nanoseconds between setting qspi_n_ss_out low
-  and first bit transfer.
-- resets   : Must contain an entry for each entry in reset-names.
- See ../reset/reset.txt for details.
-- reset-names  : Must include either "qspi" and/or "qspi-ocp".
-
-Example:
-
-   qspi: spi@ff705000 {
-   compatible = "cdns,qspi-nor";
-   #address-cells = <1>;
-   #size-cells = <0>;
-   reg = <0xff705000 0x1000>,
- <0xffa0 0x1000>;
-   interrupts = <0 151 4>;
-   clocks = <_clk>;
-   cdns,is-decoded-cs;
-   cdns,fifo-depth = <128>;
-   cdns,fifo-width = <4>;
-   cdns,trigger-address = <0x>;
-   resets = < QSPI_RESET>, < QSPI_OCP_RESET>;
-   reset-names = "qspi", "qspi-ocp";
-
-   flash0: n25q00@0 {
-   ...
-   cdns,read-delay = <4>;
-   cdns,tshsl-ns = <50>;
-   cdns,tsd2d-ns = <50>;
-   cdns,tchsh-ns = <4>;
-   cdns,tslch-ns = <4>;
-   };
-   };
diff --git a/Documentation/devicetree/bindings/spi/cadence-quadspi.yaml 
b/Documentation/devicetree/bindings/spi/cadence-quadspi.yaml
new file mode 100644
index ..94d631045153
--- /dev/null
+++ b/Documentation/devicetree/bindings/spi/cadence-quadspi.yaml
@@ -0,0 +1,153 @@
+# SPDX-License-Identifier: GPL-2.0-only
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/spi/cadence-quadspi.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Cadence Quad SPI controller
+
+maintainers:
+  - Ramuthevar Vadivel Murugan 
+  - Brad Larson 
+
+properties:
+  compatible:
+contains:
+  enum:
+- cdns,qspi-nor   # Generic default
+- ti,k2g-qspi # TI 66AK2G SoC
+- ti,am654-ospi   # TI AM654 SoC
+- intel,lgm-qspi  # Intel LGM SoC
+- pensando,cdns-qspi  # Pensando Elba SoC
+
+  '#address-cells':
+const: 1
+
+  '#size-cells':
+const: 0
+
+  reg:
+minItems: 2
+maxItems: 2
+description: |
+  Contains two entries, each of which is a tuple consisting of a
+  physical address and length. The first entry is the address and
+

[PATCH v2 00/13] Support Pensando Elba SoC

2021-03-28 Thread Brad Larson

This series enables support for Pensando Elba SoC based platforms.
The Elba SoC has the following features:

- Sixteen ARM64 A72 cores
- Dual DDR 4/5 memory controllers
- 32 lanes of PCIe Gen3/4 to the Host
- Network interfaces: Dual 200GE, Quad 100GE, 50GE, 25GE, 10GE and
  also a single 1GE management port.
- Storage/crypto offloads and 144 programmable P4 cores.
- QSPI and EMMC for SoC storage
- Two SPI interfaces for peripheral management
- I2C bus for platform management

See below for an overview of changes since v1.

== Patch overview ==

- 01Fix typo, return code value and log message.
- 03Remove else clause, intrinsic DW chip-select is never used.
- 08-11 Split out dts and bindings to sub-patches
- 10Converted existing cadence-quadspi.txt to YAML schema
- 13New driver should use 

Brad Larson (13):
  gpio: Add Elba SoC gpio driver for spi cs control
  spi: cadence-quadspi: Add QSPI support for Pensando Elba SoC
  spi: dw: Add support for Pensando Elba SoC SPI
  spidev: Add Pensando CPLD compatible
  mmc: sdhci-cadence: Add Pensando Elba SoC support
  arm64: Add config for Pensando SoC platforms
  arm64: dts: Add Pensando Elba SoC support
  dt-bindings: Add pensando vendor prefix
  dt-bindings: mmc: Add Pensando Elba SoC binding
  dt-bindings: spi: cadence-qspi: Add support for Pensando Elba SoC
  dt-bindings: gpio: Add Pensando Elba SoC support
  MAINTAINERS: Add entry for PENSANDO
  gpio: Use linux/gpio/driver.h

 .../bindings/gpio/pensando,elba-spics.yaml|  50 +++
 .../devicetree/bindings/mmc/cdns,sdhci.yaml   |   1 +
 .../bindings/spi/cadence-quadspi.txt  |  68 
 .../bindings/spi/cadence-quadspi.yaml | 153 +
 .../devicetree/bindings/vendor-prefixes.yaml  |   2 +
 MAINTAINERS   |   9 +
 arch/arm64/Kconfig.platforms  |   5 +
 arch/arm64/boot/dts/Makefile  |   1 +
 arch/arm64/boot/dts/pensando/Makefile |   6 +
 arch/arm64/boot/dts/pensando/elba-16core.dtsi | 170 ++
 .../boot/dts/pensando/elba-asic-common.dtsi   | 112 +++
 arch/arm64/boot/dts/pensando/elba-asic.dts|   7 +
 .../boot/dts/pensando/elba-flash-parts.dtsi   |  78 +
 arch/arm64/boot/dts/pensando/elba.dtsi| 310 ++
 drivers/gpio/Kconfig  |   6 +
 drivers/gpio/Makefile |   1 +
 drivers/gpio/gpio-elba-spics.c| 113 +++
 drivers/mmc/host/Kconfig  |  15 +
 drivers/mmc/host/Makefile |   1 +
 drivers/mmc/host/sdhci-cadence-elba.c | 137 
 drivers/mmc/host/sdhci-cadence.c  |  81 +++--
 drivers/mmc/host/sdhci-cadence.h  |  68 
 drivers/spi/spi-cadence-quadspi.c |   9 +
 drivers/spi/spi-dw-mmio.c |  28 +-
 drivers/spi/spidev.c  |   1 +
 25 files changed, 1321 insertions(+), 111 deletions(-)
 create mode 100644 
Documentation/devicetree/bindings/gpio/pensando,elba-spics.yaml
 delete mode 100644 Documentation/devicetree/bindings/spi/cadence-quadspi.txt
 create mode 100644 Documentation/devicetree/bindings/spi/cadence-quadspi.yaml
 create mode 100644 arch/arm64/boot/dts/pensando/Makefile
 create mode 100644 arch/arm64/boot/dts/pensando/elba-16core.dtsi
 create mode 100644 arch/arm64/boot/dts/pensando/elba-asic-common.dtsi
 create mode 100644 arch/arm64/boot/dts/pensando/elba-asic.dts
 create mode 100644 arch/arm64/boot/dts/pensando/elba-flash-parts.dtsi
 create mode 100644 arch/arm64/boot/dts/pensando/elba.dtsi
 create mode 100644 drivers/gpio/gpio-elba-spics.c
 create mode 100644 drivers/mmc/host/sdhci-cadence-elba.c
 create mode 100644 drivers/mmc/host/sdhci-cadence.h

-- 
2.17.1

[PATCH v2 03/13] spi: dw: Add support for Pensando Elba SoC SPI

2021-03-28 Thread Brad Larson

The Pensando Elba SoC uses a GPIO based chip select
for two DW SPI busses with each bus having two
chip selects.

Signed-off-by: Brad Larson 
---
 drivers/spi/spi-dw-mmio.c | 28 +++-
 1 file changed, 27 insertions(+), 1 deletion(-)

diff --git a/drivers/spi/spi-dw-mmio.c b/drivers/spi/spi-dw-mmio.c
index 17c06039a74d..c323a5ceecb8 100644
--- a/drivers/spi/spi-dw-mmio.c
+++ b/drivers/spi/spi-dw-mmio.c
@@ -56,7 +56,7 @@ struct dw_spi_mscc {
 /*
  * The Designware SPI controller (referred to as master in the documentation)
  * automatically deasserts chip select when the tx fifo is empty. The chip
- * selects then needs to be either driven as GPIOs or, for the first 4 using 
the
+ * selects then needs to be either driven as GPIOs or, for the first 4 using
  * the SPI boot controller registers. the final chip select is an OR gate
  * between the Designware SPI controller and the SPI boot controller.
  */
@@ -237,6 +237,31 @@ static int dw_spi_canaan_k210_init(struct platform_device 
*pdev,
return 0;
 }
 
+static void dw_spi_elba_set_cs(struct spi_device *spi, bool enable)
+{
+   struct dw_spi *dws = spi_master_get_devdata(spi->master);
+
+   if (!enable) {
+   /*
+* Using a GPIO-based chip-select, the DW SPI
+* controller still needs its own CS bit selected
+* to start the serial engine.  On Elba the specific
+* CS doesn't matter to start the serial engine,
+* so using CS0.
+*/
+   dw_writel(dws, DW_SPI_SER, BIT(0));
+   } else {
+   dw_writel(dws, DW_SPI_SER, 0);
+   }
+}
+
+static int dw_spi_elba_init(struct platform_device *pdev,
+   struct dw_spi_mmio *dwsmmio)
+{
+   dwsmmio->dws.set_cs = dw_spi_elba_set_cs;
+   return 0;
+}
+
 static int dw_spi_mmio_probe(struct platform_device *pdev)
 {
int (*init_func)(struct platform_device *pdev,
@@ -351,6 +376,7 @@ static const struct of_device_id dw_spi_mmio_of_match[] = {
{ .compatible = "intel,keembay-ssi", .data = dw_spi_keembay_init},
{ .compatible = "microchip,sparx5-spi", dw_spi_mscc_sparx5_init},
{ .compatible = "canaan,k210-spi", dw_spi_canaan_k210_init},
+   { .compatible = "pensando,elba-spi", .data = dw_spi_elba_init},
{ /* end of table */}
 };
 MODULE_DEVICE_TABLE(of, dw_spi_mmio_of_match);
-- 
2.17.1

[PATCH v2 04/13] spidev: Add Pensando CPLD compatible

2021-03-28 Thread Brad Larson

Pensando Elba SoC platforms have a SPI connected CPLD
for platform management.

Signed-off-by: Brad Larson 
---
 drivers/spi/spidev.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/spi/spidev.c b/drivers/spi/spidev.c
index 8cb4d923aeaa..8b285852ce82 100644
--- a/drivers/spi/spidev.c
+++ b/drivers/spi/spidev.c
@@ -683,6 +683,7 @@ static const struct of_device_id spidev_dt_ids[] = {
{ .compatible = "dh,dhcom-board" },
{ .compatible = "menlo,m53cpld" },
{ .compatible = "cisco,spi-petra" },
+   { .compatible = "pensando,cpld" },
{},
 };
 MODULE_DEVICE_TABLE(of, spidev_dt_ids);
-- 
2.17.1

[PATCH v2 08/13] dt-bindings: Add pensando vendor prefix

2021-03-28 Thread Brad Larson

Add vendor prefix for Pensando Systems, Inc.

Signed-off-by: Brad Larson 
---
 Documentation/devicetree/bindings/vendor-prefixes.yaml | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/Documentation/devicetree/bindings/vendor-prefixes.yaml 
b/Documentation/devicetree/bindings/vendor-prefixes.yaml
index f6064d84a424..9a21d780c5e1 100644
--- a/Documentation/devicetree/bindings/vendor-prefixes.yaml
+++ b/Documentation/devicetree/bindings/vendor-prefixes.yaml
@@ -850,6 +850,8 @@ patternProperties:
 description: Parallax Inc.
   "^pda,.*":
 description: Precision Design Associates, Inc.
+  "^pensando,.*":
+description: Pensando Systems Inc.
   "^pericom,.*":
 description: Pericom Technology Inc.
   "^pervasive,.*":
-- 
2.17.1

[PATCH v2 06/13] arm64: Add config for Pensando SoC platforms

2021-03-28 Thread Brad Larson

Add ARCH_PENSANDO configuration option for Pensando SoC
based platforms.

Signed-off-by: Brad Larson 
---
 arch/arm64/Kconfig.platforms | 5 +
 1 file changed, 5 insertions(+)

diff --git a/arch/arm64/Kconfig.platforms b/arch/arm64/Kconfig.platforms
index cdfd5fed457f..803e7cf1df55 100644
--- a/arch/arm64/Kconfig.platforms
+++ b/arch/arm64/Kconfig.platforms
@@ -210,6 +210,11 @@ config ARCH_MXC
  This enables support for the ARMv8 based SoCs in the
  NXP i.MX family.
 
+config ARCH_PENSANDO
+   bool "Pensando Platforms"
+   help
+ This enables support for the ARMv8 based Pensando chipsets
+
 config ARCH_QCOM
bool "Qualcomm Platforms"
select GPIOLIB
-- 
2.17.1

[PATCH v2 01/13] gpio: Add Elba SoC gpio driver for spi cs control

2021-03-28 Thread Brad Larson

This GPIO driver is for the Pensando Elba SoC which
provides control of four chip selects on two SPI busses.

Signed-off-by: Brad Larson 
---
 drivers/gpio/Kconfig   |   6 ++
 drivers/gpio/Makefile  |   1 +
 drivers/gpio/gpio-elba-spics.c | 114 +
 3 files changed, 121 insertions(+)
 create mode 100644 drivers/gpio/gpio-elba-spics.c

diff --git a/drivers/gpio/Kconfig b/drivers/gpio/Kconfig
index e3607ec4c2e8..4720459b24f5 100644
--- a/drivers/gpio/Kconfig
+++ b/drivers/gpio/Kconfig
@@ -241,6 +241,12 @@ config GPIO_EIC_SPRD
help
  Say yes here to support Spreadtrum EIC device.
 
+config GPIO_ELBA_SPICS
+   bool "Pensando Elba SPI chip-select"
+   depends on (ARCH_PENSANDO_ELBA_SOC || COMPILE_TEST)
+   help
+ Say yes here to support the Penasndo Elba SoC SPI chip-select driver
+
 config GPIO_EM
tristate "Emma Mobile GPIO"
depends on (ARCH_EMEV2 || COMPILE_TEST) && OF_GPIO
diff --git a/drivers/gpio/Makefile b/drivers/gpio/Makefile
index c58a90a3c3b1..c5c7acad371b 100644
--- a/drivers/gpio/Makefile
+++ b/drivers/gpio/Makefile
@@ -54,6 +54,7 @@ obj-$(CONFIG_GPIO_DAVINCI)+= gpio-davinci.o
 obj-$(CONFIG_GPIO_DLN2)+= gpio-dln2.o
 obj-$(CONFIG_GPIO_DWAPB)   += gpio-dwapb.o
 obj-$(CONFIG_GPIO_EIC_SPRD)+= gpio-eic-sprd.o
+obj-$(CONFIG_GPIO_ELBA_SPICS)  += gpio-elba-spics.o
 obj-$(CONFIG_GPIO_EM)  += gpio-em.o
 obj-$(CONFIG_GPIO_EP93XX)  += gpio-ep93xx.o
 obj-$(CONFIG_GPIO_EXAR)+= gpio-exar.o
diff --git a/drivers/gpio/gpio-elba-spics.c b/drivers/gpio/gpio-elba-spics.c
new file mode 100644
index ..351bbaeea033
--- /dev/null
+++ b/drivers/gpio/gpio-elba-spics.c
@@ -0,0 +1,114 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Pensando Elba SoC SPI chip select driver
+ *
+ * Copyright (c) 2020-2021, Pensando Systems Inc.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+//#include 
+#include 
+#include 
+#include 
+
+/*
+ * pin: 32|   10
+ * bit: 7--6--5--4|---3--2--1--0
+ * cs1  cs1_ovr  cs0  cs0_ovr |  cs1  cs1_ovr  cs0  cs0_ovr
+ *ssi1| ssi0
+ */
+#define SPICS_PIN_SHIFT(pin)   (2 * (pin))
+#define SPICS_MASK(pin)(0x3 << SPICS_PIN_SHIFT(pin))
+#define SPICS_SET(pin, val)val) << 1) | 0x1) << SPICS_PIN_SHIFT(pin))
+
+struct elba_spics_priv {
+   void __iomem *base;
+   spinlock_t lock;
+   struct gpio_chip chip;
+};
+
+static int elba_spics_get_value(struct gpio_chip *chip, unsigned int pin)
+{
+   return -ENOTSUPP;
+}
+
+static void elba_spics_set_value(struct gpio_chip *chip,
+   unsigned int pin, int value)
+{
+   struct elba_spics_priv *p = gpiochip_get_data(chip);
+   unsigned long flags;
+   u32 tmp;
+
+   /* select chip select from register */
+   spin_lock_irqsave(>lock, flags);
+   tmp = readl_relaxed(p->base);
+   tmp = (tmp & ~SPICS_MASK(pin)) | SPICS_SET(pin, value);
+   writel_relaxed(tmp, p->base);
+   spin_unlock_irqrestore(>lock, flags);
+}
+
+static int elba_spics_direction_input(struct gpio_chip *chip, unsigned int pin)
+{
+   return -ENOTSUPP;
+}
+
+static int elba_spics_direction_output(struct gpio_chip *chip,
+   unsigned int pin, int value)
+{
+   elba_spics_set_value(chip, pin, value);
+   return 0;
+}
+
+static int elba_spics_probe(struct platform_device *pdev)
+{
+   struct elba_spics_priv *p;
+   struct resource *res;
+   int ret = 0;
+
+   p = devm_kzalloc(>dev, sizeof(*p), GFP_KERNEL);
+   if (!p)
+   return -ENOMEM;
+
+   res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+   p->base = devm_ioremap_resource(>dev, res);
+   if (IS_ERR(p->base))
+   return PTR_ERR(p->base);
+   spin_lock_init(>lock);
+   platform_set_drvdata(pdev, p);
+
+   p->chip.ngpio = 4;  /* 2 cs pins for spi0, and 2 for spi1 */
+   p->chip.base = -1;
+   p->chip.direction_input = elba_spics_direction_input;
+   p->chip.direction_output = elba_spics_direction_output;
+   p->chip.get = elba_spics_get_value;
+   p->chip.set = elba_spics_set_value;
+   p->chip.label = dev_name(>dev);
+   p->chip.parent = >dev;
+   p->chip.owner = THIS_MODULE;
+
+   ret = devm_gpiochip_add_data(>dev, >chip, p);
+   if (ret)
+   dev_err(>dev, "unable to add gpio chip\n");
+   return ret;
+}
+
+static const struct of_device_id elba_spics_of_match[] = {
+   { .compatible = "pensando,elba-spics" },
+   {}
+};
+
+static struct platform_driver elba_spics_driver = {
+   .probe = elba_spics_probe,
+   .driver = {
+   .name = "pensando-elba-spics",
+   .of_match_table = elba_spics_of_match,
+   },
+};

1 2 3 4 5 >

1 - 100 of 480 matches

Mail list logo