[PATCH RESEND -next] mtd: rawnand: Remove redundant dev_err call in hisi_nfc_probe()

2021-04-07 Thread Wei Li
There is a error message within devm_ioremap_resource
already, so remove the dev_err call to avoid redundant
error message.

Reported-by: Hulk Robot 
Signed-off-by: Wei Li 
---
 drivers/mtd/nand/raw/hisi504_nand.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/mtd/nand/raw/hisi504_nand.c 
b/drivers/mtd/nand/raw/hisi504_nand.c
index 8b2122ce6ec3..78c4e05434e2 100644
--- a/drivers/mtd/nand/raw/hisi504_nand.c
+++ b/drivers/mtd/nand/raw/hisi504_nand.c
@@ -761,10 +761,8 @@ static int hisi_nfc_probe(struct platform_device *pdev)
 
res = platform_get_resource(pdev, IORESOURCE_MEM, 1);
host->mmio = devm_ioremap_resource(dev, res);
-   if (IS_ERR(host->mmio)) {
-   dev_err(dev, "devm_ioremap_resource[1] fail\n");
+   if (IS_ERR(host->mmio))
return PTR_ERR(host->mmio);
-   }
 
mtd->name   = "hisi_nand";
mtd->dev.parent = >dev;



[PATCH RESEND -next] drm: kirin: Remove redundant dev_err call in ade_hw_ctx_alloc()

2021-04-07 Thread Wei Li
There is a error message within devm_ioremap_resource
already, so remove the dev_err call to avoid redundant
error message.

Reported-by: Hulk Robot 
Signed-off-by: Wei Li 
---
 drivers/gpu/drm/hisilicon/kirin/kirin_drm_ade.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/hisilicon/kirin/kirin_drm_ade.c 
b/drivers/gpu/drm/hisilicon/kirin/kirin_drm_ade.c
index 6dcf9ec05eec..78a792048c42 100644
--- a/drivers/gpu/drm/hisilicon/kirin/kirin_drm_ade.c
+++ b/drivers/gpu/drm/hisilicon/kirin/kirin_drm_ade.c
@@ -857,10 +857,8 @@ static void *ade_hw_ctx_alloc(struct platform_device *pdev,
 
res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
ctx->base = devm_ioremap_resource(dev, res);
-   if (IS_ERR(ctx->base)) {
-   DRM_ERROR("failed to remap ade io base\n");
+   if (IS_ERR(ctx->base))
return ERR_PTR(-EIO);
-   }
 
ctx->reset = devm_reset_control_get(dev, NULL);
if (IS_ERR(ctx->reset))



Re: [PATCH] arm64: mm: decrease the section size to reduce the memory reserved for the page map

2020-12-07 Thread Wei Li
(+ saberlily + jiapeng)

On 2020/12/7 18:39, Anshuman Khandual wrote:
> 
> 
> On 12/7/20 3:34 PM, Mike Rapoport wrote:
>> On Mon, Dec 07, 2020 at 10:49:26AM +0100, Ard Biesheuvel wrote:
>>> On Mon, 7 Dec 2020 at 10:42, Mike Rapoport  wrote:
>>>>
>>>> On Mon, Dec 07, 2020 at 09:35:06AM +, Marc Zyngier wrote:
>>>>> On 2020-12-07 09:09, Ard Biesheuvel wrote:
>>>>>> (+ Marc)
>>>>>>
>>>>>> On Fri, 4 Dec 2020 at 12:14, Will Deacon  wrote:
>>>>>>>
>>>>>>> On Fri, Dec 04, 2020 at 09:44:43AM +0800, Wei Li wrote:
>>>>>>>> For the memory hole, sparse memory model that define SPARSEMEM_VMEMMAP
>>>>>>>> do not free the reserved memory for the page map, decrease the section
>>>>>>>> size can reduce the waste of reserved memory.
>>>>>>>>
>>>>>>>> Signed-off-by: Wei Li 
>>>>>>>> Signed-off-by: Baopeng Feng 
>>>>>>>> Signed-off-by: Xia Qing 
>>>>>>>> ---
>>>>>>>>  arch/arm64/include/asm/sparsemem.h | 2 +-
>>>>>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>>>
>>>>>>>> diff --git a/arch/arm64/include/asm/sparsemem.h 
>>>>>>>> b/arch/arm64/include/asm/sparsemem.h
>>>>>>>> index 1f43fcc79738..8963bd3def28 100644
>>>>>>>> --- a/arch/arm64/include/asm/sparsemem.h
>>>>>>>> +++ b/arch/arm64/include/asm/sparsemem.h
>>>>>>>> @@ -7,7 +7,7 @@
>>>>>>>>
>>>>>>>>  #ifdef CONFIG_SPARSEMEM
>>>>>>>>  #define MAX_PHYSMEM_BITS CONFIG_ARM64_PA_BITS
>>>>>>>> -#define SECTION_SIZE_BITS30
>>>>>>>> +#define SECTION_SIZE_BITS27
>>>>>>>
>>>>>>> We chose '30' to avoid running out of bits in the page flags. What
>>>>>>> changed?
>>>>>>>
>>>>>>> With this patch, I can trigger:
>>>>>>>
>>>>>>> ./include/linux/mmzone.h:1170:2: error: Allocator MAX_ORDER exceeds
>>>>>>> SECTION_SIZE
>>>>>>> #error Allocator MAX_ORDER exceeds SECTION_SIZE
>>>>>>>
>>>>>>> if I bump up NR_CPUS and NODES_SHIFT.
>>>>>>>
>>>>>>
>>>>>> Does this mean we will run into problems with the GICv3 ITS LPI tables
>>>>>> again if we are forced to reduce MAX_ORDER to fit inside
>>>>>> SECTION_SIZE_BITS?
>>>>>
>>>>> Most probably. We are already massively constraint on platforms
>>>>> such as TX1, and dividing the max allocatable range by 8 isn't
>>>>> going to make it work any better...
>>>>
>>>> I don't think MAX_ORDER should shrink. Even if SECTION_SIZE_BITS is
>>>> reduced it should accomodate the existing MAX_ORDER.
>>>>
>>>> My two pennies.
>>>>
>>>
>>> But include/linux/mmzone.h:1170 has this:
>>>
>>> #if (MAX_ORDER - 1 + PAGE_SHIFT) > SECTION_SIZE_BITS
>>> #error Allocator MAX_ORDER exceeds SECTION_SIZE
>>> #endif
>>>
>>> and Will managed to trigger it after applying this patch.
>>
>> Right, because with 64K pages section size of 27 bits is not enough to
>> accomodate MAX_ORDER (2^13 pages of 64K).
>>
>> Which means that definition of SECTION_SIZE_BITS should take MAX_ORDER
>> into account either statically with 
>>
>> #ifdef ARM64_4K_PAGES
>> #define SECTION_SIZE_BITS 
>> #elif ARM64_16K_PAGES
>> #define SECTION_SIZE_BITS 
>> #elif ARM64_64K_PAGES
>> #define SECTION_SIZE_BITS 
>> #else
>> #error "and what is the page size?"
>> #endif
>>
>> or dynamically, like e.g. ia64 does:
>>
>> #ifdef CONFIG_FORCE_MAX_ZONEORDER
>> #if ((CONFIG_FORCE_MAX_ZONEORDER - 1 + PAGE_SHIFT) > SECTION_SIZE_BITS)
>> #undef SECTION_SIZE_BITS
>> #define SECTION_SIZE_BITS (CONFIG_FORCE_MAX_ZONEORDER - 1 + PAGE_SHIFT)
>> #endif
> 
> I had proposed the same on the other thread here. But with this the
> SECTION_SIZE_BITS becomes 22 in case of 4K page size reducing to an
> extent where PMD based vmemmap mapping could not be created. Though
> have not looked into much details yet.
> 
> Using CONFIG_FORCE_MAX_ZONEORDER seems to the right thing to do. But
> if that does not reasonably work for 4K pages, we might have to hard
> code it as 27 to have huge page vmemmap mappings.
> .
> 


[PATCH] arm64: mm: decrease the section size to reduce the memory reserved for the page map

2020-12-03 Thread Wei Li
For the memory hole, sparse memory model that define SPARSEMEM_VMEMMAP
do not free the reserved memory for the page map, decrease the section
size can reduce the waste of reserved memory.

Signed-off-by: Wei Li 
Signed-off-by: Baopeng Feng 
Signed-off-by: Xia Qing 
---
 arch/arm64/include/asm/sparsemem.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/sparsemem.h 
b/arch/arm64/include/asm/sparsemem.h
index 1f43fcc79738..8963bd3def28 100644
--- a/arch/arm64/include/asm/sparsemem.h
+++ b/arch/arm64/include/asm/sparsemem.h
@@ -7,7 +7,7 @@

 #ifdef CONFIG_SPARSEMEM
 #define MAX_PHYSMEM_BITS   CONFIG_ARM64_PA_BITS
-#define SECTION_SIZE_BITS  30
+#define SECTION_SIZE_BITS  27
 #endif

 #endif
--
2.15.0



[PATCH v4] drivers/perf: Add support for ARMv8.3-SPE

2020-12-03 Thread Wei Li
Armv8.3 extends the SPE by adding:
- Alignment field in the Events packet, and filtering on this event
  using PMSEVFR_EL1.
- Support for the Scalable Vector Extension (SVE).

The main additions for SVE are:
- Recording the vector length for SVE operations in the Operation Type
  packet. It is not possible to filter on vector length.
- Incomplete predicate and empty predicate fields in the Events packet,
  and filtering on these events using PMSEVFR_EL1.

Update the check of pmsevfr for empty/partial predicated SVE and
alignment event in SPE driver.

Signed-off-by: Wei Li 
---
v3 -> v4:
 - Return the highest supported version in default in arm_spe_pmsevfr_res0().
 - Drop the exposing of 'pmsver'.
   (Suggested by Will.)
---
v2 -> v3:
 - Make the definition of 'pmsevfr_res0' progressive and easy to check.
   (Suggested by Will.)
---
v1 -> v2:
 - Rename 'pmuver' to 'pmsver', change it's type to 'u16' from 'int'.
   (Suggested by Will and Leo.)
 - Expose 'pmsver' as cap attribute through sysfs, instead of printing.
   (Suggested by Will.)
---
 arch/arm64/include/asm/sysreg.h |  9 -
 drivers/perf/arm_spe_pmu.c  | 17 +++--
 2 files changed, 23 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index d52c1b3ce589..57e5aee6f7e6 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -287,7 +287,11 @@
 #define SYS_PMSFCR_EL1_ST_SHIFT18
 
 #define SYS_PMSEVFR_EL1sys_reg(3, 0, 9, 9, 5)
-#define SYS_PMSEVFR_EL1_RES0   0x00ff0f55UL
+#define SYS_PMSEVFR_EL1_RES0_8_2   \
+   (GENMASK_ULL(47, 32) | GENMASK_ULL(23, 16) | GENMASK_ULL(11, 8) |\
+BIT_ULL(6) | BIT_ULL(4) | BIT_ULL(2) | BIT_ULL(0))
+#define SYS_PMSEVFR_EL1_RES0_8_3   \
+   (SYS_PMSEVFR_EL1_RES0_8_2 & ~(BIT_ULL(18) | BIT_ULL(17) | BIT_ULL(11)))
 
 #define SYS_PMSLATFR_EL1   sys_reg(3, 0, 9, 9, 6)
 #define SYS_PMSLATFR_EL1_MINLAT_SHIFT  0
@@ -829,6 +833,9 @@
 #define ID_AA64DFR0_PMUVER_8_5 0x6
 #define ID_AA64DFR0_PMUVER_IMP_DEF 0xf
 
+#define ID_AA64DFR0_PMSVER_8_2 0x1
+#define ID_AA64DFR0_PMSVER_8_3 0x2
+
 #define ID_DFR0_PERFMON_SHIFT  24
 
 #define ID_DFR0_PERFMON_8_10x4
diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
index cc00915ad6d1..bce9aff9f546 100644
--- a/drivers/perf/arm_spe_pmu.c
+++ b/drivers/perf/arm_spe_pmu.c
@@ -54,7 +54,7 @@ struct arm_spe_pmu {
struct hlist_node   hotplug_node;
 
int irq; /* PPI */
-
+   u16 pmsver;
u16 min_period;
u16 counter_sz;
 
@@ -655,6 +655,18 @@ static irqreturn_t arm_spe_pmu_irq_handler(int irq, void 
*dev)
return IRQ_HANDLED;
 }
 
+static u64 arm_spe_pmsevfr_res0(u16 pmsver)
+{
+   switch (pmsver) {
+   case ID_AA64DFR0_PMSVER_8_2:
+   return SYS_PMSEVFR_EL1_RES0_8_2;
+   case ID_AA64DFR0_PMSVER_8_3:
+   /* Return the highest version we support in default */
+   default:
+   return SYS_PMSEVFR_EL1_RES0_8_3;
+   }
+}
+
 /* Perf callbacks */
 static int arm_spe_pmu_event_init(struct perf_event *event)
 {
@@ -670,7 +682,7 @@ static int arm_spe_pmu_event_init(struct perf_event *event)
!cpumask_test_cpu(event->cpu, _pmu->supported_cpus))
return -ENOENT;
 
-   if (arm_spe_event_to_pmsevfr(event) & SYS_PMSEVFR_EL1_RES0)
+   if (arm_spe_event_to_pmsevfr(event) & 
arm_spe_pmsevfr_res0(spe_pmu->pmsver))
return -EOPNOTSUPP;
 
if (attr->exclude_idle)
@@ -937,6 +949,7 @@ static void __arm_spe_pmu_dev_probe(void *info)
fld, smp_processor_id());
return;
}
+   spe_pmu->pmsver = (u16)fld;
 
/* Read PMBIDR first to determine whether or not we have access */
reg = read_sysreg_s(SYS_PMBIDR_EL1);
-- 
2.17.1



[PATCH] MIPS: SMP-CPS: Add support for irq migration when CPU offline

2020-12-02 Thread Wei Li
Currently we won't migrate irqs when offline CPUs, which has been
implemented on most architectures. That will lead to some devices work
incorrectly if the bound cores are offline.

While that can be easily supported by enabling GENERIC_IRQ_MIGRATION.
But i don't pretty known the reason it was not supported on all MIPS
platforms.

This patch add the support for irq migration on MIPS CPS platform, and
it's tested on the interAptiv processor.

Signed-off-by: Wei Li 
---
 arch/mips/Kconfig  | 1 +
 arch/mips/kernel/smp-cps.c | 2 ++
 2 files changed, 3 insertions(+)

diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
index a48cb9a71471..8ece19ffe255 100644
--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@@ -2510,6 +2510,7 @@ config MIPS_CPS
select SYS_SUPPORTS_SCHED_SMT if CPU_MIPSR6
select SYS_SUPPORTS_SMP
select WEAK_ORDERING
+   select GENERIC_IRQ_MIGRATION if HOTPLUG_CPU
help
  Select this if you wish to run an SMP kernel across multiple cores
  within a MIPS Coherent Processing System. When this option is
diff --git a/arch/mips/kernel/smp-cps.c b/arch/mips/kernel/smp-cps.c
index 3ab433a8e871..26f74f7d7604 100644
--- a/arch/mips/kernel/smp-cps.c
+++ b/arch/mips/kernel/smp-cps.c
@@ -12,6 +12,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -465,6 +466,7 @@ static int cps_cpu_disable(void)
smp_mb__after_atomic();
set_cpu_online(cpu, false);
calculate_cpu_foreign_map();
+   irq_migrate_all_off_this_cpu();
 
return 0;
 }
-- 
2.17.1



[PATCH v3] drivers/perf: Add support for ARMv8.3-SPE

2020-11-26 Thread Wei Li
Armv8.3 extends the SPE by adding:
- Alignment field in the Events packet, and filtering on this event
  using PMSEVFR_EL1.
- Support for the Scalable Vector Extension (SVE).

The main additions for SVE are:
- Recording the vector length for SVE operations in the Operation Type
  packet. It is not possible to filter on vector length.
- Incomplete predicate and empty predicate fields in the Events packet,
  and filtering on these events using PMSEVFR_EL1.

Update the check of pmsevfr for empty/partial predicated SVE and
alignment event in SPE driver. For adaption by the version of SPE,
expose 'pmsver' as cap attribute to userspace.

Signed-off-by: Wei Li 
---
v2 -> v3:
 - Make the definition of 'pmsevfr_res0' progressive and easy to check.
   (Suggested by Will.)
---
v1 -> v2:
 - Rename 'pmuver' to 'pmsver', change it's type to 'u16' from 'int'.
   (Suggested by Will and Leo.)
 - Expose 'pmsver' as cap attribute through sysfs, instead of printing.
   (Suggested by Will.)
---
 arch/arm64/include/asm/sysreg.h |  9 -
 drivers/perf/arm_spe_pmu.c  | 21 +++--
 2 files changed, 27 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index d52c1b3ce589..57e5aee6f7e6 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -287,7 +287,11 @@
 #define SYS_PMSFCR_EL1_ST_SHIFT18
 
 #define SYS_PMSEVFR_EL1sys_reg(3, 0, 9, 9, 5)
-#define SYS_PMSEVFR_EL1_RES0   0x00ff0f55UL
+#define SYS_PMSEVFR_EL1_RES0_8_2   \
+   (GENMASK_ULL(47, 32) | GENMASK_ULL(23, 16) | GENMASK_ULL(11, 8) |\
+BIT_ULL(6) | BIT_ULL(4) | BIT_ULL(2) | BIT_ULL(0))
+#define SYS_PMSEVFR_EL1_RES0_8_3   \
+   (SYS_PMSEVFR_EL1_RES0_8_2 & ~(BIT_ULL(18) | BIT_ULL(17) | BIT_ULL(11)))
 
 #define SYS_PMSLATFR_EL1   sys_reg(3, 0, 9, 9, 6)
 #define SYS_PMSLATFR_EL1_MINLAT_SHIFT  0
@@ -829,6 +833,9 @@
 #define ID_AA64DFR0_PMUVER_8_5 0x6
 #define ID_AA64DFR0_PMUVER_IMP_DEF 0xf
 
+#define ID_AA64DFR0_PMSVER_8_2 0x1
+#define ID_AA64DFR0_PMSVER_8_3 0x2
+
 #define ID_DFR0_PERFMON_SHIFT  24
 
 #define ID_DFR0_PERFMON_8_10x4
diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
index cc00915ad6d1..515c51271d7f 100644
--- a/drivers/perf/arm_spe_pmu.c
+++ b/drivers/perf/arm_spe_pmu.c
@@ -54,7 +54,7 @@ struct arm_spe_pmu {
struct hlist_node   hotplug_node;
 
int irq; /* PPI */
-
+   u16 pmsver;
u16 min_period;
u16 counter_sz;
 
@@ -93,6 +93,7 @@ enum arm_spe_pmu_capabilities {
SPE_PMU_CAP_FEAT_MAX,
SPE_PMU_CAP_CNT_SZ = SPE_PMU_CAP_FEAT_MAX,
SPE_PMU_CAP_MIN_IVAL,
+   SPE_PMU_CAP_PMSVER,
 };
 
 static int arm_spe_pmu_feat_caps[SPE_PMU_CAP_FEAT_MAX] = {
@@ -110,6 +111,8 @@ static u32 arm_spe_pmu_cap_get(struct arm_spe_pmu *spe_pmu, 
int cap)
return spe_pmu->counter_sz;
case SPE_PMU_CAP_MIN_IVAL:
return spe_pmu->min_period;
+   case SPE_PMU_CAP_PMSVER:
+   return spe_pmu->pmsver;
default:
WARN(1, "unknown cap %d\n", cap);
}
@@ -143,6 +146,7 @@ static struct attribute *arm_spe_pmu_cap_attr[] = {
SPE_CAP_EXT_ATTR_ENTRY(ernd, SPE_PMU_CAP_ERND),
SPE_CAP_EXT_ATTR_ENTRY(count_size, SPE_PMU_CAP_CNT_SZ),
SPE_CAP_EXT_ATTR_ENTRY(min_interval, SPE_PMU_CAP_MIN_IVAL),
+   SPE_CAP_EXT_ATTR_ENTRY(pmsver, SPE_PMU_CAP_PMSVER),
NULL,
 };
 
@@ -655,6 +659,18 @@ static irqreturn_t arm_spe_pmu_irq_handler(int irq, void 
*dev)
return IRQ_HANDLED;
 }
 
+static u64 arm_spe_pmsevfr_res0(u16 pmsver)
+{
+   switch (pmsver) {
+   case ID_AA64DFR0_PMSVER_8_2:
+   return SYS_PMSEVFR_EL1_RES0_8_2;
+   case ID_AA64DFR0_PMSVER_8_3:
+   return SYS_PMSEVFR_EL1_RES0_8_3;
+   default:
+   return -1;
+   }
+}
+
 /* Perf callbacks */
 static int arm_spe_pmu_event_init(struct perf_event *event)
 {
@@ -670,7 +686,7 @@ static int arm_spe_pmu_event_init(struct perf_event *event)
!cpumask_test_cpu(event->cpu, _pmu->supported_cpus))
return -ENOENT;
 
-   if (arm_spe_event_to_pmsevfr(event) & SYS_PMSEVFR_EL1_RES0)
+   if (arm_spe_event_to_pmsevfr(event) & 
arm_spe_pmsevfr_res0(spe_pmu->pmsver))
return -EOPNOTSUPP;
 
if (attr->exclude_idle)
@@ -937,6 +953,7 @@ static void __arm_spe_pmu_dev_probe(void *info)
fld, smp_processor_id());
return;
}
+   spe_pmu->pmsver = (u16)fld;
 
/* Read PMBIDR first to determine whether or not we have access */
reg = read_sysreg_s(SYS_PMBIDR_EL1);
-- 
2.17.1



[PATCH] drivers/pcmcia: Fix error return code in electra_cf_probe()

2020-11-23 Thread Wei Li
When it fails to call of_get_property(), it just jumps to 'fail1',
while the 'status' which will be returned is not updated.

Fixes: 2b571a066a2f ("pcmcia: CompactFlash driver for PA Semi Electra boards")
Signed-off-by: Wei Li 
---
 drivers/pcmcia/electra_cf.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/pcmcia/electra_cf.c b/drivers/pcmcia/electra_cf.c
index 35158cfd9c1a..0570758e3fa8 100644
--- a/drivers/pcmcia/electra_cf.c
+++ b/drivers/pcmcia/electra_cf.c
@@ -228,6 +228,7 @@ static int electra_cf_probe(struct platform_device *ofdev)
}
 
cf->socket.pci_irq = cf->irq;
+   status = -ENODEV;
 
prop = of_get_property(np, "card-detect-gpio", NULL);
if (!prop)
-- 
2.17.1



[PATCH] net: fs_enet: Fix incorrect IS_ERR_VALUE macro usages

2020-11-23 Thread Wei Li
IS_ERR_VALUE macro should be used only with unsigned long type.
Especially it works incorrectly with unsigned shorter types on
64bit machines.

Fixes: 976de6a8c304 ("fs_enet: Be an of_platform device when 
CONFIG_PPC_CPM_NEW_BINDING is set.")
Fixes: 4c35630ccda5 ("[POWERPC] Change rheap functions to use ulongs instead of 
pointers")
Signed-off-by: Wei Li 
---
 drivers/net/ethernet/freescale/fs_enet/mac-fcc.c | 2 +-
 drivers/net/ethernet/freescale/fs_enet/mac-scc.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/freescale/fs_enet/mac-fcc.c 
b/drivers/net/ethernet/freescale/fs_enet/mac-fcc.c
index b47490be872c..e2117ad46130 100644
--- a/drivers/net/ethernet/freescale/fs_enet/mac-fcc.c
+++ b/drivers/net/ethernet/freescale/fs_enet/mac-fcc.c
@@ -107,7 +107,7 @@ static int do_pd_setup(struct fs_enet_private *fep)
 
fep->fcc.mem = (void __iomem *)cpm2_immr;
fpi->dpram_offset = cpm_dpalloc(128, 32);
-   if (IS_ERR_VALUE(fpi->dpram_offset)) {
+   if (IS_ERR_VALUE((unsigned long)(int)fpi->dpram_offset)) {
ret = fpi->dpram_offset;
goto out_fcccp;
}
diff --git a/drivers/net/ethernet/freescale/fs_enet/mac-scc.c 
b/drivers/net/ethernet/freescale/fs_enet/mac-scc.c
index 64300ac13e02..90f82df0b1bb 100644
--- a/drivers/net/ethernet/freescale/fs_enet/mac-scc.c
+++ b/drivers/net/ethernet/freescale/fs_enet/mac-scc.c
@@ -136,7 +136,7 @@ static int allocate_bd(struct net_device *dev)
 
fep->ring_mem_addr = cpm_dpalloc((fpi->tx_ring + fpi->rx_ring) *
 sizeof(cbd_t), 8);
-   if (IS_ERR_VALUE(fep->ring_mem_addr))
+   if (IS_ERR_VALUE((unsigned long)(int)fep->ring_mem_addr))
return -ENOMEM;
 
fep->ring_base = (void __iomem __force*)
-- 
2.17.1



[PATCH] net/ethernet/freescale: Fix incorrect IS_ERR_VALUE macro usages

2020-11-23 Thread Wei Li
IS_ERR_VALUE macro should be used only with unsigned long type.
Especially it works incorrectly with unsigned shorter types on
64bit machines.

Fixes: 4c35630ccda5 ("[POWERPC] Change rheap functions to use ulongs instead of 
pointers")
Signed-off-by: Wei Li 
---
 drivers/net/ethernet/freescale/ucc_geth.c | 30 +++
 1 file changed, 15 insertions(+), 15 deletions(-)

diff --git a/drivers/net/ethernet/freescale/ucc_geth.c 
b/drivers/net/ethernet/freescale/ucc_geth.c
index 714b501be7d0..8656d9be256a 100644
--- a/drivers/net/ethernet/freescale/ucc_geth.c
+++ b/drivers/net/ethernet/freescale/ucc_geth.c
@@ -286,7 +286,7 @@ static int fill_init_enet_entries(struct ucc_geth_private 
*ugeth,
else {
init_enet_offset =
qe_muram_alloc(thread_size, thread_alignment);
-   if (IS_ERR_VALUE(init_enet_offset)) {
+   if (IS_ERR_VALUE((unsigned long)(int)init_enet_offset)) 
{
if (netif_msg_ifup(ugeth))
pr_err("Can not allocate DPRAM 
memory\n");
qe_put_snum((u8) snum);
@@ -2223,7 +2223,7 @@ static int ucc_geth_alloc_tx(struct ucc_geth_private 
*ugeth)
ugeth->tx_bd_ring_offset[j] =
qe_muram_alloc(length,
   UCC_GETH_TX_BD_RING_ALIGNMENT);
-   if (!IS_ERR_VALUE(ugeth->tx_bd_ring_offset[j]))
+   if (!IS_ERR_VALUE((unsigned 
long)(int)ugeth->tx_bd_ring_offset[j]))
ugeth->p_tx_bd_ring[j] =
(u8 __iomem *) qe_muram_addr(ugeth->
 tx_bd_ring_offset[j]);
@@ -2300,7 +2300,7 @@ static int ucc_geth_alloc_rx(struct ucc_geth_private 
*ugeth)
ugeth->rx_bd_ring_offset[j] =
qe_muram_alloc(length,
   UCC_GETH_RX_BD_RING_ALIGNMENT);
-   if (!IS_ERR_VALUE(ugeth->rx_bd_ring_offset[j]))
+   if (!IS_ERR_VALUE((unsigned 
long)(int)ugeth->rx_bd_ring_offset[j]))
ugeth->p_rx_bd_ring[j] =
(u8 __iomem *) qe_muram_addr(ugeth->
 rx_bd_ring_offset[j]);
@@ -2510,7 +2510,7 @@ static int ucc_geth_startup(struct ucc_geth_private 
*ugeth)
ugeth->tx_glbl_pram_offset =
qe_muram_alloc(sizeof(struct ucc_geth_tx_global_pram),
   UCC_GETH_TX_GLOBAL_PRAM_ALIGNMENT);
-   if (IS_ERR_VALUE(ugeth->tx_glbl_pram_offset)) {
+   if (IS_ERR_VALUE((unsigned long)(int)ugeth->tx_glbl_pram_offset)) {
if (netif_msg_ifup(ugeth))
pr_err("Can not allocate DPRAM memory for 
p_tx_glbl_pram\n");
return -ENOMEM;
@@ -2530,7 +2530,7 @@ static int ucc_geth_startup(struct ucc_geth_private 
*ugeth)
   sizeof(struct ucc_geth_thread_data_tx) +
   32 * (numThreadsTxNumerical == 1),
   UCC_GETH_THREAD_DATA_ALIGNMENT);
-   if (IS_ERR_VALUE(ugeth->thread_dat_tx_offset)) {
+   if (IS_ERR_VALUE((unsigned long)(int)ugeth->thread_dat_tx_offset)) {
if (netif_msg_ifup(ugeth))
pr_err("Can not allocate DPRAM memory for 
p_thread_data_tx\n");
return -ENOMEM;
@@ -2557,7 +2557,7 @@ static int ucc_geth_startup(struct ucc_geth_private 
*ugeth)
qe_muram_alloc(ug_info->numQueuesTx *
   sizeof(struct ucc_geth_send_queue_qd),
   UCC_GETH_SEND_QUEUE_QUEUE_DESCRIPTOR_ALIGNMENT);
-   if (IS_ERR_VALUE(ugeth->send_q_mem_reg_offset)) {
+   if (IS_ERR_VALUE((unsigned long)(int)ugeth->send_q_mem_reg_offset)) {
if (netif_msg_ifup(ugeth))
pr_err("Can not allocate DPRAM memory for 
p_send_q_mem_reg\n");
return -ENOMEM;
@@ -2597,7 +2597,7 @@ static int ucc_geth_startup(struct ucc_geth_private 
*ugeth)
ugeth->scheduler_offset =
qe_muram_alloc(sizeof(struct ucc_geth_scheduler),
   UCC_GETH_SCHEDULER_ALIGNMENT);
-   if (IS_ERR_VALUE(ugeth->scheduler_offset)) {
+   if (IS_ERR_VALUE((unsigned long)(int)ugeth->scheduler_offset)) {
if (netif_msg_ifup(ugeth))
pr_err("Can not allocate DPRAM memory for 
p_scheduler\n");
return -ENOMEM;
@@ -2644,7 +2644,7 @@ static int ucc_geth_startup(struct ucc_geth_private 
*ugeth)

[PATCH] Documentation/features: Update feature lists for 5.10

2020-11-18 Thread Wei Li
The feature lists don't match reality as of v5.10-rc4, update them
accordingly (by features-refresh.sh).

Fixes: aa65ff6b18e0 ("powerpc/64s: Implement queued spinlocks and rwlocks")
Fixes: e95a4f8cb985 ("csky: Add SECCOMP_FILTER supported")
Fixes: 0bb605c2c7f2 ("sh: Add SECCOMP_FILTER")
Fixes: bdcd93ef9afb ("csky: Add context tracking support")
Signed-off-by: Wei Li 
---
 .../features/locking/queued-rwlocks/arch-support.txt  | 2 +-
 .../features/locking/queued-spinlocks/arch-support.txt| 2 +-
 .../features/seccomp/seccomp-filter/arch-support.txt  | 4 ++--
 Documentation/features/time/context-tracking/arch-support.txt | 2 +-
 Documentation/features/time/virt-cpuacct/arch-support.txt | 2 +-
 5 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/Documentation/features/locking/queued-rwlocks/arch-support.txt 
b/Documentation/features/locking/queued-rwlocks/arch-support.txt
index 5c6bcfcf8e1f..4dd5e554873f 100644
--- a/Documentation/features/locking/queued-rwlocks/arch-support.txt
+++ b/Documentation/features/locking/queued-rwlocks/arch-support.txt
@@ -22,7 +22,7 @@
 |   nios2: | TODO |
 |openrisc: |  ok  |
 |  parisc: | TODO |
-| powerpc: | TODO |
+| powerpc: |  ok  |
 |   riscv: | TODO |
 |s390: | TODO |
 |  sh: | TODO |
diff --git a/Documentation/features/locking/queued-spinlocks/arch-support.txt 
b/Documentation/features/locking/queued-spinlocks/arch-support.txt
index b55e420a34ea..b16d4f71e5ce 100644
--- a/Documentation/features/locking/queued-spinlocks/arch-support.txt
+++ b/Documentation/features/locking/queued-spinlocks/arch-support.txt
@@ -22,7 +22,7 @@
 |   nios2: | TODO |
 |openrisc: |  ok  |
 |  parisc: | TODO |
-| powerpc: | TODO |
+| powerpc: |  ok  |
 |   riscv: | TODO |
 |s390: | TODO |
 |  sh: | TODO |
diff --git a/Documentation/features/seccomp/seccomp-filter/arch-support.txt 
b/Documentation/features/seccomp/seccomp-filter/arch-support.txt
index c688aba22a8d..eb3d74092c61 100644
--- a/Documentation/features/seccomp/seccomp-filter/arch-support.txt
+++ b/Documentation/features/seccomp/seccomp-filter/arch-support.txt
@@ -11,7 +11,7 @@
 | arm: |  ok  |
 |   arm64: |  ok  |
 | c6x: | TODO |
-|csky: | TODO |
+|csky: |  ok  |
 |   h8300: | TODO |
 | hexagon: | TODO |
 |ia64: | TODO |
@@ -25,7 +25,7 @@
 | powerpc: |  ok  |
 |   riscv: |  ok  |
 |s390: |  ok  |
-|  sh: | TODO |
+|  sh: |  ok  |
 |   sparc: | TODO |
 |  um: |  ok  |
 | x86: |  ok  |
diff --git a/Documentation/features/time/context-tracking/arch-support.txt 
b/Documentation/features/time/context-tracking/arch-support.txt
index 266c81e8a721..52aea275aab7 100644
--- a/Documentation/features/time/context-tracking/arch-support.txt
+++ b/Documentation/features/time/context-tracking/arch-support.txt
@@ -11,7 +11,7 @@
 | arm: |  ok  |
 |   arm64: |  ok  |
 | c6x: | TODO |
-|csky: | TODO |
+|csky: |  ok  |
 |   h8300: | TODO |
 | hexagon: | TODO |
 |ia64: | TODO |
diff --git a/Documentation/features/time/virt-cpuacct/arch-support.txt 
b/Documentation/features/time/virt-cpuacct/arch-support.txt
index 56b372da6b01..e51f3af38e31 100644
--- a/Documentation/features/time/virt-cpuacct/arch-support.txt
+++ b/Documentation/features/time/virt-cpuacct/arch-support.txt
@@ -11,7 +11,7 @@
 | arm: |  ok  |
 |   arm64: |  ok  |
 | c6x: | TODO |
-|csky: | TODO |
+|csky: |  ok  |
 |   h8300: | TODO |
 | hexagon: | TODO |
 |ia64: |  ok  |
-- 
2.17.1



[PATCH] drm/msm: Fix error return code in msm_drm_init()

2020-11-16 Thread Wei Li
When it fail to create crtc_event kthread, it just jump to err_msm_uninit,
while the 'ret' is not updated. So assign the return code before that.

Fixes: 25fdd5933e4c ("drm/msm: Add SDM845 DPU support")
Reported-by: Hulk Robot 
Signed-off-by: Wei Li 
---
 drivers/gpu/drm/msm/msm_drv.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
index 49685571dc0e..37a373c5ced3 100644
--- a/drivers/gpu/drm/msm/msm_drv.c
+++ b/drivers/gpu/drm/msm/msm_drv.c
@@ -506,6 +506,7 @@ static int msm_drm_init(struct device *dev, struct 
drm_driver *drv)
"crtc_event:%d", priv->event_thread[i].crtc_id);
if (IS_ERR(priv->event_thread[i].worker)) {
DRM_DEV_ERROR(dev, "failed to create crtc_event 
kthread\n");
+   ret = PTR_ERR(priv->event_thread[i].worker);
goto err_msm_uninit;
}
 
-- 
2.17.1



[PATCH] scsi: fdomain: Fix error return code in fdomain_probe()

2020-11-16 Thread Wei Li
When it fail to request_region, it just jump to 'fail_disable',
while the 'ret' is not updated. So assign the return code before
that.

Fixes: 8674a8aa2c39 ("scsi: fdomain: Add PCMCIA support")
Reported-by: Hulk Robot 
Signed-off-by: Wei Li 
---
 drivers/scsi/pcmcia/fdomain_cs.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/pcmcia/fdomain_cs.c b/drivers/scsi/pcmcia/fdomain_cs.c
index e42acf314d06..6bde57ed6c50 100644
--- a/drivers/scsi/pcmcia/fdomain_cs.c
+++ b/drivers/scsi/pcmcia/fdomain_cs.c
@@ -45,8 +45,10 @@ static int fdomain_probe(struct pcmcia_device *link)
goto fail_disable;
 
if (!request_region(link->resource[0]->start, FDOMAIN_REGION_SIZE,
-   "fdomain_cs"))
+   "fdomain_cs")) {
+   ret = -ENOMEM;
goto fail_disable;
+   }
 
sh = fdomain_create(link->resource[0]->start, link->irq, 7, >dev);
if (!sh) {
-- 
2.17.1



[PATCH v2] drivers/perf: Add support for ARMv8.3-SPE

2020-09-30 Thread Wei Li
Armv8.3 extends the SPE by adding:
- Alignment field in the Events packet, and filtering on this event
  using PMSEVFR_EL1.
- Support for the Scalable Vector Extension (SVE).

The main additions for SVE are:
- Recording the vector length for SVE operations in the Operation Type
  packet. It is not possible to filter on vector length.
- Incomplete predicate and empty predicate fields in the Events packet,
  and filtering on these events using PMSEVFR_EL1.

Update the check of pmsevfr for empty/partial predicated SVE and
alignment event in SPE driver. For adaption by the version of SPE,
expose 'pmsver' as cap attribute to userspace.

Signed-off-by: Wei Li 
---
v1 -> v2:
 - Rename 'pmuver' to 'pmsver', change it's type to 'u16' from 'int'.
   (Suggested by Will and Leo.)
 - Expose 'pmsver' as cap attribute through sysfs, instead of printing.
   (Suggested by Will.)
---
 arch/arm64/include/asm/sysreg.h |  4 +++-
 drivers/perf/arm_spe_pmu.c  | 18 --
 2 files changed, 19 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 554a7e8ecb07..f4f9c1fc6398 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -281,7 +281,6 @@
 #define SYS_PMSFCR_EL1_ST_SHIFT18
 
 #define SYS_PMSEVFR_EL1sys_reg(3, 0, 9, 9, 5)
-#define SYS_PMSEVFR_EL1_RES0   0x00ff0f55UL
 
 #define SYS_PMSLATFR_EL1   sys_reg(3, 0, 9, 9, 6)
 #define SYS_PMSLATFR_EL1_MINLAT_SHIFT  0
@@ -787,6 +786,9 @@
 #define ID_AA64DFR0_PMUVER_8_5 0x6
 #define ID_AA64DFR0_PMUVER_IMP_DEF 0xf
 
+#define ID_AA64DFR0_PMSVER_8_2 0x1
+#define ID_AA64DFR0_PMSVER_8_3 0x2
+
 #define ID_DFR0_PERFMON_SHIFT  24
 
 #define ID_DFR0_PERFMON_8_10x4
diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
index cc00915ad6d1..52e7869f5621 100644
--- a/drivers/perf/arm_spe_pmu.c
+++ b/drivers/perf/arm_spe_pmu.c
@@ -54,7 +54,7 @@ struct arm_spe_pmu {
struct hlist_node   hotplug_node;
 
int irq; /* PPI */
-
+   u16 pmsver;
u16 min_period;
u16 counter_sz;
 
@@ -80,6 +80,15 @@ struct arm_spe_pmu {
 /* Keep track of our dynamic hotplug state */
 static enum cpuhp_state arm_spe_pmu_online;
 
+static u64 sys_pmsevfr_el1_mask[] = {
+   [ID_AA64DFR0_PMSVER_8_2] = GENMASK_ULL(63, 48) | GENMASK_ULL(31, 24) |
+   GENMASK_ULL(15, 12) | BIT_ULL(7) | BIT_ULL(5) | BIT_ULL(3) |
+   BIT_ULL(1),
+   [ID_AA64DFR0_PMSVER_8_3] = GENMASK_ULL(63, 48) | GENMASK_ULL(31, 24) |
+   GENMASK_ULL(18, 17) | GENMASK_ULL(15, 11) | BIT_ULL(7) |
+   BIT_ULL(5) | BIT_ULL(3) | BIT_ULL(1),
+};
+
 enum arm_spe_pmu_buf_fault_action {
SPE_PMU_BUF_FAULT_ACT_SPURIOUS,
SPE_PMU_BUF_FAULT_ACT_FATAL,
@@ -93,6 +102,7 @@ enum arm_spe_pmu_capabilities {
SPE_PMU_CAP_FEAT_MAX,
SPE_PMU_CAP_CNT_SZ = SPE_PMU_CAP_FEAT_MAX,
SPE_PMU_CAP_MIN_IVAL,
+   SPE_PMU_CAP_PMSVER,
 };
 
 static int arm_spe_pmu_feat_caps[SPE_PMU_CAP_FEAT_MAX] = {
@@ -110,6 +120,8 @@ static u32 arm_spe_pmu_cap_get(struct arm_spe_pmu *spe_pmu, 
int cap)
return spe_pmu->counter_sz;
case SPE_PMU_CAP_MIN_IVAL:
return spe_pmu->min_period;
+   case SPE_PMU_CAP_PMSVER:
+   return spe_pmu->pmsver;
default:
WARN(1, "unknown cap %d\n", cap);
}
@@ -143,6 +155,7 @@ static struct attribute *arm_spe_pmu_cap_attr[] = {
SPE_CAP_EXT_ATTR_ENTRY(ernd, SPE_PMU_CAP_ERND),
SPE_CAP_EXT_ATTR_ENTRY(count_size, SPE_PMU_CAP_CNT_SZ),
SPE_CAP_EXT_ATTR_ENTRY(min_interval, SPE_PMU_CAP_MIN_IVAL),
+   SPE_CAP_EXT_ATTR_ENTRY(pmsver, SPE_PMU_CAP_PMSVER),
NULL,
 };
 
@@ -670,7 +683,7 @@ static int arm_spe_pmu_event_init(struct perf_event *event)
!cpumask_test_cpu(event->cpu, _pmu->supported_cpus))
return -ENOENT;
 
-   if (arm_spe_event_to_pmsevfr(event) & SYS_PMSEVFR_EL1_RES0)
+   if (arm_spe_event_to_pmsevfr(event) & 
~sys_pmsevfr_el1_mask[spe_pmu->pmsver])
return -EOPNOTSUPP;
 
if (attr->exclude_idle)
@@ -937,6 +950,7 @@ static void __arm_spe_pmu_dev_probe(void *info)
fld, smp_processor_id());
return;
}
+   spe_pmu->pmsver = (u16)fld;
 
/* Read PMBIDR first to determine whether or not we have access */
reg = read_sysreg_s(SYS_PMBIDR_EL1);
-- 
2.17.1



[PATCH] MIPS: Add the missing 'CPU_1074K' into __get_cpu_type()

2020-09-23 Thread Wei Li
Commit 442e14a2c55e ("MIPS: Add 1074K CPU support explicitly.") split
1074K from the 74K as an unique CPU type, while it missed to add the
'CPU_1074K' in __get_cpu_type(). So let's add it back.

Fixes: 442e14a2c55e ("MIPS: Add 1074K CPU support explicitly.")
Signed-off-by: Wei Li 
---
 arch/mips/include/asm/cpu-type.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/mips/include/asm/cpu-type.h b/arch/mips/include/asm/cpu-type.h
index 75a7a382da09..3288cef4b168 100644
--- a/arch/mips/include/asm/cpu-type.h
+++ b/arch/mips/include/asm/cpu-type.h
@@ -47,6 +47,7 @@ static inline int __pure __get_cpu_type(const int cpu_type)
case CPU_34K:
case CPU_1004K:
case CPU_74K:
+   case CPU_1074K:
case CPU_M14KC:
case CPU_M14KEC:
case CPU_INTERAPTIV:
-- 
2.17.1



[PATCH] MIPS: BCM47XX: Remove the needless check with the 1074K

2020-09-23 Thread Wei Li
As there is no known soc powered by mips 1074K in bcm47xx series,
the check with 1074K is needless. So just remove it.

Link: https://wireless.wiki.kernel.org/en/users/Drivers/b43/soc
Fixes: 442e14a2c55e ("MIPS: Add 1074K CPU support explicitly.")
Signed-off-by: Wei Li 
---
 arch/mips/bcm47xx/setup.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/mips/bcm47xx/setup.c b/arch/mips/bcm47xx/setup.c
index 82627c264964..01427bde2397 100644
--- a/arch/mips/bcm47xx/setup.c
+++ b/arch/mips/bcm47xx/setup.c
@@ -148,7 +148,7 @@ void __init plat_mem_setup(void)
 {
struct cpuinfo_mips *c = _cpu_data;
 
-   if ((c->cputype == CPU_74K) || (c->cputype == CPU_1074K)) {
+   if (c->cputype == CPU_74K) {
pr_info("Using bcma bus\n");
 #ifdef CONFIG_BCM47XX_BCMA
bcm47xx_bus_type = BCM47XX_BUS_TYPE_BCMA;
-- 
2.17.1



[PATCH 0/2] perf stat: Unbreak perf stat with ARMv8 PMU events

2020-09-21 Thread Wei Li
Currently, perf-stat with armv8_pmu events with a workload is broken.
This patch set just fixes that.

Before the patch set:
[root@localhost hulk]# tools/perf/perf stat  -e 
armv8_pmuv3_0/ll_cache_rd/,armv8_pmuv3_0/ll_cache_miss_rd/ ls > /dev/null
Segmentation fault

After the patch set:
[root@localhost hulk]# tools/perf/perf stat  -e 
armv8_pmuv3_0/ll_cache_rd/,armv8_pmuv3_0/ll_cache_miss_rd/ ls > /dev/null

 Performance counter stats for 'ls':

39,882  armv8_pmuv3_0/ll_cache_rd/  
 
 9,639  armv8_pmuv3_0/ll_cache_miss_rd/ 
  

   0.001416690 seconds time elapsed

   0.001469000 seconds user
   0.0 seconds sys

Wei Li (2):
  perf stat: Fix segfault when counting armv8 PMU events
  perf stat: Unbreak perf stat with armv8 PMU events

 tools/lib/perf/include/internal/evlist.h |  1 +
 tools/perf/builtin-stat.c| 37 
 tools/perf/util/evlist.c | 23 ++-
 3 files changed, 48 insertions(+), 13 deletions(-)

-- 
2.17.1



[PATCH 1/2] perf stat: Fix segfault when counting armv8_pmu events

2020-09-21 Thread Wei Li
When executing perf stat with armv8_pmu events with a workload, it will
report a segfault as result.

(gdb) bt
#0  0x00603fc8 in perf_evsel__close_fd_cpu (evsel=,
cpu=) at evsel.c:122
#1  perf_evsel__close_cpu (evsel=evsel@entry=0x716e950, cpu=7) at evsel.c:156
#2  0x004d4718 in evlist__close (evlist=0x70a7cb0) at util/evlist.c:1242
#3  0x00453404 in __run_perf_stat (argc=3, argc@entry=1, argv=0x30,
argv@entry=0xfaea2f90, run_idx=119, run_idx@entry=1701998435)
at builtin-stat.c:929
#4  0x00455058 in run_perf_stat (run_idx=1701998435, 
argv=0xfaea2f90,
argc=1) at builtin-stat.c:947
#5  cmd_stat (argc=1, argv=0xfaea2f90) at builtin-stat.c:2357
#6  0x004bb888 in run_builtin (p=p@entry=0x9764b8 ,
argc=argc@entry=4, argv=argv@entry=0xfaea2f90) at perf.c:312
#7  0x004bbb54 in handle_internal_command (argc=argc@entry=4,
argv=argv@entry=0xfaea2f90) at perf.c:364
#8  0x00435378 in run_argv (argcp=,
argv=) at perf.c:408
#9  main (argc=4, argv=0xfaea2f90) at perf.c:538

After debugging, i found the root reason is that the xyarray fd is created
by evsel__open_per_thread() ignoring the cpu passed in
create_perf_stat_counter(), while the evsel' cpumap is assigned as the
corresponding PMU's cpumap in __add_event(). Thus, the xyarray fd is created
with ncpus of dummy cpumap and an out of bounds 'cpu' index will be used in
perf_evsel__close_fd_cpu().

To address this, add a flag to mark this situation and avoid using the
affinity technique when closing/enabling/disabling events.

Fixes: 7736627b865d ("perf stat: Use affinity for closing file descriptors")
Fixes: 704e2f5b700d ("perf stat: Use affinity for enabling/disabling events")
Signed-off-by: Wei Li 
---
 tools/lib/perf/include/internal/evlist.h |  1 +
 tools/perf/builtin-stat.c|  3 +++
 tools/perf/util/evlist.c | 23 ++-
 3 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/tools/lib/perf/include/internal/evlist.h 
b/tools/lib/perf/include/internal/evlist.h
index 2d0fa02b036f..c02d7e583846 100644
--- a/tools/lib/perf/include/internal/evlist.h
+++ b/tools/lib/perf/include/internal/evlist.h
@@ -17,6 +17,7 @@ struct perf_evlist {
struct list_head entries;
int  nr_entries;
bool has_user_cpus;
+   bool open_per_thread;
struct perf_cpu_map *cpus;
struct perf_cpu_map *all_cpus;
struct perf_thread_map  *threads;
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index fddc97cac984..6e6ceacce634 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -725,6 +725,9 @@ static int __run_perf_stat(int argc, const char **argv, int 
run_idx)
if (group)
perf_evlist__set_leader(evsel_list);
 
+   if (!(target__has_cpu() && !target__has_per_thread()))
+   evsel_list->core.open_per_thread = true;
+
if (affinity__setup() < 0)
return -1;
 
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index e3fa3bf7498a..bf8a3ccc599f 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -383,6 +383,15 @@ void evlist__disable(struct evlist *evlist)
int cpu, i, imm = 0;
bool has_imm = false;
 
+   if (evlist->core.open_per_thread) {
+   evlist__for_each_entry(evlist, pos) {
+   if (pos->disabled || !evsel__is_group_leader(pos) || 
!pos->core.fd)
+   continue;
+   evsel__disable(pos);
+   }
+   goto out;
+   }
+
if (affinity__setup() < 0)
return;
 
@@ -414,6 +423,7 @@ void evlist__disable(struct evlist *evlist)
pos->disabled = true;
}
 
+out:
evlist->enabled = false;
 }
 
@@ -423,6 +433,15 @@ void evlist__enable(struct evlist *evlist)
struct affinity affinity;
int cpu, i;
 
+   if (evlist->core.open_per_thread) {
+   evlist__for_each_entry(evlist, pos) {
+   if (!evsel__is_group_leader(pos) || !pos->core.fd)
+   continue;
+   evsel__enable(pos);
+   }
+   goto out;
+   }
+
if (affinity__setup() < 0)
return;
 
@@ -444,6 +463,7 @@ void evlist__enable(struct evlist *evlist)
pos->disabled = false;
}
 
+out:
evlist->enabled = true;
 }
 
@@ -1223,9 +1243,10 @@ void evlist__close(struct evlist *evlist)
 
/*
 * With perf record core.cpus is usually NULL.
+* Or perf stat may open events per-thread.
 * Use the old method to handle this for now.
 */
-   if (!evlist->core.cpus) {
+   if (evlist->core.open_per_threa

[PATCH 2/2] perf stat: Unbreak perf stat with armv8_pmu events

2020-09-21 Thread Wei Li
After the segfault is fixed, perf-stat with armv8_pmu events with a
workload is still broken:

[root@localhost hulk]# tools/perf/perf stat -e 
armv8_pmuv3_0/ll_cache_rd/,armv8_pmuv3_0/ll_cache_miss_rd/ ls > /dev/null

 Performance counter stats for 'ls':

   armv8_pmuv3_0/ll_cache_rd/  
   (0.00%)
   armv8_pmuv3_0/ll_cache_miss_rd/ 
(0.00%)

   0.002052670 seconds time elapsed

   0.0 seconds user
   0.002086000 seconds sys

In fact, while the event will be opened per-thread,
create_perf_stat_counter() is called as many times as the count of cpu
in the evlist's cpumap, and lost all the file descriptors except the
last one. If this counter is not scheduled during the period of time,
it will be "not counted".

Add the process to don't open the needless events in such situation.

Fixes: 4804e0111662 ("perf stat: Use affinity for opening events")
Signed-off-by: Wei Li 
---
 tools/perf/builtin-stat.c | 36 +++-
 1 file changed, 23 insertions(+), 13 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 6e6ceacce634..9a43b3de26d1 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -712,6 +712,7 @@ static int __run_perf_stat(int argc, const char **argv, int 
run_idx)
struct affinity affinity;
int i, cpu;
bool second_pass = false;
+   bool open_per_thread = false;
 
if (forks) {
if (perf_evlist__prepare_workload(evsel_list, , argv, 
is_pipe,
@@ -726,16 +727,17 @@ static int __run_perf_stat(int argc, const char **argv, 
int run_idx)
perf_evlist__set_leader(evsel_list);
 
if (!(target__has_cpu() && !target__has_per_thread()))
-   evsel_list->core.open_per_thread = true;
+   evsel_list->core.open_per_thread = open_per_thread = true;
 
if (affinity__setup() < 0)
return -1;
 
evlist__for_each_cpu (evsel_list, i, cpu) {
-   affinity__set(, cpu);
+   if (!open_per_thread)
+   affinity__set(, cpu);
 
evlist__for_each_entry(evsel_list, counter) {
-   if (evsel__cpu_iter_skip(counter, cpu))
+   if (!open_per_thread && evsel__cpu_iter_skip(counter, 
cpu))
continue;
if (counter->reset_group || counter->errored)
continue;
@@ -753,7 +755,8 @@ static int __run_perf_stat(int argc, const char **argv, int 
run_idx)
if ((errno == EINVAL || errno == EBADF) &&
counter->leader != counter &&
counter->weak_group) {
-   
perf_evlist__reset_weak_group(evsel_list, counter, false);
+   
perf_evlist__reset_weak_group(evsel_list, counter,
+   open_per_thread);
assert(counter->reset_group);
second_pass = true;
continue;
@@ -773,6 +776,9 @@ static int __run_perf_stat(int argc, const char **argv, int 
run_idx)
}
counter->supported = true;
}
+
+   if (open_per_thread)
+   break;
}
 
if (second_pass) {
@@ -782,20 +788,22 @@ static int __run_perf_stat(int argc, const char **argv, 
int run_idx)
 */
 
evlist__for_each_cpu(evsel_list, i, cpu) {
-   affinity__set(, cpu);
-   /* First close errored or weak retry */
-   evlist__for_each_entry(evsel_list, counter) {
-   if (!counter->reset_group && !counter->errored)
-   continue;
-   if (evsel__cpu_iter_skip_no_inc(counter, cpu))
-   continue;
-   perf_evsel__close_cpu(>core, 
counter->cpu_iter);
+   if (!open_per_thread) {
+   affinity__set(, cpu);
+   /* First close errored or weak retry */
+   evlist__for_each_entry(evsel_list, counter) {
+   if (!counter->reset_group && 
!counter->errored)
+   continue;
+   if 
(evsel__cpu_iter_skip_no_inc(counter, cpu))
+   continue;
+   perf_evsel__close_cpu(&g

[PATCH] MIPS: Correct the header guard of r4k-timer.h

2020-09-17 Thread Wei Li
Rename the header guard of r4k-timer.h from __ASM_R4K_TYPES_H to
__ASM_R4K_TIMER_H what corresponding with the file name.

Signed-off-by: Wei Li 
---
 arch/mips/include/asm/r4k-timer.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/mips/include/asm/r4k-timer.h 
b/arch/mips/include/asm/r4k-timer.h
index afe9e0e03fe9..6e7361629348 100644
--- a/arch/mips/include/asm/r4k-timer.h
+++ b/arch/mips/include/asm/r4k-timer.h
@@ -5,8 +5,8 @@
  *
  * Copyright (C) 2008 by Ralf Baechle (r...@linux-mips.org)
  */
-#ifndef __ASM_R4K_TYPES_H
-#define __ASM_R4K_TYPES_H
+#ifndef __ASM_R4K_TIMER_H
+#define __ASM_R4K_TIMER_H
 
 #include 
 
@@ -27,4 +27,4 @@ static inline void synchronise_count_slave(int cpu)
 
 #endif
 
-#endif /* __ASM_R4K_TYPES_H */
+#endif /* __ASM_R4K_TIMER_H */
-- 
2.17.1



[PATCH v2] perf metric: Code cleanup with map_for_each_event()

2020-09-17 Thread Wei Li
Since we have introduced map_for_each_event() to walk the 'pmu_events_map',
clean up metricgroup__print() and metricgroup__has_metric() with it.

Signed-off-by: Wei Li 
Acked-by: Namhyung Kim 
---
v1 -> v2:
 - Move map_for_each_metric() after match_metric() to avoid potential
   use-before-declare.
---
 tools/perf/util/metricgroup.c | 33 +
 1 file changed, 13 insertions(+), 20 deletions(-)

diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c
index 8831b964288f..50ee36437b99 100644
--- a/tools/perf/util/metricgroup.c
+++ b/tools/perf/util/metricgroup.c
@@ -374,6 +374,17 @@ static bool match_metric(const char *n, const char *list)
return false;
 }
 
+#define map_for_each_event(__pe, __idx, __map) \
+   for (__idx = 0, __pe = &__map->table[__idx];\
+__pe->name || __pe->metric_group || __pe->metric_name; \
+__pe = &__map->table[++__idx])
+
+#define map_for_each_metric(__pe, __idx, __map, __metric)  \
+   map_for_each_event(__pe, __idx, __map)  \
+   if (__pe->metric_expr &&\
+   (match_metric(__pe->metric_group, __metric) ||  \
+match_metric(__pe->metric_name, __metric)))
+
 struct mep {
struct rb_node nd;
const char *name;
@@ -475,12 +486,9 @@ void metricgroup__print(bool metrics, bool metricgroups, 
char *filter,
groups.node_new = mep_new;
groups.node_cmp = mep_cmp;
groups.node_delete = mep_delete;
-   for (i = 0; ; i++) {
+   map_for_each_event(pe, i, map) {
const char *g;
-   pe = >table[i];
 
-   if (!pe->name && !pe->metric_group && !pe->metric_name)
-   break;
if (!pe->metric_expr)
continue;
g = pe->metric_group;
@@ -745,17 +753,6 @@ static int __add_metric(struct list_head *metric_list,
return 0;
 }
 
-#define map_for_each_event(__pe, __idx, __map) \
-   for (__idx = 0, __pe = &__map->table[__idx];\
-__pe->name || __pe->metric_group || __pe->metric_name; \
-__pe = &__map->table[++__idx])
-
-#define map_for_each_metric(__pe, __idx, __map, __metric)  \
-   map_for_each_event(__pe, __idx, __map)  \
-   if (__pe->metric_expr &&\
-   (match_metric(__pe->metric_group, __metric) ||  \
-match_metric(__pe->metric_name, __metric)))
-
 static struct pmu_event *find_metric(const char *metric, struct pmu_events_map 
*map)
 {
struct pmu_event *pe;
@@ -1092,11 +1089,7 @@ bool metricgroup__has_metric(const char *metric)
if (!map)
return false;
 
-   for (i = 0; ; i++) {
-   pe = >table[i];
-
-   if (!pe->name && !pe->metric_group && !pe->metric_name)
-   break;
+   map_for_each_event(pe, i, map) {
if (!pe->metric_expr)
continue;
if (match_metric(pe->metric_name, metric))
-- 
2.17.1



[PATCH net v2] hinic: fix potential resource leak

2020-09-17 Thread Wei Li
In rx_request_irq(), it will just return what irq_set_affinity_hint()
returns. If it is failed, the napi and irq requested are not freed
properly. So add exits for failures to handle these.

Signed-off-by: Wei Li 
---
v1 -> v2:
 - Free irq as well when irq_set_affinity_hint() fails.
---
 drivers/net/ethernet/huawei/hinic/hinic_rx.c | 21 +---
 1 file changed, 14 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/huawei/hinic/hinic_rx.c 
b/drivers/net/ethernet/huawei/hinic/hinic_rx.c
index 5bee951fe9d4..cc1d425d070c 100644
--- a/drivers/net/ethernet/huawei/hinic/hinic_rx.c
+++ b/drivers/net/ethernet/huawei/hinic/hinic_rx.c
@@ -543,18 +543,25 @@ static int rx_request_irq(struct hinic_rxq *rxq)
if (err) {
netif_err(nic_dev, drv, rxq->netdev,
  "Failed to set RX interrupt coalescing attribute\n");
-   rx_del_napi(rxq);
-   return err;
+   goto err_req_irq;
}
 
err = request_irq(rq->irq, rx_irq, 0, rxq->irq_name, rxq);
-   if (err) {
-   rx_del_napi(rxq);
-   return err;
-   }
+   if (err)
+   goto err_req_irq;
 
cpumask_set_cpu(qp->q_id % num_online_cpus(), >affinity_mask);
-   return irq_set_affinity_hint(rq->irq, >affinity_mask);
+   err = irq_set_affinity_hint(rq->irq, >affinity_mask);
+   if (err)
+   goto err_irq_affinity;
+
+   return 0;
+
+err_irq_affinity:
+   free_irq(rq->irq, rxq);
+err_req_irq:
+   rx_del_napi(rxq);
+   return err;
 }
 
 static void rx_free_irq(struct hinic_rxq *rxq)
-- 
2.17.1



[PATCH] Revert "perf report: Treat an argument as a symbol filter"

2020-09-16 Thread Wei Li
Since commit 6db6127c4dad ("perf report: Treat an argument as a symbol
filter"), the only one unrecognized argument for perf-report is treated
as a symbol filter. This is not described in man page nor help info,
and the result is really confusing, especially when it's misspecified by
the user (e.g. missing -i for perf.data).

As we can use "--symbol-filter=" if we really want to filter a symbol,
it may be better to revert this misfeature.

Signed-off-by: Wei Li 
---
 tools/perf/builtin-report.c | 8 ++--
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 3c74c9c0f3c3..f57ebc1bcd20 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -1317,13 +1317,9 @@ int cmd_report(int argc, const char **argv)
argc = parse_options(argc, argv, options, report_usage, 0);
if (argc) {
/*
-* Special case: if there's an argument left then assume that
-* it's a symbol filter:
+* Any (unrecognized) arguments left?
 */
-   if (argc > 1)
-   usage_with_options(report_usage, options);
-
-   report.symbol_filter_str = argv[0];
+   usage_with_options(report_usage, options);
}
 
if (annotate_check_args(_opts) < 0)
-- 
2.17.1



[PATCH] hinic: fix potential resource leak

2020-09-16 Thread Wei Li
In rx_request_irq(), it will just return what irq_set_affinity_hint()
returns. If it is failed, the napi added is not deleted properly.
Add a common exit for failures to do this.

Signed-off-by: Wei Li 
---
 drivers/net/ethernet/huawei/hinic/hinic_rx.c | 19 ---
 1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/huawei/hinic/hinic_rx.c 
b/drivers/net/ethernet/huawei/hinic/hinic_rx.c
index 5bee951fe9d4..63da9cc8ca51 100644
--- a/drivers/net/ethernet/huawei/hinic/hinic_rx.c
+++ b/drivers/net/ethernet/huawei/hinic/hinic_rx.c
@@ -543,18 +543,23 @@ static int rx_request_irq(struct hinic_rxq *rxq)
if (err) {
netif_err(nic_dev, drv, rxq->netdev,
  "Failed to set RX interrupt coalescing attribute\n");
-   rx_del_napi(rxq);
-   return err;
+   goto err_irq;
}
 
err = request_irq(rq->irq, rx_irq, 0, rxq->irq_name, rxq);
-   if (err) {
-   rx_del_napi(rxq);
-   return err;
-   }
+   if (err)
+   goto err_irq;
 
cpumask_set_cpu(qp->q_id % num_online_cpus(), >affinity_mask);
-   return irq_set_affinity_hint(rq->irq, >affinity_mask);
+   err = irq_set_affinity_hint(rq->irq, >affinity_mask);
+   if (err)
+   goto err_irq;
+
+   return 0;
+
+err_irq:
+   rx_del_napi(rxq);
+   return err;
 }
 
 static void rx_free_irq(struct hinic_rxq *rxq)
-- 
2.17.1



[PATCH] perf metric: Code cleanup with map_for_each_event()

2020-09-16 Thread Wei Li
Since we have introduced map_for_each_event() to walk the 'pmu_events_map',
clean up metricgroup__print() and metricgroup__has_metric() with it.

Signed-off-by: Wei Li 
---
 tools/perf/util/metricgroup.c | 33 +
 1 file changed, 13 insertions(+), 20 deletions(-)

diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c
index 8831b964288f..3734cbb2c456 100644
--- a/tools/perf/util/metricgroup.c
+++ b/tools/perf/util/metricgroup.c
@@ -26,6 +26,17 @@
 #include "util.h"
 #include 
 
+#define map_for_each_event(__pe, __idx, __map) \
+   for (__idx = 0, __pe = &__map->table[__idx];\
+__pe->name || __pe->metric_group || __pe->metric_name; \
+__pe = &__map->table[++__idx])
+
+#define map_for_each_metric(__pe, __idx, __map, __metric)  \
+   map_for_each_event(__pe, __idx, __map)  \
+   if (__pe->metric_expr &&\
+   (match_metric(__pe->metric_group, __metric) ||  \
+match_metric(__pe->metric_name, __metric)))
+
 struct metric_event *metricgroup__lookup(struct rblist *metric_events,
 struct evsel *evsel,
 bool create)
@@ -475,12 +486,9 @@ void metricgroup__print(bool metrics, bool metricgroups, 
char *filter,
groups.node_new = mep_new;
groups.node_cmp = mep_cmp;
groups.node_delete = mep_delete;
-   for (i = 0; ; i++) {
+   map_for_each_event(pe, i, map) {
const char *g;
-   pe = >table[i];
 
-   if (!pe->name && !pe->metric_group && !pe->metric_name)
-   break;
if (!pe->metric_expr)
continue;
g = pe->metric_group;
@@ -745,17 +753,6 @@ static int __add_metric(struct list_head *metric_list,
return 0;
 }
 
-#define map_for_each_event(__pe, __idx, __map) \
-   for (__idx = 0, __pe = &__map->table[__idx];\
-__pe->name || __pe->metric_group || __pe->metric_name; \
-__pe = &__map->table[++__idx])
-
-#define map_for_each_metric(__pe, __idx, __map, __metric)  \
-   map_for_each_event(__pe, __idx, __map)  \
-   if (__pe->metric_expr &&\
-   (match_metric(__pe->metric_group, __metric) ||  \
-match_metric(__pe->metric_name, __metric)))
-
 static struct pmu_event *find_metric(const char *metric, struct pmu_events_map 
*map)
 {
struct pmu_event *pe;
@@ -1092,11 +1089,7 @@ bool metricgroup__has_metric(const char *metric)
if (!map)
return false;
 
-   for (i = 0; ; i++) {
-   pe = >table[i];
-
-   if (!pe->name && !pe->metric_group && !pe->metric_name)
-   break;
+   map_for_each_event(pe, i, map) {
if (!pe->metric_expr)
continue;
if (match_metric(pe->metric_name, metric))
-- 
2.17.1



[PATCH] perf record: Correct the help info of option "--no-bpf-event"

2020-08-18 Thread Wei Li
The help info of option "--no-bpf-event" is wrongly described as
"record bpf events", correct it.

Fixes: 71184c6ab7e6 ("perf record: Replace option --bpf-event with 
--no-bpf-event")
Signed-off-by: Wei Li 
---
 tools/perf/builtin-record.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index f91352f847c0..772f1057647f 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -2452,7 +2452,7 @@ static struct option __record_options[] = {
OPT_BOOLEAN(0, "tail-synthesize", _synthesize,
"synthesize non-sample events at the end of output"),
OPT_BOOLEAN(0, "overwrite", , "use overwrite 
mode"),
-   OPT_BOOLEAN(0, "no-bpf-event", _bpf_event, "record bpf 
events"),
+   OPT_BOOLEAN(0, "no-bpf-event", _bpf_event, "do not 
record bpf events"),
OPT_BOOLEAN(0, "strict-freq", _freq,
"Fail if the specified frequency can't be used"),
OPT_CALLBACK('F', "freq", , "freq or 'max'",
-- 
2.17.1



[PATCH v2] arm64: mm: free unused memmap for sparse memory model that define VMEMMAP

2020-08-11 Thread Wei Li
For the memory hole, sparse memory model that define SPARSEMEM_VMEMMAP
do not free the reserved memory for the page map, this patch do it.

Signed-off-by: Wei Li 
Signed-off-by: Chen Feng 
Signed-off-by: Xia Qing 

v2: fix the patch v1 compile errors that are not based on the latest mainline.
---
 arch/arm64/mm/init.c | 81 +---
 1 file changed, 71 insertions(+), 10 deletions(-)

diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 1e93cfc7c47a..600889945cd0 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -441,7 +441,48 @@ void __init bootmem_init(void)
memblock_dump_all();
 }

-#ifndef CONFIG_SPARSEMEM_VMEMMAP
+#ifdef CONFIG_SPARSEMEM_VMEMMAP
+#define VMEMMAP_PAGE_INUSE 0xFD
+static inline void free_memmap(unsigned long start_pfn, unsigned long end_pfn)
+{
+   unsigned long addr, end;
+   unsigned long next;
+   pmd_t *pmd;
+   void *page_addr;
+   phys_addr_t phys_addr;
+
+   addr = (unsigned long)pfn_to_page(start_pfn);
+   end = (unsigned long)pfn_to_page(end_pfn);
+
+   pmd = pmd_off_k(addr);
+   for (; addr < end; addr = next, pmd++) {
+   next = pmd_addr_end(addr, end);
+
+   if (!pmd_present(*pmd))
+   continue;
+
+   if (IS_ALIGNED(addr, PMD_SIZE) &&
+   IS_ALIGNED(next, PMD_SIZE)) {
+   phys_addr = __pfn_to_phys(pmd_pfn(*pmd));
+   memblock_free(phys_addr, PMD_SIZE);
+   pmd_clear(pmd);
+   } else {
+   /* If here, we are freeing vmemmap pages. */
+   memset((void *)addr, VMEMMAP_PAGE_INUSE, next - addr);
+   page_addr = page_address(pmd_page(*pmd));
+
+   if (!memchr_inv(page_addr, VMEMMAP_PAGE_INUSE,
+   PMD_SIZE)) {
+   phys_addr = __pfn_to_phys(pmd_pfn(*pmd));
+   memblock_free(phys_addr, PMD_SIZE);
+   pmd_clear(pmd);
+   }
+   }
+   }
+
+   flush_tlb_all();
+}
+#else
 static inline void free_memmap(unsigned long start_pfn, unsigned long end_pfn)
 {
struct page *start_pg, *end_pg;
@@ -468,31 +509,53 @@ static inline void free_memmap(unsigned long start_pfn, 
unsigned long end_pfn)
memblock_free(pg, pgend - pg);
 }

+#endif
+
 /*
  * The mem_map array can get very big. Free the unused area of the memory map.
  */
 static void __init free_unused_memmap(void)
 {
-   unsigned long start, prev_end = 0;
+   unsigned long start, cur_start, prev_end = 0;
struct memblock_region *reg;

for_each_memblock(memory, reg) {
-   start = __phys_to_pfn(reg->base);
+   cur_start = __phys_to_pfn(reg->base);

 #ifdef CONFIG_SPARSEMEM
/*
 * Take care not to free memmap entries that don't exist due
 * to SPARSEMEM sections which aren't present.
 */
-   start = min(start, ALIGN(prev_end, PAGES_PER_SECTION));
-#endif
+   start = min(cur_start, ALIGN(prev_end, PAGES_PER_SECTION));
+
/*
-* If we had a previous bank, and there is a space between the
-* current bank and the previous, free it.
+* Free memory in the case of:
+* 1. if cur_start - prev_end <= PAGES_PER_SECTION,
+* free pre_end ~ cur_start.
+* 2. if cur_start - prev_end > PAGES_PER_SECTION,
+* free pre_end ~ ALIGN(prev_end, PAGES_PER_SECTION).
 */
if (prev_end && prev_end < start)
free_memmap(prev_end, start);

+   /*
+* Free memory in the case of:
+* if cur_start - prev_end > PAGES_PER_SECTION,
+* free ALIGN_DOWN(cur_start, PAGES_PER_SECTION) ~ cur_start.
+*/
+   if (cur_start > start &&
+   !IS_ALIGNED(cur_start, PAGES_PER_SECTION))
+   free_memmap(ALIGN_DOWN(cur_start, PAGES_PER_SECTION),
+   cur_start);
+#else
+   /*
+* If we had a previous bank, and there is a space between the
+* current bank and the previous, free it.
+*/
+   if (prev_end && prev_end < cur_start)
+   free_memmap(prev_end, cur_start);
+#endif
/*
 * Align up here since the VM subsystem insists that the
 * memmap entries are valid from the bank end aligned to
@@ -507,7 +570,6 @@ static void __init free_unused_memmap(void)
free_memmap(prev_end, ALIGN(prev_end, PAGES_PER_SECTION));
 #endif
 }
-#endi

[PATCH 1/4] drivers/perf: Add support for ARMv8.3-SPE

2020-07-24 Thread Wei Li
Armv8.3 extends the SPE by adding:
- Alignment field in the Events packet, and filtering on this event
  using PMSEVFR_EL1.
- Support for the Scalable Vector Extension (SVE).

The main additions for SVE are:
- Recording the vector length for SVE operations in the Operation Type
  packet. It is not possible to filter on vector length.
- Incomplete predicate and empty predicate fields in the Events packet,
  and filtering on these events using PMSEVFR_EL1.

Update the check of pmsevfr for empty/partial predicated SVE and
alignment event in kernel driver.

Signed-off-by: Wei Li 
---
 arch/arm64/include/asm/sysreg.h |  4 +++-
 drivers/perf/arm_spe_pmu.c  | 18 ++
 2 files changed, 17 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 463175f80341..be4c44ccdb56 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -281,7 +281,6 @@
 #define SYS_PMSFCR_EL1_ST_SHIFT18
 
 #define SYS_PMSEVFR_EL1sys_reg(3, 0, 9, 9, 5)
-#define SYS_PMSEVFR_EL1_RES0   0x00ff0f55UL
 
 #define SYS_PMSLATFR_EL1   sys_reg(3, 0, 9, 9, 6)
 #define SYS_PMSLATFR_EL1_MINLAT_SHIFT  0
@@ -769,6 +768,9 @@
 #define ID_AA64DFR0_PMUVER_8_5 0x6
 #define ID_AA64DFR0_PMUVER_IMP_DEF 0xf
 
+#define ID_AA64DFR0_PMSVER_8_2 0x1
+#define ID_AA64DFR0_PMSVER_8_3 0x2
+
 #define ID_DFR0_PERFMON_SHIFT  24
 
 #define ID_DFR0_PERFMON_8_10x4
diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
index e51ddb6d63ed..5ec7ee0c8fa1 100644
--- a/drivers/perf/arm_spe_pmu.c
+++ b/drivers/perf/arm_spe_pmu.c
@@ -54,7 +54,7 @@ struct arm_spe_pmu {
struct hlist_node   hotplug_node;
 
int irq; /* PPI */
-
+   int pmuver;
u16 min_period;
u16 counter_sz;
 
@@ -80,6 +80,15 @@ struct arm_spe_pmu {
 /* Keep track of our dynamic hotplug state */
 static enum cpuhp_state arm_spe_pmu_online;
 
+static u64 sys_pmsevfr_el1_mask[] = {
+   [ID_AA64DFR0_PMSVER_8_2] = GENMASK_ULL(63, 48) | GENMASK_ULL(31, 24) |
+   GENMASK_ULL(15, 12) | BIT_ULL(7) | BIT_ULL(5) | BIT_ULL(3) |
+   BIT_ULL(1),
+   [ID_AA64DFR0_PMSVER_8_3] = GENMASK_ULL(63, 48) | GENMASK_ULL(31, 24) |
+   GENMASK_ULL(18, 17) | GENMASK_ULL(15, 11) | BIT_ULL(7) |
+   BIT_ULL(5) | BIT_ULL(3) | BIT_ULL(1),
+};
+
 enum arm_spe_pmu_buf_fault_action {
SPE_PMU_BUF_FAULT_ACT_SPURIOUS,
SPE_PMU_BUF_FAULT_ACT_FATAL,
@@ -670,7 +679,7 @@ static int arm_spe_pmu_event_init(struct perf_event *event)
!cpumask_test_cpu(event->cpu, _pmu->supported_cpus))
return -ENOENT;
 
-   if (arm_spe_event_to_pmsevfr(event) & SYS_PMSEVFR_EL1_RES0)
+   if (arm_spe_event_to_pmsevfr(event) & 
~sys_pmsevfr_el1_mask[spe_pmu->pmuver])
return -EOPNOTSUPP;
 
if (attr->exclude_idle)
@@ -937,6 +946,7 @@ static void __arm_spe_pmu_dev_probe(void *info)
fld, smp_processor_id());
return;
}
+   spe_pmu->pmuver = fld;
 
/* Read PMBIDR first to determine whether or not we have access */
reg = read_sysreg_s(SYS_PMBIDR_EL1);
@@ -1027,8 +1037,8 @@ static void __arm_spe_pmu_dev_probe(void *info)
}
 
dev_info(dev,
-"probed for CPUs %*pbl [max_record_sz %u, align %u, features 
0x%llx]\n",
-cpumask_pr_args(_pmu->supported_cpus),
+"v%d probed for CPUs %*pbl [max_record_sz %u, align %u, 
features 0x%llx]\n",
+spe_pmu->pmuver, cpumask_pr_args(_pmu->supported_cpus),
 spe_pmu->max_record_sz, spe_pmu->align, spe_pmu->features);
 
spe_pmu->features |= SPE_PMU_FEAT_DEV_PROBED;
-- 
2.17.1



[PATCH 0/4] Add support for ARMv8.3-SPE

2020-07-24 Thread Wei Li
The ARMv8.3-SPE adds an Alignment Flag in the Events packet and filtering
on this event using PMSEVFR_EL1, together with support for the profiling
of Scalable Vector Extension operations.

Patch 1: Update the kernel driver, mainly for PMSEVFR_EL1.

Patch 2: Update the decode process of Events packet and Operation Type
  packet in perf tool.

Patch 3-4: Synthesize unaligned address access events and partial/empty
  predicated SVE events, also add two itrace options for filtering.

Wei Li (4):
  drivers/perf: Add support for ARMv8.3-SPE
  perf: arm-spe: Add support for ARMv8.3-SPE
  perf auxtrace: Add new itrace options for ARMv8.3-SPE
  perf: arm-spe: Synthesize new events for ARMv8.3-SPE

 arch/arm64/include/asm/sysreg.h   |  4 +-
 drivers/perf/arm_spe_pmu.c| 18 +++--
 tools/perf/Documentation/itrace.txt   |  2 +
 .../util/arm-spe-decoder/arm-spe-decoder.c| 11 +++
 .../util/arm-spe-decoder/arm-spe-decoder.h|  3 +
 .../arm-spe-decoder/arm-spe-pkt-decoder.c | 69 ++-
 tools/perf/util/arm-spe.c | 61 
 tools/perf/util/auxtrace.c|  8 +++
 tools/perf/util/auxtrace.h|  4 ++
 9 files changed, 173 insertions(+), 7 deletions(-)

-- 
2.17.1



[PATCH 2/4] perf: arm-spe: Add support for ARMv8.3-SPE

2020-07-24 Thread Wei Li
Armv8.3 extends the SPE by adding:
- Alignment field in the Events packet, and filtering on this event
  using PMSEVFR_EL1.
- Support for the Scalable Vector Extension (SVE).

The main additions for SVE are:
- Recording the vector length for SVE operations in the Operation Type
  packet. It is not possible to filter on vector length.
- Incomplete predicate and empty predicate fields in the Events packet,
  and filtering on these events using PMSEVFR_EL1.

Add the corresponding decode process of Events packet and Operation Type
packet in perf tool.

Signed-off-by: Wei Li 
---
 .../arm-spe-decoder/arm-spe-pkt-decoder.c | 69 ++-
 1 file changed, 67 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c 
b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
index b94001b756c7..10a3692839de 100644
--- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
+++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
@@ -347,6 +347,24 @@ int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, 
char *buf,
blen -= ret;
}
}
+   if (idx > 2) {
+   if (payload & 0x800) {
+   ret = snprintf(buf, buf_len, " ALIGNMENT");
+   buf += ret;
+   blen -= ret;
+   }
+   if (payload & 0x2) {
+   ret = snprintf(buf, buf_len, " 
SVE-PRED-PARTIAL");
+   buf += ret;
+   blen -= ret;
+   }
+   if (payload & 0x4) {
+   ret = snprintf(buf, buf_len, " SVE-PRED-EMPTY");
+   buf += ret;
+   blen -= ret;
+   }
+   }
+
if (ret < 0)
return ret;
blen -= ret;
@@ -354,8 +372,38 @@ int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, 
char *buf,
}
case ARM_SPE_OP_TYPE:
switch (idx) {
-   case 0: return snprintf(buf, buf_len, "%s", payload & 0x1 ?
-   "COND-SELECT" : "INSN-OTHER");
+   case 0: {
+   if (payload & 0x8) {
+   size_t blen = buf_len;
+
+   ret = snprintf(buf, buf_len, "SVE-OTHER");
+   buf += ret;
+   blen -= ret;
+   if (payload & 0x2) {
+   ret = snprintf(buf, buf_len, " FP");
+   buf += ret;
+   blen -= ret;
+   }
+   if (payload & 0x4) {
+   ret = snprintf(buf, buf_len, " PRED");
+   buf += ret;
+   blen -= ret;
+   }
+   if (payload & 0x70) {
+   ret = snprintf(buf, buf_len, " EVL %d",
+   32 << ((payload & 0x70) >> 4));
+   buf += ret;
+   blen -= ret;
+   }
+   if (ret < 0)
+   return ret;
+   blen -= ret;
+   return buf_len - blen;
+   } else {
+   return snprintf(buf, buf_len, "%s", payload & 
0x1 ?
+   "COND-SELECT" : "INSN-OTHER");
+   }
+   }
case 1: {
size_t blen = buf_len;
 
@@ -385,6 +433,23 @@ int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, 
char *buf,
ret = snprintf(buf, buf_len, " SIMD-FP");
buf += ret;
blen -= ret;
+   } else if (payload & 0x8) {
+   if (payload & 0x4) {
+   ret = snprintf(buf, buf_len, " PRED");
+   buf += ret;
+   blen -= ret;
+   }
+   if (payload & 0x70) {
+   

[PATCH 4/4] perf: arm-spe: Synthesize new events for ARMv8.3-SPE

2020-07-24 Thread Wei Li
Synthesize unaligned address access events and partial/empty
predicated SVE operation introduced by ARMv8.3-SPE.

They can be filtered by itrace options when reporting.

Signed-off-by: Wei Li 
---
 .../util/arm-spe-decoder/arm-spe-decoder.c| 11 
 .../util/arm-spe-decoder/arm-spe-decoder.h|  3 +
 tools/perf/util/arm-spe.c | 61 +++
 3 files changed, 75 insertions(+)

diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c 
b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
index 93e063f22be5..fac8102c0149 100644
--- a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
+++ b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
@@ -197,6 +197,17 @@ static int arm_spe_read_record(struct arm_spe_decoder 
*decoder)
if (payload & BIT(EV_MISPRED))
decoder->record.type |= ARM_SPE_BRANCH_MISS;
 
+   if ((idx == 4 || idx == 8) &&
+   (payload & BIT(EV_ALIGNMENT)))
+   decoder->record.type |= ARM_SPE_ALIGNMENT;
+
+   if ((idx == 4 || idx == 8) &&
+   (payload & BIT(EV_PARTIAL_PREDICATE)))
+   decoder->record.type |= 
ARM_SPE_PARTIAL_PREDICATE;
+
+   if ((idx == 4 || idx == 8) &&
+   (payload & BIT(EV_EMPTY_PREDICATE)))
+   decoder->record.type |= ARM_SPE_EMPTY_PREDICATE;
break;
case ARM_SPE_DATA_SOURCE:
break;
diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h 
b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h
index a5111a8d4360..d165418fcc13 100644
--- a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h
+++ b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h
@@ -39,6 +39,9 @@ enum arm_spe_sample_type {
ARM_SPE_TLB_MISS= 1 << 5,
ARM_SPE_BRANCH_MISS = 1 << 6,
ARM_SPE_REMOTE_ACCESS   = 1 << 7,
+   ARM_SPE_ALIGNMENT   = 1 << 8,
+   ARM_SPE_PARTIAL_PREDICATE   = 1 << 9,
+   ARM_SPE_EMPTY_PREDICATE = 1 << 10,
 };
 
 struct arm_spe_record {
diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c
index 3882a5360ada..e36d6eea269b 100644
--- a/tools/perf/util/arm-spe.c
+++ b/tools/perf/util/arm-spe.c
@@ -53,6 +53,8 @@ struct arm_spe {
u8  sample_tlb;
u8  sample_branch;
u8  sample_remote_access;
+   u8  sample_alignment;
+   u8  sample_sve;
 
u64 l1d_miss_id;
u64 l1d_access_id;
@@ -62,6 +64,9 @@ struct arm_spe {
u64 tlb_access_id;
u64 branch_miss_id;
u64 remote_access_id;
+   u64 alignment_id;
+   u64 epred_sve_id;
+   u64 ppred_sve_id;
 
u64 kernel_start;
 
@@ -344,6 +349,30 @@ static int arm_spe_sample(struct arm_spe_queue *speq)
return err;
}
 
+   if (spe->sample_alignment &&
+   (record->type & ARM_SPE_ALIGNMENT)) {
+   err = arm_spe_synth_spe_events_sample(speq,
+ spe->alignment_id);
+   if (err)
+   return err;
+   }
+
+   if (spe->sample_sve) {
+   if (record->type & ARM_SPE_EMPTY_PREDICATE) {
+   err = arm_spe_synth_spe_events_sample(
+   speq, spe->epred_sve_id);
+   if (err)
+   return err;
+   }
+
+   if (record->type & ARM_SPE_PARTIAL_PREDICATE) {
+   err = arm_spe_synth_spe_events_sample(
+   speq, spe->ppred_sve_id);
+   if (err)
+   return err;
+   }
+   }
+
return 0;
 }
 
@@ -907,6 +936,38 @@ arm_spe_synth_events(struct arm_spe *spe, struct 
perf_session *session)
id += 1;
}
 
+   if (spe->synth_opts.alignment) {
+   spe->sample_alignment = true;
+
+   /* Alignment */
+   err = arm_spe_synth_event(session, , id);
+   if (err)
+   return err;
+   spe->alignment_id = id;
+   arm_spe_set_event_name(evlist, id, "alignment");
+   id += 1;
+   }
+
+   if (spe->synt

[PATCH 3/4] perf auxtrace: Add new itrace options for ARMv8.3-SPE

2020-07-24 Thread Wei Li
This patch is to add two options to synthesize events which are
described as below:

 'u': synthesize unaligned address access events
 'v': synthesize partial/empty predicated SVE events

This two options will be used by ARM SPE as their first consumer.

Signed-off-by: Wei Li 
---
 tools/perf/Documentation/itrace.txt | 2 ++
 tools/perf/util/auxtrace.c  | 8 
 tools/perf/util/auxtrace.h  | 4 
 3 files changed, 14 insertions(+)

diff --git a/tools/perf/Documentation/itrace.txt 
b/tools/perf/Documentation/itrace.txt
index e817179c5027..25bcf3622709 100644
--- a/tools/perf/Documentation/itrace.txt
+++ b/tools/perf/Documentation/itrace.txt
@@ -13,6 +13,8 @@
m   synthesize last level cache events
t   synthesize TLB events
a   synthesize remote access events
+   u   synthesize unaligned address access events
+   v   synthesize partial/empty predicated SVE events
g   synthesize a call chain (use with i or x)
G   synthesize a call chain on existing event records
l   synthesize last branch entries (use with i or x)
diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
index 25c639ac4ad4..2033eb3708ec 100644
--- a/tools/perf/util/auxtrace.c
+++ b/tools/perf/util/auxtrace.c
@@ -1334,6 +1334,8 @@ void itrace_synth_opts__set_default(struct 
itrace_synth_opts *synth_opts,
synth_opts->llc = true;
synth_opts->tlb = true;
synth_opts->remote_access = true;
+   synth_opts->alignment = true;
+   synth_opts->sve = true;
 
if (no_sample) {
synth_opts->period_type = PERF_ITRACE_PERIOD_INSTRUCTIONS;
@@ -1507,6 +1509,12 @@ int itrace_parse_synth_opts(const struct option *opt, 
const char *str,
case 'a':
synth_opts->remote_access = true;
break;
+   case 'u':
+   synth_opts->alignment = true;
+   break;
+   case 'v':
+   synth_opts->sve = true;
+   break;
case ' ':
case ',':
break;
diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h
index 142ccf7d34df..972df7b06b0d 100644
--- a/tools/perf/util/auxtrace.h
+++ b/tools/perf/util/auxtrace.h
@@ -116,6 +116,8 @@ struct itrace_synth_opts {
boolllc;
booltlb;
boolremote_access;
+   boolalignment;
+   boolsve;
unsigned intcallchain_sz;
unsigned intlast_branch_sz;
unsigned long long  period;
@@ -617,6 +619,8 @@ bool auxtrace__evsel_is_auxtrace(struct perf_session 
*session,
 "  m:  synthesize last level 
cache events\n" \
 "  t:  synthesize TLB 
events\n" \
 "  a:  synthesize remote 
access events\n" \
+"  u:  synthesize unaligned 
address access events\n" \
+"  v:  synthesize 
partial/empty predicated SVE events\n" \
 "  g[len]: synthesize a call chain 
(use with i or x)\n" \
 "  l[len]: synthesize last branch 
entries (use with i or x)\n" \
 "  sNUMBER:skip initial number of 
events\n"\
-- 
2.17.1



[PATCH] perf: arm-spe: Fix check error when synthesizing events

2020-07-24 Thread Wei Li
In arm_spe_read_record(), when we are processing an events packet,
'decoder->packet.index' is the length of payload, which has been
transformed in payloadlen(). So correct the check of 'idx'.

Signed-off-by: Wei Li 
---
 tools/perf/util/arm-spe-decoder/arm-spe-decoder.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c 
b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
index 302a14d0aca9..93e063f22be5 100644
--- a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
+++ b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
@@ -182,15 +182,15 @@ static int arm_spe_read_record(struct arm_spe_decoder 
*decoder)
if (payload & BIT(EV_TLB_ACCESS))
decoder->record.type |= ARM_SPE_TLB_ACCESS;
 
-   if ((idx == 1 || idx == 2 || idx == 3) &&
+   if ((idx == 2 || idx == 4 || idx == 8) &&
(payload & BIT(EV_LLC_MISS)))
decoder->record.type |= ARM_SPE_LLC_MISS;
 
-   if ((idx == 1 || idx == 2 || idx == 3) &&
+   if ((idx == 2 || idx == 4 || idx == 8) &&
(payload & BIT(EV_LLC_ACCESS)))
decoder->record.type |= ARM_SPE_LLC_ACCESS;
 
-   if ((idx == 1 || idx == 2 || idx == 3) &&
+   if ((idx == 2 || idx == 4 || idx == 8) &&
(payload & BIT(EV_REMOTE_ACCESS)))
decoder->record.type |= ARM_SPE_REMOTE_ACCESS;
 
-- 
2.17.1



[PATCH v2 2/2] perf tools: ARM SPE code cleanup

2020-07-24 Thread Wei Li
- Firstly, the function auxtrace_record__init() will be invoked only
  once, the variable "arm_spe_pmus" will not be used afterwards, thus
  we don't need to check "arm_spe_pmus" is NULL or not;
- Another reason is, even though SPE is micro-architecture dependent,
  but so far it only supports "statistical-profiling-extension-v1" and
  we have no chance to use multiple SPE's PMU events in Perf command.

So remove the useless check code to make it clear.

Signed-off-by: Wei Li 
---
 tools/perf/arch/arm/util/auxtrace.c | 9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/tools/perf/arch/arm/util/auxtrace.c 
b/tools/perf/arch/arm/util/auxtrace.c
index 28a5d0c18b1d..b187bddbd01a 100644
--- a/tools/perf/arch/arm/util/auxtrace.c
+++ b/tools/perf/arch/arm/util/auxtrace.c
@@ -57,17 +57,15 @@ struct auxtrace_record
struct evsel *evsel;
bool found_etm = false;
struct perf_pmu *found_spe = NULL;
-   static struct perf_pmu **arm_spe_pmus = NULL;
-   static int nr_spes = 0;
+   struct perf_pmu **arm_spe_pmus = NULL;
+   int nr_spes = 0;
int i = 0;
 
if (!evlist)
return NULL;
 
cs_etm_pmu = perf_pmu__find(CORESIGHT_ETM_PMU_NAME);
-
-   if (!arm_spe_pmus)
-   arm_spe_pmus = find_all_arm_spe_pmus(_spes, err);
+   arm_spe_pmus = find_all_arm_spe_pmus(_spes, err);
 
evlist__for_each_entry(evlist, evsel) {
if (cs_etm_pmu &&
@@ -84,6 +82,7 @@ struct auxtrace_record
}
}
}
+   free(arm_spe_pmus);
 
if (found_etm && found_spe) {
pr_err("Concurrent ARM Coresight ETM and SPE operation not 
currently supported\n");
-- 
2.17.1



[PATCH v2 0/2] perf tools: Fix record failure when mixed with ARM SPE event

2020-07-24 Thread Wei Li
v1 -> v2:
 - Optimize code in patch 1 as Mathieu adviced.
 - Fix memleak in patch 2.
 - Detail the commit info to explain the reason.

This patch set fixes perf record failure when we mix arm_spe_x event
with other events in specific order.

Wei Li (2):
  perf tools: Fix record failure when mixed with ARM SPE event
  perf tools: ARM SPE code cleanup

 tools/perf/arch/arm/util/auxtrace.c | 17 -
 1 file changed, 8 insertions(+), 9 deletions(-)

-- 
2.17.1



[PATCH v2 1/2] perf tools: Fix record failure when mixed with ARM SPE event

2020-07-24 Thread Wei Li
When recording with cache-misses and arm_spe_x event, i found that
it will just fail without showing any error info if i put cache-misses
after 'arm_spe_x' event.

[root@localhost 0620]# perf record -e cache-misses -e \
arm_spe_0/ts_enable=1,pct_enable=1,pa_enable=1,load_filter=1,\
jitter=1,store_filter=1,min_latency=0/ sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.067 MB perf.data ]
[root@localhost 0620]# perf record -e \
arm_spe_0/ts_enable=1,pct_enable=1,pa_enable=1,load_filter=1,jitter=1,\
store_filter=1,min_latency=0/ -e cache-misses sleep 1
[root@localhost 0620]#

The current code can only work if the only event to be traced is an
'arm_spe_x', or if it is the last event to be specified. Otherwise the
last event type will be checked against all the arm_spe_pmus[i]->types,
none will match and an out of bound 'i' index will be used in
arm_spe_recording_init().

We don't support concurrent multiple arm_spe_x events currently, that
is checked in arm_spe_recording_options(), and it will show the relevant
info. So add the check and record of the first found 'arm_spe_pmu' to
fix this issue here.

Fixes: ffd3d18c20b8d ("perf tools: Add ARM Statistical Profiling Extensions 
(SPE) support")
Signed-off-by: Wei Li 
---
 tools/perf/arch/arm/util/auxtrace.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/tools/perf/arch/arm/util/auxtrace.c 
b/tools/perf/arch/arm/util/auxtrace.c
index 0a6e75b8777a..28a5d0c18b1d 100644
--- a/tools/perf/arch/arm/util/auxtrace.c
+++ b/tools/perf/arch/arm/util/auxtrace.c
@@ -56,7 +56,7 @@ struct auxtrace_record
struct perf_pmu *cs_etm_pmu;
struct evsel *evsel;
bool found_etm = false;
-   bool found_spe = false;
+   struct perf_pmu *found_spe = NULL;
static struct perf_pmu **arm_spe_pmus = NULL;
static int nr_spes = 0;
int i = 0;
@@ -74,12 +74,12 @@ struct auxtrace_record
evsel->core.attr.type == cs_etm_pmu->type)
found_etm = true;
 
-   if (!nr_spes)
+   if (!nr_spes || found_spe)
continue;
 
for (i = 0; i < nr_spes; i++) {
if (evsel->core.attr.type == arm_spe_pmus[i]->type) {
-   found_spe = true;
+   found_spe = arm_spe_pmus[i];
break;
}
}
@@ -96,7 +96,7 @@ struct auxtrace_record
 
 #if defined(__aarch64__)
if (found_spe)
-   return arm_spe_recording_init(err, arm_spe_pmus[i]);
+   return arm_spe_recording_init(err, found_spe);
 #endif
 
/*
-- 
2.17.1



[PATCH] arm64: mm: free unused memmap for sparse memory model that define VMEMMAP

2020-07-21 Thread Wei Li
For the memory hole, sparse memory model that define SPARSEMEM_VMEMMAP
do not free the reserved memory for the page map, this patch do it.

Signed-off-by: Wei Li 
Signed-off-by: Chen Feng 
Signed-off-by: Xia Qing 
---
 arch/arm64/mm/init.c | 81 +---
 1 file changed, 71 insertions(+), 10 deletions(-)

diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 1e93cfc7c47a..d1b56b47d5ba 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -441,7 +441,48 @@ void __init bootmem_init(void)
memblock_dump_all();
 }

-#ifndef CONFIG_SPARSEMEM_VMEMMAP
+#ifdef CONFIG_SPARSEMEM_VMEMMAP
+#define VMEMMAP_PAGE_INUSE 0xFD
+static inline void free_memmap(unsigned long start_pfn, unsigned long end_pfn)
+{
+   unsigned long addr, end;
+   unsigned long next;
+   pmd_t *pmd;
+   void *page_addr;
+   phys_addr_t phys_addr;
+
+   addr = (unsigned long)pfn_to_page(start_pfn);
+   end = (unsigned long)pfn_to_page(end_pfn);
+
+   pmd = pmd_offset(pud_offset(pgd_offset_k(addr), addr), addr);
+   for (; addr < end; addr = next, pmd++) {
+   next = pmd_addr_end(addr, end);
+
+   if (!pmd_present(*pmd))
+   continue;
+
+   if (IS_ALIGNED(addr, PMD_SIZE) &&
+   IS_ALIGNED(next, PMD_SIZE)) {
+   phys_addr = __pfn_to_phys(pmd_pfn(*pmd));
+   free_bootmem(phys_addr, PMD_SIZE);
+   pmd_clear(pmd);
+   } else {
+   /* If here, we are freeing vmemmap pages. */
+   memset((void *)addr, VMEMMAP_PAGE_INUSE, next - addr);
+   page_addr = page_address(pmd_page(*pmd));
+
+   if (!memchr_inv(page_addr, VMEMMAP_PAGE_INUSE,
+   PMD_SIZE)) {
+   phys_addr = __pfn_to_phys(pmd_pfn(*pmd));
+   free_bootmem(phys_addr, PMD_SIZE);
+   pmd_clear(pmd);
+   }
+   }
+   }
+
+   flush_tlb_all();
+}
+#else
 static inline void free_memmap(unsigned long start_pfn, unsigned long end_pfn)
 {
struct page *start_pg, *end_pg;
@@ -468,31 +509,53 @@ static inline void free_memmap(unsigned long start_pfn, 
unsigned long end_pfn)
memblock_free(pg, pgend - pg);
 }

+#endif
+
 /*
  * The mem_map array can get very big. Free the unused area of the memory map.
  */
 static void __init free_unused_memmap(void)
 {
-   unsigned long start, prev_end = 0;
+   unsigned long start, cur_start, prev_end = 0;
struct memblock_region *reg;

for_each_memblock(memory, reg) {
-   start = __phys_to_pfn(reg->base);
+   cur_start = __phys_to_pfn(reg->base);

 #ifdef CONFIG_SPARSEMEM
/*
 * Take care not to free memmap entries that don't exist due
 * to SPARSEMEM sections which aren't present.
 */
-   start = min(start, ALIGN(prev_end, PAGES_PER_SECTION));
-#endif
+   start = min(cur_start, ALIGN(prev_end, PAGES_PER_SECTION));
+
/*
-* If we had a previous bank, and there is a space between the
-* current bank and the previous, free it.
+* Free memory in the case of:
+* 1. if cur_start - prev_end <= PAGES_PER_SECTION,
+* free pre_end ~ cur_start.
+* 2. if cur_start - prev_end > PAGES_PER_SECTION,
+* free pre_end ~ ALIGN(prev_end, PAGES_PER_SECTION).
 */
if (prev_end && prev_end < start)
free_memmap(prev_end, start);

+   /*
+* Free memory in the case of:
+* if cur_start - prev_end > PAGES_PER_SECTION,
+* free ALIGN_DOWN(cur_start, PAGES_PER_SECTION) ~ cur_start.
+*/
+   if (cur_start > start &&
+   !IS_ALIGNED(cur_start, PAGES_PER_SECTION))
+   free_memmap(ALIGN_DOWN(cur_start, PAGES_PER_SECTION),
+   cur_start);
+#else
+   /*
+* If we had a previous bank, and there is a space between the
+* current bank and the previous, free it.
+*/
+   if (prev_end && prev_end < cur_start)
+   free_memmap(prev_end, cur_start);
+#endif
/*
 * Align up here since the VM subsystem insists that the
 * memmap entries are valid from the bank end aligned to
@@ -507,7 +570,6 @@ static void __init free_unused_memmap(void)
free_memmap(prev_end, ALIGN(prev_end, PAGES_PER_SECTION));
 #endif
 }
-#endif /* !CONFIG_SPARSEMEM_VMEMMAP */

 /*
  * 

[PATCH 2/2] perf tools: Fix record failure when mixed with ARM SPE event

2020-06-23 Thread Wei Li
When recording with cache-misses and arm_spe_x event, i found that
it will just fail without showing any error info if i put cache-misses
after arm_spe_x event.

[root@localhost 0620]# perf record -e cache-misses -e \
arm_spe_0/ts_enable=1,pct_enable=1,pa_enable=1,load_filter=1,\
jitter=1,store_filter=1,min_latency=0/ sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.067 MB perf.data ]
[root@localhost 0620]# perf record -e \
arm_spe_0/ts_enable=1,pct_enable=1,pa_enable=1,load_filter=1,jitter=1,\
store_filter=1,min_latency=0/ -e cache-misses sleep 1
[root@localhost 0620]#

Finally, i found the reason is that the parameter 'arm_spe_pmu' passed to
arm_spe_recording_init() in auxtrace_record__init() is wrong. When the
arm_spe_x event is not the last event, 'arm_spe_pmus[i]' will be out of
bounds.

It seems that the code can't support concurrent multiple different
arm_spe_x events currently. So add the code to check and record the
found 'arm_spe_pmu' to fix this issue.

In fact, we don't support concurrent multiple same arm_spe_x events either,
that is checked in arm_spe_recording_options(), and it will show the
relevant info.

Fixes: ffd3d18c20b8d ("perf tools: Add ARM Statistical Profiling Extensions 
(SPE) support")
Signed-off-by: Wei Li 
---
 tools/perf/arch/arm/util/auxtrace.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/tools/perf/arch/arm/util/auxtrace.c 
b/tools/perf/arch/arm/util/auxtrace.c
index 62b7b03d691a..7bb6f29e766c 100644
--- a/tools/perf/arch/arm/util/auxtrace.c
+++ b/tools/perf/arch/arm/util/auxtrace.c
@@ -58,6 +58,7 @@ struct auxtrace_record
bool found_etm = false;
bool found_spe = false;
static struct perf_pmu **arm_spe_pmus;
+   static struct perf_pmu *arm_spe_pmu;
static int nr_spes = 0;
int i = 0;
 
@@ -77,6 +78,13 @@ struct auxtrace_record
 
for (i = 0; i < nr_spes; i++) {
if (evsel->core.attr.type == arm_spe_pmus[i]->type) {
+   if (found_spe && (arm_spe_pmu != 
arm_spe_pmus[i])) {
+   pr_err("Concurrent multiple SPE 
operation not currently supported\n");
+   *err = -EOPNOTSUPP;
+   return NULL;
+   }
+
+   arm_spe_pmu = arm_spe_pmus[i];
found_spe = true;
break;
}
@@ -94,7 +102,7 @@ struct auxtrace_record
 
 #if defined(__aarch64__)
if (found_spe)
-   return arm_spe_recording_init(err, arm_spe_pmus[i]);
+   return arm_spe_recording_init(err, arm_spe_pmu);
 #endif
 
/*
-- 
2.17.1



[PATCH 0/2] perf tools: Fix record failure when mixed with ARM SPE event

2020-06-23 Thread Wei Li
This patch set fixes perf record failure when we mix arm_spe_x event with
other events in specific order.

Wei Li (2):
  perf tools: ARM SPE code cleanup
  perf tools: Fix record failure when mixed with ARM SPE event

 tools/perf/arch/arm/util/auxtrace.c | 16 +++-
 1 file changed, 11 insertions(+), 5 deletions(-)

-- 
2.17.1



[PATCH 1/2] perf tools: ARM SPE code cleanup

2020-06-23 Thread Wei Li
Remove the useless check code to make it clear.

Signed-off-by: Wei Li 
---
 tools/perf/arch/arm/util/auxtrace.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/tools/perf/arch/arm/util/auxtrace.c 
b/tools/perf/arch/arm/util/auxtrace.c
index 0a6e75b8777a..62b7b03d691a 100644
--- a/tools/perf/arch/arm/util/auxtrace.c
+++ b/tools/perf/arch/arm/util/auxtrace.c
@@ -57,7 +57,7 @@ struct auxtrace_record
struct evsel *evsel;
bool found_etm = false;
bool found_spe = false;
-   static struct perf_pmu **arm_spe_pmus = NULL;
+   static struct perf_pmu **arm_spe_pmus;
static int nr_spes = 0;
int i = 0;
 
@@ -65,9 +65,7 @@ struct auxtrace_record
return NULL;
 
cs_etm_pmu = perf_pmu__find(CORESIGHT_ETM_PMU_NAME);
-
-   if (!arm_spe_pmus)
-   arm_spe_pmus = find_all_arm_spe_pmus(_spes, err);
+   arm_spe_pmus = find_all_arm_spe_pmus(_spes, err);
 
evlist__for_each_entry(evlist, evsel) {
if (cs_etm_pmu &&
-- 
2.17.1



[PATCH] perf list: Fix memleak in print_sdt_events()

2020-06-20 Thread Wei Li
Valgrind check info:
==30629== Command: ./perf list sdt
==30629==
==30629==
==30629== HEAP SUMMARY:
==30629== in use at exit: 12,226 bytes in 195 blocks
==30629==   total heap usage: 3,239 allocs, 3,044 frees, 3,546,759 bytes 
allocated
==30629==
==30629== 8,028 bytes in 115 blocks are definitely lost in loss record 54 of 54
==30629==at 0x4885E44: realloc (vg_replace_malloc.c:785)
==30629==by 0x56468C3: __vasprintf_chk (in /usr/lib64/libc-2.28.so)
==30629==by 0x5646723: __asprintf_chk (in /usr/lib64/libc-2.28.so)
==30629==by 0x4E2C3F: asprintf (stdio2.h:181)
==30629==by 0x4E2C3F: print_sdt_events (parse-events.c:2611)
==30629==by 0x446587: cmd_list (builtin-list.c:87)
==30629==by 0x4B8947: run_builtin (perf.c:312)
==30629==by 0x4B8C13: handle_internal_command (perf.c:364)
==30629==by 0x434717: run_argv (perf.c:408)
==30629==by 0x434717: main (perf.c:538)
==30629==
==30629== LEAK SUMMARY:
==30629==definitely lost: 8,028 bytes in 115 blocks
==30629==indirectly lost: 0 bytes in 0 blocks
==30629==  possibly lost: 0 bytes in 0 blocks
==30629==still reachable: 4,198 bytes in 80 blocks
==30629== suppressed: 0 bytes in 0 blocks

Signed-off-by: Wei Li 
---
 tools/perf/util/parse-events.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 3decbb203846..b2acacd5646e 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -2579,13 +2579,12 @@ void print_sdt_events(const char *subsys_glob, const 
char *event_glob,
struct probe_cache *pcache;
struct probe_cache_entry *ent;
struct strlist *bidlist, *sdtlist;
-   struct strlist_config cfg = {.dont_dupstr = true};
struct str_node *nd, *nd2;
char *buf, *path, *ptr = NULL;
bool show_detail = false;
int ret;
 
-   sdtlist = strlist__new(NULL, );
+   sdtlist = strlist__new(NULL, NULL);
if (!sdtlist) {
pr_debug("Failed to allocate new strlist for SDT\n");
return;
@@ -2610,8 +2609,10 @@ void print_sdt_events(const char *subsys_glob, const 
char *event_glob,
continue;
ret = asprintf(, "%s:%s@%s", ent->pev.group,
ent->pev.event, nd->s);
-   if (ret > 0)
+   if (ret > 0) {
strlist__add(sdtlist, buf);
+   free(buf);
+   }
}
probe_cache__delete(pcache);
}
-- 
2.19.1



[PATCH] perf report TUI: Fix segmentation fault in perf_evsel__hists_browse()

2020-06-12 Thread Wei Li
The segmentation fault can be reproduced as following steps:
1) Executing perf report in tui.
2) Typing '/x' to filter the symbol to get nothing matched.
3) Pressing enter with no entry selected.
Then it will report a segmentation fault.

It is caused by the lack of check of browser->he_selection when
accessing it's member res_samples in perf_evsel__hists_browse().

These processes are meaningful for specified samples, so we can
skip these when nothing is selected.

Fixes: 4968ac8fb7c3 ("perf report: Implement browsing of individual samples")
Signed-off-by: Wei Li 
---
 tools/perf/ui/browsers/hists.c | 17 +++--
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index 487e54ef56a9..2101b6b770d8 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -2288,6 +2288,11 @@ static struct thread 
*hist_browser__selected_thread(struct hist_browser *browser
return browser->he_selection->thread;
 }
 
+static struct res_sample *hist_browser__selected_res_sample(struct 
hist_browser *browser)
+{
+   return browser->he_selection ? browser->he_selection->res_samples : 
NULL;
+}
+
 /* Check whether the browser is for 'top' or 'report' */
 static inline bool is_report_browser(void *timer)
 {
@@ -3357,16 +3362,16 @@ static int perf_evsel__hists_browse(struct evsel 
*evsel, int nr_events,
 [nr_options], NULL, NULL, 
evsel);
nr_options += add_res_sample_opt(browser, [nr_options],
 [nr_options],
-
hist_browser__selected_entry(browser)->res_samples,
-evsel, A_NORMAL);
+
hist_browser__selected_res_sample(browser),
+evsel, A_NORMAL);
nr_options += add_res_sample_opt(browser, [nr_options],
 [nr_options],
-
hist_browser__selected_entry(browser)->res_samples,
-evsel, A_ASM);
+
hist_browser__selected_res_sample(browser),
+evsel, A_ASM);
nr_options += add_res_sample_opt(browser, [nr_options],
 [nr_options],
-
hist_browser__selected_entry(browser)->res_samples,
-evsel, A_SOURCE);
+
hist_browser__selected_res_sample(browser),
+evsel, A_SOURCE);
nr_options += add_switch_opt(browser, [nr_options],
 [nr_options]);
 skip_scripting:
-- 
2.17.1



[PATCH 0/4] perf: Fix memory errors

2020-05-21 Thread Wei Li
Fix several memory errors in perf tool.

Hongbo Yao (1):
  perf metrictroup: Fix memory leak of metric_events

Li Bin (2):
  perf svghelper: Fix memory leak in svg_build_topology_map
  perf util: Fix potential segment fault in put_tracepoints_path

Xie XiuQi (1):
  perf util: Fix memory leak of prefix_if_not_in

 tools/perf/util/metricgroup.c  |  3 +++
 tools/perf/util/sort.c |  2 +-
 tools/perf/util/svghelper.c| 10 +++---
 tools/perf/util/trace-event-info.c |  2 +-
 4 files changed, 12 insertions(+), 5 deletions(-)

-- 
2.17.1



[PATCH 4/4] perf util: Fix potential segment fault in put_tracepoints_path

2020-05-21 Thread Wei Li
From: Li Bin 

This patch fix potential segment fault triggerd in put_tracepoints_path
when the address of the local variable 'path' be freed in error path
of record_saved_cmdline.

Signed-off-by: Li Bin 
---
 tools/perf/util/trace-event-info.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/trace-event-info.c 
b/tools/perf/util/trace-event-info.c
index 086e98ff42a3..0e5c4786f296 100644
--- a/tools/perf/util/trace-event-info.c
+++ b/tools/perf/util/trace-event-info.c
@@ -428,7 +428,7 @@ get_tracepoints_path(struct list_head *pattrs)
if (!ppath->next) {
 error:
pr_debug("No memory to alloc tracepoints list\n");
-   put_tracepoints_path();
+   put_tracepoints_path(path.next);
return NULL;
}
 next:
-- 
2.17.1



[PATCH 1/4] perf metrictroup: Fix memory leak of metric_events

2020-05-21 Thread Wei Li
From: Hongbo Yao 

Fix memory leak of metric_events in function metricgroup__setup_events()

Signed-off-by: Hongbo Yao 
---
 tools/perf/util/metricgroup.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c
index 926449a7cdbf..69bf0f4d646e 100644
--- a/tools/perf/util/metricgroup.c
+++ b/tools/perf/util/metricgroup.c
@@ -184,6 +184,7 @@ static int metricgroup__setup_events(struct list_head 
*groups,
if (!evsel) {
pr_debug("Cannot resolve %s: %s\n",
eg->metric_name, eg->metric_expr);
+   free(metric_events);
continue;
}
for (i = 0; i < eg->idnum; i++)
@@ -191,11 +192,13 @@ static int metricgroup__setup_events(struct list_head 
*groups,
me = metricgroup__lookup(metric_events_list, evsel, true);
if (!me) {
ret = -ENOMEM;
+   free(metric_events);
break;
}
expr = malloc(sizeof(struct metric_expr));
if (!expr) {
ret = -ENOMEM;
+   free(metric_events);
break;
}
expr->metric_expr = eg->metric_expr;
-- 
2.17.1



[PATCH 3/4] perf util: Fix memory leak of prefix_if_not_in

2020-05-21 Thread Wei Li
From: Xie XiuQi 

Need to free "str" before return when asprintf() failed
to avoid memory leak.

Signed-off-by: Xie XiuQi 
---
 tools/perf/util/sort.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index f14cc728c358..8ed777565c82 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -2811,7 +2811,7 @@ static char *prefix_if_not_in(const char *pre, char *str)
return str;
 
if (asprintf(, "%s,%s", pre, str) < 0)
-   return NULL;
+   n = NULL;
 
free(str);
return n;
-- 
2.17.1



[PATCH 2/4] perf svghelper: Fix memory leak in svg_build_topology_map

2020-05-21 Thread Wei Li
From: Li Bin 

Fix leak of memory pointed to by t.sib_thr and t.sib_core in
svg_build_topology_map.

Signed-off-by: Li Bin 
---
 tools/perf/util/svghelper.c | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/svghelper.c b/tools/perf/util/svghelper.c
index 96f941e01681..e2b3b0e2fafe 100644
--- a/tools/perf/util/svghelper.c
+++ b/tools/perf/util/svghelper.c
@@ -754,6 +754,7 @@ int svg_build_topology_map(struct perf_env *env)
int i, nr_cpus;
struct topology t;
char *sib_core, *sib_thr;
+   int ret;
 
nr_cpus = min(env->nr_cpus_online, MAX_NR_CPUS);
 
@@ -767,12 +768,14 @@ int svg_build_topology_map(struct perf_env *env)
 
if (!t.sib_core || !t.sib_thr) {
fprintf(stderr, "topology: no memory\n");
+   ret = -1;
goto exit;
}
 
for (i = 0; i < env->nr_sibling_cores; i++) {
if (str_to_bitmap(sib_core, _core[i], nr_cpus)) {
fprintf(stderr, "topology: can't parse siblings map\n");
+   ret = -1;
goto exit;
}
 
@@ -782,6 +785,7 @@ int svg_build_topology_map(struct perf_env *env)
for (i = 0; i < env->nr_sibling_threads; i++) {
if (str_to_bitmap(sib_thr, _thr[i], nr_cpus)) {
fprintf(stderr, "topology: can't parse siblings map\n");
+   ret = -1;
goto exit;
}
 
@@ -791,6 +795,7 @@ int svg_build_topology_map(struct perf_env *env)
topology_map = malloc(sizeof(int) * nr_cpus);
if (!topology_map) {
fprintf(stderr, "topology: no memory\n");
+   ret = -1;
goto exit;
}
 
@@ -798,12 +803,11 @@ int svg_build_topology_map(struct perf_env *env)
topology_map[i] = -1;
 
scan_core_topology(topology_map, , nr_cpus);
-
-   return 0;
+   ret = 0;
 
 exit:
zfree(_core);
zfree(_thr);
 
-   return -1;
+   return ret;
 }
-- 
2.17.1



[PATCH v3] kdb: Remove the misfeature 'KDBFLAGS'

2020-05-21 Thread Wei Li
Currently, 'KDBFLAGS' is an internal variable of kdb, it is combined
by 'KDBDEBUG' and state flags. It will be shown only when 'KDBDEBUG'
is set, and the user can define an environment variable named 'KDBFLAGS'
too. These are puzzling indeed.

After communication with Daniel, it seems that 'KDBFLAGS' is a misfeature.
So let's replace 'KDBFLAGS' with 'KDBDEBUG' to just show the value we
wrote into. After this modification, we can use `md4c1 kdb_flags` instead,
to observe the state flags.

Suggested-by: Daniel Thompson 
Signed-off-by: Wei Li 
---
v2 -> v3:
 - Change to replace the internal env 'KDBFLAGS' with 'KDBDEBUG'.
v1 -> v2:
 - Fix lack of braces.

 kernel/debug/kdb/kdb_main.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/kernel/debug/kdb/kdb_main.c b/kernel/debug/kdb/kdb_main.c
index 4fc43fb17127..392029287083 100644
--- a/kernel/debug/kdb/kdb_main.c
+++ b/kernel/debug/kdb/kdb_main.c
@@ -418,8 +418,7 @@ int kdb_set(int argc, const char **argv)
argv[2]);
return 0;
}
-   kdb_flags = (kdb_flags &
-~(KDB_DEBUG_FLAG_MASK << KDB_DEBUG_FLAG_SHIFT))
+   kdb_flags = (kdb_flags & ~KDB_DEBUG(MASK))
| (debugflags << KDB_DEBUG_FLAG_SHIFT);
 
return 0;
@@ -2081,7 +2080,8 @@ static int kdb_env(int argc, const char **argv)
}
 
if (KDB_DEBUG(MASK))
-   kdb_printf("KDBFLAGS=0x%x\n", kdb_flags);
+   kdb_printf("KDBDEBUG=0x%x\n",
+   (kdb_flags & KDB_DEBUG(MASK)) >> KDB_DEBUG_FLAG_SHIFT);
 
return 0;
 }
-- 
2.17.1



[PATCH v2] kdb: Make the internal env 'KDBFLAGS' undefinable

2020-05-16 Thread Wei Li
'KDBFLAGS' is an internal variable of kdb, it is combined by 'KDBDEBUG'
and state flags. But the user can define an environment variable named
'KDBFLAGS' too, so let's make it undefinable to avoid confusion.

Signed-off-by: Wei Li 
Reviewed-by: Douglas Anderson 
---
v1 -> v2:
 - Fix lack of braces.

 kernel/debug/kdb/kdb_main.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/kernel/debug/kdb/kdb_main.c b/kernel/debug/kdb/kdb_main.c
index 4fc43fb17127..75b798340300 100644
--- a/kernel/debug/kdb/kdb_main.c
+++ b/kernel/debug/kdb/kdb_main.c
@@ -423,6 +423,8 @@ int kdb_set(int argc, const char **argv)
| (debugflags << KDB_DEBUG_FLAG_SHIFT);
 
return 0;
+   } else if (strcmp(argv[1], "KDBFLAGS") == 0) {
+   return KDB_NOPERM;
}
 
/*
-- 
2.17.1



[PATCH] perf timechart: Remove redundant assignment

2020-05-11 Thread Wei Li
Remove redundant assignment, no functional change.

Signed-off-by: Wei Li 
---
 tools/perf/builtin-timechart.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/tools/perf/builtin-timechart.c b/tools/perf/builtin-timechart.c
index 9e84fae9b096..5e4f809d7e5d 100644
--- a/tools/perf/builtin-timechart.c
+++ b/tools/perf/builtin-timechart.c
@@ -1149,7 +1149,6 @@ static void draw_io_bars(struct timechart *tchart)
}
 
svg_box(Y, c->start_time, c->end_time, "process3");
-   sample = c->io_samples;
for (sample = c->io_samples; sample; sample = 
sample->next) {
double h = (double)sample->bytes / c->max_bytes;
 
-- 
2.17.1



[PATCH] kdb: Make the internal env 'KDBFLAGS' undefinable

2020-05-10 Thread Wei Li
'KDBFLAGS' is an internal variable of kdb, it is combined by 'KDBDEBUG'
and state flags. But the user can define an environment variable named
'KDBFLAGS' too, so let's make it undefinable to avoid confusion.

Signed-off-by: Wei Li 
---
 kernel/debug/kdb/kdb_main.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/debug/kdb/kdb_main.c b/kernel/debug/kdb/kdb_main.c
index 4fc43fb17127..d3d060136821 100644
--- a/kernel/debug/kdb/kdb_main.c
+++ b/kernel/debug/kdb/kdb_main.c
@@ -423,7 +423,8 @@ int kdb_set(int argc, const char **argv)
| (debugflags << KDB_DEBUG_FLAG_SHIFT);
 
return 0;
-   }
+   } else if (strcmp(argv[1], "KDBFLAGS") == 0)
+   return KDB_NOPERM;
 
/*
 * Tokenizer squashed the '=' sign.  argv[1] is variable
-- 
2.17.1



[PATCH 0/4] arm64: kgdb/kdb: Fix single-step debugging issues

2020-05-09 Thread Wei Li
This patch set is to fix several issues of single-step debugging
in kgdb/kdb on arm64.

It seems that these issues have been shelved a very long time,
but i still hope to solve them, as the single-step debugging
is an useful feature.

Note:
Based on patch "arm64: cacheflush: Fix KGDB trap detection",
https://www.spinics.net/lists/arm-kernel/msg803741.html

Wei Li (4):
  arm64: kgdb: Fix single-step exception handling oops
  arm64: Extract kprobes_save_local_irqflag() and
kprobes_restore_local_irqflag()
  arm64: kgdb: Fix single-stepping into the irq handler wrongly
  arm64: kgdb: Set PSTATE.SS to 1 to reenable single-step

 arch/arm64/include/asm/debug-monitors.h |  6 ++
 arch/arm64/kernel/debug-monitors.c  | 28 -
 arch/arm64/kernel/kgdb.c| 16 +++---
 arch/arm64/kernel/probes/kprobes.c  | 28 ++---
 4 files changed, 48 insertions(+), 30 deletions(-)

-- 
2.17.1



[PATCH 2/4] arm64: Extract kprobes_save_local_irqflag() and kprobes_restore_local_irqflag()

2020-05-09 Thread Wei Li
PSTATE.I and PSTATE.D are very important for single-step working.

Without disabling interrupt on local CPU, there is a chance of
interrupt occurrence in the period of exception return and start of
out-of-line single-step, that result in wrongly single stepping
into the interrupt handler. And if D bit is set then, it results into
undefined exception and when it's handler enables dbg then single-step
exception is generated, not as expected.

As they are maintained well in kprobes_save_local_irqflag() and
kprobes_restore_local_irqflag() for kprobe module, extract them as
kernel_prepare_single_step() and kernel_cleanup_single_step() for
general use.

Signed-off-by: Wei Li 
---
 arch/arm64/include/asm/debug-monitors.h |  4 
 arch/arm64/kernel/debug-monitors.c  | 26 +++
 arch/arm64/kernel/probes/kprobes.c  | 28 ++---
 3 files changed, 32 insertions(+), 26 deletions(-)

diff --git a/arch/arm64/include/asm/debug-monitors.h 
b/arch/arm64/include/asm/debug-monitors.h
index 7619f473155f..b62469f3475b 100644
--- a/arch/arm64/include/asm/debug-monitors.h
+++ b/arch/arm64/include/asm/debug-monitors.h
@@ -113,6 +113,10 @@ void user_fastforward_single_step(struct task_struct 
*task);
 void kernel_enable_single_step(struct pt_regs *regs);
 void kernel_disable_single_step(void);
 int kernel_active_single_step(void);
+void kernel_prepare_single_step(unsigned long *flags,
+   struct pt_regs *regs);
+void kernel_cleanup_single_step(unsigned long flags,
+   struct pt_regs *regs);
 
 #ifdef CONFIG_HAVE_HW_BREAKPOINT
 int reinstall_suspended_bps(struct pt_regs *regs);
diff --git a/arch/arm64/kernel/debug-monitors.c 
b/arch/arm64/kernel/debug-monitors.c
index 48222a4760c2..25ce6b5a52d2 100644
--- a/arch/arm64/kernel/debug-monitors.c
+++ b/arch/arm64/kernel/debug-monitors.c
@@ -429,6 +429,32 @@ int kernel_active_single_step(void)
 }
 NOKPROBE_SYMBOL(kernel_active_single_step);
 
+/*
+ * Interrupts need to be disabled before single-step mode is set, and not
+ * reenabled until after single-step mode ends.
+ * Without disabling interrupt on local CPU, there is a chance of
+ * interrupt occurrence in the period of exception return and  start of
+ * out-of-line single-step, that result in wrongly single stepping
+ * into the interrupt handler.
+ */
+void kernel_prepare_single_step(unsigned long *flags,
+   struct pt_regs *regs)
+{
+   *flags = regs->pstate & DAIF_MASK;
+   regs->pstate |= PSR_I_BIT;
+   /* Unmask PSTATE.D for enabling software step exceptions. */
+   regs->pstate &= ~PSR_D_BIT;
+}
+NOKPROBE_SYMBOL(kernel_prepare_single_step);
+
+void kernel_cleanup_single_step(unsigned long flags,
+   struct pt_regs *regs)
+{
+   regs->pstate &= ~DAIF_MASK;
+   regs->pstate |= flags;
+}
+NOKPROBE_SYMBOL(kernel_cleanup_single_step);
+
 /* ptrace API */
 void user_enable_single_step(struct task_struct *task)
 {
diff --git a/arch/arm64/kernel/probes/kprobes.c 
b/arch/arm64/kernel/probes/kprobes.c
index d1c95dcf1d78..c655b6b543e3 100644
--- a/arch/arm64/kernel/probes/kprobes.c
+++ b/arch/arm64/kernel/probes/kprobes.c
@@ -168,30 +168,6 @@ static void __kprobes set_current_kprobe(struct kprobe *p)
__this_cpu_write(current_kprobe, p);
 }
 
-/*
- * Interrupts need to be disabled before single-step mode is set, and not
- * reenabled until after single-step mode ends.
- * Without disabling interrupt on local CPU, there is a chance of
- * interrupt occurrence in the period of exception return and  start of
- * out-of-line single-step, that result in wrongly single stepping
- * into the interrupt handler.
- */
-static void __kprobes kprobes_save_local_irqflag(struct kprobe_ctlblk *kcb,
-   struct pt_regs *regs)
-{
-   kcb->saved_irqflag = regs->pstate & DAIF_MASK;
-   regs->pstate |= PSR_I_BIT;
-   /* Unmask PSTATE.D for enabling software step exceptions. */
-   regs->pstate &= ~PSR_D_BIT;
-}
-
-static void __kprobes kprobes_restore_local_irqflag(struct kprobe_ctlblk *kcb,
-   struct pt_regs *regs)
-{
-   regs->pstate &= ~DAIF_MASK;
-   regs->pstate |= kcb->saved_irqflag;
-}
-
 static void __kprobes
 set_ss_context(struct kprobe_ctlblk *kcb, unsigned long addr)
 {
@@ -227,7 +203,7 @@ static void __kprobes setup_singlestep(struct kprobe *p,
set_ss_context(kcb, slot);  /* mark pending ss */
 
/* IRQs and single stepping do not mix well. */
-   kprobes_save_local_irqflag(kcb, regs);
+   kernel_prepare_single_step(>saved_irqflag, regs);
kernel_enable_single_step(regs);
instruction_pointer_set(regs, slot);
} else {
@@ -

[PATCH 3/4] arm64: kgdb: Fix single-stepping into the irq handler wrongly

2020-05-09 Thread Wei Li
After the single-step exception handling oops is fixed, when we execute
single-step in kdb/kgdb, we may see it jumps to the irq_handler (where
PSTATE.D is cleared) instead of the next instruction.

Add the prepare and cleanup work for single-step when enabling and
disabling to maintain the PSTATE.I and PSTATE.D carefully.

Before this patch:
* kdb:
Entering kdb (current=0x8000119e2dc0, pid 0) on processor 0 due to Keyboard 
Entry
[0]kdb> bp printk
Instruction(i) BP #0 at 0x8000101486cc (printk)
is enabled   addr at 8000101486cc, hardtype=0 installed=0

[0]kdb> g

/ # echo h > /proc/sysrq-trigger

Entering kdb (current=0xfada65c0, pid 267) on processor 0 due to 
Breakpoint @ 0x8000101486cc
[0]kdb> ss

Entering kdb (current=0xfada65c0, pid 267) on processor 0 due to SS 
trap @ 0x800010082ab8
[0]kdb> 0x800010082ab8
0x800010082ab8 = 0x800010082ab8 (el1_irq+0x78)
[0]kdb>

   0x800010082ab0 <+112>:   nop
   0x800010082ab4 <+116>:   msr daifclr, #0xd
   0x800010082ab8 <+120>:   adrpx1, 0x8000113a7000 
   0x800010082abc <+124>:   ldr x1, [x1, #1408]

* kgdb:
(gdb) target remote 127.1:23002
Remote debugging using 127.1:23002
arch_kgdb_breakpoint () at 
/home/liwei/main_code/linux/arch/arm64/include/asm/kgdb.h:21
21  asm ("brk %0" : : "I" (KGDB_COMPILED_DBG_BRK_IMM));
(gdb) b printk
Breakpoint 1 at 0x8000101486cc: file 
/home/liwei/main_code/linux/kernel/printk/printk.c, line 2076.
(gdb) c
Continuing.
[New Thread 287]
[Switching to Thread 283]

Thread 177 hit Breakpoint 1, printk (fmt=0x80001130c9d8 "\001\066sysrq: 
HELP : ")
at /home/liwei/main_code/linux/kernel/printk/printk.c:2076
2076{
(gdb) stepi
el1_irq () at /home/liwei/main_code/linux/arch/arm64/kernel/entry.S:608
608 irq_handler
(gdb)

After this patch:
* kdb:
Entering kdb (current=0x8000119d2dc0, pid 0) on processor 0 due to Keyboard 
Entry
[0]kdb> bp printk
Instruction(i) BP #0 at 0x80001014874c (printk)
is enabled   addr at 80001014874c, hardtype=0 installed=0

[0]kdb> g

/ # echo h > /proc/sysrq-trigger

Entering kdb (current=0xfa6948c0, pid 265) on processor 0 due to 
Breakpoint @ 0x80001014874c
[0]kdb> ss

Entering kdb (current=0xfa6948c0, pid 265) on processor 0 due to SS 
trap @ 0x800010148750
[0]kdb>

* kgdb:
(gdb) target remote 127.1:23002
Remote debugging using 127.1:23002
arch_kgdb_breakpoint () at 
/home/liwei/main_code/linux/arch/arm64/include/asm/kgdb.h:21
21  asm ("brk %0" : : "I" (KGDB_COMPILED_DBG_BRK_IMM));
(gdb) b printk
Breakpoint 1 at 0x80001014874c: file 
/home/liwei/main_code/linux/kernel/printk/printk.c, line 2076.
(gdb) c
Continuing.
[New Thread 277]
[Switching to Thread 276]

Thread 171 hit Breakpoint 1, printk (fmt=0x8000112fc130 "\001\066sysrq: 
HELP : ")
at /home/liwei/main_code/linux/kernel/printk/printk.c:2076
2076{
(gdb) stepi
0xffff800010148750  2076{
(gdb)

Fixes: 44679a4f142b ("arm64: KGDB: Add step debugging support")
Signed-off-by: Wei Li 
---
 arch/arm64/kernel/kgdb.c | 13 +++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kernel/kgdb.c b/arch/arm64/kernel/kgdb.c
index 1a157ca33262..3910ac06c261 100644
--- a/arch/arm64/kernel/kgdb.c
+++ b/arch/arm64/kernel/kgdb.c
@@ -100,6 +100,8 @@ struct dbg_reg_def_t dbg_reg_def[DBG_MAX_REG_NUM] = {
{ "fpcr", 4, -1 },
 };
 
+static DEFINE_PER_CPU(unsigned long, kgdb_ss_flags);
+
 char *dbg_get_reg(int regno, void *mem, struct pt_regs *regs)
 {
if (regno >= DBG_MAX_REG_NUM || regno < 0)
@@ -200,8 +202,11 @@ int kgdb_arch_handle_exception(int exception_vector, int 
signo,
/*
 * Received continue command, disable single step
 */
-   if (kernel_active_single_step())
+   if (kernel_active_single_step()) {
+   kernel_cleanup_single_step(per_cpu(kgdb_ss_flags,
+   raw_smp_processor_id()), linux_regs);
kernel_disable_single_step();
+   }
 
err = 0;
break;
@@ -221,8 +226,12 @@ int kgdb_arch_handle_exception(int exception_vector, int 
signo,
/*
 * Enable single step handling
 */
-   if (!kernel_active_single_step())
+   if (!kernel_active_single_step()) {
+   kernel_prepare_single_step(_cpu(kgdb_ss_flags,
+   raw_smp_processor_id()), linux_regs);
kernel_enable_single_step(linux_regs);
+   }
+
err = 0;
break;
default:
-- 
2.17.1



[PATCH 4/4] arm64: kgdb: Set PSTATE.SS to 1 to reenable single-step

2020-05-09 Thread Wei Li
After fixing wrongly single-stepping into the irq handler, when we execute
single-step in kdb/kgdb, we can see only the first step can work.

Refer to the ARM Architecture Reference Manual (ARM DDI 0487E.a) D2.12,
i think PSTATE.SS=1 should be set each step for transferring the PE to the
'Active-not-pending' state. The problem here is PSTATE.SS=1 is not set
since the second single-step.

After the first single-step, the PE transferes to the 'Inactive' state,
with PSTATE.SS=0 and MDSCR.SS=1, thus PSTATE.SS won't be set to 1 due to
kernel_active_single_step()=true. Then the PE transferes to the
'Active-pending' state when ERET and returns to the debugger by step
exception.

Before this patch:
* kdb:
Entering kdb (current=0x8000119d2dc0, pid 0) on processor 0 due to Keyboard 
Entry
[0]kdb> bp printk
Instruction(i) BP #0 at 0x80001014874c (printk)
is enabled   addr at 80001014874c, hardtype=0 installed=0

[0]kdb> g

/ # echo h > /proc/sysrq-trigger

Entering kdb (current=0xfa6948c0, pid 265) on processor 3 due to 
Breakpoint @ 0x80001014874c
[3]kdb> ss

Entering kdb (current=0xfa6948c0, pid 265) on processor 3 due to SS 
trap @ 0x800010148750
[3]kdb> ss

Entering kdb (current=0xfa6948c0, pid 265) on processor 3 due to SS 
trap @ 0x800010148750
[3]kdb> ss

Entering kdb (current=0xfa6948c0, pid 265) on processor 3 due to SS 
trap @ 0x800010148750
[3]kdb>

* kgdb:
(gdb) target remote 127.1:23002
Remote debugging using 127.1:23002
arch_kgdb_breakpoint () at 
/home/liwei/main_code/linux/arch/arm64/include/asm/kgdb.h:21
21  asm ("brk %0" : : "I" (KGDB_COMPILED_DBG_BRK_IMM));
(gdb) b printk
Breakpoint 1 at 0x80001014874c: file 
/home/liwei/main_code/linux/kernel/printk/printk.c, line 2076.
(gdb) c
Continuing.
[New Thread 277]
[Switching to Thread 276]

Thread 171 hit Breakpoint 1, printk (fmt=0x8000112fc130 "\001\066sysrq: 
HELP : ")
at /home/liwei/main_code/linux/kernel/printk/printk.c:2076
2076{
(gdb) stepi
0x800010148750  2076{
(gdb) stepi
0x800010148750  2076{
(gdb) stepi
0x800010148750  2076{
(gdb)

After this patch:
* kdb:
Entering kdb (current=0x8000119d2dc0, pid 0) on processor 0 due to Keyboard 
Entry
[0]kdb> bp printk
Instruction(i) BP #0 at 0x80001014874c (printk)
is enabled   addr at 80001014874c, hardtype=0 installed=0

[0]kdb> g

/ # echo h > /proc/sysrq-trigger

Entering kdb (current=0xfa800040, pid 264) on processor 2 due to 
Breakpoint @ 0x80001014874c
[2]kdb> ss

Entering kdb (current=0xfa800040, pid 264) on processor 2 due to SS 
trap @ 0x800010148750
[2]kdb> ss

Entering kdb (current=0xfa800040, pid 264) on processor 2 due to SS 
trap @ 0x800010148754
[2]kdb> ss

Entering kdb (current=0xfa800040, pid 264) on processor 2 due to SS 
trap @ 0x800010148758
[2]kdb>

* kgdb:
(gdb) target remote 127.1:23002
Remote debugging using 127.1:23002
arch_kgdb_breakpoint () at 
/home/liwei/main_code/linux/arch/arm64/include/asm/kgdb.h:21
21  asm ("brk %0" : : "I" (KGDB_COMPILED_DBG_BRK_IMM));
(gdb) b printk
Breakpoint 1 at 0x80001014874c: file 
/home/liwei/main_code/linux/kernel/printk/printk.c, line 2076.
(gdb) c
Continuing.
[New Thread 281]
[New Thread 280]
[Switching to Thread 281]

Thread 174 hit Breakpoint 1, printk (fmt=0x8000112fc138 "\001\066sysrq: 
HELP : ")
at /home/liwei/main_code/linux/kernel/printk/printk.c:2076
2076{
(gdb) stepi
0x800010148750  2076{
(gdb) stepi
2080va_start(args, fmt);
(gdb) stepi
0x800010148758  2080va_start(args, fmt);
(gdb)

Fixes: 44679a4f142b ("arm64: KGDB: Add step debugging support")
Signed-off-by: Wei Li 
---
 arch/arm64/include/asm/debug-monitors.h | 2 ++
 arch/arm64/kernel/debug-monitors.c  | 2 +-
 arch/arm64/kernel/kgdb.c| 3 ++-
 3 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/debug-monitors.h 
b/arch/arm64/include/asm/debug-monitors.h
index b62469f3475b..a48b507c89ee 100644
--- a/arch/arm64/include/asm/debug-monitors.h
+++ b/arch/arm64/include/asm/debug-monitors.h
@@ -78,6 +78,8 @@ struct step_hook {
int (*fn)(struct pt_regs *regs, unsigned int esr);
 };
 
+void set_regs_spsr_ss(struct pt_regs *regs);
+
 void register_user_step_hook(struct step_hook *hook);
 void unregister_user_step_hook(struct step_hook *hook);
 
diff --git a/arch/arm64/kernel/debug-monitors.c 
b/arch/arm64/kernel/debug-monitors.c
index 25ce6b5a52d2..7a58233677de 100644
--- a/arch/arm64/kernel/debug-monitors.c
+++ b/arch/arm64/kernel/debug-monitors.c
@@ -141,7 +141,7 @@ postcore_initcall(debug_monitors_init);
 /*
  * Single step API and exception handling.
  */
-static void set_regs_spsr_ss(struct pt_regs *regs)
+void set_regs_spsr_ss(struct pt_regs *regs)
 

[PATCH 1/4] arm64: kgdb: Fix single-step exception handling oops

2020-05-09 Thread Wei Li
After entering kdb due to breakpoint, when we execute 'ss' or 'go' (will
delay installing breakpoints, do single-step first), it won't work
correctly, and it will enter kdb due to oops.

It's because the reason gotten in kdb_stub() is not as expected, and it
seems that the ex_vector for single-step should be 0, like what arch
powerpc/sh/parisc has implemented.

Before the patch:
Entering kdb (current=0x8000119e2dc0, pid 0) on processor 0 due to Keyboard 
Entry
[0]kdb> bp printk
Instruction(i) BP #0 at 0x8000101486cc (printk)
is enabled   addr at 8000101486cc, hardtype=0 installed=0

[0]kdb> g

/ # echo h > /proc/sysrq-trigger

Entering kdb (current=0xfa878040, pid 266) on processor 3 due to 
Breakpoint @ 0x8000101486cc
[3]kdb> ss

Entering kdb (current=0xfa878040, pid 266) on processor 3 Oops: (null)
due to oops @ 0x800010082ab8
CPU: 3 PID: 266 Comm: sh Not tainted 5.7.0-rc4-13839-gf0e5ad491718 #6
Hardware name: linux,dummy-virt (DT)
pstate: 0085 (nzcv daIf -PAN -UAO)
pc : el1_irq+0x78/0x180
lr : __handle_sysrq+0x80/0x190
sp : 800015003bf0
x29: 800015003d20 x28: fa878040
x27:  x26: 80001126b1f0
x25: 800011b6a0d8 x24: 
x23: 8025 x22: 8000101486cc
x21: 800015003d30 x20: 
x19: 8000119f2000 x18: 
x17:  x16: 
x15:  x14: 
x13:  x12: 
x11:  x10: 
x9 :  x8 : 800015003e50
x7 : 0002 x6 : 380b9990
x5 : 8000106e99e8 x4 : fadd83c0
x3 :  x2 : 800011b6a0d8
x1 : 800011b6a000 x0 : 80001130c9d8
Call trace:
 el1_irq+0x78/0x180
 printk+0x0/0x84
 write_sysrq_trigger+0xb0/0x118
 proc_reg_write+0xb4/0xe0
 __vfs_write+0x18/0x40
 vfs_write+0xb0/0x1b8
 ksys_write+0x64/0xf0
 __arm64_sys_write+0x14/0x20
 el0_svc_common.constprop.2+0xb0/0x168
 do_el0_svc+0x20/0x98
 el0_sync_handler+0xec/0x1a8
 el0_sync+0x140/0x180

[3]kdb>

After the patch:
Entering kdb (current=0x8000119e2dc0, pid 0) on processor 0 due to Keyboard 
Entry
[0]kdb> bp printk
Instruction(i) BP #0 at 0x8000101486cc (printk)
is enabled   addr at 8000101486cc, hardtype=0 installed=0

[0]kdb> g

/ # echo h > /proc/sysrq-trigger

Entering kdb (current=0xfa852bc0, pid 268) on processor 0 due to 
Breakpoint @ 0x8000101486cc
[0]kdb> g

Entering kdb (current=0xfa852bc0, pid 268) on processor 0 due to 
Breakpoint @ 0x8000101486cc
[0]kdb> ss

Entering kdb (current=0xfa852bc0, pid 268) on processor 0 due to SS 
trap @ 0x800010082ab8
[0]kdb>

Fixes: 44679a4f142b ("arm64: KGDB: Add step debugging support")
Signed-off-by: Wei Li 
---
 arch/arm64/kernel/kgdb.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/kernel/kgdb.c b/arch/arm64/kernel/kgdb.c
index 43119922341f..1a157ca33262 100644
--- a/arch/arm64/kernel/kgdb.c
+++ b/arch/arm64/kernel/kgdb.c
@@ -252,7 +252,7 @@ static int kgdb_step_brk_fn(struct pt_regs *regs, unsigned 
int esr)
if (!kgdb_single_step)
return DBG_HOOK_ERROR;
 
-   kgdb_handle_exception(1, SIGTRAP, 0, regs);
+   kgdb_handle_exception(0, SIGTRAP, 0, regs);
return DBG_HOOK_HANDLED;
 }
 NOKPROBE_SYMBOL(kgdb_step_brk_fn);
-- 
2.17.1



[PATCH] perf symbols: Don't split kernel symbols when using kallsyms

2019-09-29 Thread Wei Li
If there is ovelapping of vmlinux addresses with modules on a system,
the kernel symbols will be split to unique DSOs([kernel].N) when using
/proc/kallsyms, introduced by commit 2e538c4a1847 ("perf tools: Improve
kernel/modules symbol lookup").

When doing perf probe/report on such a system, it will fail in finding
symbol, as the symbol split process is after the map searching, and the
created maps can't be matched with name in kernel_get_module_map().

I think the split is confusing and not so useful when using [kernel].N
instead of the original ELF section names here. So remove the split
process when using /proc/kallsyms only.

On my arm32 system:

V2R7C00_HI1212-OSN9800 ~/liwei # ./perf probe -v -a printk
probe-definition(0): printk
symbol:printk file:(null) line:0 offset:0 return:0 lazy:(null)
0 arguments
map_groups__set_modules_path_dir: cannot open /lib/modules/5.3.0-rc5+ dir
Problems setting modules path maps, continuing anyway...
No kprobe blacklist support, ignored
Failed to open cache(-1): 
/root/.debug/[kernel.kallsyms]/8eb36727183e955c790f0c7feb22d8306be7ce99/probes
Cache open error: -1
Looking at the vmlinux_path (8 entries long)
symsrc__init: cannot get elf header.
Could not open debuginfo. Try to use symbols.
Looking at the vmlinux_path (8 entries long)
symsrc__init: cannot get elf header.
Failed to open /proc/kcore. Note /proc/kcore requires CAP_SYS_RAWIO capability 
to access.
Using /proc/kallsyms for symbols
Failed to find symbol printk in kernel
  Error: Failed to add events. Reason: No such file or directory (Code: -2)

V2R7C00_HI1212-OSN9800 ~/liwei # ./perf probe -F
vector_addrexcptn
vector_dabt
vector_fiq
vector_irq
vector_pabt
vector_rst
vector_und

V2R7C00_HI1212-OSN9800 ~/liwei # sort -u /proc/kallsyms | head -n 10
01df t __vectors_start
01df1000 t __stubs_start
01df1004 t vector_rst
01df1020 t vector_irq
01df10a0 t vector_dabt
01df1120 t vector_pabt
01df11a0 t vector_und
01df1220 t vector_addrexcptn
01df1240 T vector_fiq
bf00 t $a   [eth]

V2R7C00_HI1212-OSN9800 ~/liwei # sort -u /proc/kallsyms | grep _stext -B 10
c1e04000 t swapper_pg_dir
c1e08000 T stext
c1e08000 t _text
c1e080b8 t __create_page_tables
c1e081bc t __fixup_smp
c1e08224 t __fixup_smp_on_up
c1e08240 t __fixup_pv_table
c1e082b0 t __vet_atags
c1e08300 T __turn_mmu_on
c1e08300 t __idmap_text_start
c1e08300 t _stext

V2R7C00_HI1212-OSN9800 ~/liwei # ./perf record -a sleep 2
Couldn't synthesize bpf events.
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.024 MB perf.data (140 samples) ]
V2R7C00_HI1212-OSN9800 ~/liwei # ./perf report
# To display the perf.data header info, please use --header/--header-only 
options.
#
#
# Total Lost Samples: 0
#
# Samples: 140  of event 'cycles'
# Event count (approx.): 9286809
#
# Overhead  Command  Shared Object  Symbol
#   ...  .  .
#
15.89%  swapper  [unknown]  [k] 0xc1e10b2c
 7.83%  swapper  [unknown]  [k] 0xc1e303f4
 6.66%  swapper  [unknown]  [k] 0xc1e84d58
 6.28%  sleepld-2.27.so [.] do_lookup_x
 6.07%  swapper  [unknown]  [k] 0xc23f59dc
 3.19%  swapper  [unknown]  [k] 0xc23f0bfc
 3.18%  swapper  [unknown]  [k] 0xc2194d18
 3.18%  sleeplibc-2.27.so   [.] _dl_addr
 3.17%  sleep[unknown]  [k] 0xc1e23454
 3.16%  sleepld-2.27.so [.] check_match
 3.12%  sleepld-2.27.so [.] strcmp
 3.12%  swapper  [unknown]  [k] 0xc1e52ce8
 3.10%  sleep[unknown]  [k] 0xc1f738a4
 3.08%  sleep[unknown]  [k] 0xc21a1ef8
 3.05%  sleep[unknown]  [k] 0xc1f7109c
 3.02%  sleep[unknown]  [k] 0xc1f59520
 2.99%  sleep[unknown]  [k] 0xc23f54d4
 ...

Signed-off-by: Wei Li 
---
 tools/perf/util/symbol.c | 54 ++--
 1 file changed, 7 insertions(+), 47 deletions(-)

diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index 4efde7879474..1c0d6d0e4ed1 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -743,9 +743,7 @@ static int map_groups__split_kallsyms_for_kcore(struct 
map_groups *kmaps, struct
 }
 
 /*
- * Split the symbols into maps, making sure there are no overlaps, i.e. the
- * kernel range is broken in several maps, named [kernel].N, as we don't have
- * the original ELF section names vmlinux have.
+ * Remove module and useless symbols from the map derived from /proc/kallsyms.
  */
 static int map_groups__split_kallsyms(struct map_groups *kmaps, struct dso 
*dso, u64 delta,
  struct map *initial_map)
@@ -753,10 +751,9 @@ static int map_groups__split_kallsyms(struct map_groups 
*kmaps, struct dso *dso,
struct machine *machine;
struct map *curr_map = initial_map;
struct symbol *pos;
-   int count = 0, moved = 0;
+   int moved = 0;

[PATCH] clocksource/arm_arch_timer: mark arch_timer_read_counter() as notrace to avoid deadloop

2019-06-11 Thread Wei Li
According to arch_counter_register(), mark arch_counter_get_*() what
arch_timer_read_counter() can be as notrace to avoid deadloop when using
function_graph tracer.

 0x80028af23250 0x10195e00 sched_clock+64
 0x80028af23290 0x101e83ec trace_clock_local+12
 0x80028af232a0 0x1020e52c function_graph_enter+116
 0x80028af23300 0x1009af9c prepare_ftrace_return+44
 0x80028af23320 0x1009b0a8 ftrace_graph_caller+28
 0x80028af23330 0x10b01918 arch_counter_get_cntvct+16
 0x80028af23340 0x10195e00 sched_clock+64
 0x80028af23380 0x101e83ec trace_clock_local+12
 0x80028af23390 0x1020e52c function_graph_enter+116
 0x80028af233f0 0x1009af9c prepare_ftrace_return+44
 0x80028af23410 0x1009b0a8 ftrace_graph_caller+28
 0x80028af23420 0x10b01918 arch_counter_get_cntvct+16
 ...

Fixes: 0ea415390cd3 ("clocksource/arm_arch_timer: Use arch_timer_read_counter 
to access stable counters")
Signed-off-by: Wei Li 
---
 drivers/clocksource/arm_arch_timer.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/clocksource/arm_arch_timer.c 
b/drivers/clocksource/arm_arch_timer.c
index b2a951a798e2..f4d5bd8fe906 100644
--- a/drivers/clocksource/arm_arch_timer.c
+++ b/drivers/clocksource/arm_arch_timer.c
@@ -149,22 +149,22 @@ u32 arch_timer_reg_read(int access, enum arch_timer_reg 
reg,
return val;
 }
 
-static u64 arch_counter_get_cntpct_stable(void)
+static u64 notrace arch_counter_get_cntpct_stable(void)
 {
return __arch_counter_get_cntpct_stable();
 }
 
-static u64 arch_counter_get_cntpct(void)
+static u64 notrace arch_counter_get_cntpct(void)
 {
return __arch_counter_get_cntpct();
 }
 
-static u64 arch_counter_get_cntvct_stable(void)
+static u64 notrace arch_counter_get_cntvct_stable(void)
 {
return __arch_counter_get_cntvct_stable();
 }
 
-static u64 arch_counter_get_cntvct(void)
+static u64 notrace arch_counter_get_cntvct(void)
 {
return __arch_counter_get_cntvct();
 }
@@ -947,7 +947,7 @@ bool arch_timer_evtstrm_available(void)
return cpumask_test_cpu(raw_smp_processor_id(), _available);
 }
 
-static u64 arch_counter_get_cntvct_mem(void)
+static u64 notrace arch_counter_get_cntvct_mem(void)
 {
u32 vct_lo, vct_hi, tmp_hi;
 
-- 
2.17.1



[PATCH] ftrace: fix NULL pointer dereference in free_ftrace_func_mapper()

2019-06-05 Thread Wei Li
The mapper may be NULL when called from register_ftrace_function_probe()
with probe->data == NULL.

This issue can be reproduced as follow (it may be coverd by compiler
optimization sometime):

/ # cat /sys/kernel/debug/tracing/set_ftrace_filter 
 all functions enabled 
/ # echo foo_bar:dump > /sys/kernel/debug/tracing/set_ftrace_filter 
[  206.949100] Unable to handle kernel NULL pointer dereference at virtual 
address 
[  206.952402] Mem abort info:
[  206.952819]   ESR = 0x9606
[  206.955326]   Exception class = DABT (current EL), IL = 32 bits
[  206.955844]   SET = 0, FnV = 0
[  206.956272]   EA = 0, S1PTW = 0
[  206.956652] Data abort info:
[  206.957320]   ISV = 0, ISS = 0x0006
[  206.959271]   CM = 0, WnR = 0
[  206.959938] user pgtable: 4k pages, 48-bit VAs, pgdp=000419f3a000
[  206.960483] [] pgd=000411a87003, pud=000411a83003, 
pmd=
[  206.964953] Internal error: Oops: 9606 [#1] SMP
[  206.971122] Dumping ftrace buffer:
[  206.973677](ftrace buffer empty)
[  206.975258] Modules linked in:
[  206.976631] Process sh (pid: 281, stack limit = 0x(ptrval))
[  206.978449] CPU: 10 PID: 281 Comm: sh Not tainted 5.2.0-rc1+ #17
[  206.978955] Hardware name: linux,dummy-virt (DT)
[  206.979883] pstate: 6005 (nZCv daif -PAN -UAO)
[  206.980499] pc : free_ftrace_func_mapper+0x2c/0x118
[  206.980874] lr : ftrace_count_free+0x68/0x80
[  206.982539] sp : 182f3ab0
[  206.983102] x29: 182f3ab0 x28: 8003d0ec1700 
[  206.983632] x27: 13054b40 x26: 0001 
[  206.984000] x25: 1385f000 x24:  
[  206.984394] x23: 13453000 x22: 13054000 
[  206.984775] x21:  x20: 1385fe28 
[  206.986575] x19: 13872c30 x18:  
[  206.987111] x17:  x16:  
[  206.987491] x15: ffb0 x14:  
[  206.987850] x13: 0017430e x12: 0580 
[  206.988251] x11:  x10:  
[  206.988740] x9 :  x8 : 13917550 
[  206.990198] x7 : 12fac2e8 x6 : 12fac000 
[  206.991008] x5 : 103da588 x4 : 0001 
[  206.991395] x3 : 0001 x2 : 13872a28 
[  206.991771] x1 :  x0 :  
[  206.992557] Call trace:
[  206.993101]  free_ftrace_func_mapper+0x2c/0x118
[  206.994827]  ftrace_count_free+0x68/0x80
[  206.995238]  release_probe+0xfc/0x1d0
[  206.99]  register_ftrace_function_probe+0x4a8/0x868
[  206.995923]  ftrace_trace_probe_callback.isra.4+0xb8/0x180
[  206.996330]  ftrace_dump_callback+0x50/0x70
[  206.996663]  ftrace_regex_write.isra.29+0x290/0x3a8
[  206.997157]  ftrace_filter_write+0x44/0x60
[  206.998971]  __vfs_write+0x64/0xf0
[  206.999285]  vfs_write+0x14c/0x2f0
[  206.999591]  ksys_write+0xbc/0x1b0
[  206.999888]  __arm64_sys_write+0x3c/0x58
[  207.000246]  el0_svc_common.constprop.0+0x408/0x5f0
[  207.000607]  el0_svc_handler+0x144/0x1c8
[  207.000916]  el0_svc+0x8/0xc
[  207.003699] Code: aa0003f8 a9025bf5 aa0103f5 f946ea80 (f9400303) 
[  207.008388] ---[ end trace 7b6d11b5f542bdf1 ]---
[  207.010126] Kernel panic - not syncing: Fatal exception
[  207.011322] SMP: stopping secondary CPUs
[  207.013956] Dumping ftrace buffer:
[  207.014595](ftrace buffer empty)
[  207.015632] Kernel Offset: disabled
[  207.017187] CPU features: 0x002,20006008
[  207.017985] Memory Limit: none
[  207.019825] ---[ end Kernel panic - not syncing: Fatal exception ]---

Signed-off-by: Wei Li 
---
 kernel/trace/ftrace.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index a12aff849c04..7e2488da69ac 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -4221,10 +4221,13 @@ void free_ftrace_func_mapper(struct ftrace_func_mapper 
*mapper,
struct ftrace_func_entry *entry;
struct ftrace_func_map *map;
struct hlist_head *hhd;
-   int size = 1 << mapper->hash.size_bits;
-   int i;
+   int size, i;
+
+   if (!mapper)
+   return;
 
if (free_func && mapper->hash.count) {
+   size = 1 << mapper->hash.size_bits;
for (i = 0; i < size; i++) {
hhd = >hash.buckets[i];
hlist_for_each_entry(entry, hhd, hlist) {
-- 
2.17.1



[PATCH v2] fix use-after-free in perf_sched__lat

2019-05-08 Thread Wei Li
After thread is added to machine->threads[i].dead in
__machine__remove_thread, the machine->threads[i].dead is freed
when calling free(session) in perf_session__delete(). So it get a
Segmentation fault when accessing it in thread__put().

In this patch, we delay the perf_session__delete until all threads
have been deleted.

This can be reproduced by following steps:
ulimit -c unlimited
export MALLOC_MMAP_THRESHOLD_=0
perf sched record sleep 10
perf sched latency --sort max
Segmentation fault (core dumped)

Signed-off-by: Zhipeng Xie 
Signed-off-by: Wei Li 
---
 tools/perf/builtin-sched.c | 63 ++
 1 file changed, 43 insertions(+), 20 deletions(-)

diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index 275f2d92a7bf..8a4841fa124c 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -1774,7 +1774,8 @@ static int perf_sched__process_comm(struct perf_tool 
*tool __maybe_unused,
return 0;
 }
 
-static int perf_sched__read_events(struct perf_sched *sched)
+static int __perf_sched__read_events(struct perf_sched *sched,
+   struct perf_session *session)
 {
const struct perf_evsel_str_handler handlers[] = {
{ "sched:sched_switch",   process_sched_switch_event, },
@@ -1783,30 +1784,17 @@ static int perf_sched__read_events(struct perf_sched 
*sched)
{ "sched:sched_wakeup_new",   process_sched_wakeup_event, },
{ "sched:sched_migrate_task", process_sched_migrate_task_event, 
},
};
-   struct perf_session *session;
-   struct perf_data data = {
-   .path  = input_name,
-   .mode  = PERF_DATA_MODE_READ,
-   .force = sched->force,
-   };
-   int rc = -1;
-
-   session = perf_session__new(, false, >tool);
-   if (session == NULL) {
-   pr_debug("No Memory for session\n");
-   return -1;
-   }
 
symbol__init(>header.env);
 
if (perf_session__set_tracepoints_handlers(session, handlers))
-   goto out_delete;
+   return -1;
 
if (perf_session__has_traces(session, "record -R")) {
int err = perf_session__process_events(session);
if (err) {
pr_err("Failed to process events, error %d", err);
-   goto out_delete;
+   return -1;
}
 
sched->nr_events  = session->evlist->stats.nr_events[0];
@@ -1814,9 +1802,28 @@ static int perf_sched__read_events(struct perf_sched 
*sched)
sched->nr_lost_chunks = 
session->evlist->stats.nr_events[PERF_RECORD_LOST];
}
 
-   rc = 0;
-out_delete:
+   return 0;
+}
+
+static int perf_sched__read_events(struct perf_sched *sched)
+{
+   struct perf_session *session;
+   struct perf_data data = {
+   .path  = input_name,
+   .mode  = PERF_DATA_MODE_READ,
+   .force = sched->force,
+   };
+   int rc;
+
+   session = perf_session__new(, false, >tool);
+   if (session == NULL) {
+   pr_debug("No Memory for session\n");
+   return -1;
+   }
+
+   rc = __perf_sched__read_events(sched, session);
perf_session__delete(session);
+
return rc;
 }
 
@@ -3130,12 +3137,25 @@ static void perf_sched__merge_lat(struct perf_sched 
*sched)
 
 static int perf_sched__lat(struct perf_sched *sched)
 {
+   struct perf_session *session;
+   struct perf_data data = {
+   .path  = input_name,
+   .mode  = PERF_DATA_MODE_READ,
+   .force = sched->force,
+   };
struct rb_node *next;
+   int rc = -1;
 
setup_pager();
 
-   if (perf_sched__read_events(sched))
+   session = perf_session__new(, false, >tool);
+   if (session == NULL) {
+   pr_debug("No Memory for session\n");
return -1;
+   }
+
+   if (__perf_sched__read_events(sched, session))
+   goto out_delete;
 
perf_sched__merge_lat(sched);
perf_sched__sort_lat(sched);
@@ -3164,7 +3184,10 @@ static int perf_sched__lat(struct perf_sched *sched)
print_bad_events(sched);
printf("\n");
 
-   return 0;
+   rc = 0;
+out_delete:
+   perf_session__delete(session);
+   return rc;
 }
 
 static int setup_map_cpus(struct perf_sched *sched)
-- 
2.17.1



[PATCH 1/3] arm64: Add pseudo NMI support of GICv3 SGIs

2019-05-06 Thread Wei Li
Currently, only PPIs and SPIs can be set as NMIs. IPIs being currently
hardcoded IRQ numbers, there isn't a generic interface to set SGIs as NMI
for now.

In this patch, we do:
1. Add an interface for setting priority of SGIs.
2. Export GICD_INT_NMI_PRI for setting priority of SGIs as NMI.
3. Move the gic_enable_nmi_support() earlier to make the gic_supports_nmi()
check works in gic_cpu_init().

Signed-off-by: Wei Li 
Cc: Julien Thierry 
---
 arch/arm64/include/asm/smp.h   |  2 ++
 arch/arm64/kernel/smp.c|  4 +++
 drivers/irqchip/irq-gic-v3.c   | 46 +-
 include/linux/irqchip/arm-gic-common.h |  1 +
 include/linux/irqchip/arm-gic-v3.h |  1 +
 5 files changed, 45 insertions(+), 9 deletions(-)

diff --git a/arch/arm64/include/asm/smp.h b/arch/arm64/include/asm/smp.h
index 18553f399e08..84d7ea073d84 100644
--- a/arch/arm64/include/asm/smp.h
+++ b/arch/arm64/include/asm/smp.h
@@ -158,6 +158,8 @@ bool cpus_are_stuck_in_kernel(void);
 extern void crash_smp_send_stop(void);
 extern bool smp_crash_stop_failed(void);
 
+extern void ipi_gic_nmi_setup(void __iomem *base);
+
 #endif /* ifndef __ASSEMBLY__ */
 
 #endif /* ifndef __ASM_SMP_H */
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index 824de7038967..bd8fdf6fcd8e 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -1067,3 +1067,7 @@ bool cpus_are_stuck_in_kernel(void)
 
return !!cpus_stuck_in_kernel || smp_spin_tables;
 }
+
+void ipi_gic_nmi_setup(void __iomem *base)
+{
+}
diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index 15e55d327505..394aa5668dd6 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -42,8 +42,6 @@
 
 #include "irq-gic-common.h"
 
-#define GICD_INT_NMI_PRI   (GICD_INT_DEF_PRI & ~0x80)
-
 #define FLAGS_WORKAROUND_GICR_WAKER_MSM8996(1ULL << 0)
 
 struct redist_region {
@@ -324,6 +322,23 @@ static int gic_irq_get_irqchip_state(struct irq_data *d,
return 0;
 }
 
+void gic_sgi_set_prio(void __iomem *base, u32 irqnr, u8 prio)
+{
+   u32 val, offset;
+
+   offset = GICR_IPRIORITYR0 + ((irqnr / 4) * 4);
+
+   /*
+* Using writeb here may cause hardware error on some CPUs,
+* aovid this quirk by using writel.
+*/
+   val = readl_relaxed(base + offset);
+   val &= ~(0xff << ((irqnr % 4) * 8));
+   val |= prio << ((irqnr % 4) * 8);
+
+   writel_relaxed(val, base + offset);
+}
+
 static void gic_irq_set_prio(struct irq_data *d, u8 prio)
 {
void __iomem *base = gic_dist_base(d);
@@ -474,6 +489,16 @@ static inline void gic_handle_nmi(u32 irqnr, struct 
pt_regs *regs)
 {
int err;
 
+   if (unlikely(irqnr < 16)) {
+   gic_write_eoir(irqnr);
+   if (static_branch_likely(_deactivate_key))
+   gic_write_dir(irqnr);
+#ifdef CONFIG_SMP
+   handle_IPI(irqnr, regs);
+#endif
+   return;
+   }
+
if (static_branch_likely(_deactivate_key))
gic_write_eoir(irqnr);
/*
@@ -859,6 +884,9 @@ static void gic_cpu_init(void)
 
gic_cpu_config(rbase, gic_redist_wait_for_rwp);
 
+   if (gic_supports_nmi())
+   ipi_gic_nmi_setup(rbase);
+
/* initialise system registers */
gic_cpu_sys_reg_init();
 }
@@ -1335,6 +1363,13 @@ static int __init gic_init_bases(void __iomem *dist_base,
 
gic_update_vlpi_properties();
 
+   if (gic_prio_masking_enabled()) {
+   if (!gic_has_group0() || gic_dist_security_disabled())
+   gic_enable_nmi_support();
+   else
+   pr_warn("SCR_EL3.FIQ is cleared, cannot enable use of 
pseudo-NMIs\n");
+   }
+
gic_smp_init();
gic_dist_init();
gic_cpu_init();
@@ -1345,13 +1380,6 @@ static int __init gic_init_bases(void __iomem *dist_base,
its_cpu_init();
}
 
-   if (gic_prio_masking_enabled()) {
-   if (!gic_has_group0() || gic_dist_security_disabled())
-   gic_enable_nmi_support();
-   else
-   pr_warn("SCR_EL3.FIQ is cleared, cannot enable use of 
pseudo-NMIs\n");
-   }
-
return 0;
 
 out_free:
diff --git a/include/linux/irqchip/arm-gic-common.h 
b/include/linux/irqchip/arm-gic-common.h
index 9a1a479a2bf4..d8c973295179 100644
--- a/include/linux/irqchip/arm-gic-common.h
+++ b/include/linux/irqchip/arm-gic-common.h
@@ -18,6 +18,7 @@
(GICD_INT_DEF_PRI << 16) |\
(GICD_INT_DEF_PRI << 8) |\
GICD_INT_DEF_PRI)
+#define GICD_INT_NMI_PRI   (GICD_INT_DEF_PRI & ~0x80)
 
 enum gic_type {
GIC_V2,
diff --git a/include/linux/irqchip/arm-gic-v3.h 
b/include/linux/irqchip/arm-gic-v3.h
index

[PATCH 2/3] arm64: Add support for on-demand backtrace of other CPUs

2019-05-06 Thread Wei Li
To support backtracing of other CPUs in the system on lockups, add the
implementation of arch_trigger_cpumask_backtrace() for arm64.

In this patch, we add an arm64 NMI-like IPI based backtracer, referring
to the implementation on arm by Russell King and Chris Metcalf.

Signed-off-by: Wei Li 
---
 arch/arm64/include/asm/hardirq.h |  2 +-
 arch/arm64/include/asm/irq.h |  6 ++
 arch/arm64/kernel/smp.c  | 22 +-
 3 files changed, 28 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/hardirq.h b/arch/arm64/include/asm/hardirq.h
index 89691c86640a..a5d94aa59c7c 100644
--- a/arch/arm64/include/asm/hardirq.h
+++ b/arch/arm64/include/asm/hardirq.h
@@ -24,7 +24,7 @@
 #include 
 #include 
 
-#define NR_IPI 7
+#define NR_IPI 8
 
 typedef struct {
unsigned int __softirq_pending;
diff --git a/arch/arm64/include/asm/irq.h b/arch/arm64/include/asm/irq.h
index b2b0c6405eb0..28471df488c0 100644
--- a/arch/arm64/include/asm/irq.h
+++ b/arch/arm64/include/asm/irq.h
@@ -13,5 +13,11 @@ static inline int nr_legacy_irqs(void)
return 0;
 }
 
+#ifdef CONFIG_SMP
+extern void arch_trigger_cpumask_backtrace(const cpumask_t *mask,
+  bool exclude_self);
+#define arch_trigger_cpumask_backtrace arch_trigger_cpumask_backtrace
+#endif
+
 #endif /* !__ASSEMBLER__ */
 #endif
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index bd8fdf6fcd8e..7e862f9124f3 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -35,6 +35,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -83,7 +84,8 @@ enum ipi_msg_type {
IPI_CPU_CRASH_STOP,
IPI_TIMER,
IPI_IRQ_WORK,
-   IPI_WAKEUP
+   IPI_WAKEUP,
+   IPI_CPU_BACKTRACE
 };
 
 #ifdef CONFIG_HOTPLUG_CPU
@@ -787,6 +789,7 @@ static const char *ipi_types[NR_IPI] __tracepoint_string = {
S(IPI_TIMER, "Timer broadcast interrupts"),
S(IPI_IRQ_WORK, "IRQ work interrupts"),
S(IPI_WAKEUP, "CPU wake-up interrupts"),
+   S(IPI_CPU_BACKTRACE, "Backtrace interrupts"),
 };
 
 static void smp_cross_call(const struct cpumask *target, unsigned int ipinr)
@@ -946,6 +949,12 @@ void handle_IPI(int ipinr, struct pt_regs *regs)
break;
 #endif
 
+   case IPI_CPU_BACKTRACE:
+   nmi_enter();
+   nmi_cpu_backtrace(regs);
+   nmi_exit();
+   break;
+
default:
pr_crit("CPU%u: Unknown IPI message 0x%x\n", cpu, ipinr);
break;
@@ -1070,4 +1079,15 @@ bool cpus_are_stuck_in_kernel(void)
 
 void ipi_gic_nmi_setup(void __iomem *base)
 {
+   gic_sgi_set_prio(base, IPI_CPU_BACKTRACE, GICD_INT_NMI_PRI);
+}
+
+static void raise_nmi(cpumask_t *mask)
+{
+   smp_cross_call(mask, IPI_CPU_BACKTRACE);
+}
+
+void arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
+{
+   nmi_trigger_cpumask_backtrace(mask, exclude_self, raise_nmi);
 }
-- 
2.17.1



[PATCH 0/3] arm64: Add support for on-demand backtrace by NMI-like IPI

2019-05-06 Thread Wei Li
[  272.190800] Hardware name: linux,dummy-virt (DT)
[  272.190825] pstate: 8005 (Nzcv daif -PAN -UAO)
[  272.190860] pc : hardlockup_test_loop+0xa0/0xe8 [lockup_test]
[  272.190898] lr : hardlockup_test_loop+0x6c/0xe8 [lockup_test]
[  272.190932] sp : 1499be00
[  272.190993] pmr_save: 0070
[  272.191054] x29: 1499be00 x28:  
[  272.191171] x27: 14983a98 x26: 8003eca41eb8 
[  272.191249] x25: 08d671b8 x24:  
[  272.191391] x23: 11081f58 x22: 3b9ac9ff 
[  272.191501] x21: 0003 x20: 08d68110 
[  272.191593] x19: 08d69000 x18:  
[  272.191718] x17:  x16:  
[  272.191895] x15: 117dd708 x14: 9499bb27 
[  272.192062] x13: 1499bb35 x12:  
[  272.192282] x11: 11807000 x10: 05f5e0ff 
[  272.192451] x9 : ffd0 x8 : 000dceb8 
[  272.192555] x7 : 086e x6 : 119f5bf1 
[  272.192715] x5 : 117df7a0 x4 :  
[  272.192887] x3 :  x2 : 3ebac9b03882d300 
[  272.192962] x1 : 04208040 x0 : 39e93260 
[  272.193041] Call trace:
[  272.193077]  hardlockup_test_loop+0xa0/0xe8 [lockup_test]
[  272.193117]  lockup_test+0x50/0x88 [lockup_test]
[  272.193152]  kthread+0x100/0x130
[  272.193225]  ret_from_fork+0x10/0x18
[  272.193421] NMI backtrace for cpu 7
[  272.193475] CPU: 7 PID: 256 Comm: lockup_test7 Tainted: G   OEL
5.1.0-rc7+ #10
[  272.193495] Hardware name: linux,dummy-virt (DT)
[  272.193562] pstate: 8005 (Nzcv daif -PAN -UAO)
[  272.193619] pc : hardlockup_test_loop+0xa0/0xe8 [lockup_test]
[  272.193634] lr : hardlockup_test_loop+0x6c/0xe8 [lockup_test]
[  272.193647] sp : 149bbe00
[  272.193661] pmr_save: 0070
[  272.193710] x29: 149bbe00 x28:  
[  272.193848] x27: 14983a98 x26: 8003ecb5fbb8 
[  272.193921] x25: 08d671b8 x24:  
[  272.194056] x23: 11081f58 x22: 3b9ac9ff 
[  272.194162] x21: 0007 x20: 08d68110 
[  272.194263] x19: 08d69000 x18:  
[  272.194343] x17:  x16:  
[  272.194463] x15: 117dd708 x14: 949bbb27 
[  272.194566] x13: 149bbb35 x12: 1148898f 
[  272.194667] x11: 11807000 x10: 05f5e0ff 
[  272.194809] x9 : ffd0 x8 : 00066bb1 
[  272.194962] x7 : 086d x6 : 119f5bf1 
[  272.195121] x5 : 117df7a0 x4 :  
[  272.195249] x3 :  x2 : 3ebac9b03882d300 
[  272.195344] x1 : 04208040 x0 : 3b728080 
[  272.195428] Call trace:
[  272.195463]  hardlockup_test_loop+0xa0/0xe8 [lockup_test]
[  272.195540]  lockup_test+0x50/0x88 [lockup_test]
[  272.195600]  kthread+0x100/0x130
[  272.195638]  ret_from_fork+0x10/0x18

Wei Li (3):
  arm64: Add pseudo NMI support of GICv3 SGIs
  arm64: Add support for on-demand backtrace of other CPUs
  arm64: Avoid entering NMI context improperly

 arch/arm64/include/asm/arch_gicv3.h|  8 
 arch/arm64/include/asm/hardirq.h   |  2 +-
 arch/arm64/include/asm/irq.h   |  6 +++
 arch/arm64/include/asm/smp.h   |  2 +
 arch/arm64/kernel/smp.c| 36 -
 drivers/irqchip/irq-gic-v3.c   | 54 ++
 include/linux/irqchip/arm-gic-common.h |  1 +
 include/linux/irqchip/arm-gic-v3.h |  1 +
 8 files changed, 92 insertions(+), 18 deletions(-)

-- 
2.17.1



[PATCH 3/3] arm64: Avoid entering NMI context improperly

2019-05-06 Thread Wei Li
As the pseudo NMI can be enabled/disabled by cmdline parameter, the
arch_trigger_cpumask_backtrace() may still work through a normal IPI.

In this patch, we export the gic_supports_nmi() and add a check in
IPI_CPU_BACKTRACE process to avoid entering NMI context when pseudo
NMI is disabled.

Signed-off-by: Wei Li 
---
 arch/arm64/include/asm/arch_gicv3.h |  8 
 arch/arm64/kernel/smp.c | 14 --
 drivers/irqchip/irq-gic-v3.c|  8 +---
 3 files changed, 21 insertions(+), 9 deletions(-)

diff --git a/arch/arm64/include/asm/arch_gicv3.h 
b/arch/arm64/include/asm/arch_gicv3.h
index 14b41ddc68ba..6655701ea7d4 100644
--- a/arch/arm64/include/asm/arch_gicv3.h
+++ b/arch/arm64/include/asm/arch_gicv3.h
@@ -156,6 +156,14 @@ static inline u32 gic_read_rpr(void)
 #define gits_write_vpendbaser(v, c)writeq_relaxed(v, c)
 #define gits_read_vpendbaser(c)readq_relaxed(c)
 
+extern struct static_key_false supports_pseudo_nmis;
+
+static inline bool gic_supports_nmi(void)
+{
+   return IS_ENABLED(CONFIG_ARM64_PSEUDO_NMI) &&
+  static_branch_likely(_pseudo_nmis);
+}
+
 static inline bool gic_prio_masking_enabled(void)
 {
return system_uses_irq_prio_masking();
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index 7e862f9124f3..5550951527ea 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -950,9 +950,19 @@ void handle_IPI(int ipinr, struct pt_regs *regs)
 #endif
 
case IPI_CPU_BACKTRACE:
-   nmi_enter();
+   if (gic_supports_nmi()) {
+   nmi_enter();
+   } else {
+   printk_nmi_enter();
+   irq_enter();
+   }
nmi_cpu_backtrace(regs);
-   nmi_exit();
+   if (gic_supports_nmi()) {
+   nmi_exit();
+   } else {
+   irq_exit();
+   printk_nmi_exit();
+   }
break;
 
default:
diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index 394aa5668dd6..b701727258b0 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -90,7 +90,7 @@ static DEFINE_STATIC_KEY_TRUE(supports_deactivate_key);
  * For now, we only support pseudo-NMIs if we have non-secure view of
  * priorities.
  */
-static DEFINE_STATIC_KEY_FALSE(supports_pseudo_nmis);
+DEFINE_STATIC_KEY_FALSE(supports_pseudo_nmis);
 
 /* ppi_nmi_refs[n] == number of cpus having ppi[n + 16] set as NMI */
 static refcount_t ppi_nmi_refs[16];
@@ -261,12 +261,6 @@ static void gic_unmask_irq(struct irq_data *d)
gic_poke_irq(d, GICD_ISENABLER);
 }
 
-static inline bool gic_supports_nmi(void)
-{
-   return IS_ENABLED(CONFIG_ARM64_PSEUDO_NMI) &&
-  static_branch_likely(_pseudo_nmis);
-}
-
 static int gic_irq_set_irqchip_state(struct irq_data *d,
 enum irqchip_irq_state which, bool val)
 {
-- 
2.17.1



[PATCH] fix use-after-free in perf_sched__lat

2019-05-02 Thread Wei Li
After thread is added to machine->threads[i].dead in
__machine__remove_thread, the machine->threads[i].dead is freed
when calling free(session) in perf_session__delete(). So it get a
Segmentation fault when accessing it in thread__put().

In this patch, we delay the perf_session__delete until all threads
have been deleted.

This can be reproduced by following steps:
ulimit -c unlimited
export MALLOC_MMAP_THRESHOLD_=0
perf sched record sleep 10
perf sched latency --sort max
Segmentation fault (core dumped)

Signed-off-by: Zhipeng Xie 
Signed-off-by: Wei Li 
---
 tools/perf/builtin-sched.c | 44 --
 1 file changed, 42 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index cbf39dab19c1..17849ae2eb1e 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -3130,11 +3130,48 @@ static void perf_sched__merge_lat(struct perf_sched 
*sched)
 static int perf_sched__lat(struct perf_sched *sched)
 {
struct rb_node *next;
+   const struct perf_evsel_str_handler handlers[] = {
+   { "sched:sched_switch",   process_sched_switch_event, },
+   { "sched:sched_stat_runtime", process_sched_runtime_event, },
+   { "sched:sched_wakeup",   process_sched_wakeup_event, },
+   { "sched:sched_wakeup_new",   process_sched_wakeup_event, },
+   { "sched:sched_migrate_task", process_sched_migrate_task_event, 
},
+   };
+   struct perf_session *session;
+   struct perf_data data = {
+   .file  = {
+   .path = input_name,
+   },
+   .mode  = PERF_DATA_MODE_READ,
+   .force = sched->force,
+   };
+   int rc = -1;
 
setup_pager();
 
-   if (perf_sched__read_events(sched))
+   session = perf_session__new(, false, >tool);
+   if (session == NULL) {
+   pr_debug("No Memory for session\n");
return -1;
+   }
+
+   symbol__init(>header.env);
+
+   if (perf_session__set_tracepoints_handlers(session, handlers))
+   goto out_delete;
+
+   if (perf_session__has_traces(session, "record -R")) {
+   int err = perf_session__process_events(session);
+
+   if (err) {
+   pr_err("Failed to process events, error %d", err);
+   goto out_delete;
+   }
+
+   sched->nr_events  = session->evlist->stats.nr_events[0];
+   sched->nr_lost_events = session->evlist->stats.total_lost;
+   sched->nr_lost_chunks = 
session->evlist->stats.nr_events[PERF_RECORD_LOST];
+   }
 
perf_sched__merge_lat(sched);
perf_sched__sort_lat(sched);
@@ -3163,7 +3200,10 @@ static int perf_sched__lat(struct perf_sched *sched)
print_bad_events(sched);
printf("\n");
 
-   return 0;
+   rc = 0;
+out_delete:
+   perf_session__delete(session);
+   return rc;
 }
 
 static int setup_map_cpus(struct perf_sched *sched)
-- 
2.17.1



[RFC PATCH] arm64: irqflags: fix incomplete save & restore

2019-04-03 Thread Wei Li
To support the arm64 pseudo nmi, function arch_local_irq_save() and
arch_local_irq_restore() now operate ICC_PMR_EL1 insead of daif.
But i found the logic of the save and restore may be suspicious:

arch_local_irq_save():
daif.i_on   pmr_on  ->  flag.i_on
1   0   |   0
1   1   |   1
0   1   |   0   --[1]
0   0   |   0

arch_local_irq_restore():
daif.i_on   pmr_on  <-  flag.i_on
x   0   |   0
x   1   |   1

As we see, the condintion [1] will never be restored honestly. When doing
function_graph trace at gic_handle_irq(), calling local_irq_save() and
local_irq_restore() in trace_graph_entry() will just go into this
condintion. Therefore the irq can never be processed and leading to hang.

In this patch, we do the save & restore exactly, and make sure the
arch_irqs_disabled_flags() returns correctly.

Fix: commit 4a503217ce37 ("arm64: irqflags: Use ICC_PMR_EL1 for interrupt 
masking")
Signed-off-by: Wei Li 
---
 arch/arm64/include/asm/irqflags.h | 17 +
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/include/asm/irqflags.h 
b/arch/arm64/include/asm/irqflags.h
index 43d8366c1e87..7bc6a79f31c4 100644
--- a/arch/arm64/include/asm/irqflags.h
+++ b/arch/arm64/include/asm/irqflags.h
@@ -76,8 +76,7 @@ static inline unsigned long arch_local_save_flags(void)
 * The asm is logically equivalent to:
 *
 * if (system_uses_irq_prio_masking())
-*  flags = (daif_bits & PSR_I_BIT) ?
-*  GIC_PRIO_IRQOFF :
+*  flags = (daif_bits << 32) |
 *  read_sysreg_s(SYS_ICC_PMR_EL1);
 * else
 *  flags = daif_bits;
@@ -87,11 +86,11 @@ static inline unsigned long arch_local_save_flags(void)
"nop\n"
"nop",
"mrs_s  %0, " __stringify(SYS_ICC_PMR_EL1) "\n"
-   "ands   %1, %1, " __stringify(PSR_I_BIT) "\n"
-   "csel   %0, %0, %2, eq",
+   "lsl%1, %1, #32\n"
+   "orr%0, %0, %1",
ARM64_HAS_IRQ_PRIO_MASKING)
: "=" (flags), "+r" (daif_bits)
-   : "r" ((unsigned long) GIC_PRIO_IRQOFF)
+   :
: "memory");
 
return flags;
@@ -119,8 +118,8 @@ static inline void arch_local_irq_restore(unsigned long 
flags)
"msr_s  " __stringify(SYS_ICC_PMR_EL1) ", %0\n"
"dsbsy",
ARM64_HAS_IRQ_PRIO_MASKING)
-   : "+r" (flags)
:
+   : "r" ((int)flags)
: "memory");
 }
 
@@ -130,12 +129,14 @@ static inline int arch_irqs_disabled_flags(unsigned long 
flags)
 
asm volatile(ALTERNATIVE(
"and%w0, %w1, #" __stringify(PSR_I_BIT) "\n"
+   "nop\n"
"nop",
+   "and%w0, %w2, #" __stringify(PSR_I_BIT) "\n"
"cmp%w1, #" __stringify(GIC_PRIO_IRQOFF) "\n"
-   "cset   %w0, ls",
+   "cinc   %w0, %w0, ls",
ARM64_HAS_IRQ_PRIO_MASKING)
: "=" (res)
-   : "r" ((int) flags)
+   : "r" ((int) flags), "r" (flags >> 32)
: "memory");
 
return res;
-- 
2.17.1



[tip:perf/urgent] perf machine: Update kernel map address and re-order properly

2019-03-29 Thread tip-bot for Wei Li
Commit-ID:  977c7a6d1e263ff1d755f28595b99e4bc0c48a9f
Gitweb: https://git.kernel.org/tip/977c7a6d1e263ff1d755f28595b99e4bc0c48a9f
Author: Wei Li 
AuthorDate: Thu, 28 Feb 2019 17:20:03 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Thu, 28 Mar 2019 14:41:21 -0300

perf machine: Update kernel map address and re-order properly

Since commit 1fb87b8e9599 ("perf machine: Don't search for active kernel
start in __machine__create_kernel_maps"), the __machine__create_kernel_maps()
just create a map what start and end are both zero. Though the address will be
updated later, the order of map in the rbtree may be incorrect.

The commit ee05d21791db ("perf machine: Set main kernel end address properly")
fixed the logic in machine__create_kernel_maps(), but it's still wrong in
function machine__process_kernel_mmap_event().

To reproduce this issue, we need an environment which the module address
is before the kernel text segment. I tested it on an aarch64 machine with
kernel 4.19.25:

  [root@localhost hulk]# grep _stext /proc/kallsyms
  08081000 T _stext
  [root@localhost hulk]# grep _etext /proc/kallsyms
  0978 R _etext
  [root@localhost hulk]# tail /proc/modules
  hisi_sas_v2_hw 77824 0 - Live 0x0191d000
  nvme_core 126976 7 nvme, Live 0x018b6000
  mdio 20480 1 ixgbe, Live 0x018ab000
  hisi_sas_main 106496 1 hisi_sas_v2_hw, Live 0x01861000
  hns_mdio 20480 2 - Live 0x01822000
  hnae 28672 3 hns_dsaf,hns_enet_drv, Live 0x01815000
  dm_mirror 40960 0 - Live 0x01804000
  dm_region_hash 32768 1 dm_mirror, Live 0x017f5000
  dm_log 32768 2 dm_mirror,dm_region_hash, Live 0x017e7000
  dm_mod 315392 17 dm_mirror,dm_log, Live 0x0178
  [root@localhost hulk]#

Before fix:

  [root@localhost bin]# perf record sleep 3
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.011 MB perf.data (9 samples) ]
  [root@localhost bin]# perf buildid-list -i perf.data
  4c4e46c971ca935f781e603a09b52a92e8bdfee8 [vdso]
  [root@localhost bin]# perf buildid-list -i perf.data -H
   /proc/kcore
  [root@localhost bin]#

After fix:

  [root@localhost tools]# ./perf/perf record sleep 3
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.011 MB perf.data (9 samples) ]
  [root@localhost tools]# ./perf/perf buildid-list -i perf.data
  28a6c690262896dbd1b5e1011ed81623e6db0610 [kernel.kallsyms]
  106c14ce6e4acea3453e484dc604d6f08a2f [vdso]
  [root@localhost tools]# ./perf/perf buildid-list -i perf.data -H
  28a6c690262896dbd1b5e1011ed81623e6db0610 /proc/kcore

Signed-off-by: Wei Li 
Acked-by: Jiri Olsa 
Acked-by: Namhyung Kim 
Cc: Alexander Shishkin 
Cc: David Ahern 
Cc: Hanjun Guo 
Cc: Kim Phillips 
Cc: Li Bin 
Cc: Peter Zijlstra 
Link: http://lkml.kernel.org/r/20190228092003.34071-1-liwei...@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/machine.c | 32 
 1 file changed, 20 insertions(+), 12 deletions(-)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 61959aba7e27..3c520baa198c 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -1421,6 +1421,20 @@ static void machine__set_kernel_mmap(struct machine 
*machine,
machine->vmlinux_map->end = ~0ULL;
 }
 
+static void machine__update_kernel_mmap(struct machine *machine,
+u64 start, u64 end)
+{
+   struct map *map = machine__kernel_map(machine);
+
+   map__get(map);
+   map_groups__remove(>kmaps, map);
+
+   machine__set_kernel_mmap(machine, start, end);
+
+   map_groups__insert(>kmaps, map);
+   map__put(map);
+}
+
 int machine__create_kernel_maps(struct machine *machine)
 {
struct dso *kernel = machine__get_kernel(machine);
@@ -1453,17 +1467,11 @@ int machine__create_kernel_maps(struct machine *machine)
goto out_put;
}
 
-   /* we have a real start address now, so re-order the kmaps */
-   map = machine__kernel_map(machine);
-
-   map__get(map);
-   map_groups__remove(>kmaps, map);
-
-   /* assume it's the last in the kmaps */
-   machine__set_kernel_mmap(machine, addr, ~0ULL);
-
-   map_groups__insert(>kmaps, map);
-   map__put(map);
+   /*
+* we have a real start address now, so re-order the kmaps
+* assume it's the last in the kmaps
+*/
+   machine__update_kernel_mmap(machine, addr, ~0ULL);
}
 
if (machine__create_extra_kernel_maps(machine, kernel))
@@ -1599,7 +1607,7 @@ static int machine__process_kernel_mmap_event(struct 
machine *machine,
if (strstr(kernel->long_name, "vmlinux"))
 

[PATCH] arm64: gic-v3: fix unexpected overwrite of spi/ppi/sgi prio

2019-03-26 Thread Wei Li
The priority config will be restored to the default value in the
notifiers of gic suspend. While the arm64 nmi-like interrupt has been
implemented by using priority since commit bc3c03ccb464 ("arm64: Enable
the support of pseudo-NMIs"), we need to do the save and restore exactly.

Signed-off-by: Wei Li 
---
 drivers/irqchip/irq-gic.c | 22 +++---
 1 file changed, 19 insertions(+), 3 deletions(-)

diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
index fd3110c171ba..5928c4338b28 100644
--- a/drivers/irqchip/irq-gic.c
+++ b/drivers/irqchip/irq-gic.c
@@ -79,10 +79,12 @@ struct gic_chip_data {
u32 saved_spi_enable[DIV_ROUND_UP(1020, 32)];
u32 saved_spi_active[DIV_ROUND_UP(1020, 32)];
u32 saved_spi_conf[DIV_ROUND_UP(1020, 16)];
+   u32 saved_spi_prio[DIV_ROUND_UP(1020, 4)];
u32 saved_spi_target[DIV_ROUND_UP(1020, 4)];
u32 __percpu *saved_ppi_enable;
u32 __percpu *saved_ppi_active;
u32 __percpu *saved_ppi_conf;
+   u32 __percpu *saved_ppi_prio;
 #endif
struct irq_domain *domain;
unsigned int gic_irqs;
@@ -599,6 +601,10 @@ void gic_dist_save(struct gic_chip_data *gic)
for (i = 0; i < DIV_ROUND_UP(gic_irqs, 32); i++)
gic->saved_spi_active[i] =
readl_relaxed(dist_base + GIC_DIST_ACTIVE_SET + i * 4);
+
+   for (i = 0; i < DIV_ROUND_UP(gic_irqs, 4); i++)
+   gic->saved_spi_prio[i] =
+   readl_relaxed(dist_base + GIC_DIST_PRI + i * 4);
 }
 
 /*
@@ -630,7 +636,7 @@ void gic_dist_restore(struct gic_chip_data *gic)
dist_base + GIC_DIST_CONFIG + i * 4);
 
for (i = 0; i < DIV_ROUND_UP(gic_irqs, 4); i++)
-   writel_relaxed(GICD_INT_DEF_PRI_X4,
+   writel_relaxed(gic->saved_spi_prio[i],
dist_base + GIC_DIST_PRI + i * 4);
 
for (i = 0; i < DIV_ROUND_UP(gic_irqs, 4); i++)
@@ -682,6 +688,9 @@ void gic_cpu_save(struct gic_chip_data *gic)
for (i = 0; i < DIV_ROUND_UP(32, 16); i++)
ptr[i] = readl_relaxed(dist_base + GIC_DIST_CONFIG + i * 4);
 
+   ptr = raw_cpu_ptr(gic->saved_ppi_prio);
+   for (i = 0; i < DIV_ROUND_UP(32, 4); i++)
+   ptr[i] = readl_relaxed(dist_base + GIC_DIST_PRI + i * 4);
 }
 
 void gic_cpu_restore(struct gic_chip_data *gic)
@@ -718,9 +727,9 @@ void gic_cpu_restore(struct gic_chip_data *gic)
for (i = 0; i < DIV_ROUND_UP(32, 16); i++)
writel_relaxed(ptr[i], dist_base + GIC_DIST_CONFIG + i * 4);
 
+   ptr = raw_cpu_ptr(gic->saved_ppi_prio);
for (i = 0; i < DIV_ROUND_UP(32, 4); i++)
-   writel_relaxed(GICD_INT_DEF_PRI_X4,
-   dist_base + GIC_DIST_PRI + i * 4);
+   writel_relaxed(ptr[i], dist_base + GIC_DIST_PRI + i * 4);
 
writel_relaxed(GICC_INT_PRI_THRESHOLD, cpu_base + GIC_CPU_PRIMASK);
gic_cpu_if_up(gic);
@@ -778,11 +787,18 @@ static int gic_pm_init(struct gic_chip_data *gic)
if (WARN_ON(!gic->saved_ppi_conf))
goto free_ppi_active;
 
+   gic->saved_ppi_prio = __alloc_percpu(DIV_ROUND_UP(32, 4) * 4,
+   sizeof(u32));
+   if (WARN_ON(!gic->saved_ppi_prio))
+   goto free_ppi_prio;
+
if (gic == _data[0])
cpu_pm_register_notifier(_notifier_block);
 
return 0;
 
+free_ppi_prio:
+   free_percpu(gic->saved_ppi_conf);
 free_ppi_active:
free_percpu(gic->saved_ppi_active);
 free_ppi_enable:
-- 
2.17.1



[PATCH] perf machine: Update kernel map address and re-order properly

2019-02-28 Thread Wei Li
Since commit 1fb87b8e9599 ("perf machine: Don't search for active kernel
start in __machine__create_kernel_maps"), the __machine__create_kernel_maps()
just create a map what start and end are both zero. Though the address will be
updated later, the order of map in the rbtree may be incorrect.

The commit ee05d21791db ("perf machine: Set main kernel end address properly")
fixed the logic in machine__create_kernel_maps(), but it's still wrong in
function machine__process_kernel_mmap_event().

To reproduce this issue, we need an environment which the module address
is before the kernel text segment. I tested it on an aarch64 machine with
kernel 4.19.25:

[root@localhost hulk]# grep _stext /proc/kallsyms 
08081000 T _stext
[root@localhost hulk]# grep _etext /proc/kallsyms 
0978 R _etext
[root@localhost hulk]# tail /proc/modules 
hisi_sas_v2_hw 77824 0 - Live 0x0191d000
nvme_core 126976 7 nvme, Live 0x018b6000
mdio 20480 1 ixgbe, Live 0x018ab000
hisi_sas_main 106496 1 hisi_sas_v2_hw, Live 0x01861000
hns_mdio 20480 2 - Live 0x01822000
hnae 28672 3 hns_dsaf,hns_enet_drv, Live 0x01815000
dm_mirror 40960 0 - Live 0x01804000
dm_region_hash 32768 1 dm_mirror, Live 0x017f5000
dm_log 32768 2 dm_mirror,dm_region_hash, Live 0x017e7000
dm_mod 315392 17 dm_mirror,dm_log, Live 0x0178
[root@localhost hulk]# 

Before fix:

[root@localhost bin]# perf record sleep 3
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.011 MB perf.data (9 samples) ]
[root@localhost bin]# perf buildid-list -i perf.data 
4c4e46c971ca935f781e603a09b52a92e8bdfee8 [vdso]
[root@localhost bin]# perf buildid-list -i perf.data -H
 /proc/kcore
[root@localhost bin]# 

After fix:

[root@localhost tools]# ./perf/perf record sleep 3
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.011 MB perf.data (9 samples) ]
[root@localhost tools]# ./perf/perf buildid-list -i perf.data 
28a6c690262896dbd1b5e1011ed81623e6db0610 [kernel.kallsyms]
106c14ce6e4acea3453e484dc604d6f08a2f [vdso]
[root@localhost tools]# ./perf/perf buildid-list -i perf.data -H
28a6c690262896dbd1b5e1011ed81623e6db0610 /proc/kcore

Signed-off-by: Wei Li 
---
 tools/perf/util/machine.c | 32 
 1 file changed, 20 insertions(+), 12 deletions(-)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index b1508ce3e412..076718a7b3ea 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -1358,6 +1358,20 @@ static void machine__set_kernel_mmap(struct machine 
*machine,
machine->vmlinux_map->end = ~0ULL;
 }
 
+static void machine__update_kernel_mmap(struct machine *machine,
+u64 start, u64 end)
+{
+   struct map *map = machine__kernel_map(machine);
+
+   map__get(map);
+   map_groups__remove(>kmaps, map);
+
+   machine__set_kernel_mmap(machine, start, end);
+
+   map_groups__insert(>kmaps, map);
+   map__put(map);
+}
+
 int machine__create_kernel_maps(struct machine *machine)
 {
struct dso *kernel = machine__get_kernel(machine);
@@ -1390,17 +1404,11 @@ int machine__create_kernel_maps(struct machine *machine)
goto out_put;
}
 
-   /* we have a real start address now, so re-order the kmaps */
-   map = machine__kernel_map(machine);
-
-   map__get(map);
-   map_groups__remove(>kmaps, map);
-
-   /* assume it's the last in the kmaps */
-   machine__set_kernel_mmap(machine, addr, ~0ULL);
-
-   map_groups__insert(>kmaps, map);
-   map__put(map);
+   /*
+* we have a real start address now, so re-order the kmaps
+* assume it's the last in the kmaps
+*/
+   machine__update_kernel_mmap(machine, addr, ~0ULL);
}
 
if (machine__create_extra_kernel_maps(machine, kernel))
@@ -1536,7 +1544,7 @@ static int machine__process_kernel_mmap_event(struct 
machine *machine,
if (strstr(kernel->long_name, "vmlinux"))
dso__set_short_name(kernel, "[kernel.vmlinux]", false);
 
-   machine__set_kernel_mmap(machine, event->mmap.start,
+   machine__update_kernel_mmap(machine, event->mmap.start,
 event->mmap.start + event->mmap.len);
 
/*
-- 
2.17.1



[tip:perf/core] perf annotate: Fix getting source line failure

2019-02-27 Thread tip-bot for Wei Li
Commit-ID:  11db1ad4513d6205d2519e1a30ff4cef746e3243
Gitweb: https://git.kernel.org/tip/11db1ad4513d6205d2519e1a30ff4cef746e3243
Author: Wei Li 
AuthorDate: Thu, 21 Feb 2019 17:57:16 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Thu, 21 Feb 2019 17:00:35 -0300

perf annotate: Fix getting source line failure

The output of "perf annotate -l --stdio xxx" changed since commit 425859ff0de33
("perf annotate: No need to calculate notes->start twice") removed notes->start
assignment in symbol__calc_lines(). It will get failed in
find_address_in_section() from symbol__tty_annotate() subroutine as the
a2l->addr is wrong. So the annotate summary doesn't report the line number of
source code correctly.

Before fix:

  liwei@euler:~/main_code/hulk_work/hulk/tools/perf$ cat common_while_1.c
  void hotspot_1(void)
  {
volatile int i;

for (i = 0; i < 0x1000; i++);
for (i = 0; i < 0x1000; i++);
for (i = 0; i < 0x1000; i++);
  }

  int main(void)
  {
hotspot_1();

return 0;
  }
  liwei@euler:~/main_code/hulk_work/hulk/tools/perf$ gcc common_while_1.c -g -o 
common_while_1

  liwei@euler:~/main_code/hulk_work/hulk/tools/perf$ sudo ./perf record 
./common_while_1
  [ perf record: Woken up 2 times to write data ]
  [ perf record: Captured and wrote 0.488 MB perf.data (12498 samples) ]
  liwei@euler:~/main_code/hulk_work/hulk/tools/perf$ sudo ./perf annotate -l -s 
hotspot_1 --stdio

  Sorted summary for file 
/home/liwei/main_code/hulk_work/hulk/tools/perf/common_while_1
  --

   19.30 common_while_1[32]
   19.03 common_while_1[4e]
   19.01 common_while_1[16]
5.04 common_while_1[13]
4.99 common_while_1[4b]
4.78 common_while_1[2c]
4.77 common_while_1[10]
4.66 common_while_1[2f]
4.59 common_while_1[51]
4.59 common_while_1[35]
4.52 common_while_1[19]
4.20 common_while_1[56]
0.51 common_while_1[48]
   Percent |  Source code & Disassembly of common_while_1 for cycles:ppp 
(12480 samples, percent: local period)
  
-
 :
 :
 :
 : Disassembly of section .text:
 :
 : 05fa :
 : hotspot_1():
 : void hotspot_1(void)
 : {
0.00 :   5fa:   push   %rbp
0.00 :   5fb:   mov%rsp,%rbp
 : volatile int i;
 :
 : for (i = 0; i < 0x1000; i++);
0.00 :   5fe:   movl   $0x0,-0x4(%rbp)
0.00 :   605:   jmp610 
0.00 :   607:   mov-0x4(%rbp),%eax
   common_while_1[10]4.77 :   60a:   add$0x1,%eax
   common_while_1[13]5.04 :   60d:   mov%eax,-0x4(%rbp)
   common_while_1[16]   19.01 :   610:   mov-0x4(%rbp),%eax
   common_while_1[19]4.52 :   613:   cmp$0xfff,%eax
  0.00 :   618:   jle607 
   : for (i = 0; i < 0x1000; i++);
  ...

After fix:

  liwei@euler:~/main_code/hulk_work/hulk/tools/perf$ sudo ./perf record 
./common_while_1
  [ perf record: Woken up 2 times to write data ]
  [ perf record: Captured and wrote 0.488 MB perf.data (12500 samples) ]
  liwei@euler:~/main_code/hulk_work/hulk/tools/perf$ sudo ./perf annotate -l -s 
hotspot_1 --stdio

  Sorted summary for file 
/home/liwei/main_code/hulk_work/hulk/tools/perf/common_while_1
  --

   33.34 common_while_1.c:5
   33.34 common_while_1.c:6
   33.32 common_while_1.c:7
   Percent |  Source code & Disassembly of common_while_1 for cycles:ppp 
(12482 samples, percent: local period)
  
-
 :
 :
 :
 : Disassembly of section .text:
 :
 : 05fa :
 : hotspot_1():
 : void hotspot_1(void)
 : {
0.00 :   5fa:   push   %rbp
0.00 :   5fb:   mov%rsp,%rbp
 : volatile int i;
 :
 : for (i = 0; i < 0x1000; i++);
0.00 :   5fe:   movl   $0x0,-0x4(%rbp)
0.00 :   605:   jmp610 
0.00 :   607:   mov-0x4(%rbp),%eax
   common_while_1.c:54.70 :   60a:   add$0x1,%eax
4.89 :   60d:   mov%eax,-0x4(%rbp)
   common_while_1.c:5   19.03 :   610:   mov-0x4(%rbp),%eax
   common_while_1.c:54.72 :   613:   cmp$0xfff,%eax
0.00 :   618:   jle607 
 : for (i = 0; i < 0x1000; i++);
0.00 :   61a:   movl   $0x0,-0x4(%rbp)
0.00 :   621:   jmp62c 
0.00 :   623:   mov-0x4(%rbp),%eax
   common_while_1.c:64.54 :   626:   add$0x1,%eax
4.73 :   629:   mov%eax,-0x4(%rbp)
   co

[PATCH] perf: fix getting source line failure in annotate

2019-02-21 Thread Wei Li
The output of "perf annotate -l --stdio xxx" changed since commit 425859ff0de33 
("perf
annotate: No need to calculate notes->start twice") removed notes->start 
assignment
in symbol__calc_lines(). It will get failed in find_address_in_section() from
symbol__tty_annotate() subroutine as the a2l->addr is wrong. So the annotate 
summary
doesn't report the line number of source code correctly.

Before fix:

liwei@euler:~/main_code/hulk_work/hulk/tools/perf$ cat common_while_1.c 
void hotspot_1(void)
{
volatile int i;

for (i = 0; i < 0x1000; i++);
for (i = 0; i < 0x1000; i++);
for (i = 0; i < 0x1000; i++);
}

int main(void)
{
hotspot_1();

return 0;
}
liwei@euler:~/main_code/hulk_work/hulk/tools/perf$ gcc common_while_1.c -g -o 
common_while_1

liwei@euler:~/main_code/hulk_work/hulk/tools/perf$ sudo ./perf record 
./common_while_1
[ perf record: Woken up 2 times to write data ]
[ perf record: Captured and wrote 0.488 MB perf.data (12498 samples) ]
liwei@euler:~/main_code/hulk_work/hulk/tools/perf$ sudo ./perf annotate -l -s 
hotspot_1 --stdio

Sorted summary for file 
/home/liwei/main_code/hulk_work/hulk/tools/perf/common_while_1
--

   19.30 common_while_1[32]
   19.03 common_while_1[4e]
   19.01 common_while_1[16]
5.04 common_while_1[13]
4.99 common_while_1[4b]
4.78 common_while_1[2c]
4.77 common_while_1[10]
4.66 common_while_1[2f]
4.59 common_while_1[51]
4.59 common_while_1[35]
4.52 common_while_1[19]
4.20 common_while_1[56]
0.51 common_while_1[48]
 Percent |  Source code & Disassembly of common_while_1 for cycles:ppp 
(12480 samples, percent: local period)
-
 :
 :
 :
 : Disassembly of section .text:
 :
 : 05fa :
 : hotspot_1():
 : void hotspot_1(void)
 : {
0.00 :   5fa:   push   %rbp
0.00 :   5fb:   mov%rsp,%rbp
 : volatile int i;
 :
 : for (i = 0; i < 0x1000; i++);
0.00 :   5fe:   movl   $0x0,-0x4(%rbp)
0.00 :   605:   jmp610 
0.00 :   607:   mov-0x4(%rbp),%eax
 common_while_1[10]4.77 :   60a:   add$0x1,%eax
 common_while_1[13]5.04 :   60d:   mov%eax,-0x4(%rbp)
 common_while_1[16]   19.01 :   610:   mov-0x4(%rbp),%eax
 common_while_1[19]4.52 :   613:   cmp$0xfff,%eax
0.00 :   618:   jle607 
 : for (i = 0; i < 0x1000; i++);
...

After fix:

liwei@euler:~/main_code/hulk_work/hulk/tools/perf$ sudo ./perf record 
./common_while_1
[ perf record: Woken up 2 times to write data ]
[ perf record: Captured and wrote 0.488 MB perf.data (12500 samples) ]
liwei@euler:~/main_code/hulk_work/hulk/tools/perf$ sudo ./perf annotate -l -s 
hotspot_1 --stdio

Sorted summary for file 
/home/liwei/main_code/hulk_work/hulk/tools/perf/common_while_1
--

   33.34 common_while_1.c:5
   33.34 common_while_1.c:6
   33.32 common_while_1.c:7
 Percent |  Source code & Disassembly of common_while_1 for cycles:ppp 
(12482 samples, percent: local period)
-
 :
 :
 :
 : Disassembly of section .text:
 :
 : 05fa :
 : hotspot_1():
 : void hotspot_1(void)
 : {
0.00 :   5fa:   push   %rbp
0.00 :   5fb:   mov%rsp,%rbp
 : volatile int i;
 :
 : for (i = 0; i < 0x1000; i++);
0.00 :   5fe:   movl   $0x0,-0x4(%rbp)
0.00 :   605:   jmp610 
0.00 :   607:   mov-0x4(%rbp),%eax
 common_while_1.c:54.70 :   60a:   add$0x1,%eax
4.89 :   60d:   mov%eax,-0x4(%rbp)
 common_while_1.c:5   19.03 :   610:   mov-0x4(%rbp),%eax
 common_while_1.c:54.72 :   613:   cmp$0xfff,%eax
0.00 :   618:   jle607 
 : for (i = 0; i < 0x1000; i++);
0.00 :   61a:   movl   $0x0,-0x4(%rbp)
0.00 :   621:   jmp62c 
0.00 :   623:   mov-0x4(%rbp),%eax
 common_while_1.c:64.54 :   626:   add$0x1,%eax
4.73 :   629:   mov%eax,-0x4(%rbp)
 common_while_1.c:6   19.54 :   62c:   mov-0x4(%rbp),%eax
 common_while_1.c:64.54 :   62f:   cmp$0xfff,%eax
...

Fixes: 425859ff0de33 ("perf annotate: No need to calculate notes->start twice")
Signed-off-by: Wei Li 
---
 tools/perf/util/annotate.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/a

[PATCH] arm64: fix potential deadlock in arm64-provide-pseudo-NMI-with-GICv3

2019-01-29 Thread Wei Li
In some exception handlers, the interrupt is not reenabled by daifclr at first.
The later process may call local_irq_enable() to enable the interrupt, like
gic_handle_irq(). As we known, function local_irq_enable() just change the pmr 
now.
The following codes what i found may cause a deadlock or some issues else:

do_sp_pc_abort  <- el0_sp_pc
do_el0_ia_bp_hardening  <- el0_ia
kgdb_roundup_cpus   <- el1_dbg

Signed-off-by: Wei Li 
---
 arch/arm64/kernel/kgdb.c | 4 
 arch/arm64/mm/fault.c| 6 ++
 2 files changed, 10 insertions(+)

diff --git a/arch/arm64/kernel/kgdb.c b/arch/arm64/kernel/kgdb.c
index a20de58061a8..119fbf2c0788 100644
--- a/arch/arm64/kernel/kgdb.c
+++ b/arch/arm64/kernel/kgdb.c
@@ -25,6 +25,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -291,6 +292,9 @@ static void kgdb_call_nmi_hook(void *ignored)
 
 void kgdb_roundup_cpus(unsigned long flags)
 {
+   if (gic_prio_masking_enabled())
+   gic_arch_enable_irqs();
+
local_irq_enable();
smp_call_function(kgdb_call_nmi_hook, NULL, 0);
local_irq_disable();
diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index 97ba2ba78aee..f7c39a0b28bc 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -32,6 +32,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -780,6 +781,8 @@ asmlinkage void __exception do_el0_ia_bp_hardening(unsigned 
long addr,
if (addr > TASK_SIZE)
arm64_apply_bp_hardening();
 
+   if (gic_prio_masking_enabled())
+   gic_arch_enable_irqs();
local_irq_enable();
do_mem_abort(addr, esr, regs);
 }
@@ -794,6 +797,9 @@ asmlinkage void __exception do_sp_pc_abort(unsigned long 
addr,
if (user_mode(regs)) {
if (instruction_pointer(regs) > TASK_SIZE)
arm64_apply_bp_hardening();
+
+   if (gic_prio_masking_enabled())
+   gic_arch_enable_irqs();
local_irq_enable();
}
 
-- 
2.17.1



[PATCH] driver-core: remove lock for platform devices during probe

2017-04-23 Thread Wei Li
During driver probe procedure, lock on the parent of
platform devices could be removed to make probe in
parallel.

Signed-off-by: Wei Li <we...@codeaurora.org>
---
 drivers/base/dd.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index a1fbf55..e238fbc 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -25,6 +25,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "base.h"
 #include "power/power.h"
@@ -749,13 +750,14 @@ static int __driver_attach(struct device *dev, void *data)
return ret;
} /* ret > 0 means positive match */
 
-   if (dev->parent)/* Needed for USB */
+   if (dev->parent &&
+   (dev->bus != _bus_type))   /* Needed for USB */
device_lock(dev->parent);
device_lock(dev);
if (!dev->driver)
driver_probe_device(drv, dev);
device_unlock(dev);
-   if (dev->parent)
+   if (dev->parent && (dev->bus != _bus_type))
device_unlock(dev->parent);
 
return 0;
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,\na 
Linux Foundation Collaborative Project



[PATCH] driver-core: remove lock for platform devices during probe

2017-04-23 Thread Wei Li
During driver probe procedure, lock on the parent of
platform devices could be removed to make probe in
parallel.

Signed-off-by: Wei Li 
---
 drivers/base/dd.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index a1fbf55..e238fbc 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -25,6 +25,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "base.h"
 #include "power/power.h"
@@ -749,13 +750,14 @@ static int __driver_attach(struct device *dev, void *data)
return ret;
} /* ret > 0 means positive match */
 
-   if (dev->parent)/* Needed for USB */
+   if (dev->parent &&
+   (dev->bus != _bus_type))   /* Needed for USB */
device_lock(dev->parent);
device_lock(dev);
if (!dev->driver)
driver_probe_device(drv, dev);
device_unlock(dev);
-   if (dev->parent)
+   if (dev->parent && (dev->bus != _bus_type))
device_unlock(dev->parent);
 
return 0;
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,\na 
Linux Foundation Collaborative Project