Re: [PATCH 1/2] powerpc/perf: Fix the threshold compare group constraint for power10

2022-05-08 Thread Athira Rajeev



> On 06-May-2022, at 11:40 AM, Kajol Jain  wrote:
> 
> Thresh compare bits for a event is used to program thresh compare
> field in Monitor Mode Control Register A (MMCRA: 8-18 bits for power10).
> When scheduling events as a group, all events in that group should
> match value in threshold bits. Otherwise event open for the sibling
> events should fail. But in the current code, incase thresh compare bits are
> not valid, we are not failing in group_constraint function which can result
> in invalid group schduling.
> 
> Fix the issue by returning -1 incase event is threshold and threshold
> compare value is not valid in group_constraint function.
> 
> Patch also fixes the p10_thresh_cmp_val function to return -1,
> incase threshold bits are not valid and changes corresponding check in
> is_thresh_cmp_valid function to return false only when the thresh_cmp
> value is less then 0.
> 
> Thresh control bits in the event code is used to program thresh_ctl
> field in Monitor Mode Control Register A (MMCRA: 48-55). In below example,
> the scheduling of group events PM_MRK_INST_CMPL (3534401e0) and
> PM_THRESH_MET (34340101ec) is expected to fail as both event
> request different thresh control bits.
> 
> Result before the patch changes:
> 
> [command]# perf stat -e "{r35340401e0,r34340101ec}" sleep 1
> 
> Performance counter stats for 'sleep 1':
> 
> 8,482  r35340401e0
> 0  r34340101ec
> 
>   1.001474838 seconds time elapsed
> 
>   0.001145000 seconds user
>   0.0 seconds sys
> 
> Result after the patch changes:
> 
> [command]# perf stat -e "{r35340401e0,r34340101ec}" sleep 1
> 
> Performance counter stats for 'sleep 1':
> 
>   r35340401e0
> r34340101ec
> 
>   1.001499607 seconds time elapsed
> 
>   0.000204000 seconds user
>   0.00076 seconds sys
> 
> Fixes: 82d2c16b350f7 ("powerpc/perf: Adds support for programming
> of Thresholding in P10")
> Signed-off-by: Kajol Jain 

Reviewed-by: Athira Rajeev 

Thanks
Athira
> ---
> arch/powerpc/perf/isa207-common.c | 9 +
> 1 file changed, 5 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/powerpc/perf/isa207-common.c 
> b/arch/powerpc/perf/isa207-common.c
> index a74d382ecbb7..013b06af6fe6 100644
> --- a/arch/powerpc/perf/isa207-common.c
> +++ b/arch/powerpc/perf/isa207-common.c
> @@ -108,7 +108,7 @@ static void mmcra_sdar_mode(u64 event, unsigned long 
> *mmcra)
>   *mmcra |= MMCRA_SDAR_MODE_TLB;
> }
> 
> -static u64 p10_thresh_cmp_val(u64 value)
> +static int p10_thresh_cmp_val(u64 value)
> {
>   int exp = 0;
>   u64 result = value;
> @@ -139,7 +139,7 @@ static u64 p10_thresh_cmp_val(u64 value)
>* exponent is also zero.
>*/
>   if (!(value & 0xC0) && exp)
> - result = 0;
> + result = -1;
>   else
>   result = (exp << 8) | value;
>   }
> @@ -187,7 +187,7 @@ static bool is_thresh_cmp_valid(u64 event)
>   unsigned int cmp, exp;
> 
>   if (cpu_has_feature(CPU_FTR_ARCH_31))
> - return p10_thresh_cmp_val(event) != 0;
> + return p10_thresh_cmp_val(event) >= 0;
> 
>   /*
>* Check the mantissa upper two bits are not zero, unless the
> @@ -502,7 +502,8 @@ int isa207_get_constraint(u64 event, unsigned long 
> *maskp, unsigned long *valp,
>   value |= CNST_THRESH_CTL_SEL_VAL(event >> 
> EVENT_THRESH_SHIFT);
>   mask  |= p10_CNST_THRESH_CMP_MASK;
>   value |= 
> p10_CNST_THRESH_CMP_VAL(p10_thresh_cmp_val(event_config1));
> - }
> + } else if (event_is_threshold(event))
> + return -1;
>   } else if (cpu_has_feature(CPU_FTR_ARCH_300))  {
>   if (event_is_threshold(event) && is_thresh_cmp_valid(event)) {
>   mask  |= CNST_THRESH_MASK;
> -- 
> 2.31.1
> 



Re: [PATCH v2 1/3] mm: change huge_ptep_clear_flush() to return the original pte

2022-05-08 Thread Christophe Leroy


Le 08/05/2022 à 15:09, Baolin Wang a écrit :
> 
> 
> On 5/8/2022 7:09 PM, Muchun Song wrote:
>> On Sun, May 08, 2022 at 05:36:39PM +0800, Baolin Wang wrote:
>>> It is incorrect to use ptep_clear_flush() to nuke a hugetlb page
>>> table when unmapping or migrating a hugetlb page, and will change
>>> to use huge_ptep_clear_flush() instead in the following patches.
>>>
>>> So this is a preparation patch, which changes the 
>>> huge_ptep_clear_flush()
>>> to return the original pte to help to nuke a hugetlb page table.
>>>
>>> Signed-off-by: Baolin Wang 
>>> Acked-by: Mike Kravetz 
>>
>> Reviewed-by: Muchun Song 
> 
> Thanks for reviewing.
> 
>>
>> But one nit below:
>>
>> [...]
>>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
>>> index 8605d7e..61a21af 100644
>>> --- a/mm/hugetlb.c
>>> +++ b/mm/hugetlb.c
>>> @@ -5342,7 +5342,7 @@ static vm_fault_t hugetlb_wp(struct mm_struct 
>>> *mm, struct vm_area_struct *vma,
>>>   ClearHPageRestoreReserve(new_page);
>>>   /* Break COW or unshare */
>>> -    huge_ptep_clear_flush(vma, haddr, ptep);
>>> +    (void)huge_ptep_clear_flush(vma, haddr, ptep);
>>
>> Why add a "(void)" here? Is there any warning if no "(void)"?
>> IIUC, I think we can remove this, right?
> 
> I did not meet any warning without the casting, but this is per Mike's 
> comment[1] to make the code consistent with other functions casting to 
> void type explicitly in hugetlb.c file.
> 
> [1] 
> https://lore.kernel.org/all/495c4ebe-a5b4-afb6-4cb0-956c1b18d...@oracle.com/ 
> 

As far as I understand, Mike said that you should be accompagnied with a 
big fat comment explaining why we ignore the returned value from 
huge_ptep_clear_flush().

By the way huge_ptep_clear_flush() is not declared 'must_check' so this 
cast is just visual polution and should be removed.

In the meantime the comment suggested by Mike should be added instead.

Christophe

[PATCH v3 21/25] powerpc/ftrace: Don't use copy_from_kernel_nofault() in module_trampoline_target()

2022-05-08 Thread Christophe Leroy
module_trampoline_target() is quite a hot path used when
activating/deactivating function tracer.

Avoid the heavy copy_from_kernel_nofault() by doing four calls
to copy_inst_from_kernel_nofault().

Use __copy_inst_from_kernel_nofault() for the 3 last calls. First call
is done to copy_from_kernel_nofault() to check address is within
kernel space. No risk to wrap out the top of kernel space because the
last page is never mapped so if address is in last page the first copy
will fails and the other ones will never be performed.

And also make it notrace just like all functions that call it.

Signed-off-by: Christophe Leroy 
---
v3: Use ppc_inst_t to fix sparse warnings and split trampoline verification in 
one line per instruction.
---
 arch/powerpc/kernel/module_32.c | 27 ++-
 1 file changed, 18 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/kernel/module_32.c b/arch/powerpc/kernel/module_32.c
index a0432ef46967..715a42f383d0 100644
--- a/arch/powerpc/kernel/module_32.c
+++ b/arch/powerpc/kernel/module_32.c
@@ -289,23 +289,32 @@ int apply_relocate_add(Elf32_Shdr *sechdrs,
 }
 
 #ifdef CONFIG_DYNAMIC_FTRACE
-int module_trampoline_target(struct module *mod, unsigned long addr,
-unsigned long *target)
+notrace int module_trampoline_target(struct module *mod, unsigned long addr,
+unsigned long *target)
 {
-   unsigned int jmp[4];
+   ppc_inst_t jmp[4];
 
/* Find where the trampoline jumps to */
-   if (copy_from_kernel_nofault(jmp, (void *)addr, sizeof(jmp)))
+   if (copy_inst_from_kernel_nofault(jmp, (void *)addr))
+   return -EFAULT;
+   if (__copy_inst_from_kernel_nofault(jmp + 1, (void *)addr + 4))
+   return -EFAULT;
+   if (__copy_inst_from_kernel_nofault(jmp + 2, (void *)addr + 8))
+   return -EFAULT;
+   if (__copy_inst_from_kernel_nofault(jmp + 3, (void *)addr + 12))
return -EFAULT;
 
/* verify that this is what we expect it to be */
-   if ((jmp[0] & 0x) != PPC_RAW_LIS(_R12, 0) ||
-   (jmp[1] & 0x) != PPC_RAW_ADDI(_R12, _R12, 0) ||
-   jmp[2] != PPC_RAW_MTCTR(_R12) ||
-   jmp[3] != PPC_RAW_BCTR())
+   if ((ppc_inst_val(jmp[0]) & 0x) != PPC_RAW_LIS(_R12, 0))
+   return -EINVAL;
+   if ((ppc_inst_val(jmp[1]) & 0x) != PPC_RAW_ADDI(_R12, _R12, 0))
+   return -EINVAL;
+   if (ppc_inst_val(jmp[2]) != PPC_RAW_MTCTR(_R12))
+   return -EINVAL;
+   if (ppc_inst_val(jmp[3]) != PPC_RAW_BCTR())
return -EINVAL;
 
-   addr = (jmp[1] & 0x) | ((jmp[0] & 0x) << 16);
+   addr = (ppc_inst_val(jmp[1]) & 0x) | ((ppc_inst_val(jmp[0]) & 
0x) << 16);
if (addr & 0x8000)
addr -= 0x1;
 
-- 
2.35.1



[PATCH v3 23/25] powerpc/modules: Use PPC_LI macros instead of opencoding

2022-05-08 Thread Christophe Leroy
Use PPC_LI_MASK and PPC_LI() instead of opencoding.

Signed-off-by: Christophe Leroy 
---
v2: Utilisation de PPC_LI() et PPC_LI_MASK
---
 arch/powerpc/kernel/module_32.c | 11 ---
 arch/powerpc/kernel/module_64.c |  3 +--
 2 files changed, 5 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/kernel/module_32.c b/arch/powerpc/kernel/module_32.c
index 715a42f383d0..3d47e9853f3e 100644
--- a/arch/powerpc/kernel/module_32.c
+++ b/arch/powerpc/kernel/module_32.c
@@ -256,9 +256,8 @@ int apply_relocate_add(Elf32_Shdr *sechdrs,
   value, (uint32_t)location);
pr_debug("Location before: %08X.\n",
   *(uint32_t *)location);
-   value = (*(uint32_t *)location & ~0x03fc)
-   | ((value - (uint32_t)location)
-  & 0x03fc);
+   value = (*(uint32_t *)location & ~PPC_LI_MASK) |
+   PPC_LI(value - (uint32_t)location);
 
if (patch_instruction(location, ppc_inst(value)))
return -EFAULT;
@@ -266,10 +265,8 @@ int apply_relocate_add(Elf32_Shdr *sechdrs,
pr_debug("Location after: %08X.\n",
   *(uint32_t *)location);
pr_debug("ie. jump to %08X+%08X = %08X\n",
-  *(uint32_t *)location & 0x03fc,
-  (uint32_t)location,
-  (*(uint32_t *)location & 0x03fc)
-  + (uint32_t)location);
+*(uint32_t *)PPC_LI((uint32_t)location), 
(uint32_t)location,
+(*(uint32_t *)PPC_LI((uint32_t)location)) + 
(uint32_t)location);
break;
 
case R_PPC_REL32:
diff --git a/arch/powerpc/kernel/module_64.c b/arch/powerpc/kernel/module_64.c
index c1d87937b962..4c844198185e 100644
--- a/arch/powerpc/kernel/module_64.c
+++ b/arch/powerpc/kernel/module_64.c
@@ -653,8 +653,7 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
}
 
/* Only replace bits 2 through 26 */
-   value = (*(uint32_t *)location & ~0x03fc)
-   | (value & 0x03fc);
+   value = (*(uint32_t *)location & ~PPC_LI_MASK) | 
PPC_LI(value);
 
if (patch_instruction((u32 *)location, ppc_inst(value)))
return -EFAULT;
-- 
2.35.1



[PATCH v3 25/25] powerpc/opcodes: Remove unused PPC_INST_XXX macros

2022-05-08 Thread Christophe Leroy
The following PPC_INST_XXX macros are not used anymore
outside ppc-opcode.h:
- PPC_INST_LD
- PPC_INST_STD
- PPC_INST_ADDIS
- PPC_INST_ADD
- PPC_INST_DIVD

Remove them.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/ppc-opcode.h | 13 -
 1 file changed, 4 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/include/asm/ppc-opcode.h 
b/arch/powerpc/include/asm/ppc-opcode.h
index 9ca8996ee1cd..b9d6f95b66e9 100644
--- a/arch/powerpc/include/asm/ppc-opcode.h
+++ b/arch/powerpc/include/asm/ppc-opcode.h
@@ -285,11 +285,6 @@
 #define PPC_INST_TRECHKPT  0x7c0007dd
 #define PPC_INST_TRECLAIM  0x7c00075d
 #define PPC_INST_TSR   0x7c0005dd
-#define PPC_INST_LD0xe800
-#define PPC_INST_STD   0xf800
-#define PPC_INST_ADDIS 0x3c00
-#define PPC_INST_ADD   0x7c000214
-#define PPC_INST_DIVD  0x7c0003d2
 #define PPC_INST_BRANCH_COND   0x4080
 
 /* Prefixes */
@@ -462,10 +457,10 @@
(0x10c7 | ___PPC_RT(vrt) | ___PPC_RA(vra) | ___PPC_RB(vrb) | 
__PPC_RC21)
 #define PPC_RAW_VCMPEQUB_RC(vrt, vra, vrb) \
(0x1006 | ___PPC_RT(vrt) | ___PPC_RA(vra) | ___PPC_RB(vrb) | 
__PPC_RC21)
-#define PPC_RAW_LD(r, base, i) (PPC_INST_LD | ___PPC_RT(r) | 
___PPC_RA(base) | IMM_DS(i))
+#define PPC_RAW_LD(r, base, i) (0xe800 | ___PPC_RT(r) | 
___PPC_RA(base) | IMM_DS(i))
 #define PPC_RAW_LWZ(r, base, i)(0x8000 | ___PPC_RT(r) | 
___PPC_RA(base) | IMM_L(i))
 #define PPC_RAW_LWZX(t, a, b)  (0x7c2e | ___PPC_RT(t) | 
___PPC_RA(a) | ___PPC_RB(b))
-#define PPC_RAW_STD(r, base, i)(PPC_INST_STD | ___PPC_RS(r) | 
___PPC_RA(base) | IMM_DS(i))
+#define PPC_RAW_STD(r, base, i)(0xf800 | ___PPC_RS(r) | 
___PPC_RA(base) | IMM_DS(i))
 #define PPC_RAW_STDCX(s, a, b) (0x7c0001ad | ___PPC_RS(s) | 
___PPC_RA(a) | ___PPC_RB(b))
 #define PPC_RAW_LFSX(t, a, b)  (0x7c00042e | ___PPC_RT(t) | 
___PPC_RA(a) | ___PPC_RB(b))
 #define PPC_RAW_STFSX(s, a, b) (0x7c00052e | ___PPC_RS(s) | 
___PPC_RA(a) | ___PPC_RB(b))
@@ -476,8 +471,8 @@
 #define PPC_RAW_ADDE(t, a, b)  (0x7c000114 | ___PPC_RT(t) | 
___PPC_RA(a) | ___PPC_RB(b))
 #define PPC_RAW_ADDZE(t, a)(0x7c000194 | ___PPC_RT(t) | 
___PPC_RA(a))
 #define PPC_RAW_ADDME(t, a)(0x7c0001d4 | ___PPC_RT(t) | 
___PPC_RA(a))
-#define PPC_RAW_ADD(t, a, b)   (PPC_INST_ADD | ___PPC_RT(t) | 
___PPC_RA(a) | ___PPC_RB(b))
-#define PPC_RAW_ADD_DOT(t, a, b)   (PPC_INST_ADD | ___PPC_RT(t) | 
___PPC_RA(a) | ___PPC_RB(b) | 0x1)
+#define PPC_RAW_ADD(t, a, b)   (0x7c000214 | ___PPC_RT(t) | 
___PPC_RA(a) | ___PPC_RB(b))
+#define PPC_RAW_ADD_DOT(t, a, b)   (0x7c000214 | ___PPC_RT(t) | 
___PPC_RA(a) | ___PPC_RB(b) | 0x1)
 #define PPC_RAW_ADDC(t, a, b)  (0x7c14 | ___PPC_RT(t) | 
___PPC_RA(a) | ___PPC_RB(b))
 #define PPC_RAW_ADDC_DOT(t, a, b)  (0x7c14 | ___PPC_RT(t) | 
___PPC_RA(a) | ___PPC_RB(b) | 0x1)
 #define PPC_RAW_NOP()  PPC_RAW_ORI(0, 0, 0)
-- 
2.35.1



[PATCH v3 22/25] powerpc/inst: Remove PPC_INST_BRANCH

2022-05-08 Thread Christophe Leroy
Convert last users of PPC_INST_BRANCH to PPC_RAW_BRANCH()

And remove PPC_INST_BRANCH.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/ppc-opcode.h | 3 +--
 arch/powerpc/lib/feature-fixups.c | 2 +-
 2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/ppc-opcode.h 
b/arch/powerpc/include/asm/ppc-opcode.h
index 3e9aa96ae74b..1871a86c5436 100644
--- a/arch/powerpc/include/asm/ppc-opcode.h
+++ b/arch/powerpc/include/asm/ppc-opcode.h
@@ -290,7 +290,6 @@
 #define PPC_INST_ADDIS 0x3c00
 #define PPC_INST_ADD   0x7c000214
 #define PPC_INST_DIVD  0x7c0003d2
-#define PPC_INST_BRANCH0x4800
 #define PPC_INST_BL0x4801
 #define PPC_INST_BRANCH_COND   0x4080
 
@@ -575,7 +574,7 @@
 #define PPC_RAW_MTSPR(spr, d)  (0x7c0003a6 | ___PPC_RS(d) | 
__PPC_SPR(spr))
 #define PPC_RAW_EIEIO()(0x7c0006ac)
 
-#define PPC_RAW_BRANCH(addr)   (PPC_INST_BRANCH | ((addr) & 
0x03fc))
+#define PPC_RAW_BRANCH(offset) (0x4800 | PPC_LI(offset))
 #define PPC_RAW_BL(offset) (0x4801 | PPC_LI(offset))
 
 /* Deal with instructions that older assemblers aren't aware of */
diff --git a/arch/powerpc/lib/feature-fixups.c 
b/arch/powerpc/lib/feature-fixups.c
index 343a78826035..993d3f31832a 100644
--- a/arch/powerpc/lib/feature-fixups.c
+++ b/arch/powerpc/lib/feature-fixups.c
@@ -451,7 +451,7 @@ static int __do_rfi_flush_fixups(void *data)
 
if (types & L1D_FLUSH_FALLBACK)
/* b .+16 to fallback flush */
-   instrs[0] = PPC_INST_BRANCH | 16;
+   instrs[0] = PPC_RAW_BRANCH(16);
 
i = 0;
if (types & L1D_FLUSH_ORI) {
-- 
2.35.1



[PATCH v3 24/25] powerpc/inst: Remove PPC_INST_BL

2022-05-08 Thread Christophe Leroy
Convert last users of PPC_INST_BL to PPC_RAW_BL()

And remove PPC_INST_BL.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/ppc-opcode.h | 1 -
 arch/powerpc/net/bpf_jit.h| 2 +-
 2 files changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/ppc-opcode.h 
b/arch/powerpc/include/asm/ppc-opcode.h
index 1871a86c5436..9ca8996ee1cd 100644
--- a/arch/powerpc/include/asm/ppc-opcode.h
+++ b/arch/powerpc/include/asm/ppc-opcode.h
@@ -290,7 +290,6 @@
 #define PPC_INST_ADDIS 0x3c00
 #define PPC_INST_ADD   0x7c000214
 #define PPC_INST_DIVD  0x7c0003d2
-#define PPC_INST_BL0x4801
 #define PPC_INST_BRANCH_COND   0x4080
 
 /* Prefixes */
diff --git a/arch/powerpc/net/bpf_jit.h b/arch/powerpc/net/bpf_jit.h
index 80d973da9093..a4f7880f959d 100644
--- a/arch/powerpc/net/bpf_jit.h
+++ b/arch/powerpc/net/bpf_jit.h
@@ -35,7 +35,7 @@
} while (0)
 
 /* bl (unconditional 'branch' with link) */
-#define PPC_BL(dest)   EMIT(PPC_INST_BL | (((dest) - (unsigned long)(image + 
ctx->idx)) & 0x03fc))
+#define PPC_BL(dest)   EMIT(PPC_RAW_BL((dest) - (unsigned long)(image + 
ctx->idx)))
 
 /* "cond" here covers BO:BI fields. */
 #define PPC_BCC_SHORT(cond, dest)\
-- 
2.35.1



[PATCH v3 05/25] powerpc/code-patching: Inline create_branch()

2022-05-08 Thread Christophe Leroy
create_branch() is a good candidate for inlining because:
- Flags can be folded in.
- Range tests are likely to be already done.

Hence reducing the create_branch() to only a set of instructions.

So inline it.

It improves ftrace activation by 10%.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/code-patching.h | 22 --
 arch/powerpc/lib/code-patching.c | 20 
 2 files changed, 20 insertions(+), 22 deletions(-)

diff --git a/arch/powerpc/include/asm/code-patching.h 
b/arch/powerpc/include/asm/code-patching.h
index e7c5df50cb4e..4260e89f62b1 100644
--- a/arch/powerpc/include/asm/code-patching.h
+++ b/arch/powerpc/include/asm/code-patching.h
@@ -49,8 +49,26 @@ static inline bool is_offset_in_cond_branch_range(long 
offset)
return offset >= -0x8000 && offset <= 0x7fff && !(offset & 0x3);
 }
 
-int create_branch(ppc_inst_t *instr, const u32 *addr,
- unsigned long target, int flags);
+static inline int create_branch(ppc_inst_t *instr, const u32 *addr,
+   unsigned long target, int flags)
+{
+   long offset;
+
+   *instr = ppc_inst(0);
+   offset = target;
+   if (! (flags & BRANCH_ABSOLUTE))
+   offset = offset - (unsigned long)addr;
+
+   /* Check we can represent the target in the instruction format */
+   if (!is_offset_in_branch_range(offset))
+   return 1;
+
+   /* Mask out the flags and target, so they don't step on each other. */
+   *instr = ppc_inst(0x4800 | (flags & 0x3) | (offset & 0x03FC));
+
+   return 0;
+}
+
 int create_cond_branch(ppc_inst_t *instr, const u32 *addr,
   unsigned long target, int flags);
 int patch_branch(u32 *addr, unsigned long target, int flags);
diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
index 58262c7e447c..7adbdb05fee7 100644
--- a/arch/powerpc/lib/code-patching.c
+++ b/arch/powerpc/lib/code-patching.c
@@ -230,26 +230,6 @@ bool is_conditional_branch(ppc_inst_t instr)
 }
 NOKPROBE_SYMBOL(is_conditional_branch);
 
-int create_branch(ppc_inst_t *instr, const u32 *addr,
- unsigned long target, int flags)
-{
-   long offset;
-
-   *instr = ppc_inst(0);
-   offset = target;
-   if (! (flags & BRANCH_ABSOLUTE))
-   offset = offset - (unsigned long)addr;
-
-   /* Check we can represent the target in the instruction format */
-   if (!is_offset_in_branch_range(offset))
-   return 1;
-
-   /* Mask out the flags and target, so they don't step on each other. */
-   *instr = ppc_inst(0x4800 | (flags & 0x3) | (offset & 0x03FC));
-
-   return 0;
-}
-
 int create_cond_branch(ppc_inst_t *instr, const u32 *addr,
   unsigned long target, int flags)
 {
-- 
2.35.1



[PATCH v3 03/25] powerpc/code-patching: Inline is_offset_in_{cond}_branch_range()

2022-05-08 Thread Christophe Leroy
Test in is_offset_in_branch_range() and is_offset_in_cond_branch_range()
are simple tests that are worth inlining.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/code-patching.h | 29 ++--
 arch/powerpc/lib/code-patching.c | 27 --
 2 files changed, 27 insertions(+), 29 deletions(-)

diff --git a/arch/powerpc/include/asm/code-patching.h 
b/arch/powerpc/include/asm/code-patching.h
index 409483b2d0ce..e7c5df50cb4e 100644
--- a/arch/powerpc/include/asm/code-patching.h
+++ b/arch/powerpc/include/asm/code-patching.h
@@ -22,8 +22,33 @@
 #define BRANCH_SET_LINK0x1
 #define BRANCH_ABSOLUTE0x2
 
-bool is_offset_in_branch_range(long offset);
-bool is_offset_in_cond_branch_range(long offset);
+/*
+ * Powerpc branch instruction is :
+ *
+ *  0 6 30   31
+ *  +-++---+---+
+ *  | opcode  | LI |AA |LK |
+ *  +-++---+---+
+ *  Where AA = 0 and LK = 0
+ *
+ * LI is a signed 24 bits integer. The real branch offset is computed
+ * by: imm32 = SignExtend(LI:'0b00', 32);
+ *
+ * So the maximum forward branch should be:
+ *   (0x007f << 2) = 0x01fc =  0x1fc
+ * The maximum backward branch should be:
+ *   (0xff80 << 2) = 0xfe00 = -0x200
+ */
+static inline bool is_offset_in_branch_range(long offset)
+{
+   return (offset >= -0x200 && offset <= 0x1fc && !(offset & 0x3));
+}
+
+static inline bool is_offset_in_cond_branch_range(long offset)
+{
+   return offset >= -0x8000 && offset <= 0x7fff && !(offset & 0x3);
+}
+
 int create_branch(ppc_inst_t *instr, const u32 *addr,
  unsigned long target, int flags);
 int create_cond_branch(ppc_inst_t *instr, const u32 *addr,
diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
index 00c68e7fb11e..58262c7e447c 100644
--- a/arch/powerpc/lib/code-patching.c
+++ b/arch/powerpc/lib/code-patching.c
@@ -208,33 +208,6 @@ int patch_branch(u32 *addr, unsigned long target, int 
flags)
return patch_instruction(addr, instr);
 }
 
-bool is_offset_in_branch_range(long offset)
-{
-   /*
-* Powerpc branch instruction is :
-*
-*  0 6 30   31
-*  +-++---+---+
-*  | opcode  | LI |AA |LK |
-*  +-++---+---+
-*  Where AA = 0 and LK = 0
-*
-* LI is a signed 24 bits integer. The real branch offset is computed
-* by: imm32 = SignExtend(LI:'0b00', 32);
-*
-* So the maximum forward branch should be:
-*   (0x007f << 2) = 0x01fc =  0x1fc
-* The maximum backward branch should be:
-*   (0xff80 << 2) = 0xfe00 = -0x200
-*/
-   return (offset >= -0x200 && offset <= 0x1fc && !(offset & 0x3));
-}
-
-bool is_offset_in_cond_branch_range(long offset)
-{
-   return offset >= -0x8000 && offset <= 0x7fff && !(offset & 0x3);
-}
-
 /*
  * Helper to check if a given instruction is a conditional branch
  * Derived from the conditional checks in analyse_instr()
-- 
2.35.1



[PATCH v3 07/25] powerpc/ftrace: Use patch_instruction() return directly

2022-05-08 Thread Christophe Leroy
Instead of returning -EPERM when patch_instruction() fails,
just return what patch_instruction returns.

That simplifies ftrace_modify_code():

   0:   94 21 ff c0 stwur1,-64(r1)
   4:   93 e1 00 3c stw r31,60(r1)
   8:   7c 7f 1b 79 mr. r31,r3
   c:   40 80 00 30 bge 3c 
  10:   93 c1 00 38 stw r30,56(r1)
  14:   7c 9e 23 78 mr  r30,r4
  18:   7c a4 2b 78 mr  r4,r5
  1c:   80 bf 00 00 lwz r5,0(r31)
  20:   7c 1e 28 40 cmplw   r30,r5
  24:   40 82 00 34 bne 58 
  28:   83 c1 00 38 lwz r30,56(r1)
  2c:   7f e3 fb 78 mr  r3,r31
  30:   83 e1 00 3c lwz r31,60(r1)
  34:   38 21 00 40 addir1,r1,64
  38:   48 00 00 00 b   38 
38: R_PPC_REL24 patch_instruction

Before:

   0:   94 21 ff c0 stwur1,-64(r1)
   4:   93 e1 00 3c stw r31,60(r1)
   8:   7c 7f 1b 79 mr. r31,r3
   c:   40 80 00 4c bge 58 
  10:   93 c1 00 38 stw r30,56(r1)
  14:   7c 9e 23 78 mr  r30,r4
  18:   7c a4 2b 78 mr  r4,r5
  1c:   80 bf 00 00 lwz r5,0(r31)
  20:   7c 08 02 a6 mflrr0
  24:   90 01 00 44 stw r0,68(r1)
  28:   7c 1e 28 40 cmplw   r30,r5
  2c:   40 82 00 48 bne 74 
  30:   7f e3 fb 78 mr  r3,r31
  34:   48 00 00 01 bl  34 
34: R_PPC_REL24 patch_instruction
  38:   80 01 00 44 lwz r0,68(r1)
  3c:   20 63 00 00 subfic  r3,r3,0
  40:   83 c1 00 38 lwz r30,56(r1)
  44:   7c 63 19 10 subfe   r3,r3,r3
  48:   7c 08 03 a6 mtlrr0
  4c:   83 e1 00 3c lwz r31,60(r1)
  50:   38 21 00 40 addir1,r1,64
  54:   4e 80 00 20 blr

It improves ftrace activation/deactivation duration by about 3%.

Modify patch_instruction() return on failure to -EPERM in order to
match with ftrace expectations. Other users of patch_instruction()
do not care about the exact error value returned.

Signed-off-by: Christophe Leroy 
---
v2: Make patch_instruction() return -EPERM in case of failure
---
 arch/powerpc/kernel/trace/ftrace.c | 5 +
 arch/powerpc/lib/code-patching.c   | 2 +-
 2 files changed, 2 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/kernel/trace/ftrace.c 
b/arch/powerpc/kernel/trace/ftrace.c
index 98e82fa4980f..1b05d33f96c6 100644
--- a/arch/powerpc/kernel/trace/ftrace.c
+++ b/arch/powerpc/kernel/trace/ftrace.c
@@ -78,10 +78,7 @@ ftrace_modify_code(unsigned long ip, ppc_inst_t old, 
ppc_inst_t new)
}
 
/* replace the text with the new text */
-   if (patch_instruction((u32 *)ip, new))
-   return -EPERM;
-
-   return 0;
+   return patch_instruction((u32 *)ip, new);
 }
 
 /*
diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
index 7adbdb05fee7..cd25c07df23c 100644
--- a/arch/powerpc/lib/code-patching.c
+++ b/arch/powerpc/lib/code-patching.c
@@ -32,7 +32,7 @@ static int __patch_instruction(u32 *exec_addr, ppc_inst_t 
instr, u32 *patch_addr
return 0;
 
 failed:
-   return -EFAULT;
+   return -EPERM;
 }
 
 int raw_patch_instruction(u32 *addr, ppc_inst_t instr)
-- 
2.35.1



[PATCH v3 08/25] powerpc: Add CONFIG_PPC64_ELF_ABI_V1 and CONFIG_PPC64_ELF_ABI_V2

2022-05-08 Thread Christophe Leroy
At the time being, we use CONFIG_CPU_LITTLE_ENDIAN and
CONFIG_CPU_BIG_ENDIAN to pass -mabi=elfv1 or elfv2 to
compiler, then define a PPC64_ELF_ABI_v1 or PPC64_ELF_ABI_v2
macro in asm/types.h based on _CALL_ELF define set by the compiler.

Make it more straight forward with a CONFIG option that
is directly usable.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/Makefile  | 10 +-
 arch/powerpc/boot/Makefile |  2 ++
 arch/powerpc/platforms/Kconfig.cputype |  6 ++
 3 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile
index eb541e730d3c..1ba98be84101 100644
--- a/arch/powerpc/Makefile
+++ b/arch/powerpc/Makefile
@@ -89,10 +89,10 @@ endif
 
 ifdef CONFIG_PPC64
 ifndef CONFIG_CC_IS_CLANG
-cflags-$(CONFIG_CPU_BIG_ENDIAN)+= $(call cc-option,-mabi=elfv1)
-cflags-$(CONFIG_CPU_BIG_ENDIAN)+= $(call 
cc-option,-mcall-aixdesc)
-aflags-$(CONFIG_CPU_BIG_ENDIAN)+= $(call cc-option,-mabi=elfv1)
-aflags-$(CONFIG_CPU_LITTLE_ENDIAN) += -mabi=elfv2
+cflags-$(CONFIG_PPC64_ELF_ABI_V1)  += $(call cc-option,-mabi=elfv1)
+cflags-$(CONFIG_PPC64_ELF_ABI_V1)  += $(call cc-option,-mcall-aixdesc)
+aflags-$(CONFIG_PPC64_ELF_ABI_V1)  += $(call cc-option,-mabi=elfv1)
+aflags-$(CONFIG_PPC64_ELF_ABI_V2)  += -mabi=elfv2
 endif
 endif
 
@@ -141,7 +141,7 @@ endif
 
 CFLAGS-$(CONFIG_PPC64) := $(call cc-option,-mtraceback=no)
 ifndef CONFIG_CC_IS_CLANG
-ifdef CONFIG_CPU_LITTLE_ENDIAN
+ifdef CONFIG_PPC64_ELF_ABI_V2
 CFLAGS-$(CONFIG_PPC64) += $(call cc-option,-mabi=elfv2,$(call 
cc-option,-mcall-aixdesc))
 AFLAGS-$(CONFIG_PPC64) += $(call cc-option,-mabi=elfv2)
 else
diff --git a/arch/powerpc/boot/Makefile b/arch/powerpc/boot/Makefile
index 4b4827c475c6..b6d4fe04c594 100644
--- a/arch/powerpc/boot/Makefile
+++ b/arch/powerpc/boot/Makefile
@@ -49,6 +49,8 @@ ifdef CONFIG_CPU_BIG_ENDIAN
 BOOTCFLAGS += -mbig-endian
 else
 BOOTCFLAGS += -mlittle-endian
+endif
+ifdef CONFIG_PPC64_ELF_ABI_V2
 BOOTCFLAGS += $(call cc-option,-mabi=elfv2)
 endif
 
diff --git a/arch/powerpc/platforms/Kconfig.cputype 
b/arch/powerpc/platforms/Kconfig.cputype
index e2e1fec91c6e..9bfcf972d21d 100644
--- a/arch/powerpc/platforms/Kconfig.cputype
+++ b/arch/powerpc/platforms/Kconfig.cputype
@@ -556,6 +556,12 @@ config CPU_LITTLE_ENDIAN
 
 endchoice
 
+config PPC64_ELF_ABI_V1
+   def_bool PPC64 && CPU_BIG_ENDIAN
+
+config PPC64_ELF_ABI_V2
+   def_bool PPC64 && CPU_LITTLE_ENDIAN
+
 config PPC64_BOOT_WRAPPER
def_bool n
depends on CPU_LITTLE_ENDIAN
-- 
2.35.1



[PATCH v3 11/25] powerpc/ftrace: Make __ftrace_make_{nop/call}() common to PPC32 and PPC64

2022-05-08 Thread Christophe Leroy
Since c93d4f6ecf4b ("powerpc/ftrace: Add module_trampoline_target()
for PPC32"), __ftrace_make_nop() for PPC32 is very similar to the
one for PPC64.

Same for __ftrace_make_call().

Make them common.

Signed-off-by: Christophe Leroy 
---
v2:
- Fixed comment to -mprofile-kernel versus -mkernel_profile
- Replaced a couple of #ifdef with CONFIG_PPC64_ELF_ABI_V1 as suggested by 
Naveen.
---
 arch/powerpc/kernel/trace/ftrace.c | 108 +++--
 1 file changed, 8 insertions(+), 100 deletions(-)

diff --git a/arch/powerpc/kernel/trace/ftrace.c 
b/arch/powerpc/kernel/trace/ftrace.c
index 0b199fc9cfd3..531da4d93c58 100644
--- a/arch/powerpc/kernel/trace/ftrace.c
+++ b/arch/powerpc/kernel/trace/ftrace.c
@@ -114,7 +114,6 @@ static unsigned long find_bl_target(unsigned long ip, 
ppc_inst_t op)
 }
 
 #ifdef CONFIG_MODULES
-#ifdef CONFIG_PPC64
 static int
 __ftrace_make_nop(struct module *mod,
  struct dyn_ftrace *rec, unsigned long addr)
@@ -154,10 +153,11 @@ __ftrace_make_nop(struct module *mod,
return -EINVAL;
}
 
-#ifdef CONFIG_MPROFILE_KERNEL
-   /* When using -mkernel_profile there is no load to jump over */
+   /* When using -mprofile-kernel or PPC32 there is no load to jump over */
pop = ppc_inst(PPC_RAW_NOP());
 
+#ifdef CONFIG_PPC64
+#ifdef CONFIG_MPROFILE_KERNEL
if (copy_inst_from_kernel_nofault(, (void *)(ip - 4))) {
pr_err("Fetching instruction at %lx failed.\n", ip - 4);
return -EFAULT;
@@ -201,6 +201,7 @@ __ftrace_make_nop(struct module *mod,
return -EINVAL;
}
 #endif /* CONFIG_MPROFILE_KERNEL */
+#endif /* PPC64 */
 
if (patch_instruction((u32 *)ip, pop)) {
pr_err("Patching NOP failed.\n");
@@ -209,48 +210,6 @@ __ftrace_make_nop(struct module *mod,
 
return 0;
 }
-
-#else /* !PPC64 */
-static int
-__ftrace_make_nop(struct module *mod,
- struct dyn_ftrace *rec, unsigned long addr)
-{
-   ppc_inst_t op;
-   unsigned long ip = rec->ip;
-   unsigned long tramp, ptr;
-
-   if (copy_from_kernel_nofault(, (void *)ip, MCOUNT_INSN_SIZE))
-   return -EFAULT;
-
-   /* Make sure that that this is still a 24bit jump */
-   if (!is_bl_op(op)) {
-   pr_err("Not expected bl: opcode is %s\n", ppc_inst_as_str(op));
-   return -EINVAL;
-   }
-
-   /* lets find where the pointer goes */
-   tramp = find_bl_target(ip, op);
-
-   /* Find where the trampoline jumps to */
-   if (module_trampoline_target(mod, tramp, )) {
-   pr_err("Failed to get trampoline target\n");
-   return -EFAULT;
-   }
-
-   if (ptr != addr) {
-   pr_err("Trampoline location %08lx does not match addr\n",
-  tramp);
-   return -EINVAL;
-   }
-
-   op = ppc_inst(PPC_RAW_NOP());
-
-   if (patch_instruction((u32 *)ip, op))
-   return -EPERM;
-
-   return 0;
-}
-#endif /* PPC64 */
 #endif /* CONFIG_MODULES */
 
 static unsigned long find_ftrace_tramp(unsigned long ip)
@@ -437,13 +396,12 @@ int ftrace_make_nop(struct module *mod,
 }
 
 #ifdef CONFIG_MODULES
-#ifdef CONFIG_PPC64
 /*
  * Examine the existing instructions for __ftrace_make_call.
  * They should effectively be a NOP, and follow formal constraints,
  * depending on the ABI. Return false if they don't.
  */
-#ifndef CONFIG_MPROFILE_KERNEL
+#ifdef CONFIG_PPC64_ELF_ABI_V1
 static int
 expected_nop_sequence(void *ip, ppc_inst_t op0, ppc_inst_t op1)
 {
@@ -465,7 +423,7 @@ expected_nop_sequence(void *ip, ppc_inst_t op0, ppc_inst_t 
op1)
 static int
 expected_nop_sequence(void *ip, ppc_inst_t op0, ppc_inst_t op1)
 {
-   /* look for patched "NOP" on ppc64 with -mprofile-kernel */
+   /* look for patched "NOP" on ppc64 with -mprofile-kernel or ppc32 */
if (!ppc_inst_equal(op0, ppc_inst(PPC_RAW_NOP(
return 0;
return 1;
@@ -484,8 +442,10 @@ __ftrace_make_call(struct dyn_ftrace *rec, unsigned long 
addr)
if (copy_inst_from_kernel_nofault(op, ip))
return -EFAULT;
 
+#ifdef CONFIG_PPC64_ELF_ABI_V1
if (copy_inst_from_kernel_nofault(op + 1, ip + 4))
return -EFAULT;
+#endif
 
if (!expected_nop_sequence(ip, op[0], op[1])) {
pr_err("Unexpected call sequence at %p: %s %s\n",
@@ -531,58 +491,6 @@ __ftrace_make_call(struct dyn_ftrace *rec, unsigned long 
addr)
 
return 0;
 }
-
-#else  /* !CONFIG_PPC64: */
-static int
-__ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr)
-{
-   int err;
-   ppc_inst_t op;
-   u32 *ip = (u32 *)rec->ip;
-   struct module *mod = rec->arch.mod;
-   unsigned long tramp;
-
-   /* read where this goes */
-   if (copy_inst_from_kernel_nofault(, ip))
-   return -EFAULT;
-
-   /* It should be pointing to a nop */
-   if (!ppc_inst_equal(op,  

[PATCH v3 02/25] powerpc/ftrace: Remove redundant create_branch() calls

2022-05-08 Thread Christophe Leroy
Since commit d5937db114e4 ("powerpc/code-patching: Fix patch_branch()
return on out-of-range failure") patch_branch() fails with -ERANGE
when trying to branch out of range.

No need to perform the test twice. Remove redundant create_branch()
calls.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/kernel/trace/ftrace.c | 20 
 1 file changed, 20 deletions(-)

diff --git a/arch/powerpc/kernel/trace/ftrace.c 
b/arch/powerpc/kernel/trace/ftrace.c
index 7a266fd469b7..3ce3697e8a7c 100644
--- a/arch/powerpc/kernel/trace/ftrace.c
+++ b/arch/powerpc/kernel/trace/ftrace.c
@@ -301,7 +301,6 @@ static int setup_mcount_compiler_tramp(unsigned long tramp)
int i;
ppc_inst_t op;
unsigned long ptr;
-   ppc_inst_t instr;
static unsigned long ftrace_plt_tramps[NUM_FTRACE_TRAMPS];
 
/* Is this a known long jump tramp? */
@@ -344,12 +343,6 @@ static int setup_mcount_compiler_tramp(unsigned long tramp)
 #else
ptr = ppc_global_function_entry((void *)ftrace_caller);
 #endif
-   if (create_branch(, (void *)tramp, ptr, 0)) {
-   pr_debug("%ps is not reachable from existing mcount tramp\n",
-   (void *)ptr);
-   return -1;
-   }
-
if (patch_branch((u32 *)tramp, ptr, 0)) {
pr_debug("REL24 out of range!\n");
return -1;
@@ -490,7 +483,6 @@ static int
 __ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr)
 {
ppc_inst_t op[2];
-   ppc_inst_t instr;
void *ip = (void *)rec->ip;
unsigned long entry, ptr, tramp;
struct module *mod = rec->arch.mod;
@@ -539,12 +531,6 @@ __ftrace_make_call(struct dyn_ftrace *rec, unsigned long 
addr)
return -EINVAL;
}
 
-   /* Ensure branch is within 24 bits */
-   if (create_branch(, ip, tramp, BRANCH_SET_LINK)) {
-   pr_err("Branch out of range\n");
-   return -EINVAL;
-   }
-
if (patch_branch(ip, tramp, BRANCH_SET_LINK)) {
pr_err("REL24 out of range!\n");
return -EINVAL;
@@ -770,12 +756,6 @@ __ftrace_modify_call(struct dyn_ftrace *rec, unsigned long 
old_addr,
return -EINVAL;
}
 
-   /* Ensure branch is within 24 bits */
-   if (create_branch(, (u32 *)ip, tramp, BRANCH_SET_LINK)) {
-   pr_err("Branch out of range\n");
-   return -EINVAL;
-   }
-
if (patch_branch((u32 *)ip, tramp, BRANCH_SET_LINK)) {
pr_err("REL24 out of range!\n");
return -EINVAL;
-- 
2.35.1



[PATCH v3 01/25] powerpc/ftrace: Refactor prepare_ftrace_return()

2022-05-08 Thread Christophe Leroy
When we have CONFIG_DYNAMIC_FTRACE_WITH_ARGS,
prepare_ftrace_return() is called by ftrace_graph_func()
otherwise prepare_ftrace_return() is called from assembly.

Refactor prepare_ftrace_return() into a static
__prepare_ftrace_return() that will be called by both
prepare_ftrace_return() and ftrace_graph_func().

It will allow GCC to fold __prepare_ftrace_return() inside
ftrace_graph_func().

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/kernel/trace/ftrace.c | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kernel/trace/ftrace.c 
b/arch/powerpc/kernel/trace/ftrace.c
index 4ee04aacf9f1..7a266fd469b7 100644
--- a/arch/powerpc/kernel/trace/ftrace.c
+++ b/arch/powerpc/kernel/trace/ftrace.c
@@ -939,8 +939,8 @@ int ftrace_disable_ftrace_graph_caller(void)
  * Hook the return address and push it in the stack of return addrs
  * in current thread info. Return the address we want to divert to.
  */
-unsigned long prepare_ftrace_return(unsigned long parent, unsigned long ip,
-   unsigned long sp)
+static unsigned long
+__prepare_ftrace_return(unsigned long parent, unsigned long ip, unsigned long 
sp)
 {
unsigned long return_hooker;
int bit;
@@ -969,7 +969,13 @@ unsigned long prepare_ftrace_return(unsigned long parent, 
unsigned long ip,
 void ftrace_graph_func(unsigned long ip, unsigned long parent_ip,
   struct ftrace_ops *op, struct ftrace_regs *fregs)
 {
-   fregs->regs.link = prepare_ftrace_return(parent_ip, ip, 
fregs->regs.gpr[1]);
+   fregs->regs.link = __prepare_ftrace_return(parent_ip, ip, 
fregs->regs.gpr[1]);
+}
+#else
+unsigned long prepare_ftrace_return(unsigned long parent, unsigned long ip,
+   unsigned long sp)
+{
+   return __prepare_ftrace_return(parent, ip, sp);
 }
 #endif
 #endif /* CONFIG_FUNCTION_GRAPH_TRACER */
-- 
2.35.1



[PATCH v3 10/25] powerpc: Finalise cleanup around ABI use

2022-05-08 Thread Christophe Leroy
Now that we have CONFIG_PPC64_ELF_ABI_V1 and CONFIG_PPC64_ELF_ABI_V2,
get rid of all indirect detection of ABI version.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/Kconfig|  2 +-
 arch/powerpc/Makefile   |  2 +-
 arch/powerpc/include/asm/types.h|  8 
 arch/powerpc/kernel/fadump.c| 13 -
 arch/powerpc/kernel/ptrace/ptrace.c |  6 --
 arch/powerpc/net/bpf_jit_comp64.c   |  4 ++--
 6 files changed, 12 insertions(+), 23 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 174edabb74fa..5514fed3f072 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -208,7 +208,7 @@ config PPC
select HAVE_EFFICIENT_UNALIGNED_ACCESS  if !(CPU_LITTLE_ENDIAN && 
POWER7_CPU)
select HAVE_FAST_GUP
select HAVE_FTRACE_MCOUNT_RECORD
-   select HAVE_FUNCTION_DESCRIPTORSif PPC64 && !CPU_LITTLE_ENDIAN
+   select HAVE_FUNCTION_DESCRIPTORSif PPC64_ELF_ABI_V1
select HAVE_FUNCTION_ERROR_INJECTION
select HAVE_FUNCTION_GRAPH_TRACER
select HAVE_FUNCTION_TRACER
diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile
index 1ba98be84101..8bd3b631f094 100644
--- a/arch/powerpc/Makefile
+++ b/arch/powerpc/Makefile
@@ -213,7 +213,7 @@ CHECKFLAGS  += -m$(BITS) -D__powerpc__ -D__powerpc$(BITS)__
 ifdef CONFIG_CPU_BIG_ENDIAN
 CHECKFLAGS += -D__BIG_ENDIAN__
 else
-CHECKFLAGS += -D__LITTLE_ENDIAN__ -D_CALL_ELF=2
+CHECKFLAGS += -D__LITTLE_ENDIAN__
 endif
 
 ifdef CONFIG_476FPE_ERR46
diff --git a/arch/powerpc/include/asm/types.h b/arch/powerpc/include/asm/types.h
index 84078c28c1a2..93157a661dcc 100644
--- a/arch/powerpc/include/asm/types.h
+++ b/arch/powerpc/include/asm/types.h
@@ -11,14 +11,6 @@
 
 #include 
 
-#ifdef __powerpc64__
-#if defined(_CALL_ELF) && _CALL_ELF == 2
-#define PPC64_ELF_ABI_v2 1
-#else
-#define PPC64_ELF_ABI_v1 1
-#endif
-#endif /* __powerpc64__ */
-
 #ifndef __ASSEMBLY__
 
 typedef __vector128 vector128;
diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c
index 65562c4a0a69..5f7224d66586 100644
--- a/arch/powerpc/kernel/fadump.c
+++ b/arch/powerpc/kernel/fadump.c
@@ -968,11 +968,14 @@ static int fadump_init_elfcore_header(char *bufp)
elf->e_entry = 0;
elf->e_phoff = sizeof(struct elfhdr);
elf->e_shoff = 0;
-#if defined(_CALL_ELF)
-   elf->e_flags = _CALL_ELF;
-#else
-   elf->e_flags = 0;
-#endif
+
+   if (IS_ENABLED(CONFIG_PPC64_ELF_ABI_V2))
+   elf->e_flags = 2;
+   else if (IS_ENABLED(CONFIG_PPC64_ELF_ABI_V1))
+   elf->e_flags = 1;
+   else
+   elf->e_flags = 0;
+
elf->e_ehsize = sizeof(struct elfhdr);
elf->e_phentsize = sizeof(struct elf_phdr);
elf->e_phnum = 0;
diff --git a/arch/powerpc/kernel/ptrace/ptrace.c 
b/arch/powerpc/kernel/ptrace/ptrace.c
index 9fbe155a9bd0..4d2dc22d4a2d 100644
--- a/arch/powerpc/kernel/ptrace/ptrace.c
+++ b/arch/powerpc/kernel/ptrace/ptrace.c
@@ -444,10 +444,4 @@ void __init pt_regs_check(void)
 * real registers.
 */
BUILD_BUG_ON(PT_DSCR < sizeof(struct user_pt_regs) / sizeof(unsigned 
long));
-
-#ifdef CONFIG_PPC64_ELF_ABI_V1
-   BUILD_BUG_ON(!IS_ENABLED(CONFIG_HAVE_FUNCTION_DESCRIPTORS));
-#else
-   BUILD_BUG_ON(IS_ENABLED(CONFIG_HAVE_FUNCTION_DESCRIPTORS));
-#endif
 }
diff --git a/arch/powerpc/net/bpf_jit_comp64.c 
b/arch/powerpc/net/bpf_jit_comp64.c
index d7b42f45669e..594c54931e20 100644
--- a/arch/powerpc/net/bpf_jit_comp64.c
+++ b/arch/powerpc/net/bpf_jit_comp64.c
@@ -126,7 +126,7 @@ void bpf_jit_build_prologue(u32 *image, struct 
codegen_context *ctx)
 {
int i;
 
-   if (__is_defined(CONFIG_PPC64_ELF_ABI_V2))
+   if (IS_ENABLED(CONFIG_PPC64_ELF_ABI_V2))
EMIT(PPC_RAW_LD(_R2, _R13, offsetof(struct paca_struct, 
kernel_toc)));
 
/*
@@ -266,7 +266,7 @@ static int bpf_jit_emit_tail_call(u32 *image, struct 
codegen_context *ctx, u32 o
int b2p_index = bpf_to_ppc(BPF_REG_3);
int bpf_tailcall_prologue_size = 8;
 
-   if (__is_defined(CONFIG_PPC64_ELF_ABI_V2))
+   if (IS_ENABLED(CONFIG_PPC64_ELF_ABI_V2))
bpf_tailcall_prologue_size += 4; /* skip past the toc load */
 
/*
-- 
2.35.1



[PATCH v3 14/25] powerpc/ftrace: Remove ftrace_plt_tramps[]

2022-05-08 Thread Christophe Leroy
ftrace_plt_tramps table is never filled so it is useless.

Remove it.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/kernel/trace/ftrace.c | 8 
 1 file changed, 8 deletions(-)

diff --git a/arch/powerpc/kernel/trace/ftrace.c 
b/arch/powerpc/kernel/trace/ftrace.c
index f89bcaa5f0fc..010a8c7ff4ac 100644
--- a/arch/powerpc/kernel/trace/ftrace.c
+++ b/arch/powerpc/kernel/trace/ftrace.c
@@ -250,7 +250,6 @@ static int setup_mcount_compiler_tramp(unsigned long tramp)
int i;
ppc_inst_t op;
unsigned long ptr;
-   static unsigned long ftrace_plt_tramps[NUM_FTRACE_TRAMPS];
 
/* Is this a known long jump tramp? */
for (i = 0; i < NUM_FTRACE_TRAMPS; i++)
@@ -259,13 +258,6 @@ static int setup_mcount_compiler_tramp(unsigned long tramp)
else if (ftrace_tramps[i] == tramp)
return 0;
 
-   /* Is this a known plt tramp? */
-   for (i = 0; i < NUM_FTRACE_TRAMPS; i++)
-   if (!ftrace_plt_tramps[i])
-   break;
-   else if (ftrace_plt_tramps[i] == tramp)
-   return -1;
-
/* New trampoline -- read where this goes */
if (copy_inst_from_kernel_nofault(, (void *)tramp)) {
pr_debug("Fetching opcode failed.\n");
-- 
2.35.1



[PATCH v3 06/25] powerpc/ftrace: Inline ftrace_modify_code()

2022-05-08 Thread Christophe Leroy
Inlining ftrace_modify_code(), it increases a bit the
size of ftrace code but brings 5% improvment on ftrace
activation.

Usually in C files we let gcc decide what to do but here
it really help to 'help' gcc to decide to inline, thought
we don't want to force it with an __always_inline that
would be too much for CONFIG_CC_OPTIMIZE_FOR_SIZE.

Signed-off-by: Christophe Leroy 
---
v2: More explanation in commit message
---
 arch/powerpc/kernel/trace/ftrace.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/trace/ftrace.c 
b/arch/powerpc/kernel/trace/ftrace.c
index 41c45b9c7f39..98e82fa4980f 100644
--- a/arch/powerpc/kernel/trace/ftrace.c
+++ b/arch/powerpc/kernel/trace/ftrace.c
@@ -53,7 +53,7 @@ ftrace_call_replace(unsigned long ip, unsigned long addr, int 
link)
return op;
 }
 
-static int
+static inline int
 ftrace_modify_code(unsigned long ip, ppc_inst_t old, ppc_inst_t new)
 {
ppc_inst_t replaced;
-- 
2.35.1



[PATCH v3 00/25] powerpc: ftrace optimisation and cleanup and more [v3]

2022-05-08 Thread Christophe Leroy
This series provides optimisation and cleanup of ftrace on powerpc.

With this series ftrace activation is about 20% faster on an 8xx.

At the end of the series come additional cleanups around ppc-opcode,
that would likely conflict with this series if posted separately.

Change since v2:
- The only change in v3 is in patch 21, to fix sparse problems reported by the 
Robot.

Main changes since v1 (details in after each individual patch description):
- Added 3 patches (8, 9, 10) that convert PPC64_ELF_ABI_v{1/2} macros by 
CONFIG_PPC64_ELF_ABI_V{1/2}
- Taken comments from Naveen

Christophe Leroy (25):
  powerpc/ftrace: Refactor prepare_ftrace_return()
  powerpc/ftrace: Remove redundant create_branch() calls
  powerpc/code-patching: Inline is_offset_in_{cond}_branch_range()
  powerpc/ftrace: Use is_offset_in_branch_range()
  powerpc/code-patching: Inline create_branch()
  powerpc/ftrace: Inline ftrace_modify_code()
  powerpc/ftrace: Use patch_instruction() return directly
  powerpc: Add CONFIG_PPC64_ELF_ABI_V1 and CONFIG_PPC64_ELF_ABI_V2
  powerpc: Replace PPC64_ELF_ABI_v{1/2} by CONFIG_PPC64_ELF_ABI_V{1/2}
  powerpc: Finalise cleanup around ABI use
  powerpc/ftrace: Make __ftrace_make_{nop/call}() common to PPC32 and
PPC64
  powerpc/ftrace: Don't include ftrace.o for CONFIG_FTRACE_SYSCALLS
  powerpc/ftrace: Use CONFIG_FUNCTION_TRACER instead of
CONFIG_DYNAMIC_FTRACE
  powerpc/ftrace: Remove ftrace_plt_tramps[]
  powerpc/ftrace: Use BRANCH_SET_LINK instead of value 1
  powerpc/ftrace: Use PPC_RAW_xxx() macros instead of opencoding.
  powerpc/ftrace: Use size macro instead of opencoding
  powerpc/ftrace: Simplify expected_nop_sequence()
  powerpc/ftrace: Minimise number of #ifdefs
  powerpc/inst: Add __copy_inst_from_kernel_nofault()
  powerpc/ftrace: Don't use copy_from_kernel_nofault() in
module_trampoline_target()
  powerpc/inst: Remove PPC_INST_BRANCH
  powerpc/modules: Use PPC_LI macros instead of opencoding
  powerpc/inst: Remove PPC_INST_BL
  powerpc/opcodes: Remove unused PPC_INST_XXX macros

 arch/powerpc/Kconfig |   2 +-
 arch/powerpc/Makefile|  12 +-
 arch/powerpc/boot/Makefile   |   2 +
 arch/powerpc/include/asm/code-patching.h |  65 +++-
 arch/powerpc/include/asm/ftrace.h|   4 +-
 arch/powerpc/include/asm/inst.h  |  13 +-
 arch/powerpc/include/asm/linkage.h   |   2 +-
 arch/powerpc/include/asm/module.h|   2 -
 arch/powerpc/include/asm/ppc-opcode.h|  22 +-
 arch/powerpc/include/asm/ppc_asm.h   |   4 +-
 arch/powerpc/include/asm/ptrace.h|   2 +-
 arch/powerpc/include/asm/sections.h  |  24 +-
 arch/powerpc/include/asm/types.h |   8 -
 arch/powerpc/kernel/fadump.c |  13 +-
 arch/powerpc/kernel/head_64.S|   2 +-
 arch/powerpc/kernel/interrupt_64.S   |   2 +-
 arch/powerpc/kernel/kprobes.c|   6 +-
 arch/powerpc/kernel/misc_64.S|   2 +-
 arch/powerpc/kernel/module.c |   4 +-
 arch/powerpc/kernel/module_32.c  |  38 ++-
 arch/powerpc/kernel/module_64.c  |   7 +-
 arch/powerpc/kernel/ptrace/ptrace.c  |   6 -
 arch/powerpc/kernel/trace/Makefile   |   5 +-
 arch/powerpc/kernel/trace/ftrace.c   | 375 +++
 arch/powerpc/kvm/book3s_interrupts.S |   2 +-
 arch/powerpc/kvm/book3s_rmhandlers.S |   2 +-
 arch/powerpc/lib/code-patching.c |  49 +--
 arch/powerpc/lib/feature-fixups.c|   2 +-
 arch/powerpc/net/bpf_jit.h   |   4 +-
 arch/powerpc/net/bpf_jit_comp.c  |   2 +-
 arch/powerpc/net/bpf_jit_comp64.c|   4 +-
 arch/powerpc/platforms/Kconfig.cputype   |   6 +
 32 files changed, 271 insertions(+), 422 deletions(-)

-- 
2.35.1



[PATCH v3 18/25] powerpc/ftrace: Simplify expected_nop_sequence()

2022-05-08 Thread Christophe Leroy
Avoid ifdefs around expected_nop_sequence().

While at it make it a bool.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/kernel/trace/ftrace.c | 22 ++
 1 file changed, 6 insertions(+), 16 deletions(-)

diff --git a/arch/powerpc/kernel/trace/ftrace.c 
b/arch/powerpc/kernel/trace/ftrace.c
index 346b5485e7ef..c34cb394f8a8 100644
--- a/arch/powerpc/kernel/trace/ftrace.c
+++ b/arch/powerpc/kernel/trace/ftrace.c
@@ -390,24 +390,14 @@ int ftrace_make_nop(struct module *mod,
  * They should effectively be a NOP, and follow formal constraints,
  * depending on the ABI. Return false if they don't.
  */
-#ifdef CONFIG_PPC64_ELF_ABI_V1
-static int
-expected_nop_sequence(void *ip, ppc_inst_t op0, ppc_inst_t op1)
-{
-   if (!ppc_inst_equal(op0, ppc_inst(PPC_RAW_BRANCH(8))) ||
-   !ppc_inst_equal(op1, ppc_inst(PPC_INST_LD_TOC)))
-   return 0;
-   return 1;
-}
-#else
-static int
-expected_nop_sequence(void *ip, ppc_inst_t op0, ppc_inst_t op1)
+static bool expected_nop_sequence(void *ip, ppc_inst_t op0, ppc_inst_t op1)
 {
-   if (!ppc_inst_equal(op0, ppc_inst(PPC_RAW_NOP(
-   return 0;
-   return 1;
+   if (IS_ENABLED(CONFIG_PPC64_ELF_ABI_V1))
+   return ppc_inst_equal(op0, ppc_inst(PPC_RAW_BRANCH(8))) &&
+  ppc_inst_equal(op1, ppc_inst(PPC_INST_LD_TOC));
+   else
+   return ppc_inst_equal(op0, ppc_inst(PPC_RAW_NOP()));
 }
-#endif
 
 static int
 __ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr)
-- 
2.35.1



[PATCH v3 12/25] powerpc/ftrace: Don't include ftrace.o for CONFIG_FTRACE_SYSCALLS

2022-05-08 Thread Christophe Leroy
Since commit 7bea7ac0ca01 ("powerpc/syscalls: Fix syscall tracing")
ftrace.o is not needed anymore for CONFIG_FTRACE_SYSCALLS.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/kernel/trace/Makefile | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/powerpc/kernel/trace/Makefile 
b/arch/powerpc/kernel/trace/Makefile
index 542aa7a8b2b4..fc32ec30b297 100644
--- a/arch/powerpc/kernel/trace/Makefile
+++ b/arch/powerpc/kernel/trace/Makefile
@@ -17,7 +17,6 @@ endif
 obj-$(CONFIG_FUNCTION_TRACER)  += ftrace_low.o
 obj-$(CONFIG_DYNAMIC_FTRACE)   += ftrace.o
 obj-$(CONFIG_FUNCTION_GRAPH_TRACER)+= ftrace.o
-obj-$(CONFIG_FTRACE_SYSCALLS)  += ftrace.o
 obj-$(CONFIG_TRACING)  += trace_clock.o
 
 obj-$(CONFIG_PPC64)+= $(obj64-y)
-- 
2.35.1



[PATCH v3 19/25] powerpc/ftrace: Minimise number of #ifdefs

2022-05-08 Thread Christophe Leroy
A lot of #ifdefs can be replaced by IS_ENABLED()

Do so.

This requires to have kernel_toc_addr() defined at all time
as well as PPC_INST_LD_TOC and PPC_INST_STD_LR.

Signed-off-by: Christophe Leroy 
---
v2: Moved the setup of pop outside of the big if()/else() in __ftrace_make_nop()
---
 arch/powerpc/include/asm/code-patching.h |   2 -
 arch/powerpc/include/asm/module.h|   2 -
 arch/powerpc/include/asm/sections.h  |  24 +--
 arch/powerpc/kernel/trace/ftrace.c   | 182 +++
 4 files changed, 103 insertions(+), 107 deletions(-)

diff --git a/arch/powerpc/include/asm/code-patching.h 
b/arch/powerpc/include/asm/code-patching.h
index 8b1a10868275..3f881548fb61 100644
--- a/arch/powerpc/include/asm/code-patching.h
+++ b/arch/powerpc/include/asm/code-patching.h
@@ -217,7 +217,6 @@ static inline unsigned long ppc_kallsyms_lookup_name(const 
char *name)
return addr;
 }
 
-#ifdef CONFIG_PPC64
 /*
  * Some instruction encodings commonly used in dynamic ftracing
  * and function live patching.
@@ -234,6 +233,5 @@ static inline unsigned long ppc_kallsyms_lookup_name(const 
char *name)
 
 /* usually preceded by a mflr r0 */
 #define PPC_INST_STD_LRPPC_RAW_STD(_R0, _R1, PPC_LR_STKOFF)
-#endif /* CONFIG_PPC64 */
 
 #endif /* _ASM_POWERPC_CODE_PATCHING_H */
diff --git a/arch/powerpc/include/asm/module.h 
b/arch/powerpc/include/asm/module.h
index 857d9ff24295..09e2ffd360bb 100644
--- a/arch/powerpc/include/asm/module.h
+++ b/arch/powerpc/include/asm/module.h
@@ -41,9 +41,7 @@ struct mod_arch_specific {
 
 #ifdef CONFIG_DYNAMIC_FTRACE
unsigned long tramp;
-#ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
unsigned long tramp_regs;
-#endif
 #endif
 
/* List of BUG addresses, source line numbers and filenames */
diff --git a/arch/powerpc/include/asm/sections.h 
b/arch/powerpc/include/asm/sections.h
index 8be2c491c733..6980eaeb16fe 100644
--- a/arch/powerpc/include/asm/sections.h
+++ b/arch/powerpc/include/asm/sections.h
@@ -29,18 +29,6 @@ extern char start_virt_trampolines[];
 extern char end_virt_trampolines[];
 #endif
 
-/*
- * This assumes the kernel is never compiled -mcmodel=small or
- * the total .toc is always less than 64k.
- */
-static inline unsigned long kernel_toc_addr(void)
-{
-   unsigned long toc_ptr;
-
-   asm volatile("mr %0, 2" : "=r" (toc_ptr));
-   return toc_ptr;
-}
-
 static inline int overlaps_interrupt_vector_text(unsigned long start,
unsigned long end)
 {
@@ -60,5 +48,17 @@ static inline int overlaps_kernel_text(unsigned long start, 
unsigned long end)
 
 #endif
 
+/*
+ * This assumes the kernel is never compiled -mcmodel=small or
+ * the total .toc is always less than 64k.
+ */
+static inline unsigned long kernel_toc_addr(void)
+{
+   unsigned long toc_ptr;
+
+   asm volatile("mr %0, 2" : "=r" (toc_ptr));
+   return toc_ptr;
+}
+
 #endif /* __KERNEL__ */
 #endif /* _ASM_POWERPC_SECTIONS_H */
diff --git a/arch/powerpc/kernel/trace/ftrace.c 
b/arch/powerpc/kernel/trace/ftrace.c
index c34cb394f8a8..5e7a4ed7ad22 100644
--- a/arch/powerpc/kernel/trace/ftrace.c
+++ b/arch/powerpc/kernel/trace/ftrace.c
@@ -150,26 +150,39 @@ __ftrace_make_nop(struct module *mod,
return -EINVAL;
}
 
-   /* When using -mprofile-kernel or PPC32 there is no load to jump over */
-   pop = ppc_inst(PPC_RAW_NOP());
+   if (IS_ENABLED(CONFIG_MPROFILE_KERNEL)) {
+   if (copy_inst_from_kernel_nofault(, (void *)(ip - 4))) {
+   pr_err("Fetching instruction at %lx failed.\n", ip - 4);
+   return -EFAULT;
+   }
 
-#ifdef CONFIG_PPC64
-#ifdef CONFIG_MPROFILE_KERNEL
-   if (copy_inst_from_kernel_nofault(, (void *)(ip - 4))) {
-   pr_err("Fetching instruction at %lx failed.\n", ip - 4);
-   return -EFAULT;
-   }
+   /* We expect either a mflr r0, or a std r0, LRSAVE(r1) */
+   if (!ppc_inst_equal(op, ppc_inst(PPC_RAW_MFLR(_R0))) &&
+   !ppc_inst_equal(op, ppc_inst(PPC_INST_STD_LR))) {
+   pr_err("Unexpected instruction %s around bl _mcount\n",
+  ppc_inst_as_str(op));
+   return -EINVAL;
+   }
+   } else if (IS_ENABLED(CONFIG_PPC64)) {
+   /*
+* Check what is in the next instruction. We can see ld 
r2,40(r1), but
+* on first pass after boot we will see mflr r0.
+*/
+   if (copy_inst_from_kernel_nofault(, (void *)(ip + 4))) {
+   pr_err("Fetching op failed.\n");
+   return -EFAULT;
+   }
 
-   /* We expect either a mflr r0, or a std r0, LRSAVE(r1) */
-   if (!ppc_inst_equal(op, ppc_inst(PPC_RAW_MFLR(_R0))) &&
-   !ppc_inst_equal(op, ppc_inst(PPC_INST_STD_LR))) {
-   pr_err("Unexpected 

[PATCH v3 17/25] powerpc/ftrace: Use size macro instead of opencoding

2022-05-08 Thread Christophe Leroy
0x8000 is SZ_2G. Use it.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/kernel/trace/ftrace.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/trace/ftrace.c 
b/arch/powerpc/kernel/trace/ftrace.c
index ac3f97dd1729..346b5485e7ef 100644
--- a/arch/powerpc/kernel/trace/ftrace.c
+++ b/arch/powerpc/kernel/trace/ftrace.c
@@ -741,7 +741,7 @@ int __init ftrace_dyn_arch_init(void)
 #endif
long reladdr = addr - kernel_toc_addr();
 
-   if (reladdr > 0x7FFF || reladdr < -(0x8000L)) {
+   if (reladdr >= SZ_2G || reladdr < -SZ_2G) {
pr_err("Address of %ps out of range of kernel_toc.\n",
(void *)addr);
return -1;
-- 
2.35.1



[PATCH v3 16/25] powerpc/ftrace: Use PPC_RAW_xxx() macros instead of opencoding.

2022-05-08 Thread Christophe Leroy
PPC_RAW_xxx() macros are self explanatory and less error prone
than open coding.

Use them in ftrace.c

Signed-off-by: Christophe Leroy 
---
v2:
- Replaced PPC_INST_OFFSET24_MASK by PPC_LI_MASK and added PPC_LI().
- Fix ADDI instead of ADDIS
---
 arch/powerpc/include/asm/ppc-opcode.h |  5 +
 arch/powerpc/kernel/trace/ftrace.c| 32 +--
 2 files changed, 16 insertions(+), 21 deletions(-)

diff --git a/arch/powerpc/include/asm/ppc-opcode.h 
b/arch/powerpc/include/asm/ppc-opcode.h
index 82f1f0041c6f..3e9aa96ae74b 100644
--- a/arch/powerpc/include/asm/ppc-opcode.h
+++ b/arch/powerpc/include/asm/ppc-opcode.h
@@ -352,6 +352,10 @@
 #define PPC_HIGHER(v)  (((v) >> 32) & 0x)
 #define PPC_HIGHEST(v) (((v) >> 48) & 0x)
 
+/* LI Field */
+#define PPC_LI_MASK0x03fc
+#define PPC_LI(v)  ((v) & PPC_LI_MASK)
+
 /*
  * Only use the larx hint bit on 64bit CPUs. e500v1/v2 based CPUs will treat a
  * larx with EH set as an illegal instruction.
@@ -572,6 +576,7 @@
 #define PPC_RAW_EIEIO()(0x7c0006ac)
 
 #define PPC_RAW_BRANCH(addr)   (PPC_INST_BRANCH | ((addr) & 
0x03fc))
+#define PPC_RAW_BL(offset) (0x4801 | PPC_LI(offset))
 
 /* Deal with instructions that older assemblers aren't aware of */
 #definePPC_BCCTR_FLUSH stringify_in_c(.long 
PPC_INST_BCCTR_FLUSH)
diff --git a/arch/powerpc/kernel/trace/ftrace.c 
b/arch/powerpc/kernel/trace/ftrace.c
index c4a68340a351..ac3f97dd1729 100644
--- a/arch/powerpc/kernel/trace/ftrace.c
+++ b/arch/powerpc/kernel/trace/ftrace.c
@@ -90,19 +90,19 @@ static int test_24bit_addr(unsigned long ip, unsigned long 
addr)
 
 static int is_bl_op(ppc_inst_t op)
 {
-   return (ppc_inst_val(op) & 0xfc03) == 0x4801;
+   return (ppc_inst_val(op) & ~PPC_LI_MASK) == PPC_RAW_BL(0);
 }
 
 static int is_b_op(ppc_inst_t op)
 {
-   return (ppc_inst_val(op) & 0xfc03) == 0x4800;
+   return (ppc_inst_val(op) & ~PPC_LI_MASK) == PPC_RAW_BRANCH(0);
 }
 
 static unsigned long find_bl_target(unsigned long ip, ppc_inst_t op)
 {
int offset;
 
-   offset = (ppc_inst_val(op) & 0x03fc);
+   offset = PPC_LI(ppc_inst_val(op));
/* make it signed */
if (offset & 0x0200)
offset |= 0xfe00;
@@ -182,7 +182,7 @@ __ftrace_make_nop(struct module *mod,
 * Use a b +8 to jump over the load.
 */
 
-   pop = ppc_inst(PPC_INST_BRANCH | 8);/* b +8 */
+   pop = ppc_inst(PPC_RAW_BRANCH(8));  /* b +8 */
 
/*
 * Check what is in the next instruction. We can see ld r2,40(r1), but
@@ -394,17 +394,8 @@ int ftrace_make_nop(struct module *mod,
 static int
 expected_nop_sequence(void *ip, ppc_inst_t op0, ppc_inst_t op1)
 {
-   /*
-* We expect to see:
-*
-* b +8
-* ld r2,XX(r1)
-*
-* The load offset is different depending on the ABI. For simplicity
-* just mask it out when doing the compare.
-*/
-   if (!ppc_inst_equal(op0, ppc_inst(0x4808)) ||
-   (ppc_inst_val(op1) & 0x) != 0xe841)
+   if (!ppc_inst_equal(op0, ppc_inst(PPC_RAW_BRANCH(8))) ||
+   !ppc_inst_equal(op1, ppc_inst(PPC_INST_LD_TOC)))
return 0;
return 1;
 }
@@ -412,7 +403,6 @@ expected_nop_sequence(void *ip, ppc_inst_t op0, ppc_inst_t 
op1)
 static int
 expected_nop_sequence(void *ip, ppc_inst_t op0, ppc_inst_t op1)
 {
-   /* look for patched "NOP" on ppc64 with -mprofile-kernel or ppc32 */
if (!ppc_inst_equal(op0, ppc_inst(PPC_RAW_NOP(
return 0;
return 1;
@@ -738,11 +728,11 @@ int __init ftrace_dyn_arch_init(void)
int i;
unsigned int *tramp[] = { ftrace_tramp_text, ftrace_tramp_init };
u32 stub_insns[] = {
-   0xe98d | PACATOC,   /* ld  r12,PACATOC(r13) */
-   0x3d8c, /* addis   r12,r12,   */
-   0x398c, /* addir12,r12,*/
-   0x7d8903a6, /* mtctr   r12  */
-   0x4e800420, /* bctr */
+   PPC_RAW_LD(_R12, _R13, PACATOC),
+   PPC_RAW_ADDIS(_R12, _R12, 0),
+   PPC_RAW_ADDI(_R12, _R12, 0),
+   PPC_RAW_MTCTR(_R12),
+   PPC_RAW_BCTR()
};
 #ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
unsigned long addr = ppc_global_function_entry((void 
*)ftrace_regs_caller);
-- 
2.35.1



[PATCH v3 20/25] powerpc/inst: Add __copy_inst_from_kernel_nofault()

2022-05-08 Thread Christophe Leroy
On the same model as get_user() versus __get_user(),
introduce __copy_inst_from_kernel_nofault() which doesn't
check address.

To be used by callers that have already checked that the adress
is a kernel address.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/inst.h | 13 +
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/include/asm/inst.h b/arch/powerpc/include/asm/inst.h
index 80b6d74146c6..b49aae9f6f27 100644
--- a/arch/powerpc/include/asm/inst.h
+++ b/arch/powerpc/include/asm/inst.h
@@ -158,13 +158,10 @@ static inline char *__ppc_inst_as_str(char 
str[PPC_INST_STR_LEN], ppc_inst_t x)
__str;  \
 })
 
-static inline int copy_inst_from_kernel_nofault(ppc_inst_t *inst, u32 *src)
+static inline int __copy_inst_from_kernel_nofault(ppc_inst_t *inst, u32 *src)
 {
unsigned int val, suffix;
 
-   if (unlikely(!is_kernel_addr((unsigned long)src)))
-   return -ERANGE;
-
 /* See https://github.com/ClangBuiltLinux/linux/issues/1521 */
 #if defined(CONFIG_CC_IS_CLANG) && CONFIG_CLANG_VERSION < 14
val = suffix = 0;
@@ -181,4 +178,12 @@ static inline int copy_inst_from_kernel_nofault(ppc_inst_t 
*inst, u32 *src)
return -EFAULT;
 }
 
+static inline int copy_inst_from_kernel_nofault(ppc_inst_t *inst, u32 *src)
+{
+   if (unlikely(!is_kernel_addr((unsigned long)src)))
+   return -ERANGE;
+
+   return __copy_inst_from_kernel_nofault(inst, src);
+}
+
 #endif /* _ASM_POWERPC_INST_H */
-- 
2.35.1



[PATCH v3 13/25] powerpc/ftrace: Use CONFIG_FUNCTION_TRACER instead of CONFIG_DYNAMIC_FTRACE

2022-05-08 Thread Christophe Leroy
Since commit 0c0c52306f47 ("powerpc: Only support DYNAMIC_FTRACE not
static"), CONFIG_DYNAMIC_FTRACE is always selected when
CONFIG_FUNCTION_TRACER is selected.

To avoid confusion and have the reader wonder what's happen when
CONFIG_FUNCTION_TRACER is selected and CONFIG_DYNAMIC_FTRACE is not,
use CONFIG_FUNCTION_TRACER in ifdefs instead of CONFIG_DYNAMIC_FTRACE.

As CONFIG_FUNCTION_GRAPH_TRACER depends on CONFIG_FUNCTION_TRACER,
ftrace.o doesn't need to appear for both symbols in Makefile.

Then as ftrace.o is built only when CONFIG_FUNCTION_TRACER is selected
ifdef CONFIG_FUNCTION_TRACER is not needed in ftrace.c, and since it
implies CONFIG_DYNAMIC_FTRACE, CONFIG_DYNAMIC_FTRACE is not needed
in ftrace.c

Signed-off-by: Christophe Leroy 
---
v2: Limit the change to the content of arch/powerpc/kernel/trace as suggested 
by Naveen.
---
 arch/powerpc/kernel/trace/Makefile | 4 +---
 arch/powerpc/kernel/trace/ftrace.c | 4 
 2 files changed, 1 insertion(+), 7 deletions(-)

diff --git a/arch/powerpc/kernel/trace/Makefile 
b/arch/powerpc/kernel/trace/Makefile
index fc32ec30b297..af8527538fe4 100644
--- a/arch/powerpc/kernel/trace/Makefile
+++ b/arch/powerpc/kernel/trace/Makefile
@@ -14,9 +14,7 @@ obj64-$(CONFIG_FUNCTION_TRACER)   += 
ftrace_mprofile.o
 else
 obj64-$(CONFIG_FUNCTION_TRACER)+= ftrace_64_pg.o
 endif
-obj-$(CONFIG_FUNCTION_TRACER)  += ftrace_low.o
-obj-$(CONFIG_DYNAMIC_FTRACE)   += ftrace.o
-obj-$(CONFIG_FUNCTION_GRAPH_TRACER)+= ftrace.o
+obj-$(CONFIG_FUNCTION_TRACER)  += ftrace_low.o ftrace.o
 obj-$(CONFIG_TRACING)  += trace_clock.o
 
 obj-$(CONFIG_PPC64)+= $(obj64-y)
diff --git a/arch/powerpc/kernel/trace/ftrace.c 
b/arch/powerpc/kernel/trace/ftrace.c
index 531da4d93c58..f89bcaa5f0fc 100644
--- a/arch/powerpc/kernel/trace/ftrace.c
+++ b/arch/powerpc/kernel/trace/ftrace.c
@@ -28,9 +28,6 @@
 #include 
 #include 
 
-
-#ifdef CONFIG_DYNAMIC_FTRACE
-
 /*
  * We generally only have a single long_branch tramp and at most 2 or 3 plt
  * tramps generated. But, we don't use the plt tramps currently. We also allot
@@ -783,7 +780,6 @@ int __init ftrace_dyn_arch_init(void)
return 0;
 }
 #endif
-#endif /* CONFIG_DYNAMIC_FTRACE */
 
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 
-- 
2.35.1



[PATCH v3 15/25] powerpc/ftrace: Use BRANCH_SET_LINK instead of value 1

2022-05-08 Thread Christophe Leroy
To make it explicit, use BRANCH_SET_LINK instead of value 1
when calling create_branch().

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/kernel/trace/ftrace.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/trace/ftrace.c 
b/arch/powerpc/kernel/trace/ftrace.c
index 010a8c7ff4ac..c4a68340a351 100644
--- a/arch/powerpc/kernel/trace/ftrace.c
+++ b/arch/powerpc/kernel/trace/ftrace.c
@@ -45,7 +45,7 @@ ftrace_call_replace(unsigned long ip, unsigned long addr, int 
link)
addr = ppc_function_entry((void *)addr);
 
/* if (link) set op to 'bl' else 'b' */
-   create_branch(, (u32 *)ip, addr, link ? 1 : 0);
+   create_branch(, (u32 *)ip, addr, link ? BRANCH_SET_LINK : 0);
 
return op;
 }
-- 
2.35.1



[PATCH v3 09/25] powerpc: Replace PPC64_ELF_ABI_v{1/2} by CONFIG_PPC64_ELF_ABI_V{1/2}

2022-05-08 Thread Christophe Leroy
Replace all uses of PPC64_ELF_ABI_v1 and PPC64_ELF_ABI_v2 by
resp CONFIG_PPC64_ELF_ABI_V1 and CONFIG_PPC64_ELF_ABI_V2.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/code-patching.h | 12 ++--
 arch/powerpc/include/asm/ftrace.h|  4 ++--
 arch/powerpc/include/asm/linkage.h   |  2 +-
 arch/powerpc/include/asm/ppc_asm.h   |  4 ++--
 arch/powerpc/include/asm/ptrace.h|  2 +-
 arch/powerpc/kernel/head_64.S|  2 +-
 arch/powerpc/kernel/interrupt_64.S   |  2 +-
 arch/powerpc/kernel/kprobes.c|  6 +++---
 arch/powerpc/kernel/misc_64.S|  2 +-
 arch/powerpc/kernel/module.c |  4 ++--
 arch/powerpc/kernel/module_64.c  |  4 ++--
 arch/powerpc/kernel/ptrace/ptrace.c  |  2 +-
 arch/powerpc/kernel/trace/ftrace.c   |  4 ++--
 arch/powerpc/kvm/book3s_interrupts.S |  2 +-
 arch/powerpc/kvm/book3s_rmhandlers.S |  2 +-
 arch/powerpc/net/bpf_jit.h   |  2 +-
 arch/powerpc/net/bpf_jit_comp.c  |  2 +-
 arch/powerpc/net/bpf_jit_comp64.c|  4 ++--
 18 files changed, 31 insertions(+), 31 deletions(-)

diff --git a/arch/powerpc/include/asm/code-patching.h 
b/arch/powerpc/include/asm/code-patching.h
index 4260e89f62b1..8b1a10868275 100644
--- a/arch/powerpc/include/asm/code-patching.h
+++ b/arch/powerpc/include/asm/code-patching.h
@@ -130,7 +130,7 @@ bool is_conditional_branch(ppc_inst_t instr);
 
 static inline unsigned long ppc_function_entry(void *func)
 {
-#ifdef PPC64_ELF_ABI_v2
+#ifdef CONFIG_PPC64_ELF_ABI_V2
u32 *insn = func;
 
/*
@@ -155,7 +155,7 @@ static inline unsigned long ppc_function_entry(void *func)
return (unsigned long)(insn + 2);
else
return (unsigned long)func;
-#elif defined(PPC64_ELF_ABI_v1)
+#elif defined(CONFIG_PPC64_ELF_ABI_V1)
/*
 * On PPC64 ABIv1 the function pointer actually points to the
 * function's descriptor. The first entry in the descriptor is the
@@ -169,7 +169,7 @@ static inline unsigned long ppc_function_entry(void *func)
 
 static inline unsigned long ppc_global_function_entry(void *func)
 {
-#ifdef PPC64_ELF_ABI_v2
+#ifdef CONFIG_PPC64_ELF_ABI_V2
/* PPC64 ABIv2 the global entry point is at the address */
return (unsigned long)func;
 #else
@@ -186,7 +186,7 @@ static inline unsigned long ppc_global_function_entry(void 
*func)
 static inline unsigned long ppc_kallsyms_lookup_name(const char *name)
 {
unsigned long addr;
-#ifdef PPC64_ELF_ABI_v1
+#ifdef CONFIG_PPC64_ELF_ABI_V1
/* check for dot variant */
char dot_name[1 + KSYM_NAME_LEN];
bool dot_appended = false;
@@ -207,7 +207,7 @@ static inline unsigned long ppc_kallsyms_lookup_name(const 
char *name)
if (!addr && dot_appended)
/* Let's try the original non-dot symbol lookup */
addr = kallsyms_lookup_name(name);
-#elif defined(PPC64_ELF_ABI_v2)
+#elif defined(CONFIG_PPC64_ELF_ABI_V2)
addr = kallsyms_lookup_name(name);
if (addr)
addr = ppc_function_entry((void *)addr);
@@ -224,7 +224,7 @@ static inline unsigned long ppc_kallsyms_lookup_name(const 
char *name)
  */
 
 /* This must match the definition of STK_GOT in  */
-#ifdef PPC64_ELF_ABI_v2
+#ifdef CONFIG_PPC64_ELF_ABI_V2
 #define R2_STACK_OFFSET 24
 #else
 #define R2_STACK_OFFSET 40
diff --git a/arch/powerpc/include/asm/ftrace.h 
b/arch/powerpc/include/asm/ftrace.h
index d83758acd1c7..b56166b7ea68 100644
--- a/arch/powerpc/include/asm/ftrace.h
+++ b/arch/powerpc/include/asm/ftrace.h
@@ -64,7 +64,7 @@ void ftrace_graph_func(unsigned long ip, unsigned long 
parent_ip,
  * those.
  */
 #define ARCH_HAS_SYSCALL_MATCH_SYM_NAME
-#ifdef PPC64_ELF_ABI_v1
+#ifdef CONFIG_PPC64_ELF_ABI_V1
 static inline bool arch_syscall_match_sym_name(const char *sym, const char 
*name)
 {
/* We need to skip past the initial dot, and the __se_sys alias */
@@ -83,7 +83,7 @@ static inline bool arch_syscall_match_sym_name(const char 
*sym, const char *name
(!strncmp(sym, "ppc32_", 6) && !strcmp(sym + 6, name + 4)) ||
(!strncmp(sym, "ppc64_", 6) && !strcmp(sym + 6, name + 4));
 }
-#endif /* PPC64_ELF_ABI_v1 */
+#endif /* CONFIG_PPC64_ELF_ABI_V1 */
 #endif /* CONFIG_FTRACE_SYSCALLS */
 
 #ifdef CONFIG_PPC64
diff --git a/arch/powerpc/include/asm/linkage.h 
b/arch/powerpc/include/asm/linkage.h
index 1f00d2891d69..b71b9582e754 100644
--- a/arch/powerpc/include/asm/linkage.h
+++ b/arch/powerpc/include/asm/linkage.h
@@ -4,7 +4,7 @@
 
 #include 
 
-#ifdef PPC64_ELF_ABI_v1
+#ifdef CONFIG_PPC64_ELF_ABI_V1
 #define cond_syscall(x) \
asm ("\t.weak " #x "\n\t.set " #x ", sys_ni_syscall\n"  \
 "\t.weak ." #x "\n\t.set ." #x ", .sys_ni_syscall\n")
diff --git a/arch/powerpc/include/asm/ppc_asm.h 
b/arch/powerpc/include/asm/ppc_asm.h
index 4dea2d963738..83c02f5a7f2a 100644
--- a/arch/powerpc/include/asm/ppc_asm.h
+++ 

[PATCH v3 04/25] powerpc/ftrace: Use is_offset_in_branch_range()

2022-05-08 Thread Christophe Leroy
Use is_offset_in_branch_range() instead of create_branch()
to check if a target is within branch range.

This patch together with the previous one improves
ftrace activation time by 7%

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/kernel/trace/ftrace.c | 8 ++--
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/kernel/trace/ftrace.c 
b/arch/powerpc/kernel/trace/ftrace.c
index 3ce3697e8a7c..41c45b9c7f39 100644
--- a/arch/powerpc/kernel/trace/ftrace.c
+++ b/arch/powerpc/kernel/trace/ftrace.c
@@ -89,11 +89,9 @@ ftrace_modify_code(unsigned long ip, ppc_inst_t old, 
ppc_inst_t new)
  */
 static int test_24bit_addr(unsigned long ip, unsigned long addr)
 {
-   ppc_inst_t op;
addr = ppc_function_entry((void *)addr);
 
-   /* use the create_branch to verify that this offset can be branched */
-   return create_branch(, (u32 *)ip, addr, 0) == 0;
+   return is_offset_in_branch_range(addr - ip);
 }
 
 static int is_bl_op(ppc_inst_t op)
@@ -261,7 +259,6 @@ __ftrace_make_nop(struct module *mod,
 static unsigned long find_ftrace_tramp(unsigned long ip)
 {
int i;
-   ppc_inst_t instr;
 
/*
 * We have the compiler generated long_branch tramps at the end
@@ -270,8 +267,7 @@ static unsigned long find_ftrace_tramp(unsigned long ip)
for (i = NUM_FTRACE_TRAMPS - 1; i >= 0; i--)
if (!ftrace_tramps[i])
continue;
-   else if (create_branch(, (void *)ip,
-  ftrace_tramps[i], 0) == 0)
+   else if (is_offset_in_branch_range(ftrace_tramps[i] - ip))
return ftrace_tramps[i];
 
return 0;
-- 
2.35.1



Re: [PATCH kernel] powerpc/llvm/lto: Allow LLVM LTO builds

2022-05-08 Thread Alexey Kardashevskiy




On 5/4/22 07:21, Nick Desaulniers wrote:

On Thu, Apr 28, 2022 at 11:46 PM Alexey Kardashevskiy  wrote:


This enables LTO_CLANG builds on POWER with the upstream version of
LLVM.

LTO optimizes the output vmlinux binary and this may affect the FTP
alternative section if alt branches use "bc" (Branch Conditional) which
is limited by 16 bit offsets. This shows up in errors like:

ld.lld: error: InputSection too large for range extension thunk 
vmlinux.o:(__ftr_alt_97+0xF0)

This works around the issue by replacing "bc" in FTR_SECTION_ELSE with
"b" which allows 26 bit offsets.

This catches the problem instructions in vmlinux.o before it LTO'ed:

$ objdump -d -M raw -j __ftr_alt_97 vmlinux.o | egrep '\S+\s*\'
   30:   00 00 82 40 bc  4,eq,30 <__ftr_alt_97+0x30>
   f0:   00 00 82 40 bc  4,eq,f0 <__ftr_alt_97+0xf0>

This allows LTO builds for ppc64le_defconfig plus LTO options.
Note that DYNAMIC_FTRACE/FUNCTION_TRACER is not supported by LTO builds
but this is not POWERPC-specific.


$ ARCH=powerpc make LLVM=1 -j72 ppc64le_defconfig
$ ARCH=powerpc make LLVM=1 -j72 menuconfig

$ ARCH=powerpc make LLVM=1 -j72
...
   VDSO64L arch/powerpc/kernel/vdso/vdso64.so.dbg
/usr/bin/powerpc64le-linux-gnu-ld:
/android0/llvm-project/llvm/build/bin/../lib/LLVMgold.so: error
loading plugin:
/android0/llvm-project/llvm/build/bin/../lib/LLVMgold.so: cannot open
shared object file: No such file or directory
clang-15: error: linker command failed with exit code 1 (use -v to see
invocation)
make[1]: *** [arch/powerpc/kernel/vdso/Makefile:67:
arch/powerpc/kernel/vdso/vdso64.so.dbg] Error 1

Looks like LLD isn't being invoked correctly to link the vdso.
Probably need to revisit
https://lore.kernel.org/lkml/20200901222523.1941988-1-ndesaulni...@google.com/

How were you working around this issue? Perhaps you built clang to
default to LLD? (there's a cmake option for that)



What option is that? I only add  -DLLVM_ENABLE_LLD=ON  which (I think) 
tells cmake to use lld to link the LLVM being built but does not seem to 
tell what the built clang should do.


Without -DLLVM_ENABLE_LLD=ON, building just fails:

[fstn1-p1 ~/pbuild/llvm/llvm-lto-latest-cleanbuild]$ ninja -j 100
[619/3501] Linking CXX executable bin/not
FAILED: bin/not
: && /usr/bin/clang++ -fPIC -fvisibility-inlines-hidden 
-Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra 
-Wno-unused-parameter -Wwrite-strings -Wcast-qual 
-Wmissing-field-initializers -pedantic -Wno-long-long 
-Wc++98-compat-extra-semi -Wimplicit-fallthrough 
-Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor 
-Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion 
-Wmisleading-indentation -fdiagnostics-color -ffunction-sections 
-fdata-sections -flto -O3 -DNDEBUG -flto 
-Wl,-rpath-link,/home/aik/pbuild/llvm/llvm-lto-latest-cleanbuild/./lib 
-Wl,--gc-sections utils/not/CMakeFiles/not.dir/not.cpp.o -o bin/not 
-Wl,-rpath,"\$ORIGIN/../lib"  -lpthread  lib/libLLVMSupport.a  -lrt 
-ldl  -lpthread  -lm  /usr/lib/powerpc64le-linux-gnu/libz.so 
/usr/lib/powerpc64le-linux-gnu/libtinfo.so  lib/libLLVMDemangle.a && :
/usr/bin/ld: lib/libLLVMSupport.a: error adding symbols: archive has no 
index; run ranlib to add one
clang: error: linker command failed with exit code 1 (use -v to see 
invocation)
[701/3501] Building CXX object 
utils/TableGen/CMakeFiles/llvm-tblgen.dir/GlobalISelEmitter.cpp.o

ninja: build stopped: subcommand failed.



My head hurts :(
The above example is running on PPC. Now I am trying x86 box:


[2693/3505] Linking CXX shared library lib/libLTO.so.15git
FAILED: lib/libLTO.so.15git
: && /usr/bin/clang++ -fPIC -fPIC -fvisibility-inlines-hidden 
-Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra 
-Wno-unused-parameter -Wwrite-strings -Wcast-qual 
-Wmissing-field-initializers -pedantic -Wno-long-long 
-Wc++98-compat-extra-semi -Wimplicit-fallthrough 
-Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor 
-Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion 
-Wmisleading-indentation -fdiagnostics-color -ffunction-sections 
-fdata-sections -flto -O3 -DNDEBUG  -Wl,-z,defs -Wl,-z,nodelete 
-fuse-ld=ld -flto   -Wl,-rpath-link,/home/aik/llvm-build/./lib 
-Wl,--gc-sections 
-Wl,--version-script,"/home/aik/llvm-build/tools/lto/LTO.exports" 
-shared -Wl,-soname,libLTO.so.15git -o lib/libLTO.so.15git 
tools/lto/CMakeFiles/LTO.dir/LTODisassembler.cpp.o 
tools/lto/CMakeFiles/LTO.dir/lto.cpp.o  -Wl,-rpath,"\$ORIGIN/../lib" 
lib/libLLVMPowerPCAsmParser.a  lib/libLLVMPowerPCCodeGen.a 
lib/libLLVMPowerPCDesc.a  lib/libLLVMPowerPCDisassembler.a 
lib/libLLVMPowerPCInfo.a  lib/libLLVMBitReader.a  lib/libLLVMCore.a 
lib/libLLVMCodeGen.a  lib/libLLVMLTO.a  lib/libLLVMMC.a 
lib/libLLVMMCDisassembler.a  lib/libLLVMSupport.a  lib/libLLVMTarget.a 
lib/libLLVMAsmPrinter.a  lib/libLLVMGlobalISel.a 
lib/libLLVMSelectionDAG.a  lib/libLLVMCodeGen.a  lib/libLLVMExtensions.a 
 lib/libLLVMPasses.a  lib/libLLVMTarget.a  

Re: [PATCH v4 00/14] kbuild: yet another series of cleanups (modpost, LTO, MODULE_REL_CRCS, export.h)

2022-05-08 Thread Masahiro Yamada
On Mon, May 9, 2022 at 4:09 AM Masahiro Yamada  wrote:
>
> This is the third batch of cleanups in this development cycle.
>
> Major changes in v4:
>  - Move static EXPORT_SYMBOL check to a script
>  - Some refactoring
>
> Major changes in v3:
>
>  - Generate symbol CRCs as C code, and remove CONFIG_MODULE_REL_CRCS.
>
> Major changes in v2:
>
>  - V1 did not work with CONFIG_MODULE_REL_CRCS.
>I fixed this for v2.
>
>  - Reflect some review comments in v1
>
>  - Refactor the code more
>
>  - Avoid too long argument error

This series is available at
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild.git
lto-cleanup-v4




>
>
> Masahiro Yamada (14):
>   modpost: remove left-over cross_compile declaration
>   modpost: change the license of EXPORT_SYMBOL to bool type
>   modpost: split the section mismatch checks into section-check.c
>   modpost: add sym_find_with_module() helper
>   modpost: extract symbol versions from *.cmd files
>   kbuild: link symbol CRCs at final link, removing
> CONFIG_MODULE_REL_CRCS
>   kbuild: stop merging *.symversions
>   genksyms: adjust the output format to modpost
>   kbuild: do not create *.prelink.o for Clang LTO or IBT
>   kbuild: check static EXPORT_SYMBOL* by script instead of modpost
>   kbuild: make built-in.a rule robust against too long argument error
>   kbuild: make *.mod rule robust against too long argument error
>   kbuild: add cmd_and_savecmd macro
>   kbuild: rebuild multi-object modules when objtool is updated
>
>  arch/powerpc/Kconfig|1 -
>  arch/s390/Kconfig   |1 -
>  arch/um/Kconfig |1 -
>  include/asm-generic/export.h|   22 +-
>  include/linux/export-internal.h |   16 +
>  include/linux/export.h  |   30 +-
>  init/Kconfig|4 -
>  kernel/module.c |   10 +-
>  scripts/Kbuild.include  |   10 +-
>  scripts/Makefile.build  |  134 +--
>  scripts/Makefile.lib|7 -
>  scripts/Makefile.modfinal   |5 +-
>  scripts/Makefile.modpost|9 +-
>  scripts/check-local-export  |   48 +
>  scripts/genksyms/genksyms.c |   18 +-
>  scripts/link-vmlinux.sh |   33 +-
>  scripts/mod/Makefile|2 +-
>  scripts/mod/modpost.c   | 1499 ---
>  scripts/mod/modpost.h   |   35 +-
>  scripts/mod/section-check.c | 1222 +
>  20 files changed, 1551 insertions(+), 1556 deletions(-)
>  create mode 100644 include/linux/export-internal.h
>  create mode 100755 scripts/check-local-export
>  create mode 100644 scripts/mod/section-check.c
>
> --
> 2.32.0
>
> --
> You received this message because you are subscribed to the Google Groups 
> "Clang Built Linux" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to clang-built-linux+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/clang-built-linux/20220508190631.2386038-1-masahiroy%40kernel.org.



-- 
Best Regards
Masahiro Yamada


Re: [PATCH v2 1/3] mm: change huge_ptep_clear_flush() to return the original pte

2022-05-08 Thread Muchun Song
On Sun, May 08, 2022 at 09:09:55PM +0800, Baolin Wang wrote:
> 
> 
> On 5/8/2022 7:09 PM, Muchun Song wrote:
> > On Sun, May 08, 2022 at 05:36:39PM +0800, Baolin Wang wrote:
> > > It is incorrect to use ptep_clear_flush() to nuke a hugetlb page
> > > table when unmapping or migrating a hugetlb page, and will change
> > > to use huge_ptep_clear_flush() instead in the following patches.
> > > 
> > > So this is a preparation patch, which changes the huge_ptep_clear_flush()
> > > to return the original pte to help to nuke a hugetlb page table.
> > > 
> > > Signed-off-by: Baolin Wang 
> > > Acked-by: Mike Kravetz 
> > 
> > Reviewed-by: Muchun Song 
> 
> Thanks for reviewing.
> 
> > 
> > But one nit below:
> > 
> > [...]
> > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> > > index 8605d7e..61a21af 100644
> > > --- a/mm/hugetlb.c
> > > +++ b/mm/hugetlb.c
> > > @@ -5342,7 +5342,7 @@ static vm_fault_t hugetlb_wp(struct mm_struct *mm, 
> > > struct vm_area_struct *vma,
> > >   ClearHPageRestoreReserve(new_page);
> > >   /* Break COW or unshare */
> > > - huge_ptep_clear_flush(vma, haddr, ptep);
> > > + (void)huge_ptep_clear_flush(vma, haddr, ptep);
> > 
> > Why add a "(void)" here? Is there any warning if no "(void)"?
> > IIUC, I think we can remove this, right?
> 
> I did not meet any warning without the casting, but this is per Mike's
> comment[1] to make the code consistent with other functions casting to void
> type explicitly in hugetlb.c file.
>

Got it. I see hugetlb.c per this rule, while others do not.
 
> [1]
> https://lore.kernel.org/all/495c4ebe-a5b4-afb6-4cb0-956c1b18d...@oracle.com/
> 


Re: request_module DoS

2022-05-08 Thread Luis Chamberlain
On Sat, May 07, 2022 at 12:14:47PM -0700, Luis Chamberlain wrote:
> On Sat, May 07, 2022 at 01:02:20AM -0700, Luis Chamberlain wrote:
> > You can try to reproduce by using adding a new test type for crypto-aegis256
> > on lib/test_kmod.c. These tests however can try something similar but other
> > modules.
> > 
> > /tools/testing/selftests/kmod/kmod.sh -t 0008
> > /tools/testing/selftests/kmod/kmod.sh -t 0009
> > 
> > I can't decipher this yet.
> 
> Without testing it... but something like this might be an easier
> reproducer:
> 
> + config_set_driver crypto-aegis256

If the module is not present though nothing really happens, and so
is it possible this is another issue?

Below a bogus module request.

diff --git a/tools/testing/selftests/kmod/kmod.sh 
b/tools/testing/selftests/kmod/kmod.sh
index afd42387e8b2..a747ad549940 100755
--- a/tools/testing/selftests/kmod/kmod.sh
+++ b/tools/testing/selftests/kmod/kmod.sh
@@ -65,6 +66,7 @@ ALL_TESTS="$ALL_TESTS 0010:1:1"
 ALL_TESTS="$ALL_TESTS 0011:1:1"
 ALL_TESTS="$ALL_TESTS 0012:1:1"
 ALL_TESTS="$ALL_TESTS 0013:1:1"
+ALL_TESTS="$ALL_TESTS 0014:150:1"
 
 # Kselftest framework requirement - SKIP code is 4.
 ksft_skip=4
@@ -504,6 +506,17 @@ kmod_test_0013()
"cat /sys/module/${DEFAULT_KMOD_DRIVER}/sections/.*text | head 
-n1"
 }
 
+kmod_test_0014()
+{
+   kmod_defaults_driver
+   MODPROBE_LIMIT=$(config_get_modprobe_limit)
+   let EXTRA=$MODPROBE_LIMIT/6
+   config_set_driver bogus_module_does_not_exist
+   config_num_thread_limit_extra $EXTRA
+   config_trigger ${FUNCNAME[0]}
+   config_expect_result ${FUNCNAME[0]} MODULE_NOT_FOUND
+}
+
 list_tests()
 {
echo "Test ID list:"
@@ -525,6 +538,7 @@ list_tests()
echo "0011 x $(get_test_count 0011) - test completely disabling module 
autoloading"
echo "0012 x $(get_test_count 0012) - test /proc/modules address 
visibility under CAP_SYSLOG"
echo "0013 x $(get_test_count 0013) - test /sys/module/*/sections/* 
visibility under CAP_SYSLOG"
+   echo "0014 x $(get_test_count 0014) - multithreaded - push 
kmod_concurrent over max_modprobes for request_module() for a missing module"
 }
 
 usage()


Re: [PATCH v2 2/3] mm: rmap: Fix CONT-PTE/PMD size hugetlb issue when migration

2022-05-08 Thread Muchun Song
On Sun, May 08, 2022 at 05:36:40PM +0800, Baolin Wang wrote:
> On some architectures (like ARM64), it can support CONT-PTE/PMD size
> hugetlb, which means it can support not only PMD/PUD size hugetlb:
> 2M and 1G, but also CONT-PTE/PMD size: 64K and 32M if a 4K page
> size specified.
> 
> When migrating a hugetlb page, we will get the relevant page table
> entry by huge_pte_offset() only once to nuke it and remap it with
> a migration pte entry. This is correct for PMD or PUD size hugetlb,
> since they always contain only one pmd entry or pud entry in the
> page table.
> 
> However this is incorrect for CONT-PTE and CONT-PMD size hugetlb,
> since they can contain several continuous pte or pmd entry with
> same page table attributes. So we will nuke or remap only one pte
> or pmd entry for this CONT-PTE/PMD size hugetlb page, which is
> not expected for hugetlb migration. The problem is we can still
> continue to modify the subpages' data of a hugetlb page during
> migrating a hugetlb page, which can cause a serious data consistent
> issue, since we did not nuke the page table entry and set a
> migration pte for the subpages of a hugetlb page.
> 
> To fix this issue, we should change to use huge_ptep_clear_flush()
> to nuke a hugetlb page table, and remap it with set_huge_pte_at()
> and set_huge_swap_pte_at() when migrating a hugetlb page, which
> already considered the CONT-PTE or CONT-PMD size hugetlb.
> 
> Signed-off-by: Baolin Wang 

This looks fine to me.

Reviewed-by: Muchun Song 

Thanks.


Re: [PATCH v2 1/3] mm: change huge_ptep_clear_flush() to return the original pte

2022-05-08 Thread Muchun Song
On Sun, May 08, 2022 at 05:36:39PM +0800, Baolin Wang wrote:
> It is incorrect to use ptep_clear_flush() to nuke a hugetlb page
> table when unmapping or migrating a hugetlb page, and will change
> to use huge_ptep_clear_flush() instead in the following patches.
> 
> So this is a preparation patch, which changes the huge_ptep_clear_flush()
> to return the original pte to help to nuke a hugetlb page table.
> 
> Signed-off-by: Baolin Wang 
> Acked-by: Mike Kravetz 

Reviewed-by: Muchun Song 

But one nit below:

[...]
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 8605d7e..61a21af 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -5342,7 +5342,7 @@ static vm_fault_t hugetlb_wp(struct mm_struct *mm, 
> struct vm_area_struct *vma,
>   ClearHPageRestoreReserve(new_page);
>  
>   /* Break COW or unshare */
> - huge_ptep_clear_flush(vma, haddr, ptep);
> + (void)huge_ptep_clear_flush(vma, haddr, ptep);

Why add a "(void)" here? Is there any warning if no "(void)"?
IIUC, I think we can remove this, right?

>   mmu_notifier_invalidate_range(mm, range.start, range.end);
>   page_remove_rmap(old_page, vma, true);
>   hugepage_add_new_anon_rmap(new_page, vma, haddr);
> -- 
> 1.8.3.1
> 
> 


Re: [PATCH] ASoC: fsl_sai: fix incorrect mclk number in error message

2022-05-08 Thread Shengjiu Wang
On Sat, May 7, 2022 at 8:31 PM Pieterjan Camerlynck <
pieterjan.camerly...@gmail.com> wrote:

> In commit  ("ASoC: fsl_sai: add sai master mode support")
> the loop was changed to start iterating from 1 instead of 0. The error
> message however was not updated, reporting the wrong clock to the user.
>
> Signed-off-by: Pieterjan Camerlynck 
>

Acked-by: Shengjiu Wang 

Best Regards
Wang Shengjiu

> ---
>  sound/soc/fsl/fsl_sai.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/sound/soc/fsl/fsl_sai.c b/sound/soc/fsl/fsl_sai.c
> index ffc24afb5a7a..f0602077b385 100644
> --- a/sound/soc/fsl/fsl_sai.c
> +++ b/sound/soc/fsl/fsl_sai.c
> @@ -1054,7 +1054,7 @@ static int fsl_sai_probe(struct platform_device
> *pdev)
> sai->mclk_clk[i] = devm_clk_get(>dev, tmp);
> if (IS_ERR(sai->mclk_clk[i])) {
> dev_err(>dev, "failed to get mclk%d clock:
> %ld\n",
> -   i + 1, PTR_ERR(sai->mclk_clk[i]));
> +   i, PTR_ERR(sai->mclk_clk[i]));
> sai->mclk_clk[i] = NULL;
> }
> }
> --
> 2.25.1
>
>


[powerpc:fixes-test] BUILD SUCCESS 348c71344111d7a48892e3e52264ff11956fc196

2022-05-08 Thread kernel test robot
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git 
fixes-test
branch HEAD: 348c71344111d7a48892e3e52264ff11956fc196  powerpc/papr_scm: Fix 
buffer overflow issue with CONFIG_FORTIFY_SOURCE

elapsed time: 739m

configs tested: 153
configs skipped: 100

The following configs have been built successfully.
More configs may be tested in the coming days.

gcc tested configs:
arm64   defconfig
arm64allyesconfig
arm  allmodconfig
arm defconfig
arm  allyesconfig
i386  randconfig-c001
arm   imxrt_defconfig
arm  footbridge_defconfig
m68k alldefconfig
arm ezx_defconfig
powerpc64   defconfig
h8300alldefconfig
powerpc   motionpro_defconfig
armshmobile_defconfig
sh   sh7770_generic_defconfig
arm lubbock_defconfig
mipsvocore2_defconfig
ia64 bigsur_defconfig
mips  decstation_64_defconfig
h8300   h8s-sim_defconfig
sh  kfr2r09_defconfig
sh   se7712_defconfig
powerpc redwood_defconfig
powerpc mpc837x_rdb_defconfig
powerpc64alldefconfig
arcnsimosci_defconfig
um   x86_64_defconfig
armspear6xx_defconfig
powerpc  cm5200_defconfig
arm  iop32x_defconfig
armtrizeps4_defconfig
sparc   sparc64_defconfig
ia64 alldefconfig
mips   bmips_be_defconfig
powerpc rainier_defconfig
sparc   defconfig
sh   se7750_defconfig
sh  defconfig
mips decstation_r4k_defconfig
shecovec24-romimage_defconfig
arm  lpd270_defconfig
sh   se7721_defconfig
alphaallyesconfig
shapsh4ad0a_defconfig
arm   aspeed_g5_defconfig
sh   se7343_defconfig
powerpc   eiger_defconfig
arm   sunxi_defconfig
powerpc  mgcoge_defconfig
sh   se7751_defconfig
mips   xway_defconfig
sh   se7619_defconfig
arm s3c6400_defconfig
arc  alldefconfig
xtensasmp_lx200_defconfig
powerpc pq2fads_defconfig
m68k allmodconfig
sh  sdk7780_defconfig
powerpcsam440ep_defconfig
sh   se7722_defconfig
openriscdefconfig
arm at91_dt_defconfig
arm   viper_defconfig
x86_64randconfig-c001
arm  randconfig-c002-20220508
ia64defconfig
m68k allyesconfig
m68kdefconfig
nios2   defconfig
arc  allyesconfig
cskydefconfig
nios2allyesconfig
alpha   defconfig
h8300allyesconfig
xtensa   allyesconfig
arc defconfig
sh   allmodconfig
s390defconfig
s390 allmodconfig
parisc  defconfig
parisc64defconfig
parisc   allyesconfig
s390 allyesconfig
i386 allyesconfig
sparcallyesconfig
i386defconfig
i386   debian-10.3-kselftests
i386  debian-10.3
mips allyesconfig
mips allmodconfig
powerpc  allyesconfig
powerpc   allnoconfig
powerpc  allmodconfig
x86_64randconfig-a006
x86_64randconfig-a004
x86_64randconfig-a002
i386  randconfig-a012
i386  randconfig-a014
i386  randconfig-a016
riscv

[PATCH v4 14/14] kbuild: rebuild multi-object modules when objtool is updated

2022-05-08 Thread Masahiro Yamada
When CONFIG_LTO_CLANG or CONFIG_X86_KERNEL_IBT is enabled, objtool for
multi-object modules is postponed until the objects are linked together.

Make sure to re-run objtool and re-link multi-object modules when
objtool is updated.

Signed-off-by: Masahiro Yamada 
Reviewed-by: Kees Cook 
Acked-by: Josh Poimboeuf 
---

Changes in v4:
  - New
Resent of my previous submission

https://lore.kernel.org/linux-kbuild/20210831074004.3195284-11-masahi...@kernel.org/

 scripts/Makefile.build | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/scripts/Makefile.build b/scripts/Makefile.build
index f546b5f1f33f..4e6902e099e8 100644
--- a/scripts/Makefile.build
+++ b/scripts/Makefile.build
@@ -404,13 +404,18 @@ $(obj)/modules.order: $(obj-m) FORCE
 $(obj)/lib.a: $(lib-y) FORCE
$(call if_changed,ar)
 
-quiet_cmd_link_multi-m = LD [M]  $@
-  cmd_link_multi-m = $(LD) $(ld_flags) -r -o $@ @$(patsubst %.o,%.mod,$@) 
$(cmd_objtool)
+quiet_cmd_ld_multi_m = LD [M]  $@
+  cmd_ld_multi_m = $(LD) $(ld_flags) -r -o $@ @$(patsubst %.o,%.mod,$@) 
$(cmd_objtool)
+
+define rule_ld_multi_m
+   $(call cmd_and_savecmd,ld_multi_m)
+   $(call cmd,gen_objtooldep)
+endef
 
 $(multi-obj-m): objtool-enabled := $(delay-objtool)
 $(multi-obj-m): part-of-module := y
 $(multi-obj-m): %.o: %.mod FORCE
-   $(call if_changed,link_multi-m)
+   $(call if_changed_rule,ld_multi_m)
 $(call multi_depend, $(multi-obj-m), .o, -objs -y -m)
 
 targets := $(filter-out $(PHONY), $(targets))
-- 
2.32.0



[PATCH v4 13/14] kbuild: add cmd_and_savecmd macro

2022-05-08 Thread Masahiro Yamada
Separate out the command execution part of if_changed, as we did
for if_changed_dep.

This allows us to reuse it in if_changed_rule.

  define rule_foo
  $(call cmd_and_savecmd,foo)
  $(call cmd,bar)
  endef

Signed-off-by: Masahiro Yamada 
Reviewed-by: Kees Cook 
---

Changes in v4:
  - New.
Resent of my previous submission.
https://lore.kernel.org/all/20210831074004.3195284-10-masahi...@kernel.org/

 scripts/Kbuild.include | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/scripts/Kbuild.include b/scripts/Kbuild.include
index 455a0a6ce12d..ece44b735061 100644
--- a/scripts/Kbuild.include
+++ b/scripts/Kbuild.include
@@ -142,9 +142,11 @@ check-FORCE = $(if $(filter FORCE, $^),,$(warning FORCE 
prerequisite is missing)
 if-changed-cond = $(newer-prereqs)$(cmd-check)$(check-FORCE)
 
 # Execute command if command has changed or prerequisite(s) are updated.
-if_changed = $(if $(if-changed-cond),\
+if_changed = $(if $(if-changed-cond),$(cmd_and_savecmd),@:)
+
+cmd_and_savecmd =\
$(cmd);  \
-   printf '%s\n' 'cmd_$@ := $(make-cmd)' > $(dot-target).cmd, @:)
+   printf '%s\n' 'cmd_$@ := $(make-cmd)' > $(dot-target).cmd
 
 # Execute the command and also postprocess generated .d dependencies file.
 if_changed_dep = $(if $(if-changed-cond),$(cmd_and_fixdep),@:)
-- 
2.32.0



[PATCH v4 07/14] kbuild: stop merging *.symversions

2022-05-08 Thread Masahiro Yamada
Now modpost reads symbol versions from .*.cmd files.

The merged *.symversions are no longer needed.

Signed-off-by: Masahiro Yamada 
Reviewed-by: Nicolas Schier 
Tested-by: Nathan Chancellor 
---

(no changes since v1)

 scripts/Makefile.build  | 21 ++---
 scripts/link-vmlinux.sh | 15 ---
 2 files changed, 2 insertions(+), 34 deletions(-)

diff --git a/scripts/Makefile.build b/scripts/Makefile.build
index ddd9080fc028..dff9220135c4 100644
--- a/scripts/Makefile.build
+++ b/scripts/Makefile.build
@@ -390,17 +390,6 @@ $(obj)/%.asn1.c $(obj)/%.asn1.h: $(src)/%.asn1 
$(objtree)/scripts/asn1_compiler
 $(subdir-builtin): $(obj)/%/built-in.a: $(obj)/% ;
 $(subdir-modorder): $(obj)/%/modules.order: $(obj)/% ;
 
-# combine symversions for later processing
-ifeq ($(CONFIG_LTO_CLANG) $(CONFIG_MODVERSIONS),y y)
-  cmd_update_lto_symversions = \
-   rm -f $@.symversions\
-   $(foreach n, $(filter-out FORCE,$^),\
-   $(if $(shell test -s $(n).symversions && echo y),   \
-   ; cat $(n).symversions >> $@.symversions))
-else
-  cmd_update_lto_symversions = echo >/dev/null
-endif
-
 #
 # Rule to compile a set of .o files into one .a file (without symbol table)
 #
@@ -408,11 +397,8 @@ endif
 quiet_cmd_ar_builtin = AR  $@
   cmd_ar_builtin = rm -f $@; $(AR) cDPrST $@ $(real-prereqs)
 
-quiet_cmd_ar_and_symver = AR  $@
-  cmd_ar_and_symver = $(cmd_update_lto_symversions); $(cmd_ar_builtin)
-
 $(obj)/built-in.a: $(real-obj-y) FORCE
-   $(call if_changed,ar_and_symver)
+   $(call if_changed,ar_builtin)
 
 #
 # Rule to create modules.order file
@@ -432,16 +418,13 @@ $(obj)/modules.order: $(obj-m) FORCE
 #
 # Rule to compile a set of .o files into one .a file (with symbol table)
 #
-quiet_cmd_ar_lib = AR  $@
-  cmd_ar_lib = $(cmd_update_lto_symversions); $(cmd_ar)
 
 $(obj)/lib.a: $(lib-y) FORCE
-   $(call if_changed,ar_lib)
+   $(call if_changed,ar)
 
 ifneq ($(CONFIG_LTO_CLANG)$(CONFIG_X86_KERNEL_IBT),)
 quiet_cmd_link_multi-m = AR [M]  $@
 cmd_link_multi-m = \
-   $(cmd_update_lto_symversions);  \
rm -f $@;   \
$(AR) cDPrsT $@ @$(patsubst %.o,%.mod,$@)
 else
diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
index 6aee2401f3ad..bc94252e920c 100755
--- a/scripts/link-vmlinux.sh
+++ b/scripts/link-vmlinux.sh
@@ -56,20 +56,6 @@ gen_initcalls()
> .tmp_initcalls.lds
 }
 
-# If CONFIG_LTO_CLANG is selected, collect generated symbol versions into
-# .tmp_symversions.lds
-gen_symversions()
-{
-   info GEN .tmp_symversions.lds
-   rm -f .tmp_symversions.lds
-
-   for o in ${KBUILD_VMLINUX_OBJS} ${KBUILD_VMLINUX_LIBS}; do
-   if [ -f ${o}.symversions ]; then
-   cat ${o}.symversions >> .tmp_symversions.lds
-   fi
-   done
-}
-
 # Link of vmlinux.o used for section mismatch analysis
 # ${1} output file
 modpost_link()
@@ -303,7 +289,6 @@ cleanup()
rm -f .btf.*
rm -f .tmp_System.map
rm -f .tmp_initcalls.lds
-   rm -f .tmp_symversions.lds
rm -f .tmp_vmlinux*
rm -f System.map
rm -f vmlinux
-- 
2.32.0



[PATCH v4 03/14] modpost: split the section mismatch checks into section-check.c

2022-05-08 Thread Masahiro Yamada
modpost.c is too big, and the half of the code is for section checks.
Split it.

I fixed some style issues in the moved code.

Signed-off-by: Masahiro Yamada 
---

Changes in v4:
  - New patch

 scripts/mod/Makefile|2 +-
 scripts/mod/modpost.c   | 1202 +-
 scripts/mod/modpost.h   |   34 +-
 scripts/mod/section-check.c | 1222 +++
 4 files changed, 1240 insertions(+), 1220 deletions(-)
 create mode 100644 scripts/mod/section-check.c

diff --git a/scripts/mod/Makefile b/scripts/mod/Makefile
index c9e38ad937fd..ca739c6c68a1 100644
--- a/scripts/mod/Makefile
+++ b/scripts/mod/Makefile
@@ -5,7 +5,7 @@ CFLAGS_REMOVE_empty.o += $(CC_FLAGS_LTO)
 hostprogs-always-y += modpost mk_elfconfig
 always-y   += empty.o
 
-modpost-objs   := modpost.o file2alias.o sumversion.o
+modpost-objs   := modpost.o section-check.o file2alias.o sumversion.o
 
 devicetable-offsets-file := devicetable-offsets.h
 
diff --git a/scripts/mod/modpost.c b/scripts/mod/modpost.c
index a78b75f0eeb0..e7e2c70a98f5 100644
--- a/scripts/mod/modpost.c
+++ b/scripts/mod/modpost.c
@@ -31,7 +31,7 @@ static bool external_module;
 /* Only warn about unresolved symbols */
 static bool warn_unresolved;
 
-static int sec_mismatch_count;
+int sec_mismatch_count;
 static bool sec_mismatch_warn_only = true;
 /* ignore missing files */
 static bool ignore_missing_files;
@@ -310,8 +310,8 @@ static void add_namespace(struct list_head *head, const 
char *namespace)
}
 }
 
-static void *sym_get_data_by_offset(const struct elf_info *info,
-   unsigned int secindex, unsigned long offset)
+void *sym_get_data_by_offset(const struct elf_info *info,
+unsigned int secindex, unsigned long offset)
 {
Elf_Shdr *sechdr = >sechdrs[secindex];
 
@@ -327,19 +327,17 @@ static void *sym_get_data(const struct elf_info *info, 
const Elf_Sym *sym)
  sym->st_value);
 }
 
-static const char *sech_name(const struct elf_info *info, Elf_Shdr *sechdr)
+const char *sech_name(const struct elf_info *info, Elf_Shdr *sechdr)
 {
return sym_get_data_by_offset(info, info->secindex_strings,
  sechdr->sh_name);
 }
 
-static const char *sec_name(const struct elf_info *info, int secindex)
+const char *sec_name(const struct elf_info *info, int secindex)
 {
return sech_name(info, >sechdrs[secindex]);
 }
 
-#define strstarts(str, prefix) (strncmp(str, prefix, strlen(prefix)) == 0)
-
 static void sym_update_namespace(const char *symname, const char *namespace)
 {
struct symbol *s = find_symbol(symname);
@@ -741,1196 +739,6 @@ static char *get_modinfo(struct elf_info *info, const 
char *tag)
return get_next_modinfo(info, tag, NULL);
 }
 
-/**
- * Test if string s ends in string sub
- * return 0 if match
- **/
-static int strrcmp(const char *s, const char *sub)
-{
-   int slen, sublen;
-
-   if (!s || !sub)
-   return 1;
-
-   slen = strlen(s);
-   sublen = strlen(sub);
-
-   if ((slen == 0) || (sublen == 0))
-   return 1;
-
-   if (sublen > slen)
-   return 1;
-
-   return memcmp(s + slen - sublen, sub, sublen);
-}
-
-static const char *sym_name(struct elf_info *elf, Elf_Sym *sym)
-{
-   if (sym)
-   return elf->strtab + sym->st_name;
-   else
-   return "(unknown)";
-}
-
-/* The pattern is an array of simple patterns.
- * "foo" will match an exact string equal to "foo"
- * "*foo" will match a string that ends with "foo"
- * "foo*" will match a string that begins with "foo"
- * "*foo*" will match a string that contains "foo"
- */
-static int match(const char *sym, const char * const pat[])
-{
-   const char *p;
-   while (*pat) {
-   const char *endp;
-
-   p = *pat++;
-   endp = p + strlen(p) - 1;
-
-   /* "*foo*" */
-   if (*p == '*' && *endp == '*') {
-   char *bare = NOFAIL(strndup(p + 1, strlen(p) - 2));
-   char *here = strstr(sym, bare);
-
-   free(bare);
-   if (here != NULL)
-   return 1;
-   }
-   /* "*foo" */
-   else if (*p == '*') {
-   if (strrcmp(sym, p + 1) == 0)
-   return 1;
-   }
-   /* "foo*" */
-   else if (*endp == '*') {
-   if (strncmp(sym, p, strlen(p) - 1) == 0)
-   return 1;
-   }
-   /* no wildcards */
-   else {
-   if (strcmp(p, sym) == 0)
-   return 1;
-   }
-   }
-   /* no match */
-   return 0;
-}
-
-/* sections that we do not want to do full section mismatch check 

[PATCH v4 11/14] kbuild: make built-in.a rule robust against too long argument error

2022-05-08 Thread Masahiro Yamada
Kbuild runs at the top of objtree instead of changing the working
directory to subdirectories. I think this design is nice overall but
some commands have a scalability issue.

The build command of built-in.a is one of them whose length scales with:

O(D * N)

Here, D is the length of the directory path (i.e. $(obj)/ prefix),
N is the number of objects in the Makefile, O() is the big O notation.

The deeper directory the Makefile directory is located, the more easily
it will hit the too long argument error.

We can make it better. Trim the $(obj)/ by Make's builtin function, and
restore it by a shell command (sed).

With this, the command length scales with:

O(D + N)

In-tree modules still have some room to the limit (ARG_MAX=2097152),
but this is more future-proof for big modules in a deep directory.

For example, you can build i915 as builtin (CONFIG_DRM_I915=y) and
compare drivers/gpu/drm/i915/.built-in.a.cmd with/without this commit.

Signed-off-by: Masahiro Yamada 
Reviewed-by: Nicolas Schier 
Tested-by: Nathan Chancellor 
---

(no changes since v2)

Changes in v2:
  - New patch

 scripts/Makefile.build | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/scripts/Makefile.build b/scripts/Makefile.build
index c2a173b3fd60..8f1a355df7aa 100644
--- a/scripts/Makefile.build
+++ b/scripts/Makefile.build
@@ -374,7 +374,10 @@ $(subdir-modorder): $(obj)/%/modules.order: $(obj)/% ;
 #
 
 quiet_cmd_ar_builtin = AR  $@
-  cmd_ar_builtin = rm -f $@; $(AR) cDPrST $@ $(real-prereqs)
+  cmd_ar_builtin = rm -f $@; \
+   echo $(patsubst $(obj)/%,%,$(real-prereqs)) | \
+   sed -E 's:([^ ]+):$(obj)/\1:g' | \
+   xargs $(AR) cDPrST $@
 
 $(obj)/built-in.a: $(real-obj-y) FORCE
$(call if_changed,ar_builtin)
-- 
2.32.0



[PATCH v4 10/14] kbuild: check static EXPORT_SYMBOL* by script instead of modpost

2022-05-08 Thread Masahiro Yamada
The 'static' specifier and EXPORT_SYMBOL() are an odd combination.

Commit 15bfc2348d54 ("modpost: check for static EXPORT_SYMBOL*
functions") tried to detect it, but this check has false negatives.

Here is the sample code.

  Makefile:

obj-y += foo1.o foo2.o

  foo1.c:

#include 
static void foo(void) {}
EXPORT_SYMBOL(foo);

  foo2.c:

void foo(void) {}

foo1.c exports the static symbol 'foo', but modpost cannot catch it
because it is fooled by foo2.c, which has a global symbol with the
same name.

s->is_static is cleared if a global symbol with the same name is found
somewhere, but EXPORT_SYMBOL() and the global symbol do not necessarily
belong to the same compilation unit.

This check should be done per compilation unit, but I do not know how
to do it in modpost. modpost runs against vmlinux.o or modules, which
merges multiple objects, then forgets their origin.

It is true modpost gets access to the lists of all the member objects
(.vmlinux.objs and *.mod), but it is impossible to parse individual
objects in modpost; they might be LLVM IR instead of ELF when
CONFIG_LTO_CLANG=y.

Add a simple bash script to parse the output from ${NM}. This works for
CONFIG_LTO_CLANG=y because llvm-nm can dump symbols of LLVM bitcode.

Revert 15bfc2348d54.

Signed-off-by: Masahiro Yamada 
---

Changes in v4:
  - New patch

 scripts/Makefile.build |  4 
 scripts/check-local-export | 48 ++
 scripts/mod/modpost.c  | 28 +-
 3 files changed, 53 insertions(+), 27 deletions(-)
 create mode 100755 scripts/check-local-export

diff --git a/scripts/Makefile.build b/scripts/Makefile.build
index 838ea5e83174..c2a173b3fd60 100644
--- a/scripts/Makefile.build
+++ b/scripts/Makefile.build
@@ -244,9 +244,12 @@ cmd_gen_ksymdeps = \
$(CONFIG_SHELL) $(srctree)/scripts/gen_ksymdeps.sh $@ >> 
$(dot-target).cmd
 endif
 
+cmd_check_local_export = $(srctree)/scripts/check-local-export $@
+
 define rule_cc_o_c
$(call cmd_and_fixdep,cc_o_c)
$(call cmd,gen_ksymdeps)
+   $(call cmd,check_local_export)
$(call cmd,checksrc)
$(call cmd,checkdoc)
$(call cmd,gen_objtooldep)
@@ -257,6 +260,7 @@ endef
 define rule_as_o_S
$(call cmd_and_fixdep,as_o_S)
$(call cmd,gen_ksymdeps)
+   $(call cmd,check_local_export)
$(call cmd,gen_objtooldep)
$(call cmd,gen_symversions_S)
 endef
diff --git a/scripts/check-local-export b/scripts/check-local-export
new file mode 100755
index ..d1721fa63057
--- /dev/null
+++ b/scripts/check-local-export
@@ -0,0 +1,48 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0-only
+#
+# Copyright (C) 2022 Masahiro Yamada
+
+set -e
+set -o pipefail
+
+declare -A symbol_types
+declare -a export_symbols
+
+exit_code=0
+
+while read value type name
+do
+   # to avoid error for clang LTO; $name may be empty
+   if [[ $value = -* && -z $name ]]; then
+   continue
+   fi
+
+   # The first field (value) may be empty. If so, fix it up.
+   if [[ -z $name ]]; then
+  name=${type}
+  type=${value}
+   fi
+
+   # save (name, type) in the associative array
+   symbol_types[$name]=$type
+
+   # append the exported symbol to the array
+   if [[ $name == __ksymtab_* ]]; then
+   export_symbols+=(${name#__ksymtab_})
+   fi
+done < <(${NM} ${1} 2>/dev/null)
+
+# Catch error in the process substitution
+wait $!
+
+for name in "${export_symbols[@]}"
+do
+   # nm(3) says "If lowercase, the symbol is usually local"
+   if [[ ${symbol_types[$name]} =~ [a-z] ]]; then
+   echo "$@: error: local symbol '${name}' was exported" >&2
+   exit_code=1
+   fi
+done
+
+exit ${exit_code}
diff --git a/scripts/mod/modpost.c b/scripts/mod/modpost.c
index 018527d96680..fa73ddb6a6cf 100644
--- a/scripts/mod/modpost.c
+++ b/scripts/mod/modpost.c
@@ -212,7 +212,6 @@ struct symbol {
unsigned int crc;
bool crc_valid;
bool weak;
-   bool is_static; /* true if symbol is not global */
bool is_gpl_only;   /* exported by EXPORT_SYMBOL_GPL */
char name[];
 };
@@ -242,7 +241,7 @@ static struct symbol *alloc_symbol(const char *name)
 
memset(s, 0, sizeof(*s));
strcpy(s->name, name);
-   s->is_static = true;
+
return s;
 }
 
@@ -875,20 +874,6 @@ static void read_symbols(const char *modname)
 sym_get_data(, sym));
}
 
-   // check for static EXPORT_SYMBOL_* functions && global vars
-   for (sym = info.symtab_start; sym < info.symtab_stop; sym++) {
-   unsigned char bind = ELF_ST_BIND(sym->st_info);
-
-   if (bind == STB_GLOBAL || bind == STB_WEAK) {
-   struct symbol *s =
-   find_symbol(remove_dot(info.strtab +
-  

[PATCH v4 12/14] kbuild: make *.mod rule robust against too long argument error

2022-05-08 Thread Masahiro Yamada
Like built-in.a, the command length of the *.mod rule scales with
the depth of the directory times the number of objects in the Makefile.

Add $(obj)/ by the shell command (awk) instead of by Make's builtin
function.

In-tree modules still have some room to the limit (ARG_MAX=2097152),
but this is more future-proof for big modules in a deep directory.

For example, you can build i915 as a module (CONFIG_DRM_I915=m) and
compare drivers/gpu/drm/i915/.i915.mod.cmd with/without this commit.

The issue is more critical for external modules because the M= path
can be very long as Jeff Johnson reported before [1].

[1] 
https://lore.kernel.org/linux-kbuild/4c02050c4e95e4cb8cc04282695f8...@codeaurora.org/

Signed-off-by: Masahiro Yamada 
Reviewed-by: Nicolas Schier 
Tested-by: Nathan Chancellor 
---

(no changes since v2)

Changes in v2:
  - New patch

 scripts/Makefile.build | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/scripts/Makefile.build b/scripts/Makefile.build
index 8f1a355df7aa..f546b5f1f33f 100644
--- a/scripts/Makefile.build
+++ b/scripts/Makefile.build
@@ -270,8 +270,8 @@ $(obj)/%.o: $(src)/%.c $(recordmcount_source) FORCE
$(call if_changed_rule,cc_o_c)
$(call cmd,force_checksrc)
 
-cmd_mod = echo $(addprefix $(obj)/, $(call real-search, $*.o, .o, -objs -y 
-m)) | \
-   $(AWK) -v RS='( |\n)' '!x[$$0]++' > $@
+cmd_mod = echo $(call real-search, $*.o, .o, -objs -y -m) | \
+   $(AWK) -v RS='( |\n)' '!x[$$0]++ { print("$(obj)/"$$0) }' > $@
 
 $(obj)/%.mod: FORCE
$(call if_changed,mod)
-- 
2.32.0



[PATCH v4 06/14] kbuild: link symbol CRCs at final link, removing CONFIG_MODULE_REL_CRCS

2022-05-08 Thread Masahiro Yamada
include/{linux,asm-generic}/export.h defines a weak symbol, __crc_*
as a placeholder.

Genksyms writes the version CRCs into the linker script, which will be
used for filling the __crc_* symbols. The linker script format depends
on CONFIG_MODULE_REL_CRCS. If it is enabled, __crc_* holds the offset
to the reference of CRC.

It is time to get rid of this complexity.

Now that modpost parses text files (.*.cmd) to collect all the CRCs,
it can generate C code that will be linked to the vmlinux or modules.

Generate a new C file, .vmlinux.export.c, which contains the CRCs of
symbols exported by vmlinux. It is compiled and linked to vmlinux in
scripts/link-vmlinux.sh.

Put the CRCs of symbols exported by modules into the existing *.mod.c
files. No additional build step is needed for modules. As before,
*.mod.c are compiled and linked to *.ko in scripts/Makefile.modfinal.

No linker magic is used here. The new C implementation works in the
same way, whether CONFIG_RELOCATABLE is enabled or not.
CONFIG_MODULE_REL_CRCS is no longer needed.

Previously, Kbuild invoked additional $(LD) to update the CRCs in
objects, but this step is unneeded too.

Signed-off-by: Masahiro Yamada 
Tested-by: Nathan Chancellor 
---

Changes in v4:
  - Rename .vmlinux-symver.c to .vmlinux.export.c
because I notice this approach is useful for further cleanups,
not only for modversioning but also for overall EXPORT_SYMBOL.

Changes in v3:
  - New patch

 arch/powerpc/Kconfig|  1 -
 arch/s390/Kconfig   |  1 -
 arch/um/Kconfig |  1 -
 include/asm-generic/export.h| 22 --
 include/linux/export-internal.h | 16 
 include/linux/export.h  | 30 --
 init/Kconfig|  4 
 kernel/module.c | 10 +-
 scripts/Makefile.build  | 27 ---
 scripts/genksyms/genksyms.c | 17 -
 scripts/link-vmlinux.sh | 18 +-
 scripts/mod/modpost.c   | 28 
 12 files changed, 78 insertions(+), 97 deletions(-)
 create mode 100644 include/linux/export-internal.h

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 174edabb74fa..a4e8dd889e29 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -566,7 +566,6 @@ config RELOCATABLE
bool "Build a relocatable kernel"
depends on PPC64 || (FLATMEM && (44x || FSL_BOOKE))
select NONSTATIC_KERNEL
-   select MODULE_REL_CRCS if MODVERSIONS
help
  This builds a kernel image that is capable of running at the
  location the kernel is loaded at. For ppc32, there is no any
diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index 77b5a03de13a..aa5848004c76 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -567,7 +567,6 @@ endchoice
 
 config RELOCATABLE
bool "Build a relocatable kernel"
-   select MODULE_REL_CRCS if MODVERSIONS
default y
help
  This builds a kernel image that retains relocation information
diff --git a/arch/um/Kconfig b/arch/um/Kconfig
index 4d398b80aea8..e8983d098e73 100644
--- a/arch/um/Kconfig
+++ b/arch/um/Kconfig
@@ -106,7 +106,6 @@ config LD_SCRIPT_DYN
bool
default y
depends on !LD_SCRIPT_STATIC
-   select MODULE_REL_CRCS if MODVERSIONS
 
 config LD_SCRIPT_DYN_RPATH
bool "set rpath in the binary" if EXPERT
diff --git a/include/asm-generic/export.h b/include/asm-generic/export.h
index 07a36a874dca..51ce72ce80fa 100644
--- a/include/asm-generic/export.h
+++ b/include/asm-generic/export.h
@@ -2,6 +2,14 @@
 #ifndef __ASM_GENERIC_EXPORT_H
 #define __ASM_GENERIC_EXPORT_H
 
+/*
+ * This comment block is used by fixdep. Please do not remove.
+ *
+ * When CONFIG_MODVERSIONS is changed from n to y, all source files having
+ * EXPORT_SYMBOL variants must be re-compiled because genksyms is run as a
+ * side effect of the .o build rule.
+ */
+
 #ifndef KSYM_FUNC
 #define KSYM_FUNC(x) x
 #endif
@@ -12,9 +20,6 @@
 #else
 #define KSYM_ALIGN 4
 #endif
-#ifndef KCRC_ALIGN
-#define KCRC_ALIGN 4
-#endif
 
 .macro __put, val, name
 #ifdef CONFIG_HAVE_ARCH_PREL32_RELOCATIONS
@@ -43,17 +48,6 @@ __ksymtab_\name:
 __kstrtab_\name:
.asciz "\name"
.previous
-#ifdef CONFIG_MODVERSIONS
-   .section ___kcrctab\sec+\name,"a"
-   .balign KCRC_ALIGN
-#if defined(CONFIG_MODULE_REL_CRCS)
-   .long __crc_\name - .
-#else
-   .long __crc_\name
-#endif
-   .weak __crc_\name
-   .previous
-#endif
 #endif
 .endm
 
diff --git a/include/linux/export-internal.h b/include/linux/export-internal.h
new file mode 100644
index ..77175d561058
--- /dev/null
+++ b/include/linux/export-internal.h
@@ -0,0 +1,16 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Please do not include this explicitly.
+ * This is used by C files generated by modpost.
+ */
+
+#ifndef __LINUX_EXPORT_INTERNAL_H__

[PATCH v4 08/14] genksyms: adjust the output format to modpost

2022-05-08 Thread Masahiro Yamada
Make genksyms output symbol versions in the format modpost expects,
so the 'sed' is unneeded.

This commit makes *.symversions completely unneeded.

I will keep *.symversions in .gitignore and 'make clean' for a while.
Otherwise, 'git status' might be surprising.

Signed-off-by: Masahiro Yamada 
Reviewed-by: Nicolas Schier 
Tested-by: Nathan Chancellor 
---

(no changes since v2)

Changes in v2:
  - New patch

 scripts/Makefile.build  | 6 --
 scripts/genksyms/genksyms.c | 3 +--
 2 files changed, 1 insertion(+), 8 deletions(-)

diff --git a/scripts/Makefile.build b/scripts/Makefile.build
index dff9220135c4..461998a2ad2b 100644
--- a/scripts/Makefile.build
+++ b/scripts/Makefile.build
@@ -165,16 +165,10 @@ ifdef CONFIG_MODVERSIONS
 # o modpost will extract versions from that file and create *.c files that will
 #   be compiled and linked to the kernel and/or modules.
 
-genksyms_format := __crc_\(.*\) = \(.*\);
-
 gen_symversions =  
\
if $(NM) $@ 2>/dev/null | grep -q __ksymtab; then   
\
$(call 
cmd_gensymtypes_$(1),$(KBUILD_SYMTYPES),$(@:.o=.symtypes)) \
-   > $@.symversions;   
\
-   sed -n 's/$(genksyms_format)/$(pound)SYMVER \1 \2/p' 
$@.symversions \
>> $(dot-target).cmd;   
\
-   else
\
-   rm -f $@.symversions;   
\
fi
 
 cmd_gen_symversions_c =$(call gen_symversions,c)
diff --git a/scripts/genksyms/genksyms.c b/scripts/genksyms/genksyms.c
index 6e6933ae7911..f5dfdb9d80e9 100644
--- a/scripts/genksyms/genksyms.c
+++ b/scripts/genksyms/genksyms.c
@@ -680,8 +680,7 @@ void export_symbol(const char *name)
if (flag_dump_defs)
fputs(">\n", debugfile);
 
-   /* Used as a linker script. */
-   printf("__crc_%s = 0x%08lx;\n", name, crc);
+   printf("#SYMVER %s 0x%08lx\n", name, crc);
}
 }
 
-- 
2.32.0



[PATCH v4 09/14] kbuild: do not create *.prelink.o for Clang LTO or IBT

2022-05-08 Thread Masahiro Yamada
When CONFIG_LTO_CLANG=y, additional intermediate *.prelink.o is created
for each module. Also, objtool is postponed until LLVM bitcode is
converted to ELF.

CONFIG_X86_KERNEL_IBT works in a similar way to postpone objtool until
objects are merged together.

This commit stops generating *.prelink.o, so the build flow will look
the same with/without LTO.

The following figures show how the LTO build currently works, and
how this commit is changing it.

Current build flow
==

 [1] single-object module

  $(LD)
   $(CC) +objtool  $(LD)
foo.c > foo.o -> foo.prelink.o -> foo.ko
   (LLVM bitcode)(ELF)   |
 |
 foo.mod.o --/

 [2] multi-object module
  $(LD)
   $(CC) $(AR)   +objtool   $(LD)
foo1.c -> foo1.o -> foo.o -> foo.prelink.o -> foo.ko
   |  (archive)  (ELF)   |
foo2.c -> foo2.o --/ |
(LLVM bitcode)   foo.mod.o --/

  One confusion is foo.o in multi-object module is an archive despite of
  its suffix.

New build flow
==

 [1] single-object module

  Since there is only one object, we do not need to have the LLVM
  bitcode stage. Use $(CC)+$(LD) to generate an ELF object in one
  build rule. When LTO is disabled, $(LD) is unneeded because $(CC)
  produces an ELF object.

   $(CC)+$(LD)+objtool $(LD)
foo.c > foo.o ---> foo.ko
(ELF)|
 |
 foo.mod.o --/

 [2] multi-object module

  Previously, $(AR) was used to combine LLVM bitcode into an archive,
  but there was no technical reason to do so.
  This commit just uses $(LD) to combine and convert them into a single
  ELF object.

$(LD)
$(CC)  +objtool$(LD)
foo1.c ---> foo1.o ---> foo.o ---> foo.ko
  | (ELF)|
foo2.c ---> foo2.o ---/  |
(LLVM bitcode)   foo.mod.o --/

Signed-off-by: Masahiro Yamada 
Reviewed-by: Nicolas Schier 
Tested-by: Nathan Chancellor 
---

(no changes since v2)

Changes in v2:
 - replace the chain of $(if ...) with $(and )

 scripts/Kbuild.include|  4 +++
 scripts/Makefile.build| 58 ---
 scripts/Makefile.lib  |  7 -
 scripts/Makefile.modfinal |  5 ++--
 scripts/Makefile.modpost  |  9 ++
 scripts/mod/modpost.c |  7 -
 6 files changed, 25 insertions(+), 65 deletions(-)

diff --git a/scripts/Kbuild.include b/scripts/Kbuild.include
index 3514c2149e9d..455a0a6ce12d 100644
--- a/scripts/Kbuild.include
+++ b/scripts/Kbuild.include
@@ -15,6 +15,10 @@ pound := \#
 # Name of target with a '.' as filename prefix. foo/bar.o => foo/.bar.o
 dot-target = $(dir $@).$(notdir $@)
 
+###
+# Name of target with a '.tmp_' as filename prefix. foo/bar.o => foo/.tmp_bar.o
+tmp-target = $(dir $@).tmp_$(notdir $@)
+
 ###
 # The temporary file to save gcc -MMD generated dependencies must not
 # contain a comma
diff --git a/scripts/Makefile.build b/scripts/Makefile.build
index 461998a2ad2b..838ea5e83174 100644
--- a/scripts/Makefile.build
+++ b/scripts/Makefile.build
@@ -88,10 +88,6 @@ endif
 targets-for-modules := $(foreach x, o mod $(if $(CONFIG_TRIM_UNUSED_KSYMS), 
usyms), \
$(patsubst %.o, %.$x, $(filter %.o, $(obj-m
 
-ifneq ($(CONFIG_LTO_CLANG)$(CONFIG_X86_KERNEL_IBT),)
-targets-for-modules += $(patsubst %.o, %.prelink.o, $(filter %.o, $(obj-m)))
-endif
-
 ifdef need-modorder
 targets-for-modules += $(obj)/modules.order
 endif
@@ -152,8 +148,16 @@ $(obj)/%.ll: $(src)/%.c FORCE
 # The C file is compiled and updated dependency information is generated.
 # (See cmd_cc_o_c + relevant part of rule_cc_o_c)
 
+is-single-obj-m = $(and $(part-of-module),$(filter $@, $(obj-m)),y)
+
+ifdef CONFIG_LTO_CLANG
+cmd_ld_single_m = $(if $(is-single-obj-m), ; $(LD) $(ld_flags) -r -o 
$(tmp-target) $@; mv $(tmp-target) $@)
+endif
+
 quiet_cmd_cc_o_c = CC $(quiet_modtag)  $@
-  cmd_cc_o_c = $(CC) $(c_flags) -c -o $@ $< $(cmd_objtool)
+  cmd_cc_o_c = $(CC) $(c_flags) -c -o $@ $< \
+   $(cmd_ld_single_m) \
+   $(cmd_objtool)
 
 ifdef CONFIG_MODVERSIONS
 # When module versioning is enabled the following steps are executed:
@@ -224,21 +228,16 @@ cmd_gen_objtooldep = $(if $(objtool-enabled), { echo ; 
echo '$@: $$(wildcard $(o
 
 endif # CONFIG_STACK_VALIDATION
 
-ifneq ($(CONFIG_LTO_CLANG)$(CONFIG_X86_KERNEL_IBT),)
-
-# Skip objtool for LLVM bitcode
-$(obj)/%.o: objtool-enabled :=
-

[PATCH v4 02/14] modpost: change the license of EXPORT_SYMBOL to bool type

2022-05-08 Thread Masahiro Yamada
There were more EXPORT_SYMBOL types in the past. The following commits
removed unused ones.

 - f1c3d73e973c ("module: remove EXPORT_SYMBOL_GPL_FUTURE")
 - 367948220fce ("module: remove EXPORT_UNUSED_SYMBOL*")

There are 3 remaining in enum export, but export_unknown does not make
any sense because we never expect such a situation like "we do not know
how it was exported".

If the symbol name starts with "__ksymtab_", but the section name
does not start with "___ksymtab+" or "___ksymtab_gpl+", it is not an
exported symbol.

It occurs when a variable starting with "__ksymtab_" is directly defined:

   int __ksymtab_foo;

Presumably, there is no practical issue for using such a weird variable
name (but there is no good reason for doing so, either).

Anyway, that is not an exported symbol. Setting export_unknown is not
the right thing to do. Do not call sym_add_exported() in this case.

With pointless export_unknown removed, the export type finally becomes
boolean (either EXPORT_SYMBOL or EXPORT_SYMBOL_GPL).

I renamed the field name to is_gpl_only. EXPORT_SYMBOL_GPL sets it true.
Only GPL-compatible modules can use it.

I removed the orphan comment, "How a symbol is exported", which is
unrelated to sec_mismatch_count. It is about enum export.
See commit bd5cbcedf446 ("kbuild: export-type enhancement to modpost.c")

Signed-off-by: Masahiro Yamada 
Reviewed-by: Nicolas Schier 
Tested-by: Nathan Chancellor 
---

Changes in v4:
  - Rebase again because I dropped
 
https://patchwork.kernel.org/project/linux-kbuild/patch/20220501084032.1025918-11-masahi...@kernel.org/
  - Remove warning message because I plan to change this hunk again in a later 
commit
  - Remove orphan comment

Changes in v3:
  - New patch

 scripts/mod/modpost.c | 108 --
 1 file changed, 30 insertions(+), 78 deletions(-)

diff --git a/scripts/mod/modpost.c b/scripts/mod/modpost.c
index d9efbd5b31a6..a78b75f0eeb0 100644
--- a/scripts/mod/modpost.c
+++ b/scripts/mod/modpost.c
@@ -30,7 +30,7 @@ static bool all_versions;
 static bool external_module;
 /* Only warn about unresolved symbols */
 static bool warn_unresolved;
-/* How a symbol is exported */
+
 static int sec_mismatch_count;
 static bool sec_mismatch_warn_only = true;
 /* ignore missing files */
@@ -47,12 +47,6 @@ static bool error_occurred;
 #define MAX_UNRESOLVED_REPORTS 10
 static unsigned int nr_unresolved;
 
-enum export {
-   export_plain,
-   export_gpl,
-   export_unknown
-};
-
 /* In kernel, this size is defined in linux/module.h;
  * here we use Elf_Addr instead of long for covering cross-compile
  */
@@ -219,7 +213,7 @@ struct symbol {
bool crc_valid;
bool weak;
bool is_static; /* true if symbol is not global */
-   enum export  export;   /* Type of export */
+   bool is_gpl_only;   /* exported by EXPORT_SYMBOL_GPL */
char name[];
 };
 
@@ -316,34 +310,6 @@ static void add_namespace(struct list_head *head, const 
char *namespace)
}
 }
 
-static const struct {
-   const char *str;
-   enum export export;
-} export_list[] = {
-   { .str = "EXPORT_SYMBOL",.export = export_plain },
-   { .str = "EXPORT_SYMBOL_GPL",.export = export_gpl },
-   { .str = "(unknown)",.export = export_unknown },
-};
-
-
-static const char *export_str(enum export ex)
-{
-   return export_list[ex].str;
-}
-
-static enum export export_no(const char *s)
-{
-   int i;
-
-   if (!s)
-   return export_unknown;
-   for (i = 0; export_list[i].export != export_unknown; i++) {
-   if (strcmp(export_list[i].str, s) == 0)
-   return export_list[i].export;
-   }
-   return export_unknown;
-}
-
 static void *sym_get_data_by_offset(const struct elf_info *info,
unsigned int secindex, unsigned long offset)
 {
@@ -374,18 +340,6 @@ static const char *sec_name(const struct elf_info *info, 
int secindex)
 
 #define strstarts(str, prefix) (strncmp(str, prefix, strlen(prefix)) == 0)
 
-static enum export export_from_secname(struct elf_info *elf, unsigned int sec)
-{
-   const char *secname = sec_name(elf, sec);
-
-   if (strstarts(secname, "___ksymtab+"))
-   return export_plain;
-   else if (strstarts(secname, "___ksymtab_gpl+"))
-   return export_gpl;
-   else
-   return export_unknown;
-}
-
 static void sym_update_namespace(const char *symname, const char *namespace)
 {
struct symbol *s = find_symbol(symname);
@@ -405,7 +359,7 @@ static void sym_update_namespace(const char *symname, const 
char *namespace)
 }
 
 static struct symbol *sym_add_exported(const char *name, struct module *mod,
-  enum export export)
+  bool gpl_only)
 {
struct symbol *s = find_symbol(name);
 
@@ -417,7 +371,7 @@ static struct symbol 

[PATCH v4 05/14] modpost: extract symbol versions from *.cmd files

2022-05-08 Thread Masahiro Yamada
Currently, CONFIG_MODVERSIONS needs extra link to embed the symbol
versions into ELF objects. Then, modpost extracts the version CRCs
from them.

The following figures show how it currently works, and how I am trying
to change it.

Current implementation
==
   |--|
 embed CRC  -->| final|
   $(CC)   $(LD)   /  |-|  | link for |
   -> *.o ---> *.o -->| modpost |  | vmlinux  |
  /  /| |-- *.mod.c -->| or   |
 / genksyms / |-|  | module   |
  *.c --> *.symversions|--|

Genksyms outputs the calculated CRCs in the form of linker script
(*.symversions), which is used by $(LD) to update the object.

If CONFIG_LTO_CLANG=y, the build process is much more complex. Embedding
the CRCs is postponed until the LLVM bitcode is converted into ELF,
creating another intermediate *.prelink.o.

However, this complexity is unneeded. There is no reason why we must
embed version CRCs in objects so early.

There is final link stage for vmlinux (scripts/link-vmlinux.sh) and
modules (scripts/Makefile.modfinal). We can link CRCs at the very last
moment.

New implementation
==
   |--|
   --->| final|
   $(CC)  /|-| | link for |
   -> *.o >| | | vmlinux  |
  /| modpost |--- .vmlinux.export.c -->| or   |
 / genksyms| |--- *.mod.c >| module   |
  *.c --> *.cmd -->|-| |--|

Pass the symbol versions to modpost as separate text data, which are
available in *.cmd files.

This commit changes modpost to extract CRCs from *.cmd files instead of
from ELF objects.

Signed-off-by: Masahiro Yamada 
Reviewed-by: Nicolas Schier 
Tested-by: Nathan Chancellor 
---

(no changes since v2)

Changes in v2:
  - Simplify the implementation (parse .cmd files after ELF)

 scripts/mod/modpost.c | 177 ++
 1 file changed, 129 insertions(+), 48 deletions(-)

diff --git a/scripts/mod/modpost.c b/scripts/mod/modpost.c
index fc5db1f73cf1..54f957952723 100644
--- a/scripts/mod/modpost.c
+++ b/scripts/mod/modpost.c
@@ -381,19 +381,10 @@ static struct symbol *sym_add_exported(const char *name, 
struct module *mod,
return s;
 }
 
-static void sym_set_crc(const char *name, unsigned int crc)
+static void sym_set_crc(struct symbol *sym, unsigned int crc)
 {
-   struct symbol *s = find_symbol(name);
-
-   /*
-* Ignore stand-alone __crc_*, which might be auto-generated symbols
-* such as __*_veneer in ARM ELF.
-*/
-   if (!s)
-   return;
-
-   s->crc = crc;
-   s->crc_valid = true;
+   sym->crc = crc;
+   sym->crc_valid = true;
 }
 
 static void *grab_file(const char *filename, size_t *size)
@@ -616,33 +607,6 @@ static int ignore_undef_symbol(struct elf_info *info, 
const char *symname)
return 0;
 }
 
-static void handle_modversion(const struct module *mod,
- const struct elf_info *info,
- const Elf_Sym *sym, const char *symname)
-{
-   unsigned int crc;
-
-   if (sym->st_shndx == SHN_UNDEF) {
-   warn("EXPORT symbol \"%s\" [%s%s] version generation failed, 
symbol will not be versioned.\n"
-"Is \"%s\" prototyped in ?\n",
-symname, mod->name, mod->is_vmlinux ? "" : ".ko",
-symname);
-
-   return;
-   }
-
-   if (sym->st_shndx == SHN_ABS) {
-   crc = sym->st_value;
-   } else {
-   unsigned int *crcp;
-
-   /* symbol points to the CRC in the ELF object */
-   crcp = sym_get_data(info, sym);
-   crc = TO_NATIVE(*crcp);
-   }
-   sym_set_crc(symname, crc);
-}
-
 static void handle_symbol(struct module *mod, struct elf_info *info,
  const Elf_Sym *sym, const char *symname)
 {
@@ -760,6 +724,102 @@ static char *remove_dot(char *s)
return s;
 }
 
+/*
+ * The CRCs are recorded in .*.cmd files in the form of:
+ * #SYMVER  
+ */
+static void extract_crcs_for_object(const char *object, struct module *mod)
+{
+   char cmd_file[PATH_MAX];
+   char *buf, *p;
+   const char *base;
+   int dirlen, ret;
+
+   base = strrchr(object, '/');
+   if (base) {
+   base++;
+   dirlen = base - object;
+   } else {
+   dirlen = 0;
+   base = object;
+   }
+
+   ret = snprintf(cmd_file, sizeof(cmd_file), "%.*s.%s.cmd",
+   

[PATCH v4 00/14] kbuild: yet another series of cleanups (modpost, LTO, MODULE_REL_CRCS, export.h)

2022-05-08 Thread Masahiro Yamada
This is the third batch of cleanups in this development cycle.

Major changes in v4:
 - Move static EXPORT_SYMBOL check to a script
 - Some refactoring

Major changes in v3:

 - Generate symbol CRCs as C code, and remove CONFIG_MODULE_REL_CRCS.

Major changes in v2:

 - V1 did not work with CONFIG_MODULE_REL_CRCS.
   I fixed this for v2.

 - Reflect some review comments in v1

 - Refactor the code more

 - Avoid too long argument error



Masahiro Yamada (14):
  modpost: remove left-over cross_compile declaration
  modpost: change the license of EXPORT_SYMBOL to bool type
  modpost: split the section mismatch checks into section-check.c
  modpost: add sym_find_with_module() helper
  modpost: extract symbol versions from *.cmd files
  kbuild: link symbol CRCs at final link, removing
CONFIG_MODULE_REL_CRCS
  kbuild: stop merging *.symversions
  genksyms: adjust the output format to modpost
  kbuild: do not create *.prelink.o for Clang LTO or IBT
  kbuild: check static EXPORT_SYMBOL* by script instead of modpost
  kbuild: make built-in.a rule robust against too long argument error
  kbuild: make *.mod rule robust against too long argument error
  kbuild: add cmd_and_savecmd macro
  kbuild: rebuild multi-object modules when objtool is updated

 arch/powerpc/Kconfig|1 -
 arch/s390/Kconfig   |1 -
 arch/um/Kconfig |1 -
 include/asm-generic/export.h|   22 +-
 include/linux/export-internal.h |   16 +
 include/linux/export.h  |   30 +-
 init/Kconfig|4 -
 kernel/module.c |   10 +-
 scripts/Kbuild.include  |   10 +-
 scripts/Makefile.build  |  134 +--
 scripts/Makefile.lib|7 -
 scripts/Makefile.modfinal   |5 +-
 scripts/Makefile.modpost|9 +-
 scripts/check-local-export  |   48 +
 scripts/genksyms/genksyms.c |   18 +-
 scripts/link-vmlinux.sh |   33 +-
 scripts/mod/Makefile|2 +-
 scripts/mod/modpost.c   | 1499 ---
 scripts/mod/modpost.h   |   35 +-
 scripts/mod/section-check.c | 1222 +
 20 files changed, 1551 insertions(+), 1556 deletions(-)
 create mode 100644 include/linux/export-internal.h
 create mode 100755 scripts/check-local-export
 create mode 100644 scripts/mod/section-check.c

-- 
2.32.0



[PATCH v4 01/14] modpost: remove left-over cross_compile declaration

2022-05-08 Thread Masahiro Yamada
This is a remnant of commit 6543becf26ff ("mod/file2alias: make
modalias generation safe for cross compiling").

Signed-off-by: Masahiro Yamada 
---

Changes in v4:
  - New patch

 scripts/mod/modpost.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/scripts/mod/modpost.h b/scripts/mod/modpost.h
index cfa127d2bb8f..d9daeff07b83 100644
--- a/scripts/mod/modpost.h
+++ b/scripts/mod/modpost.h
@@ -174,7 +174,6 @@ static inline unsigned int get_secindex(const struct 
elf_info *info,
 }
 
 /* file2alias.c */
-extern unsigned int cross_build;
 void handle_moddevtable(struct module *mod, struct elf_info *info,
Elf_Sym *sym, const char *symname);
 void add_moddevtable(struct buffer *buf, struct module *mod);
-- 
2.32.0



[PATCH v4 04/14] modpost: add sym_find_with_module() helper

2022-05-08 Thread Masahiro Yamada
find_symbol() returns the first symbol found in the hash table. This
table is global, so it may return a symbol from an unexpected module.

There is a case where we want to search for a symbol with a given name
in a specified module.

Add sym_find_with_module(), which receives the module pointer as the
second argument. It is equivalent to find_module() if NULL is passed
as the module pointer.

Signed-off-by: Masahiro Yamada 
Reviewed-by: Nicolas Schier 
Tested-by: Nathan Chancellor 
---

Changes in v4:
  - Only takes the new helper from

https://patchwork.kernel.org/project/linux-kbuild/patch/20220505072244.1155033-2-masahi...@kernel.org/

Changes in v2:
  - Rename the new func to sym_find_with_module()

 scripts/mod/modpost.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/scripts/mod/modpost.c b/scripts/mod/modpost.c
index e7e2c70a98f5..fc5db1f73cf1 100644
--- a/scripts/mod/modpost.c
+++ b/scripts/mod/modpost.c
@@ -266,7 +266,7 @@ static void sym_add_unresolved(const char *name, struct 
module *mod, bool weak)
list_add_tail(>list, >unresolved_symbols);
 }
 
-static struct symbol *find_symbol(const char *name)
+static struct symbol *sym_find_with_module(const char *name, struct module 
*mod)
 {
struct symbol *s;
 
@@ -275,12 +275,17 @@ static struct symbol *find_symbol(const char *name)
name++;
 
for (s = symbolhash[tdb_hash(name) % SYMBOL_HASH_SIZE]; s; s = s->next) 
{
-   if (strcmp(s->name, name) == 0)
+   if (strcmp(s->name, name) == 0 && (!mod || s->module == mod))
return s;
}
return NULL;
 }
 
+static struct symbol *find_symbol(const char *name)
+{
+   return sym_find_with_module(name, NULL);
+}
+
 struct namespace_list {
struct list_head list;
char namespace[];
-- 
2.32.0



Re: [GIT PULL] Please pull powerpc/linux.git powerpc-5.18-4 tag

2022-05-08 Thread pr-tracker-bot
The pull request you sent on Sun, 08 May 2022 22:13:14 +1000:

> https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git 
> tags/powerpc-5.18-4

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/e3de3a1cda5fdc3ac42cb0d45321fb254500595f

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html


Re: [PATCH v3 00/15] kbuild: yet another series of cleanups (modpost, LTO, MODULE_REL_CRCS)

2022-05-08 Thread Masahiro Yamada
On Thu, May 5, 2022 at 4:24 PM Masahiro Yamada  wrote:
>
>
> This is the third batch of cleanups in this development cycle.
>
> Major changes in v3:
>
>  - Generate symbol CRCs as C code, and remove CONFIG_MODULE_REL_CRCS.
>
> Major changes in v2:
>
>  - V1 did not work with CONFIG_MODULE_REL_CRCS.
>I fixed this for v2.
>
>  - Reflect some review comments in v1
>
>  - Refactor the code more
>
>  - Avoid too long argument error
>
>
> Masahiro Yamada (15):
>   modpost: mitigate false-negatives for static EXPORT_SYMBOL checks
>   modpost: change the license of EXPORT_SYMBOL to bool type
>   modpost: merge add_{intree_flag,retpoline,staging_flag} to add_header
>   modpost: move *.mod.c generation to write_mod_c_files()
>   kbuild: generate a list of objects in vmlinux
>   kbuild: record symbol versions in *.cmd files
>   modpost: extract symbol versions from *.cmd files
>   kbuild: link symbol CRCs at final link, removing
> CONFIG_MODULE_REL_CRCS
>   kbuild: stop merging *.symversions
>   genksyms: adjust the output format to modpost
>   kbuild: do not create *.prelink.o for Clang LTO or IBT
>   modpost: simplify the ->is_static initialization
>   modpost: use hlist for hash table implementation
>   kbuild: make built-in.a rule robust against too long argument error
>   kbuild: make *.mod rule robust against too long argument error


Only 03-06 were applied.

I will send v4 for the rest.
(I rewrote the static EXPORT checks).

>
>  arch/powerpc/Kconfig |   1 -
>  arch/s390/Kconfig|   1 -
>  arch/um/Kconfig  |   1 -
>  include/asm-generic/export.h |  22 +-
>  include/linux/export.h   |  30 +--
>  include/linux/symversion.h   |  13 +
>  init/Kconfig |   4 -
>  kernel/module.c  |  10 +-
>  scripts/Kbuild.include   |   4 +
>  scripts/Makefile.build   | 118 +++--
>  scripts/Makefile.lib |   7 -
>  scripts/Makefile.modfinal|   5 +-
>  scripts/Makefile.modpost |   9 +-
>  scripts/genksyms/genksyms.c  |  18 +-
>  scripts/link-vmlinux.sh  |  46 ++--
>  scripts/mod/file2alias.c |   2 -
>  scripts/mod/list.h   |  52 
>  scripts/mod/modpost.c| 449 ---
>  scripts/mod/modpost.h|   2 +
>  19 files changed, 402 insertions(+), 392 deletions(-)
>  create mode 100644 include/linux/symversion.h
>
> --
> 2.32.0
>
> --
> You received this message because you are subscribed to the Google Groups 
> "Clang Built Linux" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to clang-built-linux+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/clang-built-linux/20220505072244.1155033-1-masahiroy%40kernel.org.



-- 
Best Regards
Masahiro Yamada


Re: [PATCH v6 00/23] Rust support

2022-05-08 Thread Matthew Wilcox
On Sat, May 07, 2022 at 01:06:18AM -0700, Kees Cook wrote:
> On Sat, May 07, 2022 at 07:23:58AM +0200, Miguel Ojeda wrote:
> > ## Patch series status
> > 
> > The Rust support is still to be considered experimental. However,
> > support is good enough that kernel developers can start working on the
> > Rust abstractions for subsystems and write drivers and other modules.
> 
> I'd really like to see this landed for a few reasons:
> 
> - It's under active development, and I'd rather review the changes
>   "normally", incrementally, etc. Right now it can be hard to re-review
>   some of the "mostly the same each version" patches in the series.
> 
> - I'd like to break the catch-22 of "ask for a new driver to be
>   written in rust but the rust support isn't landed" vs "the rust
>   support isn't landed because there aren't enough drivers". It
>   really feels like "release early, release often" is needed here;
>   it's hard to develop against -next. :)
> 
> Should we give it a try for this coming merge window?

I'm broadly in favour of that.  It's just code, we can always drop it
again or fix it.  There's sufficient development community around it
that it's hardly going to become abandonware.



Re: [PATCH v2 2/3] mm: rmap: Fix CONT-PTE/PMD size hugetlb issue when migration

2022-05-08 Thread Baolin Wang

Hi,

On 5/8/2022 8:01 PM, kernel test robot wrote:

Hi Baolin,

I love your patch! Yet something to improve:

[auto build test ERROR on akpm-mm/mm-everything]
[also build test ERROR on next-20220506]
[cannot apply to hnaz-mm/master arm64/for-next/core linus/master v5.18-rc5]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/intel-lab-lkp/linux/commits/Baolin-Wang/Fix-CONT-PTE-PMD-size-hugetlb-issue-when-unmapping-or-migrating/20220508-174036
base:   https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git 
mm-everything
config: x86_64-randconfig-a013 
(https://download.01.org/0day-ci/archive/20220508/202205081910.mstoc5rj-...@intel.com/config)
compiler: gcc-11 (Debian 11.2.0-20) 11.2.0
reproduce (this is a W=1 build):
 # 
https://github.com/intel-lab-lkp/linux/commit/907981b27213707fdb2f8a24c107d6752a09a773
 git remote add linux-review https://github.com/intel-lab-lkp/linux
 git fetch --no-tags linux-review 
Baolin-Wang/Fix-CONT-PTE-PMD-size-hugetlb-issue-when-unmapping-or-migrating/20220508-174036
 git checkout 907981b27213707fdb2f8a24c107d6752a09a773
 # save the config file
 mkdir build_dir && cp config build_dir/.config
 make W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All errors (new ones prefixed by >>):

mm/rmap.c: In function 'try_to_migrate_one':

mm/rmap.c:1931:34: error: implicit declaration of function 
'huge_ptep_clear_flush'; did you mean 'ptep_clear_flush'? 
[-Werror=implicit-function-declaration]

 1931 | pteval = huge_ptep_clear_flush(vma, 
address, pvmw.pte);
  |  ^
  |  ptep_clear_flush

mm/rmap.c:1931:34: error: incompatible types when assigning to type 'pte_t' 
from type 'int'
mm/rmap.c:2023:41: error: implicit declaration of function 'set_huge_pte_at'; 
did you mean 'set_huge_swap_pte_at'? [-Werror=implicit-function-declaration]

 2023 | set_huge_pte_at(mm, 
address, pvmw.pte, pteval);
  | ^~~
  | set_huge_swap_pte_at
cc1: some warnings being treated as errors


Thanks for reporting. I think I should add some dummy functions in 
hugetlb.h file if the CONFIG_HUGETLB_PAGE is not selected. I can pass 
the building with below changes and your config file.


diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index 306d6ef..9f71043 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -1093,6 +1093,17 @@ static inline void set_huge_swap_pte_at(struct 
mm_struct *mm, unsigned long addr
pte_t *ptep, pte_t pte, 
unsigned long sz)

 {
 }
+
+static inline pte_t huge_ptep_clear_flush(struct vm_area_struct *vma,
+ unsigned long addr, pte_t *ptep)
+{
+   return ptep_get(ptep);
+}
+
+static inline void set_huge_pte_at(struct mm_struct *mm, unsigned long 
addr,

+  pte_t *ptep, pte_t pte)
+{
+}
 #endif /* CONFIG_HUGETLB_PAGE */


Re: [PATCH v2 1/3] mm: change huge_ptep_clear_flush() to return the original pte

2022-05-08 Thread Baolin Wang




On 5/8/2022 7:09 PM, Muchun Song wrote:

On Sun, May 08, 2022 at 05:36:39PM +0800, Baolin Wang wrote:

It is incorrect to use ptep_clear_flush() to nuke a hugetlb page
table when unmapping or migrating a hugetlb page, and will change
to use huge_ptep_clear_flush() instead in the following patches.

So this is a preparation patch, which changes the huge_ptep_clear_flush()
to return the original pte to help to nuke a hugetlb page table.

Signed-off-by: Baolin Wang 
Acked-by: Mike Kravetz 


Reviewed-by: Muchun Song 


Thanks for reviewing.



But one nit below:

[...]

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 8605d7e..61a21af 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -5342,7 +5342,7 @@ static vm_fault_t hugetlb_wp(struct mm_struct *mm, struct 
vm_area_struct *vma,
ClearHPageRestoreReserve(new_page);
  
  		/* Break COW or unshare */

-   huge_ptep_clear_flush(vma, haddr, ptep);
+   (void)huge_ptep_clear_flush(vma, haddr, ptep);


Why add a "(void)" here? Is there any warning if no "(void)"?
IIUC, I think we can remove this, right?


I did not meet any warning without the casting, but this is per Mike's 
comment[1] to make the code consistent with other functions casting to 
void type explicitly in hugetlb.c file.


[1] 
https://lore.kernel.org/all/495c4ebe-a5b4-afb6-4cb0-956c1b18d...@oracle.com/


[GIT PULL] Please pull powerpc/linux.git powerpc-5.18-4 tag

2022-05-08 Thread Michael Ellerman
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Hi Linus,

Please pull some more powerpc fixes for 5.18:

The following changes since commit bb82c574691daf8f7fa9a160264d15c5804cb769:

  powerpc/perf: Fix 32bit compile (2022-04-21 23:26:47 +1000)

are available in the git repository at:

  https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git 
tags/powerpc-5.18-4

for you to fetch changes up to 348c71344111d7a48892e3e52264ff11956fc196:

  powerpc/papr_scm: Fix buffer overflow issue with CONFIG_FORTIFY_SOURCE 
(2022-05-06 12:44:03 +1000)

- --
powerpc fixes for 5.18 #4

 - Fix the DWARF CFI in our VDSO time functions, allowing gdb to backtrace 
through them
   correctly.

 - Fix a buffer overflow in the papr_scm driver, only triggerable by hypervisor 
input.

 - A fix in the recently added QoS handling for VAS (used for communicating with
   coprocessors).

Thanks to: Alan Modra, Haren Myneni, Kajol Jain, Segher Boessenkool.

- --
Haren Myneni (1):
  powerpc/pseries/vas: Use QoS credits from the userspace

Kajol Jain (1):
  powerpc/papr_scm: Fix buffer overflow issue with CONFIG_FORTIFY_SOURCE

Michael Ellerman (1):
  powerpc/vdso: Fix incorrect CFI in gettimeofday.S


 arch/powerpc/kernel/vdso/gettimeofday.S|  9 ++--
 arch/powerpc/platforms/pseries/papr_scm.c  |  7 ++
 arch/powerpc/platforms/pseries/vas-sysfs.c | 19 +++-
 arch/powerpc/platforms/pseries/vas.c   | 23 ++--
 arch/powerpc/platforms/pseries/vas.h   |  2 +-
 5 files changed, 36 insertions(+), 24 deletions(-)
-BEGIN PGP SIGNATURE-

iQIzBAEBCAAdFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmJ3s60ACgkQUevqPMjh
pYCE+xAAk+ButiF8vXxyO0/sWvW8F2qkGDvUlGn8Dwo8q8AaA70nCvzztcnBMScE
KrUjJFOEAiQUKXCVsczWAcxQwPAkD6myTaoseUBNTc+fdeLiWzpAGRY9FTMR54M6
UtPtiSCUnz2UJnU4gIfAEYGGsnF2PMKnBnEV4ROFNqqIAihmQjW7oU7iLq4kNSX6
YOE5UPUpPSuyJgI1/KlseUuEsH/Hz0Fc3AvSEel+/pfTdPaIxed7Oxr116HsOHqJ
Lda88F+4Tdk0OSC9Q9gzbyqQsvpIe2OTt9FQEuBbSAEV+eUbWuwBI44UVkpDDg/C
HlcmxAGAoulLXTKrnt3RkjonLZuVwGCTgCJe9zTzWG00n1XzO6mvEuphyixlPsow
7Ej5QLSWkGMZhZO+wTcJpgcCcZ4TEYtpf3T5iBR2DlcftIgmlJtmSS99mwgMZ7ct
LaHYJDOlSRCtxQipAeHBtybe/ngsxYIdCjNlumbEbYY6tUg5+6jY8DMkJ6KFHAfk
82h241dByF0YDW1HpG5D+RGpEvxTpQrFYhE9XPdOqQ07mwOzIg9DMmCLXrIofETV
Ywb5+jY3DlpCZz0nxOHA+5SO1fealq8ZC4ZDKO3FErgqsUUCjuZJUbSLtFHGGRsF
HIg+xDoXRpiGWwpIqrgozu2xxYE4AbDhe+sOVvF4APHTXIuP0+U=
=Lyfd
-END PGP SIGNATURE-


Re: [PATCH v2 2/3] mm: rmap: Fix CONT-PTE/PMD size hugetlb issue when migration

2022-05-08 Thread kernel test robot
Hi Baolin,

I love your patch! Yet something to improve:

[auto build test ERROR on akpm-mm/mm-everything]
[also build test ERROR on next-20220506]
[cannot apply to hnaz-mm/master arm64/for-next/core linus/master v5.18-rc5]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/intel-lab-lkp/linux/commits/Baolin-Wang/Fix-CONT-PTE-PMD-size-hugetlb-issue-when-unmapping-or-migrating/20220508-174036
base:   https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git 
mm-everything
config: x86_64-randconfig-a014 
(https://download.01.org/0day-ci/archive/20220508/202205081950.ipkfnyip-...@intel.com/config)
compiler: clang version 15.0.0 (https://github.com/llvm/llvm-project 
a385645b470e2d3a1534aae618ea56b31177639f)
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# 
https://github.com/intel-lab-lkp/linux/commit/907981b27213707fdb2f8a24c107d6752a09a773
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review 
Baolin-Wang/Fix-CONT-PTE-PMD-size-hugetlb-issue-when-unmapping-or-migrating/20220508-174036
git checkout 907981b27213707fdb2f8a24c107d6752a09a773
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 
O=build_dir ARCH=x86_64 SHELL=/bin/bash

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All errors (new ones prefixed by >>):

>> mm/rmap.c:1931:13: error: call to undeclared function 
>> 'huge_ptep_clear_flush'; ISO C99 and later do not support implicit function 
>> declarations [-Wimplicit-function-declaration]
   pteval = huge_ptep_clear_flush(vma, address, 
pvmw.pte);
^
   mm/rmap.c:1931:13: note: did you mean 'ptep_clear_flush'?
   include/linux/pgtable.h:431:14: note: 'ptep_clear_flush' declared here
   extern pte_t ptep_clear_flush(struct vm_area_struct *vma,
^
>> mm/rmap.c:1931:11: error: assigning to 'pte_t' from incompatible type 'int'
   pteval = huge_ptep_clear_flush(vma, address, 
pvmw.pte);
  ^ 
~
>> mm/rmap.c:2023:6: error: call to undeclared function 'set_huge_pte_at'; ISO 
>> C99 and later do not support implicit function declarations 
>> [-Wimplicit-function-declaration]
   set_huge_pte_at(mm, address, 
pvmw.pte, pteval);
   ^
   mm/rmap.c:2035:6: error: call to undeclared function 'set_huge_pte_at'; ISO 
C99 and later do not support implicit function declarations 
[-Wimplicit-function-declaration]
   set_huge_pte_at(mm, address, 
pvmw.pte, pteval);
   ^
   4 errors generated.


vim +/huge_ptep_clear_flush +1931 mm/rmap.c

  1883  
  1884  /* Unexpected PMD-mapped THP? */
  1885  VM_BUG_ON_FOLIO(!pvmw.pte, folio);
  1886  
  1887  subpage = folio_page(folio,
  1888  pte_pfn(*pvmw.pte) - folio_pfn(folio));
  1889  address = pvmw.address;
  1890  anon_exclusive = folio_test_anon(folio) &&
  1891   PageAnonExclusive(subpage);
  1892  
  1893  if (folio_test_hugetlb(folio)) {
  1894  /*
  1895   * huge_pmd_unshare may unmap an entire PMD 
page.
  1896   * There is no way of knowing exactly which 
PMDs may
  1897   * be cached for this mm, so we must flush them 
all.
  1898   * start/end were already adjusted above to 
cover this
  1899   * range.
  1900   */
  1901  flush_cache_range(vma, range.start, range.end);
  1902  
  1903  if (!folio_test_anon(folio)) {
  1904  /*
  1905   * To call huge_pmd_unshare, 
i_mmap_rwsem must be
  1906   * held in write mode.  Caller needs to 
explicitly
  1907   * do this outside rmap routines.
  1908   */
  1909  VM_BUG_ON(!(flags & TTU_RMAP_LOCKED));
  1910  
  1911  if (huge_pmd_unshare(mm, vma, , 
pvmw.pte)) {
  1912   

Re: [PATCH v2 2/3] mm: rmap: Fix CONT-PTE/PMD size hugetlb issue when migration

2022-05-08 Thread kernel test robot
Hi Baolin,

I love your patch! Yet something to improve:

[auto build test ERROR on akpm-mm/mm-everything]
[also build test ERROR on next-20220506]
[cannot apply to hnaz-mm/master arm64/for-next/core linus/master v5.18-rc5]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/intel-lab-lkp/linux/commits/Baolin-Wang/Fix-CONT-PTE-PMD-size-hugetlb-issue-when-unmapping-or-migrating/20220508-174036
base:   https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git 
mm-everything
config: x86_64-randconfig-a013 
(https://download.01.org/0day-ci/archive/20220508/202205081910.mstoc5rj-...@intel.com/config)
compiler: gcc-11 (Debian 11.2.0-20) 11.2.0
reproduce (this is a W=1 build):
# 
https://github.com/intel-lab-lkp/linux/commit/907981b27213707fdb2f8a24c107d6752a09a773
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review 
Baolin-Wang/Fix-CONT-PTE-PMD-size-hugetlb-issue-when-unmapping-or-migrating/20220508-174036
git checkout 907981b27213707fdb2f8a24c107d6752a09a773
# save the config file
mkdir build_dir && cp config build_dir/.config
make W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All errors (new ones prefixed by >>):

   mm/rmap.c: In function 'try_to_migrate_one':
>> mm/rmap.c:1931:34: error: implicit declaration of function 
>> 'huge_ptep_clear_flush'; did you mean 'ptep_clear_flush'? 
>> [-Werror=implicit-function-declaration]
1931 | pteval = huge_ptep_clear_flush(vma, address, 
pvmw.pte);
 |  ^
 |  ptep_clear_flush
>> mm/rmap.c:1931:34: error: incompatible types when assigning to type 'pte_t' 
>> from type 'int'
>> mm/rmap.c:2023:41: error: implicit declaration of function 
>> 'set_huge_pte_at'; did you mean 'set_huge_swap_pte_at'? 
>> [-Werror=implicit-function-declaration]
2023 | set_huge_pte_at(mm, address, 
pvmw.pte, pteval);
 | ^~~
 | set_huge_swap_pte_at
   cc1: some warnings being treated as errors


vim +1931 mm/rmap.c

  1883  
  1884  /* Unexpected PMD-mapped THP? */
  1885  VM_BUG_ON_FOLIO(!pvmw.pte, folio);
  1886  
  1887  subpage = folio_page(folio,
  1888  pte_pfn(*pvmw.pte) - folio_pfn(folio));
  1889  address = pvmw.address;
  1890  anon_exclusive = folio_test_anon(folio) &&
  1891   PageAnonExclusive(subpage);
  1892  
  1893  if (folio_test_hugetlb(folio)) {
  1894  /*
  1895   * huge_pmd_unshare may unmap an entire PMD 
page.
  1896   * There is no way of knowing exactly which 
PMDs may
  1897   * be cached for this mm, so we must flush them 
all.
  1898   * start/end were already adjusted above to 
cover this
  1899   * range.
  1900   */
  1901  flush_cache_range(vma, range.start, range.end);
  1902  
  1903  if (!folio_test_anon(folio)) {
  1904  /*
  1905   * To call huge_pmd_unshare, 
i_mmap_rwsem must be
  1906   * held in write mode.  Caller needs to 
explicitly
  1907   * do this outside rmap routines.
  1908   */
  1909  VM_BUG_ON(!(flags & TTU_RMAP_LOCKED));
  1910  
  1911  if (huge_pmd_unshare(mm, vma, , 
pvmw.pte)) {
  1912  flush_tlb_range(vma, 
range.start, range.end);
  1913  
mmu_notifier_invalidate_range(mm, range.start,
  1914
range.end);
  1915  
  1916  /*
  1917   * The ref count of the PMD 
page was dropped
  1918   * which is part of the way map 
counting
  1919   * is done for shared PMDs.  
Return 'true'
  1920   * here.  When there is no 
other sharing,
  1921 

Re: [PATCH] powerpc/pseries/vas: Use QoS credits from the userspace

2022-05-08 Thread Michael Ellerman
On Sat, 19 Mar 2022 02:28:09 -0700, Haren Myneni wrote:
> The user can change the QoS credits dynamically with the
> management console interface which notifies OS with sysfs. After
> returning from the OS interface successfully, the management
> console updates the hypervisor. Since the VAS capabilities in
> the hypervisor is not updated when the OS gets the update,
> the kernel is using the old total credits value from the
> hypervisor. Fix this issue by using the new QoS credits
> from the userspace instead of depending on VAS capabilities
> from the hypervisor.
> 
> [...]

Applied to powerpc/fixes.

[1/1] powerpc/pseries/vas: Use QoS credits from the userspace
  https://git.kernel.org/powerpc/c/57831bfb5e78777dc399e351ed68ef77c3aee385

cheers


Re: [PATCH] powerpc/vdso: Fix incorrect CFI in gettimeofday.S

2022-05-08 Thread Michael Ellerman
On Mon, 2 May 2022 22:50:10 +1000, Michael Ellerman wrote:
> As reported by Alan, the CFI (Call Frame Information) in the VDSO time
> routines is incorrect since commit ce7d8056e38b ("powerpc/vdso: Prepare
> for switching VDSO to generic C implementation.").
> 
> In particular the changes to the frame address register (r1) are not
> properly described, which prevents gdb from being able to generate a
> backtrace from inside VDSO functions, eg:
> 
> [...]

Applied to powerpc/fixes.

[1/1] powerpc/vdso: Fix incorrect CFI in gettimeofday.S
  https://git.kernel.org/powerpc/c/6d65028eb67dbb7627651adfc460d64196d38bd8

cheers


Re: [PATCH] powerpc/papr_scm: Fix buffer overflow issue with CONFIG_FORTIFY_SOURCE

2022-05-08 Thread Michael Ellerman
On Thu, 5 May 2022 21:04:51 +0530, Kajol Jain wrote:
> With CONFIG_FORTIFY_SOURCE enabled, string functions will also perform
> dynamic checks for string size which can panic the kernel,
> like incase of overflow detection.
> 
> In papr_scm, papr_scm_pmu_check_events function uses stat->stat_id
> with string operations, to populate the nvdimm_events_map array.
> Since stat_id variable is not NULL terminated, the kernel panics
> with CONFIG_FORTIFY_SOURCE enabled at boot time.
> 
> [...]

Applied to powerpc/fixes.

[1/1] powerpc/papr_scm: Fix buffer overflow issue with CONFIG_FORTIFY_SOURCE
  https://git.kernel.org/powerpc/c/348c71344111d7a48892e3e52264ff11956fc196

cheers


[PATCH v2 2/3] mm: rmap: Fix CONT-PTE/PMD size hugetlb issue when migration

2022-05-08 Thread Baolin Wang
On some architectures (like ARM64), it can support CONT-PTE/PMD size
hugetlb, which means it can support not only PMD/PUD size hugetlb:
2M and 1G, but also CONT-PTE/PMD size: 64K and 32M if a 4K page
size specified.

When migrating a hugetlb page, we will get the relevant page table
entry by huge_pte_offset() only once to nuke it and remap it with
a migration pte entry. This is correct for PMD or PUD size hugetlb,
since they always contain only one pmd entry or pud entry in the
page table.

However this is incorrect for CONT-PTE and CONT-PMD size hugetlb,
since they can contain several continuous pte or pmd entry with
same page table attributes. So we will nuke or remap only one pte
or pmd entry for this CONT-PTE/PMD size hugetlb page, which is
not expected for hugetlb migration. The problem is we can still
continue to modify the subpages' data of a hugetlb page during
migrating a hugetlb page, which can cause a serious data consistent
issue, since we did not nuke the page table entry and set a
migration pte for the subpages of a hugetlb page.

To fix this issue, we should change to use huge_ptep_clear_flush()
to nuke a hugetlb page table, and remap it with set_huge_pte_at()
and set_huge_swap_pte_at() when migrating a hugetlb page, which
already considered the CONT-PTE or CONT-PMD size hugetlb.

Signed-off-by: Baolin Wang 
---
 mm/rmap.c | 24 ++--
 1 file changed, 18 insertions(+), 6 deletions(-)

diff --git a/mm/rmap.c b/mm/rmap.c
index 6fdd198..7cf2408 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1924,13 +1924,15 @@ static bool try_to_migrate_one(struct folio *folio, 
struct vm_area_struct *vma,
break;
}
}
+
+   /* Nuke the hugetlb page table entry */
+   pteval = huge_ptep_clear_flush(vma, address, pvmw.pte);
} else {
flush_cache_page(vma, address, pte_pfn(*pvmw.pte));
+   /* Nuke the page table entry. */
+   pteval = ptep_clear_flush(vma, address, pvmw.pte);
}
 
-   /* Nuke the page table entry. */
-   pteval = ptep_clear_flush(vma, address, pvmw.pte);
-
/* Set the dirty flag on the folio now the pte is gone. */
if (pte_dirty(pteval))
folio_mark_dirty(folio);
@@ -2015,7 +2017,10 @@ static bool try_to_migrate_one(struct folio *folio, 
struct vm_area_struct *vma,
pte_t swp_pte;
 
if (arch_unmap_one(mm, vma, address, pteval) < 0) {
-   set_pte_at(mm, address, pvmw.pte, pteval);
+   if (folio_test_hugetlb(folio))
+   set_huge_pte_at(mm, address, pvmw.pte, 
pteval);
+   else
+   set_pte_at(mm, address, pvmw.pte, 
pteval);
ret = false;
page_vma_mapped_walk_done();
break;
@@ -2024,7 +2029,10 @@ static bool try_to_migrate_one(struct folio *folio, 
struct vm_area_struct *vma,
   !anon_exclusive, subpage);
if (anon_exclusive &&
page_try_share_anon_rmap(subpage)) {
-   set_pte_at(mm, address, pvmw.pte, pteval);
+   if (folio_test_hugetlb(folio))
+   set_huge_pte_at(mm, address, pvmw.pte, 
pteval);
+   else
+   set_pte_at(mm, address, pvmw.pte, 
pteval);
ret = false;
page_vma_mapped_walk_done();
break;
@@ -2050,7 +2058,11 @@ static bool try_to_migrate_one(struct folio *folio, 
struct vm_area_struct *vma,
swp_pte = pte_swp_mksoft_dirty(swp_pte);
if (pte_uffd_wp(pteval))
swp_pte = pte_swp_mkuffd_wp(swp_pte);
-   set_pte_at(mm, address, pvmw.pte, swp_pte);
+   if (folio_test_hugetlb(folio))
+   set_huge_swap_pte_at(mm, address, pvmw.pte,
+swp_pte, 
vma_mmu_pagesize(vma));
+   else
+   set_pte_at(mm, address, pvmw.pte, swp_pte);
trace_set_migration_pte(address, pte_val(swp_pte),
compound_order(>page));
/*
-- 
1.8.3.1



[PATCH v2 3/3] mm: rmap: Fix CONT-PTE/PMD size hugetlb issue when unmapping

2022-05-08 Thread Baolin Wang
On some architectures (like ARM64), it can support CONT-PTE/PMD size
hugetlb, which means it can support not only PMD/PUD size hugetlb:
2M and 1G, but also CONT-PTE/PMD size: 64K and 32M if a 4K page
size specified.

When unmapping a hugetlb page, we will get the relevant page table
entry by huge_pte_offset() only once to nuke it. This is correct
for PMD or PUD size hugetlb, since they always contain only one
pmd entry or pud entry in the page table.

However this is incorrect for CONT-PTE and CONT-PMD size hugetlb,
since they can contain several continuous pte or pmd entry with
same page table attributes, so we will nuke only one pte or pmd
entry for this CONT-PTE/PMD size hugetlb page.

And now try_to_unmap() is only passed a hugetlb page in the case
where the hugetlb page is poisoned. Which means now we will unmap
only one pte entry for a CONT-PTE or CONT-PMD size poisoned hugetlb
page, and we can still access other subpages of a CONT-PTE or CONT-PMD
size poisoned hugetlb page, which will cause serious issues possibly.

So we should change to use huge_ptep_clear_flush() to nuke the
hugetlb page table to fix this issue, which already considered
CONT-PTE and CONT-PMD size hugetlb.

We've already used set_huge_swap_pte_at() to set a poisoned
swap entry for a poisoned hugetlb page. Meanwhile adding a VM_BUG_ON()
to make sure the passed hugetlb page is poisoned in try_to_unmap().

Signed-off-by: Baolin Wang 
---
 mm/rmap.c | 39 ++-
 1 file changed, 22 insertions(+), 17 deletions(-)

diff --git a/mm/rmap.c b/mm/rmap.c
index 7cf2408..37c8fd2 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1530,6 +1530,11 @@ static bool try_to_unmap_one(struct folio *folio, struct 
vm_area_struct *vma,
 
if (folio_test_hugetlb(folio)) {
/*
+* The try_to_unmap() is only passed a hugetlb page
+* in the case where the hugetlb page is poisoned.
+*/
+   VM_BUG_ON_PAGE(!PageHWPoison(subpage), subpage);
+   /*
 * huge_pmd_unshare may unmap an entire PMD page.
 * There is no way of knowing exactly which PMDs may
 * be cached for this mm, so we must flush them all.
@@ -1564,28 +1569,28 @@ static bool try_to_unmap_one(struct folio *folio, 
struct vm_area_struct *vma,
break;
}
}
+   pteval = huge_ptep_clear_flush(vma, address, pvmw.pte);
} else {
flush_cache_page(vma, address, pte_pfn(*pvmw.pte));
-   }
-
-   /*
-* Nuke the page table entry. When having to clear
-* PageAnonExclusive(), we always have to flush.
-*/
-   if (should_defer_flush(mm, flags) && !anon_exclusive) {
/*
-* We clear the PTE but do not flush so potentially
-* a remote CPU could still be writing to the folio.
-* If the entry was previously clean then the
-* architecture must guarantee that a clear->dirty
-* transition on a cached TLB entry is written through
-* and traps if the PTE is unmapped.
+* Nuke the page table entry. When having to clear
+* PageAnonExclusive(), we always have to flush.
 */
-   pteval = ptep_get_and_clear(mm, address, pvmw.pte);
+   if (should_defer_flush(mm, flags) && !anon_exclusive) {
+   /*
+* We clear the PTE but do not flush so 
potentially
+* a remote CPU could still be writing to the 
folio.
+* If the entry was previously clean then the
+* architecture must guarantee that a 
clear->dirty
+* transition on a cached TLB entry is written 
through
+* and traps if the PTE is unmapped.
+*/
+   pteval = ptep_get_and_clear(mm, address, 
pvmw.pte);
 
-   set_tlb_ubc_flush_pending(mm, pte_dirty(pteval));
-   } else {
-   pteval = ptep_clear_flush(vma, address, pvmw.pte);
+   set_tlb_ubc_flush_pending(mm, 
pte_dirty(pteval));
+   } else {
+   pteval = ptep_clear_flush(vma, address, 
pvmw.pte);
+   }
}
 
/*
-- 
1.8.3.1



[PATCH v2 1/3] mm: change huge_ptep_clear_flush() to return the original pte

2022-05-08 Thread Baolin Wang
It is incorrect to use ptep_clear_flush() to nuke a hugetlb page
table when unmapping or migrating a hugetlb page, and will change
to use huge_ptep_clear_flush() instead in the following patches.

So this is a preparation patch, which changes the huge_ptep_clear_flush()
to return the original pte to help to nuke a hugetlb page table.

Signed-off-by: Baolin Wang 
Acked-by: Mike Kravetz 
---
 arch/arm64/include/asm/hugetlb.h   |  4 ++--
 arch/arm64/mm/hugetlbpage.c| 12 +---
 arch/ia64/include/asm/hugetlb.h|  4 ++--
 arch/mips/include/asm/hugetlb.h|  9 ++---
 arch/parisc/include/asm/hugetlb.h  |  4 ++--
 arch/powerpc/include/asm/hugetlb.h |  9 ++---
 arch/s390/include/asm/hugetlb.h|  6 +++---
 arch/sh/include/asm/hugetlb.h  |  4 ++--
 arch/sparc/include/asm/hugetlb.h   |  4 ++--
 include/asm-generic/hugetlb.h  |  4 ++--
 mm/hugetlb.c   |  2 +-
 11 files changed, 33 insertions(+), 29 deletions(-)

diff --git a/arch/arm64/include/asm/hugetlb.h b/arch/arm64/include/asm/hugetlb.h
index 1242f71..616b2ca 100644
--- a/arch/arm64/include/asm/hugetlb.h
+++ b/arch/arm64/include/asm/hugetlb.h
@@ -39,8 +39,8 @@ extern pte_t huge_ptep_get_and_clear(struct mm_struct *mm,
 extern void huge_ptep_set_wrprotect(struct mm_struct *mm,
unsigned long addr, pte_t *ptep);
 #define __HAVE_ARCH_HUGE_PTEP_CLEAR_FLUSH
-extern void huge_ptep_clear_flush(struct vm_area_struct *vma,
- unsigned long addr, pte_t *ptep);
+extern pte_t huge_ptep_clear_flush(struct vm_area_struct *vma,
+  unsigned long addr, pte_t *ptep);
 #define __HAVE_ARCH_HUGE_PTE_CLEAR
 extern void huge_pte_clear(struct mm_struct *mm, unsigned long addr,
   pte_t *ptep, unsigned long sz);
diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c
index cbace1c..ca8e65c 100644
--- a/arch/arm64/mm/hugetlbpage.c
+++ b/arch/arm64/mm/hugetlbpage.c
@@ -486,19 +486,17 @@ void huge_ptep_set_wrprotect(struct mm_struct *mm,
set_pte_at(mm, addr, ptep, pfn_pte(pfn, hugeprot));
 }
 
-void huge_ptep_clear_flush(struct vm_area_struct *vma,
-  unsigned long addr, pte_t *ptep)
+pte_t huge_ptep_clear_flush(struct vm_area_struct *vma,
+   unsigned long addr, pte_t *ptep)
 {
size_t pgsize;
int ncontig;
 
-   if (!pte_cont(READ_ONCE(*ptep))) {
-   ptep_clear_flush(vma, addr, ptep);
-   return;
-   }
+   if (!pte_cont(READ_ONCE(*ptep)))
+   return ptep_clear_flush(vma, addr, ptep);
 
ncontig = find_num_contig(vma->vm_mm, addr, ptep, );
-   clear_flush(vma->vm_mm, addr, ptep, pgsize, ncontig);
+   return get_clear_flush(vma->vm_mm, addr, ptep, pgsize, ncontig);
 }
 
 static int __init hugetlbpage_init(void)
diff --git a/arch/ia64/include/asm/hugetlb.h b/arch/ia64/include/asm/hugetlb.h
index 7e46ebd..65d3811 100644
--- a/arch/ia64/include/asm/hugetlb.h
+++ b/arch/ia64/include/asm/hugetlb.h
@@ -23,8 +23,8 @@ static inline int is_hugepage_only_range(struct mm_struct *mm,
 #define is_hugepage_only_range is_hugepage_only_range
 
 #define __HAVE_ARCH_HUGE_PTEP_CLEAR_FLUSH
-static inline void huge_ptep_clear_flush(struct vm_area_struct *vma,
-unsigned long addr, pte_t *ptep)
+static inline pte_t huge_ptep_clear_flush(struct vm_area_struct *vma,
+ unsigned long addr, pte_t *ptep)
 {
 }
 
diff --git a/arch/mips/include/asm/hugetlb.h b/arch/mips/include/asm/hugetlb.h
index c214440..fd69c88 100644
--- a/arch/mips/include/asm/hugetlb.h
+++ b/arch/mips/include/asm/hugetlb.h
@@ -43,16 +43,19 @@ static inline pte_t huge_ptep_get_and_clear(struct 
mm_struct *mm,
 }
 
 #define __HAVE_ARCH_HUGE_PTEP_CLEAR_FLUSH
-static inline void huge_ptep_clear_flush(struct vm_area_struct *vma,
-unsigned long addr, pte_t *ptep)
+static inline pte_t huge_ptep_clear_flush(struct vm_area_struct *vma,
+ unsigned long addr, pte_t *ptep)
 {
+   pte_t pte;
+
/*
 * clear the huge pte entry firstly, so that the other smp threads will
 * not get old pte entry after finishing flush_tlb_page and before
 * setting new huge pte entry
 */
-   huge_ptep_get_and_clear(vma->vm_mm, addr, ptep);
+   pte = huge_ptep_get_and_clear(vma->vm_mm, addr, ptep);
flush_tlb_page(vma, addr);
+   return pte;
 }
 
 #define __HAVE_ARCH_HUGE_PTE_NONE
diff --git a/arch/parisc/include/asm/hugetlb.h 
b/arch/parisc/include/asm/hugetlb.h
index a69cf9e..25bc560 100644
--- a/arch/parisc/include/asm/hugetlb.h
+++ b/arch/parisc/include/asm/hugetlb.h
@@ -28,8 +28,8 @@ static inline int prepare_hugepage_range(struct file *file,
 }
 
 #define __HAVE_ARCH_HUGE_PTEP_CLEAR_FLUSH
-static inline void 

[PATCH v2 0/3] Fix CONT-PTE/PMD size hugetlb issue when unmapping or migrating

2022-05-08 Thread Baolin Wang
Hi,

Now migrating a hugetlb page or unmapping a poisoned hugetlb page, we'll
use ptep_clear_flush() and set_pte_at() to nuke the page table entry
and remap it, and this is incorrect for CONT-PTE or CONT-PMD size hugetlb
page, which will cause potential data consistent issue. This patch set
will change to use hugetlb related APIs to fix this issue, please find
details in each patch. Thanks.

Note: Mike pointed out the huge_ptep_get() will only return the one specific
value, and it would not take into account the dirty or young bits of 
CONT-PTE/PMDs
like the huge_ptep_get_and_clear() [1]. This inconsistent issue is not 
introduced
by this patch set, and will address this issue in another thread [2]. Meanwhile
the uffd for hugetlb case [3] pointed by Gerald also need another patch to 
address.

[1] 
https://lore.kernel.org/linux-mm/85bd80b4-b4fd-0d3f-a2e5-149559f2f...@oracle.com/
[2] 
https://lore.kernel.org/all/cover.1651998586.git.baolin.w...@linux.alibaba.com/
[3] https://lore.kernel.org/linux-mm/20220503120343.6264e126@thinkpad/

Changes from v1:
 - Add acked tag from Mike.
 - Update some commit message.
 - Add VM_BUG_ON in try_to_unmap() for hugetlb case.
 - Add an explict void casting for huge_ptep_clear_flush() in hugetlb.c.

Baolin Wang (3):
  mm: change huge_ptep_clear_flush() to return the original pte
  mm: rmap: Fix CONT-PTE/PMD size hugetlb issue when migration
  mm: rmap: Fix CONT-PTE/PMD size hugetlb issue when unmapping

 arch/arm64/include/asm/hugetlb.h   |  4 +--
 arch/arm64/mm/hugetlbpage.c| 12 +++-
 arch/ia64/include/asm/hugetlb.h|  4 +--
 arch/mips/include/asm/hugetlb.h|  9 --
 arch/parisc/include/asm/hugetlb.h  |  4 +--
 arch/powerpc/include/asm/hugetlb.h |  9 --
 arch/s390/include/asm/hugetlb.h|  6 ++--
 arch/sh/include/asm/hugetlb.h  |  4 +--
 arch/sparc/include/asm/hugetlb.h   |  4 +--
 include/asm-generic/hugetlb.h  |  4 +--
 mm/hugetlb.c   |  2 +-
 mm/rmap.c  | 63 --
 12 files changed, 73 insertions(+), 52 deletions(-)

-- 
1.8.3.1



Re: [PATCH 2/3] mm: rmap: Fix CONT-PTE/PMD size hugetlb issue when migration

2022-05-08 Thread Baolin Wang




On 5/7/2022 10:33 AM, Baolin Wang wrote:



On 5/7/2022 1:56 AM, Mike Kravetz wrote:

On 5/5/22 20:39, Baolin Wang wrote:


On 5/6/2022 7:53 AM, Mike Kravetz wrote:

On 4/29/22 01:14, Baolin Wang wrote:

On some architectures (like ARM64), it can support CONT-PTE/PMD size
hugetlb, which means it can support not only PMD/PUD size hugetlb:
2M and 1G, but also CONT-PTE/PMD size: 64K and 32M if a 4K page
size specified.



diff --git a/mm/rmap.c b/mm/rmap.c
index 6fdd198..7cf2408 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1924,13 +1924,15 @@ static bool try_to_migrate_one(struct folio 
*folio, struct vm_area_struct *vma,

   break;
   }
   }
+
+    /* Nuke the hugetlb page table entry */
+    pteval = huge_ptep_clear_flush(vma, address, pvmw.pte);
   } else {
   flush_cache_page(vma, address, pte_pfn(*pvmw.pte));
+    /* Nuke the page table entry. */
+    pteval = ptep_clear_flush(vma, address, pvmw.pte);
   }


On arm64 with CONT-PTE/PMD the returned pteval will have dirty or 
young set

if ANY of the PTE/PMDs had dirty or young set.


Right.




-    /* Nuke the page table entry. */
-    pteval = ptep_clear_flush(vma, address, pvmw.pte);
-
   /* Set the dirty flag on the folio now the pte is gone. */
   if (pte_dirty(pteval))
   folio_mark_dirty(folio);
@@ -2015,7 +2017,10 @@ static bool try_to_migrate_one(struct folio 
*folio, struct vm_area_struct *vma,

   pte_t swp_pte;
     if (arch_unmap_one(mm, vma, address, pteval) < 0) {
-    set_pte_at(mm, address, pvmw.pte, pteval);
+    if (folio_test_hugetlb(folio))
+    set_huge_pte_at(mm, address, pvmw.pte, pteval);


And, we will use that pteval for ALL the PTE/PMDs here.  So, we 
would set

the dirty or young bit in ALL PTE/PMDs.

Could that cause any issues?  May be more of a question for the 
arm64 people.


I don't think this will cause any issues. Since the hugetlb can not 
be split, and we should not lose the the dirty or young state if any 
subpages were set. Meanwhile we already did like this in hugetlb.c:


pte = huge_ptep_get_and_clear(mm, address, ptep);
tlb_remove_huge_tlb_entry(h, tlb, ptep, address);
if (huge_pte_dirty(pte))
 set_page_dirty(page);



Agree that it 'should not' cause issues.  It just seems inconsistent.
This is not a problem specifically with your patch, just the handling of
CONT-PTE/PMD entries.

There does not appear to be an arm64 specific version of huge_ptep_get()
that takes CONT-PTE/PMD into account.  So, huge_ptep_get() would only
return the one specific value.  It would not take into account the dirty
or young bits of CONT-PTE/PMDs like your new version of
huge_ptep_get_and_clear.  Is that correct?  Or, am I missing something.


Yes, you are right.



If I am correct, then code like the following may not work:

static int gather_hugetlb_stats(pte_t *pte, unsigned long hmask,
 unsigned long addr, unsigned long end, struct mm_walk 
*walk)

{
 pte_t huge_pte = huge_ptep_get(pte);
 struct numa_maps *md;
 struct page *page;

 if (!pte_present(huge_pte))
 return 0;

 page = pte_page(huge_pte);

 md = walk->private;
 gather_stats(page, md, pte_dirty(huge_pte), 1);
 return 0;
}


Right, this is inconsistent with current huge_ptep_get() interface like 
you said. So I think we can define an ARCH-specific huge_ptep_get() 
interface for arm64, and some sample code like below. How do you think?


After some investigation, I send out a RFC patch set[1] to address this 
issue. We can talk about this issue in that thread. Thanks.


[1] 
https://lore.kernel.org/all/cover.1651998586.git.baolin.w...@linux.alibaba.com/