Re: [PATCH v1 0/5] Implement livepatch on PPC32

2021-11-24 Thread Christophe Leroy




Le 24/11/2021 à 23:34, Michael Ellerman a écrit :

Christophe Leroy  writes:

This series implements livepatch on PPC32.

This is largely copied from what's done on PPC64.

Christophe Leroy (5):
   livepatch: Fix build failure on 32 bits processors
   powerpc/ftrace: No need to read LR from stack in _mcount()
   powerpc/ftrace: Add module_trampoline_target() for PPC32
   powerpc/ftrace: Activate HAVE_DYNAMIC_FTRACE_WITH_REGS on PPC32
   powerpc/ftrace: Add support for livepatch to PPC32


I think we know patch 5 will need a respin because of the STRICT RWX vs
livepatching issue (https://github.com/linuxppc/issues/issues/375).

So should I take patches 2,3,4 for now?



Yes you can take them now I think.

Thanks
Christophe


Re: [PATCH] powerpc/eeh: Delay slot presence check once driver is notified about the pci error.

2021-11-24 Thread Mahesh J Salgaonkar
On 2021-11-24 22:57:13 Wed, Oliver O'Halloran wrote:
> On Wed, Nov 24, 2021 at 7:45 PM Mahesh J Salgaonkar
>  wrote:
> >
> > No it doesn't. We will still do a presence check before the recovery
> > process starts. This patch moves the check after notifying the driver to
> > stop active I/O operations. If a presence check finds the device isn't
> > present, we will skip the EEH recovery. However, on a surprise hotplug,
> > the user will see the EEH messages on the console before it finds there
> > is nothing to recover.
> 
> Suppressing the spurious EEH messages was part of why I added that
> check in the first place. If you want to defer the presence check
> until later you should move the stack trace printing, etc to after
> we've confirmed there are still devices present. Considering the

That will help suppressing the spurious EEH messages.

> motivation for this patch is to avoid spurious warnings from the
> driver I don't think printing spurious EEH messages is much of an
> improvement.

Agree.

> 
> The other option would be returning an error from the pseries hotplug
> driver. IIRC that's what pnv_php / OPAL does if the PHB is fenced and
> we can't check the slot presence state.

Yeah. I can change rpaphp_get_sensor_state() to use
rtas_get_sensor_fast() variant which will return immediately with an
error on extended busy error. That way we don't need to move the slot
presence check at all. I did test that and it does fix the problem. But
I wasn't sure if that would have any implications on hotplug driver
behaviour. If pnv_php / OPAL does the same thing then this would be a
cleaner approach to fix this issue. Let me send out the patch with this
other option to fix the pseries hotplug driver instead.

Thanks,
-Mahesh.

-- 
Mahesh J Salgaonkar


Re: [PATCH v2 2/2] powerpc:85xx: fix timebase sync issue when CONFIG_HOTPLUG_CPU=n

2021-11-24 Thread Martin Kennedy
Hi there,

I have bisected OpenWrt master, and then the Linux kernel down to this
change, to confirm that this change causes a kernel panic on my
P1020RDB-based, dual-core Aerohive HiveAP 370, at initialization of
the second CPU:

:
[0.00] Linux version 5.10.80 (labby@lobon)
(powerpc-openwrt-linux-musl-gcc (OpenWrt GCC 11.2.0
r18111+1-ebb6f9287e) 11.2.0, GNU ld (GNU Binutils) 2.37) #0 SMP Thu
Nov 25 02:49:35 2021
[0.00] Using P1020 RDB machine description
:
[0.627233] smp: Bringing up secondary CPUs ...
[0.681659] kernel tried to execute user page (0) - exploit attempt? (uid: 0)
[0.766618] BUG: Unable to handle kernel instruction fetch (NULL pointer?)
[0.848899] Faulting instruction address: 0x
[0.908273] Oops: Kernel access of bad area, sig: 11 [#1]
[0.972851] BE PAGE_SIZE=4K SMP NR_CPUS=2 P1020 RDB
[1.031179] Modules linked in:
[1.067640] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.10.80 #0
[1.139507] NIP:   LR: c0021d2c CTR: 
[1.199921] REGS: c1051cf0 TRAP: 0400   Not tainted  (5.10.80)
[1.269705] MSR:  00021000   CR: 84020202  XER: 
[1.340534]
[1.340534] GPR00: c0021cb8 c1051da8 c1048000 0001 00029000
 0001 
[1.340534] GPR08: 0001  c08b 0040 22000208
 c00032c4 
[1.340534] GPR16:     
 00029000 0001
[1.340534] GPR24: 1240 2000 d240 c080a1f4 0001
c08ae0a8 0001 d240
[1.758220] NIP [] 0x0
[1.794688] LR [c0021d2c] smp_85xx_kick_cpu+0xe8/0x568
[1.856126] Call Trace:
[1.885295] [c1051da8] [c0021cb8] smp_85xx_kick_cpu+0x74/0x568 (unreliable)
[1.968633] [c1051de8] [c0011460] __cpu_up+0xc0/0x228
[2.029038] [c1051e18] [c0031bbc] bringup_cpu+0x30/0x224
[2.092572] [c1051e48] [c0031f3c] cpu_up.constprop.0+0x180/0x33c
[2.164443] [c1051e88] [c00322e8] bringup_nonboot_cpus+0x88/0xc8
[2.236326] [c1051eb8] [c07e67bc] smp_init+0x30/0x78
[2.295698] [c1051ed8] [c07d9e28] kernel_init_freeable+0x118/0x2a8
[2.369641] [c1051f18] [c00032d8] kernel_init+0x14/0x124
[2.433176] [c1051f38] [c0010278] ret_from_kernel_thread+0x14/0x1c
[2.507125] Instruction dump:
[2.542541]      
 
[2.635242]      
 
[2.727952] ---[ end trace 9b796a4bafb6bc14 ]---
[2.783149]
[3.800879] Kernel panic - not syncing: Fatal exception
[3.862353] Rebooting in 1 seconds..
[5.905097] System Halted, OK to turn off power

Without this patch, the kernel no longer panics:

[0.627232] smp: Bringing up secondary CPUs ...
[0.681857] smp: Brought up 1 node, 2 CPUs

Here is the kernel configuration for this built kernel:
https://git.openwrt.org/?p=openwrt/openwrt.git;a=blob_plain;f=target/linux/mpc85xx/config-5.10;hb=HEAD

In case a force-push is needed for the source repository
(https://github.com/Hurricos/openwrt/commit/ad19bdfc77d60ee1c52b41bb4345fdd02284c4cf),
here is the device tree for this board:
https://paste.c-net.org/TrousersSliced

Martin


[PATCH 2/2] tools/perf: Update global/local variants for p_stage_cyc in powerpc

2021-11-24 Thread Athira Rajeev
Update the arch_support_sort_key() function in powerpc
to enable presenting local and global variants of sort
key: p_stage_cyc. Update the "se_header" strings for
these in arch_perf_header_entry() function along with
instruction latency.

Signed-off-by: Athira Rajeev 
Reported-by: Namhyung Kim 
---
 tools/perf/arch/powerpc/util/event.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/tools/perf/arch/powerpc/util/event.c 
b/tools/perf/arch/powerpc/util/event.c
index 3bf441257466..cf430a4c55b9 100644
--- a/tools/perf/arch/powerpc/util/event.c
+++ b/tools/perf/arch/powerpc/util/event.c
@@ -40,8 +40,12 @@ const char *arch_perf_header_entry(const char *se_header)
 {
if (!strcmp(se_header, "Local INSTR Latency"))
return "Finish Cyc";
-   else if (!strcmp(se_header, "Pipeline Stage Cycle"))
+   else if (!strcmp(se_header, "INSTR Latency"))
+   return "Global Finish_cyc";
+   else if (!strcmp(se_header, "Local Pipeline Stage Cycle"))
return "Dispatch Cyc";
+   else if (!strcmp(se_header, "Pipeline Stage Cycle"))
+   return "Global Dispatch_cyc";
return se_header;
 }
 
@@ -49,5 +53,7 @@ int arch_support_sort_key(const char *sort_key)
 {
if (!strcmp(sort_key, "p_stage_cyc"))
return 1;
+   if (!strcmp(sort_key, "local_p_stage_cyc"))
+   return 1;
return 0;
 }
-- 
2.27.0



[PATCH 1/2] tools/perf: Include global and local variants for p_stage_cyc sort key

2021-11-24 Thread Athira Rajeev
Sort key p_stage_cyc is used to present the latency
cycles spend in pipeline stages. perf tool has local
p_stage_cyc sort key to display this info. There is no
global variant available for this sort key. local variant
shows latency in a sinlge sample, whereas, global value
will be useful to present the total latency (sum of
latencies) in the hist entry. It represents latency
number multiplied by the number of samples.

Add global (p_stage_cyc) and local variant
(local_p_stage_cyc) for this sort key. Use the
local_p_stage_cyc as default option for "mem" sort mode.
Also add this to list of dynamic sort keys.

Signed-off-by: Athira Rajeev 
Reported-by: Namhyung Kim 
---
 tools/perf/util/hist.c |  4 +++-
 tools/perf/util/hist.h |  3 ++-
 tools/perf/util/sort.c | 34 +-
 tools/perf/util/sort.h |  3 ++-
 4 files changed, 32 insertions(+), 12 deletions(-)

diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index b776465e04ef..0a8033b09e28 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -211,7 +211,9 @@ void hists__calc_col_len(struct hists *hists, struct 
hist_entry *h)
hists__new_col_len(hists, HISTC_MEM_BLOCKED, 10);
hists__new_col_len(hists, HISTC_LOCAL_INS_LAT, 13);
hists__new_col_len(hists, HISTC_GLOBAL_INS_LAT, 13);
-   hists__new_col_len(hists, HISTC_P_STAGE_CYC, 13);
+   hists__new_col_len(hists, HISTC_LOCAL_P_STAGE_CYC, 13);
+   hists__new_col_len(hists, HISTC_GLOBAL_P_STAGE_CYC, 13);
+
if (symbol_conf.nanosecs)
hists__new_col_len(hists, HISTC_TIME, 16);
else
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index 5343b62476e6..2752ce681108 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -75,7 +75,8 @@ enum hist_column {
HISTC_MEM_BLOCKED,
HISTC_LOCAL_INS_LAT,
HISTC_GLOBAL_INS_LAT,
-   HISTC_P_STAGE_CYC,
+   HISTC_LOCAL_P_STAGE_CYC,
+   HISTC_GLOBAL_P_STAGE_CYC,
HISTC_NR_COLS, /* Last entry */
 };
 
diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index e9216a292a04..e978f7883e07 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -37,7 +37,7 @@ const chardefault_parent_pattern[] = 
"^sys_|^do_page_fault";
 const char *parent_pattern = default_parent_pattern;
 const char *default_sort_order = "comm,dso,symbol";
 const char default_branch_sort_order[] = 
"comm,dso_from,symbol_from,symbol_to,cycles";
-const char default_mem_sort_order[] = 
"local_weight,mem,sym,dso,symbol_daddr,dso_daddr,snoop,tlb,locked,blocked,local_ins_lat,p_stage_cyc";
+const char default_mem_sort_order[] = 
"local_weight,mem,sym,dso,symbol_daddr,dso_daddr,snoop,tlb,locked,blocked,local_ins_lat,local_p_stage_cyc";
 const char default_top_sort_order[] = "dso,symbol";
 const char default_diff_sort_order[] = "dso,symbol";
 const char default_tracepoint_sort_order[] = "trace";
@@ -46,8 +46,8 @@ const char*field_order;
 regex_tignore_callees_regex;
 inthave_ignore_callees = 0;
 enum sort_mode sort__mode = SORT_MODE__NORMAL;
-const char *dynamic_headers[] = {"local_ins_lat", "p_stage_cyc"};
-const char *arch_specific_sort_keys[] = {"p_stage_cyc"};
+const char *dynamic_headers[] = {"local_ins_lat", "ins_lat", 
"local_p_stage_cyc", "p_stage_cyc"};
+const char *arch_specific_sort_keys[] = {"local_p_stage_cyc", 
"p_stage_cyc"};
 
 /*
  * Replaces all occurrences of a char used with the:
@@ -1392,22 +1392,37 @@ struct sort_entry sort_global_ins_lat = {
 };
 
 static int64_t
-sort__global_p_stage_cyc_cmp(struct hist_entry *left, struct hist_entry *right)
+sort__p_stage_cyc_cmp(struct hist_entry *left, struct hist_entry *right)
 {
return left->p_stage_cyc - right->p_stage_cyc;
 }
 
+static int hist_entry__global_p_stage_cyc_snprintf(struct hist_entry *he, char 
*bf,
+   size_t size, unsigned int width)
+{
+   return repsep_snprintf(bf, size, "%-*u", width,
+   he->p_stage_cyc * he->stat.nr_events);
+}
+
+
 static int hist_entry__p_stage_cyc_snprintf(struct hist_entry *he, char *bf,
size_t size, unsigned int width)
 {
return repsep_snprintf(bf, size, "%-*u", width, he->p_stage_cyc);
 }
 
-struct sort_entry sort_p_stage_cyc = {
-   .se_header  = "Pipeline Stage Cycle",
-   .se_cmp = sort__global_p_stage_cyc_cmp,
+struct sort_entry sort_local_p_stage_cyc = {
+   .se_header  = "Local Pipeline Stage Cycle",
+   .se_cmp = sort__p_stage_cyc_cmp,
.se_snprintf= hist_entry__p_stage_cyc_snprintf,
-   .se_width_idx   = HISTC_P_STAGE_CYC,
+   .se_width_idx   = HISTC_LOCAL_P_STAGE_CYC,
+};
+
+struct sort_entry sort_global_p_stage_cyc = {
+   .se_header  = "Pipeline Stage Cycle",
+   .se_cmp = sort__p_stage_cyc_cmp,
+   .se_snprintf= 

Re: [PATCH 1/1] futex: Wireup futex_waitv syscall

2021-11-24 Thread Thomas Gleixner
On Wed, Nov 24 2021 at 15:29, Arnd Bergmann wrote:
> On Wed, Nov 24, 2021 at 2:21 PM André Almeida  
> wrote:
>>
>> Wireup futex_waitv syscall for all remaining archs.
>>
>> Signed-off-by: André Almeida 
>
> Reviewed-by: Arnd Bergmann 
>
> I double-checked that futex_waitv() doesn't need any architecture specific
> hacks, and that the list above is complete.
>
> Should I take this through the asm-generic tree, or would you send it
> through the tip tree?

Feel free to pick it up.

Thanks,

tglx


Re: [PATCH 1/1] futex: Wireup futex_waitv syscall

2021-11-24 Thread Michael Ellerman
André Almeida  writes:
> diff --git a/arch/powerpc/kernel/syscalls/syscall.tbl 
> b/arch/powerpc/kernel/syscalls/syscall.tbl
> index 7bef917cc84e..15109af9d075 100644
> --- a/arch/powerpc/kernel/syscalls/syscall.tbl
> +++ b/arch/powerpc/kernel/syscalls/syscall.tbl
> @@ -528,3 +528,4 @@
>  446  common  landlock_restrict_self  sys_landlock_restrict_self
>  # 447 reserved for memfd_secret
>  448  common  process_mreleasesys_process_mrelease
> +449  common  futex_waitv sys_futex_waitv

Tested-by: Michael Ellerman  (powerpc)

The selftest doesn't build with old headers, I needed this:

diff --git a/tools/testing/selftests/futex/include/futex2test.h 
b/tools/testing/selftests/futex/include/futex2test.h
index 9d305520e849..e6422321e9d0 100644
--- a/tools/testing/selftests/futex/include/futex2test.h
+++ b/tools/testing/selftests/futex/include/futex2test.h
@@ -8,6 +8,10 @@

 #define u64_to_ptr(x) ((void *)(uintptr_t)(x))

+#ifndef __NR_futex_waitv
+#define __NR_futex_waitv 449
+#endif
+
 /**
  * futex_waitv - Wait at multiple futexes, wake on any
  * @waiters:Array of waiters


cheers


Re: [PATCH v1 0/5] Implement livepatch on PPC32

2021-11-24 Thread Michael Ellerman
Christophe Leroy  writes:
> This series implements livepatch on PPC32.
>
> This is largely copied from what's done on PPC64.
>
> Christophe Leroy (5):
>   livepatch: Fix build failure on 32 bits processors
>   powerpc/ftrace: No need to read LR from stack in _mcount()
>   powerpc/ftrace: Add module_trampoline_target() for PPC32
>   powerpc/ftrace: Activate HAVE_DYNAMIC_FTRACE_WITH_REGS on PPC32
>   powerpc/ftrace: Add support for livepatch to PPC32

I think we know patch 5 will need a respin because of the STRICT RWX vs
livepatching issue (https://github.com/linuxppc/issues/issues/375).

So should I take patches 2,3,4 for now?

cheers


Re: [PATCH 0/8] Convert powerpc to default topdown mmap layout

2021-11-24 Thread Christophe Leroy




Le 24/11/2021 à 14:40, Christophe Leroy a écrit :



Le 24/11/2021 à 14:21, Nicholas Piggin a écrit :

Excerpts from Christophe Leroy's message of November 22, 2021 6:48 pm:

This series converts powerpc to default topdown mmap layout.

powerpc provides its own arch_get_unmapped_area() only when
slices are needed, which is only for book3s/64. First part of
the series moves slices into book3s/64 specific directories
and cleans up other subarchitectures.

Then a small modification is done to core mm to allow
powerpc to still provide its own arch_randomize_brk()

Last part converts to default topdown mmap layout.


A nice series but will clash badly with the CONFIG_HASH_MMU
series of course. One will have to be rebased if they are
both to be merged.



No worry, it should be an issue.

If you already forsee that series being merged soon, I can rebase my 
series on top of it just now.




In patchwork, v3 is flagged as superseded and I can't find a v4. Do you 
have it somewhere ?


Christophe


Re: [PATCH 1/1] futex: Wireup futex_waitv syscall

2021-11-24 Thread Max Filippov
On Wed, Nov 24, 2021 at 5:21 AM André Almeida  wrote:
>
> Wireup futex_waitv syscall for all remaining archs.
>
> Signed-off-by: André Almeida 
> ---
>  arch/alpha/kernel/syscalls/syscall.tbl  | 1 +
>  arch/ia64/kernel/syscalls/syscall.tbl   | 1 +
>  arch/m68k/kernel/syscalls/syscall.tbl   | 1 +
>  arch/microblaze/kernel/syscalls/syscall.tbl | 1 +
>  arch/powerpc/kernel/syscalls/syscall.tbl| 1 +
>  arch/sh/kernel/syscalls/syscall.tbl | 1 +
>  arch/sparc/kernel/syscalls/syscall.tbl  | 1 +
>  arch/xtensa/kernel/syscalls/syscall.tbl | 1 +
>  8 files changed, 8 insertions(+)

For xtensa:
Acked-by: Max Filippov 

-- 
Thanks.
-- Max


Re: [PATCH 0/3] of/fdt: Rework early FDT scanning functions

2021-11-24 Thread Frank Rowand
On 11/18/21 1:12 PM, Rob Herring wrote:
> The early FDT scanning functions use of_scan_flat_dt() which implements 
> its own node walking method. This function predates libfdt and is an 
> unnecessary indirection. This series reworks 
> early_init_dt_scan_chosen(), early_init_dt_scan_root(), and 
> early_init_dt_scan_memory() to be called directly and use libfdt calls.
> 
> Ultimately, I want to remove of_scan_flat_dt(). Most of the remaining 
> of_scan_flat_dt() users are in powerpc.
> 
> Rob
> 
> 
> Rob Herring (3):
>   of/fdt: Rework early_init_dt_scan_chosen() to call directly
>   of/fdt: Rework early_init_dt_scan_root() to call directly
>   of/fdt: Rework early_init_dt_scan_memory() to call directly
> 
>  arch/mips/ralink/of.c|  16 +---
>  arch/powerpc/kernel/prom.c   |  22 ++---
>  arch/powerpc/mm/nohash/kaslr_booke.c |   4 +-
>  drivers/of/fdt.c | 121 ++-
>  include/linux/of_fdt.h   |   9 +-
>  5 files changed, 79 insertions(+), 93 deletions(-)
> 


"checkpatch --strict" reports some "CHECK" issues, but review of the patches
for correctness becomes much more difficult if they are addressed, so they
should be ignored for this series.

-Frank


Re: [PATCH 3/3] of/fdt: Rework early_init_dt_scan_memory() to call directly

2021-11-24 Thread Frank Rowand
On 11/18/21 1:12 PM, Rob Herring wrote:
> Use of the of_scan_flat_dt() function predates libfdt and is discouraged
> as libfdt provides a nicer set of APIs. Rework
> early_init_dt_scan_memory() to be called directly and use libfdt.
> 
> Cc: John Crispin 
> Cc: Thomas Bogendoerfer 
> Cc: Michael Ellerman 
> Cc: Benjamin Herrenschmidt 
> Cc: Paul Mackerras 
> Cc: Frank Rowand 
> Cc: linux-m...@vger.kernel.org
> Cc: linuxppc-dev@lists.ozlabs.org
> Signed-off-by: Rob Herring 
> ---
>  arch/mips/ralink/of.c  | 16 ++---
>  arch/powerpc/kernel/prom.c | 16 -
>  drivers/of/fdt.c   | 68 --
>  include/linux/of_fdt.h |  3 +-
>  4 files changed, 47 insertions(+), 56 deletions(-)
> 
> diff --git a/arch/mips/ralink/of.c b/arch/mips/ralink/of.c
> index 0135376c5de5..e1d79523343a 100644
> --- a/arch/mips/ralink/of.c
> +++ b/arch/mips/ralink/of.c
> @@ -53,17 +53,6 @@ void __init device_tree_init(void)
>   unflatten_and_copy_device_tree();
>  }
>  
> -static int memory_dtb;
> -
> -static int __init early_init_dt_find_memory(unsigned long node,
> - const char *uname, int depth, void *data)
> -{
> - if (depth == 1 && !strcmp(uname, "memory@0"))
> - memory_dtb = 1;
> -
> - return 0;
> -}
> -
>  void __init plat_mem_setup(void)
>  {
>   void *dtb;
> @@ -77,9 +66,8 @@ void __init plat_mem_setup(void)
>   dtb = get_fdt();
>   __dt_setup_arch(dtb);
>  
> - of_scan_flat_dt(early_init_dt_find_memory, NULL);
> - if (memory_dtb)
> - of_scan_flat_dt(early_init_dt_scan_memory, NULL);
> + if (!early_init_dt_scan_memory())
> + return;
>   else if (soc_info.mem_detect)

The previous chunk is now:

   if (XXX)
  return;

instead of:

   if (XXX)
  YYY();

so "else if (soc_info.mem_detect)" should be:

  if (soc_info.mem_detect)

>   soc_info.mem_detect();
>   else if (soc_info.mem_size)
> diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
> index 6e1a106f02eb..63762a3b75e8 100644
> --- a/arch/powerpc/kernel/prom.c
> +++ b/arch/powerpc/kernel/prom.c
> @@ -532,19 +532,19 @@ static int  __init early_init_drmem_lmb(struct 
> drmem_lmb *lmb,
>  }
>  #endif /* CONFIG_PPC_PSERIES */
>  
> -static int __init early_init_dt_scan_memory_ppc(unsigned long node,
> - const char *uname,
> - int depth, void *data)
> +static int __init early_init_dt_scan_memory_ppc(void)
>  {
>  #ifdef CONFIG_PPC_PSERIES
> - if (depth == 1 &&
> - strcmp(uname, "ibm,dynamic-reconfiguration-memory") == 0) {
> + const void *fdt = initial_boot_params;
> + int node = fdt_path_offset(fdt, "/ibm,dynamic-reconfiguration-memory");
> +
> + if (node > 0) {
>   walk_drmem_lmbs_early(node, NULL, early_init_drmem_lmb);
>   return 0;
>   }
>  #endif
>   
> - return early_init_dt_scan_memory(node, uname, depth, data);
> + return early_init_dt_scan_memory();
>  }
>  
>  /*
> @@ -749,7 +749,7 @@ void __init early_init_devtree(void *params)
>  
>   /* Scan memory nodes and rebuild MEMBLOCKs */
>   early_init_dt_scan_root();
> - of_scan_flat_dt(early_init_dt_scan_memory_ppc, NULL);
> + early_init_dt_scan_memory_ppc();
>  
>   parse_early_param();
>  
> @@ -858,7 +858,7 @@ void __init early_get_first_memblock_info(void *params, 
> phys_addr_t *size)
>*/
>   add_mem_to_memblock = 0;
>   early_init_dt_scan_root();
> - of_scan_flat_dt(early_init_dt_scan_memory_ppc, NULL);
> + early_init_dt_scan_memory_ppc();
>   add_mem_to_memblock = 1;
>  
>   if (size)
> diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
> index 5e216555fe4f..a799117886f4 100644
> --- a/drivers/of/fdt.c
> +++ b/drivers/of/fdt.c
> @@ -1078,49 +1078,53 @@ u64 __init dt_mem_next_cell(int s, const __be32 
> **cellp)
>  /*
>   * early_init_dt_scan_memory - Look for and parse memory nodes
>   */
> -int __init early_init_dt_scan_memory(unsigned long node, const char *uname,
> -  int depth, void *data)
> +int __init early_init_dt_scan_memory(void)
>  {
> - const char *type = of_get_flat_dt_prop(node, "device_type", NULL);
> - const __be32 *reg, *endp;
> - int l;
> - bool hotpluggable;
> + int node;
> + const void *fdt = initial_boot_params;
>  
> - /* We are scanning "memory" nodes only */
> - if (type == NULL || strcmp(type, "memory") != 0)
> - return 0;
> + fdt_for_each_subnode(node, fdt, 0) {
> + const char *type = of_get_flat_dt_prop(node, "device_type", 
> NULL);
> + const __be32 *reg, *endp;
> + int l;
> + bool hotpluggable;
>  
> - reg = of_get_flat_dt_prop(node, "linux,usable-memory", );
> - if (reg == NULL)
> - reg = of_get_flat_dt_prop(node, "reg", );
> - if (reg == NULL)
> 

Re: [PATCH 2/3] of/fdt: Rework early_init_dt_scan_root() to call directly

2021-11-24 Thread Frank Rowand
On 11/18/21 1:12 PM, Rob Herring wrote:
> Use of the of_scan_flat_dt() function predates libfdt and is discouraged
> as libfdt provides a nicer set of APIs. Rework early_init_dt_scan_root()
> to be called directly and use libfdt.
> 
> Cc: Michael Ellerman 
> Cc: Benjamin Herrenschmidt 
> Cc: Paul Mackerras 
> Cc: Frank Rowand 
> Cc: linuxppc-dev@lists.ozlabs.org
> Signed-off-by: Rob Herring 
> ---
>  arch/powerpc/kernel/prom.c |  4 ++--
>  drivers/of/fdt.c   | 14 +++---
>  include/linux/of_fdt.h |  3 +--
>  3 files changed, 10 insertions(+), 11 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
> index c6c398ccd98a..6e1a106f02eb 100644
> --- a/arch/powerpc/kernel/prom.c
> +++ b/arch/powerpc/kernel/prom.c
> @@ -748,7 +748,7 @@ void __init early_init_devtree(void *params)
>   of_scan_flat_dt(early_init_dt_scan_chosen_ppc, boot_command_line);
>  
>   /* Scan memory nodes and rebuild MEMBLOCKs */
> - of_scan_flat_dt(early_init_dt_scan_root, NULL);
> + early_init_dt_scan_root();
>   of_scan_flat_dt(early_init_dt_scan_memory_ppc, NULL);
>  
>   parse_early_param();
> @@ -857,7 +857,7 @@ void __init early_get_first_memblock_info(void *params, 
> phys_addr_t *size)
>* mess the memblock.
>*/
>   add_mem_to_memblock = 0;
> - of_scan_flat_dt(early_init_dt_scan_root, NULL);
> + early_init_dt_scan_root();
>   of_scan_flat_dt(early_init_dt_scan_memory_ppc, NULL);
>   add_mem_to_memblock = 1;
>  
> diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
> index 1f1705f76263..5e216555fe4f 100644
> --- a/drivers/of/fdt.c
> +++ b/drivers/of/fdt.c
> @@ -1042,13 +1042,14 @@ int __init early_init_dt_scan_chosen_stdout(void)
>  /*
>   * early_init_dt_scan_root - fetch the top level address and size cells
>   */
> -int __init early_init_dt_scan_root(unsigned long node, const char *uname,
> -int depth, void *data)
> +int __init early_init_dt_scan_root(void)
>  {
>   const __be32 *prop;
> + const void *fdt = initial_boot_params;
> + int node = fdt_path_offset(fdt, "/");
>  
> - if (depth != 0)
> - return 0;
> + if (node < 0)
> + return -ENODEV;
>  
>   dt_root_size_cells = OF_ROOT_NODE_SIZE_CELLS_DEFAULT;
>   dt_root_addr_cells = OF_ROOT_NODE_ADDR_CELLS_DEFAULT;
> @@ -1063,8 +1064,7 @@ int __init early_init_dt_scan_root(unsigned long node, 
> const char *uname,
>   dt_root_addr_cells = be32_to_cpup(prop);
>   pr_debug("dt_root_addr_cells = %x\n", dt_root_addr_cells);
>  
> - /* break now */
> - return 1;
> + return 0;
>  }
>  
>  u64 __init dt_mem_next_cell(int s, const __be32 **cellp)
> @@ -1263,7 +1263,7 @@ void __init early_init_dt_scan_nodes(void)
>   int rc;
>  
>   /* Initialize {size,address}-cells info */
> - of_scan_flat_dt(early_init_dt_scan_root, NULL);
> + early_init_dt_scan_root();
>  
>   /* Retrieve various information from the /chosen node */
>   rc = early_init_dt_scan_chosen(boot_command_line);
> diff --git a/include/linux/of_fdt.h b/include/linux/of_fdt.h
> index 654722235df6..df3d31926c3c 100644
> --- a/include/linux/of_fdt.h
> +++ b/include/linux/of_fdt.h
> @@ -68,8 +68,7 @@ extern void early_init_dt_add_memory_arch(u64 base, u64 
> size);
>  extern u64 dt_mem_next_cell(int s, const __be32 **cellp);
>  
>  /* Early flat tree scan hooks */
> -extern int early_init_dt_scan_root(unsigned long node, const char *uname,
> -int depth, void *data);
> +extern int early_init_dt_scan_root(void);
>  
>  extern bool early_init_dt_scan(void *params);
>  extern bool early_init_dt_verify(void *params);
> 

Reviewed-by: Frank Rowand 


Re: [PATCH 1/3] of/fdt: Rework early_init_dt_scan_chosen() to call directly

2021-11-24 Thread Frank Rowand
On 11/18/21 1:12 PM, Rob Herring wrote:
> Use of the of_scan_flat_dt() function predates libfdt and is discouraged
> as libfdt provides a nicer set of APIs. Rework
> early_init_dt_scan_chosen() to be called directly and use libfdt.
> 
> Cc: Michael Ellerman 
> Cc: Benjamin Herrenschmidt 
> Cc: Paul Mackerras 
> Cc: Frank Rowand 
> Cc: linuxppc-dev@lists.ozlabs.org
> Signed-off-by: Rob Herring 
> ---
>  arch/powerpc/kernel/prom.c   |  2 +-
>  arch/powerpc/mm/nohash/kaslr_booke.c |  4 +--
>  drivers/of/fdt.c | 39 ++--
>  include/linux/of_fdt.h   |  3 +--
>  4 files changed, 22 insertions(+), 26 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
> index 2e67588f6f6e..c6c398ccd98a 100644
> --- a/arch/powerpc/kernel/prom.c
> +++ b/arch/powerpc/kernel/prom.c
> @@ -402,7 +402,7 @@ static int __init early_init_dt_scan_chosen_ppc(unsigned 
> long node,
>   const unsigned long *lprop; /* All these set by kernel, so no need to 
> convert endian */
>  
>   /* Use common scan routine to determine if this is the chosen node */
> - if (early_init_dt_scan_chosen(node, uname, depth, data) == 0)
> + if (early_init_dt_scan_chosen(data) < 0)
>   return 0;
>  
>  #ifdef CONFIG_PPC64
> diff --git a/arch/powerpc/mm/nohash/kaslr_booke.c 
> b/arch/powerpc/mm/nohash/kaslr_booke.c
> index 8fc49b1b4a91..90debe19ab4c 100644
> --- a/arch/powerpc/mm/nohash/kaslr_booke.c
> +++ b/arch/powerpc/mm/nohash/kaslr_booke.c
> @@ -44,9 +44,7 @@ struct regions __initdata regions;
>  
>  static __init void kaslr_get_cmdline(void *fdt)
>  {
> - int node = fdt_path_offset(fdt, "/chosen");
> -
> - early_init_dt_scan_chosen(node, "chosen", 1, boot_command_line);
> + early_init_dt_scan_chosen(boot_command_line);
>  }
>  
>  static unsigned long __init rotate_xor(unsigned long hash, const void *area,
> diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
> index bdca35284ceb..1f1705f76263 100644
> --- a/drivers/of/fdt.c
> +++ b/drivers/of/fdt.c
> @@ -1124,18 +1124,18 @@ int __init early_init_dt_scan_memory(unsigned long 
> node, const char *uname,
>   return 0;
>  }
>  
> -int __init early_init_dt_scan_chosen(unsigned long node, const char *uname,
> -  int depth, void *data)
> +int __init early_init_dt_scan_chosen(char *cmdline)
>  {
> - int l;
> + int l, node;
>   const char *p;
>   const void *rng_seed;
> + const void *fdt = initial_boot_params;
>  
> - pr_debug("search \"chosen\", depth: %d, uname: %s\n", depth, uname);
> -
> - if (depth != 1 || !data ||
> - (strcmp(uname, "chosen") != 0 && strcmp(uname, "chosen@0") != 0))
> - return 0;
> + node = fdt_path_offset(fdt, "/chosen");
> + if (node < 0)
> + node = fdt_path_offset(fdt, "/chosen@0");
> + if (node < 0)
> + return -ENOENT;
>  
>   early_init_dt_check_for_initrd(node);
>   early_init_dt_check_for_elfcorehdr(node);
> @@ -1144,7 +1144,7 @@ int __init early_init_dt_scan_chosen(unsigned long 
> node, const char *uname,
>   /* Retrieve command line */
>   p = of_get_flat_dt_prop(node, "bootargs", );
>   if (p != NULL && l > 0)
> - strlcpy(data, p, min(l, COMMAND_LINE_SIZE));
> + strlcpy(cmdline, p, min(l, COMMAND_LINE_SIZE));
>  
>   /*
>* CONFIG_CMDLINE is meant to be a default in case nothing else
> @@ -1153,18 +1153,18 @@ int __init early_init_dt_scan_chosen(unsigned long 
> node, const char *uname,
>*/
>  #ifdef CONFIG_CMDLINE
>  #if defined(CONFIG_CMDLINE_EXTEND)
> - strlcat(data, " ", COMMAND_LINE_SIZE);
> - strlcat(data, CONFIG_CMDLINE, COMMAND_LINE_SIZE);
> + strlcat(cmdline, " ", COMMAND_LINE_SIZE);
> + strlcat(cmdline, CONFIG_CMDLINE, COMMAND_LINE_SIZE);
>  #elif defined(CONFIG_CMDLINE_FORCE)
> - strlcpy(data, CONFIG_CMDLINE, COMMAND_LINE_SIZE);
> + strlcpy(cmdline, CONFIG_CMDLINE, COMMAND_LINE_SIZE);
>  #else
>   /* No arguments from boot loader, use kernel's  cmdl*/
> - if (!((char *)data)[0])
> - strlcpy(data, CONFIG_CMDLINE, COMMAND_LINE_SIZE);
> + if (!((char *)cmdline)[0])
> + strlcpy(cmdline, CONFIG_CMDLINE, COMMAND_LINE_SIZE);
>  #endif
>  #endif /* CONFIG_CMDLINE */
>  
> - pr_debug("Command line is: %s\n", (char *)data);
> + pr_debug("Command line is: %s\n", (char *)cmdline);
>  
>   rng_seed = of_get_flat_dt_prop(node, "rng-seed", );
>   if (rng_seed && l > 0) {
> @@ -1178,8 +1178,7 @@ int __init early_init_dt_scan_chosen(unsigned long 
> node, const char *uname,
>   fdt_totalsize(initial_boot_params));
>   }
>  
> - /* break now */
> - return 1;
> + return 0;
>  }
>  
>  #ifndef MIN_MEMBLOCK_ADDR
> @@ -1261,14 +1260,14 @@ bool __init early_init_dt_verify(void *params)
>  
>  void __init early_init_dt_scan_nodes(void)
>  {
> - int rc = 0;
> +   

Re: [PATCH 1/1] futex: Wireup futex_waitv syscall

2021-11-24 Thread André Almeida
Às 11:29 de 24/11/21, Arnd Bergmann escreveu:
> On Wed, Nov 24, 2021 at 2:21 PM André Almeida  
> wrote:
>>
>> Wireup futex_waitv syscall for all remaining archs.
>>
>> Signed-off-by: André Almeida 
> 
> Reviewed-by: Arnd Bergmann 
> 
> I double-checked that futex_waitv() doesn't need any architecture specific
> hacks, and that the list above is complete.

Thanks!

> 
> Should I take this through the asm-generic tree, or would you send it
> through the
> tip tree?
> 
I think that adding it to asm-generic tree make sense to me.


[PATCH] recordmcount: Support empty section from recent binutils

2021-11-24 Thread Christophe Leroy
Looks like recent binutils (2.36 and over ?) may empty some section,
leading to failure like:

Cannot find symbol for section 11: .text.unlikely.
kernel/kexec_file.o: failed
make[1]: *** [scripts/Makefile.build:287: kernel/kexec_file.o] Error 1

In order to avoid that, ensure that the section has a content before
returning it's name in has_rel_mcount().

Suggested-by: Steven Rostedt 
Link: https://github.com/linuxppc/issues/issues/388
Link: https://lore.kernel.org/all/20210215162209.5e2a4...@gandalf.local.home/
Signed-off-by: Christophe Leroy 
---
 scripts/recordmcount.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/scripts/recordmcount.h b/scripts/recordmcount.h
index 1e9baa5c4fc6..cc6600b729ae 100644
--- a/scripts/recordmcount.h
+++ b/scripts/recordmcount.h
@@ -575,6 +575,8 @@ static char const *has_rel_mcount(Elf_Shdr const *const 
relhdr,
  char const *const shstrtab,
  char const *const fname)
 {
+   if (!shdr0->sh_size)
+   return NULL;
if (w(relhdr->sh_type) != SHT_REL && w(relhdr->sh_type) != SHT_RELA)
return NULL;
return __has_rel_mcount(relhdr, shdr0, shstrtab, fname);
-- 
2.33.1



Re: [PATCH 1/1] futex: Wireup futex_waitv syscall

2021-11-24 Thread Arnd Bergmann
On Wed, Nov 24, 2021 at 2:21 PM André Almeida  wrote:
>
> Wireup futex_waitv syscall for all remaining archs.
>
> Signed-off-by: André Almeida 

Reviewed-by: Arnd Bergmann 

I double-checked that futex_waitv() doesn't need any architecture specific
hacks, and that the list above is complete.

Should I take this through the asm-generic tree, or would you send it
through the
tip tree?

Arnd


Re: [PATCH 1/1] futex: Wireup futex_waitv syscall

2021-11-24 Thread Geert Uytterhoeven
On Wed, Nov 24, 2021 at 2:21 PM André Almeida  wrote:
> Wireup futex_waitv syscall for all remaining archs.
>
> Signed-off-by: André Almeida 

>  arch/m68k/kernel/syscalls/syscall.tbl   | 1 +

Acked-by: Geert Uytterhoeven 

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds


Re: [PATCH 1/8] powerpc/mm: Make slice specific to book3s/64

2021-11-24 Thread Christophe Leroy




Le 24/11/2021 à 13:10, Christophe Leroy a écrit :



Le 22/11/2021 à 15:48, kernel test robot a écrit :

Hi Christophe,

I love your patch! Perhaps something to improve:

[auto build test WARNING on powerpc/next]
[also build test WARNING on hnaz-mm/master linus/master v5.16-rc2 
next-2028]

[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Christophe-Leroy/Convert-powerpc-to-default-topdown-mmap-layout/20211122-165115 

base:   
https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next

config: powerpc64-randconfig-s031-20211122 (attached as .config)
compiler: powerpc64-linux-gcc (GCC) 11.2.0
reproduce:
 wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross 
-O ~/bin/make.cross

 chmod +x ~/bin/make.cross
 # apt-get install sparse
 # sparse version: v0.6.4-dirty
 # 
https://github.com/0day-ci/linux/commit/1d0b7cc86d08f25f595b52d8c39ba9ca1d29a30a 


 git remote add linux-review https://github.com/0day-ci/linux
 git fetch --no-tags linux-review 
Christophe-Leroy/Convert-powerpc-to-default-topdown-mmap-layout/20211122-165115 


 git checkout 1d0b7cc86d08f25f595b52d8c39ba9ca1d29a30a
 # save the attached .config to linux build tree
 COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 
make.cross C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' ARCH=powerpc64


If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All warnings (new ones prefixed by >>):

    arch/powerpc/mm/book3s64/slice.c: In function 
'slice_get_unmapped_area':
arch/powerpc/mm/book3s64/slice.c:639:1: warning: the frame size of 
1040 bytes is larger than 1024 bytes [-Wframe-larger-than=]

  639 | }
  | ^



The problem was already existing when slice.c was in arch/powerpc/mm/

This patch doesn't introduce the problem.



In fact the problem is really added by yourself mister 'kernel test robot'.

CONFIG_FRAME_WARN is supposed to be 2048 on 64 bit architectures.

It the robot starts to reduce that value, it is on its own 


config FRAME_WARN
int "Warn for stack frames larger than"
range 0 8192
default 2048 if GCC_PLUGIN_LATENT_ENTROPY
default 1536 if (!64BIT && (PARISC || XTENSA))
default 1024 if (!64BIT && !PARISC)
default 2048 if 64BIT
help
  Tell gcc to warn at build time for stack frames larger than this.
  Setting this too low will cause a lot of warnings.
  Setting it to 0 disables the warning.



[PATCH 1/1] futex: Wireup futex_waitv syscall

2021-11-24 Thread André Almeida
Wireup futex_waitv syscall for all remaining archs.

Signed-off-by: André Almeida 
---
 arch/alpha/kernel/syscalls/syscall.tbl  | 1 +
 arch/ia64/kernel/syscalls/syscall.tbl   | 1 +
 arch/m68k/kernel/syscalls/syscall.tbl   | 1 +
 arch/microblaze/kernel/syscalls/syscall.tbl | 1 +
 arch/powerpc/kernel/syscalls/syscall.tbl| 1 +
 arch/sh/kernel/syscalls/syscall.tbl | 1 +
 arch/sparc/kernel/syscalls/syscall.tbl  | 1 +
 arch/xtensa/kernel/syscalls/syscall.tbl | 1 +
 8 files changed, 8 insertions(+)

diff --git a/arch/alpha/kernel/syscalls/syscall.tbl 
b/arch/alpha/kernel/syscalls/syscall.tbl
index e4a041cd5715..ca5a32228cd6 100644
--- a/arch/alpha/kernel/syscalls/syscall.tbl
+++ b/arch/alpha/kernel/syscalls/syscall.tbl
@@ -488,3 +488,4 @@
 556common  landlock_restrict_self  sys_landlock_restrict_self
 # 557 reserved for memfd_secret
 558common  process_mreleasesys_process_mrelease
+559common  futex_waitv sys_futex_waitv
diff --git a/arch/ia64/kernel/syscalls/syscall.tbl 
b/arch/ia64/kernel/syscalls/syscall.tbl
index 6fea1844fb95..707ae121f6d3 100644
--- a/arch/ia64/kernel/syscalls/syscall.tbl
+++ b/arch/ia64/kernel/syscalls/syscall.tbl
@@ -369,3 +369,4 @@
 446common  landlock_restrict_self  sys_landlock_restrict_self
 # 447 reserved for memfd_secret
 448common  process_mreleasesys_process_mrelease
+449common  futex_waitv sys_futex_waitv
diff --git a/arch/m68k/kernel/syscalls/syscall.tbl 
b/arch/m68k/kernel/syscalls/syscall.tbl
index 7976dff8f879..45bc32a41b90 100644
--- a/arch/m68k/kernel/syscalls/syscall.tbl
+++ b/arch/m68k/kernel/syscalls/syscall.tbl
@@ -448,3 +448,4 @@
 446common  landlock_restrict_self  sys_landlock_restrict_self
 # 447 reserved for memfd_secret
 448common  process_mreleasesys_process_mrelease
+449common  futex_waitv sys_futex_waitv
diff --git a/arch/microblaze/kernel/syscalls/syscall.tbl 
b/arch/microblaze/kernel/syscalls/syscall.tbl
index 6b0e11362bd2..2204bde3ce4a 100644
--- a/arch/microblaze/kernel/syscalls/syscall.tbl
+++ b/arch/microblaze/kernel/syscalls/syscall.tbl
@@ -454,3 +454,4 @@
 446common  landlock_restrict_self  sys_landlock_restrict_self
 # 447 reserved for memfd_secret
 448common  process_mreleasesys_process_mrelease
+449common  futex_waitv sys_futex_waitv
diff --git a/arch/powerpc/kernel/syscalls/syscall.tbl 
b/arch/powerpc/kernel/syscalls/syscall.tbl
index 7bef917cc84e..15109af9d075 100644
--- a/arch/powerpc/kernel/syscalls/syscall.tbl
+++ b/arch/powerpc/kernel/syscalls/syscall.tbl
@@ -528,3 +528,4 @@
 446common  landlock_restrict_self  sys_landlock_restrict_self
 # 447 reserved for memfd_secret
 448common  process_mreleasesys_process_mrelease
+449common  futex_waitv sys_futex_waitv
diff --git a/arch/sh/kernel/syscalls/syscall.tbl 
b/arch/sh/kernel/syscalls/syscall.tbl
index 208f131659c5..d9539d28bdaa 100644
--- a/arch/sh/kernel/syscalls/syscall.tbl
+++ b/arch/sh/kernel/syscalls/syscall.tbl
@@ -451,3 +451,4 @@
 446common  landlock_restrict_self  sys_landlock_restrict_self
 # 447 reserved for memfd_secret
 448common  process_mreleasesys_process_mrelease
+449common  futex_waitv sys_futex_waitv
diff --git a/arch/sparc/kernel/syscalls/syscall.tbl 
b/arch/sparc/kernel/syscalls/syscall.tbl
index c37764dc764d..46adabcb1720 100644
--- a/arch/sparc/kernel/syscalls/syscall.tbl
+++ b/arch/sparc/kernel/syscalls/syscall.tbl
@@ -494,3 +494,4 @@
 446common  landlock_restrict_self  sys_landlock_restrict_self
 # 447 reserved for memfd_secret
 448common  process_mreleasesys_process_mrelease
+449common  futex_waitv sys_futex_waitv
diff --git a/arch/xtensa/kernel/syscalls/syscall.tbl 
b/arch/xtensa/kernel/syscalls/syscall.tbl
index 104b327f8ac9..3e3e1a506bed 100644
--- a/arch/xtensa/kernel/syscalls/syscall.tbl
+++ b/arch/xtensa/kernel/syscalls/syscall.tbl
@@ -419,3 +419,4 @@
 446common  landlock_restrict_self  sys_landlock_restrict_self
 # 447 reserved for memfd_secret
 448common  process_mreleasesys_process_mrelease
+449common  futex_waitv sys_futex_waitv
-- 
2.33.1



Re: [PATCH v2 6/8] inotify: simplify subdirectory registration with register_sysctl()

2021-11-24 Thread Luis Chamberlain
On Wed, Nov 24, 2021 at 10:44:09AM +0100, Jan Kara wrote:
> On Tue 23-11-21 12:24:20, Luis Chamberlain wrote:
> > From: Xiaoming Ni 
> > 
> > There is no need to user boiler plate code to specify a set of base
> > directories we're going to stuff sysctls under. Simplify this by using
> > register_sysctl() and specifying the directory path directly.
> > 
> > Move inotify_user sysctl to inotify_user.c while at it to remove clutter
> > from kernel/sysctl.c.
> > 
> > Signed-off-by: Xiaoming Ni 
> > [mcgrof: update commit log to reflect new path we decided to take]
> > Signed-off-by: Luis Chamberlain 
> 
> This looks fishy. You register inotify_table but not fanotify_table and
> remove both...

Indeed, the following was missing, I'll roll it in:

diff --git a/fs/notify/fanotify/fanotify_user.c 
b/fs/notify/fanotify/fanotify_user.c
index 559bc1e9926d..a35693eb1f36 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -59,7 +59,7 @@ static int fanotify_max_queued_events __read_mostly;
 static long ft_zero = 0;
 static long ft_int_max = INT_MAX;
 
-struct ctl_table fanotify_table[] = {
+static struct ctl_table fanotify_table[] = {
{
.procname   = "max_user_groups",
.data   = _user_ns.ucount_max[UCOUNT_FANOTIFY_GROUPS],
@@ -88,6 +88,13 @@ struct ctl_table fanotify_table[] = {
},
{ }
 };
+
+static void __init fanotify_sysctls_init(void)
+{
+   register_sysctl("fs/fanotify", fanotify_table);
+}
+#else
+#define fanotify_sysctls_init() do { } while (0)
 #endif /* CONFIG_SYSCTL */
 
 /*
@@ -1685,6 +1692,7 @@ static int __init fanotify_user_setup(void)
init_user_ns.ucount_max[UCOUNT_FANOTIFY_GROUPS] =
FANOTIFY_DEFAULT_MAX_GROUPS;
init_user_ns.ucount_max[UCOUNT_FANOTIFY_MARKS] = max_marks;
+   fanotify_sysctls_init();
 
return 0;
 }
diff --git a/include/linux/fanotify.h b/include/linux/fanotify.h
index 616af2ea20f3..556cc63c88ee 100644
--- a/include/linux/fanotify.h
+++ b/include/linux/fanotify.h
@@ -5,8 +5,6 @@
 #include 
 #include 
 
-extern struct ctl_table fanotify_table[]; /* for sysctl */
-
 #define FAN_GROUP_FLAG(group, flag) \
((group)->fanotify_data.flags & (flag))
 


Re: [PATCH 0/8] Convert powerpc to default topdown mmap layout

2021-11-24 Thread Christophe Leroy




Le 24/11/2021 à 14:21, Nicholas Piggin a écrit :

Excerpts from Christophe Leroy's message of November 22, 2021 6:48 pm:

This series converts powerpc to default topdown mmap layout.

powerpc provides its own arch_get_unmapped_area() only when
slices are needed, which is only for book3s/64. First part of
the series moves slices into book3s/64 specific directories
and cleans up other subarchitectures.

Then a small modification is done to core mm to allow
powerpc to still provide its own arch_randomize_brk()

Last part converts to default topdown mmap layout.


A nice series but will clash badly with the CONFIG_HASH_MMU
series of course. One will have to be rebased if they are
both to be merged.



No worry, it should be an issue.

If you already forsee that series being merged soon, I can rebase my 
series on top of it just now.


Christophe


Re: [PATCH 0/3] KEXEC_SIG with appended signature

2021-11-24 Thread Michal Suchánek
On Wed, Nov 24, 2021 at 08:10:10AM -0500, Mimi Zohar wrote:
> On Wed, 2021-11-24 at 12:09 +0100, Philipp Rudo wrote:
> > Now Michal wants to adapt KEXEC_SIG for ppc too so distros can rely on all
> > architectures using the same mechanism and thus reduce maintenance cost.
> > On the way there he even makes some absolutely reasonable improvements
> > for everybody.
> > 
> > Why is that so controversial? What is the real problem that should be
> > discussed here?
> 
> Nothing is controversial with what Michal wants to do.  I've already
> said, "As for adding KEXEC_SIG appended signature support on PowerPC
> based on the s390 code, it sounds reasonable."

Ok, I will resend the series with the arch-specific changes first to be
independent of the core cleanup.

Thanks

Michal


Re: [PATCH 0/8] Convert powerpc to default topdown mmap layout

2021-11-24 Thread Nicholas Piggin
Excerpts from Christophe Leroy's message of November 22, 2021 6:48 pm:
> This series converts powerpc to default topdown mmap layout.
> 
> powerpc provides its own arch_get_unmapped_area() only when
> slices are needed, which is only for book3s/64. First part of
> the series moves slices into book3s/64 specific directories
> and cleans up other subarchitectures.
> 
> Then a small modification is done to core mm to allow
> powerpc to still provide its own arch_randomize_brk()
> 
> Last part converts to default topdown mmap layout.

A nice series but will clash badly with the CONFIG_HASH_MMU 
series of course. One will have to be rebased if they are
both to be merged.

Thanks,
Nick


[PATCH 5.15 258/279] signal/powerpc: On swapcontext failure force SIGSEGV

2021-11-24 Thread Greg Kroah-Hartman
From: Eric W. Biederman 

commit 83a1f27ad773b1d8f0460d3a676114c7651918cc upstream.

If the register state may be partial and corrupted instead of calling
do_exit, call force_sigsegv(SIGSEGV).  Which properly kills the
process with SIGSEGV and does not let any more userspace code execute,
instead of just killing one thread of the process and potentially
confusing everything.

Cc: Michael Ellerman 
Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Cc: linuxppc-dev@lists.ozlabs.org
History-tree: git://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git
Fixes: 756f1ae8a44e ("PPC32: Rework signal code and add a swapcontext system 
call.")
Fixes: 04879b04bf50 ("[PATCH] ppc64: VMX (Altivec) support & signal32 rework, 
from Ben Herrenschmidt")
Link: https://lkml.kernel.org/r/20211020174406.17889-7-ebied...@xmission.com
Signed-off-by: Eric W. Biederman 
Cc: Thomas Backlund 
Signed-off-by: Greg Kroah-Hartman 
---
 arch/powerpc/kernel/signal_32.c |6 --
 arch/powerpc/kernel/signal_64.c |9 ++---
 2 files changed, 10 insertions(+), 5 deletions(-)

--- a/arch/powerpc/kernel/signal_32.c
+++ b/arch/powerpc/kernel/signal_32.c
@@ -1062,8 +1062,10 @@ SYSCALL_DEFINE3(swapcontext, struct ucon
 * or if another thread unmaps the region containing the context.
 * We kill the task with a SIGSEGV in this situation.
 */
-   if (do_setcontext(new_ctx, regs, 0))
-   do_exit(SIGSEGV);
+   if (do_setcontext(new_ctx, regs, 0)) {
+   force_sigsegv(SIGSEGV);
+   return -EFAULT;
+   }
 
set_thread_flag(TIF_RESTOREALL);
return 0;
--- a/arch/powerpc/kernel/signal_64.c
+++ b/arch/powerpc/kernel/signal_64.c
@@ -703,15 +703,18 @@ SYSCALL_DEFINE3(swapcontext, struct ucon
 * We kill the task with a SIGSEGV in this situation.
 */
 
-   if (__get_user_sigset(, _ctx->uc_sigmask))
-   do_exit(SIGSEGV);
+   if (__get_user_sigset(, _ctx->uc_sigmask)) {
+   force_sigsegv(SIGSEGV);
+   return -EFAULT;
+   }
set_current_blocked();
 
if (!user_read_access_begin(new_ctx, ctx_size))
return -EFAULT;
if (__unsafe_restore_sigcontext(current, NULL, 0, 
_ctx->uc_mcontext)) {
user_read_access_end();
-   do_exit(SIGSEGV);
+   force_sigsegv(SIGSEGV);
+   return -EFAULT;
}
user_read_access_end();
 




Re: [PATCH v3 2/2] pseries/mce: Refactor the pseries mce handling code

2021-11-24 Thread Nicholas Piggin
Excerpts from Ganesh Goudar's message of November 24, 2021 7:55 pm:
> Now that we are no longer switching on the mmu in realmode
> mce handler, Revert the commit 4ff753feab02("powerpc/pseries:
> Avoid using addr_to_pfn in real mode") partially, which
> introduced functions mce_handle_err_virtmode/realmode() to
> separate mce handler code which needed translation to enabled.
> 
> Signed-off-by: Ganesh Goudar 
> ---
>  arch/powerpc/platforms/pseries/ras.c | 122 +++
>  1 file changed, 49 insertions(+), 73 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/pseries/ras.c 
> b/arch/powerpc/platforms/pseries/ras.c
> index 8613f9cc5798..62e1519b8355 100644
> --- a/arch/powerpc/platforms/pseries/ras.c
> +++ b/arch/powerpc/platforms/pseries/ras.c
> @@ -511,58 +511,17 @@ int pSeries_system_reset_exception(struct pt_regs *regs)
>   return 0; /* need to perform reset */
>  }
>  
> -static int mce_handle_err_realmode(int disposition, u8 error_type)
> -{
> -#ifdef CONFIG_PPC_BOOK3S_64
> - if (disposition == RTAS_DISP_NOT_RECOVERED) {
> - switch (error_type) {
> - caseMC_ERROR_TYPE_ERAT:
> - flush_erat();
> - disposition = RTAS_DISP_FULLY_RECOVERED;
> - break;
> - caseMC_ERROR_TYPE_SLB:
> - /*
> -  * Store the old slb content in paca before flushing.
> -  * Print this when we go to virtual mode.
> -  * There are chances that we may hit MCE again if there
> -  * is a parity error on the SLB entry we trying to read
> -  * for saving. Hence limit the slb saving to single
> -  * level of recursion.
> -  */
> - if (local_paca->in_mce == 1)
> - slb_save_contents(local_paca->mce_faulty_slbs);
> - flush_and_reload_slb();
> - disposition = RTAS_DISP_FULLY_RECOVERED;
> - break;
> - default:
> - break;
> - }
> - } else if (disposition == RTAS_DISP_LIMITED_RECOVERY) {
> - /* Platform corrected itself but could be degraded */
> - pr_err("MCE: limited recovery, system may be degraded\n");
> - disposition = RTAS_DISP_FULLY_RECOVERED;
> - }
> -#endif
> - return disposition;
> -}
> -
> -static int mce_handle_err_virtmode(struct pt_regs *regs,
> -struct rtas_error_log *errp,
> -struct pseries_mc_errorlog *mce_log,
> -int disposition)
> +static int mce_handle_error(struct pt_regs *regs, struct rtas_error_log 
> *errp)
>  {
>   struct mce_error_info mce_err = { 0 };
> + unsigned long eaddr = 0, paddr = 0;
> + struct pseries_errorlog *pseries_log;
> + struct pseries_mc_errorlog *mce_log;
> + int disposition = rtas_error_disposition(errp);
>   int initiator = rtas_error_initiator(errp);
>   int severity = rtas_error_severity(errp);
> - unsigned long eaddr = 0, paddr = 0;
>   u8 error_type, err_sub_type;
>  
> - if (!mce_log)
> - goto out;
> -
> - error_type = mce_log->error_type;
> - err_sub_type = rtas_mc_error_sub_type(mce_log);
> -
>   if (initiator == RTAS_INITIATOR_UNKNOWN)
>   mce_err.initiator = MCE_INITIATOR_UNKNOWN;
>   else if (initiator == RTAS_INITIATOR_CPU)
> @@ -588,6 +547,8 @@ static int mce_handle_err_virtmode(struct pt_regs *regs,
>   mce_err.severity = MCE_SEV_SEVERE;
>   else if (severity == RTAS_SEVERITY_ERROR)
>   mce_err.severity = MCE_SEV_SEVERE;
> + else if (severity == RTAS_SEVERITY_FATAL)
> + mce_err.severity = MCE_SEV_FATAL;
>   else
>   mce_err.severity = MCE_SEV_FATAL;
>  

What's this hunk for?

> @@ -599,7 +560,18 @@ static int mce_handle_err_virtmode(struct pt_regs *regs,
>   mce_err.error_type = MCE_ERROR_TYPE_UNKNOWN;
>   mce_err.error_class = MCE_ECLASS_UNKNOWN;
>  
> - switch (error_type) {
> + if (!rtas_error_extended(errp))
> + goto out;
> +
> + pseries_log = get_pseries_errorlog(errp, PSERIES_ELOG_SECT_ID_MCE);
> + if (!pseries_log)
> + goto out;
> +
> + mce_log = (struct pseries_mc_errorlog *)pseries_log->data;
> + error_type = mce_log->error_type;
> + err_sub_type = rtas_mc_error_sub_type(mce_log);
> +
> + switch (mce_log->error_type) {
>   case MC_ERROR_TYPE_UE:
>   mce_err.error_type = MCE_ERROR_TYPE_UE;
>   mce_common_process_ue(regs, _err);
> @@ -692,41 +664,45 @@ static int mce_handle_err_virtmode(struct pt_regs *regs,
>   mce_err.error_type = MCE_ERROR_TYPE_DCACHE;
>   break;
>   case MC_ERROR_TYPE_I_CACHE:
> - mce_err.error_type = 

Re: [PATCH 0/3] KEXEC_SIG with appended signature

2021-11-24 Thread Mimi Zohar
On Wed, 2021-11-24 at 12:09 +0100, Philipp Rudo wrote:
> Now Michal wants to adapt KEXEC_SIG for ppc too so distros can rely on all
> architectures using the same mechanism and thus reduce maintenance cost.
> On the way there he even makes some absolutely reasonable improvements
> for everybody.
> 
> Why is that so controversial? What is the real problem that should be
> discussed here?

Nothing is controversial with what Michal wants to do.  I've already
said, "As for adding KEXEC_SIG appended signature support on PowerPC
based on the s390 code, it sounds reasonable."

thanks,

Mimi



Re: [PATCH v3 1/2] powerpc/mce: Avoid using irq_work_queue() in realmode

2021-11-24 Thread Nicholas Piggin
Excerpts from Ganesh Goudar's message of November 24, 2021 7:54 pm:
> In realmode mce handler we use irq_work_queue() to defer
> the processing of mce events, irq_work_queue() can only
> be called when translation is enabled because it touches
> memory outside RMA, hence we enable translation before
> calling irq_work_queue and disable on return, though it
> is not safe to do in realmode.
> 
> To avoid this, program the decrementer and call the event
> processing functions from timer handler.
> 
> Signed-off-by: Ganesh Goudar 
> ---
> V2:
> * Use arch_irq_work_raise to raise decrementer interrupt.
> * Avoid having atomic variable.
> 
> V3:
> * Fix build error.
>   Reported by kernel test bot.
> ---
>  arch/powerpc/include/asm/machdep.h   |  2 +
>  arch/powerpc/include/asm/mce.h   |  2 +
>  arch/powerpc/include/asm/paca.h  |  1 +
>  arch/powerpc/kernel/mce.c| 51 +++-
>  arch/powerpc/kernel/time.c   |  3 ++
>  arch/powerpc/platforms/pseries/pseries.h |  1 +
>  arch/powerpc/platforms/pseries/ras.c | 31 +-
>  arch/powerpc/platforms/pseries/setup.c   |  1 +
>  8 files changed, 34 insertions(+), 58 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/machdep.h 
> b/arch/powerpc/include/asm/machdep.h
> index 9c3c9f04129f..d22b222ba471 100644
> --- a/arch/powerpc/include/asm/machdep.h
> +++ b/arch/powerpc/include/asm/machdep.h
> @@ -99,6 +99,8 @@ struct machdep_calls {
>   /* Called during machine check exception to retrive fixup address. */
>   bool(*mce_check_early_recovery)(struct pt_regs *regs);
>  
> + void(*machine_check_log_err)(void);
> +
>   /* Motherboard/chipset features. This is a kind of general purpose
>* hook used to control some machine specific features (like reset
>* lines, chip power control, etc...).
> diff --git a/arch/powerpc/include/asm/mce.h b/arch/powerpc/include/asm/mce.h
> index 331d944280b8..6e306aaf58aa 100644
> --- a/arch/powerpc/include/asm/mce.h
> +++ b/arch/powerpc/include/asm/mce.h
> @@ -235,8 +235,10 @@ extern void machine_check_print_event_info(struct 
> machine_check_event *evt,
>  unsigned long addr_to_pfn(struct pt_regs *regs, unsigned long addr);
>  extern void mce_common_process_ue(struct pt_regs *regs,
> struct mce_error_info *mce_err);
> +void machine_check_raise_dec_intr(void);
>  int mce_register_notifier(struct notifier_block *nb);
>  int mce_unregister_notifier(struct notifier_block *nb);
> +void mce_run_late_handlers(void);
>  #ifdef CONFIG_PPC_BOOK3S_64
>  void flush_and_reload_slb(void);
>  void flush_erat(void);
> diff --git a/arch/powerpc/include/asm/paca.h b/arch/powerpc/include/asm/paca.h
> index dc05a862e72a..d463c796f7fa 100644
> --- a/arch/powerpc/include/asm/paca.h
> +++ b/arch/powerpc/include/asm/paca.h
> @@ -280,6 +280,7 @@ struct paca_struct {
>  #endif
>  #ifdef CONFIG_PPC_BOOK3S_64
>   struct mce_info *mce_info;
> + u32 mces_to_process;
>  #endif /* CONFIG_PPC_BOOK3S_64 */
>  } cacheline_aligned;
>  
> diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c
> index fd829f7f25a4..8e17f29472a0 100644
> --- a/arch/powerpc/kernel/mce.c
> +++ b/arch/powerpc/kernel/mce.c
> @@ -28,19 +28,9 @@
>  
>  #include "setup.h"
>  
> -static void machine_check_process_queued_event(struct irq_work *work);
> -static void machine_check_ue_irq_work(struct irq_work *work);
>  static void machine_check_ue_event(struct machine_check_event *evt);
>  static void machine_process_ue_event(struct work_struct *work);
>  
> -static struct irq_work mce_event_process_work = {
> -.func = machine_check_process_queued_event,
> -};
> -
> -static struct irq_work mce_ue_event_irq_work = {
> - .func = machine_check_ue_irq_work,
> -};
> -
>  static DECLARE_WORK(mce_ue_event_work, machine_process_ue_event);
>  
>  static BLOCKING_NOTIFIER_HEAD(mce_notifier_list);
> @@ -89,6 +79,12 @@ static void mce_set_error_info(struct machine_check_event 
> *mce,
>   }
>  }
>  
> +/* Raise decrementer interrupt */
> +void machine_check_raise_dec_intr(void)
> +{
> + arch_irq_work_raise();
> +}

It would be better if the name specifically related to irq work, which 
is more than just dec interrupt. It might be good to set mces_to_process
here as well.

I would name it something like mce_irq_work_queue, and the paca variable
to mce_pending_irq_work...


> +void mce_run_late_handlers(void)
> +{
> + if (unlikely(local_paca->mces_to_process)) {
> + if (ppc_md.machine_check_log_err)
> + ppc_md.machine_check_log_err();
> + machine_check_process_queued_event();
> + machine_check_ue_work();
> + local_paca->mces_to_process--;
> + }
> +}

The problem with a counter is that you're clearing the irq work pending
in the timer interrupt, so you'll never call in here again to clear that
(until something else sets irq work).

But as far 

Re: [PATCH 1/8] powerpc/mm: Make slice specific to book3s/64

2021-11-24 Thread Christophe Leroy




Le 22/11/2021 à 22:10, kernel test robot a écrit :

Hi Christophe,

I love your patch! Yet something to improve:

[auto build test ERROR on powerpc/next]
[also build test ERROR on hnaz-mm/master linus/master v5.16-rc2 next-2028]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Christophe-Leroy/Convert-powerpc-to-default-topdown-mmap-layout/20211122-165115
base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
config: powerpc64-randconfig-r021-20211122 (attached as .config)
compiler: powerpc64-linux-gcc (GCC) 11.2.0
reproduce (this is a W=1 build):
 wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
 chmod +x ~/bin/make.cross
 # 
https://github.com/0day-ci/linux/commit/1d0b7cc86d08f25f595b52d8c39ba9ca1d29a30a
 git remote add linux-review https://github.com/0day-ci/linux
 git fetch --no-tags linux-review 
Christophe-Leroy/Convert-powerpc-to-default-topdown-mmap-layout/20211122-165115
 git checkout 1d0b7cc86d08f25f595b52d8c39ba9ca1d29a30a
 # save the attached .config to linux build tree
 COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross 
ARCH=powerpc

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All errors (new ones prefixed by >>):

arch/powerpc/mm/book3s64/slice.c: In function 'slice_get_unmapped_area':

arch/powerpc/mm/book3s64/slice.c:639:1: error: the frame size of 1056 bytes is 
larger than 1024 bytes [-Werror=frame-larger-than=]

  639 | }
  | ^
cc1: all warnings being treated as errors




The problem was already existing when slice.c was in arch/powerpc/mm/

This patch doesn't introduce the problem.

Christophe



Re: [PATCH 1/8] powerpc/mm: Make slice specific to book3s/64

2021-11-24 Thread Christophe Leroy




Le 22/11/2021 à 15:48, kernel test robot a écrit :

Hi Christophe,

I love your patch! Perhaps something to improve:

[auto build test WARNING on powerpc/next]
[also build test WARNING on hnaz-mm/master linus/master v5.16-rc2 next-2028]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Christophe-Leroy/Convert-powerpc-to-default-topdown-mmap-layout/20211122-165115
base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
config: powerpc64-randconfig-s031-20211122 (attached as .config)
compiler: powerpc64-linux-gcc (GCC) 11.2.0
reproduce:
 wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
 chmod +x ~/bin/make.cross
 # apt-get install sparse
 # sparse version: v0.6.4-dirty
 # 
https://github.com/0day-ci/linux/commit/1d0b7cc86d08f25f595b52d8c39ba9ca1d29a30a
 git remote add linux-review https://github.com/0day-ci/linux
 git fetch --no-tags linux-review 
Christophe-Leroy/Convert-powerpc-to-default-topdown-mmap-layout/20211122-165115
 git checkout 1d0b7cc86d08f25f595b52d8c39ba9ca1d29a30a
 # save the attached .config to linux build tree
 COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross C=1 
CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' ARCH=powerpc64

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All warnings (new ones prefixed by >>):

arch/powerpc/mm/book3s64/slice.c: In function 'slice_get_unmapped_area':

arch/powerpc/mm/book3s64/slice.c:639:1: warning: the frame size of 1040 bytes 
is larger than 1024 bytes [-Wframe-larger-than=]

  639 | }
  | ^



The problem was already existing when slice.c was in arch/powerpc/mm/

This patch doesn't introduce the problem.

Christophe


Re: [PATCH] powerpc/eeh: Delay slot presence check once driver is notified about the pci error.

2021-11-24 Thread Oliver O'Halloran
On Wed, Nov 24, 2021 at 12:05 AM Mahesh Salgaonkar  wrote:
>
> *snip*
>
> This causes the EEH handler to get stuck for ~6
> seconds before it could notify that the pci error has been detected and
> stop any active operations. Hence with running I/O traffic, during this 6
> seconds, the network driver continues its operation and hits a timeout
> (netdev watchdog).On timeouts, network driver go into ffdc capture mode
> and reset path assuming the PCI device is in fatal condition. This causes
> EEH recovery to fail and sometimes it leads to system hang or crash.

Whatever is causing that crash is the real issue IMO. PCI error
reporting is fundamentally asynchronous and the driver always has to
tolerate some amount of latency between the error occuring and being
reported. Six seconds is admittedly an eternity, but it should not
cause a system crash under any circumstances. Printing a warning due
to a timeout is annoying, but it's not the end of the world.


Re: [PATCH] powerpc/eeh: Delay slot presence check once driver is notified about the pci error.

2021-11-24 Thread Oliver O'Halloran
On Wed, Nov 24, 2021 at 7:45 PM Mahesh J Salgaonkar
 wrote:
>
> No it doesn't. We will still do a presence check before the recovery
> process starts. This patch moves the check after notifying the driver to
> stop active I/O operations. If a presence check finds the device isn't
> present, we will skip the EEH recovery. However, on a surprise hotplug,
> the user will see the EEH messages on the console before it finds there
> is nothing to recover.

Suppressing the spurious EEH messages was part of why I added that
check in the first place. If you want to defer the presence check
until later you should move the stack trace printing, etc to after
we've confirmed there are still devices present. Considering the
motivation for this patch is to avoid spurious warnings from the
driver I don't think printing spurious EEH messages is much of an
improvement.

The other option would be returning an error from the pseries hotplug
driver. IIRC that's what pnv_php / OPAL does if the PHB is fenced and
we can't check the slot presence state.


Re: [PATCH 0/3] KEXEC_SIG with appended signature

2021-11-24 Thread Philipp Rudo
Hi Mimi,

On Fri, 19 Nov 2021 13:16:20 -0500
Mimi Zohar  wrote:

> On Fri, 2021-11-19 at 12:18 +0100, Michal Suchánek wrote:
> > Maybe I was not clear enough. If you happen to focus on an architecture
> > that supports IMA fully it's great.
> > 
> > My point of view is maintaining multiple architectures. Both end users
> > and people conecerend with security are rarely familiar with
> > architecture specifics. Portability of documentation and debugging
> > instructions across architectures is a concern.
> > 
> > IMA has large number of options with varying availablitily across
> > architectures for no apparent reason. The situation is complex and hard
> > to grasp.  
> 
> IMA measures, verifies, and audits the integrity of files based on a
> system wide policy.  The known "good" integrity value may be stored in
> the security.ima xattr or more recently as an appended signature.
> 
> With both IMA kexec appraise and measurement policy rules, not only is
> the kernel image signature verified and the file hash included in the
> IMA measurement list, but the signature used to verify the integrity of
> the kexec kernel image is also included in the IMA measurement list
> (ima_template=ima-sig).
> 
> Even without PECOFF support in IMA, IMA kexec measurement policy rules
> can be defined to supplement the KEXEC_SIG signature verfication.
>
> > In comparison the *_SIG options are widely available. The missing
> > support for KEXEC_SIG on POWER is trivial to add by cut from s390.
> > With that all the documentation that exists already is also trivially
> > applicable to POWER. Any additional code cleanup is a bonus but not
> > really needed to enable the kexec lockdown on POWER.  
> 
> Before lockdown was upstreamed, Matthew made sure that IMA signature
> verification could co-exist.   Refer to commit 29d3c1c8dfe7 ("kexec:
> Allow kexec_file() with appropriate IMA policy when locked down").   If
> there is a problem with the downstream kexec lockdown patches, they
> should be fixed.
> 
> The kexec kselftest might provide some insight into how the different
> signature verification methods and lockdown co-exist.
> 
> As for adding KEXEC_SIG appended signature support on PowerPC based on
> the s390 code, it sounds reasonable.

Heiko contacted me as you apparently requested that someone from s390
takes part in this discussion. I now spend over a day trying to make
any sens from this discussion but failed. The way I see it is.

On one hand there is KEXEC_SIG which is specifically designed to verify
the signatures of kernel images in kexec_file_load. From the beginning
it was designed in a way that every architecture (in fact even every
image type) can define its own callback function as needed. It's design
is simple and easy to extend and thus was adopted by all architectures,
except ppc, so far.

On the other hand there is IMA which is much more general and also
includes the same functionality like KEXEC_SIG but only on some
architectures in some special cases and without proper documentation.

Now Michal wants to adapt KEXEC_SIG for ppc too so distros can rely on all
architectures using the same mechanism and thus reduce maintenance cost.
On the way there he even makes some absolutely reasonable improvements
for everybody.

Why is that so controversial? What is the real problem that should be
discussed here?

Thanks
Philipp



[PATCH v3 1/2] powerpc/mce: Avoid using irq_work_queue() in realmode

2021-11-24 Thread Ganesh Goudar
In realmode mce handler we use irq_work_queue() to defer
the processing of mce events, irq_work_queue() can only
be called when translation is enabled because it touches
memory outside RMA, hence we enable translation before
calling irq_work_queue and disable on return, though it
is not safe to do in realmode.

To avoid this, program the decrementer and call the event
processing functions from timer handler.

Signed-off-by: Ganesh Goudar 
---
V2:
* Use arch_irq_work_raise to raise decrementer interrupt.
* Avoid having atomic variable.

V3:
* Fix build error.
  Reported by kernel test bot.
---
 arch/powerpc/include/asm/machdep.h   |  2 +
 arch/powerpc/include/asm/mce.h   |  2 +
 arch/powerpc/include/asm/paca.h  |  1 +
 arch/powerpc/kernel/mce.c| 51 +++-
 arch/powerpc/kernel/time.c   |  3 ++
 arch/powerpc/platforms/pseries/pseries.h |  1 +
 arch/powerpc/platforms/pseries/ras.c | 31 +-
 arch/powerpc/platforms/pseries/setup.c   |  1 +
 8 files changed, 34 insertions(+), 58 deletions(-)

diff --git a/arch/powerpc/include/asm/machdep.h 
b/arch/powerpc/include/asm/machdep.h
index 9c3c9f04129f..d22b222ba471 100644
--- a/arch/powerpc/include/asm/machdep.h
+++ b/arch/powerpc/include/asm/machdep.h
@@ -99,6 +99,8 @@ struct machdep_calls {
/* Called during machine check exception to retrive fixup address. */
bool(*mce_check_early_recovery)(struct pt_regs *regs);
 
+   void(*machine_check_log_err)(void);
+
/* Motherboard/chipset features. This is a kind of general purpose
 * hook used to control some machine specific features (like reset
 * lines, chip power control, etc...).
diff --git a/arch/powerpc/include/asm/mce.h b/arch/powerpc/include/asm/mce.h
index 331d944280b8..6e306aaf58aa 100644
--- a/arch/powerpc/include/asm/mce.h
+++ b/arch/powerpc/include/asm/mce.h
@@ -235,8 +235,10 @@ extern void machine_check_print_event_info(struct 
machine_check_event *evt,
 unsigned long addr_to_pfn(struct pt_regs *regs, unsigned long addr);
 extern void mce_common_process_ue(struct pt_regs *regs,
  struct mce_error_info *mce_err);
+void machine_check_raise_dec_intr(void);
 int mce_register_notifier(struct notifier_block *nb);
 int mce_unregister_notifier(struct notifier_block *nb);
+void mce_run_late_handlers(void);
 #ifdef CONFIG_PPC_BOOK3S_64
 void flush_and_reload_slb(void);
 void flush_erat(void);
diff --git a/arch/powerpc/include/asm/paca.h b/arch/powerpc/include/asm/paca.h
index dc05a862e72a..d463c796f7fa 100644
--- a/arch/powerpc/include/asm/paca.h
+++ b/arch/powerpc/include/asm/paca.h
@@ -280,6 +280,7 @@ struct paca_struct {
 #endif
 #ifdef CONFIG_PPC_BOOK3S_64
struct mce_info *mce_info;
+   u32 mces_to_process;
 #endif /* CONFIG_PPC_BOOK3S_64 */
 } cacheline_aligned;
 
diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c
index fd829f7f25a4..8e17f29472a0 100644
--- a/arch/powerpc/kernel/mce.c
+++ b/arch/powerpc/kernel/mce.c
@@ -28,19 +28,9 @@
 
 #include "setup.h"
 
-static void machine_check_process_queued_event(struct irq_work *work);
-static void machine_check_ue_irq_work(struct irq_work *work);
 static void machine_check_ue_event(struct machine_check_event *evt);
 static void machine_process_ue_event(struct work_struct *work);
 
-static struct irq_work mce_event_process_work = {
-.func = machine_check_process_queued_event,
-};
-
-static struct irq_work mce_ue_event_irq_work = {
-   .func = machine_check_ue_irq_work,
-};
-
 static DECLARE_WORK(mce_ue_event_work, machine_process_ue_event);
 
 static BLOCKING_NOTIFIER_HEAD(mce_notifier_list);
@@ -89,6 +79,12 @@ static void mce_set_error_info(struct machine_check_event 
*mce,
}
 }
 
+/* Raise decrementer interrupt */
+void machine_check_raise_dec_intr(void)
+{
+   arch_irq_work_raise();
+}
+
 /*
  * Decode and save high level MCE information into per cpu buffer which
  * is an array of machine_check_event structure.
@@ -135,6 +131,8 @@ void save_mce_event(struct pt_regs *regs, long handled,
if (mce->error_type == MCE_ERROR_TYPE_UE)
mce->u.ue_error.ignore_event = mce_err->ignore_event;
 
+   local_paca->mces_to_process++;
+
if (!addr)
return;
 
@@ -217,7 +215,7 @@ void release_mce_event(void)
get_mce_event(NULL, true);
 }
 
-static void machine_check_ue_irq_work(struct irq_work *work)
+static void machine_check_ue_work(void)
 {
schedule_work(_ue_event_work);
 }
@@ -239,7 +237,7 @@ static void machine_check_ue_event(struct 
machine_check_event *evt)
   evt, sizeof(*evt));
 
/* Queue work to process this event later. */
-   irq_work_queue(_ue_event_irq_work);
+   machine_check_raise_dec_intr();
 }
 
 /*
@@ -249,7 +247,6 @@ void machine_check_queue_event(void)
 {
int index;
struct machine_check_event evt;
-   unsigned long msr;
 
 

[PATCH v3 2/2] pseries/mce: Refactor the pseries mce handling code

2021-11-24 Thread Ganesh Goudar
Now that we are no longer switching on the mmu in realmode
mce handler, Revert the commit 4ff753feab02("powerpc/pseries:
Avoid using addr_to_pfn in real mode") partially, which
introduced functions mce_handle_err_virtmode/realmode() to
separate mce handler code which needed translation to enabled.

Signed-off-by: Ganesh Goudar 
---
 arch/powerpc/platforms/pseries/ras.c | 122 +++
 1 file changed, 49 insertions(+), 73 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/ras.c 
b/arch/powerpc/platforms/pseries/ras.c
index 8613f9cc5798..62e1519b8355 100644
--- a/arch/powerpc/platforms/pseries/ras.c
+++ b/arch/powerpc/platforms/pseries/ras.c
@@ -511,58 +511,17 @@ int pSeries_system_reset_exception(struct pt_regs *regs)
return 0; /* need to perform reset */
 }
 
-static int mce_handle_err_realmode(int disposition, u8 error_type)
-{
-#ifdef CONFIG_PPC_BOOK3S_64
-   if (disposition == RTAS_DISP_NOT_RECOVERED) {
-   switch (error_type) {
-   caseMC_ERROR_TYPE_ERAT:
-   flush_erat();
-   disposition = RTAS_DISP_FULLY_RECOVERED;
-   break;
-   caseMC_ERROR_TYPE_SLB:
-   /*
-* Store the old slb content in paca before flushing.
-* Print this when we go to virtual mode.
-* There are chances that we may hit MCE again if there
-* is a parity error on the SLB entry we trying to read
-* for saving. Hence limit the slb saving to single
-* level of recursion.
-*/
-   if (local_paca->in_mce == 1)
-   slb_save_contents(local_paca->mce_faulty_slbs);
-   flush_and_reload_slb();
-   disposition = RTAS_DISP_FULLY_RECOVERED;
-   break;
-   default:
-   break;
-   }
-   } else if (disposition == RTAS_DISP_LIMITED_RECOVERY) {
-   /* Platform corrected itself but could be degraded */
-   pr_err("MCE: limited recovery, system may be degraded\n");
-   disposition = RTAS_DISP_FULLY_RECOVERED;
-   }
-#endif
-   return disposition;
-}
-
-static int mce_handle_err_virtmode(struct pt_regs *regs,
-  struct rtas_error_log *errp,
-  struct pseries_mc_errorlog *mce_log,
-  int disposition)
+static int mce_handle_error(struct pt_regs *regs, struct rtas_error_log *errp)
 {
struct mce_error_info mce_err = { 0 };
+   unsigned long eaddr = 0, paddr = 0;
+   struct pseries_errorlog *pseries_log;
+   struct pseries_mc_errorlog *mce_log;
+   int disposition = rtas_error_disposition(errp);
int initiator = rtas_error_initiator(errp);
int severity = rtas_error_severity(errp);
-   unsigned long eaddr = 0, paddr = 0;
u8 error_type, err_sub_type;
 
-   if (!mce_log)
-   goto out;
-
-   error_type = mce_log->error_type;
-   err_sub_type = rtas_mc_error_sub_type(mce_log);
-
if (initiator == RTAS_INITIATOR_UNKNOWN)
mce_err.initiator = MCE_INITIATOR_UNKNOWN;
else if (initiator == RTAS_INITIATOR_CPU)
@@ -588,6 +547,8 @@ static int mce_handle_err_virtmode(struct pt_regs *regs,
mce_err.severity = MCE_SEV_SEVERE;
else if (severity == RTAS_SEVERITY_ERROR)
mce_err.severity = MCE_SEV_SEVERE;
+   else if (severity == RTAS_SEVERITY_FATAL)
+   mce_err.severity = MCE_SEV_FATAL;
else
mce_err.severity = MCE_SEV_FATAL;
 
@@ -599,7 +560,18 @@ static int mce_handle_err_virtmode(struct pt_regs *regs,
mce_err.error_type = MCE_ERROR_TYPE_UNKNOWN;
mce_err.error_class = MCE_ECLASS_UNKNOWN;
 
-   switch (error_type) {
+   if (!rtas_error_extended(errp))
+   goto out;
+
+   pseries_log = get_pseries_errorlog(errp, PSERIES_ELOG_SECT_ID_MCE);
+   if (!pseries_log)
+   goto out;
+
+   mce_log = (struct pseries_mc_errorlog *)pseries_log->data;
+   error_type = mce_log->error_type;
+   err_sub_type = rtas_mc_error_sub_type(mce_log);
+
+   switch (mce_log->error_type) {
case MC_ERROR_TYPE_UE:
mce_err.error_type = MCE_ERROR_TYPE_UE;
mce_common_process_ue(regs, _err);
@@ -692,41 +664,45 @@ static int mce_handle_err_virtmode(struct pt_regs *regs,
mce_err.error_type = MCE_ERROR_TYPE_DCACHE;
break;
case MC_ERROR_TYPE_I_CACHE:
-   mce_err.error_type = MCE_ERROR_TYPE_ICACHE;
+   mce_err.error_type = MCE_ERROR_TYPE_DCACHE;
break;
case MC_ERROR_TYPE_UNKNOWN:
default:

Re: [PATCH v2 4/8] ocfs2: simplify subdirectory registration with register_sysctl()

2021-11-24 Thread Jan Kara
On Tue 23-11-21 12:24:18, Luis Chamberlain wrote:
> There is no need to user boiler plate code to specify a set of base
> directories we're going to stuff sysctls under. Simplify this by using
> register_sysctl() and specifying the directory path directly.
> 
> // pycocci sysctl-subdir-register-sysctl-simplify.cocci PATH

Heh, nice example of using Coccinelle. The result looks good. Feel free to
add:

Reviewed-by: Jan Kara 

Honza


> 
> @c1@
> expression E1;
> identifier subdir, sysctls;
> @@
> 
> static struct ctl_table subdir[] = {
>   {
>   .procname = E1,
>   .maxlen = 0,
>   .mode = 0555,
>   .child = sysctls,
>   },
>   { }
> };
> 
> @c2@
> identifier c1.subdir;
> 
> expression E2;
> identifier base;
> @@
> 
> static struct ctl_table base[] = {
>   {
>   .procname = E2,
>   .maxlen = 0,
>   .mode = 0555,
>   .child = subdir,
>   },
>   { }
> };
> 
> @c3@
> identifier c2.base;
> identifier header;
> @@
> 
> header = register_sysctl_table(base);
> 
> @r1 depends on c1 && c2 && c3@
> expression c1.E1;
> identifier c1.subdir, c1.sysctls;
> @@
> 
> -static struct ctl_table subdir[] = {
> - {
> - .procname = E1,
> - .maxlen = 0,
> - .mode = 0555,
> - .child = sysctls,
> - },
> - { }
> -};
> 
> @r2 depends on c1 && c2 && c3@
> identifier c1.subdir;
> 
> expression c2.E2;
> identifier c2.base;
> @@
> -static struct ctl_table base[] = {
> - {
> - .procname = E2,
> - .maxlen = 0,
> - .mode = 0555,
> - .child = subdir,
> - },
> - { }
> -};
> 
> @initialize:python@
> @@
> 
> def make_my_fresh_expression(s1, s2):
>   return '"' + s1.strip('"') + "/" + s2.strip('"') + '"'
> 
> @r3 depends on c1 && c2 && c3@
> expression c1.E1;
> identifier c1.sysctls;
> expression c2.E2;
> identifier c2.base;
> identifier c3.header;
> fresh identifier E3 = script:python(E2, E1) { make_my_fresh_expression(E2, 
> E1) };
> @@
> 
> header =
> -register_sysctl_table(base);
> +register_sysctl(E3, sysctls);
> 
> Generated-by: Coccinelle SmPL
> Signed-off-by: Luis Chamberlain 
> ---
>  fs/ocfs2/stackglue.c | 25 +
>  1 file changed, 1 insertion(+), 24 deletions(-)
> 
> diff --git a/fs/ocfs2/stackglue.c b/fs/ocfs2/stackglue.c
> index 16f1bfc407f2..731558a6f27d 100644
> --- a/fs/ocfs2/stackglue.c
> +++ b/fs/ocfs2/stackglue.c
> @@ -672,31 +672,8 @@ static struct ctl_table ocfs2_mod_table[] = {
>   { }
>  };
>  
> -static struct ctl_table ocfs2_kern_table[] = {
> - {
> - .procname   = "ocfs2",
> - .data   = NULL,
> - .maxlen = 0,
> - .mode   = 0555,
> - .child  = ocfs2_mod_table
> - },
> - { }
> -};
> -
> -static struct ctl_table ocfs2_root_table[] = {
> - {
> - .procname   = "fs",
> - .data   = NULL,
> - .maxlen = 0,
> - .mode   = 0555,
> - .child  = ocfs2_kern_table
> - },
> - { }
> -};
> -
>  static struct ctl_table_header *ocfs2_table_header;
>  
> -
>  /*
>   * Initialization
>   */
> @@ -705,7 +682,7 @@ static int __init ocfs2_stack_glue_init(void)
>  {
>   strcpy(cluster_stack_name, OCFS2_STACK_PLUGIN_O2CB);
>  
> - ocfs2_table_header = register_sysctl_table(ocfs2_root_table);
> + ocfs2_table_header = register_sysctl("fs/ocfs2", ocfs2_mod_table);
>   if (!ocfs2_table_header) {
>   printk(KERN_ERR
>  "ocfs2 stack glue: unable to register sysctl\n");
> -- 
> 2.33.0
> 
-- 
Jan Kara 
SUSE Labs, CR


Re: [PATCH v2 6/8] inotify: simplify subdirectory registration with register_sysctl()

2021-11-24 Thread Jan Kara
On Tue 23-11-21 12:24:20, Luis Chamberlain wrote:
> From: Xiaoming Ni 
> 
> There is no need to user boiler plate code to specify a set of base
> directories we're going to stuff sysctls under. Simplify this by using
> register_sysctl() and specifying the directory path directly.
> 
> Move inotify_user sysctl to inotify_user.c while at it to remove clutter
> from kernel/sysctl.c.
> 
> Signed-off-by: Xiaoming Ni 
> [mcgrof: update commit log to reflect new path we decided to take]
> Signed-off-by: Luis Chamberlain 

This looks fishy. You register inotify_table but not fanotify_table and
remove both...

Honza

> ---
>  fs/notify/inotify/inotify_user.c | 11 ++-
>  include/linux/inotify.h  |  3 ---
>  kernel/sysctl.c  | 21 -
>  3 files changed, 10 insertions(+), 25 deletions(-)
> 
> diff --git a/fs/notify/inotify/inotify_user.c 
> b/fs/notify/inotify/inotify_user.c
> index 29fca3284bb5..54583f62dc44 100644
> --- a/fs/notify/inotify/inotify_user.c
> +++ b/fs/notify/inotify/inotify_user.c
> @@ -58,7 +58,7 @@ struct kmem_cache *inotify_inode_mark_cachep __read_mostly;
>  static long it_zero = 0;
>  static long it_int_max = INT_MAX;
>  
> -struct ctl_table inotify_table[] = {
> +static struct ctl_table inotify_table[] = {
>   {
>   .procname   = "max_user_instances",
>   .data   = 
> _user_ns.ucount_max[UCOUNT_INOTIFY_INSTANCES],
> @@ -87,6 +87,14 @@ struct ctl_table inotify_table[] = {
>   },
>   { }
>  };
> +
> +static void __init inotify_sysctls_init(void)
> +{
> + register_sysctl("fs/inotify", inotify_table);
> +}
> +
> +#else
> +#define inotify_sysctls_init() do { } while (0)
>  #endif /* CONFIG_SYSCTL */
>  
>  static inline __u32 inotify_arg_to_mask(struct inode *inode, u32 arg)
> @@ -849,6 +857,7 @@ static int __init inotify_user_setup(void)
>   inotify_max_queued_events = 16384;
>   init_user_ns.ucount_max[UCOUNT_INOTIFY_INSTANCES] = 128;
>   init_user_ns.ucount_max[UCOUNT_INOTIFY_WATCHES] = watches_max;
> + inotify_sysctls_init();
>  
>   return 0;
>  }
> diff --git a/include/linux/inotify.h b/include/linux/inotify.h
> index 6a24905f6e1e..8d20caa1b268 100644
> --- a/include/linux/inotify.h
> +++ b/include/linux/inotify.h
> @@ -7,11 +7,8 @@
>  #ifndef _LINUX_INOTIFY_H
>  #define _LINUX_INOTIFY_H
>  
> -#include 
>  #include 
>  
> -extern struct ctl_table inotify_table[]; /* for sysctl */
> -
>  #define ALL_INOTIFY_BITS (IN_ACCESS | IN_MODIFY | IN_ATTRIB | IN_CLOSE_WRITE 
> | \
> IN_CLOSE_NOWRITE | IN_OPEN | IN_MOVED_FROM | \
> IN_MOVED_TO | IN_CREATE | IN_DELETE | \
> diff --git a/kernel/sysctl.c b/kernel/sysctl.c
> index 7a90a12b9ea4..6aa67c737e4e 100644
> --- a/kernel/sysctl.c
> +++ b/kernel/sysctl.c
> @@ -125,13 +125,6 @@ static const int maxolduid = 65535;
>  static const int ngroups_max = NGROUPS_MAX;
>  static const int cap_last_cap = CAP_LAST_CAP;
>  
> -#ifdef CONFIG_INOTIFY_USER
> -#include 
> -#endif
> -#ifdef CONFIG_FANOTIFY
> -#include 
> -#endif
> -
>  #ifdef CONFIG_PROC_SYSCTL
>  
>  /**
> @@ -3099,20 +3092,6 @@ static struct ctl_table fs_table[] = {
>   .proc_handler   = proc_dointvec,
>   },
>  #endif
> -#ifdef CONFIG_INOTIFY_USER
> - {
> - .procname   = "inotify",
> - .mode   = 0555,
> - .child  = inotify_table,
> - },
> -#endif
> -#ifdef CONFIG_FANOTIFY
> - {
> - .procname   = "fanotify",
> - .mode   = 0555,
> - .child  = fanotify_table,
> - },
> -#endif
>  #ifdef CONFIG_EPOLL
>   {
>   .procname   = "epoll",
> -- 
> 2.33.0
> 
-- 
Jan Kara 
SUSE Labs, CR


[PATCH 6/6] powerpc: Mark probe_machine() __init and static

2021-11-24 Thread Michael Ellerman
Prior to commit b1923caa6e64 ("powerpc: Merge 32-bit and 64-bit
setup_arch()") probe_machine() was called from setup_32/64.c and lived
in setup-common.c. But now it's only called from setup-common.c so it
can be static and __init, and we don't need the declaration in
machdep.h either.

Signed-off-by: Michael Ellerman 
---
 arch/powerpc/include/asm/machdep.h | 2 --
 arch/powerpc/kernel/setup-common.c | 2 +-
 2 files changed, 1 insertion(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/machdep.h 
b/arch/powerpc/include/asm/machdep.h
index 9c3c9f04129f..e821037f74f0 100644
--- a/arch/powerpc/include/asm/machdep.h
+++ b/arch/powerpc/include/asm/machdep.h
@@ -235,8 +235,6 @@ extern struct machdep_calls *machine_id;
machine_id == _##name; \
})
 
-extern void probe_machine(void);
-
 #ifdef CONFIG_PPC_PMAC
 /*
  * Power macintoshes have either a CUDA, PMU or SMU controlling
diff --git a/arch/powerpc/kernel/setup-common.c 
b/arch/powerpc/kernel/setup-common.c
index 4f1322b65760..f8da937df918 100644
--- a/arch/powerpc/kernel/setup-common.c
+++ b/arch/powerpc/kernel/setup-common.c
@@ -582,7 +582,7 @@ static __init int add_pcspkr(void)
 device_initcall(add_pcspkr);
 #endif /* CONFIG_PCSPKR_PLATFORM */
 
-void probe_machine(void)
+static __init void probe_machine(void)
 {
extern struct machdep_calls __machine_desc_start;
extern struct machdep_calls __machine_desc_end;
-- 
2.31.1



[PATCH 5/6] powerpc/smp: Move setup_profiling_timer() under CONFIG_PROFILING

2021-11-24 Thread Michael Ellerman
setup_profiling_timer() is only needed when CONFIG_PROFILING is enabled.

Fixes the following W=1 warning when CONFIG_PROFILING=n:
  linux/arch/powerpc/kernel/smp.c:1638:5: error: no previous prototype for 
‘setup_profiling_timer’

Signed-off-by: Michael Ellerman 
---
 arch/powerpc/kernel/smp.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index c23ee842c4c3..aee3a7119f97 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -1635,10 +1635,12 @@ void start_secondary(void *unused)
BUG();
 }
 
+#ifdef CONFIG_PROFILING
 int setup_profiling_timer(unsigned int multiplier)
 {
return 0;
 }
+#endif
 
 static void fixup_topology(void)
 {
-- 
2.31.1



[PATCH 4/6] powerpc/mm: Move tlbcam_sz() and make it static

2021-11-24 Thread Michael Ellerman
Building with W=1 we see a warning:
  linux/arch/powerpc/mm/nohash/fsl_book3e.c:63:15: error: no previous prototype 
for ‘tlbcam_sz’

tlbcam_sz() is not used outside this file, so we can make it static.
However it's only used inside #ifdef CONFIG_PPC32, so move it within
that ifdef, otherwise we would get a defined but not used error.

Signed-off-by: Michael Ellerman 
---
 arch/powerpc/mm/nohash/fsl_book3e.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/mm/nohash/fsl_book3e.c 
b/arch/powerpc/mm/nohash/fsl_book3e.c
index b231a54f540c..7f71bc3bf85f 100644
--- a/arch/powerpc/mm/nohash/fsl_book3e.c
+++ b/arch/powerpc/mm/nohash/fsl_book3e.c
@@ -60,11 +60,6 @@ struct tlbcamrange {
phys_addr_t phys;
 } tlbcam_addrs[NUM_TLBCAMS];
 
-unsigned long tlbcam_sz(int idx)
-{
-   return tlbcam_addrs[idx].limit - tlbcam_addrs[idx].start + 1;
-}
-
 #ifdef CONFIG_FSL_BOOKE
 /*
  * Return PA for this VA if it is mapped by a CAM, or 0
@@ -264,6 +259,11 @@ void __init MMU_init_hw(void)
flush_instruction_cache();
 }
 
+static unsigned long tlbcam_sz(int idx)
+{
+   return tlbcam_addrs[idx].limit - tlbcam_addrs[idx].start + 1;
+}
+
 void __init adjust_total_lowmem(void)
 {
unsigned long ram;
-- 
2.31.1



[PATCH 3/6] powerpc/85xx: Make c293_pcie_pic_init() static

2021-11-24 Thread Michael Ellerman
To fix the W=1 warning:
  linux/arch/powerpc/platforms/85xx/c293pcie.c:22:13: error: no previous 
prototype for ‘c293_pcie_pic_init’

Signed-off-by: Michael Ellerman 
---
 arch/powerpc/platforms/85xx/c293pcie.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/85xx/c293pcie.c 
b/arch/powerpc/platforms/85xx/c293pcie.c
index 8d9a2503dd0f..58a398c89e97 100644
--- a/arch/powerpc/platforms/85xx/c293pcie.c
+++ b/arch/powerpc/platforms/85xx/c293pcie.c
@@ -19,7 +19,7 @@
 
 #include "mpc85xx.h"
 
-void __init c293_pcie_pic_init(void)
+static void __init c293_pcie_pic_init(void)
 {
struct mpic *mpic = mpic_alloc(NULL, 0, MPIC_BIG_ENDIAN |
  MPIC_SINGLE_DEST_CPU, 0, 256, " OpenPIC  ");
-- 
2.31.1



[PATCH 1/6] powerpc/85xx: Fix no previous prototype warning for mpc85xx_setup_pmc()

2021-11-24 Thread Michael Ellerman
Fixes the following W=1 warning:
  arch/powerpc/platforms/85xx/mpc85xx_pm_ops.c:89:12: warning: no previous 
prototype for 'mpc85xx_setup_pmc'

Reported-by: kernel test robot 
Signed-off-by: Michael Ellerman 
---
 arch/powerpc/platforms/85xx/mpc85xx_pm_ops.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/powerpc/platforms/85xx/mpc85xx_pm_ops.c 
b/arch/powerpc/platforms/85xx/mpc85xx_pm_ops.c
index 4a8af80011a6..f7ac92a8ae97 100644
--- a/arch/powerpc/platforms/85xx/mpc85xx_pm_ops.c
+++ b/arch/powerpc/platforms/85xx/mpc85xx_pm_ops.c
@@ -15,6 +15,8 @@
 #include 
 #include 
 
+#include "smp.h"
+
 static struct ccsr_guts __iomem *guts;
 
 #ifdef CONFIG_FSL_PMC
-- 
2.31.1



[PATCH 2/6] powerpc/85xx: Make mpc85xx_smp_kexec_cpu_down() static

2021-11-24 Thread Michael Ellerman
To fix the W=1 warning:
  arch/powerpc/platforms/85xx/smp.c:369:6: error: no previous prototype for 
‘mpc85xx_smp_kexec_cpu_down’

Signed-off-by: Michael Ellerman 
---
 arch/powerpc/platforms/85xx/smp.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/platforms/85xx/smp.c 
b/arch/powerpc/platforms/85xx/smp.c
index 83f4a6389a28..0abc1da2c14f 100644
--- a/arch/powerpc/platforms/85xx/smp.c
+++ b/arch/powerpc/platforms/85xx/smp.c
@@ -366,7 +366,7 @@ struct smp_ops_t smp_85xx_ops = {
 #ifdef CONFIG_PPC32
 atomic_t kexec_down_cpus = ATOMIC_INIT(0);
 
-void mpc85xx_smp_kexec_cpu_down(int crash_shutdown, int secondary)
+static void mpc85xx_smp_kexec_cpu_down(int crash_shutdown, int secondary)
 {
local_irq_disable();
 
@@ -384,7 +384,7 @@ static void mpc85xx_smp_kexec_down(void *arg)
ppc_md.kexec_cpu_down(0,1);
 }
 #else
-void mpc85xx_smp_kexec_cpu_down(int crash_shutdown, int secondary)
+static void mpc85xx_smp_kexec_cpu_down(int crash_shutdown, int secondary)
 {
int cpu = smp_processor_id();
int sibling = cpu_last_thread_sibling(cpu);
-- 
2.31.1



Re: [PATCH v2 1/2] powerpc/mce: Avoid using irq_work_queue() in realmode

2021-11-24 Thread kernel test robot
Hi Ganesh,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on powerpc/next]
[also build test ERROR on v5.16-rc2 next-20211124]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Ganesh-Goudar/powerpc-mce-Avoid-using-irq_work_queue-in-realmode/20211124-130459
base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
config: powerpc-allnoconfig 
(https://download.01.org/0day-ci/archive/20211124/202111241736.zgco0sk3-...@intel.com/config)
compiler: powerpc-linux-gcc (GCC) 11.2.0
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# 
https://github.com/0day-ci/linux/commit/bac24ec52edd7013115ad594974f64a30565266d
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review 
Ganesh-Goudar/powerpc-mce-Avoid-using-irq_work_queue-in-realmode/20211124-130459
git checkout bac24ec52edd7013115ad594974f64a30565266d
# save the config file to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross 
ARCH=powerpc 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All errors (new ones prefixed by >>):

   arch/powerpc/kernel/time.c: In function 'timer_interrupt':
>> arch/powerpc/kernel/time.c:598:25: error: implicit declaration of function 
>> 'mce_run_late_handlers' [-Werror=implicit-function-declaration]
 598 | mce_run_late_handlers();
 | ^
   cc1: all warnings being treated as errors


vim +/mce_run_late_handlers +598 arch/powerpc/kernel/time.c

   590  
   591  old_regs = set_irq_regs(regs);
   592  
   593  trace_timer_interrupt_entry(regs);
   594  
   595  if (test_irq_work_pending()) {
   596  clear_irq_work_pending();
   597  if (IS_ENABLED(CONFIG_PPC_BOOK3S_64))
 > 598  mce_run_late_handlers();
   599  irq_work_run();
   600  }
   601  
   602  now = get_tb();
   603  if (now >= *next_tb) {
   604  *next_tb = ~(u64)0;
   605  if (evt->event_handler)
   606  evt->event_handler(evt);
   607  __this_cpu_inc(irq_stat.timer_irqs_event);
   608  } else {
   609  now = *next_tb - now;
   610  if (now <= decrementer_max)
   611  set_dec(now);
   612  /* We may have raced with new irq work */
   613  if (test_irq_work_pending())
   614  set_dec(1);
   615  __this_cpu_inc(irq_stat.timer_irqs_others);
   616  }
   617  
   618  trace_timer_interrupt_exit(regs);
   619  
   620  set_irq_regs(old_regs);
   621  }
   622  EXPORT_SYMBOL(timer_interrupt);
   623  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


Re: [PATCH] powerpc/eeh: Delay slot presence check once driver is notified about the pci error.

2021-11-24 Thread Mahesh J Salgaonkar
On 2021-11-24 10:14:30 Wed, Michael Ellerman wrote:
> Mahesh Salgaonkar  writes:
> > When certain PHB HW failure causes phyp to recover PHB, it marks the PE
> > state as temporarily unavailable until recovery is complete. This also
> > triggers an EEH handler in Linux which needs to notify drivers, and perform
> > recovery. But before notifying the driver about the pci error it uses
> > get_adapter_state()->get-sesnor-state() operation of the hotplug_slot to
> > determine if the slot contains a device or not. if the slot is empty, the
> > recovery is skipped entirely.
> >
> > However on certain PHB failures, the rtas call get-sesnor-state() returns
> > extended busy error (9902) until PHB is recovered by phyp. Once PHB is
> > recovered, the get-sensor-state() returns success with correct presence
> > status. The rtas call interface rtas_get_sensor() loops over the rtas call
> > on extended delay return code (9902) until the return value is either
> > success (0) or error (-1). This causes the EEH handler to get stuck for ~6
> > seconds before it could notify that the pci error has been detected and
> > stop any active operations. Hence with running I/O traffic, during this 6
> > seconds, the network driver continues its operation and hits a timeout
> > (netdev watchdog). On timeouts, network driver go into ffdc capture mode
> > and reset path assuming the PCI device is in fatal condition. This causes
> > EEH recovery to fail and sometimes it leads to system hang or crash.
> >
> > 
> > [52732.244731] DEBUG: ibm_read_slot_reset_state2()
> > [52732.244762] DEBUG: ret = 0, rets[0]=5, rets[1]=1, rets[2]=4000, 
> > rets[3]=0x0
> > [52732.244798] DEBUG: in eeh_slot_presence_check
> > [52732.244804] DEBUG: error state check
> > [52732.244807] DEBUG: Is slot hotpluggable
> > [52732.244810] DEBUG: hotpluggable ops ?
> > [52732.244953] DEBUG: Calling ops->get_adapter_status
> > [52732.244958] DEBUG: calling rpaphp_get_sensor_state
> > [52736.564262] [ cut here ]
> > [52736.564299] NETDEV WATCHDOG: enP64p1s0f3 (tg3): transmit queue 0 timed 
> > out
> > [52736.564324] WARNING: CPU: 1442 PID: 0 at net/sched/sch_generic.c:478 
> > dev_watchdog+0x438/0x440
> > [...]
> > [52736.564505] NIP [c0c32368] dev_watchdog+0x438/0x440
> > [52736.564513] LR [c0c32364] dev_watchdog+0x434/0x440
> > 
> >
> > To fix this issue, delay the slot presence check after notifying the driver
> > about the pci error.
> 
> How does this interact with the commit that put the slot presence check
> there in the first place:
> 
>   b104af5a7687 ("powerpc/eeh: Check slot presence state in 
> eeh_handle_normal_event()")
> 
> 
> It seems like delaying the slot presence check will effectively revert
> that commit?

No it doesn't. We will still do a presence check before the recovery
process starts. This patch moves the check after notifying the driver to
stop active I/O operations. If a presence check finds the device isn't
present, we will skip the EEH recovery. However, on a surprise hotplug,
the user will see the EEH messages on the console before it finds there
is nothing to recover.

Current EEH behaviour:

EEH event -> eeh_handle_normal_event
/* Check for adapter status */
eeh_slot_presence_check()
if (!present)
bail out early

/* Report the error */
eeh_report_error() <- notify driver about error
driver->err_handler->error_detected()
/* Any active I/O will be stopped now */

/* Start the recovery process */
eeh_reset_device()
eeh_report_resume()
/* Recovery done */

With this patch:

EEH event -> eeh_handle_normal_event
/* Report the error */
eeh_report_error() <- notify driver about error
driver->err_handler->error_detected()
/* Any active I/O will be stopped now */

/* Check for adapter status */
eeh_slot_presence_check()
if (!present)
bail out early

/* Start the recovery process */
eeh_reset_device()
eeh_report_resume()
/* Recovery done */

Thanks,
-Mahesh.