Re: [PATCH 1/5] cpuidle-pseries: Set the latency-hint before entering CEDE

2020-07-19 Thread Vaidyanathan Srinivasan
* Gautham R Shenoy  [2020-07-07 16:41:35]:

> From: "Gautham R. Shenoy" 
> 
> As per the PAPR, each H_CEDE call is associated with a latency-hint to
> be passed in the VPA field "cede_latency_hint". The CEDE states that
> we were implicitly entering so far is CEDE with latency-hint = 0.
> 
> This patch explicitly sets the latency hint corresponding to the CEDE
> state that we are currently entering. While at it, we save the
> previous hint, to be restored once we wakeup from CEDE. This will be
> required in the future when we expose extended-cede states through the
> cpuidle framework, where each of them will have a different
> cede-latency hint.
> 
> Signed-off-by: Gautham R. Shenoy 

Reviewed-by: Vaidyanathan Srinivasan 


> ---
>  drivers/cpuidle/cpuidle-pseries.c | 10 +-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/cpuidle/cpuidle-pseries.c 
> b/drivers/cpuidle/cpuidle-pseries.c
> index 4a37252..39d4bb6 100644
> --- a/drivers/cpuidle/cpuidle-pseries.c
> +++ b/drivers/cpuidle/cpuidle-pseries.c
> @@ -105,19 +105,27 @@ static void check_and_cede_processor(void)
>   }
>  }
> 
> +#define NR_CEDE_STATES   1  /* CEDE with latency-hint 0 */
> +#define NR_DEDICATED_STATES  (NR_CEDE_STATES + 1) /* Includes snooze */
> +
> +u8 cede_latency_hint[NR_DEDICATED_STATES];
>  static int dedicated_cede_loop(struct cpuidle_device *dev,
>   struct cpuidle_driver *drv,
>   int index)
>  {
> + u8 old_latency_hint;
> 
>   pseries_idle_prolog();
>   get_lppaca()->donate_dedicated_cpu = 1;
> + old_latency_hint = get_lppaca()->cede_latency_hint;
> + get_lppaca()->cede_latency_hint = cede_latency_hint[index];
> 
>   HMT_medium();
>   check_and_cede_processor();
> 
>   local_irq_disable();
>   get_lppaca()->donate_dedicated_cpu = 0;
> + get_lppaca()->cede_latency_hint = old_latency_hint;
> 
>   pseries_idle_epilog();
> 
> @@ -149,7 +157,7 @@ static int shared_cede_loop(struct cpuidle_device *dev,
>  /*
>   * States for dedicated partition case.
>   */
> -static struct cpuidle_state dedicated_states[] = {
> +static struct cpuidle_state dedicated_states[NR_DEDICATED_STATES] = {
>   { /* Snooze */
>   .name = "snooze",
>   .desc = "snooze",


Saving and restoring the current cede hint value helps in maintaining
compatibility with other parts of the kernel.  Over long term we can
make cpuidle driver deterministically set the CEDE hint at each
invocation of H_CEDE call so that we do not have to do multiple
redundant save-restore.

This is a reasonable start to cleanup this cupidle subsystem on PAPR
guests. 

--Vaidy



Re: [PATCH v2 2/2] selftest/cpuidle: Add support for cpuidle latency measurement

2020-07-19 Thread Gautham R Shenoy
Hi Pratik,


On Fri, Jul 17, 2020 at 02:48:01PM +0530, Pratik Rajesh Sampat wrote:
> This patch adds support to trace IPI based and timer based wakeup
> latency from idle states
> 
> Latches onto the test-cpuidle_latency kernel module using the debugfs
> interface to send IPIs or schedule a timer based event, which in-turn
> populates the debugfs with the latency measurements.
> 
> Currently for the IPI and timer tests; first disable all idle states
> and then test for latency measurements incrementally enabling each state
> 
> Signed-off-by: Pratik Rajesh Sampat 

A few comments below.

> ---
>  tools/testing/selftests/Makefile   |   1 +
>  tools/testing/selftests/cpuidle/Makefile   |   6 +
>  tools/testing/selftests/cpuidle/cpuidle.sh | 257 +
>  tools/testing/selftests/cpuidle/settings   |   1 +
>  4 files changed, 265 insertions(+)
>  create mode 100644 tools/testing/selftests/cpuidle/Makefile
>  create mode 100755 tools/testing/selftests/cpuidle/cpuidle.sh
>  create mode 100644 tools/testing/selftests/cpuidle/settings
> 

[..skip..]

> +
> +ins_mod()
> +{
> + if [ ! -f "$MODULE" ]; then
> + printf "$MODULE module does not exist. Exitting\n"

If the module has been compiled into the kernel (due to a
localyesconfig, for instance), then it is unlikely that we will find
it in /lib/modules. Perhaps you want to check if the debugfs
directories created by the module exist, and if so, print a message
saying that the modules is already loaded or some such?

> + exit $ksft_skip
> + fi
> + printf "Inserting $MODULE module\n\n"
> + insmod $MODULE
> + if [ $? != 0 ]; then
> + printf "Insmod $MODULE failed\n"
> + exit $ksft_skip
> + fi
> +}
> +
> +compute_average()
> +{
> + arr=("$@")
> + sum=0
> + size=${#arr[@]}
> + for i in "${arr[@]}"
> + do
> + sum=$((sum + i))
> + done
> + avg=$((sum/size))

It would be good to assert that "size" isn't 0 here.

> +}
> +
> +# Disable all stop states
> +disable_idle()
> +{
> + for ((cpu=0; cpu + do
> + for ((state=0; state + do
> + echo 1 > 
> /sys/devices/system/cpu/cpu$cpu/cpuidle/state$state/disable

So, on offlined CPUs, we won't see
/sys/devices/system/cpu/cpu$cpu/cpuidle/state$state directory. You
should probably perform this operation only on online CPUs.


> + done
> + done
> +}
> +
> +# Perform operation on each CPU for the given state
> +# $1 - Operation: enable (0) / disable (1)
> +# $2 - State to enable
> +op_state()
> +{
> + for ((cpu=0; cpu + do
> + echo $1 > 
> /sys/devices/system/cpu/cpu$cpu/cpuidle/state$2/disable


Ditto

> + done
> +}

This is a helper function. For better readability of the main code you
can define the following wrappers and use them.


cpuidle_enable_state()
{
state=$1
op_state 1 $state
}

cpuidle_disable_state()
{
state=$1
op_state 0 $state

}


> +

[..snip..]

> +run_ipi_tests()
> +{
> +extract_latency
> +disable_idle
> +declare -a avg_arr
> +echo -e "--IPI Latency Test---" >> $LOG
> +
> + echo -e "--Baseline IPI Latency measurement: CPU Busy--" >> $LOG
> + printf "%s %10s %12s\n" "SRC_CPU" "DEST_CPU" "IPI_Latency(ns)" 
> >> $LOG
> + for ((cpu=0; cpu + do
> + ipi_test_once "baseline" $cpu
> + printf "%-3s %10s %12s\n" $src_cpu $cpu $ipi_latency >> 
> $LOG
> + avg_arr+=($ipi_latency)
> + done
> + compute_average "${avg_arr[@]}"
> + echo -e "Baseline Average IPI latency(ns): $avg" >> $LOG
> +
> +for ((state=0; state +do
> + unset avg_arr
> + echo -e "---Enabling state: $state---" >> $LOG
> + op_state 0 $state
> + printf "%s %10s %12s\n" "SRC_CPU" "DEST_CPU" 
> "IPI_Latency(ns)" >> $LOG
> + for ((cpu=0; cpu + do

If a CPU is offline, then we should skip it here.

> + # Running IPI test and logging results
> + sleep 1
> + ipi_test_once "test" $cpu
> + printf "%-3s %10s %12s\n" $src_cpu $cpu 
> $ipi_latency >> $LOG
> + avg_arr+=($ipi_latency)
> + done
> + compute_average "${avg_arr[@]}"
> + echo -e "Expected IPI latency(ns): 
> ${latency_arr[$state]}" >> $LOG
> + echo -e "Observed Average IPI latency(ns): $avg" >> $LOG
> + op_state 1 $state
> +done
> +}
> +
> +# Extract the residency in microseconds and convert to nanoseconds.
> +# Add 100 ns so that the timer stays for a little longer than the residency
> +extract_residency()
> +{
> 

Re: [PATCH 0/5] cpuidle-pseries: Parse extended CEDE information for idle.

2020-07-19 Thread Vaidyanathan Srinivasan
* Gautham R Shenoy  [2020-07-07 16:41:34]:

> From: "Gautham R. Shenoy" 
> 
> Hi,
> 
> On pseries Dedicated Linux LPARs, apart from the polling snooze idle
> state, we currently have the CEDE idle state which cedes the CPU to
> the hypervisor with latency-hint = 0.
> 
> However, the PowerVM hypervisor supports additional extended CEDE
> states, which can be queried through the "ibm,get-systems-parameter"
> rtas-call with the CEDE_LATENCY_TOKEN. The hypervisor maps these
> extended CEDE states to appropriate platform idle-states in order to
> provide energy-savings as well as shifting power to the active
> units. On existing pseries LPARs today we have extended CEDE with
> latency-hints {1,2} supported.
> 
> In Patches 1-3 of this patchset, we add the code to parse the CEDE
> latency records provided by the hypervisor. We use this information to
> determine the wakeup latency of the regular CEDE (which we have been
> so far hardcoding to 10us while experimentally it is much lesser ~
> 1us), by looking at the wakeup latency provided by the hypervisor for
> Extended CEDE states. Since the platform currently advertises Extended
> CEDE 1 to have wakeup latency of 2us, we can be sure that the wakeup
> latency of the regular CEDE is no more than this.
> 
> Patch 4 (currently marked as RFC), expose the extended CEDE states
> parsed above to the cpuidle framework, provided that they can wakeup
> on an interrupt. On current platforms only Extended CEDE 1 fits the
> bill, but this is going to change in future platforms where even
> Extended CEDE 2 may be responsive to external interrupts.
> 
> Patch 5 (currently marked as RFC), filters out Extended CEDE 1 since
> it offers no added advantage over the normal CEDE.
> 
> With Patches 1-3, we see an improvement in the single-threaded
> performance on ebizzy.
> 
> 2 ebizzy threads bound to the same big-core. 25% improvement in the
> avg records/s (higher the better) with patches 1-3.
> x without_patches
> * with_patches
> N   Min   MaxMedian   AvgStddev
> x  10   2491089   5834307   5398375   4244335 1596244.9
> *  10   2893813   5834474   5832448 5327281.3 1055941.4
> 
> We do not observe any major regression in either the context_switch2
> benchmark or the schbench benchmark
> 
> context_switch2 across CPU0 CPU1 (Both belong to same big-core, but different
> small cores). We observe a minor 0.14% regression in the number of
> context-switches (higher is better).
> x without_patch
> * with_patch
> N   Min   MaxMedian   AvgStddev
> x 500348872362236354712 354745.69  2711.827
> * 500349422361452353942  354215.4 2576.9258
> 
> context_switch2 across CPU0 CPU8 (Different big-cores). We observe a 0.37%
> improvement in the number of context-switches (higher is better).
> x without_patch
> * with_patch
> N   Min   MaxMedian   AvgStddev
> x 500287956294940288896 288977.23 646.59295
> * 500288300294646289582 290064.76 1161.9992
> 
> schbench:
> No major difference could be seen until the 99.9th percentile.
> 
> Without-patch
> Latency percentiles (usec)
>   50.0th: 29
>   75.0th: 39
>   90.0th: 49
>   95.0th: 59
>   *99.0th: 13104
>   99.5th: 14672
>   99.9th: 15824
>   min=0, max=17993
> 
> With-patch:
> Latency percentiles (usec)
>   50.0th: 29
>   75.0th: 40
>   90.0th: 50
>   95.0th: 61
>   *99.0th: 13648
>   99.5th: 14768
>   99.9th: 15664
>   min=0, max=29812

This patch series mainly cleans up the CEDE latency discovery and
prepares to add different cpuidle states in virtualised environment.
This helps in improving SMT folding speeds and also power savings and
power shifting with newer platform firmware.

The current benefit is primarily from faster SMT folding and resulting
single performance achieved by updating the platform firmware provided
heuristics in the cpuidle states.

--Vaidy





Re: [PATCH 10/11] powerpc/smp: Implement cpu_to_coregroup_id

2020-07-19 Thread Srikar Dronamraju
* Gautham R Shenoy  [2020-07-17 13:56:53]:

> On Tue, Jul 14, 2020 at 10:06:23AM +0530, Srikar Dronamraju wrote:
> > Lookup the coregroup id from the associativity array.
> > 
> > If unable to detect the coregroup id, fallback on the core id.
> > This way, ensure sched_domain degenerates and an extra sched domain is
> > not created.
> > 
> > Ideally this function should have been implemented in
> > arch/powerpc/kernel/smp.c. However if its implemented in mm/numa.c, we
> > don't need to find the primary domain again.
> > 
> > If the device-tree mentions more than one coregroup, then kernel
> > implements only the last or the smallest coregroup, which currently
> > corresponds to the penultimate domain in the device-tree.
> > 
> > Cc: linuxppc-dev 
> > Cc: Michael Ellerman 
> > Cc: Nick Piggin 
> > Cc: Oliver OHalloran 
> > Cc: Nathan Lynch 
> > Cc: Michael Neuling 
> > Cc: Anton Blanchard 
> > Cc: Gautham R Shenoy 
> > Cc: Vaidyanathan Srinivasan 
> > Signed-off-by: Srikar Dronamraju 
> > ---
> >  arch/powerpc/mm/numa.c | 17 +
> >  1 file changed, 17 insertions(+)
> > 
> > diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
> > index d9ab9da85eab..4e85564ef62a 100644
> > --- a/arch/powerpc/mm/numa.c
> > +++ b/arch/powerpc/mm/numa.c
> > @@ -1697,6 +1697,23 @@ static const struct proc_ops topology_proc_ops = {
> > 
> >  int cpu_to_coregroup_id(int cpu)
> >  {
> > +   __be32 associativity[VPHN_ASSOC_BUFSIZE] = {0};
> > +   int index;
> > +
> 
> It would be good to have an assert here to ensure that we are calling
> this function only when coregroups are enabled.
> 
> Else, we may end up returning the penultimate index which maps to the
> chip-id.
> 

We have a check below exactly for the same reason. Please look below.

> 
> 
> > +   if (cpu < 0 || cpu > nr_cpu_ids)
> > +   return -1;
> > +
> > +   if (!firmware_has_feature(FW_FEATURE_VPHN))
> > +   goto out;
> > +
> > +   if (vphn_get_associativity(cpu, associativity))
> > +   goto out;
> > +
> > +   index = of_read_number(associativity, 1);
> > +   if ((index > min_common_depth + 1) && coregroup_enabled)
> > +   return of_read_number([index - 1], 1);

See ^above.

index would be the all the domains in the associativity array, 
min_common_depth would be where the primary domain or the chip-id is
defined. So we are reading the penultimate domain if and only if the
min_common_depth isn't the primary domain aka chip-id. 

What other check /assertions can we add?


> > +
> > +out:
> > return cpu_to_core_id(cpu);
> >  }
> > 
> > -- 
> > 2.17.1
> > 

-- 
Thanks and Regards
Srikar Dronamraju


Re: [FIX PATCH] powerpc/prom: Enable Radix GTSE in cpu pa-features

2020-07-19 Thread Nicholas Piggin
Excerpts from Bharata B Rao's message of July 20, 2020 2:42 pm:
> From: Nicholas Piggin 
> 
> When '029ab30b4c0a ("powerpc/mm: Enable radix GTSE only if supported.")'
> made GTSE an MMU feature, it was enabled by default in
> powerpc-cpu-features but was missed in pa-features. This causes
> random memory corruption during boot of PowerNV kernels where
> CONFIG_PPC_DT_CPU_FTRS isn't enabled.

Thanks for writing this up, I got a bit bogged down with other things.

> Fixes: 029ab30b4c0a ("powerpc/mm: Enable radix GTSE only if supported.")
> Reported-by: Qian Cai 
> Signed-off-by: Nicholas Piggin 
> Signed-off-by: Bharata B Rao 
> ---
>  arch/powerpc/kernel/prom.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
> index 9cc49f265c86..a9594bad572a 100644
> --- a/arch/powerpc/kernel/prom.c
> +++ b/arch/powerpc/kernel/prom.c
> @@ -163,7 +163,8 @@ static struct ibm_pa_feature {
>   { .pabyte = 0,  .pabit = 6, .cpu_features  = CPU_FTR_NOEXECUTE },
>   { .pabyte = 1,  .pabit = 2, .mmu_features  = MMU_FTR_CI_LARGE_PAGE },
>  #ifdef CONFIG_PPC_RADIX_MMU
> - { .pabyte = 40, .pabit = 0, .mmu_features  = MMU_FTR_TYPE_RADIX },
> + { .pabyte = 40, .pabit = 0,
> +   .mmu_features  = (MMU_FTR_TYPE_RADIX | MMU_FTR_GTSE) },

It might look better like this:

{ .pabyte = 1,  .pabit = 2, .mmu_features  = MMU_FTR_CI_LARGE_PAGE },
#ifdef CONFIG_PPC_RADIX_MMU
{ .pabyte = 40, .pabit = 0, .mmu_features  = MMU_FTR_TYPE_RADIX },
{ .pabyte = 40, .pabit = 0, .mmu_features  = MMU_FTR_TYPE_RADIX |
 MMU_FTR_GTSE },
#endif
{ .pabyte = 1,  .pabit = 1, .invert = 1, .cpu_features = 
CPU_FTR_NODSISRALIGN },

But that's bikeshedding a bit and the optional bits already put it out 
of alignment.

Thanks,
Nick



Re: [PATCH v2 13/16] scripts/kallsyms: move ignored symbol types to is_ignored_symbol()

2020-07-19 Thread Masahiro Yamada
On Mon, Jul 20, 2020 at 10:46 AM Finn Thain  wrote:
>
> On Sun, 24 Nov 2019, Masahiro Yamada wrote:
>
> > Collect the ignored patterns to is_ignored_symbol().
> >
> > Signed-off-by: Masahiro Yamada 
>
> This commit (887df76de67f5) caused a regression in my powerpc builds as it
> causes symbol names to disappear from backtraces:
>
> [ cut here ]
> WARNING: CPU: 0 PID: 0 at kernel/smp.c:433 _einittext+0x3f9e5120/0x3feb71b8
> Modules linked in:
> CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.4.0-rc7-pmac-00055-g887df76de67f5 
> #18
> NIP:  c00aef68 LR: c00af114 CTR: c001272c
> REGS: c0705c40 TRAP: 0700   Not tainted  (5.4.0-rc7-pmac-00055-g887df76de67f5)
> MSR:  00029032   CR: 4244  XER: 
>
> GPR00: 001f0100 c0705cf8 c06dc300 c070af1c c001258c   ef7fb5bc
> GPR08: 0880 0100 0001 0100 4244  c0709040 0004
> GPR16: 0001 c06022b4 c058297c 0022 8cb9  c06d84a0 c071
> GPR24: c071   c070af1c c070af1c  c001258c 
> NIP [c00aef68] _einittext+0x3f9e5120/0x3feb71b8
> LR [c00af114] _einittext+0x3f9e52cc/0x3feb71b8
> Call Trace:
> [c0705cf8] [ef006320] 0xef006320 (unreliable)
> [c0705d38] [c00af114] _einittext+0x3f9e52cc/0x3feb71b8
> [c0705d48] [c00af158] _einittext+0x3f9e5310/0x3feb71b8
> [c0705d68] [c0012768] _einittext+0x3f948920/0x3feb71b8
> [c0705d78] [c0092c04] _einittext+0x3f9c8dbc/0x3feb71b8
> [c0705d88] [c0092d18] _einittext+0x3f9c8ed0/0x3feb71b8
> [c0705da8] [c0093a2c] _einittext+0x3f9c9be4/0x3feb71b8
> [c0705de8] [c0580224] _einittext+0x3feb63dc/0x3feb71b8
> [c0705e48] [c00382ec] _einittext+0x3f96e4a4/0x3feb71b8
> [c0705e58] [c000d2a0] _einittext+0x3f943458/0x3feb71b8
> [c0705e88] [c001353c] _einittext+0x3f9496f4/0x3feb71b8
> --- interrupt: 901 at _einittext+0x3f941058/0x3feb71b8
> LR = _einittext+0x3f941058/0x3feb71b8
> [c0705f50] [c06cc214] 0xc06cc214 (unreliable)
> [c0705f60] [c057fa20] _einittext+0x3feb5bd8/0x3feb71b8
> [c0705f70] [c005de48] _einittext+0x3f994000/0x3feb71b8
> [c0705f90] [c005e050] _einittext+0x3f994208/0x3feb71b8
> [c0705fa0] [c0004cc8] _einittext+0x3f93ae80/0x3feb71b8
> [c0705fb0] [c069a36c] _einittext+0x3ffd0524/0x4000
> [c0705ff0] [3500] 0x3500
> Instruction dump:
> 7c0803a6 7fa5eb78 7d808120 7ea6ab78 baa10014 38210040 4bfffbb0 7f64db78
> 7f85e378 484b31b1 7c601b78 4bfffdf4 <0fe0> 4bfffd60 9421ffe0 7c0802a6
> ---[ end trace a06fef4788747c72 ]---
>
>
> Prior to that (e.g. 97261e1e2240f), I get backtraces like this:
>
> [ cut here ]
> WARNING: CPU: 0 PID: 0 at kernel/smp.c:433 smp_call_function_many+0x318/0x320
> Modules linked in:
> CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.4.0-rc7-pmac-00054-g97261e1e2240f 
> #20
> NIP:  c00aef68 LR: c00af114 CTR: c001272c
> REGS: c075dc40 TRAP: 0700   Not tainted  (5.4.0-rc7-pmac-00054-g97261e1e2240f)
> MSR:  00029032   CR: 4244  XER: 
>
> GPR00: 001f0100 c075dcf8 c0733300 c0762f1c c001258c   ef7fb5bc
> GPR08: 0480 0100 0001 0100 4244  c0761040 0004
> GPR16: 0001 c0658e58 c058297c 0022 8cb9  c072f4a0 c076
> GPR24: c076   c0762f1c c0762f1c  c001258c 
> NIP [c00aef68] smp_call_function_many+0x318/0x320
> LR [c00af114] smp_call_function+0x34/0x44
> Call Trace:
> [c075dcf8] [ef006320] 0xef006320 (unreliable)
> [c075dd38] [c00af114] smp_call_function+0x34/0x44
> [c075dd48] [c00af158] on_each_cpu+0x1c/0x4c
> [c075dd68] [c0012768] tau_timeout_smp+0x3c/0x4c
> [c075dd78] [c0092c04] call_timer_fn.isra.26+0x20/0x84
> [c075dd88] [c0092d18] expire_timers+0xb0/0xc0
> [c075dda8] [c0093a2c] run_timer_softirq+0xa4/0x1a4
> [c075dde8] [c0580224] __do_softirq+0x11c/0x280
> [c075de48] [c00382ec] irq_exit+0xc0/0xd4
> [c075de58] [c000d2a0] timer_interrupt+0x154/0x260
> [c075de88] [c001353c] ret_from_except+0x0/0x14
> --- interrupt: 901 at arch_cpu_idle+0x24/0x78
> LR = arch_cpu_idle+0x24/0x78
> [c075df50] [c0723214] 0xc0723214 (unreliable)
> [c075df60] [c057fa20] default_idle_call+0x38/0x58
> [c075df70] [c005de48] do_idle+0xd4/0x17c
> [c075df90] [c005e054] cpu_startup_entry+0x24/0x28
> [c075dfa0] [c0004cc8] rest_init+0xa8/0xbc
> [c075dfb0] [c06f136c] start_kernel+0x40c/0x420
> [c075dff0] [3500] 0x3500
> Instruction dump:
> 7c0803a6 7fa5eb78 7d808120 7ea6ab78 baa10014 38210040 4bfffbb0 7f64db78
> 7f85e378 484b31b1 7c601b78 4bfffdf4 <0fe0> 4bfffd60 9421ffe0 7c0802a6
> ---[ end trace 784c7f15ecd23941 ]---
>
> Has anyone else observed these problems (either the WARNING from
> smp_call_function_many() or the missing symbol names)?
>
> What is the best way to fix this? Should I upgrade binutils?



I got a similar report before.

I'd like to know whether or not
this is the same issue as fixed by
7883a14339299773b2ce08dcfd97c63c199a9289


Does your problem happen on the latest kernel?
Which version of binutils are you using?


-- 
Best Regards
Masahiro Yamada


Re: [PATCH v2 1/2] cpuidle: Trace IPI based and timer based wakeup latency from idle states

2020-07-19 Thread Gautham R Shenoy
On Fri, Jul 17, 2020 at 02:48:00PM +0530, Pratik Rajesh Sampat wrote:
> Fire directed smp_call_function_single IPIs from a specified source
> CPU to the specified target CPU to reduce the noise we have to wade
> through in the trace log.
> The module is based on the idea written by Srivatsa Bhat and maintained
> by Vaidyanathan Srinivasan internally.
> 
> Queue HR timer and measure jitter. Wakeup latency measurement for idle
> states using hrtimer.  Echo a value in ns to timer_test_function and
> watch trace. A HRtimer will be queued and when it fires the expected
> wakeup vs actual wakeup is computes and delay printed in ns.
> 
> Implemented as a module which utilizes debugfs so that it can be
> integrated with selftests.
> 
> To include the module, check option and include as module
> kernel hacking -> Cpuidle latency selftests
> 
> [srivatsa.b...@linux.vnet.ibm.com: Initial implementation in
>  cpidle/sysfs]
> 
> [sva...@linux.vnet.ibm.com: wakeup latency measurements using hrtimer
>  and fix some of the time calculation]
> 
> [e...@linux.vnet.ibm.com: Fix some whitespace and tab errors and
>  increase the resolution of IPI wakeup]
> 
> Signed-off-by: Pratik Rajesh Sampat 


The debugfs module looks good to me.

Reviewed-by: Gautham R. Shenoy 


> ---
>  drivers/cpuidle/Makefile   |   1 +
>  drivers/cpuidle/test-cpuidle_latency.c | 150 +
>  lib/Kconfig.debug  |  10 ++
>  3 files changed, 161 insertions(+)
>  create mode 100644 drivers/cpuidle/test-cpuidle_latency.c
> 
> diff --git a/drivers/cpuidle/Makefile b/drivers/cpuidle/Makefile
> index f07800cbb43f..2ae05968078c 100644
> --- a/drivers/cpuidle/Makefile
> +++ b/drivers/cpuidle/Makefile
> @@ -8,6 +8,7 @@ obj-$(CONFIG_ARCH_NEEDS_CPU_IDLE_COUPLED) += coupled.o
>  obj-$(CONFIG_DT_IDLE_STATES)   += dt_idle_states.o
>  obj-$(CONFIG_ARCH_HAS_CPU_RELAX)   += poll_state.o
>  obj-$(CONFIG_HALTPOLL_CPUIDLE) += cpuidle-haltpoll.o
> +obj-$(CONFIG_IDLE_LATENCY_SELFTEST)+= test-cpuidle_latency.o
> 
>  
> ##
>  # ARM SoC drivers
> diff --git a/drivers/cpuidle/test-cpuidle_latency.c 
> b/drivers/cpuidle/test-cpuidle_latency.c
> new file mode 100644
> index ..61574665e972
> --- /dev/null
> +++ b/drivers/cpuidle/test-cpuidle_latency.c
> @@ -0,0 +1,150 @@
> +// SPDX-License-Identifier: GPL-2.0-or-later
> +/*
> + * Module-based API test facility for cpuidle latency using IPIs and timers
> + */
> +
> +#include 
> +#include 
> +#include 
> +
> +/* IPI based wakeup latencies */
> +struct latency {
> + unsigned int src_cpu;
> + unsigned int dest_cpu;
> + ktime_t time_start;
> + ktime_t time_end;
> + u64 latency_ns;
> +} ipi_wakeup;
> +
> +static void measure_latency(void *info)
> +{
> + struct latency *v;
> + ktime_t time_diff;
> +
> + v = (struct latency *)info;
> + v->time_end = ktime_get();
> + time_diff = ktime_sub(v->time_end, v->time_start);
> + v->latency_ns = ktime_to_ns(time_diff);
> +}
> +
> +void run_smp_call_function_test(unsigned int cpu)
> +{
> + ipi_wakeup.src_cpu = smp_processor_id();
> + ipi_wakeup.dest_cpu = cpu;
> + ipi_wakeup.time_start = ktime_get();
> + smp_call_function_single(cpu, measure_latency, _wakeup, 1);
> +}
> +
> +/* Timer based wakeup latencies */
> +struct timer_data {
> + unsigned int src_cpu;
> + u64 timeout;
> + ktime_t time_start;
> + ktime_t time_end;
> + struct hrtimer timer;
> + u64 timeout_diff_ns;
> +} timer_wakeup;
> +
> +static enum hrtimer_restart timer_called(struct hrtimer *hrtimer)
> +{
> + struct timer_data *w;
> + ktime_t time_diff;
> +
> + w = container_of(hrtimer, struct timer_data, timer);
> + w->time_end = ktime_get();
> +
> + time_diff = ktime_sub(w->time_end, w->time_start);
> + time_diff = ktime_sub(time_diff, ns_to_ktime(w->timeout));
> + w->timeout_diff_ns = ktime_to_ns(time_diff);
> + return HRTIMER_NORESTART;
> +}
> +
> +static void run_timer_test(unsigned int ns)
> +{
> + hrtimer_init(_wakeup.timer, CLOCK_MONOTONIC,
> +  HRTIMER_MODE_REL);
> + timer_wakeup.timer.function = timer_called;
> + timer_wakeup.time_start = ktime_get();
> + timer_wakeup.src_cpu = smp_processor_id();
> + timer_wakeup.timeout = ns;
> +
> + hrtimer_start(_wakeup.timer, ns_to_ktime(ns),
> +   HRTIMER_MODE_REL_PINNED);
> +}
> +
> +static struct dentry *dir;
> +
> +static int cpu_read_op(void *data, u64 *value)
> +{
> + *value = ipi_wakeup.dest_cpu;
> + return 0;
> +}
> +
> +static int cpu_write_op(void *data, u64 value)
> +{
> + run_smp_call_function_test(value);
> + return 0;
> +}
> +DEFINE_SIMPLE_ATTRIBUTE(ipi_ops, cpu_read_op, cpu_write_op, "%llu\n");
> +
> +static int timeout_read_op(void *data, u64 *value)
> +{
> + *value = timer_wakeup.timeout;
> + return 0;
> +}
> 

[FIX PATCH] powerpc/prom: Enable Radix GTSE in cpu pa-features

2020-07-19 Thread Bharata B Rao
From: Nicholas Piggin 

When '029ab30b4c0a ("powerpc/mm: Enable radix GTSE only if supported.")'
made GTSE an MMU feature, it was enabled by default in
powerpc-cpu-features but was missed in pa-features. This causes
random memory corruption during boot of PowerNV kernels where
CONFIG_PPC_DT_CPU_FTRS isn't enabled.

Fixes: 029ab30b4c0a ("powerpc/mm: Enable radix GTSE only if supported.")
Reported-by: Qian Cai 
Signed-off-by: Nicholas Piggin 
Signed-off-by: Bharata B Rao 
---
 arch/powerpc/kernel/prom.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
index 9cc49f265c86..a9594bad572a 100644
--- a/arch/powerpc/kernel/prom.c
+++ b/arch/powerpc/kernel/prom.c
@@ -163,7 +163,8 @@ static struct ibm_pa_feature {
{ .pabyte = 0,  .pabit = 6, .cpu_features  = CPU_FTR_NOEXECUTE },
{ .pabyte = 1,  .pabit = 2, .mmu_features  = MMU_FTR_CI_LARGE_PAGE },
 #ifdef CONFIG_PPC_RADIX_MMU
-   { .pabyte = 40, .pabit = 0, .mmu_features  = MMU_FTR_TYPE_RADIX },
+   { .pabyte = 40, .pabit = 0,
+ .mmu_features  = (MMU_FTR_TYPE_RADIX | MMU_FTR_GTSE) },
 #endif
{ .pabyte = 1,  .pabit = 1, .invert = 1, .cpu_features = 
CPU_FTR_NODSISRALIGN },
{ .pabyte = 5,  .pabit = 0, .cpu_features  = CPU_FTR_REAL_LE,
-- 
2.26.2



Re: [PATCH v2 2/3] powerpc/powernv/idle: save-restore DAWR0, DAWRX0 for P10

2020-07-19 Thread Ravi Bangoria

Hi Nick,

On 7/13/20 11:22 AM, Nicholas Piggin wrote:

Excerpts from Pratik Rajesh Sampat's message of July 10, 2020 3:22 pm:

Additional registers DAWR0, DAWRX0 may be lost on Power 10 for
stop levels < 4.
Therefore save the values of these SPRs before entering a  "stop"
state and restore their values on wakeup.


Hmm, where do you get this from? Documentation I see says DAWR is lost
on POWER9 but not P10.

Does idle thread even need to save DAWR, or does it get switched when
going to a thread that has a watchpoint set?


I don't know how idle states works internally but IIUC, we need to save/restore
DAWRs. This is needed when user creates per-cpu watchpoint event.

Ravi


Re: [PATCH v2 2/3] powerpc/powernv/idle: save-restore DAWR0,DAWRX0 for P10

2020-07-19 Thread Ravi Bangoria

Hi Pratik,

On 7/10/20 10:52 AM, Pratik Rajesh Sampat wrote:

Additional registers DAWR0, DAWRX0 may be lost on Power 10 for
stop levels < 4.


p10 has one more pair DAWR1/DAWRX1. Please include that as well.

Ravi


Re: [PATCH v4 09/10] powerpc/watchpoint: Return available watchpoints dynamically

2020-07-19 Thread Jordan Niethe
On Fri, Jul 17, 2020 at 2:11 PM Ravi Bangoria
 wrote:
>
> So far Book3S Powerpc supported only one watchpoint. Power10 is
> introducing 2nd DAWR. Enable 2nd DAWR support for Power10.
> Availability of 2nd DAWR will depend on CPU_FTR_DAWR1.
>
> Signed-off-by: Ravi Bangoria 
> ---
>  arch/powerpc/include/asm/cputable.h  | 4 +++-
>  arch/powerpc/include/asm/hw_breakpoint.h | 5 +++--
>  2 files changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/cputable.h 
> b/arch/powerpc/include/asm/cputable.h
> index 3445c86e1f6f..36a0851a7a9b 100644
> --- a/arch/powerpc/include/asm/cputable.h
> +++ b/arch/powerpc/include/asm/cputable.h
> @@ -633,7 +633,9 @@ enum {
>   * Maximum number of hw breakpoint supported on powerpc. Number of
>   * breakpoints supported by actual hw might be less than this.
>   */
> -#define HBP_NUM_MAX1
> +#define HBP_NUM_MAX2
> +#define HBP_NUM_ONE1
> +#define HBP_NUM_TWO2
I wonder if these defines are necessary - has it any advantage over
just using the literal?
>
>  #endif /* !__ASSEMBLY__ */
>
> diff --git a/arch/powerpc/include/asm/hw_breakpoint.h 
> b/arch/powerpc/include/asm/hw_breakpoint.h
> index cb424799da0d..d4eab1694bcd 100644
> --- a/arch/powerpc/include/asm/hw_breakpoint.h
> +++ b/arch/powerpc/include/asm/hw_breakpoint.h
> @@ -5,10 +5,11 @@
>   * Copyright 2010, IBM Corporation.
>   * Author: K.Prasad 
>   */
> -
Was removing this line deliberate?
>  #ifndef _PPC_BOOK3S_64_HW_BREAKPOINT_H
>  #define _PPC_BOOK3S_64_HW_BREAKPOINT_H
>
> +#include 
> +
>  #ifdef __KERNEL__
>  struct arch_hw_breakpoint {
> unsigned long   address;
> @@ -46,7 +47,7 @@ struct arch_hw_breakpoint {
>
>  static inline int nr_wp_slots(void)
>  {
> -   return HBP_NUM_MAX;
> +   return cpu_has_feature(CPU_FTR_DAWR1) ? HBP_NUM_TWO : HBP_NUM_ONE;
So it'd be something like:
+   return cpu_has_feature(CPU_FTR_DAWR1) ? HBP_NUM_MAX : 1;
But thinking that there might be more slots added in the future, it
may be better to make the number of slots a variable that is set
during the init and then have this function return that.
>  }
>
>  #ifdef CONFIG_HAVE_HW_BREAKPOINT
> --
> 2.26.2
>


Re: [RFC PATCH 4/7] x86: use exit_lazy_tlb rather than membarrier_mm_sync_core_before_usermode

2020-07-19 Thread Nicholas Piggin
Excerpts from Mathieu Desnoyers's message of July 17, 2020 11:42 pm:
> - On Jul 16, 2020, at 7:26 PM, Nicholas Piggin npig...@gmail.com wrote:
> [...]
>> 
>> membarrier does replace barrier instructions on remote CPUs, which do
>> order accesses performed by the kernel on the user address space. So
>> membarrier should too I guess.
>> 
>> Normal process context accesses like read(2) will do so because they
>> don't get filtered out from IPIs, but kernel threads using the mm may
>> not.
> 
> But it should not be an issue, because membarrier's ordering is only with 
> respect
> to submit and completion of io_uring requests, which are performed through
> system calls from the context of user-space threads, which are called from the
> right mm.

Is that true? Can io completions be written into an address space via a
kernel thread? I don't know the io_uring code well but it looks like 
that's asynchonously using the user mm context.

How about other memory accesses via kthread_use_mm? Presumably there is 
still ordering requirement there for membarrier, so I really think
it's a fragile interface with no real way for the user to know how 
kernel threads may use its mm for any particular reason, so membarrier
should synchronize all possible kernel users as well.

Thanks,
Nick


Re: [PATCH v2 4/4] mm/vmalloc: Hugepage vmalloc mappings

2020-07-19 Thread Nicholas Piggin
Excerpts from Zefan Li's message of July 20, 2020 12:02 pm:
>> +static int vmap_pages_range_noflush(unsigned long start, unsigned long end,
>> +pgprot_t prot, struct page **pages,
>> +unsigned int page_shift)
>> +{
>> +if (page_shift == PAGE_SIZE) {
> 
> Is this a typo of PAGE_SHIFT?

Oh good catch, yeah that'll always be going via the one-at-a-time route 
and slow down the small page vmaps. Will fix.

Thanks,
Nick

> 
>> +return vmap_small_pages_range_noflush(start, end, prot, pages);
>> +} else {
>> +unsigned long addr = start;
>> +unsigned int i, nr = (end - start) >> page_shift;
>> +
>> +for (i = 0; i < nr; i++) {
>> +int err;
>> +
>> +err = vmap_range_noflush(addr,
>> +addr + (1UL << page_shift),
>> +__pa(page_address(pages[i])), prot,
>> +page_shift);
>> +if (err)
>> +return err;
>> +
>> +addr += 1UL << page_shift;
>> +}
>> +
>> +return 0;
>> +}
>> +}
>> +
> 


Re: [PATCH v2 4/4] mm/vmalloc: Hugepage vmalloc mappings

2020-07-19 Thread Zefan Li
> +static int vmap_pages_range_noflush(unsigned long start, unsigned long end,
> + pgprot_t prot, struct page **pages,
> + unsigned int page_shift)
> +{
> + if (page_shift == PAGE_SIZE) {

Is this a typo of PAGE_SHIFT?

> + return vmap_small_pages_range_noflush(start, end, prot, pages);
> + } else {
> + unsigned long addr = start;
> + unsigned int i, nr = (end - start) >> page_shift;
> +
> + for (i = 0; i < nr; i++) {
> + int err;
> +
> + err = vmap_range_noflush(addr,
> + addr + (1UL << page_shift),
> + __pa(page_address(pages[i])), prot,
> + page_shift);
> + if (err)
> + return err;
> +
> + addr += 1UL << page_shift;
> + }
> +
> + return 0;
> + }
> +}
> +


Re: [PATCH v4 07/10] powerpc/watchpoint: Rename current H_SET_MODE DAWR macro

2020-07-19 Thread Jordan Niethe
On Fri, Jul 17, 2020 at 2:11 PM Ravi Bangoria
 wrote:
>
> Current H_SET_MODE hcall macro name for setting/resetting DAWR0 is
> H_SET_MODE_RESOURCE_SET_DAWR. Add suffix 0 to macro name as well.
>
> Signed-off-by: Ravi Bangoria 
Reviewed-by: Jordan Niethe 
> ---
>  arch/powerpc/include/asm/hvcall.h | 2 +-
>  arch/powerpc/include/asm/plpar_wrappers.h | 2 +-
>  arch/powerpc/kvm/book3s_hv.c  | 2 +-
>  3 files changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/hvcall.h 
> b/arch/powerpc/include/asm/hvcall.h
> index 43486e773bd6..b785e9f0071c 100644
> --- a/arch/powerpc/include/asm/hvcall.h
> +++ b/arch/powerpc/include/asm/hvcall.h
> @@ -355,7 +355,7 @@
>
>  /* Values for 2nd argument to H_SET_MODE */
>  #define H_SET_MODE_RESOURCE_SET_CIABR  1
> -#define H_SET_MODE_RESOURCE_SET_DAWR   2
> +#define H_SET_MODE_RESOURCE_SET_DAWR0  2
>  #define H_SET_MODE_RESOURCE_ADDR_TRANS_MODE3
>  #define H_SET_MODE_RESOURCE_LE 4
>
> diff --git a/arch/powerpc/include/asm/plpar_wrappers.h 
> b/arch/powerpc/include/asm/plpar_wrappers.h
> index 4293c5d2ddf4..d12c3680d946 100644
> --- a/arch/powerpc/include/asm/plpar_wrappers.h
> +++ b/arch/powerpc/include/asm/plpar_wrappers.h
> @@ -312,7 +312,7 @@ static inline long plpar_set_ciabr(unsigned long ciabr)
>
>  static inline long plpar_set_watchpoint0(unsigned long dawr0, unsigned long 
> dawrx0)
>  {
> -   return plpar_set_mode(0, H_SET_MODE_RESOURCE_SET_DAWR, dawr0, dawrx0);
> +   return plpar_set_mode(0, H_SET_MODE_RESOURCE_SET_DAWR0, dawr0, 
> dawrx0);
>  }
>
>  static inline long plpar_signal_sys_reset(long cpu)
> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
> index 6bf66649ab92..7ad692c2d7c7 100644
> --- a/arch/powerpc/kvm/book3s_hv.c
> +++ b/arch/powerpc/kvm/book3s_hv.c
> @@ -764,7 +764,7 @@ static int kvmppc_h_set_mode(struct kvm_vcpu *vcpu, 
> unsigned long mflags,
> return H_P3;
> vcpu->arch.ciabr  = value1;
> return H_SUCCESS;
> -   case H_SET_MODE_RESOURCE_SET_DAWR:
> +   case H_SET_MODE_RESOURCE_SET_DAWR0:
> if (!kvmppc_power8_compatible(vcpu))
> return H_P2;
> if (!ppc_breakpoint_available())
> --
> 2.26.2
>


Re: [PATCH v2 13/16] scripts/kallsyms: move ignored symbol types to is_ignored_symbol()

2020-07-19 Thread Finn Thain
On Sun, 24 Nov 2019, Masahiro Yamada wrote:

> Collect the ignored patterns to is_ignored_symbol().
> 
> Signed-off-by: Masahiro Yamada 

This commit (887df76de67f5) caused a regression in my powerpc builds as it 
causes symbol names to disappear from backtraces:

[ cut here ]
WARNING: CPU: 0 PID: 0 at kernel/smp.c:433 _einittext+0x3f9e5120/0x3feb71b8
Modules linked in:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.4.0-rc7-pmac-00055-g887df76de67f5 
#18
NIP:  c00aef68 LR: c00af114 CTR: c001272c
REGS: c0705c40 TRAP: 0700   Not tainted  (5.4.0-rc7-pmac-00055-g887df76de67f5)
MSR:  00029032   CR: 4244  XER: 

GPR00: 001f0100 c0705cf8 c06dc300 c070af1c c001258c   ef7fb5bc 
GPR08: 0880 0100 0001 0100 4244  c0709040 0004 
GPR16: 0001 c06022b4 c058297c 0022 8cb9  c06d84a0 c071 
GPR24: c071   c070af1c c070af1c  c001258c  
NIP [c00aef68] _einittext+0x3f9e5120/0x3feb71b8
LR [c00af114] _einittext+0x3f9e52cc/0x3feb71b8
Call Trace:
[c0705cf8] [ef006320] 0xef006320 (unreliable)
[c0705d38] [c00af114] _einittext+0x3f9e52cc/0x3feb71b8
[c0705d48] [c00af158] _einittext+0x3f9e5310/0x3feb71b8
[c0705d68] [c0012768] _einittext+0x3f948920/0x3feb71b8
[c0705d78] [c0092c04] _einittext+0x3f9c8dbc/0x3feb71b8
[c0705d88] [c0092d18] _einittext+0x3f9c8ed0/0x3feb71b8
[c0705da8] [c0093a2c] _einittext+0x3f9c9be4/0x3feb71b8
[c0705de8] [c0580224] _einittext+0x3feb63dc/0x3feb71b8
[c0705e48] [c00382ec] _einittext+0x3f96e4a4/0x3feb71b8
[c0705e58] [c000d2a0] _einittext+0x3f943458/0x3feb71b8
[c0705e88] [c001353c] _einittext+0x3f9496f4/0x3feb71b8
--- interrupt: 901 at _einittext+0x3f941058/0x3feb71b8
LR = _einittext+0x3f941058/0x3feb71b8
[c0705f50] [c06cc214] 0xc06cc214 (unreliable)
[c0705f60] [c057fa20] _einittext+0x3feb5bd8/0x3feb71b8
[c0705f70] [c005de48] _einittext+0x3f994000/0x3feb71b8
[c0705f90] [c005e050] _einittext+0x3f994208/0x3feb71b8
[c0705fa0] [c0004cc8] _einittext+0x3f93ae80/0x3feb71b8
[c0705fb0] [c069a36c] _einittext+0x3ffd0524/0x4000
[c0705ff0] [3500] 0x3500
Instruction dump:
7c0803a6 7fa5eb78 7d808120 7ea6ab78 baa10014 38210040 4bfffbb0 7f64db78 
7f85e378 484b31b1 7c601b78 4bfffdf4 <0fe0> 4bfffd60 9421ffe0 7c0802a6 
---[ end trace a06fef4788747c72 ]---


Prior to that (e.g. 97261e1e2240f), I get backtraces like this:

[ cut here ]
WARNING: CPU: 0 PID: 0 at kernel/smp.c:433 smp_call_function_many+0x318/0x320
Modules linked in:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.4.0-rc7-pmac-00054-g97261e1e2240f 
#20
NIP:  c00aef68 LR: c00af114 CTR: c001272c
REGS: c075dc40 TRAP: 0700   Not tainted  (5.4.0-rc7-pmac-00054-g97261e1e2240f)
MSR:  00029032   CR: 4244  XER: 

GPR00: 001f0100 c075dcf8 c0733300 c0762f1c c001258c   ef7fb5bc 
GPR08: 0480 0100 0001 0100 4244  c0761040 0004 
GPR16: 0001 c0658e58 c058297c 0022 8cb9  c072f4a0 c076 
GPR24: c076   c0762f1c c0762f1c  c001258c  
NIP [c00aef68] smp_call_function_many+0x318/0x320
LR [c00af114] smp_call_function+0x34/0x44
Call Trace:
[c075dcf8] [ef006320] 0xef006320 (unreliable)
[c075dd38] [c00af114] smp_call_function+0x34/0x44
[c075dd48] [c00af158] on_each_cpu+0x1c/0x4c
[c075dd68] [c0012768] tau_timeout_smp+0x3c/0x4c
[c075dd78] [c0092c04] call_timer_fn.isra.26+0x20/0x84
[c075dd88] [c0092d18] expire_timers+0xb0/0xc0
[c075dda8] [c0093a2c] run_timer_softirq+0xa4/0x1a4
[c075dde8] [c0580224] __do_softirq+0x11c/0x280
[c075de48] [c00382ec] irq_exit+0xc0/0xd4
[c075de58] [c000d2a0] timer_interrupt+0x154/0x260
[c075de88] [c001353c] ret_from_except+0x0/0x14
--- interrupt: 901 at arch_cpu_idle+0x24/0x78
LR = arch_cpu_idle+0x24/0x78
[c075df50] [c0723214] 0xc0723214 (unreliable)
[c075df60] [c057fa20] default_idle_call+0x38/0x58
[c075df70] [c005de48] do_idle+0xd4/0x17c
[c075df90] [c005e054] cpu_startup_entry+0x24/0x28
[c075dfa0] [c0004cc8] rest_init+0xa8/0xbc
[c075dfb0] [c06f136c] start_kernel+0x40c/0x420
[c075dff0] [3500] 0x3500
Instruction dump:
7c0803a6 7fa5eb78 7d808120 7ea6ab78 baa10014 38210040 4bfffbb0 7f64db78 
7f85e378 484b31b1 7c601b78 4bfffdf4 <0fe0> 4bfffd60 9421ffe0 7c0802a6 
---[ end trace 784c7f15ecd23941 ]---

Has anyone else observed these problems (either the WARNING from 
smp_call_function_many() or the missing symbol names)?

What is the best way to fix this? Should I upgrade binutils?


Re: [PATCH v4 06/10] powerpc/watchpoint: Set CPU_FTR_DAWR1 based on pa-features bit

2020-07-19 Thread Jordan Niethe
On Fri, Jul 17, 2020 at 2:10 PM Ravi Bangoria
 wrote:
>
> As per the PAPR, bit 0 of byte 64 in pa-features property indicates
> availability of 2nd DAWR registers. i.e. If this bit is set, 2nd
> DAWR is present, otherwise not. Host generally uses "cpu-features",
> which masks "pa-features". But "cpu-features" are still not used for
> guests and thus this change is mostly applicable for guests only.
>
> Signed-off-by: Ravi Bangoria 
I checked those PAPR values are correct and checked running a powernv
kernel in p10 mambo with dt_cpu_ftrs=off and it does set the
CPU_FTR_DAWR1 bit.
(using p10 skiboot).
Tested-by: Jordan Niethe 
> ---
>  arch/powerpc/kernel/prom.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
> index 9cc49f265c86..c76c09b97bc8 100644
> --- a/arch/powerpc/kernel/prom.c
> +++ b/arch/powerpc/kernel/prom.c
> @@ -175,6 +175,8 @@ static struct ibm_pa_feature {
>  */
> { .pabyte = 22, .pabit = 0, .cpu_features = CPU_FTR_TM_COMP,
>   .cpu_user_ftrs2 = PPC_FEATURE2_HTM_COMP | 
> PPC_FEATURE2_HTM_NOSC_COMP },
> +
> +   { .pabyte = 64, .pabit = 0, .cpu_features = CPU_FTR_DAWR1 },
>  };
>
>  static void __init scan_features(unsigned long node, const unsigned char 
> *ftrs,
> --
> 2.26.2
>


Re: [PATCH v2] powerpc/powernv/pci: use ifdef to avoid dead code

2020-07-19 Thread Oliver O'Halloran
On Sun, Jul 19, 2020 at 5:13 AM Greg Thelen  wrote:
>
> Oliver O'Halloran  wrote:
>
> > On Mon, Jun 15, 2020 at 9:33 AM Greg Thelen  wrote:
> >>
> >> Commit dc3d8f85bb57 ("powerpc/powernv/pci: Re-work bus PE
> >> configuration") removed a couple pnv_ioda_setup_bus_dma() calls.  The
> >> only remaining calls are behind CONFIG_IOMMU_API.  Thus builds without
> >> CONFIG_IOMMU_API see:
> >>   arch/powerpc/platforms/powernv/pci-ioda.c:1888:13: error: 
> >> 'pnv_ioda_setup_bus_dma' defined but not used
> >>
> >> Move pnv_ioda_setup_bus_dma() under CONFIG_IOMMU_API to avoid dead code.
> >
> > Doh! Thanks for the fix.
> >
> > Reviewed-by: Oliver O'Halloran 
>
> Is there anything else needed from me on this patch?
> Given that it fixes a 5.8 commit I figured it'd be 5.8 material.

Oh sorry, I completely forgot about this patch. I sent another series
that included a more-or-less identical fix after the kbuild robot sent
a reminder:

http://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=187630=*

That's current in powerpc/next, but if it's causing a build break then
I agree it should probably go into 5.8 too.

Oliver


io_uring kthread_use_mm / mmget_not_zero possible abuse

2020-07-19 Thread Nicholas Piggin
When I last looked at this (predating io_uring), as far as I remember it was 
not permitted to actually switch to (use_mm) an mm user context that was 
pinned with mmget_not_zero. Those pins were only allowed to look at page 
tables, vmas, etc., but not actually run the CPU in that mm context.

sparc/kernel/smp_64.c depends heavily on this, e.g.,

void smp_flush_tlb_mm(struct mm_struct *mm)
{
u32 ctx = CTX_HWBITS(mm->context);
int cpu = get_cpu();

if (atomic_read(>mm_users) == 1) {
cpumask_copy(mm_cpumask(mm), cpumask_of(cpu));
goto local_flush_and_out;
}

smp_cross_call_masked(_flush_tlb_mm,
  ctx, 0, 0,
  mm_cpumask(mm));

local_flush_and_out:
__flush_tlb_mm(ctx, SECONDARY_CONTEXT);

put_cpu();
}

If a kthread comes in concurrently between the mm_users test and the 
mm_cpumask reset, and does mmget_not_zero(); kthread_use_mm() then we have 
another CPU switched to mm context but not in the mm_cpumask. It's then 
possible for our thread to schedule on that CPU and not go through a 
switch_mm (because kthread_unuse_mm will make it lazy, then we can switch 
back to our user thread and un-lazy it).

powerpc has something similar.

I don't think this is documented anywhere and certainly isn't checked for 
unfortunately, so I don't really blame io_uring.

The simplest fix is for io_uring to carry mm_users references. If that can't 
be done or we decide to lift the limitation on mmget_not_zero references, we 
can come up with a way to synchronize things.

On powerpc for example, we IPI all targets in mm_cpumask before clearing 
them, so we could disable interrupts while kthread_use_mm does the mm switch 
sequence, and have the IPI handler check that current->mm hasn't been set to 
mm, for example.

sparc is a bit harder because it doesn't IPI targets if it thinks it can 
avoid it. But powerpc found that just doing one IPI isn't a big burden here 
so maybe we change sparc to do that too. I would be inclined to fix this 
mmget_not_zero quirk if we can, unless someone has a very good way to test 
and enforce it, it'll just happen again.

Comments?

Thanks,
Nick


Re: [PATCH v3 3/3] powerpc/powernv/idle: Exclude mfspr on HID1, 4, 5 on P9 and above

2020-07-19 Thread Nicholas Piggin
Excerpts from Pratik Rajesh Sampat's message of July 18, 2020 4:53 am:
> POWER9 onwards the support for the registers HID1, HID4, HID5 has been
> receded.
> Although mfspr on the above registers worked in Power9, In Power10
> simulator is unrecognized. Moving their assignment under the
> check for machines lower than Power9
> 
> Signed-off-by: Pratik Rajesh Sampat 
> Reviewed-by: Gautham R. Shenoy 

Reviewed-by: Nicholas Piggin 

> ---
>  arch/powerpc/platforms/powernv/idle.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/powernv/idle.c 
> b/arch/powerpc/platforms/powernv/idle.c
> index d439e11af101..d24d6671f3e8 100644
> --- a/arch/powerpc/platforms/powernv/idle.c
> +++ b/arch/powerpc/platforms/powernv/idle.c
> @@ -73,9 +73,6 @@ static int pnv_save_sprs_for_deep_states(void)
>*/
>   uint64_t lpcr_val   = mfspr(SPRN_LPCR);
>   uint64_t hid0_val   = mfspr(SPRN_HID0);
> - uint64_t hid1_val   = mfspr(SPRN_HID1);
> - uint64_t hid4_val   = mfspr(SPRN_HID4);
> - uint64_t hid5_val   = mfspr(SPRN_HID5);
>   uint64_t hmeer_val  = mfspr(SPRN_HMEER);
>   uint64_t msr_val = MSR_IDLE;
>   uint64_t psscr_val = pnv_deepest_stop_psscr_val;
> @@ -117,6 +114,9 @@ static int pnv_save_sprs_for_deep_states(void)
>  
>   /* Only p8 needs to set extra HID regiters */
>   if (!pvr_version_is(PVR_POWER9)) {
> + uint64_t hid1_val = mfspr(SPRN_HID1);
> + uint64_t hid4_val = mfspr(SPRN_HID4);
> + uint64_t hid5_val = mfspr(SPRN_HID5);
>  
>   rc = opal_slw_set_reg(pir, SPRN_HID1, hid1_val);
>   if (rc != 0)
> -- 
> 2.25.4
> 
> 


Re: [PATCH v3 1/3] powerpc/powernv/idle: Replace CPU features checks with PVR checks

2020-07-19 Thread Nicholas Piggin
Excerpts from Pratik Rajesh Sampat's message of July 18, 2020 4:53 am:
> As the idle framework's architecture is incomplete, hence instead of
> checking for just the processor type advertised in the device tree CPU
> features; check for the Processor Version Register (PVR) so that finer
> granularity can be leveraged while making processor checks.
> 
> Signed-off-by: Pratik Rajesh Sampat 
> ---
>  arch/powerpc/platforms/powernv/idle.c | 14 +++---
>  1 file changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/powernv/idle.c 
> b/arch/powerpc/platforms/powernv/idle.c
> index 2dd467383a88..f62904f70fc6 100644
> --- a/arch/powerpc/platforms/powernv/idle.c
> +++ b/arch/powerpc/platforms/powernv/idle.c
> @@ -92,7 +92,7 @@ static int pnv_save_sprs_for_deep_states(void)
>   if (rc != 0)
>   return rc;
>  
> - if (cpu_has_feature(CPU_FTR_ARCH_300)) {
> + if (pvr_version_is(PVR_POWER9)) {
>   rc = opal_slw_set_reg(pir, P9_STOP_SPR_MSR, msr_val);
>   if (rc)
>   return rc;
> @@ -116,7 +116,7 @@ static int pnv_save_sprs_for_deep_states(void)
>   return rc;
>  
>   /* Only p8 needs to set extra HID regiters */
> - if (!cpu_has_feature(CPU_FTR_ARCH_300)) {
> + if (!pvr_version_is(PVR_POWER9)) {
>  
>   rc = opal_slw_set_reg(pir, SPRN_HID1, hid1_val);
>   if (rc != 0)

What I think you should do is keep using CPU_FTR_ARCH_300 for this stuff 
which is written for power9 and we know is running on power9, because 
that's a faster test (static branch and does not have to read PVR. And
then...

> @@ -1205,7 +1205,7 @@ static void __init pnv_probe_idle_states(void)
>   return;
>   }
>  
> - if (cpu_has_feature(CPU_FTR_ARCH_300))
> + if (pvr_version_is(PVR_POWER9))
>   pnv_power9_idle_init();
>  
>   for (i = 0; i < nr_pnv_idle_states; i++)

Here is where you would put the version check. Once we have code that 
can also handle P10 (either by testing CPU_FTR_ARCH_31, or by adding
an entirely new power10 idle function), then you can add the P10 version
check here.

Thanks,
Nick



Re: [PATCH v3 2/3] powerpc/powernv/idle: Rename pnv_first_spr_loss_level variable

2020-07-19 Thread Nicholas Piggin
Excerpts from Pratik Rajesh Sampat's message of July 18, 2020 4:53 am:
> Replace the variable name from using "pnv_first_spr_loss_level" to
> "pnv_first_fullstate_loss_level".
> 
> As pnv_first_spr_loss_level is supposed to be the earliest state that
> has OPAL_PM_LOSE_FULL_CONTEXT set, however as shallow states too loose
> SPR values, render an incorrect terminology.

It also doesn't lose "full" state at this loss level though. From the 
architecture it could be called "hv state loss level", but in POWER10 
even that is not strictly true.


> 
> Signed-off-by: Pratik Rajesh Sampat 
> ---
>  arch/powerpc/platforms/powernv/idle.c | 18 +-
>  1 file changed, 9 insertions(+), 9 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/powernv/idle.c 
> b/arch/powerpc/platforms/powernv/idle.c
> index f62904f70fc6..d439e11af101 100644
> --- a/arch/powerpc/platforms/powernv/idle.c
> +++ b/arch/powerpc/platforms/powernv/idle.c
> @@ -48,7 +48,7 @@ static bool default_stop_found;
>   * First stop state levels when SPR and TB loss can occur.
>   */
>  static u64 pnv_first_tb_loss_level = MAX_STOP_STATE + 1;
> -static u64 pnv_first_spr_loss_level = MAX_STOP_STATE + 1;
> +static u64 pnv_first_fullstate_loss_level = MAX_STOP_STATE + 1;
>  
>  /*
>   * psscr value and mask of the deepest stop idle state.
> @@ -657,7 +657,7 @@ static unsigned long power9_idle_stop(unsigned long 
> psscr, bool mmu_on)
> */
>   mmcr0   = mfspr(SPRN_MMCR0);
>   }
> - if ((psscr & PSSCR_RL_MASK) >= pnv_first_spr_loss_level) {
> + if ((psscr & PSSCR_RL_MASK) >= pnv_first_fullstate_loss_level) {
>   sprs.lpcr   = mfspr(SPRN_LPCR);
>   sprs.hfscr  = mfspr(SPRN_HFSCR);
>   sprs.fscr   = mfspr(SPRN_FSCR);
> @@ -741,7 +741,7 @@ static unsigned long power9_idle_stop(unsigned long 
> psscr, bool mmu_on)
>* just always test PSSCR for SPR/TB state loss.
>*/
>   pls = (psscr & PSSCR_PLS) >> PSSCR_PLS_SHIFT;
> - if (likely(pls < pnv_first_spr_loss_level)) {
> + if (likely(pls < pnv_first_fullstate_loss_level)) {
>   if (sprs_saved)
>   atomic_stop_thread_idle();
>   goto out;
> @@ -1088,7 +1088,7 @@ static void __init pnv_power9_idle_init(void)
>* the deepest loss-less (OPAL_PM_STOP_INST_FAST) stop state.
>*/
>   pnv_first_tb_loss_level = MAX_STOP_STATE + 1;
> - pnv_first_spr_loss_level = MAX_STOP_STATE + 1;
> + pnv_first_fullstate_loss_level = MAX_STOP_STATE + 1;
>   for (i = 0; i < nr_pnv_idle_states; i++) {
>   int err;
>   struct pnv_idle_states_t *state = _idle_states[i];
> @@ -1099,8 +1099,8 @@ static void __init pnv_power9_idle_init(void)
>   pnv_first_tb_loss_level = psscr_rl;
>  
>   if ((state->flags & OPAL_PM_LOSE_FULL_CONTEXT) &&
> -  (pnv_first_spr_loss_level > psscr_rl))
> - pnv_first_spr_loss_level = psscr_rl;
> +  (pnv_first_fullstate_loss_level > psscr_rl))
> + pnv_first_fullstate_loss_level = psscr_rl;
>  
>   /*
>* The idle code does not deal with TB loss occurring
> @@ -,8 +,8 @@ static void __init pnv_power9_idle_init(void)
>* compatibility.
>*/
>   if ((state->flags & OPAL_PM_TIMEBASE_STOP) &&
> -  (pnv_first_spr_loss_level > psscr_rl))
> - pnv_first_spr_loss_level = psscr_rl;
> +  (pnv_first_fullstate_loss_level > psscr_rl))
> + pnv_first_fullstate_loss_level = psscr_rl;
>  
>   err = validate_psscr_val_mask(>psscr_val,
> >psscr_mask,
> @@ -1158,7 +1158,7 @@ static void __init pnv_power9_idle_init(void)
>   }
>  
>   pr_info("cpuidle-powernv: First stop level that may lose SPRs = 
> 0x%llx\n",
> - pnv_first_spr_loss_level);
> + pnv_first_fullstate_loss_level);
>  
>   pr_info("cpuidle-powernv: First stop level that may lose timebase = 
> 0x%llx\n",
>   pnv_first_tb_loss_level);
> -- 
> 2.25.4
> 
> 


Re: [v3 12/15] powerpc/perf: Add support for outputting extended regs in perf intr_regs

2020-07-19 Thread kernel test robot
Hi Athira,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on powerpc/next]
[also build test ERROR on tip/perf/core v5.8-rc5 next-20200717]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Athira-Rajeev/powerpc-perf-Add-support-for-power10-PMU-Hardware/20200717-224353
base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
config: powerpc64-randconfig-r024-20200719 (attached as .config)
compiler: clang version 12.0.0 (https://github.com/llvm/llvm-project 
ed6b578040a85977026c93bf4188f996148f3218)
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# install powerpc64 cross compiling tool for clang build
# apt-get install binutils-powerpc64-linux-gnu
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross 
ARCH=powerpc64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All errors (new ones prefixed by >>):

   arch/powerpc/include/asm/io-defs.h:45:1: warning: performing pointer 
arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
   DEF_PCI_AC_NORET(insw, (unsigned long p, void *b, unsigned long c),
   ^~~
   arch/powerpc/include/asm/io.h:601:3: note: expanded from macro 
'DEF_PCI_AC_NORET'
   __do_##name al; \
   ^~
   :221:1: note: expanded from here
   __do_insw
   ^
   arch/powerpc/include/asm/io.h:542:56: note: expanded from macro '__do_insw'
   #define __do_insw(p, b, n)  readsw((PCI_IO_ADDR)_IO_BASE+(p), (b), (n))
  ~^
   In file included from arch/powerpc/perf/perf_regs.c:10:
   In file included from include/linux/perf_event.h:57:
   In file included from include/linux/cgroup.h:26:
   In file included from include/linux/kernel_stat.h:9:
   In file included from include/linux/interrupt.h:11:
   In file included from include/linux/hardirq.h:10:
   In file included from arch/powerpc/include/asm/hardirq.h:6:
   In file included from include/linux/irq.h:20:
   In file included from include/linux/io.h:13:
   In file included from arch/powerpc/include/asm/io.h:604:
   arch/powerpc/include/asm/io-defs.h:47:1: warning: performing pointer 
arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
   DEF_PCI_AC_NORET(insl, (unsigned long p, void *b, unsigned long c),
   ^~~
   arch/powerpc/include/asm/io.h:601:3: note: expanded from macro 
'DEF_PCI_AC_NORET'
   __do_##name al; \
   ^~
   :223:1: note: expanded from here
   __do_insl
   ^
   arch/powerpc/include/asm/io.h:543:56: note: expanded from macro '__do_insl'
   #define __do_insl(p, b, n)  readsl((PCI_IO_ADDR)_IO_BASE+(p), (b), (n))
  ~^
   In file included from arch/powerpc/perf/perf_regs.c:10:
   In file included from include/linux/perf_event.h:57:
   In file included from include/linux/cgroup.h:26:
   In file included from include/linux/kernel_stat.h:9:
   In file included from include/linux/interrupt.h:11:
   In file included from include/linux/hardirq.h:10:
   In file included from arch/powerpc/include/asm/hardirq.h:6:
   In file included from include/linux/irq.h:20:
   In file included from include/linux/io.h:13:
   In file included from arch/powerpc/include/asm/io.h:604:
   arch/powerpc/include/asm/io-defs.h:49:1: warning: performing pointer 
arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
   DEF_PCI_AC_NORET(outsb, (unsigned long p, const void *b, unsigned long c),
   ^~
   arch/powerpc/include/asm/io.h:601:3: note: expanded from macro 
'DEF_PCI_AC_NORET'
   __do_##name al; \
   ^~
   :225:1: note: expanded from here
   __do_outsb
   ^
   arch/powerpc/include/asm/io.h:544:58: note: expanded from macro '__do_outsb'
   #define __do_outsb(p, b, n) writesb((PCI_IO_ADDR)_IO_BASE+(p),(b),(n))
   ~^
   In file included from arch/powerpc/perf/perf_regs.c:10:
   In file included from include/linux/perf_event.h:57:
   In file included from include/linux/cgroup.h:26:
   In file included from include/linux/kernel_stat.h:9:
   In file included from include/linux/interrup

[powerpc:next-test] BUILD SUCCESS 5fed3b3e21db21f9a7002426f456fd3a8a8c0772

2020-07-19 Thread kernel test robot
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git  
next-test
branch HEAD: 5fed3b3e21db21f9a7002426f456fd3a8a8c0772  papr/scm: Add bad memory 
ranges to nvdimm bad ranges

elapsed time: 1279m

configs tested: 141
configs skipped: 6

The following configs have been built successfully.
More configs may be tested in the coming days.

arm defconfig
arm  allyesconfig
arm  allmodconfig
arm64allyesconfig
arm64   defconfig
arm64allmodconfig
arm64 allnoconfig
arm   allnoconfig
nds32 allnoconfig
powerpc  ppc64e_defconfig
arm   viper_defconfig
ia64 alldefconfig
sh   se7721_defconfig
arc haps_hs_defconfig
powerpc  ppc6xx_defconfig
powerpcgamecube_defconfig
mips  cavium_octeon_defconfig
mipsmalta_qemu_32r6_defconfig
armvexpress_defconfig
sh shx3_defconfig
arm  integrator_defconfig
ia64  tiger_defconfig
mipsjmr3927_defconfig
armxcep_defconfig
c6x dsk6455_defconfig
m68k amcore_defconfig
arm  simpad_defconfig
openrisc allyesconfig
mips   sb1250_swarm_defconfig
armspear6xx_defconfig
armhisi_defconfig
powerpc64alldefconfig
armspear3xx_defconfig
h8300   h8s-sim_defconfig
m68k   m5475evb_defconfig
sh  r7780mp_defconfig
armmps2_defconfig
umkunit_defconfig
powerpc   allnoconfig
mips tb0219_defconfig
csky alldefconfig
shedosk7705_defconfig
i386  allnoconfig
i386 allyesconfig
i386defconfig
i386  debian-10.3
ia64 allmodconfig
ia64defconfig
ia64  allnoconfig
ia64 allyesconfig
m68k allmodconfig
m68k  allnoconfig
m68k   sun3_defconfig
m68kdefconfig
m68k allyesconfig
nios2   defconfig
nios2allyesconfig
openriscdefconfig
c6x  allyesconfig
c6x   allnoconfig
csky allyesconfig
cskydefconfig
alpha   defconfig
alphaallyesconfig
nds32   defconfig
xtensa   allyesconfig
h8300allyesconfig
h8300allmodconfig
xtensa  defconfig
arc defconfig
arc  allyesconfig
sh   allmodconfig
shallnoconfig
microblazeallnoconfig
mips allyesconfig
mips  allnoconfig
mips allmodconfig
pariscallnoconfig
parisc  defconfig
parisc   allyesconfig
parisc   allmodconfig
powerpc  allyesconfig
powerpc  rhel-kconfig
powerpc  allmodconfig
powerpc defconfig
i386 randconfig-a001-20200717
i386 randconfig-a005-20200717
i386 randconfig-a002-20200717
i386 randconfig-a006-20200717
i386 randconfig-a003-20200717
i386 randconfig-a004-20200717
i386 randconfig-a001-20200719
i386 randconfig-a006-20200719
i386 randconfig-a002-20200719
i386 randconfig-a005-20200719
i386 randconfig-a003-20200719
i386 randconfig-a004-20200719
x86_64   randconfig-a012-20200716
x86_64   randconfig-a011-20200716
x86_64   randconfig-a016-20200716
x86_64   randconfig-a014

Re: [powerpc:next-test 103/106] arch/powerpc/mm/book3s64/radix_pgtable.c:513:21: error: use of undeclared identifier 'SECTION_SIZE_BITS'

2020-07-19 Thread Aneesh Kumar K.V
kernel test robot  writes:

> tree:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git 
> next-test
> head:   5fed3b3e21db21f9a7002426f456fd3a8a8c0772
> commit: 21407f39b9d547da527ad5224c4323e1f62bb514 [103/106] powerpc/mm/radix: 
> Create separate mappings for hot-plugged memory
> config: powerpc-randconfig-r016-20200719 (attached as .config)
> compiler: clang version 12.0.0 (https://github.com/llvm/llvm-project 
> ed6b578040a85977026c93bf4188f996148f3218)
> reproduce (this is a W=1 build):
> wget 
> https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
> ~/bin/make.cross
> chmod +x ~/bin/make.cross
> # install powerpc cross compiling tool for clang build
> # apt-get install binutils-powerpc-linux-gnu
> git checkout 21407f39b9d547da527ad5224c4323e1f62bb514
> # save the attached .config to linux build tree
> COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross 
> ARCH=powerpc 
>
> If you fix the issue, kindly add following tag as appropriate
> Reported-by: kernel test robot 
>
> All errors (new ones prefixed by >>):
>
>>> arch/powerpc/mm/book3s64/radix_pgtable.c:513:21: error: use of undeclared 
>>> identifier 'SECTION_SIZE_BITS'
>*mem_block_size = MIN_MEMORY_BLOCK_SIZE;
>  ^
>include/linux/memory.h:24:43: note: expanded from macro 
> 'MIN_MEMORY_BLOCK_SIZE'
>#define MIN_MEMORY_BLOCK_SIZE (1UL << SECTION_SIZE_BITS)
>  ^
>arch/powerpc/mm/book3s64/radix_pgtable.c:521:33: error: use of undeclared 
> identifier 'SECTION_SIZE_BITS'
>unsigned long mem_block_size = MIN_MEMORY_BLOCK_SIZE;
>   ^
>include/linux/memory.h:24:43: note: expanded from macro 
> 'MIN_MEMORY_BLOCK_SIZE'
>#define MIN_MEMORY_BLOCK_SIZE (1UL << SECTION_SIZE_BITS)
>  ^
>2 errors generated.
>
> vim +/SECTION_SIZE_BITS +513 arch/powerpc/mm/book3s64/radix_pgtable.c
>
>494
>495static int __init probe_memory_block_size(unsigned long node, 
> const char *uname, int
>496  depth, void *data)
>497{
>498unsigned long *mem_block_size = (unsigned long *)data;
>499const __be64 *prop;
>500int len;
>501
>502if (depth != 1)
>503return 0;
>504
>505if (strcmp(uname, "ibm,dynamic-reconfiguration-memory"))
>506return 0;
>507
>508prop = of_get_flat_dt_prop(node, "ibm,lmb-size", );
>509if (!prop || len < sizeof(__be64))
>510/*
>511 * Nothing in the device tree
>512 */
>  > 513*mem_block_size = MIN_MEMORY_BLOCK_SIZE;
>514else
>515*mem_block_size = be64_to_cpup(prop);
>516return 1;
>517}
>518
>

 arch/powerpc/mm/book3s64/radix_pgtable.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c 
b/arch/powerpc/mm/book3s64/radix_pgtable.c
index bba45fc0b7b2..c5bf2ef73c36 100644
--- a/arch/powerpc/mm/book3s64/radix_pgtable.c
+++ b/arch/powerpc/mm/book3s64/radix_pgtable.c
@@ -492,6 +492,7 @@ static int __init radix_dt_scan_page_sizes(unsigned long 
node,
return 1;
 }
 
+#ifdef CONFIG_MEMORY_HOTPLUG
 static int __init probe_memory_block_size(unsigned long node, const char 
*uname, int
  depth, void *data)
 {
@@ -532,6 +533,15 @@ static unsigned long radix_memory_block_size(void)
return mem_block_size;
 }
 
+#else   /* CONFIG_MEMORY_HOTPLUG */
+
+static unsigned long radix_memory_block_size(void)
+{
+   return 1UL * 1024 * 1024 * 1024;
+}
+
+#endif /* CONFIG_MEMORY_HOTPLUG */
+
 
 void __init radix__early_init_devtree(void)
 {
-- 
2.26.2


-aneesh