Re: [PATCH v3 1/3] cpuidle-powernv : forced wakeup for stop states
Hi Nick, Will post next version with the changes you have suggested. There is a comment below. On 07/07/2019 03:43 PM, Nicholas Piggin wrote: Abhishek Goel's on July 4, 2019 7:18 pm: Currently, the cpuidle governors determine what idle state a idling CPU should enter into based on heuristics that depend on the idle history on that CPU. Given that no predictive heuristic is perfect, there are cases where the governor predicts a shallow idle state, hoping that the CPU will be busy soon. However, if no new workload is scheduled on that CPU in the near future, the CPU may end up in the shallow state. This is problematic, when the predicted state in the aforementioned scenario is a shallow stop state on a tickless system. As we might get stuck into shallow states for hours, in absence of ticks or interrupts. To address this, We forcefully wakeup the cpu by setting the decrementer. The decrementer is set to a value that corresponds with the residency of the next available state. Thus firing up a timer that will forcefully wakeup the cpu. Few such iterations will essentially train the governor to select a deeper state for that cpu, as the timer here corresponds to the next available cpuidle state residency. Thus, cpu will eventually end up in the deepest possible state. Signed-off-by: Abhishek Goel --- Auto-promotion v1 : started as auto promotion logic for cpuidle states in generic driver v2 : Removed timeout_needed and rebased the code to upstream kernel Forced-wakeup v1 : New patch with name of forced wakeup started v2 : Extending the forced wakeup logic for all states. Setting the decrementer instead of queuing up a hrtimer to implement the logic. v3 : Cleanly handle setting/resetting of decrementer so as to not break irq work arch/powerpc/include/asm/time.h | 2 ++ arch/powerpc/kernel/time.c| 40 +++ drivers/cpuidle/cpuidle-powernv.c | 32 + 3 files changed, 74 insertions(+) diff --git a/arch/powerpc/include/asm/time.h b/arch/powerpc/include/asm/time.h index 54f4ec1f9..a3bd4f3c0 100644 --- a/arch/powerpc/include/asm/time.h +++ b/arch/powerpc/include/asm/time.h @@ -188,6 +188,8 @@ static inline unsigned long tb_ticks_since(unsigned long tstamp) extern u64 mulhdu(u64, u64); #endif +extern int set_dec_before_idle(u64 timeout); +extern void reset_dec_after_idle(void); extern void div128_by_32(u64 dividend_high, u64 dividend_low, unsigned divisor, struct div_result *dr); diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c index 694522308..814de3469 100644 --- a/arch/powerpc/kernel/time.c +++ b/arch/powerpc/kernel/time.c @@ -576,6 +576,46 @@ void arch_irq_work_raise(void) #endif /* CONFIG_IRQ_WORK */ +/* + * Returns 1 if we have reprogrammed the decrementer for idle. + * Returns 0 if the decrementer is unchanged. + */ +int set_dec_before_idle(u64 timeout) +{ + u64 *next_tb = this_cpu_ptr(_next_tb); + u64 now = get_tb_or_rtc(); + + /* +* Ensure that the timeout is at least one microsecond +* before the current decrement value. Else, we will +* unnecesarily wakeup again within a microsecond. +*/ + if (now + timeout + 512 > *next_tb) I would pass this 512 in as a parameter and put the comment in the idle code. Timer code does not know/care. Maybe return bool and call it try_set_dec_before_idle. + return 0; + + set_dec(timeout); This needs to have if (test_irq_work_pending()) set_dec(1); here AFAIKS + + return 1; +} + +void reset_dec_after_idle(void) +{ + u64 now; + u64 *next_tb; + + if (test_irq_work_pending()) + return; + + now = get_tb_or_rtc(); + next_tb = this_cpu_ptr(_next_tb); + if (now >= *next_tb) + return; Are you sure it's okay to escape early in this case? Yeah, It looks safe. In power9_idle_type, we call irq_set_pending_from_srr1 which sets the irq_happened. If reason is IRQ_DEC, in __check_irq_replay, decrementer_check_overflow will be called which will set dec to a positive valid value. Also, we typically disable MSR EE before entering stop. And if a decrementer wakes us up, before we enable EE, check for pending interrupt will be done. And we finally reset dec to a positive value before we set EE=1. Thanks, Nick Thanks, Abhishek
Re: [PATCH 4/4] powerpc/64: reuse PPC32 static inline flush_dcache_range()
On Tue, Jul 9, 2019 at 12:52 PM Aneesh Kumar K.V wrote: > > On 7/9/19 7:50 AM, Oliver O'Halloran wrote: > > On Tue, Jul 9, 2019 at 12:22 AM Aneesh Kumar K.V > > wrote: > >> > >> Christophe Leroy writes: > >> > >>> *snip* > >>> + if (IS_ENABLED(CONFIG_PPC64)) > >>> + isync(); > >>> } > >> > >> > >> Was checking with Michael about why we need that extra isync. Michael > >> pointed this came via > >> > >> https://github.com/mpe/linux-fullhistory/commit/faa5ee3743ff9b6df9f9a03600e34fdae596cfb2#diff-67c7ffa8e420c7d4206cae4a9e888e14 > >> > >> for 970 which doesn't have coherent icache. So possibly isync there is > >> to flush the prefetch instructions? But even so we would need an icbi > >> there before that isync. > > > > I don't think it's that, there's some magic in flush_icache_range() to > > handle dropping prefetched instructions on 970. > > > >> So overall wondering why we need that extra barriers there. > > > > I think the isync is needed there because the architecture only > > requires sync to provide ordering. A sync alone doesn't guarantee the > > dcbfs have actually completed so the isync is necessary to ensure the > > flushed cache lines are back in memory. That said, as far as I know > > all the IBM book3s chips from power4 onwards will wait for pending > > dcbfs when they hit a sync, but that might change in the future. > > > > ISA doesn't list that as the sequence. Only place where isync was > mentioned was w.r.t icbi where want to discards the prefetch. doesn't list that as the sequence for what? > > If it's a problem we could add a cpu-feature section around the isync > > to no-op it in the common case. However, when I had a look with perf > > it always showed that the sync was the hotspot so I don't think it'll > > help much. > > > > What about the preceding barriers (sync; isync;) before dcbf? Why are > they needed? Dunno, the sync might just be to ensure ordering between prior stores and the dcbf. > > -aneesh
Re: [PATCH v2] powerpc: slightly improve cache helpers
Le 08/07/2019 à 21:14, Nathan Chancellor a écrit : On Mon, Jul 08, 2019 at 11:19:30AM +1000, Michael Ellerman wrote: On Fri, 2019-05-10 at 09:24:48 UTC, Christophe Leroy wrote: Cache instructions (dcbz, dcbi, dcbf and dcbst) take two registers that are summed to obtain the target address. Using 'Z' constraint and '%y0' argument gives GCC the opportunity to use both registers instead of only one with the second being forced to 0. Suggested-by: Segher Boessenkool Signed-off-by: Christophe Leroy Applied to powerpc next, thanks. https://git.kernel.org/powerpc/c/6c5875843b87c3adea2beade9d1b8b3d4523900a cheers This patch causes a regression with clang: Is that a Clang bug ? Do you have a disassembly of the code both with and without this patch in order to compare ? Segher, any idea ? Christophe https://travis-ci.com/ClangBuiltLinux/continuous-integration/jobs/213944668 I've attached my local bisect/build log. Cheers, Nathan
Re: [PATCH v3] cpufreq/pasemi: fix an use-after-free in pas_cpufreq_cpu_init()
Wen Yang writes: > The cpu variable is still being used in the of_get_property() call > after the of_node_put() call, which may result in use-after-free. > > Fixes: a9acc26b75f ("cpufreq/pasemi: fix possible object reference leak") > Signed-off-by: Wen Yang > Cc: "Rafael J. Wysocki" > Cc: Viresh Kumar > Cc: linuxppc-dev@lists.ozlabs.org > Cc: linux...@vger.kernel.org > Cc: linux-ker...@vger.kernel.org > --- > v3: fix a leaked reference. > v2: clean up the code according to the advice of viresh. > > drivers/cpufreq/pasemi-cpufreq.c | 12 +++- > 1 file changed, 7 insertions(+), 5 deletions(-) > > diff --git a/drivers/cpufreq/pasemi-cpufreq.c > b/drivers/cpufreq/pasemi-cpufreq.c > index 6b1e4ab..9dc5163 100644 > --- a/drivers/cpufreq/pasemi-cpufreq.c > +++ b/drivers/cpufreq/pasemi-cpufreq.c > @@ -128,20 +128,20 @@ static int pas_cpufreq_cpu_init(struct cpufreq_policy > *policy) > int cur_astate, idx; > struct resource res; > struct device_node *cpu, *dn; > - int err = -ENODEV; > + int err; > > cpu = of_get_cpu_node(policy->cpu, NULL); > - > - of_node_put(cpu); > if (!cpu) > - goto out; > + return -ENODEV; I don't think introducing another exit path is an improvement. > dn = of_find_compatible_node(NULL, NULL, "1682m-sdc"); > if (!dn) > dn = of_find_compatible_node(NULL, NULL, >"pasemi,pwrficient-sdc"); > - if (!dn) > + if (!dn) { > + err = -ENODEV; > goto out; > + } > err = of_address_to_resource(dn, 0, ); > of_node_put(dn); > if (err) > @@ -196,6 +196,7 @@ static int pas_cpufreq_cpu_init(struct cpufreq_policy > *policy) > policy->cur = pas_freqs[cur_astate].frequency; > ppc_proc_freq = policy->cur * 1000ul; > > + of_node_put(cpu); > return cpufreq_generic_init(policy, pas_freqs, get_gizmo_latency()); > > out_unmap_sdcpwr: > @@ -204,6 +205,7 @@ static int pas_cpufreq_cpu_init(struct cpufreq_policy > *policy) > out_unmap_sdcasr: > iounmap(sdcasr_mapbase); > out: > + of_node_put(cpu); > return err; > } Notice that cpu is only used for the max_freq calculation, so we could instead do: diff --git a/drivers/cpufreq/pasemi-cpufreq.c b/drivers/cpufreq/pasemi-cpufreq.c index 6b1e4abe3248..42a0a4b8e87d 100644 --- a/drivers/cpufreq/pasemi-cpufreq.c +++ b/drivers/cpufreq/pasemi-cpufreq.c @@ -131,11 +131,20 @@ static int pas_cpufreq_cpu_init(struct cpufreq_policy *policy) int err = -ENODEV; cpu = of_get_cpu_node(policy->cpu, NULL); - - of_node_put(cpu); if (!cpu) goto out; + max_freqp = of_get_property(cpu, "clock-frequency", NULL); + if (!max_freqp) { + of_node_put(cpu); + err = -EINVAL; + goto out; + } + + /* we need the freq in kHz */ + max_freq = *max_freqp / 1000; + of_node_put(cpu); + dn = of_find_compatible_node(NULL, NULL, "1682m-sdc"); if (!dn) dn = of_find_compatible_node(NULL, NULL, @@ -172,15 +181,6 @@ static int pas_cpufreq_cpu_init(struct cpufreq_policy *policy) pr_debug("init cpufreq on CPU %d\n", policy->cpu); - max_freqp = of_get_property(cpu, "clock-frequency", NULL); - if (!max_freqp) { - err = -EINVAL; - goto out_unmap_sdcpwr; - } - - /* we need the freq in kHz */ - max_freq = *max_freqp / 1000; - pr_debug("max clock-frequency is at %u kHz\n", max_freq); pr_debug("initializing frequency table\n"); Though arguably this function should hold a reference to cpu anyway, because it doesn't want the CPU to removed out from under it. It's a CPUfreq driver after all. cheers
Re: [PATCH using git format-patch -C] kexec: add generic support for elf kernel images
Christophe Leroy writes: > Hi Sven, > > Le 08/07/2019 à 12:06, Christophe Leroy a écrit : >> From: Sven Schnelle > > Please describe you patch. > > All the changes you are doing to the ppc64 version in order to make it > generic should be described. > > Those changes are also maybe worth splitting the patch in several parts, > either preparing the ppc64 for generic then moving to kernel/kexec_elf.c > or moving to kernel/kexec_elf.c without modifications, then modifying it > to make it generic. > > Note that your patch only applies on Linux 5.1, it doesn't apply on > powerpc/next. Don't worry about that. By the time it's merged powerpc/next will probably have moved on, and if not I can deal with any conflicts. Using master or powerpc/merge as a base is fine. cheers
Re: [PATCH V2 2/2] ASoC: fsl_esai: recover the channel swap after xrun
> > > > + > > > > + /* restore registers by regcache_sync */ > > > > + fsl_esai_register_restore(esai_priv); > > > > + > > > > + regmap_update_bits(esai_priv->regmap, REG_ESAI_TCR, > > > > +ESAI_xCR_xPR_MASK, 0); > > > > + regmap_update_bits(esai_priv->regmap, REG_ESAI_RCR, > > > > +ESAI_xCR_xPR_MASK, 0); > > > > > > And just for curious, can (or shall) we stuff this personal reset to > > > the reset() function? I found this one is a part of the reset > > > routine being mentioned in the RM -- it was done after ESAI reset is > done via ECR register. > > > > > > > There is a problem to do this, TPR/RPR need to be clear after > > configure the control register. (TCCR, TCR). So it seems not only one > > place (reset function) need to be changed. > > Do you know (or remember) why we suddenly involve this TPR/PRP? > The driver has no problem so far, even if we don't have them. > > The "personal reset" sounds like a feature that we would use to reset TX or > RX individually, while this hw_reset() does a full reset for both TX and RX. > So I wonder whether they're necessary. The hw_reset flow is suggested by design team, so involve TRP/RPP is from them, I don't know the detail. Best regards Wang shengjiu
Re: [RFC PATCH] powerpc/powernv: Add ultravisor message log interface
On Tue, Jul 9, 2019 at 6:12 AM Claudio Carvalho wrote: > > From: Madhavan Srinivasan > > Ultravisor provides an in-memory circular buffer containing a message > log populated with various runtime message produced by firmware. > > Based on "powernv/opal-msglog.c", this patch provides a sysfs interface > /sys/firmware/opal/uv_msglog for userspace to view the messages. No, it's a copy and paste of the existing memcons code with "uv" sprinked around the place. I don't mind stuff being c+p since it's occasionally justified, but be honest about it. > diff --git a/arch/powerpc/platforms/powernv/opal-uv-msglog.c > b/arch/powerpc/platforms/powernv/opal-uv-msglog.c > new file mode 100644 > index ..87d665d7e6ad > --- /dev/null > +++ b/arch/powerpc/platforms/powernv/opal-uv-msglog.c > @@ -0,0 +1,141 @@ > +// SPDX-License-Identifier: GPL-2.0+ > +/* > + * PowerNV OPAL in-memory ultravisor console interface > + * > + * Copyright 2018 IBM Corp. > + */ > +#include > +#include > +#include > +#include > +#include > +#include > + > +/* OPAL in-memory console. Defined in OPAL source at core/console.c */ > +struct memcons { > + __be64 magic; > +#define MEMCONS_MAGIC 0x6630696567726173L > + __be64 obuf_phys; > + __be64 ibuf_phys; > + __be32 obuf_size; > + __be32 ibuf_size; > + __be32 out_pos; > +#define MEMCONS_OUT_POS_WRAP 0x8000u > +#define MEMCONS_OUT_POS_MASK 0x00ffu > + __be32 in_prod; > + __be32 in_cons; > +}; > + > +static struct memcons *opal_uv_memcons; > + > +ssize_t opal_uv_msglog_copy(char *to, loff_t pos, size_t count) > +{ > + const char *conbuf; > + ssize_t ret; > + size_t first_read = 0; > + uint32_t out_pos, avail; > + > + if (!opal_uv_memcons) > + return -ENODEV; > + > + out_pos = be32_to_cpu(READ_ONCE(opal_uv_memcons->out_pos)); > + > + /* > +* Now we've read out_pos, put a barrier in before reading the new > data > +* it points to in conbuf. > +*/ > + smp_rmb(); > + > + conbuf = phys_to_virt(be64_to_cpu(opal_uv_memcons->obuf_phys)); > + > + /* > +* When the buffer has wrapped, read from the out_pos marker to the > end > +* of the buffer, and then read the remaining data as in the > un-wrapped > +* case. > +*/ > + if (out_pos & MEMCONS_OUT_POS_WRAP) { > + > + out_pos &= MEMCONS_OUT_POS_MASK; > + avail = be32_to_cpu(opal_uv_memcons->obuf_size) - out_pos; > + > + ret = memory_read_from_buffer(to, count, , > + conbuf + out_pos, avail); > + > + if (ret < 0) > + goto out; > + > + first_read = ret; > + to += first_read; > + count -= first_read; > + pos -= avail; > + > + if (count <= 0) > + goto out; > + } > + > + /* Sanity check. The firmware should not do this to us. */ > + if (out_pos > be32_to_cpu(opal_uv_memcons->obuf_size)) { > + pr_err("OPAL: memory console corruption. Aborting read.\n"); > + return -EINVAL; > + } > + > + ret = memory_read_from_buffer(to, count, , conbuf, out_pos); > + > + if (ret < 0) > + goto out; > + > + ret += first_read; > +out: > + return ret; > +} Make this take an struct memcons as an argument and use the same function for the opal and UV consoles. Two copies of the same code with tricky barrier crap in them is not a good idea. > +static struct bin_attribute opal_uv_msglog_attr = { > + .attr = {.name = "uv_msglog", .mode = 0444}, We made the OPAL console only readable to root recently, so the mode should be 0400. > + .read = opal_uv_msglog_read > +}; > + > +void __init opal_uv_msglog_init(void) > +{ > + u64 mcaddr; > + struct memcons *mc; Declarations are reverse-christmas-tree, so these should be the other way around. > + > + if (of_property_read_u64(opal_node, "ibm,opal-uv-memcons", )) { > + pr_warn("OPAL: Property ibm,opal-uv-memcons not found\n"); > + return; > + } > + > + mc = phys_to_virt(mcaddr); > + if (!mc) { > + pr_warn("OPAL: uv memory console address is invalid\n"); > + return; > + } > + > + if (be64_to_cpu(mc->magic) != MEMCONS_MAGIC) { > + pr_warn("OPAL: uv memory console version is invalid\n"); > + return; > + } > + > + /* Report maximum size */ > + opal_uv_msglog_attr.size = be32_to_cpu(mc->ibuf_size) + > + be32_to_cpu(mc->obuf_size); > + > + opal_uv_memcons = mc; > +} You can probably share this too if you make it take the DT property name of the memcons address as an argument, e.g: struct memcons *opal_uv_msglog_init(const char *dt_prop_name) > + > +void __init
Re: [PATCH 4/4] powerpc/64: reuse PPC32 static inline flush_dcache_range()
On 7/9/19 7:50 AM, Oliver O'Halloran wrote: On Tue, Jul 9, 2019 at 12:22 AM Aneesh Kumar K.V wrote: Christophe Leroy writes: *snip* + if (IS_ENABLED(CONFIG_PPC64)) + isync(); } Was checking with Michael about why we need that extra isync. Michael pointed this came via https://github.com/mpe/linux-fullhistory/commit/faa5ee3743ff9b6df9f9a03600e34fdae596cfb2#diff-67c7ffa8e420c7d4206cae4a9e888e14 for 970 which doesn't have coherent icache. So possibly isync there is to flush the prefetch instructions? But even so we would need an icbi there before that isync. I don't think it's that, there's some magic in flush_icache_range() to handle dropping prefetched instructions on 970. So overall wondering why we need that extra barriers there. I think the isync is needed there because the architecture only requires sync to provide ordering. A sync alone doesn't guarantee the dcbfs have actually completed so the isync is necessary to ensure the flushed cache lines are back in memory. That said, as far as I know all the IBM book3s chips from power4 onwards will wait for pending dcbfs when they hit a sync, but that might change in the future. ISA doesn't list that as the sequence. Only place where isync was mentioned was w.r.t icbi where want to discards the prefetch. If it's a problem we could add a cpu-feature section around the isync to no-op it in the common case. However, when I had a look with perf it always showed that the sync was the hotspot so I don't think it'll help much. What about the preceding barriers (sync; isync;) before dcbf? Why are they needed? -aneesh
Re: [PATCH v4] cpufreq/pasemi: fix an use-after-free in pas_cpufreq_cpu_init()
On 09-07-19, 07:39, Wen Yang wrote: > The cpu variable is still being used in the of_get_property() call > after the of_node_put() call, which may result in use-after-free. > > Fixes: a9acc26b75f ("cpufreq/pasemi: fix possible object reference leak") > Signed-off-by: Wen Yang > Cc: "Rafael J. Wysocki" > Cc: Viresh Kumar > Cc: linuxppc-dev@lists.ozlabs.org > Cc: linux...@vger.kernel.org > Cc: linux-ker...@vger.kernel.org > --- > v4: restore the blank line. I don't find it restored in the below code. > v3: fix a leaked reference. > v2: clean up the code according to the advice of viresh. > > drivers/cpufreq/pasemi-cpufreq.c | 11 +++ > 1 file changed, 7 insertions(+), 4 deletions(-) > > diff --git a/drivers/cpufreq/pasemi-cpufreq.c > b/drivers/cpufreq/pasemi-cpufreq.c > index 6b1e4ab..f0c98fc 100644 > --- a/drivers/cpufreq/pasemi-cpufreq.c > +++ b/drivers/cpufreq/pasemi-cpufreq.c > @@ -128,20 +128,21 @@ static int pas_cpufreq_cpu_init(struct cpufreq_policy > *policy) > int cur_astate, idx; > struct resource res; > struct device_node *cpu, *dn; > - int err = -ENODEV; > + int err; > > cpu = of_get_cpu_node(policy->cpu, NULL); > > - of_node_put(cpu); > if (!cpu) > - goto out; > + return -ENODEV; > > dn = of_find_compatible_node(NULL, NULL, "1682m-sdc"); > if (!dn) > dn = of_find_compatible_node(NULL, NULL, >"pasemi,pwrficient-sdc"); > - if (!dn) > + if (!dn) { > + err = -ENODEV; > goto out; > + } > err = of_address_to_resource(dn, 0, ); > of_node_put(dn); > if (err) > @@ -196,6 +197,7 @@ static int pas_cpufreq_cpu_init(struct cpufreq_policy > *policy) > policy->cur = pas_freqs[cur_astate].frequency; > ppc_proc_freq = policy->cur * 1000ul; > > + of_node_put(cpu); > return cpufreq_generic_init(policy, pas_freqs, get_gizmo_latency()); > > out_unmap_sdcpwr: > @@ -204,6 +206,7 @@ static int pas_cpufreq_cpu_init(struct cpufreq_policy > *policy) > out_unmap_sdcasr: > iounmap(sdcasr_mapbase); > out: > + of_node_put(cpu); > return err; > } > > -- > 2.9.5 -- viresh
Re: powerpc/83xx: fix use-after-free on mpc831x_usb_cfg()
> > The np variable is still being used after the of_node_put() call, > > > which may result in use-after-free. > > > We fix this issue by calling of_node_put() after the last usage. > > > I imagine that this commit description can be improved a bit more > (by mentioning the influence of “immr_node”?). > How do you think about to omit the word “We” here? > > > > This patatch also do some cleanup. > > > Should the renaming of a jump label be contributed in a separate update > step of a small patch series besides a wording without a typo? Thank you for your comments. We will improve the commit description and send v2 as soon as possible. -- Regards, Wen
Re: [PATCH 4/4] powerpc/64: reuse PPC32 static inline flush_dcache_range()
On Tue, Jul 9, 2019 at 12:22 AM Aneesh Kumar K.V wrote: > > Christophe Leroy writes: > > > *snip* > > + if (IS_ENABLED(CONFIG_PPC64)) > > + isync(); > > } > > > Was checking with Michael about why we need that extra isync. Michael > pointed this came via > > https://github.com/mpe/linux-fullhistory/commit/faa5ee3743ff9b6df9f9a03600e34fdae596cfb2#diff-67c7ffa8e420c7d4206cae4a9e888e14 > > for 970 which doesn't have coherent icache. So possibly isync there is > to flush the prefetch instructions? But even so we would need an icbi > there before that isync. I don't think it's that, there's some magic in flush_icache_range() to handle dropping prefetched instructions on 970. > So overall wondering why we need that extra barriers there. I think the isync is needed there because the architecture only requires sync to provide ordering. A sync alone doesn't guarantee the dcbfs have actually completed so the isync is necessary to ensure the flushed cache lines are back in memory. That said, as far as I know all the IBM book3s chips from power4 onwards will wait for pending dcbfs when they hit a sync, but that might change in the future. If it's a problem we could add a cpu-feature section around the isync to no-op it in the common case. However, when I had a look with perf it always showed that the sync was the hotspot so I don't think it'll help much. Oliver
Re:[PATCH v3] cpufreq/pasemi: fix an use-after-free inpas_cpufreq_cpu_init()
> Hello Wen, > > Thanks for your patch! > > Did you test your patch with a P.A. Semi board? > Hello Christian, thank you. We don't have a P.A. Semi board yet, so we didn't test it. If you have such a board, could you please kindly help to test it? -- Thanks and regards, Wen
Re: [PATCH V2 2/2] ASoC: fsl_esai: recover the channel swap after xrun
On Fri, Jul 05, 2019 at 07:03:47AM +, S.j. Wang wrote: > > > > > + > > > + /* restore registers by regcache_sync */ > > > + fsl_esai_register_restore(esai_priv); > > > + > > > + regmap_update_bits(esai_priv->regmap, REG_ESAI_TCR, > > > +ESAI_xCR_xPR_MASK, 0); > > > + regmap_update_bits(esai_priv->regmap, REG_ESAI_RCR, > > > +ESAI_xCR_xPR_MASK, 0); > > > > And just for curious, can (or shall) we stuff this personal reset to the > > reset() > > function? I found this one is a part of the reset routine being mentioned in > > the RM -- it was done after ESAI reset is done via ECR register. > > > > There is a problem to do this, TPR/RPR need to be clear after configure the > control > register. (TCCR, TCR). So it seems not only one place (reset function) need > to be > changed. Do you know (or remember) why we suddenly involve this TPR/PRP? The driver has no problem so far, even if we don't have them. The "personal reset" sounds like a feature that we would use to reset TX or RX individually, while this hw_reset() does a full reset for both TX and RX. So I wonder whether they're necessary.
Re: [PATCH kernel v3 3/3] powerpc/powernv/ioda2: Create bigger default window with 64k IOMMU pages
On 08/07/2019 17:01, alist...@popple.id.au wrote: Hi Alexey, Couple of comment/questions below. - /* - * Reserve page 0 so it will not be used for any mappings. - * This avoids buggy drivers that consider page 0 to be invalid - * to crash the machine or even lose data. - */ - if (tbl->it_offset == 0) - set_bit(0, tbl->it_map); + tbl->it_reserved_start = res_start; + tbl->it_reserved_end = res_end; + iommu_table_reserve_pages(tbl); Personally I think it would be cleaner to set tbl->it_reserved_start/end in iommu_table_reserve_pages() rather than separating the setup like this. Yeah I suppose... /* We only split the IOMMU table if we have 1GB or more of space */ if ((tbl->it_size << tbl->it_page_shift) >= (1UL * 1024 * 1024 * 1024)) @@ -727,12 +755,7 @@ static void iommu_table_free(struct kref *kref) return; } - /* - * In case we have reserved the first bit, we should not emit - * the warning below. - */ - if (tbl->it_offset == 0) - clear_bit(0, tbl->it_map); + iommu_table_release_pages(tbl); /* verify that table contains no entries */ if (!bitmap_empty(tbl->it_map, tbl->it_size)) @@ -1037,8 +1060,7 @@ int iommu_take_ownership(struct iommu_table *tbl) for (i = 0; i < tbl->nr_pools; i++) spin_lock(>pools[i].lock); - if (tbl->it_offset == 0) - clear_bit(0, tbl->it_map); + iommu_table_reserve_pages(tbl); if (!bitmap_empty(tbl->it_map, tbl->it_size)) { pr_err("iommu_tce: it_map is not empty"); @@ -1068,9 +1090,7 @@ void iommu_release_ownership(struct iommu_table *tbl) memset(tbl->it_map, 0, sz); - /* Restore bit#0 set by iommu_init_table() */ - if (tbl->it_offset == 0) - set_bit(0, tbl->it_map); + iommu_table_release_pages(tbl); Are the above two changes correct? No they are not, this is a bug. Thanks for spotting, repost is coming. From the perspective of code behaviour this seems to swap the set/clear_bit() calls. For example above you are replacing set_bit(0, tbl->it_map) with a call to iommu_table_release_pages() which does clear_bit(0, tbl->it_map) instead. - Alistair for (i = 0; i < tbl->nr_pools; i++) spin_unlock(>pools[i].lock); diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c index 126602b4e399..ce2efdb3900d 100644 --- a/arch/powerpc/platforms/powernv/pci-ioda.c +++ b/arch/powerpc/platforms/powernv/pci-ioda.c @@ -2422,6 +2422,7 @@ static long pnv_pci_ioda2_setup_default_config(struct pnv_ioda_pe *pe) { struct iommu_table *tbl = NULL; long rc; + unsigned long res_start, res_end; /* * crashkernel= specifies the kdump kernel's maximum memory at @@ -2435,19 +2436,46 @@ static long pnv_pci_ioda2_setup_default_config(struct pnv_ioda_pe *pe) * DMA window can be larger than available memory, which will * cause errors later. */ - const u64 window_size = min((u64)pe->table_group.tce32_size, max_memory); + const u64 maxblock = 1UL << (PAGE_SHIFT + MAX_ORDER - 1); - rc = pnv_pci_ioda2_create_table(>table_group, 0, - IOMMU_PAGE_SHIFT_4K, - window_size, - POWERNV_IOMMU_DEFAULT_LEVELS, false, ); + /* + * We create the default window as big as we can. The constraint is + * the max order of allocation possible. The TCE tableis likely to + * end up being multilevel and with on-demand allocation in place, + * the initial use is not going to be huge as the default window aims + * to support cripplied devices (i.e. not fully 64bit DMAble) only. + */ + /* iommu_table::it_map uses 1 bit per IOMMU page, hence 8 */ + const u64 window_size = min((maxblock * 8) << PAGE_SHIFT, max_memory); + /* Each TCE level cannot exceed maxblock so go multilevel if needed */ + unsigned long tces_order = ilog2(window_size >> PAGE_SHIFT); + unsigned long tcelevel_order = ilog2(maxblock >> 3); + unsigned int levels = tces_order / tcelevel_order; + + if (tces_order % tcelevel_order) + levels += 1; + /* + * We try to stick to default levels (which is >1 at the moment) in + * order to save memory by relying on on-demain TCE level allocation. + */ + levels = max_t(unsigned int, levels, POWERNV_IOMMU_DEFAULT_LEVELS); + + rc = pnv_pci_ioda2_create_table(>table_group, 0, PAGE_SHIFT, + window_size, levels, false, ); if (rc) { pe_err(pe, "Failed to create 32-bit TCE table, err %ld", rc); return rc; } - iommu_init_table(tbl, pe->phb->hose->node); + /* We use top part of 32bit space for MMIO so exclude it from DMA */ + res_start = 0; + res_end = 0; + if (window_size > pe->phb->ioda.m32_pci_base) { + res_start = pe->phb->ioda.m32_pci_base >> tbl->it_page_shift; + res_end = min(window_size, SZ_4G) >> tbl->it_page_shift; + } +
[PATCH v4] cpufreq/pasemi: fix an use-after-free in pas_cpufreq_cpu_init()
The cpu variable is still being used in the of_get_property() call after the of_node_put() call, which may result in use-after-free. Fixes: a9acc26b75f ("cpufreq/pasemi: fix possible object reference leak") Signed-off-by: Wen Yang Cc: "Rafael J. Wysocki" Cc: Viresh Kumar Cc: linuxppc-dev@lists.ozlabs.org Cc: linux...@vger.kernel.org Cc: linux-ker...@vger.kernel.org --- v4: restore the blank line. v3: fix a leaked reference. v2: clean up the code according to the advice of viresh. drivers/cpufreq/pasemi-cpufreq.c | 11 +++ 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/drivers/cpufreq/pasemi-cpufreq.c b/drivers/cpufreq/pasemi-cpufreq.c index 6b1e4ab..f0c98fc 100644 --- a/drivers/cpufreq/pasemi-cpufreq.c +++ b/drivers/cpufreq/pasemi-cpufreq.c @@ -128,20 +128,21 @@ static int pas_cpufreq_cpu_init(struct cpufreq_policy *policy) int cur_astate, idx; struct resource res; struct device_node *cpu, *dn; - int err = -ENODEV; + int err; cpu = of_get_cpu_node(policy->cpu, NULL); - of_node_put(cpu); if (!cpu) - goto out; + return -ENODEV; dn = of_find_compatible_node(NULL, NULL, "1682m-sdc"); if (!dn) dn = of_find_compatible_node(NULL, NULL, "pasemi,pwrficient-sdc"); - if (!dn) + if (!dn) { + err = -ENODEV; goto out; + } err = of_address_to_resource(dn, 0, ); of_node_put(dn); if (err) @@ -196,6 +197,7 @@ static int pas_cpufreq_cpu_init(struct cpufreq_policy *policy) policy->cur = pas_freqs[cur_astate].frequency; ppc_proc_freq = policy->cur * 1000ul; + of_node_put(cpu); return cpufreq_generic_init(policy, pas_freqs, get_gizmo_latency()); out_unmap_sdcpwr: @@ -204,6 +206,7 @@ static int pas_cpufreq_cpu_init(struct cpufreq_policy *policy) out_unmap_sdcasr: iounmap(sdcasr_mapbase); out: + of_node_put(cpu); return err; } -- 2.9.5
Re: [PATCH v2] tpm: tpm_ibm_vtpm: Fix unallocated banks
On Mon, Jul 08, 2019 at 06:24:04PM -0400, Mimi Zohar wrote: > Hi Jarkko, > > On Mon, 2019-07-08 at 18:11 +0300, Jarkko Sakkinen wrote: > > On Sat, 2019-07-06 at 20:18 -0400, Nayna Jain wrote: > > > +/* > > > + * tpm_get_pcr_allocation() - initialize the chip allocated banks for > > > PCRs > > > + * @chip: TPM chip to use. > > > + */ > > > +static int tpm_get_pcr_allocation(struct tpm_chip *chip) > > > +{ > > > + int rc; > > > + > > > + if (chip->flags & TPM_CHIP_FLAG_TPM2) > > > + rc = tpm2_get_pcr_allocation(chip); > > > + else > > > + rc = tpm1_get_pcr_allocation(chip); > > > + > > > + return rc; > > > +} > > > > It is just a trivial static function, which means that kdoc comment is > > not required and neither it is useful. Please remove that. I would > > rewrite the function like: > > > > static int tpm_get_pcr_allocation(struct tpm_chip *chip) > > { > > int rc; > > > > rc = (chip->flags & TPM_CHIP_FLAG_TPM2) ? > > tpm2_get_pcr_allocation(chip) : > > tpm1_get_pcr_allocation(chip); > > > > > return rc > 0 ? -ENODEV : rc; > > } > > > > This addresses the issue that Stefan also pointed out. You have to > > deal with the TPM error codes. > > Hm, in the past I was told by Christoph not to use the ternary > operator. Have things changed? Other than removing the comment, the > only other difference is the return. I also dislike most use of ternary expressions.. Also, it is not so nice that TPM error codes and errno are multiplexed on the same 'int' type - very hard to get this right. I'm not sure anything actually needs the tpm error, and if it does maybe we should be mapping those special cases to errno instead. Jason
Re: [PATCH v2] tpm: tpm_ibm_vtpm: Fix unallocated banks
On Mon, Jul 08, 2019 at 06:24:04PM -0400, Mimi Zohar wrote: > > static int tpm_get_pcr_allocation(struct tpm_chip *chip) > > { > > int rc; > > > > rc = (chip->flags & TPM_CHIP_FLAG_TPM2) ? > > tpm2_get_pcr_allocation(chip) : > > tpm1_get_pcr_allocation(chip); > > > > > return rc > 0 ? -ENODEV : rc; > > } > > > > This addresses the issue that Stefan also pointed out. You have to > > deal with the TPM error codes. > > Hm, in the past I was told by Christoph not to use the ternary > operator. Have things changed? Other than removing the comment, the > only other difference is the return. In the end it is a matter of personal preference, but I find the quote version above using the ternary horribly obsfucated.
Re: [PATCH v2] tpm: tpm_ibm_vtpm: Fix unallocated banks
Hi Jarkko, On Mon, 2019-07-08 at 18:11 +0300, Jarkko Sakkinen wrote: > On Sat, 2019-07-06 at 20:18 -0400, Nayna Jain wrote: > > +/* > > + * tpm_get_pcr_allocation() - initialize the chip allocated banks for PCRs > > + * @chip: TPM chip to use. > > + */ > > +static int tpm_get_pcr_allocation(struct tpm_chip *chip) > > +{ > > + int rc; > > + > > + if (chip->flags & TPM_CHIP_FLAG_TPM2) > > + rc = tpm2_get_pcr_allocation(chip); > > + else > > + rc = tpm1_get_pcr_allocation(chip); > > + > > + return rc; > > +} > > It is just a trivial static function, which means that kdoc comment is > not required and neither it is useful. Please remove that. I would > rewrite the function like: > > static int tpm_get_pcr_allocation(struct tpm_chip *chip) > { > int rc; > > rc = (chip->flags & TPM_CHIP_FLAG_TPM2) ? >tpm2_get_pcr_allocation(chip) : >tpm1_get_pcr_allocation(chip); > > return rc > 0 ? -ENODEV : rc; > } > > This addresses the issue that Stefan also pointed out. You have to > deal with the TPM error codes. Hm, in the past I was told by Christoph not to use the ternary operator. Have things changed? Other than removing the comment, the only other difference is the return. Mimi
[PATCH v3] cpufreq/pasemi: fix an use-after-free in pas_cpufreq_cpu_init()
Hello Wen, Thanks for your patch! Did you test your patch with a P.A. Semi board? Cheers, Christian
Re: [PATCH v4 7/8] KVM: PPC: Ultravisor: Enter a secure guest
On 7/8/19 5:53 PM, janani wrote: > On 2019-06-28 15:08, Claudio Carvalho wrote: >> From: Sukadev Bhattiprolu >> >> To enter a secure guest, we have to go through the ultravisor, therefore >> we do a ucall when we are entering a secure guest. >> >> This change is needed for any sort of entry to the secure guest from the >> hypervisor, whether it is a return from an hcall, a return from a >> hypervisor interrupt, or the first time that a secure guest vCPU is run. >> >> If we are returning from an hcall, the results are already in the >> appropriate registers R3:12, except for R3, R6 and R7. R3 has the status >> of the reflected hcall, therefore we move it to R0 for the ultravisor and >> set R3 to the UV_RETURN ucall number. R6,7 were used as temporary >> registers, hence we restore them. >> >> Have fast_guest_return check the kvm_arch.secure_guest field so that a >> new CPU enters UV when started (in response to a RTAS start-cpu call). >> >> Thanks to input from Paul Mackerras, Ram Pai and Mike Anderson. >> >> Signed-off-by: Sukadev Bhattiprolu >> [ Pass SRR1 in r11 for UV_RETURN, fix kvmppc_msr_interrupt to preserve >> the MSR_S bit ] >> Signed-off-by: Paul Mackerras >> [ Fix UV_RETURN ucall number and arch.secure_guest check ] >> Signed-off-by: Ram Pai >> [ Save the actual R3 in R0 for the ultravisor and use R3 for the >> UV_RETURN ucall number. Update commit message and ret_to_ultra comment ] >> Signed-off-by: Claudio Carvalho > Reviewed-by: Janani Janakiraman Thanks Janani for reviewing the patchset. Claudio >> --- >> arch/powerpc/include/asm/kvm_host.h | 1 + >> arch/powerpc/include/asm/ultravisor-api.h | 1 + >> arch/powerpc/kernel/asm-offsets.c | 1 + >> arch/powerpc/kvm/book3s_hv_rmhandlers.S | 40 +++ >> 4 files changed, 37 insertions(+), 6 deletions(-) >> >> diff --git a/arch/powerpc/include/asm/kvm_host.h >> b/arch/powerpc/include/asm/kvm_host.h >> index 013c76a0a03e..184becb62ea4 100644 >> --- a/arch/powerpc/include/asm/kvm_host.h >> +++ b/arch/powerpc/include/asm/kvm_host.h >> @@ -294,6 +294,7 @@ struct kvm_arch { >> cpumask_t cpu_in_guest; >> u8 radix; >> u8 fwnmi_enabled; >> + u8 secure_guest; >> bool threads_indep; >> bool nested_enable; >> pgd_t *pgtable; >> diff --git a/arch/powerpc/include/asm/ultravisor-api.h >> b/arch/powerpc/include/asm/ultravisor-api.h >> index 141940771add..7c4d0b4ced12 100644 >> --- a/arch/powerpc/include/asm/ultravisor-api.h >> +++ b/arch/powerpc/include/asm/ultravisor-api.h >> @@ -19,5 +19,6 @@ >> >> /* opcodes */ >> #define UV_WRITE_PATE 0xF104 >> +#define UV_RETURN 0xF11C >> >> #endif /* _ASM_POWERPC_ULTRAVISOR_API_H */ >> diff --git a/arch/powerpc/kernel/asm-offsets.c >> b/arch/powerpc/kernel/asm-offsets.c >> index 8e02444e9d3d..44742724513e 100644 >> --- a/arch/powerpc/kernel/asm-offsets.c >> +++ b/arch/powerpc/kernel/asm-offsets.c >> @@ -508,6 +508,7 @@ int main(void) >> OFFSET(KVM_VRMA_SLB_V, kvm, arch.vrma_slb_v); >> OFFSET(KVM_RADIX, kvm, arch.radix); >> OFFSET(KVM_FWNMI, kvm, arch.fwnmi_enabled); >> + OFFSET(KVM_SECURE_GUEST, kvm, arch.secure_guest); >> OFFSET(VCPU_DSISR, kvm_vcpu, arch.shregs.dsisr); >> OFFSET(VCPU_DAR, kvm_vcpu, arch.shregs.dar); >> OFFSET(VCPU_VPA, kvm_vcpu, arch.vpa.pinned_addr); >> diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S >> b/arch/powerpc/kvm/book3s_hv_rmhandlers.S >> index cffb365d9d02..89813ca987c2 100644 >> --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S >> +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S >> @@ -36,6 +36,7 @@ >> #include >> #include >> #include >> +#include >> >> /* Sign-extend HDEC if not on POWER9 */ >> #define EXTEND_HDEC(reg) \ >> @@ -1092,16 +1093,12 @@ BEGIN_FTR_SECTION >> END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) >> >> ld r5, VCPU_LR(r4) >> - ld r6, VCPU_CR(r4) >> mtlr r5 >> - mtcr r6 >> >> ld r1, VCPU_GPR(R1)(r4) >> ld r2, VCPU_GPR(R2)(r4) >> ld r3, VCPU_GPR(R3)(r4) >> ld r5, VCPU_GPR(R5)(r4) >> - ld r6, VCPU_GPR(R6)(r4) >> - ld r7, VCPU_GPR(R7)(r4) >> ld r8, VCPU_GPR(R8)(r4) >> ld r9, VCPU_GPR(R9)(r4) >> ld r10, VCPU_GPR(R10)(r4) >> @@ -1119,10 +1116,38 @@ BEGIN_FTR_SECTION >> mtspr SPRN_HDSISR, r0 >> END_FTR_SECTION_IFSET(CPU_FTR_ARCH_300) >> >> + ld r6, VCPU_KVM(r4) >> + lbz r7, KVM_SECURE_GUEST(r6) >> + cmpdi r7, 0 >> + bne ret_to_ultra >> + >> + lwz r6, VCPU_CR(r4) >> + mtcr r6 >> + >> + ld r7, VCPU_GPR(R7)(r4) >> + ld r6, VCPU_GPR(R6)(r4) >> ld r0, VCPU_GPR(R0)(r4) >> ld r4, VCPU_GPR(R4)(r4) >> HRFI_TO_GUEST >> b . >> +/* >> + * We are entering a secure guest, so we have to invoke the ultravisor >> to do >> + * that. If we are returning from a hcall, the results are already in the >> + * appropriate registers R3:12, except for R3, R6 and R7. R3 has
Re: [PATCH v4 8/8] KVM: PPC: Ultravisor: Check for MSR_S during hv_reset_msr
On 2019-06-28 15:08, Claudio Carvalho wrote: From: Michael Anderson - Check for MSR_S so that kvmppc_set_msr will include it. Prior to this change return to guest would not have the S bit set. - Patch based on comment from Paul Mackerras Signed-off-by: Michael Anderson Signed-off-by: Claudio Carvalho Acked-by: Paul Mackerras Reviewed-by: Janani Janakiraman --- arch/powerpc/kvm/book3s_64_mmu_hv.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c index ab3d484c5e2e..ab62a66f9b4e 100644 --- a/arch/powerpc/kvm/book3s_64_mmu_hv.c +++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c @@ -295,6 +295,7 @@ static void kvmppc_mmu_book3s_64_hv_reset_msr(struct kvm_vcpu *vcpu) msr |= MSR_TS_S; else msr |= vcpu->arch.shregs.msr & MSR_TS_MASK; + msr |= vcpu->arch.shregs.msr & MSR_S; kvmppc_set_msr(vcpu, msr); }
Re: [PATCH v4 7/8] KVM: PPC: Ultravisor: Enter a secure guest
On 2019-06-28 15:08, Claudio Carvalho wrote: From: Sukadev Bhattiprolu To enter a secure guest, we have to go through the ultravisor, therefore we do a ucall when we are entering a secure guest. This change is needed for any sort of entry to the secure guest from the hypervisor, whether it is a return from an hcall, a return from a hypervisor interrupt, or the first time that a secure guest vCPU is run. If we are returning from an hcall, the results are already in the appropriate registers R3:12, except for R3, R6 and R7. R3 has the status of the reflected hcall, therefore we move it to R0 for the ultravisor and set R3 to the UV_RETURN ucall number. R6,7 were used as temporary registers, hence we restore them. Have fast_guest_return check the kvm_arch.secure_guest field so that a new CPU enters UV when started (in response to a RTAS start-cpu call). Thanks to input from Paul Mackerras, Ram Pai and Mike Anderson. Signed-off-by: Sukadev Bhattiprolu [ Pass SRR1 in r11 for UV_RETURN, fix kvmppc_msr_interrupt to preserve the MSR_S bit ] Signed-off-by: Paul Mackerras [ Fix UV_RETURN ucall number and arch.secure_guest check ] Signed-off-by: Ram Pai [ Save the actual R3 in R0 for the ultravisor and use R3 for the UV_RETURN ucall number. Update commit message and ret_to_ultra comment ] Signed-off-by: Claudio Carvalho Reviewed-by: Janani Janakiraman --- arch/powerpc/include/asm/kvm_host.h | 1 + arch/powerpc/include/asm/ultravisor-api.h | 1 + arch/powerpc/kernel/asm-offsets.c | 1 + arch/powerpc/kvm/book3s_hv_rmhandlers.S | 40 +++ 4 files changed, 37 insertions(+), 6 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h index 013c76a0a03e..184becb62ea4 100644 --- a/arch/powerpc/include/asm/kvm_host.h +++ b/arch/powerpc/include/asm/kvm_host.h @@ -294,6 +294,7 @@ struct kvm_arch { cpumask_t cpu_in_guest; u8 radix; u8 fwnmi_enabled; + u8 secure_guest; bool threads_indep; bool nested_enable; pgd_t *pgtable; diff --git a/arch/powerpc/include/asm/ultravisor-api.h b/arch/powerpc/include/asm/ultravisor-api.h index 141940771add..7c4d0b4ced12 100644 --- a/arch/powerpc/include/asm/ultravisor-api.h +++ b/arch/powerpc/include/asm/ultravisor-api.h @@ -19,5 +19,6 @@ /* opcodes */ #define UV_WRITE_PATE 0xF104 +#define UV_RETURN 0xF11C #endif /* _ASM_POWERPC_ULTRAVISOR_API_H */ diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c index 8e02444e9d3d..44742724513e 100644 --- a/arch/powerpc/kernel/asm-offsets.c +++ b/arch/powerpc/kernel/asm-offsets.c @@ -508,6 +508,7 @@ int main(void) OFFSET(KVM_VRMA_SLB_V, kvm, arch.vrma_slb_v); OFFSET(KVM_RADIX, kvm, arch.radix); OFFSET(KVM_FWNMI, kvm, arch.fwnmi_enabled); + OFFSET(KVM_SECURE_GUEST, kvm, arch.secure_guest); OFFSET(VCPU_DSISR, kvm_vcpu, arch.shregs.dsisr); OFFSET(VCPU_DAR, kvm_vcpu, arch.shregs.dar); OFFSET(VCPU_VPA, kvm_vcpu, arch.vpa.pinned_addr); diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S index cffb365d9d02..89813ca987c2 100644 --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S @@ -36,6 +36,7 @@ #include #include #include +#include /* Sign-extend HDEC if not on POWER9 */ #define EXTEND_HDEC(reg) \ @@ -1092,16 +1093,12 @@ BEGIN_FTR_SECTION END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) ld r5, VCPU_LR(r4) - ld r6, VCPU_CR(r4) mtlrr5 - mtcrr6 ld r1, VCPU_GPR(R1)(r4) ld r2, VCPU_GPR(R2)(r4) ld r3, VCPU_GPR(R3)(r4) ld r5, VCPU_GPR(R5)(r4) - ld r6, VCPU_GPR(R6)(r4) - ld r7, VCPU_GPR(R7)(r4) ld r8, VCPU_GPR(R8)(r4) ld r9, VCPU_GPR(R9)(r4) ld r10, VCPU_GPR(R10)(r4) @@ -1119,10 +1116,38 @@ BEGIN_FTR_SECTION mtspr SPRN_HDSISR, r0 END_FTR_SECTION_IFSET(CPU_FTR_ARCH_300) + ld r6, VCPU_KVM(r4) + lbz r7, KVM_SECURE_GUEST(r6) + cmpdi r7, 0 + bne ret_to_ultra + + lwz r6, VCPU_CR(r4) + mtcrr6 + + ld r7, VCPU_GPR(R7)(r4) + ld r6, VCPU_GPR(R6)(r4) ld r0, VCPU_GPR(R0)(r4) ld r4, VCPU_GPR(R4)(r4) HRFI_TO_GUEST b . +/* + * We are entering a secure guest, so we have to invoke the ultravisor to do + * that. If we are returning from a hcall, the results are already in the + * appropriate registers R3:12, except for R3, R6 and R7. R3 has the status of + * the reflected hcall, therefore we move it to R0 for the ultravisor and set + * R3 to the UV_RETURN ucall number. R6,7 were used as temporary registers + * above, hence we restore them. + */ +ret_to_ultra: + lwz r6, VCPU_CR(r4) +
Re: [PATCH v4 6/8] KVM: PPC: Ultravisor: Restrict LDBAR access
On 2019-06-28 15:08, Claudio Carvalho wrote: When the ultravisor firmware is available, it takes control over the LDBAR register. In this case, thread-imc updates and save/restore operations on the LDBAR register are handled by ultravisor. Signed-off-by: Claudio Carvalho Reviewed-by: Ram Pai Reviewed-by: Ryan Grimm Acked-by: Madhavan Srinivasan Acked-by: Paul Mackerras Reviewed-by: Janani Janakiraman --- arch/powerpc/kvm/book3s_hv_rmhandlers.S | 2 ++ arch/powerpc/platforms/powernv/idle.c | 6 -- arch/powerpc/platforms/powernv/opal-imc.c | 4 3 files changed, 10 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S index f9b2620fbecd..cffb365d9d02 100644 --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S @@ -375,8 +375,10 @@ BEGIN_FTR_SECTION mtspr SPRN_RPR, r0 ld r0, KVM_SPLIT_PMMAR(r6) mtspr SPRN_PMMAR, r0 +BEGIN_FW_FTR_SECTION_NESTED(70) ld r0, KVM_SPLIT_LDBAR(r6) mtspr SPRN_LDBAR, r0 +END_FW_FTR_SECTION_NESTED(FW_FEATURE_ULTRAVISOR, 0, 70) isync FTR_SECTION_ELSE /* On P9 we use the split_info for coordinating LPCR changes */ diff --git a/arch/powerpc/platforms/powernv/idle.c b/arch/powerpc/platforms/powernv/idle.c index 77f2e0a4ee37..5593a2d55959 100644 --- a/arch/powerpc/platforms/powernv/idle.c +++ b/arch/powerpc/platforms/powernv/idle.c @@ -679,7 +679,8 @@ static unsigned long power9_idle_stop(unsigned long psscr, bool mmu_on) sprs.ptcr = mfspr(SPRN_PTCR); sprs.rpr= mfspr(SPRN_RPR); sprs.tscr = mfspr(SPRN_TSCR); - sprs.ldbar = mfspr(SPRN_LDBAR); + if (!firmware_has_feature(FW_FEATURE_ULTRAVISOR)) + sprs.ldbar = mfspr(SPRN_LDBAR); sprs_saved = true; @@ -762,7 +763,8 @@ static unsigned long power9_idle_stop(unsigned long psscr, bool mmu_on) mtspr(SPRN_PTCR,sprs.ptcr); mtspr(SPRN_RPR, sprs.rpr); mtspr(SPRN_TSCR,sprs.tscr); - mtspr(SPRN_LDBAR, sprs.ldbar); + if (!firmware_has_feature(FW_FEATURE_ULTRAVISOR)) + mtspr(SPRN_LDBAR, sprs.ldbar); if (pls >= pnv_first_tb_loss_level) { /* TB loss */ diff --git a/arch/powerpc/platforms/powernv/opal-imc.c b/arch/powerpc/platforms/powernv/opal-imc.c index 1b6932890a73..5fe2d4526cbc 100644 --- a/arch/powerpc/platforms/powernv/opal-imc.c +++ b/arch/powerpc/platforms/powernv/opal-imc.c @@ -254,6 +254,10 @@ static int opal_imc_counters_probe(struct platform_device *pdev) bool core_imc_reg = false, thread_imc_reg = false; u32 type; + /* Disable IMC devices, when Ultravisor is enabled. */ + if (firmware_has_feature(FW_FEATURE_ULTRAVISOR)) + return -EACCES; + /* * Check whether this is kdump kernel. If yes, force the engines to * stop and return.
[RFC PATCH] powerpc/powernv: Add ultravisor message log interface
From: Madhavan Srinivasan Ultravisor provides an in-memory circular buffer containing a message log populated with various runtime message produced by firmware. Based on "powernv/opal-msglog.c", this patch provides a sysfs interface /sys/firmware/opal/uv_msglog for userspace to view the messages. CC: Joel Stanley CC: Oliver O'Halloran Signed-off-by: Madhavan Srinivasan [ Read ibm,opal-uv-memcons instead of OPAL's ] Signed-off-by: Ryan Grimm [ Fix license, update the commit message ] Signed-off-by: Claudio Carvalho --- arch/powerpc/include/asm/opal.h | 2 + arch/powerpc/platforms/powernv/Makefile | 1 + .../platforms/powernv/opal-uv-msglog.c| 141 ++ arch/powerpc/platforms/powernv/opal.c | 2 + 4 files changed, 146 insertions(+) create mode 100644 arch/powerpc/platforms/powernv/opal-uv-msglog.c diff --git a/arch/powerpc/include/asm/opal.h b/arch/powerpc/include/asm/opal.h index 4cc37e708bc7..86299c5d866b 100644 --- a/arch/powerpc/include/asm/opal.h +++ b/arch/powerpc/include/asm/opal.h @@ -351,6 +351,8 @@ extern void opal_platform_dump_init(void); extern void opal_sys_param_init(void); extern void opal_msglog_init(void); extern void opal_msglog_sysfs_init(void); +extern void opal_uv_msglog_init(void); +extern void opal_uv_msglog_sysfs_init(void); extern int opal_async_comp_init(void); extern int opal_sensor_init(void); extern int opal_hmi_handler_init(void); diff --git a/arch/powerpc/platforms/powernv/Makefile b/arch/powerpc/platforms/powernv/Makefile index da2e99efbd04..216629c63e1d 100644 --- a/arch/powerpc/platforms/powernv/Makefile +++ b/arch/powerpc/platforms/powernv/Makefile @@ -4,6 +4,7 @@ obj-y += idle.o opal-rtc.o opal-nvram.o opal-lpc.o opal-flash.o obj-y += rng.o opal-elog.o opal-dump.o opal-sysparam.o opal-sensor.o obj-y += opal-msglog.o opal-hmi.o opal-power.o opal-irqchip.o obj-y += opal-kmsg.o opal-powercap.o opal-psr.o opal-sensor-groups.o +obj-y += opal-uv-msglog.o obj-$(CONFIG_SMP) += smp.o subcore.o subcore-asm.o obj-$(CONFIG_PCI) += pci.o pci-ioda.o npu-dma.o pci-ioda-tce.o diff --git a/arch/powerpc/platforms/powernv/opal-uv-msglog.c b/arch/powerpc/platforms/powernv/opal-uv-msglog.c new file mode 100644 index ..87d665d7e6ad --- /dev/null +++ b/arch/powerpc/platforms/powernv/opal-uv-msglog.c @@ -0,0 +1,141 @@ +// SPDX-License-Identifier: GPL-2.0+ +/* + * PowerNV OPAL in-memory ultravisor console interface + * + * Copyright 2018 IBM Corp. + */ +#include +#include +#include +#include +#include +#include + +/* OPAL in-memory console. Defined in OPAL source at core/console.c */ +struct memcons { + __be64 magic; +#define MEMCONS_MAGIC 0x6630696567726173L + __be64 obuf_phys; + __be64 ibuf_phys; + __be32 obuf_size; + __be32 ibuf_size; + __be32 out_pos; +#define MEMCONS_OUT_POS_WRAP 0x8000u +#define MEMCONS_OUT_POS_MASK 0x00ffu + __be32 in_prod; + __be32 in_cons; +}; + +static struct memcons *opal_uv_memcons; + +ssize_t opal_uv_msglog_copy(char *to, loff_t pos, size_t count) +{ + const char *conbuf; + ssize_t ret; + size_t first_read = 0; + uint32_t out_pos, avail; + + if (!opal_uv_memcons) + return -ENODEV; + + out_pos = be32_to_cpu(READ_ONCE(opal_uv_memcons->out_pos)); + + /* +* Now we've read out_pos, put a barrier in before reading the new data +* it points to in conbuf. +*/ + smp_rmb(); + + conbuf = phys_to_virt(be64_to_cpu(opal_uv_memcons->obuf_phys)); + + /* +* When the buffer has wrapped, read from the out_pos marker to the end +* of the buffer, and then read the remaining data as in the un-wrapped +* case. +*/ + if (out_pos & MEMCONS_OUT_POS_WRAP) { + + out_pos &= MEMCONS_OUT_POS_MASK; + avail = be32_to_cpu(opal_uv_memcons->obuf_size) - out_pos; + + ret = memory_read_from_buffer(to, count, , + conbuf + out_pos, avail); + + if (ret < 0) + goto out; + + first_read = ret; + to += first_read; + count -= first_read; + pos -= avail; + + if (count <= 0) + goto out; + } + + /* Sanity check. The firmware should not do this to us. */ + if (out_pos > be32_to_cpu(opal_uv_memcons->obuf_size)) { + pr_err("OPAL: memory console corruption. Aborting read.\n"); + return -EINVAL; + } + + ret = memory_read_from_buffer(to, count, , conbuf, out_pos); + + if (ret < 0) + goto out; + + ret += first_read; +out: + return ret; +} + +static ssize_t opal_uv_msglog_read(struct file *file, struct kobject *kobj, +
Re: [PATCH v4 5/8] KVM: PPC: Ultravisor: Restrict flush of the partition tlb cache
On 7/1/19 2:54 AM, Alexey Kardashevskiy wrote: > > On 29/06/2019 06:08, Claudio Carvalho wrote: >> From: Ram Pai >> >> Ultravisor is responsible for flushing the tlb cache, since it manages >> the PATE entries. Hence skip tlb flush, if the ultravisor firmware is >> available. >> >> Signed-off-by: Ram Pai >> Signed-off-by: Claudio Carvalho >> --- >> arch/powerpc/mm/book3s64/pgtable.c | 33 +- >> 1 file changed, 19 insertions(+), 14 deletions(-) >> >> diff --git a/arch/powerpc/mm/book3s64/pgtable.c >> b/arch/powerpc/mm/book3s64/pgtable.c >> index 224c5c7c2e3d..bc8eb2bf9810 100644 >> --- a/arch/powerpc/mm/book3s64/pgtable.c >> +++ b/arch/powerpc/mm/book3s64/pgtable.c >> @@ -224,6 +224,23 @@ void __init mmu_partition_table_init(void) >> powernv_set_nmmu_ptcr(ptcr); >> } >> >> +static void flush_partition(unsigned int lpid, unsigned long dw0) >> +{ >> +if (dw0 & PATB_HR) { >> +asm volatile(PPC_TLBIE_5(%0, %1, 2, 0, 1) : : >> + "r" (TLBIEL_INVAL_SET_LPID), "r" (lpid)); >> +asm volatile(PPC_TLBIE_5(%0, %1, 2, 1, 1) : : >> + "r" (TLBIEL_INVAL_SET_LPID), "r" (lpid)); >> +trace_tlbie(lpid, 0, TLBIEL_INVAL_SET_LPID, lpid, 2, 0, 1); >> +} else { >> +asm volatile(PPC_TLBIE_5(%0, %1, 2, 0, 0) : : >> + "r" (TLBIEL_INVAL_SET_LPID), "r" (lpid)); >> +trace_tlbie(lpid, 0, TLBIEL_INVAL_SET_LPID, lpid, 2, 0, 0); >> +} >> +/* do we need fixup here ?*/ >> +asm volatile("eieio; tlbsync; ptesync" : : : "memory"); >> +} >> + >> static void __mmu_partition_table_set_entry(unsigned int lpid, >> unsigned long dw0, >> unsigned long dw1) >> @@ -238,20 +255,8 @@ static void __mmu_partition_table_set_entry(unsigned >> int lpid, >> * The type of flush (hash or radix) depends on what the previous >> * use of this partition ID was, not the new use. >> */ >> -asm volatile("ptesync" : : : "memory"); >> -if (old & PATB_HR) { >> -asm volatile(PPC_TLBIE_5(%0,%1,2,0,1) : : >> - "r" (TLBIEL_INVAL_SET_LPID), "r" (lpid)); >> -asm volatile(PPC_TLBIE_5(%0,%1,2,1,1) : : >> - "r" (TLBIEL_INVAL_SET_LPID), "r" (lpid)); >> -trace_tlbie(lpid, 0, TLBIEL_INVAL_SET_LPID, lpid, 2, 0, 1); >> -} else { >> -asm volatile(PPC_TLBIE_5(%0,%1,2,0,0) : : >> - "r" (TLBIEL_INVAL_SET_LPID), "r" (lpid)); >> -trace_tlbie(lpid, 0, TLBIEL_INVAL_SET_LPID, lpid, 2, 0, 0); >> -} >> -/* do we need fixup here ?*/ >> -asm volatile("eieio; tlbsync; ptesync" : : : "memory"); >> +if (!firmware_has_feature(FW_FEATURE_ULTRAVISOR)) > > __mmu_partition_table_set_entry() checks for UV and > mmu_partition_table_set_entry() (the caller) checks for UV and the whole > point of having separate flush_partition() and > __mmu_partition_table_set_entry() is not really clear. > > > 4/8 and 5/8 make more sense as one patch imho. Makes sense. I merged them in the next version. Thanks. Claudio > > >> +flush_partition(lpid, old); >> } >> >> void mmu_partition_table_set_entry(unsigned int lpid, unsigned long dw0, >>
Re: [PATCH v4 5/8] KVM: PPC: Ultravisor: Restrict flush of the partition tlb cache
On 2019-06-28 15:08, Claudio Carvalho wrote: From: Ram Pai Ultravisor is responsible for flushing the tlb cache, since it manages the PATE entries. Hence skip tlb flush, if the ultravisor firmware is available. Signed-off-by: Ram Pai Signed-off-by: Claudio Carvalho --- arch/powerpc/mm/book3s64/pgtable.c | 33 +- 1 file changed, 19 insertions(+), 14 deletions(-) diff --git a/arch/powerpc/mm/book3s64/pgtable.c b/arch/powerpc/mm/book3s64/pgtable.c index 224c5c7c2e3d..bc8eb2bf9810 100644 --- a/arch/powerpc/mm/book3s64/pgtable.c +++ b/arch/powerpc/mm/book3s64/pgtable.c @@ -224,6 +224,23 @@ void __init mmu_partition_table_init(void) powernv_set_nmmu_ptcr(ptcr); } +static void flush_partition(unsigned int lpid, unsigned long dw0) +{ + if (dw0 & PATB_HR) { + asm volatile(PPC_TLBIE_5(%0, %1, 2, 0, 1) : : +"r" (TLBIEL_INVAL_SET_LPID), "r" (lpid)); + asm volatile(PPC_TLBIE_5(%0, %1, 2, 1, 1) : : +"r" (TLBIEL_INVAL_SET_LPID), "r" (lpid)); + trace_tlbie(lpid, 0, TLBIEL_INVAL_SET_LPID, lpid, 2, 0, 1); + } else { + asm volatile(PPC_TLBIE_5(%0, %1, 2, 0, 0) : : +"r" (TLBIEL_INVAL_SET_LPID), "r" (lpid)); + trace_tlbie(lpid, 0, TLBIEL_INVAL_SET_LPID, lpid, 2, 0, 0); + } + /* do we need fixup here ?*/ + asm volatile("eieio; tlbsync; ptesync" : : : "memory"); +} + static void __mmu_partition_table_set_entry(unsigned int lpid, unsigned long dw0, unsigned long dw1) @@ -238,20 +255,8 @@ static void __mmu_partition_table_set_entry(unsigned int lpid, * The type of flush (hash or radix) depends on what the previous * use of this partition ID was, not the new use. */ - asm volatile("ptesync" : : : "memory"); Doesn't the line above that was deleted need to be added to the beginning of flush_partition() - if (old & PATB_HR) { - asm volatile(PPC_TLBIE_5(%0,%1,2,0,1) : : -"r" (TLBIEL_INVAL_SET_LPID), "r" (lpid)); - asm volatile(PPC_TLBIE_5(%0,%1,2,1,1) : : -"r" (TLBIEL_INVAL_SET_LPID), "r" (lpid)); - trace_tlbie(lpid, 0, TLBIEL_INVAL_SET_LPID, lpid, 2, 0, 1); - } else { - asm volatile(PPC_TLBIE_5(%0,%1,2,0,0) : : -"r" (TLBIEL_INVAL_SET_LPID), "r" (lpid)); - trace_tlbie(lpid, 0, TLBIEL_INVAL_SET_LPID, lpid, 2, 0, 0); - } - /* do we need fixup here ?*/ - asm volatile("eieio; tlbsync; ptesync" : : : "memory"); + if (!firmware_has_feature(FW_FEATURE_ULTRAVISOR)) + flush_partition(lpid, old); } void mmu_partition_table_set_entry(unsigned int lpid, unsigned long dw0,
Re: [PATCH v2] powerpc: slightly improve cache helpers
On Mon, Jul 08, 2019 at 11:19:30AM +1000, Michael Ellerman wrote: > On Fri, 2019-05-10 at 09:24:48 UTC, Christophe Leroy wrote: > > Cache instructions (dcbz, dcbi, dcbf and dcbst) take two registers > > that are summed to obtain the target address. Using 'Z' constraint > > and '%y0' argument gives GCC the opportunity to use both registers > > instead of only one with the second being forced to 0. > > > > Suggested-by: Segher Boessenkool > > Signed-off-by: Christophe Leroy > > Applied to powerpc next, thanks. > > https://git.kernel.org/powerpc/c/6c5875843b87c3adea2beade9d1b8b3d4523900a > > cheers This patch causes a regression with clang: https://travis-ci.com/ClangBuiltLinux/continuous-integration/jobs/213944668 I've attached my local bisect/build log. Cheers, Nathan git bisect start # good: [46713c3d2f8da5e3d8ddd2249bcb1d9974fb5d28] Merge tag 'for-linus-20190706' of git://git.kernel.dk/linux-block git bisect good 46713c3d2f8da5e3d8ddd2249bcb1d9974fb5d28 # bad: [d58b5ab90ee7528126fd5833df7fc5bda8331ce8] Add linux-next specific files for 20190708 git bisect bad d58b5ab90ee7528126fd5833df7fc5bda8331ce8 # bad: [ba30fb6d5d6464bd7d3759408ea7f178d8c9fe87] Merge remote-tracking branch 'crypto/master' git bisect bad ba30fb6d5d6464bd7d3759408ea7f178d8c9fe87 # bad: [eaa0d0d3b269695df5d682d3dfcfb5c6e8f19fa8] Merge remote-tracking branch 'i3c/i3c/next' git bisect bad eaa0d0d3b269695df5d682d3dfcfb5c6e8f19fa8 # good: [e41aad4a290783ec7d3730542cbed0e99b2dcb4a] Merge remote-tracking branch 'tegra/for-next' git bisect good e41aad4a290783ec7d3730542cbed0e99b2dcb4a # bad: [c5a28b5f954e769decf4b69c06ecfd27ebeaeb5b] Merge remote-tracking branch 'cifs/for-next' git bisect bad c5a28b5f954e769decf4b69c06ecfd27ebeaeb5b # bad: [8e8fefda572360f00854547f3458a9c2cf932ff5] Merge remote-tracking branch 'powerpc/next' git bisect bad 8e8fefda572360f00854547f3458a9c2cf932ff5 # good: [01fd0e565283d69adf0ff1da95cab5bb4cb58acb] Merge remote-tracking branch 'm68k/for-next' git bisect good 01fd0e565283d69adf0ff1da95cab5bb4cb58acb # good: [7505a13f85bdcb8713551a067dfc92ac3c7ba902] powerpc/configs: Disable latencytop git bisect good 7505a13f85bdcb8713551a067dfc92ac3c7ba902 # good: [958ace9b9edae56953190fdbdddc55d6506ec6f7] Merge remote-tracking branch 'nios2/for-next' git bisect good 958ace9b9edae56953190fdbdddc55d6506ec6f7 # bad: [1cfb725fb1899dc6fdc88f8b5354a65e8ad260c6] powerpc/64: flush_inval_dcache_range() becomes flush_dcache_range() git bisect bad 1cfb725fb1899dc6fdc88f8b5354a65e8ad260c6 # good: [89a3496e0664577043666791ec07fb731d57c950] powerpc/mm/radix: Use the right page size for vmemmap mapping git bisect good 89a3496e0664577043666791ec07fb731d57c950 # good: [259a948c4ba1829ae4a3c31bb6e40ad458a21254] powerpc/pseries/scm: Use a specific endian format for storing uuid from the device tree git bisect good 259a948c4ba1829ae4a3c31bb6e40ad458a21254 # good: [2230ebf6e6dd0b7751e2921b40f6cfe34f09bb16] powerpc/mm: Handle page table allocation failures git bisect good 2230ebf6e6dd0b7751e2921b40f6cfe34f09bb16 # good: [ac25ba68fa4001c85395f0488b1c7a2421c5aada] powerpc/mm/hugetlb: Don't enable HugeTLB if we don't have a page table cache git bisect good ac25ba68fa4001c85395f0488b1c7a2421c5aada # bad: [6c5875843b87c3adea2beade9d1b8b3d4523900a] powerpc: slightly improve cache helpers git bisect bad 6c5875843b87c3adea2beade9d1b8b3d4523900a # first bad commit: [6c5875843b87c3adea2beade9d1b8b3d4523900a] powerpc: slightly improve cache helpers make[1]: Entering directory '/mnt/build/kernel' CLEAN . CLEAN arch/powerpc/kernel/vdso32 CLEAN arch/powerpc/kernel CLEAN lib CLEAN drivers/scsi CLEAN usr CLEAN scripts/basic CLEAN scripts/dtc rm -f .tmp_symbols.txt CLEAN scripts/mod CLEAN scripts/kconfig CLEAN scripts CLEAN arch/powerpc/boot CLEAN .tmp_versions CLEAN modules.builtin.modinfo CLEAN include/config include/generated arch/powerpc/include/generated CLEAN .config .config.old .version Module.symvers HOSTCC scripts/basic/fixdep GEN Makefile HOSTCC scripts/kconfig/conf.o HOSTCC scripts/kconfig/confdata.o LEX scripts/kconfig/lexer.lex.c YACCscripts/kconfig/parser.tab.h YACCscripts/kconfig/parser.tab.c HOSTCC scripts/kconfig/expr.o HOSTCC scripts/kconfig/symbol.o HOSTCC scripts/kconfig/preprocess.o HOSTCC scripts/kconfig/parser.tab.o HOSTCC scripts/kconfig/lexer.lex.o HOSTLD scripts/kconfig/conf # # configuration written to .config # SYSTBL arch/powerpc/include/generated/asm/syscall_table_64.h SYSTBL arch/powerpc/include/generated/asm/syscall_table_c32.h SYSTBL arch/powerpc/include/generated/asm/syscall_table_32.h SYSTBL arch/powerpc/include/generated/asm/syscall_table_spu.h SYSHDR arch/powerpc/include/generated/uapi/asm/unistd_64.h SYSHDR arch/powerpc/include/generated/uapi/asm/unistd_32.h GEN Makefile WRAParch/powerpc/i
Re: [PATCH v4 4/8] KVM: PPC: Ultravisor: Use UV_WRITE_PATE ucall to register a PATE
On 2019-06-28 15:08, Claudio Carvalho wrote: From: Michael Anderson When running under an ultravisor, the ultravisor controls the real partition table and has it in secure memory where the hypervisor can't access it, and therefore we (the HV) have to do a ucall whenever we want to update an entry. The HV still keeps a copy of its view of the partition table in normal memory so that the nest MMU can access it. Both partition tables will have PATE entries for HV and normal virtual machines. Suggested-by: Ryan Grimm Signed-off-by: Michael Anderson Signed-off-by: Madhavan Srinivasan Signed-off-by: Ram Pai [ Write the pate in HV's table before doing that in UV's ] Signed-off-by: Claudio Carvalho Reviewed-by: Janani Janakiraman --- arch/powerpc/include/asm/ultravisor-api.h | 5 +++- arch/powerpc/include/asm/ultravisor.h | 14 ++ arch/powerpc/mm/book3s64/hash_utils.c | 3 +- arch/powerpc/mm/book3s64/pgtable.c| 34 +-- arch/powerpc/mm/book3s64/radix_pgtable.c | 9 -- 5 files changed, 57 insertions(+), 8 deletions(-) diff --git a/arch/powerpc/include/asm/ultravisor-api.h b/arch/powerpc/include/asm/ultravisor-api.h index 49e766adabc7..141940771add 100644 --- a/arch/powerpc/include/asm/ultravisor-api.h +++ b/arch/powerpc/include/asm/ultravisor-api.h @@ -15,6 +15,9 @@ #define U_SUCCESS H_SUCCESS #define U_FUNCTION H_FUNCTION #define U_PARAMETERH_PARAMETER +#define U_PERMISSION H_PERMISSION -#endif /* _ASM_POWERPC_ULTRAVISOR_API_H */ +/* opcodes */ +#define UV_WRITE_PATE 0xF104 +#endif /* _ASM_POWERPC_ULTRAVISOR_API_H */ diff --git a/arch/powerpc/include/asm/ultravisor.h b/arch/powerpc/include/asm/ultravisor.h index a78a2dacfd0b..996c1efd6c6d 100644 --- a/arch/powerpc/include/asm/ultravisor.h +++ b/arch/powerpc/include/asm/ultravisor.h @@ -12,6 +12,8 @@ #if !defined(__ASSEMBLY__) +#include + /* Internal functions */ extern int early_init_dt_scan_ultravisor(unsigned long node, const char *uname, int depth, void *data); @@ -28,8 +30,20 @@ extern int early_init_dt_scan_ultravisor(unsigned long node, const char *uname, */ #if defined(CONFIG_PPC_POWERNV) long ucall(unsigned long opcode, unsigned long *retbuf, ...); +#else +static long ucall(unsigned long opcode, unsigned long *retbuf, ...) +{ + return U_NOT_AVAILABLE; +} #endif +static inline int uv_register_pate(u64 lpid, u64 dw0, u64 dw1) +{ + unsigned long retbuf[UCALL_BUFSIZE]; + + return ucall(UV_WRITE_PATE, retbuf, lpid, dw0, dw1); +} + #endif /* !__ASSEMBLY__ */ #endif /* _ASM_POWERPC_ULTRAVISOR_H */ diff --git a/arch/powerpc/mm/book3s64/hash_utils.c b/arch/powerpc/mm/book3s64/hash_utils.c index 1ff451892d7f..220a4e133240 100644 --- a/arch/powerpc/mm/book3s64/hash_utils.c +++ b/arch/powerpc/mm/book3s64/hash_utils.c @@ -1080,9 +1080,10 @@ void hash__early_init_mmu_secondary(void) if (!cpu_has_feature(CPU_FTR_ARCH_300)) mtspr(SPRN_SDR1, _SDR1); - else + else if (!firmware_has_feature(FW_FEATURE_ULTRAVISOR)) mtspr(SPRN_PTCR, __pa(partition_tb) | (PATB_SIZE_SHIFT - 12)); + } /* Initialize SLB */ slb_initialize(); diff --git a/arch/powerpc/mm/book3s64/pgtable.c b/arch/powerpc/mm/book3s64/pgtable.c index ad3dd977c22d..224c5c7c2e3d 100644 --- a/arch/powerpc/mm/book3s64/pgtable.c +++ b/arch/powerpc/mm/book3s64/pgtable.c @@ -16,6 +16,8 @@ #include #include #include +#include +#include #include #include @@ -206,12 +208,25 @@ void __init mmu_partition_table_init(void) * 64 K size. */ ptcr = __pa(partition_tb) | (PATB_SIZE_SHIFT - 12); - mtspr(SPRN_PTCR, ptcr); + /* +* If ultravisor is available, it is responsible for creating and +* managing partition table +*/ + if (!firmware_has_feature(FW_FEATURE_ULTRAVISOR)) + mtspr(SPRN_PTCR, ptcr); + + /* +* Since nestMMU cannot access secure memory. Create +* and manage our own partition table. This table +* contains entries for nonsecure and hypervisor +* partition. +*/ powernv_set_nmmu_ptcr(ptcr); } -void mmu_partition_table_set_entry(unsigned int lpid, unsigned long dw0, - unsigned long dw1) +static void __mmu_partition_table_set_entry(unsigned int lpid, + unsigned long dw0, + unsigned long dw1) { unsigned long old = be64_to_cpu(partition_tb[lpid].patb0); @@ -238,6 +253,19 @@ void mmu_partition_table_set_entry(unsigned int lpid, unsigned long dw0, /* do we need fixup here ?*/ asm volatile("eieio; tlbsync; ptesync" : : : "memory"); } + +void mmu_partition_table_set_entry(unsigned int
Re: [PATCH v4 3/8] KVM: PPC: Ultravisor: Add generic ultravisor call handler
On 2019-06-28 15:08, Claudio Carvalho wrote: From: Ram Pai Add the ucall() function, which can be used to make ultravisor calls with varied number of in and out arguments. Ultravisor calls can be made from the host or guests. This copies the implementation of plpar_hcall(). Signed-off-by: Ram Pai [ Change ucall.S to not save CR, rename and move headers, build ucall.S if CONFIG_PPC_POWERNV set, use R3 for the ucall number and add some comments in the code ] Signed-off-by: Claudio Carvalho Reviewed-by: Janani Janakiraman --- arch/powerpc/include/asm/ultravisor-api.h | 20 +++ arch/powerpc/include/asm/ultravisor.h | 20 +++ arch/powerpc/kernel/Makefile | 2 +- arch/powerpc/kernel/ucall.S | 30 +++ arch/powerpc/kernel/ultravisor.c | 4 +++ 5 files changed, 75 insertions(+), 1 deletion(-) create mode 100644 arch/powerpc/include/asm/ultravisor-api.h create mode 100644 arch/powerpc/kernel/ucall.S diff --git a/arch/powerpc/include/asm/ultravisor-api.h b/arch/powerpc/include/asm/ultravisor-api.h new file mode 100644 index ..49e766adabc7 --- /dev/null +++ b/arch/powerpc/include/asm/ultravisor-api.h @@ -0,0 +1,20 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Ultravisor API. + * + * Copyright 2019, IBM Corporation. + * + */ +#ifndef _ASM_POWERPC_ULTRAVISOR_API_H +#define _ASM_POWERPC_ULTRAVISOR_API_H + +#include + +/* Return codes */ +#define U_NOT_AVAILABLEH_NOT_AVAILABLE +#define U_SUCCESS H_SUCCESS +#define U_FUNCTION H_FUNCTION +#define U_PARAMETERH_PARAMETER + +#endif /* _ASM_POWERPC_ULTRAVISOR_API_H */ + diff --git a/arch/powerpc/include/asm/ultravisor.h b/arch/powerpc/include/asm/ultravisor.h index e5009b0d84ea..a78a2dacfd0b 100644 --- a/arch/powerpc/include/asm/ultravisor.h +++ b/arch/powerpc/include/asm/ultravisor.h @@ -8,8 +8,28 @@ #ifndef _ASM_POWERPC_ULTRAVISOR_H #define _ASM_POWERPC_ULTRAVISOR_H +#include + +#if !defined(__ASSEMBLY__) + /* Internal functions */ extern int early_init_dt_scan_ultravisor(unsigned long node, const char *uname, int depth, void *data); +/* API functions */ +#define UCALL_BUFSIZE 4 +/** + * ucall: Make a powerpc ultravisor call. + * @opcode: The ultravisor call to make. + * @retbuf: Buffer to store up to 4 return arguments in. + * + * This call supports up to 6 arguments and 4 return arguments. Use + * UCALL_BUFSIZE to size the return argument buffer. + */ +#if defined(CONFIG_PPC_POWERNV) +long ucall(unsigned long opcode, unsigned long *retbuf, ...); +#endif + +#endif /* !__ASSEMBLY__ */ + #endif /* _ASM_POWERPC_ULTRAVISOR_H */ diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile index f0caa302c8c0..f28baccc0a79 100644 --- a/arch/powerpc/kernel/Makefile +++ b/arch/powerpc/kernel/Makefile @@ -154,7 +154,7 @@ endif obj-$(CONFIG_EPAPR_PARAVIRT) += epapr_paravirt.o epapr_hcalls.o obj-$(CONFIG_KVM_GUEST)+= kvm.o kvm_emul.o -obj-$(CONFIG_PPC_POWERNV) += ultravisor.o +obj-$(CONFIG_PPC_POWERNV) += ultravisor.o ucall.o # Disable GCOV, KCOV & sanitizers in odd or sensitive code GCOV_PROFILE_prom_init.o := n diff --git a/arch/powerpc/kernel/ucall.S b/arch/powerpc/kernel/ucall.S new file mode 100644 index ..1678f6eb7230 --- /dev/null +++ b/arch/powerpc/kernel/ucall.S @@ -0,0 +1,30 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Generic code to perform an ultravisor call. + * + * Copyright 2019, IBM Corporation. + * + */ +#include + +/* + * This function is based on the plpar_hcall() + */ +_GLOBAL_TOC(ucall) + std r4,STK_PARAM(R4)(r1)/* Save ret buffer */ + mr r4,r5 + mr r5,r6 + mr r6,r7 + mr r7,r8 + mr r8,r9 + mr r9,r10 + + sc 2/* Invoke the ultravisor */ + + ld r12,STK_PARAM(R4)(r1) + std r4, 0(r12) + std r5, 8(r12) + std r6, 16(r12) + std r7, 24(r12) + + blr /* Return r3 = status */ diff --git a/arch/powerpc/kernel/ultravisor.c b/arch/powerpc/kernel/ultravisor.c index dc6021f63c97..02ddf79a9522 100644 --- a/arch/powerpc/kernel/ultravisor.c +++ b/arch/powerpc/kernel/ultravisor.c @@ -8,10 +8,14 @@ #include #include #include +#include #include #include +/* in ucall.S */ +EXPORT_SYMBOL_GPL(ucall); + int __init early_init_dt_scan_ultravisor(unsigned long node, const char *uname, int depth, void *data) {
Re: [PATCH using git format-patch -C] kexec: add generic support for elf kernel images
Hi Christophe, On Mon, Jul 08, 2019 at 12:40:52PM +0200, Christophe Leroy wrote: > > From: Sven Schnelle > > Please describe you patch. > > All the changes you are doing to the ppc64 version in order to make it > generic should be described. > > Those changes are also maybe worth splitting the patch in several parts, > either preparing the ppc64 for generic then moving to kernel/kexec_elf.c or > moving to kernel/kexec_elf.c without modifications, then modifying it to > make it generic. > > Note that your patch only applies on Linux 5.1, it doesn't apply on > powerpc/next. > > I think it also be worth taking the opportunity to fix the sparse warnings, > see https://patchwork.ozlabs.org/patch/1128720/ > > checkpatch comments should also be fixed, see > https://openpower.xyz/job/snowpatch/job/snowpatch-linux-checkpatch/8044//artifact/linux/checkpatch.log Actually this patch shouldn't have been sent out, that was my fault. Appologies for that. Thanks for your review, i will split up the patch and resent it. Regards Sven
Re: [PATCH v3] cpufreq/pasemi: fix an use-after-free in pas_cpufreq_cpu_init()
> @@ -204,6 +205,7 @@ static int pas_cpufreq_cpu_init(struct cpufreq_policy > *policy) > out_unmap_sdcasr: > iounmap(sdcasr_mapbase); > out: > + of_node_put(cpu); I would find the usage of a jump label like “put_node” nicer at such a source code place. Regards, Markus
Re: [PATCH v4 2/8] powerpc: Introduce FW_FEATURE_ULTRAVISOR
On 2019-06-28 15:08, Claudio Carvalho wrote: This feature tells if the ultravisor firmware is available to handle ucalls. Signed-off-by: Claudio Carvalho [ Device node name to "ibm,ultravisor" ] Signed-off-by: Michael Anderson Reviewed-by: Janani Janakiraman --- arch/powerpc/include/asm/firmware.h | 5 +++-- arch/powerpc/include/asm/ultravisor.h | 15 +++ arch/powerpc/kernel/Makefile | 1 + arch/powerpc/kernel/prom.c| 4 arch/powerpc/kernel/ultravisor.c | 24 5 files changed, 47 insertions(+), 2 deletions(-) create mode 100644 arch/powerpc/include/asm/ultravisor.h create mode 100644 arch/powerpc/kernel/ultravisor.c diff --git a/arch/powerpc/include/asm/firmware.h b/arch/powerpc/include/asm/firmware.h index 00bc42d95679..43b48c4d3ca9 100644 --- a/arch/powerpc/include/asm/firmware.h +++ b/arch/powerpc/include/asm/firmware.h @@ -54,6 +54,7 @@ #define FW_FEATURE_DRC_INFOASM_CONST(0x0008) #define FW_FEATURE_BLOCK_REMOVE ASM_CONST(0x0010) #define FW_FEATURE_PAPR_SCMASM_CONST(0x0020) +#define FW_FEATURE_ULTRAVISOR ASM_CONST(0x0040) #ifndef __ASSEMBLY__ @@ -72,9 +73,9 @@ enum { FW_FEATURE_TYPE1_AFFINITY | FW_FEATURE_PRRN | FW_FEATURE_HPT_RESIZE | FW_FEATURE_DRMEM_V2 | FW_FEATURE_DRC_INFO | FW_FEATURE_BLOCK_REMOVE | - FW_FEATURE_PAPR_SCM, + FW_FEATURE_PAPR_SCM | FW_FEATURE_ULTRAVISOR, FW_FEATURE_PSERIES_ALWAYS = 0, - FW_FEATURE_POWERNV_POSSIBLE = FW_FEATURE_OPAL, + FW_FEATURE_POWERNV_POSSIBLE = FW_FEATURE_OPAL | FW_FEATURE_ULTRAVISOR, FW_FEATURE_POWERNV_ALWAYS = 0, FW_FEATURE_PS3_POSSIBLE = FW_FEATURE_LPAR | FW_FEATURE_PS3_LV1, FW_FEATURE_PS3_ALWAYS = FW_FEATURE_LPAR | FW_FEATURE_PS3_LV1, diff --git a/arch/powerpc/include/asm/ultravisor.h b/arch/powerpc/include/asm/ultravisor.h new file mode 100644 index ..e5009b0d84ea --- /dev/null +++ b/arch/powerpc/include/asm/ultravisor.h @@ -0,0 +1,15 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Ultravisor definitions + * + * Copyright 2019, IBM Corporation. + * + */ +#ifndef _ASM_POWERPC_ULTRAVISOR_H +#define _ASM_POWERPC_ULTRAVISOR_H + +/* Internal functions */ +extern int early_init_dt_scan_ultravisor(unsigned long node, const char *uname, +int depth, void *data); + +#endif /* _ASM_POWERPC_ULTRAVISOR_H */ diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile index 0ea6c4aa3a20..f0caa302c8c0 100644 --- a/arch/powerpc/kernel/Makefile +++ b/arch/powerpc/kernel/Makefile @@ -154,6 +154,7 @@ endif obj-$(CONFIG_EPAPR_PARAVIRT) += epapr_paravirt.o epapr_hcalls.o obj-$(CONFIG_KVM_GUEST)+= kvm.o kvm_emul.o +obj-$(CONFIG_PPC_POWERNV) += ultravisor.o # Disable GCOV, KCOV & sanitizers in odd or sensitive code GCOV_PROFILE_prom_init.o := n diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c index 4221527b082f..67a2c1b39252 100644 --- a/arch/powerpc/kernel/prom.c +++ b/arch/powerpc/kernel/prom.c @@ -59,6 +59,7 @@ #include #include #include +#include #include @@ -706,6 +707,9 @@ void __init early_init_devtree(void *params) #ifdef CONFIG_PPC_POWERNV /* Some machines might need OPAL info for debugging, grab it now. */ of_scan_flat_dt(early_init_dt_scan_opal, NULL); + + /* Scan tree for ultravisor feature */ + of_scan_flat_dt(early_init_dt_scan_ultravisor, NULL); #endif #ifdef CONFIG_FA_DUMP diff --git a/arch/powerpc/kernel/ultravisor.c b/arch/powerpc/kernel/ultravisor.c new file mode 100644 index ..dc6021f63c97 --- /dev/null +++ b/arch/powerpc/kernel/ultravisor.c @@ -0,0 +1,24 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Ultravisor high level interfaces + * + * Copyright 2019, IBM Corporation. + * + */ +#include +#include +#include + +#include +#include + +int __init early_init_dt_scan_ultravisor(unsigned long node, const char *uname, +int depth, void *data) +{ + if (depth != 1 || strcmp(uname, "ibm,ultravisor") != 0) + return 0; + + powerpc_firmware_features |= FW_FEATURE_ULTRAVISOR; + pr_debug("Ultravisor detected!\n"); + return 1; +}
Re: [PATCH v4 1/8] KVM: PPC: Ultravisor: Introduce the MSR_S bit
On 2019-06-28 15:08, Claudio Carvalho wrote: From: Sukadev Bhattiprolu The ultravisor processor mode is introduced in POWER platforms that supports the Protected Execution Facility (PEF). Ultravisor is higher privileged than hypervisor mode. In PEF enabled platforms, the MSR_S bit is used to indicate if the thread is in secure state. With the MSR_S bit, the privilege state of the thread is now determined by MSR_S, MSR_HV and MSR_PR, as follows: S HV PR --- 0 x 1 problem 1 0 1 problem x x 0 privileged x 1 0 hypervisor 1 1 0 ultravisor 1 1 1 reserved The hypervisor doesn't (and can't) run with the MSR_S bit set, but a secure guest and the ultravisor firmware do. Signed-off-by: Sukadev Bhattiprolu Signed-off-by: Ram Pai [ Update the commit message ] Signed-off-by: Claudio Carvalho Reviewed-by: Janani Janakiraman --- arch/powerpc/include/asm/reg.h | 3 +++ 1 file changed, 3 insertions(+) diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h index 10caa145f98b..39b4c0a519f5 100644 --- a/arch/powerpc/include/asm/reg.h +++ b/arch/powerpc/include/asm/reg.h @@ -38,6 +38,7 @@ #define MSR_TM_LG 32 /* Trans Mem Available */ #define MSR_VEC_LG 25 /* Enable AltiVec */ #define MSR_VSX_LG 23 /* Enable VSX */ +#define MSR_S_LG 22 /* Secure VM bit */ #define MSR_POW_LG 18 /* Enable Power Management */ #define MSR_WE_LG 18 /* Wait State Enable */ #define MSR_TGPR_LG17 /* TLB Update registers in use */ @@ -71,11 +72,13 @@ #define MSR_SF __MASK(MSR_SF_LG) /* Enable 64 bit mode */ #define MSR_ISF __MASK(MSR_ISF_LG) /* Interrupt 64b mode valid on 630 */ #define MSR_HV __MASK(MSR_HV_LG) /* Hypervisor state */ +#define MSR_S __MASK(MSR_S_LG)/* Secure state */ #else /* so tests for these bits fail on 32-bit */ #define MSR_SF 0 #define MSR_ISF0 #define MSR_HV 0 +#define MSR_S 0 #endif /*
Re: powerpc/83xx: fix use-after-free on mpc831x_usb_cfg()
> The np variable is still being used after the of_node_put() call, > which may result in use-after-free. > We fix this issue by calling of_node_put() after the last usage. I imagine that this commit description can be improved a bit more (by mentioning the influence of “immr_node”?). How do you think about to omit the word “We” here? > This patatch also do some cleanup. Should the renaming of a jump label be contributed in a separate update step of a small patch series besides a wording without a typo? Regards, Markus
Re: [v2 09/12] powerpc/mce: Enable MCE notifiers in external modules
On Fri, Jul 05, 2019 at 03:29:39PM +1000, Nicholas Piggin wrote: Okay. It might be possible to save the address in the kernel and then notify the driver afterward. For user-mode and any non-atomic user copy AFAIK the irq_work should practically run synchronously after the machine check returns so it might be enough to have a notifier in the irq work processing. We can pick up this thread later, but if I remember correctly the sticking point we ran into was that we never got that far. Instead of returning from the MCE, we went down the fatal codepath. -- Reza Arbab
Re: [PATCH] tpm: fixes uninitialized allocated banks for IBM vtpm driver
On Sat, 2019-07-06 at 20:25 -0400, Nayna wrote: > Thanks Jarkko. I just now posted the v2 version that includes your and > Stefan's feedbacks. Looks almost legit :-) /Jarkko
Re: [PATCH v2] tpm: tpm_ibm_vtpm: Fix unallocated banks
On Sat, 2019-07-06 at 20:18 -0400, Nayna Jain wrote: > +/* > + * tpm_get_pcr_allocation() - initialize the chip allocated banks for PCRs > + * @chip: TPM chip to use. > + */ > +static int tpm_get_pcr_allocation(struct tpm_chip *chip) > +{ > + int rc; > + > + if (chip->flags & TPM_CHIP_FLAG_TPM2) > + rc = tpm2_get_pcr_allocation(chip); > + else > + rc = tpm1_get_pcr_allocation(chip); > + > + return rc; > +} It is just a trivial static function, which means that kdoc comment is not required and neither it is useful. Please remove that. I would rewrite the function like: static int tpm_get_pcr_allocation(struct tpm_chip *chip) { int rc; rc = (chip->flags & TPM_CHIP_FLAG_TPM2) ? tpm2_get_pcr_allocation(chip) : tpm1_get_pcr_allocation(chip); return rc > 0 ? -ENODEV : rc; } This addresses the issue that Stefan also pointed out. You have to deal with the TPM error codes. /Jarkko
Re: [v3 7/7] powerpc/64s: save r13 in MCE handler (simulator workaroud)
On Sat, Jul 06, 2019 at 07:56:39PM +1000, Nicholas Piggin wrote: Santosh Sivaraj's on July 6, 2019 7:26 am: From: Reza Arbab Testing my memcpy_mcsafe() work in progress with an injected UE, I get an error like this immediately after the function returns: BUG: Unable to handle kernel data access at 0x7fff84dec8f8 Faulting instruction address: 0xc008009c00b0 Oops: Kernel access of bad area, sig: 11 [#1] LE PAGE_SIZE=64K MMU=Radix MMU=Hash SMP NR_CPUS=2048 NUMA PowerNV Modules linked in: mce(O+) vmx_crypto crc32c_vpmsum CPU: 0 PID: 1375 Comm: modprobe Tainted: G O 5.1.0-rc6 #267 NIP: c008009c00b0 LR: c008009c00a8 CTR: c0095f90 REGS: c000ee197790 TRAP: 0300 Tainted: G O (5.1.0-rc6) MSR: 9280b033 CR: 88002826 XER: 0004 CFAR: c0095f8c DAR: 7fff84dec8f8 DSISR: 4000 IRQMASK: 0 GPR00: 6c6c6568 c000ee197a20 c008009c8400 fff2 GPR04: c008009c02e0 0006 c3c834c8 GPR08: 0080 776a6681b7fb5100 c008009c01c8 GPR12: c0095f90 7fff84debc00 4d071440 GPR16: 00010601 c008009e c0c98dd8 c0c98d98 GPR20: c3bba970 c008009c04d0 c008009c0618 c01e5820 GPR24: 0100 0001 c3bba958 GPR28: c008009c02e8 c008009c0318 c008009c02e0 NIP [c008009c00b0] cause_ue+0xa8/0xe8 [mce] LR [c008009c00a8] cause_ue+0xa0/0xe8 [mce] After debugging we see that the first instruction at vector 200 is skipped by the simulator, due to which r13 is not saved. Adding a nop at 0x200 fixes the issue. (This commit is needed for testing this series. This should not be taken into the tree) Would be good if this was testable in simulator upstream, did you report it? What does cause_ue do? exc_mce in mambo seems to do the right thing AFAIKS. I think I posted this earlier, but cause_ue() is just a test function telling me where to set up the error injection: static noinline void cause_ue(void) { static const char src[] = "hello"; char dst[10]; int rc; /* During the pause, break into mambo and run the following */ pr_info("inject_mce_ue_on_addr 0x%px\n", src); pause(10); rc = memcpy_mcsafe(dst, src, sizeof(src)); pr_info("memcpy_mcsafe() returns %d\n", rc); if (!rc) pr_info("dst=\"%s\"\n", dst); } Can't speak for the others, but I haven't chased this upstream. I didn't know it was a simulator issue. -- Reza Arbab
Re: [PATCH v2] powerpc/mm: Implement STRICT_MODULE_RWX
Russell Currey writes: > Strict module RWX is just like strict kernel RWX, but for modules - so > loadable modules aren't marked both writable and executable at the same > time. This is handled by the generic code in kernel/module.c, and > simply requires the architecture to implement the set_memory() set of > functions, declared with ARCH_HAS_SET_MEMORY. > > There's nothing other than these functions required to turn > ARCH_HAS_STRICT_MODULE_RWX on, so turn that on too. > > With STRICT_MODULE_RWX enabled, there are as many W+X pages at runtime > as there are with CONFIG_MODULES=n (none), so in Russel's testing it works > well on both Hash and Radix book3s64. > > There's a TODO in the code for also applying the page permission changes > to the backing pages in the linear mapping: this is pretty simple for > Radix and (seemingly) a lot harder for Hash, so I've left it for now > since there's still a notable security benefit for the patch as-is. > > Technically can be enabled without STRICT_KERNEL_RWX, but > that doesn't gets you a whole lot, so we should leave it off by default > until we can get STRICT_KERNEL_RWX to the point where it's enabled by > default. > > Signed-off-by: Russell Currey > Signed-off-by: Christophe Leroy > --- > Changes from v1 (sent by Christophe): > - return if VM_FLUSH_RESET_PERMS is set > > arch/powerpc/Kconfig | 2 + > arch/powerpc/include/asm/set_memory.h | 32 ++ > arch/powerpc/mm/Makefile | 2 +- > arch/powerpc/mm/pageattr.c| 85 +++ > 4 files changed, 120 insertions(+), 1 deletion(-) > create mode 100644 arch/powerpc/include/asm/set_memory.h > create mode 100644 arch/powerpc/mm/pageattr.c > > diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig > index 8c1c636308c8..3d98240ce965 100644 > --- a/arch/powerpc/Kconfig > +++ b/arch/powerpc/Kconfig > @@ -131,7 +131,9 @@ config PPC > select ARCH_HAS_PTE_SPECIAL > select ARCH_HAS_MEMBARRIER_CALLBACKS > select ARCH_HAS_SCALED_CPUTIME if VIRT_CPU_ACCOUNTING_NATIVE > && PPC64 > + select ARCH_HAS_SET_MEMORY > select ARCH_HAS_STRICT_KERNEL_RWX if ((PPC_BOOK3S_64 || PPC32) && > !RELOCATABLE && !HIBERNATION) > + select ARCH_HAS_STRICT_MODULE_RWX if PPC_BOOK3S_64 || PPC32 > select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST > select ARCH_HAS_UACCESS_FLUSHCACHE if PPC64 > select ARCH_HAS_UBSAN_SANITIZE_ALL > diff --git a/arch/powerpc/include/asm/set_memory.h > b/arch/powerpc/include/asm/set_memory.h > new file mode 100644 > index ..4b9683f3b3dd > --- /dev/null > +++ b/arch/powerpc/include/asm/set_memory.h > @@ -0,0 +1,32 @@ > +/* SPDX-License-Identifier: GPL-2.0+ */ > +#ifndef _ASM_POWERPC_SET_MEMORY_H > +#define _ASM_POWERPC_SET_MEMORY_H > + > +#define SET_MEMORY_RO1 > +#define SET_MEMORY_RW2 > +#define SET_MEMORY_NX3 > +#define SET_MEMORY_X 4 > + > +int change_memory(unsigned long addr, int numpages, int action); > + > +static inline int set_memory_ro(unsigned long addr, int numpages) > +{ > + return change_memory(addr, numpages, SET_MEMORY_RO); > +} > + > +static inline int set_memory_rw(unsigned long addr, int numpages) > +{ > + return change_memory(addr, numpages, SET_MEMORY_RW); > +} > + > +static inline int set_memory_nx(unsigned long addr, int numpages) > +{ > + return change_memory(addr, numpages, SET_MEMORY_NX); > +} > + > +static inline int set_memory_x(unsigned long addr, int numpages) > +{ > + return change_memory(addr, numpages, SET_MEMORY_X); > +} > + > +#endif > diff --git a/arch/powerpc/mm/Makefile b/arch/powerpc/mm/Makefile > index 0f499db315d6..b683d1c311b3 100644 > --- a/arch/powerpc/mm/Makefile > +++ b/arch/powerpc/mm/Makefile > @@ -7,7 +7,7 @@ ccflags-$(CONFIG_PPC64) := $(NO_MINIMAL_TOC) > > obj-y:= fault.o mem.o pgtable.o mmap.o \ > init_$(BITS).o pgtable_$(BITS).o \ > -pgtable-frag.o \ > +pgtable-frag.o pageattr.o \ > init-common.o mmu_context.o drmem.o > obj-$(CONFIG_PPC_MMU_NOHASH) += nohash/ > obj-$(CONFIG_PPC_BOOK3S_32) += book3s32/ > diff --git a/arch/powerpc/mm/pageattr.c b/arch/powerpc/mm/pageattr.c > new file mode 100644 > index ..41baf92f632b > --- /dev/null > +++ b/arch/powerpc/mm/pageattr.c > @@ -0,0 +1,85 @@ > +// SPDX-License-Identifier: GPL-2.0+ > + > +/* > + * Page attribute and set_memory routines > + * > + * Derived from the arm64 implementation. > + * > + * Author: Russell Currey > + * > + * Copyright 2019, IBM Corporation. > + * > + */ > + > +#include > +#include > +#include > + > +#include > +#include > +#include > + > +static int change_page_ro(pte_t *ptep, pgtable_t token, unsigned long addr, > void *data) > +{ > + set_pte_at(_mm, addr, ptep,
Re: [PATCH 4/4] powerpc/64: reuse PPC32 static inline flush_dcache_range()
Christophe Leroy writes: > This patch drops the assembly PPC64 version of flush_dcache_range() > and re-uses the PPC32 static inline version. > > With GCC 8.1, the following code is generated: > > void flush_test(unsigned long start, unsigned long stop) > { > flush_dcache_range(start, stop); > } > > 0130 <.flush_test>: > 130: 3d 22 00 00 addis r9,r2,0 > 132: R_PPC64_TOC16_HA .data+0x8 > 134: 81 09 00 00 lwz r8,0(r9) > 136: R_PPC64_TOC16_LO .data+0x8 > 138: 3d 22 00 00 addis r9,r2,0 > 13a: R_PPC64_TOC16_HA .data+0xc > 13c: 80 e9 00 00 lwz r7,0(r9) > 13e: R_PPC64_TOC16_LO .data+0xc > 140: 7d 48 00 d0 neg r10,r8 > 144: 7d 43 18 38 and r3,r10,r3 > 148: 7c 00 04 ac hwsync > 14c: 4c 00 01 2c isync > 150: 39 28 ff ff addir9,r8,-1 > 154: 7c 89 22 14 add r4,r9,r4 > 158: 7c 83 20 50 subfr4,r3,r4 > 15c: 7c 89 3c 37 srd.r9,r4,r7 > 160: 41 82 00 1c beq 17c <.flush_test+0x4c> > 164: 7d 29 03 a6 mtctr r9 > 168: 60 00 00 00 nop > 16c: 60 00 00 00 nop > 170: 7c 00 18 ac dcbf0,r3 > 174: 7c 63 42 14 add r3,r3,r8 > 178: 42 00 ff f8 bdnz170 <.flush_test+0x40> > 17c: 7c 00 04 ac hwsync > 180: 4c 00 01 2c isync > 184: 4e 80 00 20 blr > 188: 60 00 00 00 nop > 18c: 60 00 00 00 nop > > Signed-off-by: Christophe Leroy > --- > arch/powerpc/include/asm/cache.h | 10 ++ > arch/powerpc/include/asm/cacheflush.h | 14 -- > arch/powerpc/kernel/misc_64.S | 29 - > 3 files changed, 18 insertions(+), 35 deletions(-) > > diff --git a/arch/powerpc/include/asm/cache.h > b/arch/powerpc/include/asm/cache.h > index 0009a0a82e86..45e3137ccd71 100644 > --- a/arch/powerpc/include/asm/cache.h > +++ b/arch/powerpc/include/asm/cache.h > @@ -54,6 +54,16 @@ struct ppc64_caches { > }; > > extern struct ppc64_caches ppc64_caches; > + > +static inline u32 l1_cache_shift(void) > +{ > + return ppc64_caches.l1d.log_block_size; > +} > + > +static inline u32 l1_cache_bytes(void) > +{ > + return ppc64_caches.l1d.block_size; > +} > #else > static inline u32 l1_cache_shift(void) > { > diff --git a/arch/powerpc/include/asm/cacheflush.h > b/arch/powerpc/include/asm/cacheflush.h > index d405f18441cd..3cd7ce3dec8b 100644 > --- a/arch/powerpc/include/asm/cacheflush.h > +++ b/arch/powerpc/include/asm/cacheflush.h > @@ -57,7 +57,6 @@ static inline void __flush_dcache_icache_phys(unsigned long > physaddr) > } > #endif > > -#ifdef CONFIG_PPC32 > /* > * Write any modified data cache blocks out to memory and invalidate them. > * Does not invalidate the corresponding instruction cache blocks. > @@ -70,9 +69,17 @@ static inline void flush_dcache_range(unsigned long start, > unsigned long stop) > unsigned long size = stop - (unsigned long)addr + (bytes - 1); > unsigned long i; > > + if (IS_ENABLED(CONFIG_PPC64)) { > + mb(); /* sync */ > + isync(); > + } > + > for (i = 0; i < size >> shift; i++, addr += bytes) > dcbf(addr); > mb(); /* sync */ > + > + if (IS_ENABLED(CONFIG_PPC64)) > + isync(); > } Was checking with Michael about why we need that extra isync. Michael pointed this came via https://github.com/mpe/linux-fullhistory/commit/faa5ee3743ff9b6df9f9a03600e34fdae596cfb2#diff-67c7ffa8e420c7d4206cae4a9e888e14 for 970 which doesn't have coherent icache. So possibly isync there is to flush the prefetch instructions? But even so we would need an icbi there before that isync. So overall wondering why we need that extra barriers there. > > /* > @@ -112,11 +119,6 @@ static inline void invalidate_dcache_range(unsigned long > start, > mb(); /* sync */ > } > -aneesh
[Bug 203839] Kernel 5.2-rc3 fails to boot on a PowerMac G4 3,6: systemd[1]: Failed to bump fs.file-max, ignoring: invalid argument
https://bugzilla.kernel.org/show_bug.cgi?id=203839 Michael Ellerman (mich...@ellerman.id.au) changed: What|Removed |Added Status|RESOLVED|CLOSED CC||mich...@ellerman.id.au --- Comment #11 from Michael Ellerman (mich...@ellerman.id.au) --- Fixed in b7f8b440f300 ("powerpc/32s: fix initial setup of segment registers on secondary CPU") -- You are receiving this mail because: You are watching the assignee of the bug.
linux-next: manual merge of the akpm-current tree with the powerpc tree
Hi all, Today's linux-next merge of the akpm-current tree got a conflict in: arch/powerpc/include/asm/pgtable.h between commit: d6eacedd1f0e ("powerpc/book3s: Use config independent helpers for page table walk") from the powerpc tree and commit: be66a174b253 ("mm/nvdimm: add is_ioremap_addr and use that to check ioremap address") from the akpm-current tree. I fixed it up (see below) and can carry the fix as necessary. This is now fixed as far as linux-next is concerned, but any non trivial conflicts should be mentioned to your upstream maintainer when your tree is submitted for merging. You may also want to consider cooperating with the maintainer of the conflicting tree to minimise any particularly complex conflicts. -- Cheers, Stephen Rothwell diff --cc arch/powerpc/include/asm/pgtable.h index bf7d771f342e,64145751b2fd.. --- a/arch/powerpc/include/asm/pgtable.h +++ b/arch/powerpc/include/asm/pgtable.h @@@ -140,30 -140,20 +140,44 @@@ static inline void pte_frag_set(mm_cont } #endif +#ifndef pmd_is_leaf +#define pmd_is_leaf pmd_is_leaf +static inline bool pmd_is_leaf(pmd_t pmd) +{ + return false; +} +#endif + +#ifndef pud_is_leaf +#define pud_is_leaf pud_is_leaf +static inline bool pud_is_leaf(pud_t pud) +{ + return false; +} +#endif + +#ifndef pgd_is_leaf +#define pgd_is_leaf pgd_is_leaf +static inline bool pgd_is_leaf(pgd_t pgd) +{ + return false; +} +#endif + + #ifdef CONFIG_PPC64 + #define is_ioremap_addr is_ioremap_addr + static inline bool is_ioremap_addr(const void *x) + { + #ifdef CONFIG_MMU + unsigned long addr = (unsigned long)x; + + return addr >= IOREMAP_BASE && addr < IOREMAP_END; + #else + return false; + #endif + } + #endif /* CONFIG_PPC64 */ + #endif /* __ASSEMBLY__ */ #endif /* _ASM_POWERPC_PGTABLE_H */ pgpmNygbuEfG0.pgp Description: OpenPGP digital signature
Re: [PATCH using git format-patch -C] kexec: add generic support for elf kernel images
Hi Sven, Le 08/07/2019 à 12:06, Christophe Leroy a écrit : From: Sven Schnelle Please describe you patch. All the changes you are doing to the ppc64 version in order to make it generic should be described. Those changes are also maybe worth splitting the patch in several parts, either preparing the ppc64 for generic then moving to kernel/kexec_elf.c or moving to kernel/kexec_elf.c without modifications, then modifying it to make it generic. Note that your patch only applies on Linux 5.1, it doesn't apply on powerpc/next. I think it also be worth taking the opportunity to fix the sparse warnings, see https://patchwork.ozlabs.org/patch/1128720/ checkpatch comments should also be fixed, see https://openpower.xyz/job/snowpatch/job/snowpatch-linux-checkpatch/8044//artifact/linux/checkpatch.log Other comments below, Signed-off-by: Sven Schnelle --- patch generated with 'git format-patch -C' in order to see the modifications done in kexec_elf.c in addition to copying it from arch/powerpc/kernel/kexec_elf_64.c arch/Kconfig | 3 + arch/powerpc/Kconfig | 1 + arch/powerpc/kernel/kexec_elf_64.c | 547 + include/linux/kexec.h | 35 ++ kernel/Makefile| 1 + .../kernel/kexec_elf_64.c => kernel/kexec_elf.c| 264 +++--- 6 files changed, 118 insertions(+), 733 deletions(-) copy arch/powerpc/kernel/kexec_elf_64.c => kernel/kexec_elf.c (64%) diff --git a/arch/Kconfig b/arch/Kconfig index 3368786a..c7c75fbc0b79 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -18,6 +18,9 @@ config KEXEC_CORE select CRASH_CORE bool +config KEXEC_ELF + bool + config HAVE_IMA_KEXEC bool diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 2d0be82c3061..447b6e3ad999 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -501,6 +501,7 @@ config KEXEC_FILE select KEXEC_CORE select HAVE_IMA_KEXEC select BUILD_BIN2C + select KEXEC_ELF depends on PPC64 depends on CRYPTO=y depends on CRYPTO_SHA256=y diff --git a/arch/powerpc/kernel/kexec_elf_64.c b/arch/powerpc/kernel/kexec_elf_64.c index ba4f18a43ee8..d062c5991722 100644 --- a/arch/powerpc/kernel/kexec_elf_64.c +++ b/arch/powerpc/kernel/kexec_elf_64.c @@ -31,539 +31,7 @@ #include #include -#define PURGATORY_STACK_SIZE (16 * 1024) [snip] +#define PURGATORY_STACK_SIZE (16 * 1024) This line shouldn't be modified by your patch. I see a tab was replaced by space. static void *elf64_load(struct kimage *image, char *kernel_buf, unsigned long kernel_len, char *initrd, @@ -577,17 +45,17 @@ static void *elf64_load(struct kimage *image, char *kernel_buf, void *fdt; const void *slave_code; struct elfhdr ehdr; - struct elf_info elf_info; + struct kexec_elf_info elf_info; struct kexec_buf kbuf = { .image = image, .buf_min = 0, .buf_max = ppc64_rma_size }; struct kexec_buf pbuf = { .image = image, .buf_min = 0, .buf_max = ppc64_rma_size, .top_down = true }; - ret = build_elf_exec_info(kernel_buf, kernel_len, , _info); + ret = kexec_build_elf_info(kernel_buf, kernel_len, , _info); if (ret) goto out; - ret = elf_exec_load(image, , _info, _load_addr); + ret = kexec_elf_load(image, , _info, , _load_addr); if (ret) goto out; @@ -606,6 +74,7 @@ static void *elf64_load(struct kimage *image, char *kernel_buf, kbuf.bufsz = kbuf.memsz = initrd_len; kbuf.buf_align = PAGE_SIZE; kbuf.top_down = false; + kbuf.mem = KEXEC_BUF_MEM_UNKNOWN; Is that correct ? I see: - kbuf.mem is unsigned - KEXEC_BUF_MEM_UNKNOWN is -1 on s390 ret = kexec_add_buffer(); if (ret) goto out; @@ -638,6 +107,7 @@ static void *elf64_load(struct kimage *image, char *kernel_buf, kbuf.bufsz = kbuf.memsz = fdt_size; kbuf.buf_align = PAGE_SIZE; kbuf.top_down = true; + kbuf.mem = KEXEC_BUF_MEM_UNKNOWN; ret = kexec_add_buffer(); if (ret) goto out; @@ -652,13 +122,12 @@ static void *elf64_load(struct kimage *image, char *kernel_buf, pr_err("Error setting up the purgatory.\n"); out: - elf_free_info(_info); - + kexec_free_elf_info(_info); /* Make kimage_file_post_load_cleanup free the fdt buffer for us. */ return ret ? ERR_PTR(ret) : fdt; } const struct kexec_file_ops kexec_elf64_ops = { - .probe = elf64_probe, + .probe = kexec_elf_probe, .load = elf64_load, }; diff --git a/include/linux/kexec.h b/include/linux/kexec.h index
Re: [PATCH] kexec: add generic support for elf kernel images
Le 07/07/2019 à 21:21, Sven Schnelle a écrit : Signed-off-by: Sven Schnelle --- Please add here a description of the changes done since RFC. arch/Kconfig | 3 + arch/powerpc/Kconfig | 1 + arch/powerpc/kernel/kexec_elf_64.c | 547 + include/linux/kexec.h | 35 ++ kernel/Makefile| 1 + kernel/kexec_elf.c | 540 6 files changed, 588 insertions(+), 539 deletions(-) create mode 100644 kernel/kexec_elf.c kernel/kexec_elf.c is a modified copy of arch/powerpc/kernel/kexec_elf_64.c. You should generate your patch usign 'git format-patch -C' in order to let git identify the copy and the changes to the copy. That would ease the review. I have regenerated your patch with -C and resent it. Christophe diff --git a/arch/Kconfig b/arch/Kconfig index c47b328eada0..30694aca4316 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -18,6 +18,9 @@ config KEXEC_CORE select CRASH_CORE bool +config KEXEC_ELF + bool + config HAVE_IMA_KEXEC bool diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 8c1c636308c8..97aa81622452 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -502,6 +502,7 @@ config KEXEC_FILE select KEXEC_CORE select HAVE_IMA_KEXEC select BUILD_BIN2C + select KEXEC_ELF depends on PPC64 depends on CRYPTO=y depends on CRYPTO_SHA256=y diff --git a/arch/powerpc/kernel/kexec_elf_64.c b/arch/powerpc/kernel/kexec_elf_64.c index ba4f18a43ee8..d062c5991722 100644 --- a/arch/powerpc/kernel/kexec_elf_64.c +++ b/arch/powerpc/kernel/kexec_elf_64.c @@ -31,539 +31,7 @@ #include #include -#define PURGATORY_STACK_SIZE (16 * 1024) - -#define elf_addr_to_cpuelf64_to_cpu - -#ifndef Elf_Rel -#define Elf_RelElf64_Rel -#endif /* Elf_Rel */ - -struct elf_info { - /* -* Where the ELF binary contents are kept. -* Memory managed by the user of the struct. -*/ - const char *buffer; - - const struct elfhdr *ehdr; - const struct elf_phdr *proghdrs; - struct elf_shdr *sechdrs; -}; - -static inline bool elf_is_elf_file(const struct elfhdr *ehdr) -{ - return memcmp(ehdr->e_ident, ELFMAG, SELFMAG) == 0; -} - -static uint64_t elf64_to_cpu(const struct elfhdr *ehdr, uint64_t value) -{ - if (ehdr->e_ident[EI_DATA] == ELFDATA2LSB) - value = le64_to_cpu(value); - else if (ehdr->e_ident[EI_DATA] == ELFDATA2MSB) - value = be64_to_cpu(value); - - return value; -} - -static uint16_t elf16_to_cpu(const struct elfhdr *ehdr, uint16_t value) -{ - if (ehdr->e_ident[EI_DATA] == ELFDATA2LSB) - value = le16_to_cpu(value); - else if (ehdr->e_ident[EI_DATA] == ELFDATA2MSB) - value = be16_to_cpu(value); - - return value; -} - -static uint32_t elf32_to_cpu(const struct elfhdr *ehdr, uint32_t value) -{ - if (ehdr->e_ident[EI_DATA] == ELFDATA2LSB) - value = le32_to_cpu(value); - else if (ehdr->e_ident[EI_DATA] == ELFDATA2MSB) - value = be32_to_cpu(value); - - return value; -} - -/** - * elf_is_ehdr_sane - check that it is safe to use the ELF header - * @buf_len: size of the buffer in which the ELF file is loaded. - */ -static bool elf_is_ehdr_sane(const struct elfhdr *ehdr, size_t buf_len) -{ - if (ehdr->e_phnum > 0 && ehdr->e_phentsize != sizeof(struct elf_phdr)) { - pr_debug("Bad program header size.\n"); - return false; - } else if (ehdr->e_shnum > 0 && - ehdr->e_shentsize != sizeof(struct elf_shdr)) { - pr_debug("Bad section header size.\n"); - return false; - } else if (ehdr->e_ident[EI_VERSION] != EV_CURRENT || - ehdr->e_version != EV_CURRENT) { - pr_debug("Unknown ELF version.\n"); - return false; - } - - if (ehdr->e_phoff > 0 && ehdr->e_phnum > 0) { - size_t phdr_size; - - /* -* e_phnum is at most 65535 so calculating the size of the -* program header cannot overflow. -*/ - phdr_size = sizeof(struct elf_phdr) * ehdr->e_phnum; - - /* Sanity check the program header table location. */ - if (ehdr->e_phoff + phdr_size < ehdr->e_phoff) { - pr_debug("Program headers at invalid location.\n"); - return false; - } else if (ehdr->e_phoff + phdr_size > buf_len) { - pr_debug("Program headers truncated.\n"); - return false; - } - } - - if (ehdr->e_shoff > 0 && ehdr->e_shnum > 0) { - size_t shdr_size; - - /* -* e_shnum is at most
[PATCH using git format-patch -C] kexec: add generic support for elf kernel images
From: Sven Schnelle Signed-off-by: Sven Schnelle --- patch generated with 'git format-patch -C' in order to see the modifications done in kexec_elf.c in addition to copying it from arch/powerpc/kernel/kexec_elf_64.c arch/Kconfig | 3 + arch/powerpc/Kconfig | 1 + arch/powerpc/kernel/kexec_elf_64.c | 547 + include/linux/kexec.h | 35 ++ kernel/Makefile| 1 + .../kernel/kexec_elf_64.c => kernel/kexec_elf.c| 264 +++--- 6 files changed, 118 insertions(+), 733 deletions(-) copy arch/powerpc/kernel/kexec_elf_64.c => kernel/kexec_elf.c (64%) diff --git a/arch/Kconfig b/arch/Kconfig index 3368786a..c7c75fbc0b79 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -18,6 +18,9 @@ config KEXEC_CORE select CRASH_CORE bool +config KEXEC_ELF + bool + config HAVE_IMA_KEXEC bool diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 2d0be82c3061..447b6e3ad999 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -501,6 +501,7 @@ config KEXEC_FILE select KEXEC_CORE select HAVE_IMA_KEXEC select BUILD_BIN2C + select KEXEC_ELF depends on PPC64 depends on CRYPTO=y depends on CRYPTO_SHA256=y diff --git a/arch/powerpc/kernel/kexec_elf_64.c b/arch/powerpc/kernel/kexec_elf_64.c index ba4f18a43ee8..d062c5991722 100644 --- a/arch/powerpc/kernel/kexec_elf_64.c +++ b/arch/powerpc/kernel/kexec_elf_64.c @@ -31,539 +31,7 @@ #include #include -#define PURGATORY_STACK_SIZE (16 * 1024) - -#define elf_addr_to_cpuelf64_to_cpu - -#ifndef Elf_Rel -#define Elf_RelElf64_Rel -#endif /* Elf_Rel */ - -struct elf_info { - /* -* Where the ELF binary contents are kept. -* Memory managed by the user of the struct. -*/ - const char *buffer; - - const struct elfhdr *ehdr; - const struct elf_phdr *proghdrs; - struct elf_shdr *sechdrs; -}; - -static inline bool elf_is_elf_file(const struct elfhdr *ehdr) -{ - return memcmp(ehdr->e_ident, ELFMAG, SELFMAG) == 0; -} - -static uint64_t elf64_to_cpu(const struct elfhdr *ehdr, uint64_t value) -{ - if (ehdr->e_ident[EI_DATA] == ELFDATA2LSB) - value = le64_to_cpu(value); - else if (ehdr->e_ident[EI_DATA] == ELFDATA2MSB) - value = be64_to_cpu(value); - - return value; -} - -static uint16_t elf16_to_cpu(const struct elfhdr *ehdr, uint16_t value) -{ - if (ehdr->e_ident[EI_DATA] == ELFDATA2LSB) - value = le16_to_cpu(value); - else if (ehdr->e_ident[EI_DATA] == ELFDATA2MSB) - value = be16_to_cpu(value); - - return value; -} - -static uint32_t elf32_to_cpu(const struct elfhdr *ehdr, uint32_t value) -{ - if (ehdr->e_ident[EI_DATA] == ELFDATA2LSB) - value = le32_to_cpu(value); - else if (ehdr->e_ident[EI_DATA] == ELFDATA2MSB) - value = be32_to_cpu(value); - - return value; -} - -/** - * elf_is_ehdr_sane - check that it is safe to use the ELF header - * @buf_len: size of the buffer in which the ELF file is loaded. - */ -static bool elf_is_ehdr_sane(const struct elfhdr *ehdr, size_t buf_len) -{ - if (ehdr->e_phnum > 0 && ehdr->e_phentsize != sizeof(struct elf_phdr)) { - pr_debug("Bad program header size.\n"); - return false; - } else if (ehdr->e_shnum > 0 && - ehdr->e_shentsize != sizeof(struct elf_shdr)) { - pr_debug("Bad section header size.\n"); - return false; - } else if (ehdr->e_ident[EI_VERSION] != EV_CURRENT || - ehdr->e_version != EV_CURRENT) { - pr_debug("Unknown ELF version.\n"); - return false; - } - - if (ehdr->e_phoff > 0 && ehdr->e_phnum > 0) { - size_t phdr_size; - - /* -* e_phnum is at most 65535 so calculating the size of the -* program header cannot overflow. -*/ - phdr_size = sizeof(struct elf_phdr) * ehdr->e_phnum; - - /* Sanity check the program header table location. */ - if (ehdr->e_phoff + phdr_size < ehdr->e_phoff) { - pr_debug("Program headers at invalid location.\n"); - return false; - } else if (ehdr->e_phoff + phdr_size > buf_len) { - pr_debug("Program headers truncated.\n"); - return false; - } - } - - if (ehdr->e_shoff > 0 && ehdr->e_shnum > 0) { - size_t shdr_size; - - /* -* e_shnum is at most 65536 so calculating -* the size of the section header cannot overflow. -*/ - shdr_size =
Re: [PATCH v3 3/3] powerpc/module64: Use symbolic instructions names.
Le 08/07/2019 à 02:56, Michael Ellerman a écrit : Christophe Leroy writes: To increase readability/maintainability, replace hard coded instructions values by symbolic names. Signed-off-by: Christophe Leroy --- v3: fixed warning by adding () in an 'if' around X | Y (unlike said in v2 history, this change was forgotten in v2) v2: rearranged comments arch/powerpc/kernel/module_64.c | 53 +++-- 1 file changed, 35 insertions(+), 18 deletions(-) diff --git a/arch/powerpc/kernel/module_64.c b/arch/powerpc/kernel/module_64.c index c2e1b06253b8..b33a5d5e2d35 100644 --- a/arch/powerpc/kernel/module_64.c +++ b/arch/powerpc/kernel/module_64.c @@ -704,18 +711,21 @@ int apply_relocate_add(Elf64_Shdr *sechdrs, ... /* * If found, replace it with: * addis r2, r12, (.TOC.-func)@ha * addi r2, r12, (.TOC.-func)@l */ - ((uint32_t *)location)[0] = 0x3c4c + PPC_HA(value); - ((uint32_t *)location)[1] = 0x3842 + PPC_LO(value); + ((uint32_t *)location)[0] = PPC_INST_ADDIS | __PPC_RT(R2) | + __PPC_RA(R12) | PPC_HA(value); + ((uint32_t *)location)[1] = PPC_INST_ADDI | __PPC_RT(R2) | + __PPC_RA(R12) | PPC_LO(value); break; This was crashing and it's amazing how long you can stare at a disassembly and not see the difference between `r2` and `r12` :) Argh, yes. I was misleaded by the comment I guess. Sorry for that and thanks for fixing. Christophe Fixed now. cheers
Re: [PATCH v3] cpufreq/pasemi: fix an use-after-free in pas_cpufreq_cpu_init()
On 08-07-19, 16:48, Wen Yang wrote: > The cpu variable is still being used in the of_get_property() call > after the of_node_put() call, which may result in use-after-free. > > Fixes: a9acc26b75f ("cpufreq/pasemi: fix possible object reference leak") > Signed-off-by: Wen Yang > Cc: "Rafael J. Wysocki" > Cc: Viresh Kumar > Cc: linuxppc-dev@lists.ozlabs.org > Cc: linux...@vger.kernel.org > Cc: linux-ker...@vger.kernel.org > --- > v3: fix a leaked reference. > v2: clean up the code according to the advice of viresh. > > drivers/cpufreq/pasemi-cpufreq.c | 12 +++- > 1 file changed, 7 insertions(+), 5 deletions(-) > > diff --git a/drivers/cpufreq/pasemi-cpufreq.c > b/drivers/cpufreq/pasemi-cpufreq.c > index 6b1e4ab..9dc5163 100644 > --- a/drivers/cpufreq/pasemi-cpufreq.c > +++ b/drivers/cpufreq/pasemi-cpufreq.c > @@ -128,20 +128,20 @@ static int pas_cpufreq_cpu_init(struct cpufreq_policy > *policy) > int cur_astate, idx; > struct resource res; > struct device_node *cpu, *dn; > - int err = -ENODEV; > + int err; > > cpu = of_get_cpu_node(policy->cpu, NULL); > - > - of_node_put(cpu); > if (!cpu) > - goto out; > + return -ENODEV; > > dn = of_find_compatible_node(NULL, NULL, "1682m-sdc"); > if (!dn) > dn = of_find_compatible_node(NULL, NULL, >"pasemi,pwrficient-sdc"); > - if (!dn) > + if (!dn) { > + err = -ENODEV; > goto out; > + } Please restore the blank line here. > err = of_address_to_resource(dn, 0, ); > of_node_put(dn); > if (err) > @@ -196,6 +196,7 @@ static int pas_cpufreq_cpu_init(struct cpufreq_policy > *policy) > policy->cur = pas_freqs[cur_astate].frequency; > ppc_proc_freq = policy->cur * 1000ul; > > + of_node_put(cpu); > return cpufreq_generic_init(policy, pas_freqs, get_gizmo_latency()); > > out_unmap_sdcpwr: > @@ -204,6 +205,7 @@ static int pas_cpufreq_cpu_init(struct cpufreq_policy > *policy) > out_unmap_sdcasr: > iounmap(sdcasr_mapbase); > out: > + of_node_put(cpu); > return err; > } > > -- > 2.9.5 -- viresh
[PATCH v3] cpufreq/pasemi: fix an use-after-free in pas_cpufreq_cpu_init()
The cpu variable is still being used in the of_get_property() call after the of_node_put() call, which may result in use-after-free. Fixes: a9acc26b75f ("cpufreq/pasemi: fix possible object reference leak") Signed-off-by: Wen Yang Cc: "Rafael J. Wysocki" Cc: Viresh Kumar Cc: linuxppc-dev@lists.ozlabs.org Cc: linux...@vger.kernel.org Cc: linux-ker...@vger.kernel.org --- v3: fix a leaked reference. v2: clean up the code according to the advice of viresh. drivers/cpufreq/pasemi-cpufreq.c | 12 +++- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/drivers/cpufreq/pasemi-cpufreq.c b/drivers/cpufreq/pasemi-cpufreq.c index 6b1e4ab..9dc5163 100644 --- a/drivers/cpufreq/pasemi-cpufreq.c +++ b/drivers/cpufreq/pasemi-cpufreq.c @@ -128,20 +128,20 @@ static int pas_cpufreq_cpu_init(struct cpufreq_policy *policy) int cur_astate, idx; struct resource res; struct device_node *cpu, *dn; - int err = -ENODEV; + int err; cpu = of_get_cpu_node(policy->cpu, NULL); - - of_node_put(cpu); if (!cpu) - goto out; + return -ENODEV; dn = of_find_compatible_node(NULL, NULL, "1682m-sdc"); if (!dn) dn = of_find_compatible_node(NULL, NULL, "pasemi,pwrficient-sdc"); - if (!dn) + if (!dn) { + err = -ENODEV; goto out; + } err = of_address_to_resource(dn, 0, ); of_node_put(dn); if (err) @@ -196,6 +196,7 @@ static int pas_cpufreq_cpu_init(struct cpufreq_policy *policy) policy->cur = pas_freqs[cur_astate].frequency; ppc_proc_freq = policy->cur * 1000ul; + of_node_put(cpu); return cpufreq_generic_init(policy, pas_freqs, get_gizmo_latency()); out_unmap_sdcpwr: @@ -204,6 +205,7 @@ static int pas_cpufreq_cpu_init(struct cpufreq_policy *policy) out_unmap_sdcasr: iounmap(sdcasr_mapbase); out: + of_node_put(cpu); return err; } -- 2.9.5
Re: [PATCH v2] cpufreq/pasemi: fix an use-after-free inpas_cpufreq_cpu_init()
> > The cpu variable is still being used in the of_get_property() call > > after the of_node_put() call, which may result in use-after-free. > > > > Fixes: a9acc26b75f ("cpufreq/pasemi: fix possible object reference leak") > > Signed-off-by: Wen Yang > > Cc: "Rafael J. Wysocki" > > Cc: Viresh Kumar > > Cc: linuxppc-dev@lists.ozlabs.org > > Cc: linux...@vger.kernel.org > > Cc: linux-ker...@vger.kernel.org > > --- > > v2: clean up the code according to the advice of viresh. > > > > drivers/cpufreq/pasemi-cpufreq.c | 10 +- > > 1 file changed, 5 insertions(+), 5 deletions(-) > > > > diff --git a/drivers/cpufreq/pasemi-cpufreq.c > > b/drivers/cpufreq/pasemi-cpufreq.c > > index 6b1e4ab..c6d464b 100644 > > --- a/drivers/cpufreq/pasemi-cpufreq.c > > +++ b/drivers/cpufreq/pasemi-cpufreq.c > > @@ -128,20 +128,18 @@ static int pas_cpufreq_cpu_init(struct cpufreq_policy > > *policy) > > int cur_astate, idx; > > struct resource res; > > struct device_node *cpu, *dn; > > -int err = -ENODEV; > > +int err; > > > > cpu = of_get_cpu_node(policy->cpu, NULL); > > - > > -of_node_put(cpu); > > if (!cpu) > > -goto out; > > +return -ENODEV; > > > > > > dn = of_find_compatible_node(NULL, NULL, "1682m-sdc"); > > if (!dn) > > dn = of_find_compatible_node(NULL, NULL, > > "pasemi,pwrficient-sdc"); > > if (!dn) > > -goto out; > > +return -ENODEV; > > This change looks incorrect. You still need to drop reference to cpu ? > Thanks! We will fix it immediately. -- Thanks and regards, Wen
Re: [v3 4/7] powerpc/mce: Handle UE event for memcpy_mcsafe
On 7/6/19 3:23 PM, Nicholas Piggin wrote: > Santosh Sivaraj's on July 6, 2019 7:26 am: >> If we take a UE on one of the instructions with a fixup entry, set nip >> to continue exucution at the fixup entry. Stop processing the event >> further or print it. > > Minor nit, but can you instead a field in the mce data structure that > describes the property of the event, and then the code that intends to > ignore such events can test for it (with an appropriate comment). > > So it would be has_fixup_handler or similar. Then queue_event can have > the logic > > /* > * Don't report this machine check because the caller has a fixup > * handler which will do the appropriate error handling and reporting. > */ > > >> @@ -565,9 +567,18 @@ static int mce_handle_derror(struct pt_regs *regs, >> return 0; >> } >> >> -static long mce_handle_ue_error(struct pt_regs *regs) >> +static long mce_handle_ue_error(struct pt_regs *regs, >> +struct mce_error_info *mce_err) >> { >> long handled = 0; >> +const struct exception_table_entry *entry; >> + >> +entry = search_exception_tables(regs->nip); > > Uh oh, this searches module exception tables too... we can't do that > in real mode, can we? Yeah, we can not do that in real mode. Should we directly call search_extable() for kernel exception table ? > > Thanks, > Nick >
Re: [PATCH v2] cpufreq/pasemi: fix an use-after-free in pas_cpufreq_cpu_init()
On 08-07-19, 15:19, Wen Yang wrote: > The cpu variable is still being used in the of_get_property() call > after the of_node_put() call, which may result in use-after-free. > > Fixes: a9acc26b75f ("cpufreq/pasemi: fix possible object reference leak") > Signed-off-by: Wen Yang > Cc: "Rafael J. Wysocki" > Cc: Viresh Kumar > Cc: linuxppc-dev@lists.ozlabs.org > Cc: linux...@vger.kernel.org > Cc: linux-ker...@vger.kernel.org > --- > v2: clean up the code according to the advice of viresh. > > drivers/cpufreq/pasemi-cpufreq.c | 10 +- > 1 file changed, 5 insertions(+), 5 deletions(-) > > diff --git a/drivers/cpufreq/pasemi-cpufreq.c > b/drivers/cpufreq/pasemi-cpufreq.c > index 6b1e4ab..c6d464b 100644 > --- a/drivers/cpufreq/pasemi-cpufreq.c > +++ b/drivers/cpufreq/pasemi-cpufreq.c > @@ -128,20 +128,18 @@ static int pas_cpufreq_cpu_init(struct cpufreq_policy > *policy) > int cur_astate, idx; > struct resource res; > struct device_node *cpu, *dn; > - int err = -ENODEV; > + int err; > > cpu = of_get_cpu_node(policy->cpu, NULL); > - > - of_node_put(cpu); > if (!cpu) > - goto out; > + return -ENODEV; > > dn = of_find_compatible_node(NULL, NULL, "1682m-sdc"); > if (!dn) > dn = of_find_compatible_node(NULL, NULL, >"pasemi,pwrficient-sdc"); > if (!dn) > - goto out; > + return -ENODEV; This change looks incorrect. You still need to drop reference to cpu ? > err = of_address_to_resource(dn, 0, ); > of_node_put(dn); > if (err) > @@ -196,6 +194,7 @@ static int pas_cpufreq_cpu_init(struct cpufreq_policy > *policy) > policy->cur = pas_freqs[cur_astate].frequency; > ppc_proc_freq = policy->cur * 1000ul; > > + of_node_put(cpu); > return cpufreq_generic_init(policy, pas_freqs, get_gizmo_latency()); > > out_unmap_sdcpwr: > @@ -204,6 +203,7 @@ static int pas_cpufreq_cpu_init(struct cpufreq_policy > *policy) > out_unmap_sdcasr: > iounmap(sdcasr_mapbase); > out: > + of_node_put(cpu); > return err; > } > > -- > 2.9.5 -- viresh
[PATCH v2] cpufreq/pasemi: fix an use-after-free in pas_cpufreq_cpu_init()
The cpu variable is still being used in the of_get_property() call after the of_node_put() call, which may result in use-after-free. Fixes: a9acc26b75f ("cpufreq/pasemi: fix possible object reference leak") Signed-off-by: Wen Yang Cc: "Rafael J. Wysocki" Cc: Viresh Kumar Cc: linuxppc-dev@lists.ozlabs.org Cc: linux...@vger.kernel.org Cc: linux-ker...@vger.kernel.org --- v2: clean up the code according to the advice of viresh. drivers/cpufreq/pasemi-cpufreq.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/drivers/cpufreq/pasemi-cpufreq.c b/drivers/cpufreq/pasemi-cpufreq.c index 6b1e4ab..c6d464b 100644 --- a/drivers/cpufreq/pasemi-cpufreq.c +++ b/drivers/cpufreq/pasemi-cpufreq.c @@ -128,20 +128,18 @@ static int pas_cpufreq_cpu_init(struct cpufreq_policy *policy) int cur_astate, idx; struct resource res; struct device_node *cpu, *dn; - int err = -ENODEV; + int err; cpu = of_get_cpu_node(policy->cpu, NULL); - - of_node_put(cpu); if (!cpu) - goto out; + return -ENODEV; dn = of_find_compatible_node(NULL, NULL, "1682m-sdc"); if (!dn) dn = of_find_compatible_node(NULL, NULL, "pasemi,pwrficient-sdc"); if (!dn) - goto out; + return -ENODEV; err = of_address_to_resource(dn, 0, ); of_node_put(dn); if (err) @@ -196,6 +194,7 @@ static int pas_cpufreq_cpu_init(struct cpufreq_policy *policy) policy->cur = pas_freqs[cur_astate].frequency; ppc_proc_freq = policy->cur * 1000ul; + of_node_put(cpu); return cpufreq_generic_init(policy, pas_freqs, get_gizmo_latency()); out_unmap_sdcpwr: @@ -204,6 +203,7 @@ static int pas_cpufreq_cpu_init(struct cpufreq_policy *policy) out_unmap_sdcasr: iounmap(sdcasr_mapbase); out: + of_node_put(cpu); return err; } -- 2.9.5
Re: [PATCH] cpufreq/pasemi: fix an use-after-free inpas_cpufreq_cpu_init()
> > The cpu variable is still being used in the of_get_property() call > > after the of_node_put() call, which may result in use-after-free. > > > > Fixes: a9acc26b75f ("cpufreq/pasemi: fix possible object reference leak") > > Signed-off-by: Wen Yang > > Cc: "Rafael J. Wysocki" > > Cc: Viresh Kumar > > Cc: linuxppc-dev@lists.ozlabs.org > > Cc: linux...@vger.kernel.org > > Cc: linux-ker...@vger.kernel.org > > --- > > drivers/cpufreq/pasemi-cpufreq.c | 10 ++ > > 1 file changed, 6 insertions(+), 4 deletions(-) > > I will suggest some changes here. Hello viresh, thank you for your comments. We will make changes as soon as possible. -- Thanks and regards, Wen
Re: [PATCH kernel v3 3/3] powerpc/powernv/ioda2: Create bigger default window with 64k IOMMU pages
Hi Alexey, Couple of comment/questions below. - /* -* Reserve page 0 so it will not be used for any mappings. -* This avoids buggy drivers that consider page 0 to be invalid -* to crash the machine or even lose data. -*/ - if (tbl->it_offset == 0) - set_bit(0, tbl->it_map); + tbl->it_reserved_start = res_start; + tbl->it_reserved_end = res_end; + iommu_table_reserve_pages(tbl); Personally I think it would be cleaner to set tbl->it_reserved_start/end in iommu_table_reserve_pages() rather than separating the setup like this. /* We only split the IOMMU table if we have 1GB or more of space */ if ((tbl->it_size << tbl->it_page_shift) >= (1UL * 1024 * 1024 * 1024)) @@ -727,12 +755,7 @@ static void iommu_table_free(struct kref *kref) return; } - /* -* In case we have reserved the first bit, we should not emit -* the warning below. -*/ - if (tbl->it_offset == 0) - clear_bit(0, tbl->it_map); + iommu_table_release_pages(tbl); /* verify that table contains no entries */ if (!bitmap_empty(tbl->it_map, tbl->it_size)) @@ -1037,8 +1060,7 @@ int iommu_take_ownership(struct iommu_table *tbl) for (i = 0; i < tbl->nr_pools; i++) spin_lock(>pools[i].lock); - if (tbl->it_offset == 0) - clear_bit(0, tbl->it_map); + iommu_table_reserve_pages(tbl); if (!bitmap_empty(tbl->it_map, tbl->it_size)) { pr_err("iommu_tce: it_map is not empty"); @@ -1068,9 +1090,7 @@ void iommu_release_ownership(struct iommu_table *tbl) memset(tbl->it_map, 0, sz); - /* Restore bit#0 set by iommu_init_table() */ - if (tbl->it_offset == 0) - set_bit(0, tbl->it_map); + iommu_table_release_pages(tbl); Are the above two changes correct? From the perspective of code behaviour this seems to swap the set/clear_bit() calls. For example above you are replacing set_bit(0, tbl->it_map) with a call to iommu_table_release_pages() which does clear_bit(0, tbl->it_map) instead. - Alistair for (i = 0; i < tbl->nr_pools; i++) spin_unlock(>pools[i].lock); diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c index 126602b4e399..ce2efdb3900d 100644 --- a/arch/powerpc/platforms/powernv/pci-ioda.c +++ b/arch/powerpc/platforms/powernv/pci-ioda.c @@ -2422,6 +2422,7 @@ static long pnv_pci_ioda2_setup_default_config(struct pnv_ioda_pe *pe) { struct iommu_table *tbl = NULL; long rc; + unsigned long res_start, res_end; /* * crashkernel= specifies the kdump kernel's maximum memory at @@ -2435,19 +2436,46 @@ static long pnv_pci_ioda2_setup_default_config(struct pnv_ioda_pe *pe) * DMA window can be larger than available memory, which will * cause errors later. */ - const u64 window_size = min((u64)pe->table_group.tce32_size, max_memory); + const u64 maxblock = 1UL << (PAGE_SHIFT + MAX_ORDER - 1); - rc = pnv_pci_ioda2_create_table(>table_group, 0, - IOMMU_PAGE_SHIFT_4K, - window_size, - POWERNV_IOMMU_DEFAULT_LEVELS, false, ); + /* +* We create the default window as big as we can. The constraint is +* the max order of allocation possible. The TCE tableis likely to +* end up being multilevel and with on-demand allocation in place, +* the initial use is not going to be huge as the default window aims +* to support cripplied devices (i.e. not fully 64bit DMAble) only. +*/ + /* iommu_table::it_map uses 1 bit per IOMMU page, hence 8 */ + const u64 window_size = min((maxblock * 8) << PAGE_SHIFT, max_memory); + /* Each TCE level cannot exceed maxblock so go multilevel if needed */ + unsigned long tces_order = ilog2(window_size >> PAGE_SHIFT); + unsigned long tcelevel_order = ilog2(maxblock >> 3); + unsigned int levels = tces_order / tcelevel_order; + + if (tces_order % tcelevel_order) + levels += 1; + /* +* We try to stick to default levels (which is >1 at the moment) in +* order to save memory by relying on on-demain TCE level allocation. +*/ + levels = max_t(unsigned int, levels, POWERNV_IOMMU_DEFAULT_LEVELS); + + rc = pnv_pci_ioda2_create_table(>table_group, 0, PAGE_SHIFT, + window_size, levels, false, ); if (rc) { pe_err(pe, "Failed to create 32-bit TCE table, err %ld", rc); return rc; } - iommu_init_table(tbl, pe->phb->hose->node); + /* We use top part of 32bit space for MMIO so exclude it from DMA */ + res_start = 0; + res_end = 0; + if (window_size > pe->phb->ioda.m32_pci_base) { +
Re: [PATCH kernel v3 2/3] powerpc/powernv/ioda2: Allocate TCE table levels on demand for default DMA window
It seems this is mostly just enabling already existing code used by KVM for on-demand TCE level allocation on baremetal as well. Given that I suppose the implementation of the on-demand allocation itself is already used and therefore somewhat tested by KVM. I took a look at pnv_tce() which does the on-demand allocation and the changes here seem like they should work with that so: Reviewed-by: Alistair Popple On Thursday, 30 May 2019 5:03:54 PM AEST Alexey Kardashevskiy wrote: We allocate only the first level of multilevel TCE tables for KVM already (alloc_userspace_copy==true), and the rest is allocated on demand. This is not enabled though for baremetal. This removes the KVM limitation (implicit, via the alloc_userspace_copy parameter) and always allocates just the first level. The on-demand allocation of missing levels is already implemented. As from now on DMA map might happen with disabled interrupts, this allocates TCEs with GFP_ATOMIC. To save time when creating a new clean table, this skips non-allocated indirect TCE entries in pnv_tce_free just like we already do in the VFIO IOMMU TCE driver. This changes the default level number from 1 to 2 to reduce the amount of memory required for the default 32bit DMA window at the boot time. The default window size is up to 2GB which requires 4MB of TCEs which is unlikely to be used entirely or at all as most devices these days are 64bit capable so by switching to 2 levels by default we save 4032KB of RAM per a device. While at this, add __GFP_NOWARN to alloc_pages_node() as the userspace can trigger this path via VFIO, see the failure and try creating a table again with different parameters which might succeed. Signed-off-by: Alexey Kardashevskiy --- Changes: v2: * added __GFP_NOWARN to alloc_pages_node --- arch/powerpc/platforms/powernv/pci.h | 2 +- arch/powerpc/platforms/powernv/pci-ioda-tce.c | 20 +-- 2 files changed, 11 insertions(+), 11 deletions(-) diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h index 1a51e7bfc541..a55dabc8d057 100644 --- a/arch/powerpc/platforms/powernv/pci.h +++ b/arch/powerpc/platforms/powernv/pci.h @@ -225,7 +225,7 @@ extern struct iommu_table_group *pnv_npu_compound_attach( struct pnv_ioda_pe *pe); /* pci-ioda-tce.c */ -#define POWERNV_IOMMU_DEFAULT_LEVELS 1 +#define POWERNV_IOMMU_DEFAULT_LEVELS 2 #define POWERNV_IOMMU_MAX_LEVELS 5 extern int pnv_tce_build(struct iommu_table *tbl, long index, long npages, diff --git a/arch/powerpc/platforms/powernv/pci-ioda-tce.c b/arch/powerpc/platforms/powernv/pci-ioda-tce.c index e28f03e1eb5e..c75ec37bf0cd 100644 --- a/arch/powerpc/platforms/powernv/pci-ioda-tce.c +++ b/arch/powerpc/platforms/powernv/pci-ioda-tce.c @@ -36,7 +36,8 @@ static __be64 *pnv_alloc_tce_level(int nid, unsigned int shift) struct page *tce_mem = NULL; __be64 *addr; - tce_mem = alloc_pages_node(nid, GFP_KERNEL, shift - PAGE_SHIFT); + tce_mem = alloc_pages_node(nid, GFP_ATOMIC | __GFP_NOWARN, + shift - PAGE_SHIFT); if (!tce_mem) { pr_err("Failed to allocate a TCE memory, level shift=%d\n", shift); @@ -161,6 +162,9 @@ void pnv_tce_free(struct iommu_table *tbl, long index, long npages) if (ptce) *ptce = cpu_to_be64(0); + else + /* Skip the rest of the level */ + i |= tbl->it_level_size - 1; } } @@ -260,7 +264,6 @@ long pnv_pci_ioda2_table_alloc_pages(int nid, __u64 bus_offset, unsigned int table_shift = max_t(unsigned int, entries_shift + 3, PAGE_SHIFT); const unsigned long tce_table_size = 1UL << table_shift; - unsigned int tmplevels = levels; if (!levels || (levels > POWERNV_IOMMU_MAX_LEVELS)) return -EINVAL; @@ -268,9 +271,6 @@ long pnv_pci_ioda2_table_alloc_pages(int nid, __u64 bus_offset, if (!is_power_of_2(window_size)) return -EINVAL; - if (alloc_userspace_copy && (window_size > (1ULL << 32))) - tmplevels = 1; - /* Adjust direct table size from window_size and levels */ entries_shift = (entries_shift + levels - 1) / levels; level_shift = entries_shift + 3; @@ -281,7 +281,7 @@ long pnv_pci_ioda2_table_alloc_pages(int nid, __u64 bus_offset, /* Allocate TCE table */ addr = pnv_pci_ioda2_table_do_alloc_pages(nid, level_shift, - tmplevels, tce_table_size, , _allocated); + 1, tce_table_size, , _allocated); /* addr==NULL means that the first level allocation failed */ if (!addr) @@ -292,18 +292,18 @@ long pnv_pci_ioda2_table_alloc_pages(int nid, __u64 bus_offset, * we did not allocate as much as we wanted, * release partially allocated table. */ - if (tmplevels == levels && offset < tce_table_size) +
[PATCH V3 2/2] ASoC: fsl_esai: recover the channel swap after xrun
From: Shengjiu Wang There is chip errata ERR008000, the reference doc is (https://www.nxp.com/docs/en/errata/IMX6DQCE.pdf), The issue is "While using ESAI transmit or receive and an underrun/overrun happens, channel swap may occur. The only recovery mechanism is to reset the ESAI." This issue exist in imx3/imx5/imx6(partial) series. In this commit add a tasklet to handle reset of ESAI after xrun happens to recover the channel swap. Signed-off-by: Shengjiu Wang --- sound/soc/fsl/fsl_esai.c | 78 1 file changed, 78 insertions(+) diff --git a/sound/soc/fsl/fsl_esai.c b/sound/soc/fsl/fsl_esai.c index ab460d6d7432..416bec424fd6 100644 --- a/sound/soc/fsl/fsl_esai.c +++ b/sound/soc/fsl/fsl_esai.c @@ -32,6 +32,7 @@ * @extalclk: esai clock source to derive HCK, SCK and FS * @fsysclk: system clock source to derive HCK, SCK and FS * @spbaclk: SPBA clock (optional, depending on SoC design) + * @task: tasklet to handle the reset operation * @fifo_depth: depth of tx/rx FIFO * @slot_width: width of each DAI slot * @slots: number of slots @@ -42,6 +43,7 @@ * @sck_div: if using PSR/PM dividers for SCKx clock * @slave_mode: if fully using DAI slave mode * @synchronous: if using tx/rx synchronous mode + * @reset_at_xrun: flags for enable reset operaton * @name: driver name */ struct fsl_esai { @@ -53,6 +55,7 @@ struct fsl_esai { struct clk *extalclk; struct clk *fsysclk; struct clk *spbaclk; + struct tasklet_struct task; u32 fifo_depth; u32 slot_width; u32 slots; @@ -65,6 +68,7 @@ struct fsl_esai { bool sck_div[2]; bool slave_mode; bool synchronous; + bool reset_at_xrun; char name[32]; }; @@ -73,8 +77,16 @@ static irqreturn_t esai_isr(int irq, void *devid) struct fsl_esai *esai_priv = (struct fsl_esai *)devid; struct platform_device *pdev = esai_priv->pdev; u32 esr; + u32 saisr; regmap_read(esai_priv->regmap, REG_ESAI_ESR, ); + regmap_read(esai_priv->regmap, REG_ESAI_SAISR, ); + + if ((saisr & (ESAI_SAISR_TUE | ESAI_SAISR_ROE)) && + esai_priv->reset_at_xrun) { + dev_dbg(>dev, "reset module for xrun\n"); + tasklet_schedule(_priv->task); + } if (esr & ESAI_ESR_TINIT_MASK) dev_dbg(>dev, "isr: Transmission Initialized\n"); @@ -635,10 +647,17 @@ static void fsl_esai_trigger_start(struct fsl_esai *esai_priv, bool tx) ESAI_xSMB_xS_MASK, ESAI_xSMB_xS(mask)); regmap_update_bits(esai_priv->regmap, REG_ESAI_xSMA(tx), ESAI_xSMA_xS_MASK, ESAI_xSMA_xS(mask)); + + /* Enable Exception interrupt */ + regmap_update_bits(esai_priv->regmap, REG_ESAI_xCR(tx), + ESAI_xCR_xEIE_MASK, ESAI_xCR_xEIE); } static void fsl_esai_trigger_stop(struct fsl_esai *esai_priv, bool tx) { + regmap_update_bits(esai_priv->regmap, REG_ESAI_xCR(tx), + ESAI_xCR_xEIE_MASK, 0); + regmap_update_bits(esai_priv->regmap, REG_ESAI_xCR(tx), tx ? ESAI_xCR_TE_MASK : ESAI_xCR_RE_MASK, 0); regmap_update_bits(esai_priv->regmap, REG_ESAI_xSMA(tx), @@ -653,6 +672,55 @@ static void fsl_esai_trigger_stop(struct fsl_esai *esai_priv, bool tx) ESAI_xFCR_xFR, 0); } +static void fsl_esai_hw_reset(unsigned long arg) +{ + struct fsl_esai *esai_priv = (struct fsl_esai *)arg; + u32 saisr, tfcr, rfcr; + bool tx = true, rx = false, enabled[2]; + + /* Save the registers */ + regmap_read(esai_priv->regmap, REG_ESAI_TFCR, ); + regmap_read(esai_priv->regmap, REG_ESAI_RFCR, ); + enabled[tx] = tfcr & ESAI_xFCR_xFEN; + enabled[rx] = rfcr & ESAI_xFCR_xFEN; + + /* Stop the tx & rx */ + fsl_esai_trigger_stop(esai_priv, tx); + fsl_esai_trigger_stop(esai_priv, rx); + + /* Reset the esai, and ignore return value */ + fsl_esai_hw_init(esai_priv); + + regmap_update_bits(esai_priv->regmap, REG_ESAI_TCR, + ESAI_xCR_xPR_MASK, ESAI_xCR_xPR); + regmap_update_bits(esai_priv->regmap, REG_ESAI_RCR, + ESAI_xCR_xPR_MASK, ESAI_xCR_xPR); + + /* +* Restore registers by regcache_sync, and ignore +* return value +*/ + fsl_esai_register_restore(esai_priv); + + regmap_update_bits(esai_priv->regmap, REG_ESAI_TCR, + ESAI_xCR_xPR_MASK, 0); + regmap_update_bits(esai_priv->regmap, REG_ESAI_RCR, + ESAI_xCR_xPR_MASK, 0); + + regmap_update_bits(esai_priv->regmap, REG_ESAI_PRRC, + ESAI_PRRC_PDC_MASK, ESAI_PRRC_PDC(ESAI_GPIO)); + regmap_update_bits(esai_priv->regmap, REG_ESAI_PCRC, + ESAI_PCRC_PC_MASK, ESAI_PCRC_PC(ESAI_GPIO)); + +
[PATCH V3 1/2] ASoC: fsl_esai: Wrap some operations to be functions
From: Shengjiu Wang Extract the operation to be functions, to improve the readability. In this patch, fsl_esai_hw_init, fsl_esai_register_restore, fsl_esai_trigger_start and fsl_esai_trigger_stop are extracted. Signed-off-by: Shengjiu Wang Acked-by: Nicolin Chen --- sound/soc/fsl/fsl_esai.c | 192 --- 1 file changed, 119 insertions(+), 73 deletions(-) diff --git a/sound/soc/fsl/fsl_esai.c b/sound/soc/fsl/fsl_esai.c index 10d2210c91ef..ab460d6d7432 100644 --- a/sound/soc/fsl/fsl_esai.c +++ b/sound/soc/fsl/fsl_esai.c @@ -35,6 +35,7 @@ * @fifo_depth: depth of tx/rx FIFO * @slot_width: width of each DAI slot * @slots: number of slots + * @channels: channel num for tx or rx * @hck_rate: clock rate of desired HCKx clock * @sck_rate: clock rate of desired SCKx clock * @hck_dir: the direction of HCKx pads @@ -57,6 +58,7 @@ struct fsl_esai { u32 slots; u32 tx_mask; u32 rx_mask; + u32 channels[2]; u32 hck_rate[2]; u32 sck_rate[2]; bool hck_dir[2]; @@ -543,64 +545,132 @@ static int fsl_esai_hw_params(struct snd_pcm_substream *substream, return 0; } -static int fsl_esai_trigger(struct snd_pcm_substream *substream, int cmd, - struct snd_soc_dai *dai) +static int fsl_esai_hw_init(struct fsl_esai *esai_priv) { - struct fsl_esai *esai_priv = snd_soc_dai_get_drvdata(dai); - bool tx = substream->stream == SNDRV_PCM_STREAM_PLAYBACK; - u8 i, channels = substream->runtime->channels; + struct platform_device *pdev = esai_priv->pdev; + int ret; + + /* Reset ESAI unit */ + ret = regmap_update_bits(esai_priv->regmap, REG_ESAI_ECR, +ESAI_ECR_ESAIEN_MASK | ESAI_ECR_ERST_MASK, +ESAI_ECR_ESAIEN | ESAI_ECR_ERST); + if (ret) { + dev_err(>dev, "failed to reset ESAI: %d\n", ret); + return ret; + } + + /* +* We need to enable ESAI so as to access some of its registers. +* Otherwise, we would fail to dump regmap from user space. +*/ + ret = regmap_update_bits(esai_priv->regmap, REG_ESAI_ECR, +ESAI_ECR_ESAIEN_MASK | ESAI_ECR_ERST_MASK, +ESAI_ECR_ESAIEN); + if (ret) { + dev_err(>dev, "failed to enable ESAI: %d\n", ret); + return ret; + } + + regmap_update_bits(esai_priv->regmap, REG_ESAI_PRRC, + ESAI_PRRC_PDC_MASK, 0); + regmap_update_bits(esai_priv->regmap, REG_ESAI_PCRC, + ESAI_PCRC_PC_MASK, 0); + + return 0; +} + +static int fsl_esai_register_restore(struct fsl_esai *esai_priv) +{ + int ret; + + /* FIFO reset for safety */ + regmap_update_bits(esai_priv->regmap, REG_ESAI_TFCR, + ESAI_xFCR_xFR, ESAI_xFCR_xFR); + regmap_update_bits(esai_priv->regmap, REG_ESAI_RFCR, + ESAI_xFCR_xFR, ESAI_xFCR_xFR); + + regcache_mark_dirty(esai_priv->regmap); + ret = regcache_sync(esai_priv->regmap); + if (ret) + return ret; + + /* FIFO reset done */ + regmap_update_bits(esai_priv->regmap, REG_ESAI_TFCR, ESAI_xFCR_xFR, 0); + regmap_update_bits(esai_priv->regmap, REG_ESAI_RFCR, ESAI_xFCR_xFR, 0); + + return 0; +} + +static void fsl_esai_trigger_start(struct fsl_esai *esai_priv, bool tx) +{ + u8 i, channels = esai_priv->channels[tx]; u32 pins = DIV_ROUND_UP(channels, esai_priv->slots); u32 mask; + regmap_update_bits(esai_priv->regmap, REG_ESAI_xFCR(tx), + ESAI_xFCR_xFEN_MASK, ESAI_xFCR_xFEN); + + /* Write initial words reqiured by ESAI as normal procedure */ + for (i = 0; tx && i < channels; i++) + regmap_write(esai_priv->regmap, REG_ESAI_ETDR, 0x0); + + /* +* When set the TE/RE in the end of enablement flow, there +* will be channel swap issue for multi data line case. +* In order to workaround this issue, we switch the bit +* enablement sequence to below sequence +* 1) clear the xSMB & xSMA: which is done in probe and +* stop state. +* 2) set TE/RE +* 3) set xSMB +* 4) set xSMA: xSMA is the last one in this flow, which +* will trigger esai to start. +*/ + regmap_update_bits(esai_priv->regmap, REG_ESAI_xCR(tx), + tx ? ESAI_xCR_TE_MASK : ESAI_xCR_RE_MASK, + tx ? ESAI_xCR_TE(pins) : ESAI_xCR_RE(pins)); + mask = tx ? esai_priv->tx_mask : esai_priv->rx_mask; + + regmap_update_bits(esai_priv->regmap, REG_ESAI_xSMB(tx), + ESAI_xSMB_xS_MASK, ESAI_xSMB_xS(mask)); + regmap_update_bits(esai_priv->regmap, REG_ESAI_xSMA(tx), +
[PATCH V3 0/2] recover the channel swap after xrun
From: Shengjiu Wang recover the channel swap after xrun Shengjiu Wang (2): ASoC: fsl_esai: Wrap some operations to be functions ASoC: fsl_esai: recover the channel swap after xrun sound/soc/fsl/fsl_esai.c | 270 --- 1 file changed, 197 insertions(+), 73 deletions(-) Changes in V3 - update code sytle. Changes in v2 - add one patch for wrap operations to functions. - fix some coding style issue -- 2.21.0
Re: [PATCH] cpufreq/pasemi: fix an use-after-free in pas_cpufreq_cpu_init()
On 08-07-19, 14:19, Wen Yang wrote: > The cpu variable is still being used in the of_get_property() call > after the of_node_put() call, which may result in use-after-free. > > Fixes: a9acc26b75f ("cpufreq/pasemi: fix possible object reference leak") > Signed-off-by: Wen Yang > Cc: "Rafael J. Wysocki" > Cc: Viresh Kumar > Cc: linuxppc-dev@lists.ozlabs.org > Cc: linux...@vger.kernel.org > Cc: linux-ker...@vger.kernel.org > --- > drivers/cpufreq/pasemi-cpufreq.c | 10 ++ > 1 file changed, 6 insertions(+), 4 deletions(-) I will suggest some changes here. > diff --git a/drivers/cpufreq/pasemi-cpufreq.c > b/drivers/cpufreq/pasemi-cpufreq.c > index 6b1e4ab..d2dd47b 100644 > --- a/drivers/cpufreq/pasemi-cpufreq.c > +++ b/drivers/cpufreq/pasemi-cpufreq.c > @@ -132,7 +132,6 @@ static int pas_cpufreq_cpu_init(struct cpufreq_policy > *policy) Don't initialize "err" anymore. > cpu = of_get_cpu_node(policy->cpu, NULL); > > - of_node_put(cpu); > if (!cpu) > goto out; Do return -ENODEV; here. > > @@ -141,15 +140,15 @@ static int pas_cpufreq_cpu_init(struct cpufreq_policy > *policy) > dn = of_find_compatible_node(NULL, NULL, >"pasemi,pwrficient-sdc"); > if (!dn) > - goto out; > + goto out_put_cpu_node; > err = of_address_to_resource(dn, 0, ); > of_node_put(dn); > if (err) > - goto out; > + goto out_put_cpu_node; > sdcasr_mapbase = ioremap(res.start + SDCASR_OFFSET, 0x2000); > if (!sdcasr_mapbase) { > err = -EINVAL; > - goto out; > + goto out_put_cpu_node; > } Don't do above changes. > > dn = of_find_compatible_node(NULL, NULL, "1682m-gizmo"); > @@ -177,6 +176,7 @@ static int pas_cpufreq_cpu_init(struct cpufreq_policy > *policy) > err = -EINVAL; > goto out_unmap_sdcpwr; > } > + of_node_put(cpu); > > /* we need the freq in kHz */ > max_freq = *max_freqp / 1000; > @@ -203,6 +203,8 @@ static int pas_cpufreq_cpu_init(struct cpufreq_policy > *policy) > > out_unmap_sdcasr: > iounmap(sdcasr_mapbase); > +out_put_cpu_node: Don't add this label, instead use "out" for also having the below code. > + of_node_put(cpu); > out: > return err; > } > -- > 2.9.5 -- viresh
[PATCH] cpufreq/pasemi: fix an use-after-free in pas_cpufreq_cpu_init()
The cpu variable is still being used in the of_get_property() call after the of_node_put() call, which may result in use-after-free. Fixes: a9acc26b75f ("cpufreq/pasemi: fix possible object reference leak") Signed-off-by: Wen Yang Cc: "Rafael J. Wysocki" Cc: Viresh Kumar Cc: linuxppc-dev@lists.ozlabs.org Cc: linux...@vger.kernel.org Cc: linux-ker...@vger.kernel.org --- drivers/cpufreq/pasemi-cpufreq.c | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/drivers/cpufreq/pasemi-cpufreq.c b/drivers/cpufreq/pasemi-cpufreq.c index 6b1e4ab..d2dd47b 100644 --- a/drivers/cpufreq/pasemi-cpufreq.c +++ b/drivers/cpufreq/pasemi-cpufreq.c @@ -132,7 +132,6 @@ static int pas_cpufreq_cpu_init(struct cpufreq_policy *policy) cpu = of_get_cpu_node(policy->cpu, NULL); - of_node_put(cpu); if (!cpu) goto out; @@ -141,15 +140,15 @@ static int pas_cpufreq_cpu_init(struct cpufreq_policy *policy) dn = of_find_compatible_node(NULL, NULL, "pasemi,pwrficient-sdc"); if (!dn) - goto out; + goto out_put_cpu_node; err = of_address_to_resource(dn, 0, ); of_node_put(dn); if (err) - goto out; + goto out_put_cpu_node; sdcasr_mapbase = ioremap(res.start + SDCASR_OFFSET, 0x2000); if (!sdcasr_mapbase) { err = -EINVAL; - goto out; + goto out_put_cpu_node; } dn = of_find_compatible_node(NULL, NULL, "1682m-gizmo"); @@ -177,6 +176,7 @@ static int pas_cpufreq_cpu_init(struct cpufreq_policy *policy) err = -EINVAL; goto out_unmap_sdcpwr; } + of_node_put(cpu); /* we need the freq in kHz */ max_freq = *max_freqp / 1000; @@ -203,6 +203,8 @@ static int pas_cpufreq_cpu_init(struct cpufreq_policy *policy) out_unmap_sdcasr: iounmap(sdcasr_mapbase); +out_put_cpu_node: + of_node_put(cpu); out: return err; } -- 2.9.5
[PATCH] powerpc/irq: Don't WARN continuously in arch_local_irq_restore()
When CONFIG_PPC_IRQ_SOFT_MASK_DEBUG is enabled (uncommon), we have a series of WARN_ON's in arch_local_irq_restore(). These are "should never happen" conditions, but if they do happen they can flood the console and render the system unusable. So switch them to WARN_ON_ONCE(). Fixes: e2b36d591720 ("powerpc/64: Don't trace code that runs with the soft irq mask unreconciled") Fixes: 9b81c0211c24 ("powerpc/64s: make PACA_IRQ_HARD_DIS track MSR[EE] closely") Fixes: 7c0482e3d055 ("powerpc/irq: Fix another case of lazy IRQ state getting out of sync") Signed-off-by: Michael Ellerman --- arch/powerpc/kernel/irq.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c index ada901af4950..c9a6eac3075c 100644 --- a/arch/powerpc/kernel/irq.c +++ b/arch/powerpc/kernel/irq.c @@ -259,7 +259,7 @@ notrace void arch_local_irq_restore(unsigned long mask) irq_happened = get_irq_happened(); if (!irq_happened) { #ifdef CONFIG_PPC_IRQ_SOFT_MASK_DEBUG - WARN_ON(!(mfmsr() & MSR_EE)); + WARN_ON_ONCE(!(mfmsr() & MSR_EE)); #endif return; } @@ -272,7 +272,7 @@ notrace void arch_local_irq_restore(unsigned long mask) */ if (!(irq_happened & PACA_IRQ_HARD_DIS)) { #ifdef CONFIG_PPC_IRQ_SOFT_MASK_DEBUG - WARN_ON(!(mfmsr() & MSR_EE)); + WARN_ON_ONCE(!(mfmsr() & MSR_EE)); #endif __hard_irq_disable(); #ifdef CONFIG_PPC_IRQ_SOFT_MASK_DEBUG @@ -283,7 +283,7 @@ notrace void arch_local_irq_restore(unsigned long mask) * warn if we are wrong. Only do that when IRQ tracing * is enabled as mfmsr() can be costly. */ - if (WARN_ON(mfmsr() & MSR_EE)) + if (WARN_ON_ONCE(mfmsr() & MSR_EE)) __hard_irq_disable(); #endif } -- 2.20.1