Re: [PATCH v2] powerpc: Don't clobber fr0/vs0 during fp|altivec register save
On Tue Nov 21, 2023 at 11:23 AM AEST, Timothy Pearson wrote: > > > - Original Message - > > From: "Michael Ellerman" > > To: "Timothy Pearson" > > Cc: "Jens Axboe" , "regressions" > > , "npiggin" , > > "christophe leroy" , "linuxppc-dev" > > > > Sent: Monday, November 20, 2023 5:39:52 PM > > Subject: Re: [PATCH v2] powerpc: Don't clobber fr0/vs0 during fp|altivec > > register save > > > Timothy Pearson writes: > >> - Original Message - > >>> From: "Michael Ellerman" > > ... > >>> > >>> But we now have a new path, because io-uring can call copy_process() via > >>> create_io_thread() from the signal handling path. That's OK if the signal > >>> is > >>> handled as we return from a syscall, but it's not OK if the signal is > >>> handled > >>> due to some other interrupt. > >>> > >>> Which is: > >>> > >>> interrupt_return_srr_user() > >>> interrupt_exit_user_prepare() > >>>interrupt_exit_user_prepare_main() > >>> do_notify_resume() > >>>get_signal() > >>> task_work_run() > >>>create_worker_cb() > >>> create_io_worker() > >>>copy_process() > >>> dup_task_struct() > >>>arch_dup_task_struct() > >>> flush_all_to_thread() > >>>save_all() > >>> if (tsk->thread.regs->msr & MSR_FP) > >>>save_fpu() > >>># fr0 is clobbered and potentially live in > >>> userspace > >>> > >>> > >>> So tldr I think the corruption is only an issue since io-uring started > >>> doing > >>> the clone via signal, which I think matches the observed timeline of this > >>> bug > >>> appearing. > >> > >> I agree the corruption really only started showing up in earnest on > >> io_uring clone-via-signal, as this was confirmed several times in the > >> course of debugging. > > > > Thanks. > > > >> Note as well that I may very well have a wrong call order in the > >> commit message, since I was relying on a couple of WARN_ON() macros I > >> inserted to check for a similar (but not identical) condition and > >> didn't spend much time getting new traces after identifying the root > >> cause. > > > > Yep no worries. I'll reword it to incorporate the full path from my mail. > > > >> I went back and grabbed some real world system-wide stack traces, since I > >> now > >> know what to trigger on. A typical example is: > >> > >> interrupt_return_srr_user() > >> interrupt_exit_user_prepare() > >> interrupt_exit_user_prepare_main() > >>schedule() > >> __schedule() > >> __switch_to() > >> giveup_all() > >># tsk->thread.regs->msr MSR_FP is still set here > >>__giveup_fpu() > >> save_fpu() > >> # fr0 is clobbered and potentially live in userspace > > > > fr0 is not live there. > > > > __giveup_fpu() does roughly: > > > > msr = tsk->thread.regs->msr; > > msr &= ~(MSR_FP|MSR_FE0|MSR_FE1); > >msr &= ~MSR_VSX; > > tsk->thread.regs = msr; > > > > ie. it clears the FP etc. bits from the task's MSR. That means the FP > > state will be reloaded from the thread struct before the task is run again. > > > > Also on that path we're switching to another task, so we'll be reloading > > the other task's FP state before returning to userspace. > > > > So I don't see any bug there. > > Yeah, you're right. I was trying to get traces while doing something else, > and didn't think that all the way through, sorry. :) It's not going to be > super easy to get a real trace (I was triggering the WARN_ON() from of fr0 > getting set to to FPSCR), so let's just assume it's mainly the path you > manually found above and update the commit message accordingly. > > > There's only two places that call save_fpu() and skip the giveup logic, > > which is save_all() and kvmppc_save_user_regs(). > > Now that's interesting as well, since it might explain some issues I've seen > for years on a specific QEMU workload. Once this is backported to stable > I'll need to get the kernels updated on those boxes and see if the issues > disappear... KVM issue is actually slightly different. You need this (lightly verified to solve issue so far). --- >From c12fbed0e12207058282a2411da59b43b1f96c49 Mon Sep 17 00:00:00 2001 From: Nicholas Piggin Date: Tue, 21 Nov 2023 17:03:55 +1000 Subject: [PATCH] KVM: PPC: Book3S HV: Fix KVM_RUN ioctl clobbering FP/VEC registers Before running a guest, the host process's FP/VEC registers are saved away like a context switch or a kernel use of those regs. The guest FP/VEC registers can then be loaded as needed. The host process registers would be restored lazily when it uses FP/VEC again. KVM HV did not do this correctly. The registers are saved in the thread struct, but the FP/VEC/VSX bits remain enabled in the user MSR, leading the kernel to think they are still valid. Even after they are clobbered by guest registers.
Re: [PATCH v2] powerpc: Don't clobber fr0/vs0 during fp|altivec register save
On Tue Nov 21, 2023 at 2:10 PM AEST, Timothy Pearson wrote: > - Original Message - > > From: "Michael Ellerman" > > To: "Timothy Pearson" > > Cc: "Jens Axboe" , "regressions" > > , "npiggin" , > > "christophe leroy" , "linuxppc-dev" > > > > Sent: Monday, November 20, 2023 5:39:52 PM > > Subject: Re: [PATCH v2] powerpc: Don't clobber fr0/vs0 during fp|altivec > > register save > > > Timothy Pearson writes: > >> - Original Message - > >>> From: "Michael Ellerman" > > ... > >>> > >>> But we now have a new path, because io-uring can call copy_process() via > >>> create_io_thread() from the signal handling path. That's OK if the signal > >>> is > >>> handled as we return from a syscall, but it's not OK if the signal is > >>> handled > >>> due to some other interrupt. > >>> > >>> Which is: > >>> > >>> interrupt_return_srr_user() > >>> interrupt_exit_user_prepare() > >>>interrupt_exit_user_prepare_main() > >>> do_notify_resume() > >>>get_signal() > >>> task_work_run() > >>>create_worker_cb() > >>> create_io_worker() > >>>copy_process() > >>> dup_task_struct() > >>>arch_dup_task_struct() > >>> flush_all_to_thread() > >>>save_all() > >>> if (tsk->thread.regs->msr & MSR_FP) > >>>save_fpu() > >>># fr0 is clobbered and potentially live in > >>> userspace > >>> > >>> > >>> So tldr I think the corruption is only an issue since io-uring started > >>> doing > >>> the clone via signal, which I think matches the observed timeline of this > >>> bug > >>> appearing. > >> > >> I agree the corruption really only started showing up in earnest on > >> io_uring clone-via-signal, as this was confirmed several times in the > >> course of debugging. > > > > Thanks. > > > >> Note as well that I may very well have a wrong call order in the > >> commit message, since I was relying on a couple of WARN_ON() macros I > >> inserted to check for a similar (but not identical) condition and > >> didn't spend much time getting new traces after identifying the root > >> cause. > > > > Yep no worries. I'll reword it to incorporate the full path from my mail. > > > >> I went back and grabbed some real world system-wide stack traces, since I > >> now > >> know what to trigger on. A typical example is: > >> > >> interrupt_return_srr_user() > >> interrupt_exit_user_prepare() > >> interrupt_exit_user_prepare_main() > >>schedule() > >> __schedule() > >> __switch_to() > >> giveup_all() > >># tsk->thread.regs->msr MSR_FP is still set here > >>__giveup_fpu() > >> save_fpu() > >> # fr0 is clobbered and potentially live in userspace > > > > fr0 is not live there. > > > ie. it clears the FP etc. bits from the task's MSR. That means the FP > > state will be reloaded from the thread struct before the task is run again. > > So a little more detail on this, just to put it to rest properly vs. assuming > hand analysis caught every possible pathway. :) > > The debugging that generates this stack trace also verifies the following in > __giveup_fpu(): > > 1.) tsk->thread.fp_state.fpr doesn't contain the FPSCR contents prior to > calling save_fpu() > 2.) tsk->thread.fp_state.fpr contains the FPSCR contents directly after > calling save_fpu() > 3.) MSR_FP is set both in the task struct and in the live MSR. > > Only if all three conditions are met will it generate the trace. This is a > generalization of the hack I used to find the problem in the first place. > > If the state will subsequently be reloaded from the thread struct, that means > we're reloading the registers from the thread struct that we just verified > was corrupted by the earlier save_fpu() call. There are only two ways I can > see for that to be true -- one is if the registers were already clobbered > when giveup_all() was entered, and the other is if save_fpu() went ahead and > clobbered them right here inside giveup_all(). > > To see which scenario we were dealing with, I added a bit more > instrumentation to dump the current register state if MSR_FP bit was already > set in registers (i.e. not dumping data from task struct, but using the live > FPU registers instead), and sure enough the registers are corrupt on entry, > so something else has already called save_fpu() before we even hit > giveup_all() in this call chain. > > Unless I'm missing something, doesn't this effectively mean that anything > interrupting a task can hit this bug? Or, put another way, I'm seeing > several processes hit this exact call chain with the corrupt register going > back out to userspace without io_uring even in the mix, so there seems to be > another pathway in play. These traces are from a qemu guest, in case it > matters given the kvm path is possibly susceptible. > >
Re: [PATCH v9 00/27] Add support for QMC HDLC, framer infrastructure and PEF2256 framer
Hi Mark, Jakub, Qiang, Li, On Mon, 20 Nov 2023 13:30:08 + Mark Brown wrote: > On Fri, Nov 17, 2023 at 04:47:46PM -0800, Jakub Kicinski wrote: > > On Wed, 15 Nov 2023 15:39:36 +0100 Herve Codina wrote: > > >- Removed Patches 6, 7 and 8 (patches applied) > > > > > >- Patches 7, 20, 21, 23 (patches 10, 23, 24, 26 in v8) > > > Add 'Acked-by: Jakub Kicinski ' > > > I thought someone (Mark?) asked for the networking stuff to be put > > on a branch. If that's still the preference - is it possible to factor > > these out as a standalone series, too? Will they build on their own? > > Yes, can we *please* at least get the generic non-driver bits of this > series moving - they seem uncontroversial as far as I can see and are a > tiny portion of the overall 20 patches. Patches 21-23 look like they > can go on a branch in the net tree? Patch 21 is the framer infrastructure. Patches 22-25 are the driver for the PEF2256 framer. Note that patch 24 is the pinmux part of the framer and, IHMO, can be taken too. Patch 23 need to be fixed (kernel test robot). The fix will be quite minor (depends on HAS_IOMEM on the Kconfig file). For the SoC part (QUICC ENGINE QMC and TSA), what will be the plan ? Qiang, Li, any opinion ? I plan to send the v10 with the patch 23 fixed. Based on that v10, some patches (21 to 25 at least) could be applied and I will remove them for the future v11. I think it will be easier to follow if I iterate on the series removing patches as soon as they are applied. Of course, please, let me know if this is not the right way to do. Best regards, Hervé
[PATCH v5 1/2] ASoC: dt-bindings: sound-card-common: List sound widgets ignoring system suspend
Add a property to list audio sound widgets which are marked ignoring system suspend. Paths between these endpoints are still active over suspend of the main application processor that the current operating system is running. Signed-off-by: Chancel Liu --- .../devicetree/bindings/sound/sound-card-common.yaml | 7 +++ 1 file changed, 7 insertions(+) diff --git a/Documentation/devicetree/bindings/sound/sound-card-common.yaml b/Documentation/devicetree/bindings/sound/sound-card-common.yaml index 3a941177f684..721950f65748 100644 --- a/Documentation/devicetree/bindings/sound/sound-card-common.yaml +++ b/Documentation/devicetree/bindings/sound/sound-card-common.yaml @@ -17,6 +17,13 @@ properties: pair of strings, the first being the connection's sink, the second being the connection's source. + ignore-suspend-widgets: +$ref: /schemas/types.yaml#/definitions/non-unique-string-array +description: | + A list of audio sound widgets which are marked ignoring system suspend. + Paths between these endpoints are still active over suspend of the main + application processor that the current operating system is running. + model: $ref: /schemas/types.yaml#/definitions/string description: User specified audio sound card name -- 2.42.0
[PATCH v5 2/2] ASoC: imx-rpmsg: Force codec power on in low power audio mode
Low power audio mode requires binding codec still power on while Acore enters into suspend so Mcore can continue playback music. ASoC machine driver acquires DAPM endpoints through reading "ignore-suspend-widgets" property from DT and then forces the path between these endpoints ignoring suspend. If the rpmsg sound card is in low power audio mode, the suspend/resume callback of binding codec is overridden to disable the suspend/resume. Signed-off-by: Chancel Liu --- sound/soc/fsl/imx-rpmsg.c | 61 +-- 1 file changed, 59 insertions(+), 2 deletions(-) diff --git a/sound/soc/fsl/imx-rpmsg.c b/sound/soc/fsl/imx-rpmsg.c index a0c5c35817dd..e5bd63dab10c 100644 --- a/sound/soc/fsl/imx-rpmsg.c +++ b/sound/soc/fsl/imx-rpmsg.c @@ -2,9 +2,8 @@ // Copyright 2017-2020 NXP #include -#include +#include #include -#include #include #include #include @@ -21,8 +20,11 @@ struct imx_rpmsg { struct snd_soc_dai_link dai; struct snd_soc_card card; unsigned long sysclk; + bool lpa; }; +static struct dev_pm_ops lpa_pm; + static const struct snd_soc_dapm_widget imx_rpmsg_dapm_widgets[] = { SND_SOC_DAPM_HP("Headphone Jack", NULL), SND_SOC_DAPM_SPK("Ext Spk", NULL), @@ -39,6 +41,58 @@ static int imx_rpmsg_late_probe(struct snd_soc_card *card) struct device *dev = card->dev; int ret; + if (data->lpa) { + struct snd_soc_component *codec_comp; + struct device_node *codec_np; + struct device_driver *codec_drv; + struct device *codec_dev = NULL; + + codec_np = data->dai.codecs->of_node; + if (codec_np) { + struct platform_device *codec_pdev; + struct i2c_client *codec_i2c; + + codec_i2c = of_find_i2c_device_by_node(codec_np); + if (codec_i2c) + codec_dev = _i2c->dev; + if (!codec_dev) { + codec_pdev = of_find_device_by_node(codec_np); + if (codec_pdev) + codec_dev = _pdev->dev; + } + } + if (codec_dev) { + codec_comp = snd_soc_lookup_component_nolocked(codec_dev, NULL); + if (codec_comp) { + int i, num_widgets; + const char *widgets; + struct snd_soc_dapm_context *dapm; + + num_widgets = of_property_count_strings(data->card.dev->of_node, + "ignore-suspend-widgets"); + for (i = 0; i < num_widgets; i++) { + of_property_read_string_index(data->card.dev->of_node, + "ignore-suspend-widgets", + i, ); + dapm = snd_soc_component_get_dapm(codec_comp); + snd_soc_dapm_ignore_suspend(dapm, widgets); + } + } + codec_drv = codec_dev->driver; + if (codec_drv->pm) { + memcpy(_pm, codec_drv->pm, sizeof(lpa_pm)); + lpa_pm.suspend = NULL; + lpa_pm.resume = NULL; + lpa_pm.freeze = NULL; + lpa_pm.thaw = NULL; + lpa_pm.poweroff = NULL; + lpa_pm.restore = NULL; + codec_drv->pm = _pm; + } + put_device(codec_dev); + } + } + if (!data->sysclk) return 0; @@ -138,6 +192,9 @@ static int imx_rpmsg_probe(struct platform_device *pdev) goto fail; } + if (of_property_read_bool(np, "fsl,enable-lpa")) + data->lpa = true; + data->card.dev = >dev; data->card.owner = THIS_MODULE; data->card.dapm_widgets = imx_rpmsg_dapm_widgets; -- 2.42.0
Re: [PATCH v2] powerpc: Don't clobber fr0/vs0 during fp|altivec register save
- Original Message - > From: "Timothy Pearson" > To: "Michael Ellerman" > Cc: "Jens Axboe" , "regressions" > , "npiggin" , > "christophe leroy" , "linuxppc-dev" > > Sent: Monday, November 20, 2023 10:10:32 PM > Subject: Re: [PATCH v2] powerpc: Don't clobber fr0/vs0 during fp|altivec > register save > - Original Message - >> From: "Michael Ellerman" >> To: "Timothy Pearson" >> Cc: "Jens Axboe" , "regressions" >> , >> "npiggin" , >> "christophe leroy" , "linuxppc-dev" >> >> Sent: Monday, November 20, 2023 5:39:52 PM >> Subject: Re: [PATCH v2] powerpc: Don't clobber fr0/vs0 during fp|altivec >> register save > >> Timothy Pearson writes: >>> - Original Message - From: "Michael Ellerman" >> ... But we now have a new path, because io-uring can call copy_process() via create_io_thread() from the signal handling path. That's OK if the signal is handled as we return from a syscall, but it's not OK if the signal is handled due to some other interrupt. Which is: interrupt_return_srr_user() interrupt_exit_user_prepare() interrupt_exit_user_prepare_main() do_notify_resume() get_signal() task_work_run() create_worker_cb() create_io_worker() copy_process() dup_task_struct() arch_dup_task_struct() flush_all_to_thread() save_all() if (tsk->thread.regs->msr & MSR_FP) save_fpu() # fr0 is clobbered and potentially live in userspace So tldr I think the corruption is only an issue since io-uring started doing the clone via signal, which I think matches the observed timeline of this bug appearing. >>> >>> I agree the corruption really only started showing up in earnest on >>> io_uring clone-via-signal, as this was confirmed several times in the >>> course of debugging. >> >> Thanks. >> >>> Note as well that I may very well have a wrong call order in the >>> commit message, since I was relying on a couple of WARN_ON() macros I >>> inserted to check for a similar (but not identical) condition and >>> didn't spend much time getting new traces after identifying the root >>> cause. >> >> Yep no worries. I'll reword it to incorporate the full path from my mail. >> >>> I went back and grabbed some real world system-wide stack traces, since I >>> now >>> know what to trigger on. A typical example is: >>> >>> interrupt_return_srr_user() >>> interrupt_exit_user_prepare() >>> interrupt_exit_user_prepare_main() >>>schedule() >>> __schedule() >>> __switch_to() >>> giveup_all() >>># tsk->thread.regs->msr MSR_FP is still set here >>>__giveup_fpu() >>> save_fpu() >>> # fr0 is clobbered and potentially live in userspace >> >> fr0 is not live there. > >> ie. it clears the FP etc. bits from the task's MSR. That means the FP >> state will be reloaded from the thread struct before the task is run again. > > So a little more detail on this, just to put it to rest properly vs. assuming > hand analysis caught every possible pathway. :) > > The debugging that generates this stack trace also verifies the following in > __giveup_fpu(): > > 1.) tsk->thread.fp_state.fpr doesn't contain the FPSCR contents prior to > calling > save_fpu() > 2.) tsk->thread.fp_state.fpr contains the FPSCR contents directly after > calling > save_fpu() > 3.) MSR_FP is set both in the task struct and in the live MSR. > > Only if all three conditions are met will it generate the trace. This is a > generalization of the hack I used to find the problem in the first place. > > If the state will subsequently be reloaded from the thread struct, that means > we're reloading the registers from the thread struct that we just verified was > corrupted by the earlier save_fpu() call. There are only two ways I can see > for that to be true -- one is if the registers were already clobbered when > giveup_all() was entered, and the other is if save_fpu() went ahead and > clobbered them right here inside giveup_all(). > > To see which scenario we were dealing with, I added a bit more instrumentation > to dump the current register state if MSR_FP bit was already set in registers > (i.e. not dumping data from task struct, but using the live FPU registers > instead), and sure enough the registers are corrupt on entry, so something > else > has already called save_fpu() before we even hit giveup_all() in this call > chain. > > Unless I'm missing something, doesn't this effectively mean that anything > interrupting a task can hit this bug? Or, put another way, I'm seeing several > processes hit this exact call chain with the corrupt register going back out > to > userspace without
Re: [PATCH v2] powerpc: Don't clobber fr0/vs0 during fp|altivec register save
- Original Message - > From: "Michael Ellerman" > To: "Timothy Pearson" > Cc: "Jens Axboe" , "regressions" > , "npiggin" , > "christophe leroy" , "linuxppc-dev" > > Sent: Monday, November 20, 2023 5:39:52 PM > Subject: Re: [PATCH v2] powerpc: Don't clobber fr0/vs0 during fp|altivec > register save > Timothy Pearson writes: >> - Original Message - >>> From: "Michael Ellerman" > ... >>> >>> But we now have a new path, because io-uring can call copy_process() via >>> create_io_thread() from the signal handling path. That's OK if the signal is >>> handled as we return from a syscall, but it's not OK if the signal is >>> handled >>> due to some other interrupt. >>> >>> Which is: >>> >>> interrupt_return_srr_user() >>> interrupt_exit_user_prepare() >>>interrupt_exit_user_prepare_main() >>> do_notify_resume() >>>get_signal() >>> task_work_run() >>>create_worker_cb() >>> create_io_worker() >>>copy_process() >>> dup_task_struct() >>>arch_dup_task_struct() >>> flush_all_to_thread() >>>save_all() >>> if (tsk->thread.regs->msr & MSR_FP) >>>save_fpu() >>># fr0 is clobbered and potentially live in >>> userspace >>> >>> >>> So tldr I think the corruption is only an issue since io-uring started doing >>> the clone via signal, which I think matches the observed timeline of this >>> bug >>> appearing. >> >> I agree the corruption really only started showing up in earnest on >> io_uring clone-via-signal, as this was confirmed several times in the >> course of debugging. > > Thanks. > >> Note as well that I may very well have a wrong call order in the >> commit message, since I was relying on a couple of WARN_ON() macros I >> inserted to check for a similar (but not identical) condition and >> didn't spend much time getting new traces after identifying the root >> cause. > > Yep no worries. I'll reword it to incorporate the full path from my mail. > >> I went back and grabbed some real world system-wide stack traces, since I now >> know what to trigger on. A typical example is: >> >> interrupt_return_srr_user() >> interrupt_exit_user_prepare() >> interrupt_exit_user_prepare_main() >>schedule() >> __schedule() >> __switch_to() >> giveup_all() >># tsk->thread.regs->msr MSR_FP is still set here >>__giveup_fpu() >> save_fpu() >> # fr0 is clobbered and potentially live in userspace > > fr0 is not live there. > ie. it clears the FP etc. bits from the task's MSR. That means the FP > state will be reloaded from the thread struct before the task is run again. So a little more detail on this, just to put it to rest properly vs. assuming hand analysis caught every possible pathway. :) The debugging that generates this stack trace also verifies the following in __giveup_fpu(): 1.) tsk->thread.fp_state.fpr doesn't contain the FPSCR contents prior to calling save_fpu() 2.) tsk->thread.fp_state.fpr contains the FPSCR contents directly after calling save_fpu() 3.) MSR_FP is set both in the task struct and in the live MSR. Only if all three conditions are met will it generate the trace. This is a generalization of the hack I used to find the problem in the first place. If the state will subsequently be reloaded from the thread struct, that means we're reloading the registers from the thread struct that we just verified was corrupted by the earlier save_fpu() call. There are only two ways I can see for that to be true -- one is if the registers were already clobbered when giveup_all() was entered, and the other is if save_fpu() went ahead and clobbered them right here inside giveup_all(). To see which scenario we were dealing with, I added a bit more instrumentation to dump the current register state if MSR_FP bit was already set in registers (i.e. not dumping data from task struct, but using the live FPU registers instead), and sure enough the registers are corrupt on entry, so something else has already called save_fpu() before we even hit giveup_all() in this call chain. Unless I'm missing something, doesn't this effectively mean that anything interrupting a task can hit this bug? Or, put another way, I'm seeing several processes hit this exact call chain with the corrupt register going back out to userspace without io_uring even in the mix, so there seems to be another pathway in play. These traces are from a qemu guest, in case it matters given the kvm path is possibly susceptible. Just a few things to think about. The FPU patch itself definitely resolves the problems; I used a sledgehammer approach *specifically* so that there is no place for a rare call sequence we didn't consider to hit it again down the line. :)
Re: Potential config regression after 89cde455 ("kexec: consolidate kexec and crash options into kernel/Kconfig.kexec")
Eric DeVolder's Oracle mail address is not available anymore, add his current mail address he told me. On 11/20/23 at 10:52pm, Ignat Korchagin wrote: > Good day! > > We have recently started to evaluate Linux 6.6 and noticed that we > cannot disable CONFIG_KEXEC anymore, but keep CONFIG_CRASH_DUMP > enabled. It seems to be related to commit 89cde455 ("kexec: > consolidate kexec and crash options into kernel/Kconfig.kexec"), where > a CONFIG_KEXEC dependency was added to CONFIG_CRASH_DUMP. > > In our current kernel (Linux 6.1) we only enable CONFIG_KEXEC_FILE > with enforced signature check to support the kernel crash dumping > functionality and would like to keep CONFIG_KEXEC disabled for > security reasons [1]. > > I was reading the long commit message, but the reason for adding > CONFIG_KEXEC as a dependency for CONFIG_CRASH_DUMP evaded me. And I > believe from the implementation perspective CONFIG_KEXEC_FILE should > suffice here (as we successfully used it for crashdumps on Linux 6.1). > > Is there a reason for adding this dependency or is it just an > oversight? Would some solution of requiring either CONFIG_KEXEC or > CONFIG_KEXEC_FILE work here? I searched the patch history, found Eric didn't add the dependency on CONFIG_KEXEC at the beginning. Later a linux-next building failure with randconfig was reported, in there CONFIG_CRASH_DUMP enabled, while CONFIG_KEXEC is disabled. Finally Eric added the KEXEC dependency for CRASH_DUMP. Please see below link for more details: https://lore.kernel.org/all/3e8eecd1-a277-2cfb-690e-5de2eb7b9...@oracle.com/T/#u And besides, the newly added CONFIG_CRASH_HOTPLUG also needs CONFIG_KEXEC if the elfcorehdr is allowed to be manipulated when cpu/memory hotplug hapened. Thanks Baoquan
Re: Potential config regression after 89cde455 ("kexec: consolidate kexec and crash options into kernel/Kconfig.kexec")
Ignat Korchagin writes: > Good day! > > We have recently started to evaluate Linux 6.6 and noticed that we > cannot disable CONFIG_KEXEC anymore, but keep CONFIG_CRASH_DUMP > enabled. It seems to be related to commit 89cde455 ("kexec: > consolidate kexec and crash options into kernel/Kconfig.kexec"), where > a CONFIG_KEXEC dependency was added to CONFIG_CRASH_DUMP. > > In our current kernel (Linux 6.1) we only enable CONFIG_KEXEC_FILE > with enforced signature check to support the kernel crash dumping > functionality and would like to keep CONFIG_KEXEC disabled for > security reasons [1]. > > I was reading the long commit message, but the reason for adding > CONFIG_KEXEC as a dependency for CONFIG_CRASH_DUMP evaded me. And I > believe from the implementation perspective CONFIG_KEXEC_FILE should > suffice here (as we successfully used it for crashdumps on Linux 6.1). > > Is there a reason for adding this dependency or is it just an > oversight? Would some solution of requiring either CONFIG_KEXEC or > CONFIG_KEXEC_FILE work here? I don't actually see any reason for CRASH_DUMP to depend on KEXEC or KEXEC_FILE. None of the old CRASH_DUMP symbols depended on KEXEC AFAICS. Using something like: $ git diff 89cde455..95d1fef5 | grep -A 3 "^-.*config CRASH_DUMP" It's reasonable to want to build a kernel that supports CRASH_DUMP (ie. can be a dump kernel), but doesn't support kexec and requires a regular reboot. Though I doubt anyone does that in practice? cheers
Re: [PATCH v2] powerpc: Don't clobber fr0/vs0 during fp|altivec register save
- Original Message - > From: "Michael Ellerman" > To: "Timothy Pearson" > Cc: "Jens Axboe" , "regressions" > , "npiggin" , > "christophe leroy" , "linuxppc-dev" > > Sent: Monday, November 20, 2023 5:39:52 PM > Subject: Re: [PATCH v2] powerpc: Don't clobber fr0/vs0 during fp|altivec > register save > Timothy Pearson writes: >> - Original Message - >>> From: "Michael Ellerman" > ... >>> >>> But we now have a new path, because io-uring can call copy_process() via >>> create_io_thread() from the signal handling path. That's OK if the signal is >>> handled as we return from a syscall, but it's not OK if the signal is >>> handled >>> due to some other interrupt. >>> >>> Which is: >>> >>> interrupt_return_srr_user() >>> interrupt_exit_user_prepare() >>>interrupt_exit_user_prepare_main() >>> do_notify_resume() >>>get_signal() >>> task_work_run() >>>create_worker_cb() >>> create_io_worker() >>>copy_process() >>> dup_task_struct() >>>arch_dup_task_struct() >>> flush_all_to_thread() >>>save_all() >>> if (tsk->thread.regs->msr & MSR_FP) >>>save_fpu() >>># fr0 is clobbered and potentially live in >>> userspace >>> >>> >>> So tldr I think the corruption is only an issue since io-uring started doing >>> the clone via signal, which I think matches the observed timeline of this >>> bug >>> appearing. >> >> I agree the corruption really only started showing up in earnest on >> io_uring clone-via-signal, as this was confirmed several times in the >> course of debugging. > > Thanks. > >> Note as well that I may very well have a wrong call order in the >> commit message, since I was relying on a couple of WARN_ON() macros I >> inserted to check for a similar (but not identical) condition and >> didn't spend much time getting new traces after identifying the root >> cause. > > Yep no worries. I'll reword it to incorporate the full path from my mail. > >> I went back and grabbed some real world system-wide stack traces, since I now >> know what to trigger on. A typical example is: >> >> interrupt_return_srr_user() >> interrupt_exit_user_prepare() >> interrupt_exit_user_prepare_main() >>schedule() >> __schedule() >> __switch_to() >> giveup_all() >># tsk->thread.regs->msr MSR_FP is still set here >>__giveup_fpu() >> save_fpu() >> # fr0 is clobbered and potentially live in userspace > > fr0 is not live there. > > __giveup_fpu() does roughly: > > msr = tsk->thread.regs->msr; > msr &= ~(MSR_FP|MSR_FE0|MSR_FE1); >msr &= ~MSR_VSX; > tsk->thread.regs = msr; > > ie. it clears the FP etc. bits from the task's MSR. That means the FP > state will be reloaded from the thread struct before the task is run again. > > Also on that path we're switching to another task, so we'll be reloading > the other task's FP state before returning to userspace. > > So I don't see any bug there. Yeah, you're right. I was trying to get traces while doing something else, and didn't think that all the way through, sorry. :) It's not going to be super easy to get a real trace (I was triggering the WARN_ON() from of fr0 getting set to to FPSCR), so let's just assume it's mainly the path you manually found above and update the commit message accordingly. > There's only two places that call save_fpu() and skip the giveup logic, > which is save_all() and kvmppc_save_user_regs(). Now that's interesting as well, since it might explain some issues I've seen for years on a specific QEMU workload. Once this is backported to stable I'll need to get the kernels updated on those boxes and see if the issues disappear...
Re: [PATCH] powerpc: add crtsavres.o to always-y instead of extra-y
On Tue Nov 21, 2023 at 9:23 AM AEST, Masahiro Yamada wrote: > crtsavres.o is linked to modules. However, as explained in commit > d0e628cd817f ("kbuild: doc: clarify the difference between extra-y > and always-y"), 'make modules' does not build extra-y. > > For example, the following command fails: > > $ make ARCH=powerpc LLVM=1 KBUILD_MODPOST_WARN=1 mrproper ps3_defconfig > modules > [snip] > LD [M] arch/powerpc/platforms/cell/spufs/spufs.ko > ld.lld: error: cannot open arch/powerpc/lib/crtsavres.o: No such file or > directory > make[3]: *** [scripts/Makefile.modfinal:56: > arch/powerpc/platforms/cell/spufs/spufs.ko] Error 1 > make[2]: *** [Makefile:1844: modules] Error 2 > make[1]: *** [/home/masahiro/workspace/linux-kbuild/Makefile:350: > __build_one_by_one] Error 2 > make: *** [Makefile:234: __sub-make] Error 2 > Thanks. Is this the correct Fixes tag? Fixes: d0e628cd817f ("powerpc/64: Do not link crtsavres.o in vmlinux") Hmm, looks like LLD might just do this now automatically for us too without --save-restore-funcs (https://reviews.llvm.org/D79977). But we probably still need it for older versions, so we still need your patch. Reviewed-by: Nicholas Piggin > Signed-off-by: Masahiro Yamada > --- > > arch/powerpc/lib/Makefile | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/powerpc/lib/Makefile b/arch/powerpc/lib/Makefile > index 51ad0397c17a..6eac63e79a89 100644 > --- a/arch/powerpc/lib/Makefile > +++ b/arch/powerpc/lib/Makefile > @@ -45,7 +45,7 @@ obj-$(CONFIG_FUNCTION_ERROR_INJECTION) += > error-inject.o > # so it is only needed for modules, and only for older linkers which > # do not support --save-restore-funcs > ifndef CONFIG_LD_IS_BFD > -extra-$(CONFIG_PPC64)+= crtsavres.o > +always-$(CONFIG_PPC64) += crtsavres.o > endif > > obj-$(CONFIG_PPC_BOOK3S_64) += copyuser_power7.o copypage_power7.o \
Re: [PATCH v2] powerpc: Don't clobber fr0/vs0 during fp|altivec register save
On Tue Nov 21, 2023 at 9:39 AM AEST, Michael Ellerman wrote: > Timothy Pearson writes: > > - Original Message - > >> From: "Michael Ellerman" > ... > >> > >> But we now have a new path, because io-uring can call copy_process() via > >> create_io_thread() from the signal handling path. That's OK if the signal > >> is > >> handled as we return from a syscall, but it's not OK if the signal is > >> handled > >> due to some other interrupt. > >> > >> Which is: > >> > >> interrupt_return_srr_user() > >> interrupt_exit_user_prepare() > >>interrupt_exit_user_prepare_main() > >> do_notify_resume() > >>get_signal() > >> task_work_run() > >>create_worker_cb() > >> create_io_worker() > >>copy_process() > >> dup_task_struct() > >>arch_dup_task_struct() > >> flush_all_to_thread() > >>save_all() > >> if (tsk->thread.regs->msr & MSR_FP) > >>save_fpu() > >># fr0 is clobbered and potentially live in > >> userspace > >> > >> > >> So tldr I think the corruption is only an issue since io-uring started > >> doing > >> the clone via signal, which I think matches the observed timeline of this > >> bug > >> appearing. > > > > I agree the corruption really only started showing up in earnest on > > io_uring clone-via-signal, as this was confirmed several times in the > > course of debugging. > > Thanks. > > > Note as well that I may very well have a wrong call order in the > > commit message, since I was relying on a couple of WARN_ON() macros I > > inserted to check for a similar (but not identical) condition and > > didn't spend much time getting new traces after identifying the root > > cause. > > Yep no worries. I'll reword it to incorporate the full path from my mail. > > > I went back and grabbed some real world system-wide stack traces, since I > > now know what to trigger on. A typical example is: > > > > interrupt_return_srr_user() > > interrupt_exit_user_prepare() > > interrupt_exit_user_prepare_main() > >schedule() > > __schedule() > > __switch_to() > > giveup_all() > ># tsk->thread.regs->msr MSR_FP is still set here > >__giveup_fpu() > > save_fpu() > > # fr0 is clobbered and potentially live in userspace > > fr0 is not live there. > > __giveup_fpu() does roughly: > > msr = tsk->thread.regs->msr; > msr &= ~(MSR_FP|MSR_FE0|MSR_FE1); > msr &= ~MSR_VSX; > tsk->thread.regs = msr; > > ie. it clears the FP etc. bits from the task's MSR. That means the FP > state will be reloaded from the thread struct before the task is run again. > > Also on that path we're switching to another task, so we'll be reloading > the other task's FP state before returning to userspace. > > So I don't see any bug there. > > There's only two places that call save_fpu() and skip the giveup logic, > which is save_all() and kvmppc_save_user_regs(). > > save_all() is only called via clone() so I think that's unable to > actually cause visible register corruption as I described in my previous > mail. > > I thought the KVM case was similar, as it's called via an ioctl, but > I'll have to talk to Nick as his mail indicates otherwise. Yeah it can, because later on it runs the guest and that will clobber other regs. I reproduced fr14 corruption in the host easily with KVM selftests. It should just do a "giveup" on FP/VEC (as it does with TM). Thanks, Nick
Re: [PATCH v2] powerpc: Don't clobber fr0/vs0 during fp|altivec register save
On Tue Nov 21, 2023 at 2:45 AM AEST, Timothy Pearson wrote: > > > - Original Message - > > From: "Michael Ellerman" > > To: "Timothy Pearson" , "Jens Axboe" > > , "regressions" > > , "npiggin" , "christophe > > leroy" , > > "linuxppc-dev" > > Sent: Monday, November 20, 2023 1:10:06 AM > > Subject: Re: [PATCH v2] powerpc: Don't clobber fr0/vs0 during fp|altivec > > register save > > > Hi Timothy, > > > > Great work debugging this. I think your fix is good, but I want to > > understand it > > a bit more to make sure I can explain why we haven't seen it outside of > > io-uring. > > If this can be triggered outside of io-uring then I have even more > > backporting > > in my future :} > > > > Typically save_fpu() is called from __giveup_fpu() which saves the FP regs > > and > > also *turns off FP* in the tasks MSR, meaning the kernel will reload the FP > > regs > > from the thread struct before letting the task use FP again. So in that case > > save_fpu() is free to clobber fr0 because the FP regs no longer hold live > > values > > for the task. > > > > There is another case though, which is the path via: > > copy_process() > >dup_task_struct() > > arch_dup_task_struct() > >flush_all_to_thread() > > save_all() > > > > That path saves the FP regs but leaves them live. That's meant as an > > optimisation for a process that's using FP/VSX and then calls fork(), > > leaving > > the regs live means the parent process doesn't have to take a fault after > > the > > fork to get its FP regs back. > > > > That path does clobber fr0, but fr0 is volatile across a syscall, and the > > only > > way to reach copy_process() from userspace is via a syscall. So in normal > > usage > > fr0 being clobbered across a syscall shouldn't cause data corruption. > > > > Even if we handle a signal on the return from the fork() syscall, the worst > > that > > happens is that the task's thread struct holds the clobbered fr0, but the > > task > > doesn't care (because fr0 is volatile across the syscall anyway). > > > > That path is something like: > > > > system_call_vectored_common() > > system_call_exception() > >sys_fork() > > kernel_clone() > >copy_process() > > dup_task_struct() > >arch_dup_task_struct() > > flush_all_to_thread() > >save_all() > > if (tsk->thread.regs->msr & MSR_FP) > >save_fpu() > ># does not clear MSR_FP from regs->msr > > syscall_exit_prepare() > >interrupt_exit_user_prepare_main() > > do_notify_resume() > >get_signal() > >handle_rt_signal64() > > prepare_setup_sigcontext() > >flush_fp_to_thread() > > if (tsk->thread.regs->msr & MSR_FP) > >giveup_fpu() > > __giveup_fpu > >save_fpu() > ># clobbered fr0 is saved, but task considers it volatile > ># across syscall anyway > > > > > > But we now have a new path, because io-uring can call copy_process() via > > create_io_thread() from the signal handling path. That's OK if the signal is > > handled as we return from a syscall, but it's not OK if the signal is > > handled > > due to some other interrupt. > > > > Which is: > > > > interrupt_return_srr_user() > > interrupt_exit_user_prepare() > >interrupt_exit_user_prepare_main() > > do_notify_resume() > >get_signal() > > task_work_run() > >create_worker_cb() > > create_io_worker() > >copy_process() > > dup_task_struct() > >arch_dup_task_struct() > > flush_all_to_thread() > >save_all() > > if (tsk->thread.regs->msr & MSR_FP) > >save_fpu() > ># fr0 is clobbered and potentially live in > > userspace > > > > > > So tldr I think the corruption is only an issue since io-uring started doing > > the clone via signal, which I think matches the observed timeline of this > > bug > > appearing. > > I agree the corruption really only started showing up in earnest on io_uring > clone-via-signal, as this was confirmed several times in the course of > debugging. Bear in mind however that we have seen very, very rare crashes > over several years in other tasks, and I am starting to think this bug might > be the root cause (see below). > > Note as well that I may very well have a wrong call order in the commit > message, since I was relying on a couple of WARN_ON() macros I inserted to > check for a similar (but not identical) condition and didn't spend much time > getting new traces after identifying the root cause. > > I went back and grabbed some real world system-wide stack traces, since I now > know what to trigger on. A typical example is: > >
Re: [PATCH] powerpc/lib: Avoid array bounds warnings in vec ops
On 11/20/23 17:54, Michael Ellerman wrote: Building with GCC 13 (which has -array-bounds enabled) there are several warnings in sstep.c along the lines of: In function ‘do_byte_reverse’, inlined from ‘do_vec_load’ at arch/powerpc/lib/sstep.c:691:3, inlined from ‘emulate_loadstore’ at arch/powerpc/lib/sstep.c:3439:9: arch/powerpc/lib/sstep.c:289:23: error: array subscript 2 is outside array bounds of ‘u8[16]’ {aka ‘unsigned char[16]’} [-Werror=array-bounds=] 289 | up[2] = byterev_8(up[1]); | ~~^~ arch/powerpc/lib/sstep.c: In function ‘emulate_loadstore’: arch/powerpc/lib/sstep.c:681:11: note: at offset 16 into object ‘u’ of size 16 681 | } u = {}; | ^ do_byte_reverse() supports a size up to 32 bytes, but in these cases the caller is only passing a 16 byte buffer. In practice there is no bug, do_vec_load() is only called from the LOAD_VMX case in emulate_loadstore(). That in turn is only reached when analyse_instr() recognises VMX ops, and in all cases the size is no greater than 16: $ git grep -w LOAD_VMX arch/powerpc/lib/sstep.c arch/powerpc/lib/sstep.c:op->type = MKOP(LOAD_VMX, 0, 1); arch/powerpc/lib/sstep.c:op->type = MKOP(LOAD_VMX, 0, 2); arch/powerpc/lib/sstep.c:op->type = MKOP(LOAD_VMX, 0, 4); arch/powerpc/lib/sstep.c:op->type = MKOP(LOAD_VMX, 0, 16); Similarly for do_vec_store(). Although the warning is incorrect, the code would be safer if it clamped the size from the caller to the known size of the buffer. Do that using min_t(). Reported-by: Bagas Sanjaya Reported-by: Jan-Benedict Glaw Reported-by: Gustavo A. R. Silva Signed-off-by: Michael Ellerman Reviewed-by: Gustavo A. R. Silva Build-tested-by: Gustavo A. R. Silva This indeed makes all those warnings go away. :) Thanks, Michael! -- Gustavo --- arch/powerpc/lib/sstep.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c index a4ab8625061a..a13f05cfc7db 100644 --- a/arch/powerpc/lib/sstep.c +++ b/arch/powerpc/lib/sstep.c @@ -688,7 +688,7 @@ static nokprobe_inline int do_vec_load(int rn, unsigned long ea, if (err) return err; if (unlikely(cross_endian)) - do_byte_reverse([ea & 0xf], size); + do_byte_reverse([ea & 0xf], min_t(size_t, size, sizeof(u))); preempt_disable(); if (regs->msr & MSR_VEC) put_vr(rn, ); @@ -719,7 +719,7 @@ static nokprobe_inline int do_vec_store(int rn, unsigned long ea, u.v = current->thread.vr_state.vr[rn]; preempt_enable(); if (unlikely(cross_endian)) - do_byte_reverse([ea & 0xf], size); + do_byte_reverse([ea & 0xf], min_t(size_t, size, sizeof(u))); return copy_mem_out([ea & 0xf], ea, size, regs); } #endif /* CONFIG_ALTIVEC */
[PATCH] powerpc/lib: Avoid array bounds warnings in vec ops
Building with GCC 13 (which has -array-bounds enabled) there are several warnings in sstep.c along the lines of: In function ‘do_byte_reverse’, inlined from ‘do_vec_load’ at arch/powerpc/lib/sstep.c:691:3, inlined from ‘emulate_loadstore’ at arch/powerpc/lib/sstep.c:3439:9: arch/powerpc/lib/sstep.c:289:23: error: array subscript 2 is outside array bounds of ‘u8[16]’ {aka ‘unsigned char[16]’} [-Werror=array-bounds=] 289 | up[2] = byterev_8(up[1]); | ~~^~ arch/powerpc/lib/sstep.c: In function ‘emulate_loadstore’: arch/powerpc/lib/sstep.c:681:11: note: at offset 16 into object ‘u’ of size 16 681 | } u = {}; | ^ do_byte_reverse() supports a size up to 32 bytes, but in these cases the caller is only passing a 16 byte buffer. In practice there is no bug, do_vec_load() is only called from the LOAD_VMX case in emulate_loadstore(). That in turn is only reached when analyse_instr() recognises VMX ops, and in all cases the size is no greater than 16: $ git grep -w LOAD_VMX arch/powerpc/lib/sstep.c arch/powerpc/lib/sstep.c:op->type = MKOP(LOAD_VMX, 0, 1); arch/powerpc/lib/sstep.c:op->type = MKOP(LOAD_VMX, 0, 2); arch/powerpc/lib/sstep.c:op->type = MKOP(LOAD_VMX, 0, 4); arch/powerpc/lib/sstep.c:op->type = MKOP(LOAD_VMX, 0, 16); Similarly for do_vec_store(). Although the warning is incorrect, the code would be safer if it clamped the size from the caller to the known size of the buffer. Do that using min_t(). Reported-by: Bagas Sanjaya Reported-by: Jan-Benedict Glaw Reported-by: Gustavo A. R. Silva Signed-off-by: Michael Ellerman --- arch/powerpc/lib/sstep.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c index a4ab8625061a..a13f05cfc7db 100644 --- a/arch/powerpc/lib/sstep.c +++ b/arch/powerpc/lib/sstep.c @@ -688,7 +688,7 @@ static nokprobe_inline int do_vec_load(int rn, unsigned long ea, if (err) return err; if (unlikely(cross_endian)) - do_byte_reverse([ea & 0xf], size); + do_byte_reverse([ea & 0xf], min_t(size_t, size, sizeof(u))); preempt_disable(); if (regs->msr & MSR_VEC) put_vr(rn, ); @@ -719,7 +719,7 @@ static nokprobe_inline int do_vec_store(int rn, unsigned long ea, u.v = current->thread.vr_state.vr[rn]; preempt_enable(); if (unlikely(cross_endian)) - do_byte_reverse([ea & 0xf], size); + do_byte_reverse([ea & 0xf], min_t(size_t, size, sizeof(u))); return copy_mem_out([ea & 0xf], ea, size, regs); } #endif /* CONFIG_ALTIVEC */ -- 2.41.0
Re: [RFC - is this a bug?] powerpc/lib/sstep: Asking for some light on this, please. :)
"Gustavo A. R. Silva" writes: > On 11/20/23 09:21, Naveen N Rao wrote: >> On Mon, Nov 20, 2023 at 08:33:45AM -0600, Gustavo A. R. Silva wrote: >>> On 11/20/23 08:25, Naveen N Rao wrote: On Fri, Nov 17, 2023 at 12:36:01PM -0600, Gustavo A. R. Silva wrote: > Hi all, > > I'm trying to fix the following -Wstringop-overflow warnings, and I'd like > to get your feedback on this, please: > > In function 'do_byte_reverse', > inlined from 'do_vec_store' at > /home/gus/linux/arch/powerpc/lib/sstep.c:722:3, > inlined from 'emulate_loadstore' at > /home/gus/linux/arch/powerpc/lib/sstep.c:3510:9: > arch/powerpc/lib/sstep.c:287:23: error: writing 8 bytes into a region of > size 0 [-Werror=stringop-overflow=] > 287 | up[3] = tmp; > | ~~^ > arch/powerpc/lib/sstep.c: In function 'emulate_loadstore': > arch/powerpc/lib/sstep.c:708:11: note: at offset [24, 39] into > destination object 'u' of size 16 > 708 | } u; > | ^ > In function 'do_byte_reverse', > inlined from 'do_vec_store' at > /home/gus/linux/arch/powerpc/lib/sstep.c:722:3, > inlined from 'emulate_loadstore' at > /home/gus/linux/arch/powerpc/lib/sstep.c:3510:9: > arch/powerpc/lib/sstep.c:289:23: error: writing 8 bytes into a region of > size 0 [-Werror=stringop-overflow=] > 289 | up[2] = byterev_8(up[1]); > | ~~^~ > arch/powerpc/lib/sstep.c: In function 'emulate_loadstore': > arch/powerpc/lib/sstep.c:708:11: note: at offset 16 into destination > object 'u' of size 16 > 708 | } u; > | ^ > In function 'do_byte_reverse', > inlined from 'do_vec_load' at > /home/gus/linux/arch/powerpc/lib/sstep.c:691:3, > inlined from 'emulate_loadstore' at > /home/gus/linux/arch/powerpc/lib/sstep.c:3439:9: > arch/powerpc/lib/sstep.c:287:23: error: writing 8 bytes into a region of > size 0 [-Werror=stringop-overflow=] > 287 | up[3] = tmp; > | ~~^ > arch/powerpc/lib/sstep.c: In function 'emulate_loadstore': > arch/powerpc/lib/sstep.c:681:11: note: at offset [24, 39] into > destination object 'u' of size 16 > 681 | } u = {}; > | ^ > arch/powerpc/lib/sstep.c:681:11: note: at offset [24, 39] into > destination object 'u' of size 16 > arch/powerpc/lib/sstep.c:681:11: note: at offset [24, 39] into > destination object 'u' of size 16 > > The origing of the issue seems to be the following calls to function > `do_vec_store()`, when > `size > 16`, or more precisely when `size == 32` > > arch/powerpc/lib/sstep.c: > 3436 case LOAD_VMX: > 3437 if (!(regs->msr & MSR_PR) && !(regs->msr & MSR_VEC)) > 3438 return 0; > 3439 err = do_vec_load(op->reg, ea, size, regs, > cross_endian); > 3440 break; > ... > 3507 case STORE_VMX: > 3508 if (!(regs->msr & MSR_PR) && !(regs->msr & MSR_VEC)) > 3509 return 0; > 3510 err = do_vec_store(op->reg, ea, size, regs, > cross_endian); > 3511 break; LOAD_VMX and STORE_VMX are set in analyse_instr() and size does not exceed 16. So, the warning looks incorrect to me. >>> >>> Does that mean that this piece of code is never actually executed: >>> >>> 281 case 32: { >>> 282 unsigned long *up = (unsigned long *)ptr; >>> 283 unsigned long tmp; >>> 284 >>> 285 tmp = byterev_8(up[0]); >>> 286 up[0] = byterev_8(up[3]); >>> 287 up[3] = tmp; >>> 288 tmp = byterev_8(up[2]); >>> 289 up[2] = byterev_8(up[1]); >>> 290 up[1] = tmp; >>> 291 break; >>> 292 } >>> >>> or rather, under which conditions the above code is executed, if any? >> >> Please see git history to understand the commit that introduced that >> code. You can do that by starting with a 'git blame' on the file. You'll >> find that the commit that introduced the above hunk was af99da74333b >> ("powerpc/sstep: Support VSX vector paired storage access >> instructions"). >> >> The above code path is hit via >> do_vsx_load()->emulate_vsx_load()->do_byte_reverse() > > It seems there is another path (at least the one that triggers the > -Wstringop-overflow warnings): > > emulate_step()->emulate_loadstore()->do_vec_load()->do_byte_reverse() > > emulate_step() { > ... > if (OP_IS_LOAD_STORE(type)) { The type comes from the op, which is determined in analyse_instr() > err =
Potential config regression after 89cde455 ("kexec: consolidate kexec and crash options into kernel/Kconfig.kexec")
Good day! We have recently started to evaluate Linux 6.6 and noticed that we cannot disable CONFIG_KEXEC anymore, but keep CONFIG_CRASH_DUMP enabled. It seems to be related to commit 89cde455 ("kexec: consolidate kexec and crash options into kernel/Kconfig.kexec"), where a CONFIG_KEXEC dependency was added to CONFIG_CRASH_DUMP. In our current kernel (Linux 6.1) we only enable CONFIG_KEXEC_FILE with enforced signature check to support the kernel crash dumping functionality and would like to keep CONFIG_KEXEC disabled for security reasons [1]. I was reading the long commit message, but the reason for adding CONFIG_KEXEC as a dependency for CONFIG_CRASH_DUMP evaded me. And I believe from the implementation perspective CONFIG_KEXEC_FILE should suffice here (as we successfully used it for crashdumps on Linux 6.1). Is there a reason for adding this dependency or is it just an oversight? Would some solution of requiring either CONFIG_KEXEC or CONFIG_KEXEC_FILE work here? Ignat [1]: https://mjg59.dreamwidth.org/28746.html
Re: [PATCH v2] powerpc: Don't clobber fr0/vs0 during fp|altivec register save
Timothy Pearson writes: > - Original Message - >> From: "Michael Ellerman" ... >> >> But we now have a new path, because io-uring can call copy_process() via >> create_io_thread() from the signal handling path. That's OK if the signal is >> handled as we return from a syscall, but it's not OK if the signal is handled >> due to some other interrupt. >> >> Which is: >> >> interrupt_return_srr_user() >> interrupt_exit_user_prepare() >>interrupt_exit_user_prepare_main() >> do_notify_resume() >>get_signal() >> task_work_run() >>create_worker_cb() >> create_io_worker() >>copy_process() >> dup_task_struct() >>arch_dup_task_struct() >> flush_all_to_thread() >>save_all() >> if (tsk->thread.regs->msr & MSR_FP) >>save_fpu() >># fr0 is clobbered and potentially live in >> userspace >> >> >> So tldr I think the corruption is only an issue since io-uring started doing >> the clone via signal, which I think matches the observed timeline of this bug >> appearing. > > I agree the corruption really only started showing up in earnest on > io_uring clone-via-signal, as this was confirmed several times in the > course of debugging. Thanks. > Note as well that I may very well have a wrong call order in the > commit message, since I was relying on a couple of WARN_ON() macros I > inserted to check for a similar (but not identical) condition and > didn't spend much time getting new traces after identifying the root > cause. Yep no worries. I'll reword it to incorporate the full path from my mail. > I went back and grabbed some real world system-wide stack traces, since I now > know what to trigger on. A typical example is: > > interrupt_return_srr_user() > interrupt_exit_user_prepare() > interrupt_exit_user_prepare_main() >schedule() > __schedule() > __switch_to() > giveup_all() ># tsk->thread.regs->msr MSR_FP is still set here >__giveup_fpu() > save_fpu() > # fr0 is clobbered and potentially live in userspace fr0 is not live there. __giveup_fpu() does roughly: msr = tsk->thread.regs->msr; msr &= ~(MSR_FP|MSR_FE0|MSR_FE1); msr &= ~MSR_VSX; tsk->thread.regs = msr; ie. it clears the FP etc. bits from the task's MSR. That means the FP state will be reloaded from the thread struct before the task is run again. Also on that path we're switching to another task, so we'll be reloading the other task's FP state before returning to userspace. So I don't see any bug there. There's only two places that call save_fpu() and skip the giveup logic, which is save_all() and kvmppc_save_user_regs(). save_all() is only called via clone() so I think that's unable to actually cause visible register corruption as I described in my previous mail. I thought the KVM case was similar, as it's called via an ioctl, but I'll have to talk to Nick as his mail indicates otherwise. cheers
[PATCH] powerpc: add crtsavres.o to always-y instead of extra-y
crtsavres.o is linked to modules. However, as explained in commit d0e628cd817f ("kbuild: doc: clarify the difference between extra-y and always-y"), 'make modules' does not build extra-y. For example, the following command fails: $ make ARCH=powerpc LLVM=1 KBUILD_MODPOST_WARN=1 mrproper ps3_defconfig modules [snip] LD [M] arch/powerpc/platforms/cell/spufs/spufs.ko ld.lld: error: cannot open arch/powerpc/lib/crtsavres.o: No such file or directory make[3]: *** [scripts/Makefile.modfinal:56: arch/powerpc/platforms/cell/spufs/spufs.ko] Error 1 make[2]: *** [Makefile:1844: modules] Error 2 make[1]: *** [/home/masahiro/workspace/linux-kbuild/Makefile:350: __build_one_by_one] Error 2 make: *** [Makefile:234: __sub-make] Error 2 Signed-off-by: Masahiro Yamada --- arch/powerpc/lib/Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/lib/Makefile b/arch/powerpc/lib/Makefile index 51ad0397c17a..6eac63e79a89 100644 --- a/arch/powerpc/lib/Makefile +++ b/arch/powerpc/lib/Makefile @@ -45,7 +45,7 @@ obj-$(CONFIG_FUNCTION_ERROR_INJECTION)+= error-inject.o # so it is only needed for modules, and only for older linkers which # do not support --save-restore-funcs ifndef CONFIG_LD_IS_BFD -extra-$(CONFIG_PPC64) += crtsavres.o +always-$(CONFIG_PPC64) += crtsavres.o endif obj-$(CONFIG_PPC_BOOK3S_64) += copyuser_power7.o copypage_power7.o \ -- 2.40.1
Re: [PATCH v4 2/2] ASoC: imx-rpmsg: Force codec power on in low power audio mode
On Mon, Oct 23, 2023 at 10:07:18AM +0800, Chancel Liu wrote: > Low power audio mode requires binding codec still power on while Acore > enters into suspend so Mcore can continue playback music. > > ASoC machine driver acquires DAPM endpoints through reading > "ignore-suspend-widgets" property from DT and then forces the path > between these endpoints ignoring suspend. This breaks an x86 allmodconfig build: /build/stage/linux/sound/soc/fsl/imx-rpmsg.c: In function ‘imx_rpmsg_late_probe’ : /build/stage/linux/sound/soc/fsl/imx-rpmsg.c:60:46: error: implicit declaration of function ‘of_find_device_by_node’; did you mean ‘of_find_i2c_device_by_node’? [-Werror=implicit-function-declaration] 60 | codec_pdev = of_find_device_by_node(code c_np); | ^~ | of_find_i2c_device_by_node /build/stage/linux/sound/soc/fsl/imx-rpmsg.c:60:44: error: assignment to ‘struct platform_device *’ from ‘int’ makes pointer from integer without a cast [-Werro r=int-conversion] 60 | codec_pdev = of_find_device_by_node(codec_np); |^ cc1: all warnings being treated as errors signature.asc Description: PGP signature
[PATCH 0/5] usb: gadget: udc: Convert to platform remove callback returning void
Hello, this patch set converts the platform drivers below drivers/usb/gadget to use .remove_new. These drivers all have an error path if the driver is still in use. Returning there early leaks resources, but fixing this isn't trivial, so I just added an error message. The patches don't make a difference to the drivers apart from the improved error message. See commit 5c5a7680e67b ("platform: Provide a remove callback that returns no value") for an extended explanation and the eventual goal of .remove_new(). Best regards Uwe Uwe Kleine-König (5): usb: gadget: at91_udc: Convert to platform remove callback returning void usb: gadget: fsl_udc: Convert to platform remove callback returning void usb: gadget: gr_udc: Convert to platform remove callback returning void usb: gadget: lpc32xx_udc: Convert to platform remove callback returning void usb: gadget: pxa25x_udc: Convert to platform remove callback returning void drivers/usb/gadget/udc/at91_udc.c | 13 +++-- drivers/usb/gadget/udc/fsl_udc_core.c | 13 +++-- drivers/usb/gadget/udc/gr_udc.c | 13 +++-- drivers/usb/gadget/udc/lpc32xx_udc.c | 13 +++-- drivers/usb/gadget/udc/pxa25x_udc.c | 12 +++- 5 files changed, 35 insertions(+), 29 deletions(-) base-commit: 5a82d69d48c82e89aef44483d2a129f869f3506a -- 2.42.0
[PATCH 2/5] usb: gadget: fsl_udc: Convert to platform remove callback returning void
The .remove() callback for a platform driver returns an int which makes many driver authors wrongly assume it's possible to do error handling by returning an error code. However the value returned is ignored (apart from emitting a warning) and this typically results in resource leaks. To improve here there is a quest to make the remove callback return void. In the first step of this quest all drivers are converted to .remove_new(), which already returns void. Eventually after all drivers are converted, .remove_new() will be renamed to .remove(). In the error path emit an error message replacing the (less useful) message by the core. Apart from the improved error message there is no change in behaviour. Signed-off-by: Uwe Kleine-König --- drivers/usb/gadget/udc/fsl_udc_core.c | 13 +++-- 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/drivers/usb/gadget/udc/fsl_udc_core.c b/drivers/usb/gadget/udc/fsl_udc_core.c index 2693a10eb0c7..535b6e79a198 100644 --- a/drivers/usb/gadget/udc/fsl_udc_core.c +++ b/drivers/usb/gadget/udc/fsl_udc_core.c @@ -2532,15 +2532,18 @@ static int fsl_udc_probe(struct platform_device *pdev) /* Driver removal function * Free resources and finish pending transactions */ -static int fsl_udc_remove(struct platform_device *pdev) +static void fsl_udc_remove(struct platform_device *pdev) { struct resource *res = platform_get_resource(pdev, IORESOURCE_MEM, 0); struct fsl_usb2_platform_data *pdata = dev_get_platdata(>dev); DECLARE_COMPLETION_ONSTACK(done); - if (!udc_controller) - return -ENODEV; + if (!udc_controller) { + dev_err(>dev, + "Driver still in use but removing anyhow\n"); + return; + } udc_controller->done = usb_del_gadget_udc(_controller->gadget); @@ -2568,8 +2571,6 @@ static int fsl_udc_remove(struct platform_device *pdev) */ if (pdata->exit) pdata->exit(pdev); - - return 0; } /*- @@ -2667,7 +2668,7 @@ static const struct platform_device_id fsl_udc_devtype[] = { MODULE_DEVICE_TABLE(platform, fsl_udc_devtype); static struct platform_driver udc_driver = { .probe = fsl_udc_probe, - .remove = fsl_udc_remove, + .remove_new = fsl_udc_remove, .id_table = fsl_udc_devtype, /* these suspend and resume are not usb suspend and resume */ .suspend= fsl_udc_suspend, -- 2.42.0
Re: [PATCH v4 07/13] powerpc/rtas: Warn if per-function lock isn't held
Nathan Lynch via B4 Relay writes: > From: Nathan Lynch > > If the function descriptor has a populated lock member, then callers > are required to hold it across calls. Now that the firmware activation > sequence is appropriately guarded, we can warn when the requirement > isn't satisfied. > > __do_enter_rtas_trace() gets reorganized a bit as a result of > performing the function descriptor lookup unconditionally now. > Reviewed-by: Aneesh Kumar K.V (IBM) > Signed-off-by: Nathan Lynch > --- > arch/powerpc/kernel/rtas.c | 21 + > 1 file changed, 9 insertions(+), 12 deletions(-) > > diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c > index e38ba05ad613..deb6289fcf9c 100644 > --- a/arch/powerpc/kernel/rtas.c > +++ b/arch/powerpc/kernel/rtas.c > @@ -685,28 +685,25 @@ static void __do_enter_rtas(struct rtas_args *args) > > static void __do_enter_rtas_trace(struct rtas_args *args) > { > - const char *name = NULL; > + struct rtas_function *func = > rtas_token_to_function(be32_to_cpu(args->token)); > > - if (args == _args) > - lockdep_assert_held(_lock); > /* > - * If the tracepoints that consume the function name aren't > - * active, avoid the lookup. > + * If there is a per-function lock, it must be held by the > + * caller. >*/ > - if ((trace_rtas_input_enabled() || trace_rtas_output_enabled())) { > - const s32 token = be32_to_cpu(args->token); > - const struct rtas_function *func = > rtas_token_to_function(token); > + if (func->lock) > + WARN_ON(!mutex_is_locked(func->lock)); > > - name = func->name; > - } > + if (args == _args) > + lockdep_assert_held(_lock); > > - trace_rtas_input(args, name); > + trace_rtas_input(args, func->name); > trace_rtas_ll_entry(args); > > __do_enter_rtas(args); > > trace_rtas_ll_exit(args); > - trace_rtas_output(args, name); > + trace_rtas_output(args, func->name); > } > > static void do_enter_rtas(struct rtas_args *args) > > -- > 2.41.0
Re: [PATCH v4 06/13] powerpc/rtas: Serialize firmware activation sequences
Nathan Lynch via B4 Relay writes: > From: Nathan Lynch > > Use the function lock API to prevent interleaving call sequences of > the ibm,activate-firmware RTAS function, which typically requires > multiple calls to complete the update. While the spec does not > specifically prohibit interleaved sequences, there's almost certainly > no advantage to allowing them. > Can we document what is the equivalent thing the userspace does? Reviewed-by: Aneesh Kumar K.V (IBM) > Signed-off-by: Nathan Lynch > --- > arch/powerpc/kernel/rtas.c | 4 > 1 file changed, 4 insertions(+) > > diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c > index 52f2242d0c28..e38ba05ad613 100644 > --- a/arch/powerpc/kernel/rtas.c > +++ b/arch/powerpc/kernel/rtas.c > @@ -1753,10 +1753,14 @@ void rtas_activate_firmware(void) > return; > } > > + rtas_function_lock(RTAS_FN_IBM_ACTIVATE_FIRMWARE); > + > do { > fwrc = rtas_call(token, 0, 1, NULL); > } while (rtas_busy_delay(fwrc)); > > + rtas_function_unlock(RTAS_FN_IBM_ACTIVATE_FIRMWARE); > + > if (fwrc) > pr_err("ibm,activate-firmware failed (%i)\n", fwrc); > } > > -- > 2.41.0
Re: [PATCH v4 05/13] powerpc/rtas: Facilitate high-level call sequences
Nathan Lynch via B4 Relay writes: > From: Nathan Lynch > > On RTAS platforms there is a general restriction that the OS must not > enter RTAS on more than one CPU at a time. This low-level > serialization requirement is satisfied by holding a spin > lock (rtas_lock) across most RTAS function invocations. > > However, some pseries RTAS functions require multiple successive calls > to complete a logical operation. Beginning a new call sequence for such a > function may disrupt any other sequences of that function already in > progress. Safe and reliable use of these functions effectively > requires higher-level serialization beyond what is already done at the > level of RTAS entry and exit. > > Where a sequence-based RTAS function is invoked only through > sys_rtas(), with no in-kernel users, there is no issue as far as the > kernel is concerned. User space is responsible for appropriately > serializing its call sequences. (Whether user space code actually > takes measures to prevent sequence interleaving is another matter.) > Examples of such functions currently include ibm,platform-dump and > ibm,get-vpd. > > But where a sequence-based RTAS function has both user space and > in-kernel uesrs, there is a hazard. Even if the in-kernel call sites > of such a function serialize their sequences correctly, a user of > sys_rtas() can invoke the same function at any time, potentially > disrupting a sequence in progress. > > So in order to prevent disruption of kernel-based RTAS call sequences, > they must serialize not only with themselves but also with sys_rtas() > users, somehow. Preferably without adding global locks or adding more > function-specific hacks to sys_rtas(). This is a prerequisite for > adding an in-kernel call sequence of ibm,get-vpd, which is in a change > to follow. > > Note that it has never been feasible for the kernel to prevent > sys_rtas()-based sequences from being disrupted because control > returns to user space on every call. sys_rtas()-based users of these > functions have always been, and continue to be, responsible for > coordinating their call sequences with other users, even those which > may invoke the RTAS functions through less direct means than > sys_rtas(). This is an unavoidable consequence of exposing > sequence-based RTAS functions through sys_rtas(). > > * Add new rtas_function_lock() and rtas_function_unlock() APIs for use > with sequence-based RTAS functions. > > * Add an optional per-function mutex to struct rtas_function. When this > member is set, kernel-internal callers of the RTAS function are > required to guard their call sequences with rtas_function_lock() and > rtas_function_unlock(). This requirement will be enforced in a later > change, after all affected call sites are updated. > > * Populate the lock members of function table entries where > serialization of call sequences is known to be necessary, along with > justifying commentary. > > * In sys_rtas(), acquire the per-function mutex when it is present. > > There should be no perceivable change introduced here except that > concurrent callers of the same RTAS function via sys_rtas() may block > on a mutex instead of spinning on rtas_lock. Changes to follow will > add rtas_function_lock()/unlock() pairs to kernel-based call > sequences. > Can you add an example of the last part. I did look at to find 06 to find the details rtas_function_lock(RTAS_FN_IBM_ACTIVATE_FIRMWARE); do { fwrc = rtas_call(token, 0, 1, NULL); } while (rtas_busy_delay(fwrc)); rtas_function_unlock(RTAS_FN_IBM_ACTIVATE_FIRMWARE); Reviewed-by: Aneesh Kumar K.V (IBM) > > Signed-off-by: Nathan Lynch > --- > arch/powerpc/include/asm/rtas.h | 2 + > arch/powerpc/kernel/rtas.c | 101 > +++- > 2 files changed, 101 insertions(+), 2 deletions(-) > > diff --git a/arch/powerpc/include/asm/rtas.h b/arch/powerpc/include/asm/rtas.h > index b73010583a8d..9a20caba6858 100644 > --- a/arch/powerpc/include/asm/rtas.h > +++ b/arch/powerpc/include/asm/rtas.h > @@ -421,6 +421,8 @@ static inline bool rtas_function_implemented(const > rtas_fn_handle_t handle) > { > return rtas_function_token(handle) != RTAS_UNKNOWN_SERVICE; > } > +void rtas_function_lock(rtas_fn_handle_t handle); > +void rtas_function_unlock(rtas_fn_handle_t handle); > extern int rtas_token(const char *service); > extern int rtas_service_present(const char *service); > extern int rtas_call(int token, int, int, int *, ...); > diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c > index 1fc0b3fffdd1..52f2242d0c28 100644 > --- a/arch/powerpc/kernel/rtas.c > +++ b/arch/powerpc/kernel/rtas.c > @@ -18,6 +18,7 @@ > #include > #include > #include > +#include > #include > #include > #include > @@ -70,14 +71,34 @@ struct rtas_filter { > *ppc64le, and we want to keep it that way. It > does > *
Re: [PATCH v4 04/13] powerpc/rtas: Factor out function descriptor lookup
Nathan Lynch via B4 Relay writes: > From: Nathan Lynch > > Move the function descriptor table lookup out of rtas_function_token() > into a separate routine for use in new code to follow. No functional > change. > Reviewed-by: Aneesh Kumar K.V (IBM) > Signed-off-by: Nathan Lynch > --- > arch/powerpc/kernel/rtas.c | 31 +++ > 1 file changed, 19 insertions(+), 12 deletions(-) > > diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c > index f0051881348a..1fc0b3fffdd1 100644 > --- a/arch/powerpc/kernel/rtas.c > +++ b/arch/powerpc/kernel/rtas.c > @@ -469,29 +469,36 @@ static struct rtas_function rtas_function_table[] > __ro_after_init = { > static DEFINE_RAW_SPINLOCK(rtas_lock); > static struct rtas_args rtas_args; > > -/** > - * rtas_function_token() - RTAS function token lookup. > - * @handle: Function handle, e.g. RTAS_FN_EVENT_SCAN. > - * > - * Context: Any context. > - * Return: the token value for the function if implemented by this platform, > - * otherwise RTAS_UNKNOWN_SERVICE. > - */ > -s32 rtas_function_token(const rtas_fn_handle_t handle) > +static struct rtas_function *rtas_function_lookup(const rtas_fn_handle_t > handle) > { > const size_t index = handle.index; > const bool out_of_bounds = index >= ARRAY_SIZE(rtas_function_table); > > if (WARN_ONCE(out_of_bounds, "invalid function index %zu", index)) > - return RTAS_UNKNOWN_SERVICE; > + return NULL; > /* >* Various drivers attempt token lookups on non-RTAS >* platforms. >*/ > if (!rtas.dev) > - return RTAS_UNKNOWN_SERVICE; > + return NULL; > + > + return _function_table[index]; > +} > + > +/** > + * rtas_function_token() - RTAS function token lookup. > + * @handle: Function handle, e.g. RTAS_FN_EVENT_SCAN. > + * > + * Context: Any context. > + * Return: the token value for the function if implemented by this platform, > + * otherwise RTAS_UNKNOWN_SERVICE. > + */ > +s32 rtas_function_token(const rtas_fn_handle_t handle) > +{ > + const struct rtas_function *func = rtas_function_lookup(handle); > > - return rtas_function_table[index].token; > + return func ? func->token : RTAS_UNKNOWN_SERVICE; > } > EXPORT_SYMBOL_GPL(rtas_function_token); > > > -- > 2.41.0
Re: [PATCH v4 03/13] powerpc/rtas: Add function return status constants
Nathan Lynch via B4 Relay writes: > From: Nathan Lynch > > Not all of the generic RTAS function statuses specified in PAPR have > symbolic constants and descriptions in rtas.h. Fix this, providing a > little more background, slightly updating the existing wording, and > improving the formatting. > Reviewed-by: Aneesh Kumar K.V (IBM) > Signed-off-by: Nathan Lynch > --- > arch/powerpc/include/asm/rtas.h | 25 +++-- > 1 file changed, 19 insertions(+), 6 deletions(-) > > diff --git a/arch/powerpc/include/asm/rtas.h b/arch/powerpc/include/asm/rtas.h > index c697c3c74694..b73010583a8d 100644 > --- a/arch/powerpc/include/asm/rtas.h > +++ b/arch/powerpc/include/asm/rtas.h > @@ -201,12 +201,25 @@ typedef struct { > /* Memory set aside for sys_rtas to use with calls that need a work area. */ > #define RTAS_USER_REGION_SIZE (64 * 1024) > > -/* RTAS return status codes */ > -#define RTAS_HARDWARE_ERROR -1/* Hardware Error */ > -#define RTAS_BUSY-2/* RTAS Busy */ > -#define RTAS_INVALID_PARAMETER -3/* Invalid > indicator/domain/sensor etc. */ > -#define RTAS_EXTENDED_DELAY_MIN 9900 > -#define RTAS_EXTENDED_DELAY_MAX 9905 > +/* > + * Common RTAS function return values, derived from the table "RTAS > + * Status Word Values" in PAPR+ 7.2.8: "Return Codes". If a function > + * can return a value in this table then generally it has the meaning > + * listed here. More extended commentary in the documentation for > + * rtas_call(). > + * > + * RTAS functions may use negative and positive numbers not in this > + * set for function-specific error and success conditions, > + * respectively. > + */ > +#define RTAS_SUCCESS 0 /* Success. */ > +#define RTAS_HARDWARE_ERROR -1 /* Hardware or other unspecified > error. */ > +#define RTAS_BUSY -2 /* Retry immediately. */ > +#define RTAS_INVALID_PARAMETER -3 /* Invalid > indicator/domain/sensor etc. */ > +#define RTAS_UNEXPECTED_STATE_CHANGE-7 /* Seems limited to EEH and slot > reset. */ > +#define RTAS_EXTENDED_DELAY_MIN 9900 /* Retry after delaying for ~1ms. > */ > +#define RTAS_EXTENDED_DELAY_MAX 9905 /* Retry after delaying for > ~100s. */ > +#define RTAS_ML_ISOLATION_ERROR -9000 /* Multi-level isolation error. */ > > /* statuses specific to ibm,suspend-me */ > #define RTAS_SUSPEND_ABORTED 9000 /* Suspension aborted */ > > -- > 2.41.0
Re: [PATCH v4 02/13] powerpc/rtas: Fall back to linear search on failed token->function lookup
Nathan Lynch via B4 Relay writes: > From: Nathan Lynch > > Enabling any of the powerpc:rtas_* tracepoints at boot is likely to > result in an oops on RTAS platforms. For example, booting a QEMU > pseries model with 'trace_event=powerpc:rtas_input' in the command > line leads to: > > BUG: Kernel NULL pointer dereference on read at 0x0008 > Oops: Kernel access of bad area, sig: 7 [#1] > NIP [c004231c] do_enter_rtas+0x1bc/0x460 > LR [c004231c] do_enter_rtas+0x1bc/0x460 > Call Trace: > do_enter_rtas+0x1bc/0x460 (unreliable) > rtas_call+0x22c/0x4a0 > rtas_get_boot_time+0x80/0x14c > read_persistent_clock64+0x124/0x150 > read_persistent_wall_and_boot_offset+0x28/0x58 > timekeeping_init+0x70/0x348 > start_kernel+0xa0c/0xc1c > start_here_common+0x1c/0x20 > > (This is preceded by a warning for the failed lookup in > rtas_token_to_function().) > > This happens when __do_enter_rtas_trace() attempts a token to function > descriptor lookup before the xarray containing the mappings has been > set up. > > Fall back to linear scan of the table if rtas_token_to_function_xarray > is empty. > Reviewed-by: Aneesh Kumar K.V (IBM) > Signed-off-by: Nathan Lynch > Fixes: 24098f580e2b ("powerpc/rtas: add tracepoints around RTAS entry") > --- > arch/powerpc/kernel/rtas.c | 18 ++ > 1 file changed, 14 insertions(+), 4 deletions(-) > > diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c > index 1ad1869e2e96..f0051881348a 100644 > --- a/arch/powerpc/kernel/rtas.c > +++ b/arch/powerpc/kernel/rtas.c > @@ -557,11 +557,21 @@ static const struct rtas_function > *rtas_token_to_function(s32 token) > return NULL; > > func = xa_load(_token_to_function_xarray, token); > + if (func) > + return func; > + /* > + * Fall back to linear scan in case the reverse mapping hasn't > + * been initialized yet. > + */ > + if (xa_empty(_token_to_function_xarray)) { > + for_each_rtas_function(func) { > + if (func->token == token) > + return func; > + } > + } > > - if (WARN_ONCE(!func, "unexpected failed lookup for token %d", token)) > - return NULL; > - > - return func; > + WARN_ONCE(true, "unexpected failed lookup for token %d", token); > + return NULL; > } > > /* This is here deliberately so it's only used in this file */ > > -- > 2.41.0
Re: [PATCH v4 01/13] powerpc/rtas: Add for_each_rtas_function() iterator
Nathan Lynch via B4 Relay writes: > From: Nathan Lynch > > Add a convenience macro for iterating over every element of the > internal function table and convert the one site that can use it. An > additional user of the macro is anticipated in changes to follow. > Reviewed-by: Aneesh Kumar K.V (IBM) > Signed-off-by: Nathan Lynch > --- > arch/powerpc/kernel/rtas.c | 9 +++-- > 1 file changed, 7 insertions(+), 2 deletions(-) > > diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c > index eddc031c4b95..1ad1869e2e96 100644 > --- a/arch/powerpc/kernel/rtas.c > +++ b/arch/powerpc/kernel/rtas.c > @@ -454,6 +454,11 @@ static struct rtas_function rtas_function_table[] > __ro_after_init = { > }, > }; > > +#define for_each_rtas_function(funcp) \ > + for (funcp = _function_table[0]; \ > + funcp < _function_table[ARRAY_SIZE(rtas_function_table)]; \ > + ++funcp) > + > /* > * Nearly all RTAS calls need to be serialized. All uses of the > * default rtas_args block must hold rtas_lock. > @@ -525,10 +530,10 @@ static DEFINE_XARRAY(rtas_token_to_function_xarray); > > static int __init rtas_token_to_function_xarray_init(void) > { > + const struct rtas_function *func; > int err = 0; > > - for (size_t i = 0; i < ARRAY_SIZE(rtas_function_table); ++i) { > - const struct rtas_function *func = _function_table[i]; > + for_each_rtas_function(func) { > const s32 token = func->token; > > if (token == RTAS_UNKNOWN_SERVICE) > > -- > 2.41.0
Re: [RFC - is this a bug?] powerpc/lib/sstep: Asking for some light on this, please. :)
On 11/20/23 09:21, Naveen N Rao wrote: On Mon, Nov 20, 2023 at 08:33:45AM -0600, Gustavo A. R. Silva wrote: On 11/20/23 08:25, Naveen N Rao wrote: On Fri, Nov 17, 2023 at 12:36:01PM -0600, Gustavo A. R. Silva wrote: Hi all, I'm trying to fix the following -Wstringop-overflow warnings, and I'd like to get your feedback on this, please: In function 'do_byte_reverse', inlined from 'do_vec_store' at /home/gus/linux/arch/powerpc/lib/sstep.c:722:3, inlined from 'emulate_loadstore' at /home/gus/linux/arch/powerpc/lib/sstep.c:3510:9: arch/powerpc/lib/sstep.c:287:23: error: writing 8 bytes into a region of size 0 [-Werror=stringop-overflow=] 287 | up[3] = tmp; | ~~^ arch/powerpc/lib/sstep.c: In function 'emulate_loadstore': arch/powerpc/lib/sstep.c:708:11: note: at offset [24, 39] into destination object 'u' of size 16 708 | } u; | ^ In function 'do_byte_reverse', inlined from 'do_vec_store' at /home/gus/linux/arch/powerpc/lib/sstep.c:722:3, inlined from 'emulate_loadstore' at /home/gus/linux/arch/powerpc/lib/sstep.c:3510:9: arch/powerpc/lib/sstep.c:289:23: error: writing 8 bytes into a region of size 0 [-Werror=stringop-overflow=] 289 | up[2] = byterev_8(up[1]); | ~~^~ arch/powerpc/lib/sstep.c: In function 'emulate_loadstore': arch/powerpc/lib/sstep.c:708:11: note: at offset 16 into destination object 'u' of size 16 708 | } u; | ^ In function 'do_byte_reverse', inlined from 'do_vec_load' at /home/gus/linux/arch/powerpc/lib/sstep.c:691:3, inlined from 'emulate_loadstore' at /home/gus/linux/arch/powerpc/lib/sstep.c:3439:9: arch/powerpc/lib/sstep.c:287:23: error: writing 8 bytes into a region of size 0 [-Werror=stringop-overflow=] 287 | up[3] = tmp; | ~~^ arch/powerpc/lib/sstep.c: In function 'emulate_loadstore': arch/powerpc/lib/sstep.c:681:11: note: at offset [24, 39] into destination object 'u' of size 16 681 | } u = {}; | ^ arch/powerpc/lib/sstep.c:681:11: note: at offset [24, 39] into destination object 'u' of size 16 arch/powerpc/lib/sstep.c:681:11: note: at offset [24, 39] into destination object 'u' of size 16 The origing of the issue seems to be the following calls to function `do_vec_store()`, when `size > 16`, or more precisely when `size == 32` arch/powerpc/lib/sstep.c: 3436 case LOAD_VMX: 3437 if (!(regs->msr & MSR_PR) && !(regs->msr & MSR_VEC)) 3438 return 0; 3439 err = do_vec_load(op->reg, ea, size, regs, cross_endian); 3440 break; ... 3507 case STORE_VMX: 3508 if (!(regs->msr & MSR_PR) && !(regs->msr & MSR_VEC)) 3509 return 0; 3510 err = do_vec_store(op->reg, ea, size, regs, cross_endian); 3511 break; LOAD_VMX and STORE_VMX are set in analyse_instr() and size does not exceed 16. So, the warning looks incorrect to me. Does that mean that this piece of code is never actually executed: 281 case 32: { 282 unsigned long *up = (unsigned long *)ptr; 283 unsigned long tmp; 284 285 tmp = byterev_8(up[0]); 286 up[0] = byterev_8(up[3]); 287 up[3] = tmp; 288 tmp = byterev_8(up[2]); 289 up[2] = byterev_8(up[1]); 290 up[1] = tmp; 291 break; 292 } or rather, under which conditions the above code is executed, if any? Please see git history to understand the commit that introduced that code. You can do that by starting with a 'git blame' on the file. You'll find that the commit that introduced the above hunk was af99da74333b ("powerpc/sstep: Support VSX vector paired storage access instructions"). The above code path is hit via do_vsx_load()->emulate_vsx_load()->do_byte_reverse() It seems there is another path (at least the one that triggers the -Wstringop-overflow warnings): emulate_step()->emulate_loadstore()->do_vec_load()->do_byte_reverse() emulate_step() { ... if (OP_IS_LOAD_STORE(type)) { err = emulate_loadstore(regs, ); if (err) return 0; goto instr_done; } ... } > emulate_loadstore() { ... #ifdef CONFIG_ALTIVEC case LOAD_VMX: if (!(regs->msr & MSR_PR) && !(regs->msr & MSR_VEC)) return 0; err = do_vec_load(op->reg, ea, size, regs, cross_endian); // with size == 32 break; #endif ... #ifdef CONFIG_ALTIVEC case STORE_VMX: if (!(regs->msr & MSR_PR) && !(regs->msr & MSR_VEC)) return 0; err = do_vec_store(op->reg, ea, size,
Re: [PATCH] ASoC: fsl_mqs: Remove duplicate linux/of.h header
On Sun, 19 Nov 2023 10:45:14 +, Lucas Tanure wrote: > Remove linux/of.h as is included more than once. > Reported by make includecheck. > > Applied to https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git for-next Thanks! [1/1] ASoC: fsl_mqs: Remove duplicate linux/of.h header commit: cac15dc25f416972f8dd0b6a8d74daa79dc3a998 All being well this means that it will be integrated into the linux-next tree (usually sometime in the next 24 hours) and sent to Linus during the next merge window (or sooner if it is a bug fix), however if problems are discovered then the patch may be dropped or reverted. You may get further e-mails resulting from automated or manual testing and review of the tree, please engage with people reporting problems and send followup patches addressing any issues that are reported if needed. If any updates are required or you are submitting further changes they should be sent as incremental updates against current git, existing patches will not be replaced. Please add any relevant lists and maintainers to the CCs when replying to this mail. Thanks, Mark
Re: [PATCH v2] powerpc: Don't clobber fr0/vs0 during fp|altivec register save
- Original Message - > From: "Michael Ellerman" > To: "Timothy Pearson" , "Jens Axboe" > , "regressions" > , "npiggin" , "christophe > leroy" , > "linuxppc-dev" > Sent: Monday, November 20, 2023 1:10:06 AM > Subject: Re: [PATCH v2] powerpc: Don't clobber fr0/vs0 during fp|altivec > register save > Hi Timothy, > > Great work debugging this. I think your fix is good, but I want to understand > it > a bit more to make sure I can explain why we haven't seen it outside of > io-uring. > If this can be triggered outside of io-uring then I have even more backporting > in my future :} > > Typically save_fpu() is called from __giveup_fpu() which saves the FP regs and > also *turns off FP* in the tasks MSR, meaning the kernel will reload the FP > regs > from the thread struct before letting the task use FP again. So in that case > save_fpu() is free to clobber fr0 because the FP regs no longer hold live > values > for the task. > > There is another case though, which is the path via: > copy_process() >dup_task_struct() > arch_dup_task_struct() >flush_all_to_thread() > save_all() > > That path saves the FP regs but leaves them live. That's meant as an > optimisation for a process that's using FP/VSX and then calls fork(), leaving > the regs live means the parent process doesn't have to take a fault after the > fork to get its FP regs back. > > That path does clobber fr0, but fr0 is volatile across a syscall, and the only > way to reach copy_process() from userspace is via a syscall. So in normal > usage > fr0 being clobbered across a syscall shouldn't cause data corruption. > > Even if we handle a signal on the return from the fork() syscall, the worst > that > happens is that the task's thread struct holds the clobbered fr0, but the task > doesn't care (because fr0 is volatile across the syscall anyway). > > That path is something like: > > system_call_vectored_common() > system_call_exception() >sys_fork() > kernel_clone() >copy_process() > dup_task_struct() >arch_dup_task_struct() > flush_all_to_thread() >save_all() > if (tsk->thread.regs->msr & MSR_FP) >save_fpu() ># does not clear MSR_FP from regs->msr > syscall_exit_prepare() >interrupt_exit_user_prepare_main() > do_notify_resume() >get_signal() >handle_rt_signal64() > prepare_setup_sigcontext() >flush_fp_to_thread() > if (tsk->thread.regs->msr & MSR_FP) >giveup_fpu() > __giveup_fpu >save_fpu() ># clobbered fr0 is saved, but task considers it volatile ># across syscall anyway > > > But we now have a new path, because io-uring can call copy_process() via > create_io_thread() from the signal handling path. That's OK if the signal is > handled as we return from a syscall, but it's not OK if the signal is handled > due to some other interrupt. > > Which is: > > interrupt_return_srr_user() > interrupt_exit_user_prepare() >interrupt_exit_user_prepare_main() > do_notify_resume() >get_signal() > task_work_run() >create_worker_cb() > create_io_worker() >copy_process() > dup_task_struct() >arch_dup_task_struct() > flush_all_to_thread() >save_all() > if (tsk->thread.regs->msr & MSR_FP) >save_fpu() ># fr0 is clobbered and potentially live in > userspace > > > So tldr I think the corruption is only an issue since io-uring started doing > the clone via signal, which I think matches the observed timeline of this bug > appearing. I agree the corruption really only started showing up in earnest on io_uring clone-via-signal, as this was confirmed several times in the course of debugging. Bear in mind however that we have seen very, very rare crashes over several years in other tasks, and I am starting to think this bug might be the root cause (see below). Note as well that I may very well have a wrong call order in the commit message, since I was relying on a couple of WARN_ON() macros I inserted to check for a similar (but not identical) condition and didn't spend much time getting new traces after identifying the root cause. I went back and grabbed some real world system-wide stack traces, since I now know what to trigger on. A typical example is: interrupt_return_srr_user() interrupt_exit_user_prepare() interrupt_exit_user_prepare_main() schedule() __schedule() __switch_to() giveup_all() # tsk->thread.regs->msr MSR_FP is still set here __giveup_fpu() save_fpu() # fr0 is clobbered and potentially live in userspace This indicates that the pending
Re: [PATCH] ASoC: fsl_sai: Fix no frame sync clock issue on i.MX8MP
On Mon, 20 Nov 2023 18:05:35 +0800, Shengjiu Wang wrote: > On i.MX8MP, when the TERE and FSD_MSTR enabled before configuring > the word width, there will be no frame sync clock issue, because > old word width impact the generation of frame sync. > > TERE enabled earlier only for i.MX8MP case for the hardware limitation, > So need to disable FSD_MSTR before configuring word width, then enable > FSD_MSTR bit for this specific case. > > [...] Applied to https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git for-next Thanks! [1/1] ASoC: fsl_sai: Fix no frame sync clock issue on i.MX8MP commit: 14e8442e0789598514f3c9de014950de9feda7a4 All being well this means that it will be integrated into the linux-next tree (usually sometime in the next 24 hours) and sent to Linus during the next merge window (or sooner if it is a bug fix), however if problems are discovered then the patch may be dropped or reverted. You may get further e-mails resulting from automated or manual testing and review of the tree, please engage with people reporting problems and send followup patches addressing any issues that are reported if needed. If any updates are required or you are submitting further changes they should be sent as incremental updates against current git, existing patches will not be replaced. Please add any relevant lists and maintainers to the CCs when replying to this mail. Thanks, Mark
Re: [RFC - is this a bug?] powerpc/lib/sstep: Asking for some light on this, please. :)
On Mon, Nov 20, 2023 at 08:33:45AM -0600, Gustavo A. R. Silva wrote: > > > On 11/20/23 08:25, Naveen N Rao wrote: > > On Fri, Nov 17, 2023 at 12:36:01PM -0600, Gustavo A. R. Silva wrote: > > > Hi all, > > > > > > I'm trying to fix the following -Wstringop-overflow warnings, and I'd like > > > to get your feedback on this, please: > > > > > > In function 'do_byte_reverse', > > > inlined from 'do_vec_store' at > > > /home/gus/linux/arch/powerpc/lib/sstep.c:722:3, > > > inlined from 'emulate_loadstore' at > > > /home/gus/linux/arch/powerpc/lib/sstep.c:3510:9: > > > arch/powerpc/lib/sstep.c:287:23: error: writing 8 bytes into a region of > > > size 0 [-Werror=stringop-overflow=] > > >287 | up[3] = tmp; > > >| ~~^ > > > arch/powerpc/lib/sstep.c: In function 'emulate_loadstore': > > > arch/powerpc/lib/sstep.c:708:11: note: at offset [24, 39] into > > > destination object 'u' of size 16 > > >708 | } u; > > >| ^ > > > In function 'do_byte_reverse', > > > inlined from 'do_vec_store' at > > > /home/gus/linux/arch/powerpc/lib/sstep.c:722:3, > > > inlined from 'emulate_loadstore' at > > > /home/gus/linux/arch/powerpc/lib/sstep.c:3510:9: > > > arch/powerpc/lib/sstep.c:289:23: error: writing 8 bytes into a region of > > > size 0 [-Werror=stringop-overflow=] > > >289 | up[2] = byterev_8(up[1]); > > >| ~~^~ > > > arch/powerpc/lib/sstep.c: In function 'emulate_loadstore': > > > arch/powerpc/lib/sstep.c:708:11: note: at offset 16 into destination > > > object 'u' of size 16 > > >708 | } u; > > >| ^ > > > In function 'do_byte_reverse', > > > inlined from 'do_vec_load' at > > > /home/gus/linux/arch/powerpc/lib/sstep.c:691:3, > > > inlined from 'emulate_loadstore' at > > > /home/gus/linux/arch/powerpc/lib/sstep.c:3439:9: > > > arch/powerpc/lib/sstep.c:287:23: error: writing 8 bytes into a region of > > > size 0 [-Werror=stringop-overflow=] > > >287 | up[3] = tmp; > > >| ~~^ > > > arch/powerpc/lib/sstep.c: In function 'emulate_loadstore': > > > arch/powerpc/lib/sstep.c:681:11: note: at offset [24, 39] into > > > destination object 'u' of size 16 > > >681 | } u = {}; > > >| ^ > > > arch/powerpc/lib/sstep.c:681:11: note: at offset [24, 39] into > > > destination object 'u' of size 16 > > > arch/powerpc/lib/sstep.c:681:11: note: at offset [24, 39] into > > > destination object 'u' of size 16 > > > > > > The origing of the issue seems to be the following calls to function > > > `do_vec_store()`, when > > > `size > 16`, or more precisely when `size == 32` > > > > > > arch/powerpc/lib/sstep.c: > > > 3436 case LOAD_VMX: > > > 3437 if (!(regs->msr & MSR_PR) && !(regs->msr & MSR_VEC)) > > > 3438 return 0; > > > 3439 err = do_vec_load(op->reg, ea, size, regs, > > > cross_endian); > > > 3440 break; > > > ... > > > 3507 case STORE_VMX: > > > 3508 if (!(regs->msr & MSR_PR) && !(regs->msr & MSR_VEC)) > > > 3509 return 0; > > > 3510 err = do_vec_store(op->reg, ea, size, regs, > > > cross_endian); > > > 3511 break; > > > > LOAD_VMX and STORE_VMX are set in analyse_instr() and size does not > > exceed 16. So, the warning looks incorrect to me. > > Does that mean that this piece of code is never actually executed: > > 281 case 32: { > 282 unsigned long *up = (unsigned long *)ptr; > 283 unsigned long tmp; > 284 > 285 tmp = byterev_8(up[0]); > 286 up[0] = byterev_8(up[3]); > 287 up[3] = tmp; > 288 tmp = byterev_8(up[2]); > 289 up[2] = byterev_8(up[1]); > 290 up[1] = tmp; > 291 break; > 292 } > > or rather, under which conditions the above code is executed, if any? Please see git history to understand the commit that introduced that code. You can do that by starting with a 'git blame' on the file. You'll find that the commit that introduced the above hunk was af99da74333b ("powerpc/sstep: Support VSX vector paired storage access instructions"). The above code path is hit via do_vsx_load()->emulate_vsx_load()->do_byte_reverse() - Naveen
Re: [RFC - is this a bug?] powerpc/lib/sstep: Asking for some light on this, please. :)
On 11/20/23 08:25, Naveen N Rao wrote: On Fri, Nov 17, 2023 at 12:36:01PM -0600, Gustavo A. R. Silva wrote: Hi all, I'm trying to fix the following -Wstringop-overflow warnings, and I'd like to get your feedback on this, please: In function 'do_byte_reverse', inlined from 'do_vec_store' at /home/gus/linux/arch/powerpc/lib/sstep.c:722:3, inlined from 'emulate_loadstore' at /home/gus/linux/arch/powerpc/lib/sstep.c:3510:9: arch/powerpc/lib/sstep.c:287:23: error: writing 8 bytes into a region of size 0 [-Werror=stringop-overflow=] 287 | up[3] = tmp; | ~~^ arch/powerpc/lib/sstep.c: In function 'emulate_loadstore': arch/powerpc/lib/sstep.c:708:11: note: at offset [24, 39] into destination object 'u' of size 16 708 | } u; | ^ In function 'do_byte_reverse', inlined from 'do_vec_store' at /home/gus/linux/arch/powerpc/lib/sstep.c:722:3, inlined from 'emulate_loadstore' at /home/gus/linux/arch/powerpc/lib/sstep.c:3510:9: arch/powerpc/lib/sstep.c:289:23: error: writing 8 bytes into a region of size 0 [-Werror=stringop-overflow=] 289 | up[2] = byterev_8(up[1]); | ~~^~ arch/powerpc/lib/sstep.c: In function 'emulate_loadstore': arch/powerpc/lib/sstep.c:708:11: note: at offset 16 into destination object 'u' of size 16 708 | } u; | ^ In function 'do_byte_reverse', inlined from 'do_vec_load' at /home/gus/linux/arch/powerpc/lib/sstep.c:691:3, inlined from 'emulate_loadstore' at /home/gus/linux/arch/powerpc/lib/sstep.c:3439:9: arch/powerpc/lib/sstep.c:287:23: error: writing 8 bytes into a region of size 0 [-Werror=stringop-overflow=] 287 | up[3] = tmp; | ~~^ arch/powerpc/lib/sstep.c: In function 'emulate_loadstore': arch/powerpc/lib/sstep.c:681:11: note: at offset [24, 39] into destination object 'u' of size 16 681 | } u = {}; | ^ arch/powerpc/lib/sstep.c:681:11: note: at offset [24, 39] into destination object 'u' of size 16 arch/powerpc/lib/sstep.c:681:11: note: at offset [24, 39] into destination object 'u' of size 16 The origing of the issue seems to be the following calls to function `do_vec_store()`, when `size > 16`, or more precisely when `size == 32` arch/powerpc/lib/sstep.c: 3436 case LOAD_VMX: 3437 if (!(regs->msr & MSR_PR) && !(regs->msr & MSR_VEC)) 3438 return 0; 3439 err = do_vec_load(op->reg, ea, size, regs, cross_endian); 3440 break; ... 3507 case STORE_VMX: 3508 if (!(regs->msr & MSR_PR) && !(regs->msr & MSR_VEC)) 3509 return 0; 3510 err = do_vec_store(op->reg, ea, size, regs, cross_endian); 3511 break; LOAD_VMX and STORE_VMX are set in analyse_instr() and size does not exceed 16. So, the warning looks incorrect to me. Does that mean that this piece of code is never actually executed: 281 case 32: { 282 unsigned long *up = (unsigned long *)ptr; 283 unsigned long tmp; 284 285 tmp = byterev_8(up[0]); 286 up[0] = byterev_8(up[3]); 287 up[3] = tmp; 288 tmp = byterev_8(up[2]); 289 up[2] = byterev_8(up[1]); 290 up[1] = tmp; 291 break; 292 } or rather, under which conditions the above code is executed, if any? Thanks! -- Gustavo
Re: [PATCH v2] powerpc: Don't clobber fr0/vs0 during fp|altivec register save
Yeah, awesome. On Mon Nov 20, 2023 at 5:10 PM AEST, Michael Ellerman wrote: > Hi Timothy, > > Great work debugging this. I think your fix is good, but I want to understand > it > a bit more to make sure I can explain why we haven't seen it outside of > io-uring. Analysis seems right to me. Probably the best minimal fix. But I wonder if we should just use the one path for saving/flushing/giving up, just use giveup instead of save? KVM looks odd too, and actually gets this wrong. In a way that's not fixed by Timothy's patch, because it's just not restoring userspace registers at all. Fortunately QEMU isn't in the habit of using non volatile FP/VEC registers over a VCPU ioctl, but there's no reason it couldn't do since GCC/LLVM can easily use them. KVM really wants to be using giveup. Thanks, Nick > If this can be triggered outside of io-uring then I have even more backporting > in my future :} > > Typically save_fpu() is called from __giveup_fpu() which saves the FP regs and > also *turns off FP* in the tasks MSR, meaning the kernel will reload the FP > regs > from the thread struct before letting the task use FP again. So in that case > save_fpu() is free to clobber fr0 because the FP regs no longer hold live > values > for the task. > > There is another case though, which is the path via: > copy_process() > dup_task_struct() > arch_dup_task_struct() > flush_all_to_thread() > save_all() > > That path saves the FP regs but leaves them live. That's meant as an > optimisation for a process that's using FP/VSX and then calls fork(), leaving > the regs live means the parent process doesn't have to take a fault after the > fork to get its FP regs back. > > That path does clobber fr0, but fr0 is volatile across a syscall, and the only > way to reach copy_process() from userspace is via a syscall. So in normal > usage > fr0 being clobbered across a syscall shouldn't cause data corruption. > > Even if we handle a signal on the return from the fork() syscall, the worst > that > happens is that the task's thread struct holds the clobbered fr0, but the task > doesn't care (because fr0 is volatile across the syscall anyway). > > That path is something like: > > system_call_vectored_common() > system_call_exception() > sys_fork() > kernel_clone() > copy_process() > dup_task_struct() > arch_dup_task_struct() > flush_all_to_thread() > save_all() > if (tsk->thread.regs->msr & MSR_FP) > save_fpu() > # does not clear MSR_FP from regs->msr > syscall_exit_prepare() > interrupt_exit_user_prepare_main() > do_notify_resume() > get_signal() > handle_rt_signal64() > prepare_setup_sigcontext() > flush_fp_to_thread() > if (tsk->thread.regs->msr & MSR_FP) > giveup_fpu() > __giveup_fpu > save_fpu() > # clobbered fr0 is saved, but task considers it volatile > # across syscall anyway > > > But we now have a new path, because io-uring can call copy_process() via > create_io_thread() from the signal handling path. That's OK if the signal is > handled as we return from a syscall, but it's not OK if the signal is handled > due to some other interrupt. > > Which is: > > interrupt_return_srr_user() > interrupt_exit_user_prepare() > interrupt_exit_user_prepare_main() > do_notify_resume() > get_signal() > task_work_run() > create_worker_cb() > create_io_worker() > copy_process() > dup_task_struct() > arch_dup_task_struct() > flush_all_to_thread() > save_all() > if (tsk->thread.regs->msr & MSR_FP) > save_fpu() > # fr0 is clobbered and potentially live in > userspace > > > So tldr I think the corruption is only an issue since io-uring started doing > the clone via signal, which I think matches the observed timeline of this bug > appearing. > > Gotta run home, will have a closer look at the actual patch later on. > > cheers > > > Timothy Pearson writes: > > During floating point and vector save to thread data fr0/vs0 are clobbered > > by the FPSCR/VSCR store routine. This leads to userspace register > > corruption > > and application data corruption / crash under the following rare condition: > > > > * A userspace thread is executing with VSX/FP mode enabled > > * The userspace thread is making active use of fr0 and/or vs0 > > * An IPI is taken in kernel mode, forcing the userspace thread to > > reschedule > > * The userspace thread is interrupted by the IPI before accessing data it > >previously stored in fr0/vs0 > > * The thread being switched in by the IPI has a pending
Re: [RFC - is this a bug?] powerpc/lib/sstep: Asking for some light on this, please. :)
On Fri, Nov 17, 2023 at 12:36:01PM -0600, Gustavo A. R. Silva wrote: > Hi all, > > I'm trying to fix the following -Wstringop-overflow warnings, and I'd like > to get your feedback on this, please: > > In function 'do_byte_reverse', > inlined from 'do_vec_store' at > /home/gus/linux/arch/powerpc/lib/sstep.c:722:3, > inlined from 'emulate_loadstore' at > /home/gus/linux/arch/powerpc/lib/sstep.c:3510:9: > arch/powerpc/lib/sstep.c:287:23: error: writing 8 bytes into a region of size > 0 [-Werror=stringop-overflow=] > 287 | up[3] = tmp; > | ~~^ > arch/powerpc/lib/sstep.c: In function 'emulate_loadstore': > arch/powerpc/lib/sstep.c:708:11: note: at offset [24, 39] into destination > object 'u' of size 16 > 708 | } u; > | ^ > In function 'do_byte_reverse', > inlined from 'do_vec_store' at > /home/gus/linux/arch/powerpc/lib/sstep.c:722:3, > inlined from 'emulate_loadstore' at > /home/gus/linux/arch/powerpc/lib/sstep.c:3510:9: > arch/powerpc/lib/sstep.c:289:23: error: writing 8 bytes into a region of size > 0 [-Werror=stringop-overflow=] > 289 | up[2] = byterev_8(up[1]); > | ~~^~ > arch/powerpc/lib/sstep.c: In function 'emulate_loadstore': > arch/powerpc/lib/sstep.c:708:11: note: at offset 16 into destination object > 'u' of size 16 > 708 | } u; > | ^ > In function 'do_byte_reverse', > inlined from 'do_vec_load' at > /home/gus/linux/arch/powerpc/lib/sstep.c:691:3, > inlined from 'emulate_loadstore' at > /home/gus/linux/arch/powerpc/lib/sstep.c:3439:9: > arch/powerpc/lib/sstep.c:287:23: error: writing 8 bytes into a region of size > 0 [-Werror=stringop-overflow=] > 287 | up[3] = tmp; > | ~~^ > arch/powerpc/lib/sstep.c: In function 'emulate_loadstore': > arch/powerpc/lib/sstep.c:681:11: note: at offset [24, 39] into destination > object 'u' of size 16 > 681 | } u = {}; > | ^ > arch/powerpc/lib/sstep.c:681:11: note: at offset [24, 39] into destination > object 'u' of size 16 > arch/powerpc/lib/sstep.c:681:11: note: at offset [24, 39] into destination > object 'u' of size 16 > > The origing of the issue seems to be the following calls to function > `do_vec_store()`, when > `size > 16`, or more precisely when `size == 32` > > arch/powerpc/lib/sstep.c: > 3436 case LOAD_VMX: > 3437 if (!(regs->msr & MSR_PR) && !(regs->msr & MSR_VEC)) > 3438 return 0; > 3439 err = do_vec_load(op->reg, ea, size, regs, cross_endian); > 3440 break; > ... > 3507 case STORE_VMX: > 3508 if (!(regs->msr & MSR_PR) && !(regs->msr & MSR_VEC)) > 3509 return 0; > 3510 err = do_vec_store(op->reg, ea, size, regs, > cross_endian); > 3511 break; LOAD_VMX and STORE_VMX are set in analyse_instr() and size does not exceed 16. So, the warning looks incorrect to me. - Naveen
Re: [PATCH 34/34] KVM: selftests: Add a memory region subtest to validate invalid flags
On Wed, Nov 08, 2023 at 05:08:01PM -0800, Anish Moorthy wrote: > Applying [1] and [2] reveals that this also breaks non-x86 builds- the > MEM_REGION_GPA/SLOT definitions are guarded behind an #ifdef > __x86_64__, while the usages introduced here aren't. > > Should > > On Sun, Nov 5, 2023 at 8:35 AM Paolo Bonzini wrote: > > > > + test_invalid_memory_region_flags(); > > be #ifdef'd, perhaps? I'm not quite sure what the intent is. This has been broken in -next for a week now, do we have any progress on a fix or should we just revert the patch? signature.asc Description: PGP signature
Re: [PATCH v9 00/27] Add support for QMC HDLC, framer infrastructure and PEF2256 framer
On Fri, Nov 17, 2023 at 04:47:46PM -0800, Jakub Kicinski wrote: > On Wed, 15 Nov 2023 15:39:36 +0100 Herve Codina wrote: > >- Removed Patches 6, 7 and 8 (patches applied) > > > >- Patches 7, 20, 21, 23 (patches 10, 23, 24, 26 in v8) > > Add 'Acked-by: Jakub Kicinski ' > I thought someone (Mark?) asked for the networking stuff to be put > on a branch. If that's still the preference - is it possible to factor > these out as a standalone series, too? Will they build on their own? Yes, can we *please* at least get the generic non-driver bits of this series moving - they seem uncontroversial as far as I can see and are a tiny portion of the overall 20 patches. Patches 21-23 look like they can go on a branch in the net tree? signature.asc Description: PGP signature
[PATCH v4 4/4] KVM: PPC: selftests: powerpc enable kvm_create_max_vcpus test
powerpc's maximum permitted vCPU ID depends on the VM's SMT mode, and the maximum reported by KVM_CAP_MAX_VCPU_ID exceeds a simple non-SMT VM's limit. The powerpc KVM selftest port uses non-SMT VMs, so add a workaround to the kvm_create_max_vcpus test case to limit vCPU IDs to KVM_CAP_MAX_VCPUS on powerpc. And enable the test case. Signed-off-by: Nicholas Piggin --- tools/testing/selftests/kvm/Makefile | 1 + tools/testing/selftests/kvm/kvm_create_max_vcpus.c | 9 + 2 files changed, 10 insertions(+) diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile index a198fe6136c8..1e904d8871d7 100644 --- a/tools/testing/selftests/kvm/Makefile +++ b/tools/testing/selftests/kvm/Makefile @@ -211,6 +211,7 @@ TEST_GEN_PROGS_powerpc += dirty_log_test TEST_GEN_PROGS_powerpc += dirty_log_perf_test TEST_GEN_PROGS_powerpc += guest_print_test TEST_GEN_PROGS_powerpc += hardware_disable_test +TEST_GEN_PROGS_powerpc += kvm_create_max_vcpus TEST_GEN_PROGS_powerpc += kvm_page_table_test TEST_GEN_PROGS_powerpc += max_guest_memory_test TEST_GEN_PROGS_powerpc += memslot_modification_stress_test diff --git a/tools/testing/selftests/kvm/kvm_create_max_vcpus.c b/tools/testing/selftests/kvm/kvm_create_max_vcpus.c index 31b3cb24b9a7..330ede73c147 100644 --- a/tools/testing/selftests/kvm/kvm_create_max_vcpus.c +++ b/tools/testing/selftests/kvm/kvm_create_max_vcpus.c @@ -51,6 +51,15 @@ int main(int argc, char *argv[]) pr_info("KVM_CAP_MAX_VCPU_ID: %d\n", kvm_max_vcpu_id); pr_info("KVM_CAP_MAX_VCPUS: %d\n", kvm_max_vcpus); +#ifdef __powerpc64__ + /* +* powerpc has a particular format for the vcpu ID that depends on +* the guest SMT mode, and the max ID cap is too large for non-SMT +* modes, where the maximum ID is the same as the maximum vCPUs. +*/ + kvm_max_vcpu_id = kvm_max_vcpus; +#endif + /* * Check that we're allowed to open nr_fds_wanted file descriptors and * try raising the limits if needed. -- 2.42.0
[PATCH v4 3/4] KVM: PPC: selftests: add support for powerpc
Implement KVM selftests support for powerpc (Book3S-64). ucalls are implemented with an unsuppored PAPR hcall number which will always cause KVM to exit to userspace. Virtual memory is implemented for the radix MMU, and only a base page size is supported (both 4K and 64K). Guest interrupts are taken in real-mode, so require a page allocated at gRA 0x0. Interrupt entry is complicated because gVA:gRA is not 1:1 mapped (like the kernel is), so the MMU can not just just be switched on and off. Acked-by: Michael Ellerman (powerpc) Signed-off-by: Nicholas Piggin --- MAINTAINERS | 2 + tools/testing/selftests/kvm/Makefile | 19 + .../selftests/kvm/include/kvm_util_base.h | 24 + .../selftests/kvm/include/powerpc/hcall.h | 17 + .../selftests/kvm/include/powerpc/ppc_asm.h | 32 ++ .../selftests/kvm/include/powerpc/processor.h | 39 ++ .../selftests/kvm/include/powerpc/ucall.h | 21 + tools/testing/selftests/kvm/lib/guest_modes.c | 27 +- tools/testing/selftests/kvm/lib/kvm_util.c| 12 + .../selftests/kvm/lib/powerpc/handlers.S | 93 .../testing/selftests/kvm/lib/powerpc/hcall.c | 45 ++ .../selftests/kvm/lib/powerpc/processor.c | 468 ++ .../testing/selftests/kvm/lib/powerpc/ucall.c | 21 + tools/testing/selftests/kvm/powerpc/helpers.h | 46 ++ 14 files changed, 862 insertions(+), 4 deletions(-) create mode 100644 tools/testing/selftests/kvm/include/powerpc/hcall.h create mode 100644 tools/testing/selftests/kvm/include/powerpc/ppc_asm.h create mode 100644 tools/testing/selftests/kvm/include/powerpc/processor.h create mode 100644 tools/testing/selftests/kvm/include/powerpc/ucall.h create mode 100644 tools/testing/selftests/kvm/lib/powerpc/handlers.S create mode 100644 tools/testing/selftests/kvm/lib/powerpc/hcall.c create mode 100644 tools/testing/selftests/kvm/lib/powerpc/processor.c create mode 100644 tools/testing/selftests/kvm/lib/powerpc/ucall.c create mode 100644 tools/testing/selftests/kvm/powerpc/helpers.h diff --git a/MAINTAINERS b/MAINTAINERS index 97f51d5ec1cf..3f430d517462 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -11705,6 +11705,8 @@ F: arch/powerpc/include/asm/kvm* F: arch/powerpc/include/uapi/asm/kvm* F: arch/powerpc/kernel/kvm* F: arch/powerpc/kvm/ +F: tools/testing/selftests/kvm/*/powerpc/ +F: tools/testing/selftests/kvm/powerpc/ KERNEL VIRTUAL MACHINE FOR RISC-V (KVM/riscv) M: Anup Patel diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile index a5963ab9215b..a198fe6136c8 100644 --- a/tools/testing/selftests/kvm/Makefile +++ b/tools/testing/selftests/kvm/Makefile @@ -65,6 +65,11 @@ LIBKVM_s390x += lib/s390x/ucall.c LIBKVM_riscv += lib/riscv/processor.c LIBKVM_riscv += lib/riscv/ucall.c +LIBKVM_powerpc += lib/powerpc/handlers.S +LIBKVM_powerpc += lib/powerpc/processor.c +LIBKVM_powerpc += lib/powerpc/ucall.c +LIBKVM_powerpc += lib/powerpc/hcall.c + # Non-compiled test targets TEST_PROGS_x86_64 += x86_64/nx_huge_pages_test.sh @@ -200,6 +205,20 @@ TEST_GEN_PROGS_riscv += kvm_page_table_test TEST_GEN_PROGS_riscv += set_memory_region_test TEST_GEN_PROGS_riscv += kvm_binary_stats_test +TEST_GEN_PROGS_powerpc += access_tracking_perf_test +TEST_GEN_PROGS_powerpc += demand_paging_test +TEST_GEN_PROGS_powerpc += dirty_log_test +TEST_GEN_PROGS_powerpc += dirty_log_perf_test +TEST_GEN_PROGS_powerpc += guest_print_test +TEST_GEN_PROGS_powerpc += hardware_disable_test +TEST_GEN_PROGS_powerpc += kvm_page_table_test +TEST_GEN_PROGS_powerpc += max_guest_memory_test +TEST_GEN_PROGS_powerpc += memslot_modification_stress_test +TEST_GEN_PROGS_powerpc += memslot_perf_test +TEST_GEN_PROGS_powerpc += rseq_test +TEST_GEN_PROGS_powerpc += set_memory_region_test +TEST_GEN_PROGS_powerpc += kvm_binary_stats_test + SPLIT_TESTS += get-reg-list TEST_PROGS += $(TEST_PROGS_$(ARCH_DIR)) diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h index 46edefabac85..3ce0bd0299fa 100644 --- a/tools/testing/selftests/kvm/include/kvm_util_base.h +++ b/tools/testing/selftests/kvm/include/kvm_util_base.h @@ -87,6 +87,12 @@ enum kvm_mem_region_type { NR_MEM_REGIONS, }; +/* Page table fragment cache for guest page table < page size */ +struct vm_pt_frag_cache { + vm_paddr_t page; + size_t page_nr_used; +}; + struct kvm_vm { int mode; unsigned long type; @@ -106,6 +112,10 @@ struct kvm_vm { bool pgd_created; vm_paddr_t ucall_mmio_addr; vm_paddr_t pgd; +#if defined(__powerpc64__) + vm_paddr_t prtb; // process table + struct vm_pt_frag_cache pt_frag_cache[2]; // 256B and 4KB PT caches +#endif vm_vaddr_t gdt; vm_vaddr_t tss; vm_vaddr_t idt; @@ -181,6 +191,8 @@ enum vm_guest_mode { VM_MODE_PXXV48_4K, /* For 48bits VA but ANY bits PA */
[PATCH v4 2/4] KVM: selftests: Add aligned guest physical page allocator
powerpc will require this to allocate MMU tables in guest memory that are larger than guest base page size. Signed-off-by: Nicholas Piggin --- .../selftests/kvm/include/kvm_util_base.h | 2 + tools/testing/selftests/kvm/lib/kvm_util.c| 44 --- 2 files changed, 29 insertions(+), 17 deletions(-) diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h index e592a75ec052..46edefabac85 100644 --- a/tools/testing/selftests/kvm/include/kvm_util_base.h +++ b/tools/testing/selftests/kvm/include/kvm_util_base.h @@ -703,6 +703,8 @@ const char *exit_reason_str(unsigned int exit_reason); vm_paddr_t vm_phy_page_alloc(struct kvm_vm *vm, vm_paddr_t paddr_min, uint32_t memslot); +vm_paddr_t vm_phy_pages_alloc_align(struct kvm_vm *vm, size_t num, size_t align, + vm_paddr_t paddr_min, uint32_t memslot); vm_paddr_t vm_phy_pages_alloc(struct kvm_vm *vm, size_t num, vm_paddr_t paddr_min, uint32_t memslot); vm_paddr_t vm_alloc_page_table(struct kvm_vm *vm); diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c index 7a8af1821f5d..22cf6c0ca89f 100644 --- a/tools/testing/selftests/kvm/lib/kvm_util.c +++ b/tools/testing/selftests/kvm/lib/kvm_util.c @@ -1924,6 +1924,7 @@ const char *exit_reason_str(unsigned int exit_reason) * Input Args: * vm - Virtual Machine * num - number of pages + * align - pages alignment * paddr_min - Physical address minimum * memslot - Memory region to allocate page from * @@ -1937,7 +1938,7 @@ const char *exit_reason_str(unsigned int exit_reason) * and their base address is returned. A TEST_ASSERT failure occurs if * not enough pages are available at or above paddr_min. */ -vm_paddr_t vm_phy_pages_alloc(struct kvm_vm *vm, size_t num, +vm_paddr_t vm_phy_pages_alloc_align(struct kvm_vm *vm, size_t num, size_t align, vm_paddr_t paddr_min, uint32_t memslot) { struct userspace_mem_region *region; @@ -1951,24 +1952,27 @@ vm_paddr_t vm_phy_pages_alloc(struct kvm_vm *vm, size_t num, paddr_min, vm->page_size); region = memslot2region(vm, memslot); - base = pg = paddr_min >> vm->page_shift; - - do { - for (; pg < base + num; ++pg) { - if (!sparsebit_is_set(region->unused_phy_pages, pg)) { - base = pg = sparsebit_next_set(region->unused_phy_pages, pg); - break; + base = paddr_min >> vm->page_shift; + +again: + base = (base + align - 1) & ~(align - 1); + for (pg = base; pg < base + num; ++pg) { + if (!sparsebit_is_set(region->unused_phy_pages, pg)) { + base = sparsebit_next_set(region->unused_phy_pages, pg); + if (!base) { + fprintf(stderr, "No guest physical pages " + "available, paddr_min: 0x%lx " + "page_size: 0x%x memslot: %u " + "num_pages: %lu align: %lu\n", + paddr_min, vm->page_size, memslot, + num, align); + fputs(" vm dump \n", stderr); + vm_dump(stderr, vm, 2); + TEST_ASSERT(false, "false"); + abort(); } + goto again; } - } while (pg && pg != base + num); - - if (pg == 0) { - fprintf(stderr, "No guest physical page available, " - "paddr_min: 0x%lx page_size: 0x%x memslot: %u\n", - paddr_min, vm->page_size, memslot); - fputs(" vm dump \n", stderr); - vm_dump(stderr, vm, 2); - abort(); } for (pg = base; pg < base + num; ++pg) @@ -1977,6 +1981,12 @@ vm_paddr_t vm_phy_pages_alloc(struct kvm_vm *vm, size_t num, return base * vm->page_size; } +vm_paddr_t vm_phy_pages_alloc(struct kvm_vm *vm, size_t num, + vm_paddr_t paddr_min, uint32_t memslot) +{ + return vm_phy_pages_alloc_align(vm, num, 1, paddr_min, memslot); +} + vm_paddr_t vm_phy_page_alloc(struct kvm_vm *vm, vm_paddr_t paddr_min, uint32_t memslot) { -- 2.42.0
[PATCH v4 1/4] KVM: selftests: Move pgd_created check into virt_pgd_alloc
virt_arch_pgd_alloc all do the same test and set of pgd_created. Move this into common code. Signed-off-by: Nicholas Piggin --- tools/testing/selftests/kvm/include/kvm_util_base.h | 5 + tools/testing/selftests/kvm/lib/aarch64/processor.c | 4 tools/testing/selftests/kvm/lib/riscv/processor.c | 4 tools/testing/selftests/kvm/lib/s390x/processor.c | 4 tools/testing/selftests/kvm/lib/x86_64/processor.c | 7 ++- 5 files changed, 7 insertions(+), 17 deletions(-) diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h index a18db6a7b3cf..e592a75ec052 100644 --- a/tools/testing/selftests/kvm/include/kvm_util_base.h +++ b/tools/testing/selftests/kvm/include/kvm_util_base.h @@ -846,7 +846,12 @@ void virt_arch_pgd_alloc(struct kvm_vm *vm); static inline void virt_pgd_alloc(struct kvm_vm *vm) { + if (vm->pgd_created) + return; + virt_arch_pgd_alloc(vm); + + vm->pgd_created = true; } /* diff --git a/tools/testing/selftests/kvm/lib/aarch64/processor.c b/tools/testing/selftests/kvm/lib/aarch64/processor.c index 6fe12e985ba5..69743d4c6748 100644 --- a/tools/testing/selftests/kvm/lib/aarch64/processor.c +++ b/tools/testing/selftests/kvm/lib/aarch64/processor.c @@ -96,13 +96,9 @@ void virt_arch_pgd_alloc(struct kvm_vm *vm) { size_t nr_pages = page_align(vm, ptrs_per_pgd(vm) * 8) / vm->page_size; - if (vm->pgd_created) - return; - vm->pgd = vm_phy_pages_alloc(vm, nr_pages, KVM_GUEST_PAGE_TABLE_MIN_PADDR, vm->memslots[MEM_REGION_PT]); - vm->pgd_created = true; } static void _virt_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr, diff --git a/tools/testing/selftests/kvm/lib/riscv/processor.c b/tools/testing/selftests/kvm/lib/riscv/processor.c index d146ca71e0c0..7695ba2cd369 100644 --- a/tools/testing/selftests/kvm/lib/riscv/processor.c +++ b/tools/testing/selftests/kvm/lib/riscv/processor.c @@ -57,13 +57,9 @@ void virt_arch_pgd_alloc(struct kvm_vm *vm) { size_t nr_pages = page_align(vm, ptrs_per_pte(vm) * 8) / vm->page_size; - if (vm->pgd_created) - return; - vm->pgd = vm_phy_pages_alloc(vm, nr_pages, KVM_GUEST_PAGE_TABLE_MIN_PADDR, vm->memslots[MEM_REGION_PT]); - vm->pgd_created = true; } void virt_arch_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr) diff --git a/tools/testing/selftests/kvm/lib/s390x/processor.c b/tools/testing/selftests/kvm/lib/s390x/processor.c index 15945121daf1..358e03f09c7a 100644 --- a/tools/testing/selftests/kvm/lib/s390x/processor.c +++ b/tools/testing/selftests/kvm/lib/s390x/processor.c @@ -17,16 +17,12 @@ void virt_arch_pgd_alloc(struct kvm_vm *vm) TEST_ASSERT(vm->page_size == 4096, "Unsupported page size: 0x%x", vm->page_size); - if (vm->pgd_created) - return; - paddr = vm_phy_pages_alloc(vm, PAGES_PER_REGION, KVM_GUEST_PAGE_TABLE_MIN_PADDR, vm->memslots[MEM_REGION_PT]); memset(addr_gpa2hva(vm, paddr), 0xff, PAGES_PER_REGION * vm->page_size); vm->pgd = paddr; - vm->pgd_created = true; } /* diff --git a/tools/testing/selftests/kvm/lib/x86_64/processor.c b/tools/testing/selftests/kvm/lib/x86_64/processor.c index d8288374078e..2ec64bb45db2 100644 --- a/tools/testing/selftests/kvm/lib/x86_64/processor.c +++ b/tools/testing/selftests/kvm/lib/x86_64/processor.c @@ -127,11 +127,8 @@ void virt_arch_pgd_alloc(struct kvm_vm *vm) TEST_ASSERT(vm->mode == VM_MODE_PXXV48_4K, "Attempt to use " "unknown or unsupported guest mode, mode: 0x%x", vm->mode); - /* If needed, create page map l4 table. */ - if (!vm->pgd_created) { - vm->pgd = vm_alloc_page_table(vm); - vm->pgd_created = true; - } + /* Create page map l4 table. */ + vm->pgd = vm_alloc_page_table(vm); } static void *virt_get_pte(struct kvm_vm *vm, uint64_t *parent_pte, -- 2.42.0
[PATCH v4 0/4] KVM: selftests: add powerpc support
This series adds initial KVM selftests support for powerpc (64-bit, BookS, radix MMU). Since v3: https://lore.kernel.org/linuxppc-dev/20230608032425.59796-1-npig...@gmail.com/ - Rebased to upstream (on top of Sean's ucall and guest assert rework). - Drop powerpc specific tests for now, concentrate on base enablement. - Fix a bunch of bugs and issues including the ones noticed by Joel in v3: - Work around powerpc's max VCPU ID complexity to fix test case failure. - Use TEST_REQUIRE for radix mode so hash hosts don't assert. - Pack page table "fragments" in pages (like Linux kernel does), which fixes PTE memory consumption estimation and prevents the max memory test from failing with no memory for page tables. Since v2: - Added a couple of new tests (patch 5,6) - Make default page size match host page size. - Check for radix MMU capability. - Build a few more of the generic tests. Since v1: - Update MAINTAINERS KVM PPC entry to include kvm selftests. - Fixes and cleanups from Sean's review including new patch 1. - Add 4K guest page support requiring new patch 2. Thanks, Nick Nicholas Piggin (4): KVM: selftests: Move pgd_created check into virt_pgd_alloc KVM: selftests: Add aligned guest physical page allocator KVM: PPC: selftests: add support for powerpc KVM: PPC: selftests: powerpc enable kvm_create_max_vcpus test MAINTAINERS | 2 + tools/testing/selftests/kvm/Makefile | 20 + .../selftests/kvm/include/kvm_util_base.h | 31 ++ .../selftests/kvm/include/powerpc/hcall.h | 17 + .../selftests/kvm/include/powerpc/ppc_asm.h | 32 ++ .../selftests/kvm/include/powerpc/processor.h | 39 ++ .../selftests/kvm/include/powerpc/ucall.h | 21 + .../selftests/kvm/kvm_create_max_vcpus.c | 9 + .../selftests/kvm/lib/aarch64/processor.c | 4 - tools/testing/selftests/kvm/lib/guest_modes.c | 27 +- tools/testing/selftests/kvm/lib/kvm_util.c| 56 ++- .../selftests/kvm/lib/powerpc/handlers.S | 93 .../testing/selftests/kvm/lib/powerpc/hcall.c | 45 ++ .../selftests/kvm/lib/powerpc/processor.c | 468 ++ .../testing/selftests/kvm/lib/powerpc/ucall.c | 21 + .../selftests/kvm/lib/riscv/processor.c | 4 - .../selftests/kvm/lib/s390x/processor.c | 4 - .../selftests/kvm/lib/x86_64/processor.c | 7 +- tools/testing/selftests/kvm/powerpc/helpers.h | 46 ++ 19 files changed, 908 insertions(+), 38 deletions(-) create mode 100644 tools/testing/selftests/kvm/include/powerpc/hcall.h create mode 100644 tools/testing/selftests/kvm/include/powerpc/ppc_asm.h create mode 100644 tools/testing/selftests/kvm/include/powerpc/processor.h create mode 100644 tools/testing/selftests/kvm/include/powerpc/ucall.h create mode 100644 tools/testing/selftests/kvm/lib/powerpc/handlers.S create mode 100644 tools/testing/selftests/kvm/lib/powerpc/hcall.c create mode 100644 tools/testing/selftests/kvm/lib/powerpc/processor.c create mode 100644 tools/testing/selftests/kvm/lib/powerpc/ucall.c create mode 100644 tools/testing/selftests/kvm/powerpc/helpers.h -- 2.42.0
[PATCH] ASoC: fsl_sai: Fix no frame sync clock issue on i.MX8MP
On i.MX8MP, when the TERE and FSD_MSTR enabled before configuring the word width, there will be no frame sync clock issue, because old word width impact the generation of frame sync. TERE enabled earlier only for i.MX8MP case for the hardware limitation, So need to disable FSD_MSTR before configuring word width, then enable FSD_MSTR bit for this specific case. Fixes: 3e4a82612998 ("ASoC: fsl_sai: MCLK bind with TX/RX enable bit") Signed-off-by: Shengjiu Wang --- sound/soc/fsl/fsl_sai.c | 21 + 1 file changed, 21 insertions(+) diff --git a/sound/soc/fsl/fsl_sai.c b/sound/soc/fsl/fsl_sai.c index 79e7c6b98a75..32bbe5056a63 100644 --- a/sound/soc/fsl/fsl_sai.c +++ b/sound/soc/fsl/fsl_sai.c @@ -673,6 +673,20 @@ static int fsl_sai_hw_params(struct snd_pcm_substream *substream, FSL_SAI_CR3_TRCE_MASK, FSL_SAI_CR3_TRCE((dl_cfg[dl_cfg_idx].mask[tx] & trce_mask))); + /* +* When the TERE and FSD_MSTR enabled before configuring the word width +* There will be no frame sync clock issue, because word width impact +* the generation of frame sync clock. +* +* TERE enabled earlier only for i.MX8MP case for the hardware limitation, +* We need to disable FSD_MSTR before configuring word width, then enable +* FSD_MSTR bit for this specific case. +*/ + if (sai->soc_data->mclk_with_tere && sai->mclk_direction_output && + !sai->is_consumer_mode) + regmap_update_bits(sai->regmap, FSL_SAI_xCR4(tx, ofs), + FSL_SAI_CR4_FSD_MSTR, 0); + regmap_update_bits(sai->regmap, FSL_SAI_xCR4(tx, ofs), FSL_SAI_CR4_SYWD_MASK | FSL_SAI_CR4_FRSZ_MASK | FSL_SAI_CR4_CHMOD_MASK, @@ -680,6 +694,13 @@ static int fsl_sai_hw_params(struct snd_pcm_substream *substream, regmap_update_bits(sai->regmap, FSL_SAI_xCR5(tx, ofs), FSL_SAI_CR5_WNW_MASK | FSL_SAI_CR5_W0W_MASK | FSL_SAI_CR5_FBT_MASK, val_cr5); + + /* Enable FSD_MSTR after configuring word width */ + if (sai->soc_data->mclk_with_tere && sai->mclk_direction_output && + !sai->is_consumer_mode) + regmap_update_bits(sai->regmap, FSL_SAI_xCR4(tx, ofs), + FSL_SAI_CR4_FSD_MSTR, FSL_SAI_CR4_FSD_MSTR); + regmap_write(sai->regmap, FSL_SAI_xMR(tx), ~0UL - ((1 << min(channels, slots)) - 1)); -- 2.34.1
Re: [PATCH] ASoC: fsl_mqs: Remove duplicate linux/of.h header
On Sun, Nov 19, 2023 at 6:45 PM Lucas Tanure wrote: > > Remove linux/of.h as is included more than once. > Reported by make includecheck. > > Signed-off-by: Lucas Tanure Acked-by: Shengjiu Wang best regards Wang shengjiu > --- > sound/soc/fsl/fsl_mqs.c | 1 - > 1 file changed, 1 deletion(-) > > diff --git a/sound/soc/fsl/fsl_mqs.c b/sound/soc/fsl/fsl_mqs.c > index f2d74ec05cdf..86704ba5f6f0 100644 > --- a/sound/soc/fsl/fsl_mqs.c > +++ b/sound/soc/fsl/fsl_mqs.c > @@ -10,7 +10,6 @@ > #include > #include > #include > -#include > #include > #include > #include > -- > 2.42.1 >
Re: [PATCH v9 10/15] media: uapi: Add V4L2_CTRL_TYPE_FIXED_POINT
On Fri, Nov 17, 2023 at 8:07 PM Hans Verkuil wrote: > > Here is an RFC patch adding support for 'fraction_bits'. It's lacking > documentation, but it can be used for testing. > > It was rather a pain logging fixed point number in a reasonable format, > but I think it is OK. > > In userspace (where you can use floating point) it is a lot easier: > > printf("%.*g\n", fraction_bits, (double)v * (1.0 / (1ULL << fraction_bits))); > > I decided to only expose fraction_bits in struct v4l2_query_ext_ctrl. > I could add it to struct v4l2_queryctrl, but I did not think that was > necessary. Other opinions are welcome. > > In the meantime, let me know if this works for your patch series. If it > does, then I can clean this up. Thanks. It works for me. What I have done are: 1. drop FIXED_POINT 2. use v4l2_ctrl_new_custom Best regards Wang shengjiu > > Regards, > > Hans > > Signed-off-by: Hans Verkuil > --- > drivers/media/v4l2-core/v4l2-ctrls-api.c | 1 + > drivers/media/v4l2-core/v4l2-ctrls-core.c | 72 +++ > include/media/v4l2-ctrls.h| 7 ++- > include/uapi/linux/videodev2.h| 20 ++- > 4 files changed, 85 insertions(+), 15 deletions(-) > > diff --git a/drivers/media/v4l2-core/v4l2-ctrls-api.c > b/drivers/media/v4l2-core/v4l2-ctrls-api.c > index 002ea6588edf..3132df315b17 100644 > --- a/drivers/media/v4l2-core/v4l2-ctrls-api.c > +++ b/drivers/media/v4l2-core/v4l2-ctrls-api.c > @@ -1101,6 +1101,7 @@ int v4l2_query_ext_ctrl(struct v4l2_ctrl_handler *hdl, > struct v4l2_query_ext_ctr > qc->elems = ctrl->elems; > qc->nr_of_dims = ctrl->nr_of_dims; > memcpy(qc->dims, ctrl->dims, qc->nr_of_dims * sizeof(qc->dims[0])); > + qc->fraction_bits = ctrl->fraction_bits; > qc->minimum = ctrl->minimum; > qc->maximum = ctrl->maximum; > qc->default_value = ctrl->default_value; > diff --git a/drivers/media/v4l2-core/v4l2-ctrls-core.c > b/drivers/media/v4l2-core/v4l2-ctrls-core.c > index a662fb60f73f..0e08a371af5c 100644 > --- a/drivers/media/v4l2-core/v4l2-ctrls-core.c > +++ b/drivers/media/v4l2-core/v4l2-ctrls-core.c > @@ -252,12 +252,42 @@ void v4l2_ctrl_type_op_init(const struct v4l2_ctrl > *ctrl, u32 from_idx, > } > EXPORT_SYMBOL(v4l2_ctrl_type_op_init); > > +static void v4l2_ctrl_log_fp(s64 v, unsigned int fraction_bits) > +{ > + s64 i = v4l2_fp_integer(v, fraction_bits); > + s64 f = v4l2_fp_fraction(v, fraction_bits); > + > + if (!f) { > + pr_cont("%lld", i); > + } else if (fraction_bits < 20) { > + u64 div = 1ULL << fraction_bits; > + > + if (!i && f < 0) > + pr_cont("-%lld/%llu", -f, div); > + else if (!i) > + pr_cont("%lld/%llu", f, div); > + else if (i < 0 || f < 0) > + pr_cont("-%lld-%llu/%llu", -i, -f, div); > + else > + pr_cont("%lld+%llu/%llu", i, f, div); > + } else { > + if (!i && f < 0) > + pr_cont("-%lld/(2^%u)", -f, fraction_bits); > + else if (!i) > + pr_cont("%lld/(2^%u)", f, fraction_bits); > + else if (i < 0 || f < 0) > + pr_cont("-%lld-%llu/(2^%u)", -i, -f, fraction_bits); > + else > + pr_cont("%lld+%llu/(2^%u)", i, f, fraction_bits); > + } > +} > + > void v4l2_ctrl_type_op_log(const struct v4l2_ctrl *ctrl) > { > union v4l2_ctrl_ptr ptr = ctrl->p_cur; > > if (ctrl->is_array) { > - unsigned i; > + unsigned int i; > > for (i = 0; i < ctrl->nr_of_dims; i++) > pr_cont("[%u]", ctrl->dims[i]); > @@ -266,7 +296,10 @@ void v4l2_ctrl_type_op_log(const struct v4l2_ctrl *ctrl) > > switch (ctrl->type) { > case V4L2_CTRL_TYPE_INTEGER: > - pr_cont("%d", *ptr.p_s32); > + if (!ctrl->fraction_bits) > + pr_cont("%d", *ptr.p_s32); > + else > + v4l2_ctrl_log_fp(*ptr.p_s32, ctrl->fraction_bits); > break; > case V4L2_CTRL_TYPE_BOOLEAN: > pr_cont("%s", *ptr.p_s32 ? "true" : "false"); > @@ -281,19 +314,31 @@ void v4l2_ctrl_type_op_log(const struct v4l2_ctrl *ctrl) > pr_cont("0x%08x", *ptr.p_s32); > break; > case V4L2_CTRL_TYPE_INTEGER64: > - pr_cont("%lld", *ptr.p_s64); > + if (!ctrl->fraction_bits) > + pr_cont("%lld", *ptr.p_s64); > + else > + v4l2_ctrl_log_fp(*ptr.p_s64, ctrl->fraction_bits); > break; > case V4L2_CTRL_TYPE_STRING: > pr_cont("%s", ptr.p_char); > break; > case V4L2_CTRL_TYPE_U8: > - pr_cont("%u", (unsigned)*ptr.p_u8); > +
Re: [PATCH RFC 06/12] mm/gup: Drop folio_fast_pin_allowed() in hugepd processing
On Wed, Nov 15, 2023 at 08:29:02PM -0500, Peter Xu wrote: > Hugepd format is only used in PowerPC with hugetlbfs. In commit > a6e79df92e4a ("mm/gup: disallow FOLL_LONGTERM GUP-fast writing to > file-backed mappings"), we added a check to fail gup-fast if there's > potential risk of violating GUP over writeback file systems. That should > never apply to hugepd. > > Drop that check, not only because it'll never be true for hugepd, but also > it paves way for reusing the function outside fast-gup. What prevents us from ever using hugepd with file mappings? I think it would naturally fit in with how large folios for the pagecache work. So keeping this check and generalizing it seems like the better idea to me.
Re: [PATCH v4 2/5] RISC-V: Add SBI debug console helper routines
On Sat, Nov 18, 2023 at 09:08:56AM +0530, Anup Patel wrote: > Let us provide SBI debug console helper routines which can be > shared by serial/earlycon-riscv-sbi.c and hvc/hvc_riscv_sbi.c. > > Signed-off-by: Anup Patel > --- > arch/riscv/include/asm/sbi.h | 5 + > arch/riscv/kernel/sbi.c | 43 > 2 files changed, 48 insertions(+) > > diff --git a/arch/riscv/include/asm/sbi.h b/arch/riscv/include/asm/sbi.h > index 66f3933c14f6..ee7aef5f6233 100644 > --- a/arch/riscv/include/asm/sbi.h > +++ b/arch/riscv/include/asm/sbi.h > @@ -334,6 +334,11 @@ static inline unsigned long sbi_mk_version(unsigned long > major, > } > > int sbi_err_map_linux_errno(int err); > + > +extern bool sbi_debug_console_available; > +int sbi_debug_console_write(unsigned int num_bytes, phys_addr_t base_addr); > +int sbi_debug_console_read(unsigned int num_bytes, phys_addr_t base_addr); > + > #else /* CONFIG_RISCV_SBI */ > static inline int sbi_remote_fence_i(const struct cpumask *cpu_mask) { > return -1; } > static inline void sbi_init(void) {} > diff --git a/arch/riscv/kernel/sbi.c b/arch/riscv/kernel/sbi.c > index 5a62ed1da453..73a9c22c3945 100644 > --- a/arch/riscv/kernel/sbi.c > +++ b/arch/riscv/kernel/sbi.c > @@ -571,6 +571,44 @@ long sbi_get_mimpid(void) > } > EXPORT_SYMBOL_GPL(sbi_get_mimpid); > > +bool sbi_debug_console_available; > + > +int sbi_debug_console_write(unsigned int num_bytes, phys_addr_t base_addr) > +{ > + struct sbiret ret; > + > + if (!sbi_debug_console_available) > + return -EOPNOTSUPP; > + > + if (IS_ENABLED(CONFIG_32BIT)) > + ret = sbi_ecall(SBI_EXT_DBCN, SBI_EXT_DBCN_CONSOLE_WRITE, > + num_bytes, lower_32_bits(base_addr), > + upper_32_bits(base_addr), 0, 0, 0); > + else > + ret = sbi_ecall(SBI_EXT_DBCN, SBI_EXT_DBCN_CONSOLE_WRITE, > + num_bytes, base_addr, 0, 0, 0, 0); > + > + return ret.error ? sbi_err_map_linux_errno(ret.error) : ret.value; We can't get perfect mappings, but I wonder if we can do better than returning ENOTSUPP for "Failed to write the byte due to I/O errors." How about if (ret.error == SBI_ERR_FAILURE) return -EIO; return ret.error ? sbi_err_map_linux_errno(ret.error) : ret.value; > +} > + > +int sbi_debug_console_read(unsigned int num_bytes, phys_addr_t base_addr) > +{ > + struct sbiret ret; > + > + if (!sbi_debug_console_available) > + return -EOPNOTSUPP; > + > + if (IS_ENABLED(CONFIG_32BIT)) > + ret = sbi_ecall(SBI_EXT_DBCN, SBI_EXT_DBCN_CONSOLE_READ, > + num_bytes, lower_32_bits(base_addr), > + upper_32_bits(base_addr), 0, 0, 0); > + else > + ret = sbi_ecall(SBI_EXT_DBCN, SBI_EXT_DBCN_CONSOLE_READ, > + num_bytes, base_addr, 0, 0, 0, 0); > + > + return ret.error ? sbi_err_map_linux_errno(ret.error) : ret.value; Same comment as above. > +} > + > void __init sbi_init(void) > { > int ret; > @@ -612,6 +650,11 @@ void __init sbi_init(void) > sbi_srst_reboot_nb.priority = 192; > register_restart_handler(_srst_reboot_nb); > } > + if ((sbi_spec_version >= sbi_mk_version(2, 0)) && > + (sbi_probe_extension(SBI_EXT_DBCN) > 0)) { > + pr_info("SBI DBCN extension detected\n"); > + sbi_debug_console_available = true; > + } > } else { > __sbi_set_timer = __sbi_set_timer_v01; > __sbi_send_ipi = __sbi_send_ipi_v01; > -- > 2.34.1 > Otherwise, Reviewed-by: Andrew Jones Thanks, drew