Re: [PATCH v2] powerpc: Don't clobber fr0/vs0 during fp|altivec register save

2023-11-20 Thread Nicholas Piggin
On Tue Nov 21, 2023 at 11:23 AM AEST, Timothy Pearson wrote:
>
>
> - Original Message -
> > From: "Michael Ellerman" 
> > To: "Timothy Pearson" 
> > Cc: "Jens Axboe" , "regressions" 
> > , "npiggin" ,
> > "christophe leroy" , "linuxppc-dev" 
> > 
> > Sent: Monday, November 20, 2023 5:39:52 PM
> > Subject: Re: [PATCH v2] powerpc: Don't clobber fr0/vs0 during fp|altivec 
> > register  save
>
> > Timothy Pearson  writes:
> >> - Original Message -
> >>> From: "Michael Ellerman" 
> > ...
> >>> 
> >>> But we now have a new path, because io-uring can call copy_process() via
> >>> create_io_thread() from the signal handling path. That's OK if the signal 
> >>> is
> >>> handled as we return from a syscall, but it's not OK if the signal is 
> >>> handled
> >>> due to some other interrupt.
> >>> 
> >>> Which is:
> >>> 
> >>> interrupt_return_srr_user()
> >>>  interrupt_exit_user_prepare()
> >>>interrupt_exit_user_prepare_main()
> >>>  do_notify_resume()
> >>>get_signal()
> >>>  task_work_run()
> >>>create_worker_cb()
> >>>  create_io_worker()
> >>>copy_process()
> >>>  dup_task_struct()
> >>>arch_dup_task_struct()
> >>>  flush_all_to_thread()
> >>>save_all()
> >>>  if (tsk->thread.regs->msr & MSR_FP)
> >>>save_fpu()
> >>># fr0 is clobbered and potentially live in 
> >>> userspace
> >>> 
> >>> 
> >>> So tldr I think the corruption is only an issue since io-uring started 
> >>> doing
> >>> the clone via signal, which I think matches the observed timeline of this 
> >>> bug
> >>> appearing.
> >>
> >> I agree the corruption really only started showing up in earnest on
> >> io_uring clone-via-signal, as this was confirmed several times in the
> >> course of debugging.
> > 
> > Thanks.
> > 
> >> Note as well that I may very well have a wrong call order in the
> >> commit message, since I was relying on a couple of WARN_ON() macros I
> >> inserted to check for a similar (but not identical) condition and
> >> didn't spend much time getting new traces after identifying the root
> >> cause.
> > 
> > Yep no worries. I'll reword it to incorporate the full path from my mail.
> > 
> >> I went back and grabbed some real world system-wide stack traces, since I 
> >> now
> >> know what to trigger on.  A typical example is:
> >>
> >> interrupt_return_srr_user()
> >>  interrupt_exit_user_prepare()
> >>   interrupt_exit_user_prepare_main()
> >>schedule()
> >> __schedule()
> >>  __switch_to()
> >>   giveup_all()
> >># tsk->thread.regs->msr MSR_FP is still set here
> >>__giveup_fpu()
> >> save_fpu()
> >> # fr0 is clobbered and potentially live in userspace
> > 
> > fr0 is not live there.
> > 
> > __giveup_fpu() does roughly:
> > 
> > msr = tsk->thread.regs->msr;
> > msr &= ~(MSR_FP|MSR_FE0|MSR_FE1);
> >msr &= ~MSR_VSX;
> > tsk->thread.regs = msr;
> > 
> > ie. it clears the FP etc. bits from the task's MSR. That means the FP
> > state will be reloaded from the thread struct before the task is run again.
> > 
> > Also on that path we're switching to another task, so we'll be reloading
> > the other task's FP state before returning to userspace.
> > 
> > So I don't see any bug there.
>
> Yeah, you're right.  I was trying to get traces while doing something else, 
> and didn't think that all the way through, sorry. :)  It's not going to be 
> super easy to get a real trace (I was triggering the WARN_ON() from of fr0 
> getting set to to FPSCR), so let's just assume it's mainly the path you 
> manually found above and update the commit message accordingly.
>
> > There's only two places that call save_fpu() and skip the giveup logic,
> > which is save_all() and kvmppc_save_user_regs().
>
> Now that's interesting as well, since it might explain some issues I've seen 
> for years on a specific QEMU workload.  Once this is backported to stable 
> I'll need to get the kernels updated on those boxes and see if the issues 
> disappear...

KVM issue is actually slightly different. You need this (lightly
verified to solve issue so far).

---

>From c12fbed0e12207058282a2411da59b43b1f96c49 Mon Sep 17 00:00:00 2001
From: Nicholas Piggin 
Date: Tue, 21 Nov 2023 17:03:55 +1000
Subject: [PATCH] KVM: PPC: Book3S HV: Fix KVM_RUN ioctl clobbering FP/VEC
 registers

Before running a guest, the host process's FP/VEC registers are saved
away like a context switch or a kernel use of those regs. The guest
FP/VEC registers can then be loaded as needed. The host process
registers would be restored lazily when it uses FP/VEC again.

KVM HV did not do this correctly. The registers are saved in the
thread struct, but the FP/VEC/VSX bits remain enabled in the user
MSR, leading the kernel to think they are still valid. Even after
they are clobbered by guest registers. 

Re: [PATCH v2] powerpc: Don't clobber fr0/vs0 during fp|altivec register save

2023-11-20 Thread Nicholas Piggin
On Tue Nov 21, 2023 at 2:10 PM AEST, Timothy Pearson wrote:
> - Original Message -
> > From: "Michael Ellerman" 
> > To: "Timothy Pearson" 
> > Cc: "Jens Axboe" , "regressions" 
> > , "npiggin" ,
> > "christophe leroy" , "linuxppc-dev" 
> > 
> > Sent: Monday, November 20, 2023 5:39:52 PM
> > Subject: Re: [PATCH v2] powerpc: Don't clobber fr0/vs0 during fp|altivec 
> > register  save
>
> > Timothy Pearson  writes:
> >> - Original Message -
> >>> From: "Michael Ellerman" 
> > ...
> >>> 
> >>> But we now have a new path, because io-uring can call copy_process() via
> >>> create_io_thread() from the signal handling path. That's OK if the signal 
> >>> is
> >>> handled as we return from a syscall, but it's not OK if the signal is 
> >>> handled
> >>> due to some other interrupt.
> >>> 
> >>> Which is:
> >>> 
> >>> interrupt_return_srr_user()
> >>>  interrupt_exit_user_prepare()
> >>>interrupt_exit_user_prepare_main()
> >>>  do_notify_resume()
> >>>get_signal()
> >>>  task_work_run()
> >>>create_worker_cb()
> >>>  create_io_worker()
> >>>copy_process()
> >>>  dup_task_struct()
> >>>arch_dup_task_struct()
> >>>  flush_all_to_thread()
> >>>save_all()
> >>>  if (tsk->thread.regs->msr & MSR_FP)
> >>>save_fpu()
> >>># fr0 is clobbered and potentially live in 
> >>> userspace
> >>> 
> >>> 
> >>> So tldr I think the corruption is only an issue since io-uring started 
> >>> doing
> >>> the clone via signal, which I think matches the observed timeline of this 
> >>> bug
> >>> appearing.
> >>
> >> I agree the corruption really only started showing up in earnest on
> >> io_uring clone-via-signal, as this was confirmed several times in the
> >> course of debugging.
> > 
> > Thanks.
> > 
> >> Note as well that I may very well have a wrong call order in the
> >> commit message, since I was relying on a couple of WARN_ON() macros I
> >> inserted to check for a similar (but not identical) condition and
> >> didn't spend much time getting new traces after identifying the root
> >> cause.
> > 
> > Yep no worries. I'll reword it to incorporate the full path from my mail.
> > 
> >> I went back and grabbed some real world system-wide stack traces, since I 
> >> now
> >> know what to trigger on.  A typical example is:
> >>
> >> interrupt_return_srr_user()
> >>  interrupt_exit_user_prepare()
> >>   interrupt_exit_user_prepare_main()
> >>schedule()
> >> __schedule()
> >>  __switch_to()
> >>   giveup_all()
> >># tsk->thread.regs->msr MSR_FP is still set here
> >>__giveup_fpu()
> >> save_fpu()
> >> # fr0 is clobbered and potentially live in userspace
> > 
> > fr0 is not live there.
>  
> > ie. it clears the FP etc. bits from the task's MSR. That means the FP
> > state will be reloaded from the thread struct before the task is run again.
>
> So a little more detail on this, just to put it to rest properly vs. assuming 
> hand analysis caught every possible pathway. :)
>
> The debugging that generates this stack trace also verifies the following in 
> __giveup_fpu():
>
> 1.) tsk->thread.fp_state.fpr doesn't contain the FPSCR contents prior to 
> calling save_fpu()
> 2.) tsk->thread.fp_state.fpr contains the FPSCR contents directly after 
> calling save_fpu()
> 3.) MSR_FP is set both in the task struct and in the live MSR.
>
> Only if all three conditions are met will it generate the trace.  This is a 
> generalization of the hack I used to find the problem in the first place.
>
> If the state will subsequently be reloaded from the thread struct, that means 
> we're reloading the registers from the thread struct that we just verified 
> was corrupted by the earlier save_fpu() call.  There are only two ways I can 
> see for that to be true -- one is if the registers were already clobbered 
> when giveup_all() was entered, and the other is if save_fpu() went ahead and 
> clobbered them right here inside giveup_all().
>
> To see which scenario we were dealing with, I added a bit more 
> instrumentation to dump the current register state if MSR_FP bit was already 
> set in registers (i.e. not dumping data from task struct, but using the live 
> FPU registers instead), and sure enough the registers are corrupt on entry, 
> so something else has already called save_fpu() before we even hit 
> giveup_all() in this call chain.
>
> Unless I'm missing something, doesn't this effectively mean that anything 
> interrupting a task can hit this bug?  Or, put another way, I'm seeing 
> several processes hit this exact call chain with the corrupt register going 
> back out to userspace without io_uring even in the mix, so there seems to be 
> another pathway in play.  These traces are from a qemu guest, in case it 
> matters given the kvm path is possibly susceptible.
>
> 

Re: [PATCH v9 00/27] Add support for QMC HDLC, framer infrastructure and PEF2256 framer

2023-11-20 Thread Herve Codina
Hi Mark, Jakub, Qiang, Li,

On Mon, 20 Nov 2023 13:30:08 +
Mark Brown  wrote:

> On Fri, Nov 17, 2023 at 04:47:46PM -0800, Jakub Kicinski wrote:
> > On Wed, 15 Nov 2023 15:39:36 +0100 Herve Codina wrote:  
> > >- Removed Patches 6, 7 and 8 (patches applied)
> > > 
> > >- Patches 7, 20, 21, 23 (patches 10, 23, 24, 26 in v8)
> > >  Add 'Acked-by: Jakub Kicinski '  
> 
> > I thought someone (Mark?) asked for the networking stuff to be put 
> > on a branch. If that's still the preference - is it possible to factor
> > these out as a standalone series, too?  Will they build on their own?  
> 
> Yes, can we *please* at least get the generic non-driver bits of this
> series moving - they seem uncontroversial as far as I can see and are a
> tiny portion of the overall 20 patches.  Patches 21-23 look like they
> can go on a branch in the net tree?

Patch 21 is the framer infrastructure.
Patches 22-25 are the driver for the PEF2256 framer.
Note that patch 24 is the pinmux part of the framer and, IHMO, can be
taken too.
Patch 23 need to be fixed (kernel test robot). The fix will be quite
minor (depends on HAS_IOMEM on the Kconfig file).

For the SoC part (QUICC ENGINE QMC and TSA), what will be the plan ? 
Qiang, Li, any opinion ?

I plan to send the v10 with the patch 23 fixed.
Based on that v10, some patches (21 to 25 at least) could be applied and
I will remove them for the future v11.
I think it will be easier to follow if I iterate on the series removing
patches as soon as they are applied.

Of course, please, let me know if this is not the right way to do.

Best regards,
Hervé


[PATCH v5 1/2] ASoC: dt-bindings: sound-card-common: List sound widgets ignoring system suspend

2023-11-20 Thread Chancel Liu
Add a property to list audio sound widgets which are marked ignoring
system suspend. Paths between these endpoints are still active over
suspend of the main application processor that the current operating
system is running.

Signed-off-by: Chancel Liu 
---
 .../devicetree/bindings/sound/sound-card-common.yaml   | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/Documentation/devicetree/bindings/sound/sound-card-common.yaml 
b/Documentation/devicetree/bindings/sound/sound-card-common.yaml
index 3a941177f684..721950f65748 100644
--- a/Documentation/devicetree/bindings/sound/sound-card-common.yaml
+++ b/Documentation/devicetree/bindings/sound/sound-card-common.yaml
@@ -17,6 +17,13 @@ properties:
   pair of strings, the first being the connection's sink, the second
   being the connection's source.
 
+  ignore-suspend-widgets:
+$ref: /schemas/types.yaml#/definitions/non-unique-string-array
+description: |
+  A list of audio sound widgets which are marked ignoring system suspend.
+  Paths between these endpoints are still active over suspend of the main
+  application processor that the current operating system is running.
+
   model:
 $ref: /schemas/types.yaml#/definitions/string
 description: User specified audio sound card name
-- 
2.42.0



[PATCH v5 2/2] ASoC: imx-rpmsg: Force codec power on in low power audio mode

2023-11-20 Thread Chancel Liu
Low power audio mode requires binding codec still power on while Acore
enters into suspend so Mcore can continue playback music.

ASoC machine driver acquires DAPM endpoints through reading
"ignore-suspend-widgets" property from DT and then forces the path
between these endpoints ignoring suspend.

If the rpmsg sound card is in low power audio mode, the suspend/resume
callback of binding codec is overridden to disable the suspend/resume.

Signed-off-by: Chancel Liu 
---
 sound/soc/fsl/imx-rpmsg.c | 61 +--
 1 file changed, 59 insertions(+), 2 deletions(-)

diff --git a/sound/soc/fsl/imx-rpmsg.c b/sound/soc/fsl/imx-rpmsg.c
index a0c5c35817dd..e5bd63dab10c 100644
--- a/sound/soc/fsl/imx-rpmsg.c
+++ b/sound/soc/fsl/imx-rpmsg.c
@@ -2,9 +2,8 @@
 // Copyright 2017-2020 NXP
 
 #include 
-#include 
+#include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -21,8 +20,11 @@ struct imx_rpmsg {
struct snd_soc_dai_link dai;
struct snd_soc_card card;
unsigned long sysclk;
+   bool lpa;
 };
 
+static struct dev_pm_ops lpa_pm;
+
 static const struct snd_soc_dapm_widget imx_rpmsg_dapm_widgets[] = {
SND_SOC_DAPM_HP("Headphone Jack", NULL),
SND_SOC_DAPM_SPK("Ext Spk", NULL),
@@ -39,6 +41,58 @@ static int imx_rpmsg_late_probe(struct snd_soc_card *card)
struct device *dev = card->dev;
int ret;
 
+   if (data->lpa) {
+   struct snd_soc_component *codec_comp;
+   struct device_node *codec_np;
+   struct device_driver *codec_drv;
+   struct device *codec_dev = NULL;
+
+   codec_np = data->dai.codecs->of_node;
+   if (codec_np) {
+   struct platform_device *codec_pdev;
+   struct i2c_client *codec_i2c;
+
+   codec_i2c = of_find_i2c_device_by_node(codec_np);
+   if (codec_i2c)
+   codec_dev = _i2c->dev;
+   if (!codec_dev) {
+   codec_pdev = of_find_device_by_node(codec_np);
+   if (codec_pdev)
+   codec_dev = _pdev->dev;
+   }
+   }
+   if (codec_dev) {
+   codec_comp = 
snd_soc_lookup_component_nolocked(codec_dev, NULL);
+   if (codec_comp) {
+   int i, num_widgets;
+   const char *widgets;
+   struct snd_soc_dapm_context *dapm;
+
+   num_widgets = 
of_property_count_strings(data->card.dev->of_node,
+   
"ignore-suspend-widgets");
+   for (i = 0; i < num_widgets; i++) {
+   
of_property_read_string_index(data->card.dev->of_node,
+ 
"ignore-suspend-widgets",
+ i, 
);
+   dapm = 
snd_soc_component_get_dapm(codec_comp);
+   snd_soc_dapm_ignore_suspend(dapm, 
widgets);
+   }
+   }
+   codec_drv = codec_dev->driver;
+   if (codec_drv->pm) {
+   memcpy(_pm, codec_drv->pm, sizeof(lpa_pm));
+   lpa_pm.suspend = NULL;
+   lpa_pm.resume = NULL;
+   lpa_pm.freeze = NULL;
+   lpa_pm.thaw = NULL;
+   lpa_pm.poweroff = NULL;
+   lpa_pm.restore = NULL;
+   codec_drv->pm = _pm;
+   }
+   put_device(codec_dev);
+   }
+   }
+
if (!data->sysclk)
return 0;
 
@@ -138,6 +192,9 @@ static int imx_rpmsg_probe(struct platform_device *pdev)
goto fail;
}
 
+   if (of_property_read_bool(np, "fsl,enable-lpa"))
+   data->lpa = true;
+
data->card.dev = >dev;
data->card.owner = THIS_MODULE;
data->card.dapm_widgets = imx_rpmsg_dapm_widgets;
-- 
2.42.0



Re: [PATCH v2] powerpc: Don't clobber fr0/vs0 during fp|altivec register save

2023-11-20 Thread Timothy Pearson



- Original Message -
> From: "Timothy Pearson" 
> To: "Michael Ellerman" 
> Cc: "Jens Axboe" , "regressions" 
> , "npiggin" ,
> "christophe leroy" , "linuxppc-dev" 
> 
> Sent: Monday, November 20, 2023 10:10:32 PM
> Subject: Re: [PATCH v2] powerpc: Don't clobber fr0/vs0 during fp|altivec 
> register  save

> - Original Message -
>> From: "Michael Ellerman" 
>> To: "Timothy Pearson" 
>> Cc: "Jens Axboe" , "regressions" 
>> ,
>> "npiggin" ,
>> "christophe leroy" , "linuxppc-dev"
>> 
>> Sent: Monday, November 20, 2023 5:39:52 PM
>> Subject: Re: [PATCH v2] powerpc: Don't clobber fr0/vs0 during fp|altivec
>> register  save
> 
>> Timothy Pearson  writes:
>>> - Original Message -
 From: "Michael Ellerman" 
>> ...
 
 But we now have a new path, because io-uring can call copy_process() via
 create_io_thread() from the signal handling path. That's OK if the signal 
 is
 handled as we return from a syscall, but it's not OK if the signal is 
 handled
 due to some other interrupt.
 
 Which is:
 
 interrupt_return_srr_user()
  interrupt_exit_user_prepare()
interrupt_exit_user_prepare_main()
  do_notify_resume()
get_signal()
  task_work_run()
create_worker_cb()
  create_io_worker()
copy_process()
  dup_task_struct()
arch_dup_task_struct()
  flush_all_to_thread()
save_all()
  if (tsk->thread.regs->msr & MSR_FP)
save_fpu()
# fr0 is clobbered and potentially live in 
 userspace
 
 
 So tldr I think the corruption is only an issue since io-uring started 
 doing
 the clone via signal, which I think matches the observed timeline of this 
 bug
 appearing.
>>>
>>> I agree the corruption really only started showing up in earnest on
>>> io_uring clone-via-signal, as this was confirmed several times in the
>>> course of debugging.
>> 
>> Thanks.
>> 
>>> Note as well that I may very well have a wrong call order in the
>>> commit message, since I was relying on a couple of WARN_ON() macros I
>>> inserted to check for a similar (but not identical) condition and
>>> didn't spend much time getting new traces after identifying the root
>>> cause.
>> 
>> Yep no worries. I'll reword it to incorporate the full path from my mail.
>> 
>>> I went back and grabbed some real world system-wide stack traces, since I 
>>> now
>>> know what to trigger on.  A typical example is:
>>>
>>> interrupt_return_srr_user()
>>>  interrupt_exit_user_prepare()
>>>   interrupt_exit_user_prepare_main()
>>>schedule()
>>> __schedule()
>>>  __switch_to()
>>>   giveup_all()
>>># tsk->thread.regs->msr MSR_FP is still set here
>>>__giveup_fpu()
>>> save_fpu()
>>> # fr0 is clobbered and potentially live in userspace
>> 
>> fr0 is not live there.
> 
>> ie. it clears the FP etc. bits from the task's MSR. That means the FP
>> state will be reloaded from the thread struct before the task is run again.
> 
> So a little more detail on this, just to put it to rest properly vs. assuming
> hand analysis caught every possible pathway. :)
> 
> The debugging that generates this stack trace also verifies the following in
> __giveup_fpu():
> 
> 1.) tsk->thread.fp_state.fpr doesn't contain the FPSCR contents prior to 
> calling
> save_fpu()
> 2.) tsk->thread.fp_state.fpr contains the FPSCR contents directly after 
> calling
> save_fpu()
> 3.) MSR_FP is set both in the task struct and in the live MSR.
> 
> Only if all three conditions are met will it generate the trace.  This is a
> generalization of the hack I used to find the problem in the first place.
> 
> If the state will subsequently be reloaded from the thread struct, that means
> we're reloading the registers from the thread struct that we just verified was
> corrupted by the earlier save_fpu() call.  There are only two ways I can see
> for that to be true -- one is if the registers were already clobbered when
> giveup_all() was entered, and the other is if save_fpu() went ahead and
> clobbered them right here inside giveup_all().
> 
> To see which scenario we were dealing with, I added a bit more instrumentation
> to dump the current register state if MSR_FP bit was already set in registers
> (i.e. not dumping data from task struct, but using the live FPU registers
> instead), and sure enough the registers are corrupt on entry, so something 
> else
> has already called save_fpu() before we even hit giveup_all() in this call
> chain.
> 
> Unless I'm missing something, doesn't this effectively mean that anything
> interrupting a task can hit this bug?  Or, put another way, I'm seeing several
> processes hit this exact call chain with the corrupt register going back out 
> to
> userspace without 

Re: [PATCH v2] powerpc: Don't clobber fr0/vs0 during fp|altivec register save

2023-11-20 Thread Timothy Pearson
- Original Message -
> From: "Michael Ellerman" 
> To: "Timothy Pearson" 
> Cc: "Jens Axboe" , "regressions" 
> , "npiggin" ,
> "christophe leroy" , "linuxppc-dev" 
> 
> Sent: Monday, November 20, 2023 5:39:52 PM
> Subject: Re: [PATCH v2] powerpc: Don't clobber fr0/vs0 during fp|altivec 
> register  save

> Timothy Pearson  writes:
>> - Original Message -
>>> From: "Michael Ellerman" 
> ...
>>> 
>>> But we now have a new path, because io-uring can call copy_process() via
>>> create_io_thread() from the signal handling path. That's OK if the signal is
>>> handled as we return from a syscall, but it's not OK if the signal is 
>>> handled
>>> due to some other interrupt.
>>> 
>>> Which is:
>>> 
>>> interrupt_return_srr_user()
>>>  interrupt_exit_user_prepare()
>>>interrupt_exit_user_prepare_main()
>>>  do_notify_resume()
>>>get_signal()
>>>  task_work_run()
>>>create_worker_cb()
>>>  create_io_worker()
>>>copy_process()
>>>  dup_task_struct()
>>>arch_dup_task_struct()
>>>  flush_all_to_thread()
>>>save_all()
>>>  if (tsk->thread.regs->msr & MSR_FP)
>>>save_fpu()
>>># fr0 is clobbered and potentially live in 
>>> userspace
>>> 
>>> 
>>> So tldr I think the corruption is only an issue since io-uring started doing
>>> the clone via signal, which I think matches the observed timeline of this 
>>> bug
>>> appearing.
>>
>> I agree the corruption really only started showing up in earnest on
>> io_uring clone-via-signal, as this was confirmed several times in the
>> course of debugging.
> 
> Thanks.
> 
>> Note as well that I may very well have a wrong call order in the
>> commit message, since I was relying on a couple of WARN_ON() macros I
>> inserted to check for a similar (but not identical) condition and
>> didn't spend much time getting new traces after identifying the root
>> cause.
> 
> Yep no worries. I'll reword it to incorporate the full path from my mail.
> 
>> I went back and grabbed some real world system-wide stack traces, since I now
>> know what to trigger on.  A typical example is:
>>
>> interrupt_return_srr_user()
>>  interrupt_exit_user_prepare()
>>   interrupt_exit_user_prepare_main()
>>schedule()
>> __schedule()
>>  __switch_to()
>>   giveup_all()
>># tsk->thread.regs->msr MSR_FP is still set here
>>__giveup_fpu()
>> save_fpu()
>> # fr0 is clobbered and potentially live in userspace
> 
> fr0 is not live there.
 
> ie. it clears the FP etc. bits from the task's MSR. That means the FP
> state will be reloaded from the thread struct before the task is run again.

So a little more detail on this, just to put it to rest properly vs. assuming 
hand analysis caught every possible pathway. :)

The debugging that generates this stack trace also verifies the following in 
__giveup_fpu():

1.) tsk->thread.fp_state.fpr doesn't contain the FPSCR contents prior to 
calling save_fpu()
2.) tsk->thread.fp_state.fpr contains the FPSCR contents directly after calling 
save_fpu()
3.) MSR_FP is set both in the task struct and in the live MSR.

Only if all three conditions are met will it generate the trace.  This is a 
generalization of the hack I used to find the problem in the first place.

If the state will subsequently be reloaded from the thread struct, that means 
we're reloading the registers from the thread struct that we just verified was 
corrupted by the earlier save_fpu() call.  There are only two ways I can see 
for that to be true -- one is if the registers were already clobbered when 
giveup_all() was entered, and the other is if save_fpu() went ahead and 
clobbered them right here inside giveup_all().

To see which scenario we were dealing with, I added a bit more instrumentation 
to dump the current register state if MSR_FP bit was already set in registers 
(i.e. not dumping data from task struct, but using the live FPU registers 
instead), and sure enough the registers are corrupt on entry, so something else 
has already called save_fpu() before we even hit giveup_all() in this call 
chain.

Unless I'm missing something, doesn't this effectively mean that anything 
interrupting a task can hit this bug?  Or, put another way, I'm seeing several 
processes hit this exact call chain with the corrupt register going back out to 
userspace without io_uring even in the mix, so there seems to be another 
pathway in play.  These traces are from a qemu guest, in case it matters given 
the kvm path is possibly susceptible.

Just a few things to think about.  The FPU patch itself definitely resolves the 
problems; I used a sledgehammer approach *specifically* so that there is no 
place for a rare call sequence we didn't consider to hit it again down the 
line. :)


Re: Potential config regression after 89cde455 ("kexec: consolidate kexec and crash options into kernel/Kconfig.kexec")

2023-11-20 Thread Baoquan He
Eric DeVolder's Oracle mail address is not available anymore, add his
current mail address he told me. 

On 11/20/23 at 10:52pm, Ignat Korchagin wrote:
> Good day!
> 
> We have recently started to evaluate Linux 6.6 and noticed that we
> cannot disable CONFIG_KEXEC anymore, but keep CONFIG_CRASH_DUMP
> enabled. It seems to be related to commit 89cde455 ("kexec:
> consolidate kexec and crash options into kernel/Kconfig.kexec"), where
> a CONFIG_KEXEC dependency was added to CONFIG_CRASH_DUMP.
> 
> In our current kernel (Linux 6.1) we only enable CONFIG_KEXEC_FILE
> with enforced signature check to support the kernel crash dumping
> functionality and would like to keep CONFIG_KEXEC disabled for
> security reasons [1].
> 
> I was reading the long commit message, but the reason for adding
> CONFIG_KEXEC as a dependency for CONFIG_CRASH_DUMP evaded me. And I
> believe from the implementation perspective CONFIG_KEXEC_FILE should
> suffice here (as we successfully used it for crashdumps on Linux 6.1).
> 
> Is there a reason for adding this dependency or is it just an
> oversight? Would some solution of requiring either CONFIG_KEXEC or
> CONFIG_KEXEC_FILE work here?

I searched the patch history, found Eric didn't add the dependency on
CONFIG_KEXEC at the beginning. Later a linux-next building failure with
randconfig was reported, in there CONFIG_CRASH_DUMP enabled, while
CONFIG_KEXEC is disabled. Finally Eric added the KEXEC dependency for
CRASH_DUMP. Please see below link for more details:

https://lore.kernel.org/all/3e8eecd1-a277-2cfb-690e-5de2eb7b9...@oracle.com/T/#u

And besides, the newly added CONFIG_CRASH_HOTPLUG also needs
CONFIG_KEXEC if the elfcorehdr is allowed to be manipulated when
cpu/memory hotplug hapened.

Thanks
Baoquan



Re: Potential config regression after 89cde455 ("kexec: consolidate kexec and crash options into kernel/Kconfig.kexec")

2023-11-20 Thread Michael Ellerman
Ignat Korchagin  writes:
> Good day!
>
> We have recently started to evaluate Linux 6.6 and noticed that we
> cannot disable CONFIG_KEXEC anymore, but keep CONFIG_CRASH_DUMP
> enabled. It seems to be related to commit 89cde455 ("kexec:
> consolidate kexec and crash options into kernel/Kconfig.kexec"), where
> a CONFIG_KEXEC dependency was added to CONFIG_CRASH_DUMP.
>
> In our current kernel (Linux 6.1) we only enable CONFIG_KEXEC_FILE
> with enforced signature check to support the kernel crash dumping
> functionality and would like to keep CONFIG_KEXEC disabled for
> security reasons [1].
>
> I was reading the long commit message, but the reason for adding
> CONFIG_KEXEC as a dependency for CONFIG_CRASH_DUMP evaded me. And I
> believe from the implementation perspective CONFIG_KEXEC_FILE should
> suffice here (as we successfully used it for crashdumps on Linux 6.1).
>
> Is there a reason for adding this dependency or is it just an
> oversight? Would some solution of requiring either CONFIG_KEXEC or
> CONFIG_KEXEC_FILE work here?

I don't actually see any reason for CRASH_DUMP to depend on KEXEC or
KEXEC_FILE.

None of the old CRASH_DUMP symbols depended on KEXEC AFAICS. Using
something like:

 $ git diff 89cde455..95d1fef5 | grep -A 3 "^-.*config CRASH_DUMP"

It's reasonable to want to build a kernel that supports CRASH_DUMP (ie.
can be a dump kernel), but doesn't support kexec and requires a regular
reboot. Though I doubt anyone does that in practice?

cheers


Re: [PATCH v2] powerpc: Don't clobber fr0/vs0 during fp|altivec register save

2023-11-20 Thread Timothy Pearson



- Original Message -
> From: "Michael Ellerman" 
> To: "Timothy Pearson" 
> Cc: "Jens Axboe" , "regressions" 
> , "npiggin" ,
> "christophe leroy" , "linuxppc-dev" 
> 
> Sent: Monday, November 20, 2023 5:39:52 PM
> Subject: Re: [PATCH v2] powerpc: Don't clobber fr0/vs0 during fp|altivec 
> register  save

> Timothy Pearson  writes:
>> - Original Message -
>>> From: "Michael Ellerman" 
> ...
>>> 
>>> But we now have a new path, because io-uring can call copy_process() via
>>> create_io_thread() from the signal handling path. That's OK if the signal is
>>> handled as we return from a syscall, but it's not OK if the signal is 
>>> handled
>>> due to some other interrupt.
>>> 
>>> Which is:
>>> 
>>> interrupt_return_srr_user()
>>>  interrupt_exit_user_prepare()
>>>interrupt_exit_user_prepare_main()
>>>  do_notify_resume()
>>>get_signal()
>>>  task_work_run()
>>>create_worker_cb()
>>>  create_io_worker()
>>>copy_process()
>>>  dup_task_struct()
>>>arch_dup_task_struct()
>>>  flush_all_to_thread()
>>>save_all()
>>>  if (tsk->thread.regs->msr & MSR_FP)
>>>save_fpu()
>>># fr0 is clobbered and potentially live in 
>>> userspace
>>> 
>>> 
>>> So tldr I think the corruption is only an issue since io-uring started doing
>>> the clone via signal, which I think matches the observed timeline of this 
>>> bug
>>> appearing.
>>
>> I agree the corruption really only started showing up in earnest on
>> io_uring clone-via-signal, as this was confirmed several times in the
>> course of debugging.
> 
> Thanks.
> 
>> Note as well that I may very well have a wrong call order in the
>> commit message, since I was relying on a couple of WARN_ON() macros I
>> inserted to check for a similar (but not identical) condition and
>> didn't spend much time getting new traces after identifying the root
>> cause.
> 
> Yep no worries. I'll reword it to incorporate the full path from my mail.
> 
>> I went back and grabbed some real world system-wide stack traces, since I now
>> know what to trigger on.  A typical example is:
>>
>> interrupt_return_srr_user()
>>  interrupt_exit_user_prepare()
>>   interrupt_exit_user_prepare_main()
>>schedule()
>> __schedule()
>>  __switch_to()
>>   giveup_all()
>># tsk->thread.regs->msr MSR_FP is still set here
>>__giveup_fpu()
>> save_fpu()
>> # fr0 is clobbered and potentially live in userspace
> 
> fr0 is not live there.
> 
> __giveup_fpu() does roughly:
> 
>   msr = tsk->thread.regs->msr;
>   msr &= ~(MSR_FP|MSR_FE0|MSR_FE1);
>msr &= ~MSR_VSX;
>   tsk->thread.regs = msr;
> 
> ie. it clears the FP etc. bits from the task's MSR. That means the FP
> state will be reloaded from the thread struct before the task is run again.
> 
> Also on that path we're switching to another task, so we'll be reloading
> the other task's FP state before returning to userspace.
> 
> So I don't see any bug there.

Yeah, you're right.  I was trying to get traces while doing something else, and 
didn't think that all the way through, sorry. :)  It's not going to be super 
easy to get a real trace (I was triggering the WARN_ON() from of fr0 getting 
set to to FPSCR), so let's just assume it's mainly the path you manually found 
above and update the commit message accordingly.

> There's only two places that call save_fpu() and skip the giveup logic,
> which is save_all() and kvmppc_save_user_regs().

Now that's interesting as well, since it might explain some issues I've seen 
for years on a specific QEMU workload.  Once this is backported to stable I'll 
need to get the kernels updated on those boxes and see if the issues 
disappear...


Re: [PATCH] powerpc: add crtsavres.o to always-y instead of extra-y

2023-11-20 Thread Nicholas Piggin
On Tue Nov 21, 2023 at 9:23 AM AEST, Masahiro Yamada wrote:
> crtsavres.o is linked to modules. However, as explained in commit
> d0e628cd817f ("kbuild: doc: clarify the difference between extra-y
> and always-y"), 'make modules' does not build extra-y.
>
> For example, the following command fails:
>
>   $ make ARCH=powerpc LLVM=1 KBUILD_MODPOST_WARN=1 mrproper ps3_defconfig 
> modules
> [snip]
> LD [M]  arch/powerpc/platforms/cell/spufs/spufs.ko
>   ld.lld: error: cannot open arch/powerpc/lib/crtsavres.o: No such file or 
> directory
>   make[3]: *** [scripts/Makefile.modfinal:56: 
> arch/powerpc/platforms/cell/spufs/spufs.ko] Error 1
>   make[2]: *** [Makefile:1844: modules] Error 2
>   make[1]: *** [/home/masahiro/workspace/linux-kbuild/Makefile:350: 
> __build_one_by_one] Error 2
>   make: *** [Makefile:234: __sub-make] Error 2
>

Thanks. Is this the correct Fixes tag?

Fixes: d0e628cd817f ("powerpc/64: Do not link crtsavres.o in vmlinux")

Hmm, looks like LLD might just do this now automatically for us
too without --save-restore-funcs (https://reviews.llvm.org/D79977).
But we probably still need it for older versions, so we still need
your patch.

Reviewed-by: Nicholas Piggin 


> Signed-off-by: Masahiro Yamada 
> ---
>
>  arch/powerpc/lib/Makefile | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/lib/Makefile b/arch/powerpc/lib/Makefile
> index 51ad0397c17a..6eac63e79a89 100644
> --- a/arch/powerpc/lib/Makefile
> +++ b/arch/powerpc/lib/Makefile
> @@ -45,7 +45,7 @@ obj-$(CONFIG_FUNCTION_ERROR_INJECTION)  += 
> error-inject.o
>  # so it is only needed for modules, and only for older linkers which
>  # do not support --save-restore-funcs
>  ifndef CONFIG_LD_IS_BFD
> -extra-$(CONFIG_PPC64)+= crtsavres.o
> +always-$(CONFIG_PPC64)   += crtsavres.o
>  endif
>  
>  obj-$(CONFIG_PPC_BOOK3S_64) += copyuser_power7.o copypage_power7.o \



Re: [PATCH v2] powerpc: Don't clobber fr0/vs0 during fp|altivec register save

2023-11-20 Thread Nicholas Piggin
On Tue Nov 21, 2023 at 9:39 AM AEST, Michael Ellerman wrote:
> Timothy Pearson  writes:
> > - Original Message -
> >> From: "Michael Ellerman" 
> ...
> >> 
> >> But we now have a new path, because io-uring can call copy_process() via
> >> create_io_thread() from the signal handling path. That's OK if the signal 
> >> is
> >> handled as we return from a syscall, but it's not OK if the signal is 
> >> handled
> >> due to some other interrupt.
> >> 
> >> Which is:
> >> 
> >> interrupt_return_srr_user()
> >>  interrupt_exit_user_prepare()
> >>interrupt_exit_user_prepare_main()
> >>  do_notify_resume()
> >>get_signal()
> >>  task_work_run()
> >>create_worker_cb()
> >>  create_io_worker()
> >>copy_process()
> >>  dup_task_struct()
> >>arch_dup_task_struct()
> >>  flush_all_to_thread()
> >>save_all()
> >>  if (tsk->thread.regs->msr & MSR_FP)
> >>save_fpu()
> >># fr0 is clobbered and potentially live in 
> >> userspace
> >> 
> >> 
> >> So tldr I think the corruption is only an issue since io-uring started 
> >> doing
> >> the clone via signal, which I think matches the observed timeline of this 
> >> bug
> >> appearing.
> >
> > I agree the corruption really only started showing up in earnest on
> > io_uring clone-via-signal, as this was confirmed several times in the
> > course of debugging.
>
> Thanks.
>
> > Note as well that I may very well have a wrong call order in the
> > commit message, since I was relying on a couple of WARN_ON() macros I
> > inserted to check for a similar (but not identical) condition and
> > didn't spend much time getting new traces after identifying the root
> > cause.
>
> Yep no worries. I'll reword it to incorporate the full path from my mail.
>
> > I went back and grabbed some real world system-wide stack traces, since I 
> > now know what to trigger on.  A typical example is:
> >
> > interrupt_return_srr_user()
> >  interrupt_exit_user_prepare()
> >   interrupt_exit_user_prepare_main()
> >schedule()
> > __schedule()
> >  __switch_to()
> >   giveup_all()
> ># tsk->thread.regs->msr MSR_FP is still set here
> >__giveup_fpu()
> > save_fpu()
> > # fr0 is clobbered and potentially live in userspace
>
> fr0 is not live there.
>
> __giveup_fpu() does roughly:
>
>   msr = tsk->thread.regs->msr;
>   msr &= ~(MSR_FP|MSR_FE0|MSR_FE1);
> msr &= ~MSR_VSX;
>   tsk->thread.regs = msr;
>
> ie. it clears the FP etc. bits from the task's MSR. That means the FP
> state will be reloaded from the thread struct before the task is run again.
>
> Also on that path we're switching to another task, so we'll be reloading
> the other task's FP state before returning to userspace.
>
> So I don't see any bug there.
>
> There's only two places that call save_fpu() and skip the giveup logic,
> which is save_all() and kvmppc_save_user_regs().
>
> save_all() is only called via clone() so I think that's unable to
> actually cause visible register corruption as I described in my previous
> mail.
>
> I thought the KVM case was similar, as it's called via an ioctl, but
> I'll have to talk to Nick as his mail indicates otherwise.

Yeah it can, because later on it runs the guest and that will clobber
other regs. I reproduced fr14 corruption in the host easily with KVM
selftests.

It should just do a "giveup" on FP/VEC (as it does with TM).

Thanks,
Nick


Re: [PATCH v2] powerpc: Don't clobber fr0/vs0 during fp|altivec register save

2023-11-20 Thread Nicholas Piggin
On Tue Nov 21, 2023 at 2:45 AM AEST, Timothy Pearson wrote:
>
>
> - Original Message -
> > From: "Michael Ellerman" 
> > To: "Timothy Pearson" , "Jens Axboe" 
> > , "regressions"
> > , "npiggin" , "christophe 
> > leroy" ,
> > "linuxppc-dev" 
> > Sent: Monday, November 20, 2023 1:10:06 AM
> > Subject: Re: [PATCH v2] powerpc: Don't clobber fr0/vs0 during fp|altivec 
> > register  save
>
> > Hi Timothy,
> > 
> > Great work debugging this. I think your fix is good, but I want to 
> > understand it
> > a bit more to make sure I can explain why we haven't seen it outside of
> > io-uring.
> > If this can be triggered outside of io-uring then I have even more 
> > backporting
> > in my future :}
> > 
> > Typically save_fpu() is called from __giveup_fpu() which saves the FP regs 
> > and
> > also *turns off FP* in the tasks MSR, meaning the kernel will reload the FP 
> > regs
> > from the thread struct before letting the task use FP again. So in that case
> > save_fpu() is free to clobber fr0 because the FP regs no longer hold live 
> > values
> > for the task.
> > 
> > There is another case though, which is the path via:
> >  copy_process()
> >dup_task_struct()
> >  arch_dup_task_struct()
> >flush_all_to_thread()
> >  save_all()
> > 
> > That path saves the FP regs but leaves them live. That's meant as an
> > optimisation for a process that's using FP/VSX and then calls fork(), 
> > leaving
> > the regs live means the parent process doesn't have to take a fault after 
> > the
> > fork to get its FP regs back.
> > 
> > That path does clobber fr0, but fr0 is volatile across a syscall, and the 
> > only
> > way to reach copy_process() from userspace is via a syscall. So in normal 
> > usage
> > fr0 being clobbered across a syscall shouldn't cause data corruption.
> > 
> > Even if we handle a signal on the return from the fork() syscall, the worst 
> > that
> > happens is that the task's thread struct holds the clobbered fr0, but the 
> > task
> > doesn't care (because fr0 is volatile across the syscall anyway).
> > 
> > That path is something like:
> > 
> > system_call_vectored_common()
> >  system_call_exception()
> >sys_fork()
> >  kernel_clone()
> >copy_process()
> >  dup_task_struct()
> >arch_dup_task_struct()
> >  flush_all_to_thread()
> >save_all()
> >  if (tsk->thread.regs->msr & MSR_FP)
> >save_fpu()
> ># does not clear MSR_FP from regs->msr
> >  syscall_exit_prepare()
> >interrupt_exit_user_prepare_main()
> >  do_notify_resume()
> >get_signal()
> >handle_rt_signal64()
> >  prepare_setup_sigcontext()
> >flush_fp_to_thread()
> >  if (tsk->thread.regs->msr & MSR_FP)
> >giveup_fpu()
> >  __giveup_fpu
> >save_fpu()
> ># clobbered fr0 is saved, but task considers it volatile
> ># across syscall anyway
> > 
> > 
> > But we now have a new path, because io-uring can call copy_process() via
> > create_io_thread() from the signal handling path. That's OK if the signal is
> > handled as we return from a syscall, but it's not OK if the signal is 
> > handled
> > due to some other interrupt.
> > 
> > Which is:
> > 
> > interrupt_return_srr_user()
> >  interrupt_exit_user_prepare()
> >interrupt_exit_user_prepare_main()
> >  do_notify_resume()
> >get_signal()
> >  task_work_run()
> >create_worker_cb()
> >  create_io_worker()
> >copy_process()
> >  dup_task_struct()
> >arch_dup_task_struct()
> >  flush_all_to_thread()
> >save_all()
> >  if (tsk->thread.regs->msr & MSR_FP)
> >save_fpu()
> ># fr0 is clobbered and potentially live in 
> > userspace
> > 
> > 
> > So tldr I think the corruption is only an issue since io-uring started doing
> > the clone via signal, which I think matches the observed timeline of this 
> > bug
> > appearing.
>
> I agree the corruption really only started showing up in earnest on io_uring 
> clone-via-signal, as this was confirmed several times in the course of 
> debugging.  Bear in mind however that we have seen very, very rare crashes 
> over several years in other tasks, and I am starting to think this bug might 
> be the root cause (see below).
>
> Note as well that I may very well have a wrong call order in the commit 
> message, since I was relying on a couple of WARN_ON() macros I inserted to 
> check for a similar (but not identical) condition and didn't spend much time 
> getting new traces after identifying the root cause.
>
> I went back and grabbed some real world system-wide stack traces, since I now 
> know what to trigger on.  A typical example is:
>
> 

Re: [PATCH] powerpc/lib: Avoid array bounds warnings in vec ops

2023-11-20 Thread Gustavo A. R. Silva




On 11/20/23 17:54, Michael Ellerman wrote:

Building with GCC 13 (which has -array-bounds enabled) there are several
warnings in sstep.c along the lines of:

   In function ‘do_byte_reverse’,
   inlined from ‘do_vec_load’ at arch/powerpc/lib/sstep.c:691:3,
   inlined from ‘emulate_loadstore’ at arch/powerpc/lib/sstep.c:3439:9:
   arch/powerpc/lib/sstep.c:289:23: error: array subscript 2 is outside array 
bounds of ‘u8[16]’ {aka ‘unsigned char[16]’} [-Werror=array-bounds=]
 289 | up[2] = byterev_8(up[1]);
 | ~~^~
   arch/powerpc/lib/sstep.c: In function ‘emulate_loadstore’:
   arch/powerpc/lib/sstep.c:681:11: note: at offset 16 into object ‘u’ of size 
16
 681 | } u = {};
 |   ^

do_byte_reverse() supports a size up to 32 bytes, but in these cases the
caller is only passing a 16 byte buffer. In practice there is no bug,
do_vec_load() is only called from the LOAD_VMX case in emulate_loadstore().
That in turn is only reached when analyse_instr() recognises VMX ops,
and in all cases the size is no greater than 16:

   $ git grep -w LOAD_VMX arch/powerpc/lib/sstep.c
   arch/powerpc/lib/sstep.c:op->type = MKOP(LOAD_VMX, 
0, 1);
   arch/powerpc/lib/sstep.c:op->type = MKOP(LOAD_VMX, 
0, 2);
   arch/powerpc/lib/sstep.c:op->type = MKOP(LOAD_VMX, 
0, 4);
   arch/powerpc/lib/sstep.c:op->type = MKOP(LOAD_VMX, 
0, 16);

Similarly for do_vec_store().

Although the warning is incorrect, the code would be safer if it clamped
the size from the caller to the known size of the buffer. Do that using
min_t().

Reported-by: Bagas Sanjaya 
Reported-by: Jan-Benedict Glaw 
Reported-by: Gustavo A. R. Silva 
Signed-off-by: Michael Ellerman 


Reviewed-by: Gustavo A. R. Silva 
Build-tested-by: Gustavo A. R. Silva 

This indeed makes all those warnings go away. :)

Thanks, Michael!
--
Gustavo


---
  arch/powerpc/lib/sstep.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
index a4ab8625061a..a13f05cfc7db 100644
--- a/arch/powerpc/lib/sstep.c
+++ b/arch/powerpc/lib/sstep.c
@@ -688,7 +688,7 @@ static nokprobe_inline int do_vec_load(int rn, unsigned 
long ea,
if (err)
return err;
if (unlikely(cross_endian))
-   do_byte_reverse([ea & 0xf], size);
+   do_byte_reverse([ea & 0xf], min_t(size_t, size, sizeof(u)));
preempt_disable();
if (regs->msr & MSR_VEC)
put_vr(rn, );
@@ -719,7 +719,7 @@ static nokprobe_inline int do_vec_store(int rn, unsigned 
long ea,
u.v = current->thread.vr_state.vr[rn];
preempt_enable();
if (unlikely(cross_endian))
-   do_byte_reverse([ea & 0xf], size);
+   do_byte_reverse([ea & 0xf], min_t(size_t, size, sizeof(u)));
return copy_mem_out([ea & 0xf], ea, size, regs);
  }
  #endif /* CONFIG_ALTIVEC */


[PATCH] powerpc/lib: Avoid array bounds warnings in vec ops

2023-11-20 Thread Michael Ellerman
Building with GCC 13 (which has -array-bounds enabled) there are several
warnings in sstep.c along the lines of:

  In function ‘do_byte_reverse’,
  inlined from ‘do_vec_load’ at arch/powerpc/lib/sstep.c:691:3,
  inlined from ‘emulate_loadstore’ at arch/powerpc/lib/sstep.c:3439:9:
  arch/powerpc/lib/sstep.c:289:23: error: array subscript 2 is outside array 
bounds of ‘u8[16]’ {aka ‘unsigned char[16]’} [-Werror=array-bounds=]
289 | up[2] = byterev_8(up[1]);
| ~~^~
  arch/powerpc/lib/sstep.c: In function ‘emulate_loadstore’:
  arch/powerpc/lib/sstep.c:681:11: note: at offset 16 into object ‘u’ of size 16
681 | } u = {};
|   ^

do_byte_reverse() supports a size up to 32 bytes, but in these cases the
caller is only passing a 16 byte buffer. In practice there is no bug,
do_vec_load() is only called from the LOAD_VMX case in emulate_loadstore().
That in turn is only reached when analyse_instr() recognises VMX ops,
and in all cases the size is no greater than 16:

  $ git grep -w LOAD_VMX arch/powerpc/lib/sstep.c
  arch/powerpc/lib/sstep.c:op->type = MKOP(LOAD_VMX, 0, 
1);
  arch/powerpc/lib/sstep.c:op->type = MKOP(LOAD_VMX, 0, 
2);
  arch/powerpc/lib/sstep.c:op->type = MKOP(LOAD_VMX, 0, 
4);
  arch/powerpc/lib/sstep.c:op->type = MKOP(LOAD_VMX, 0, 
16);

Similarly for do_vec_store().

Although the warning is incorrect, the code would be safer if it clamped
the size from the caller to the known size of the buffer. Do that using
min_t().

Reported-by: Bagas Sanjaya 
Reported-by: Jan-Benedict Glaw 
Reported-by: Gustavo A. R. Silva 
Signed-off-by: Michael Ellerman 
---
 arch/powerpc/lib/sstep.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
index a4ab8625061a..a13f05cfc7db 100644
--- a/arch/powerpc/lib/sstep.c
+++ b/arch/powerpc/lib/sstep.c
@@ -688,7 +688,7 @@ static nokprobe_inline int do_vec_load(int rn, unsigned 
long ea,
if (err)
return err;
if (unlikely(cross_endian))
-   do_byte_reverse([ea & 0xf], size);
+   do_byte_reverse([ea & 0xf], min_t(size_t, size, sizeof(u)));
preempt_disable();
if (regs->msr & MSR_VEC)
put_vr(rn, );
@@ -719,7 +719,7 @@ static nokprobe_inline int do_vec_store(int rn, unsigned 
long ea,
u.v = current->thread.vr_state.vr[rn];
preempt_enable();
if (unlikely(cross_endian))
-   do_byte_reverse([ea & 0xf], size);
+   do_byte_reverse([ea & 0xf], min_t(size_t, size, sizeof(u)));
return copy_mem_out([ea & 0xf], ea, size, regs);
 }
 #endif /* CONFIG_ALTIVEC */
-- 
2.41.0



Re: [RFC - is this a bug?] powerpc/lib/sstep: Asking for some light on this, please. :)

2023-11-20 Thread Michael Ellerman
"Gustavo A. R. Silva"  writes:
> On 11/20/23 09:21, Naveen N Rao wrote:
>> On Mon, Nov 20, 2023 at 08:33:45AM -0600, Gustavo A. R. Silva wrote:
>>> On 11/20/23 08:25, Naveen N Rao wrote:
 On Fri, Nov 17, 2023 at 12:36:01PM -0600, Gustavo A. R. Silva wrote:
> Hi all,
>
> I'm trying to fix the following -Wstringop-overflow warnings, and I'd like
> to get your feedback on this, please:
>
> In function 'do_byte_reverse',
>   inlined from 'do_vec_store' at 
> /home/gus/linux/arch/powerpc/lib/sstep.c:722:3,
>   inlined from 'emulate_loadstore' at 
> /home/gus/linux/arch/powerpc/lib/sstep.c:3510:9:
> arch/powerpc/lib/sstep.c:287:23: error: writing 8 bytes into a region of 
> size 0 [-Werror=stringop-overflow=]
> 287 | up[3] = tmp;
> | ~~^
> arch/powerpc/lib/sstep.c: In function 'emulate_loadstore':
> arch/powerpc/lib/sstep.c:708:11: note: at offset [24, 39] into 
> destination object 'u' of size 16
> 708 | } u;
> |   ^
> In function 'do_byte_reverse',
>   inlined from 'do_vec_store' at 
> /home/gus/linux/arch/powerpc/lib/sstep.c:722:3,
>   inlined from 'emulate_loadstore' at 
> /home/gus/linux/arch/powerpc/lib/sstep.c:3510:9:
> arch/powerpc/lib/sstep.c:289:23: error: writing 8 bytes into a region of 
> size 0 [-Werror=stringop-overflow=]
> 289 | up[2] = byterev_8(up[1]);
> | ~~^~
> arch/powerpc/lib/sstep.c: In function 'emulate_loadstore':
> arch/powerpc/lib/sstep.c:708:11: note: at offset 16 into destination 
> object 'u' of size 16
> 708 | } u;
> |   ^
> In function 'do_byte_reverse',
>   inlined from 'do_vec_load' at 
> /home/gus/linux/arch/powerpc/lib/sstep.c:691:3,
>   inlined from 'emulate_loadstore' at 
> /home/gus/linux/arch/powerpc/lib/sstep.c:3439:9:
> arch/powerpc/lib/sstep.c:287:23: error: writing 8 bytes into a region of 
> size 0 [-Werror=stringop-overflow=]
> 287 | up[3] = tmp;
> | ~~^
> arch/powerpc/lib/sstep.c: In function 'emulate_loadstore':
> arch/powerpc/lib/sstep.c:681:11: note: at offset [24, 39] into 
> destination object 'u' of size 16
> 681 | } u = {};
> |   ^
> arch/powerpc/lib/sstep.c:681:11: note: at offset [24, 39] into 
> destination object 'u' of size 16
> arch/powerpc/lib/sstep.c:681:11: note: at offset [24, 39] into 
> destination object 'u' of size 16
>
> The origing of the issue seems to be the following calls to function 
> `do_vec_store()`, when
> `size > 16`, or more precisely when `size == 32`
>
> arch/powerpc/lib/sstep.c:
> 3436 case LOAD_VMX:
> 3437 if (!(regs->msr & MSR_PR) && !(regs->msr & MSR_VEC))
> 3438 return 0;
> 3439 err = do_vec_load(op->reg, ea, size, regs, 
> cross_endian);
> 3440 break;
> ...
> 3507 case STORE_VMX:
> 3508 if (!(regs->msr & MSR_PR) && !(regs->msr & MSR_VEC))
> 3509 return 0;
> 3510 err = do_vec_store(op->reg, ea, size, regs, 
> cross_endian);
> 3511 break;

 LOAD_VMX and STORE_VMX are set in analyse_instr() and size does not
 exceed 16. So, the warning looks incorrect to me.
>>>
>>> Does that mean that this piece of code is never actually executed:
>>>
>>>   281 case 32: {
>>>   282 unsigned long *up = (unsigned long *)ptr;
>>>   283 unsigned long tmp;
>>>   284
>>>   285 tmp = byterev_8(up[0]);
>>>   286 up[0] = byterev_8(up[3]);
>>>   287 up[3] = tmp;
>>>   288 tmp = byterev_8(up[2]);
>>>   289 up[2] = byterev_8(up[1]);
>>>   290 up[1] = tmp;
>>>   291 break;
>>>   292 }
>>>
>>> or rather, under which conditions the above code is executed, if any?
>> 
>> Please see git history to understand the commit that introduced that
>> code. You can do that by starting with a 'git blame' on the file. You'll
>> find that the commit that introduced the above hunk was af99da74333b
>> ("powerpc/sstep: Support VSX vector paired storage access
>> instructions").
>> 
>> The above code path is hit via
>> do_vsx_load()->emulate_vsx_load()->do_byte_reverse()
>
> It seems there is another path (at least the one that triggers the
> -Wstringop-overflow warnings):
>
> emulate_step()->emulate_loadstore()->do_vec_load()->do_byte_reverse()
>
> emulate_step() {
> ...
> if (OP_IS_LOAD_STORE(type)) {
 
The type comes from the op, which is determined in analyse_instr()

>  err = 

Potential config regression after 89cde455 ("kexec: consolidate kexec and crash options into kernel/Kconfig.kexec")

2023-11-20 Thread Ignat Korchagin
Good day!

We have recently started to evaluate Linux 6.6 and noticed that we
cannot disable CONFIG_KEXEC anymore, but keep CONFIG_CRASH_DUMP
enabled. It seems to be related to commit 89cde455 ("kexec:
consolidate kexec and crash options into kernel/Kconfig.kexec"), where
a CONFIG_KEXEC dependency was added to CONFIG_CRASH_DUMP.

In our current kernel (Linux 6.1) we only enable CONFIG_KEXEC_FILE
with enforced signature check to support the kernel crash dumping
functionality and would like to keep CONFIG_KEXEC disabled for
security reasons [1].

I was reading the long commit message, but the reason for adding
CONFIG_KEXEC as a dependency for CONFIG_CRASH_DUMP evaded me. And I
believe from the implementation perspective CONFIG_KEXEC_FILE should
suffice here (as we successfully used it for crashdumps on Linux 6.1).

Is there a reason for adding this dependency or is it just an
oversight? Would some solution of requiring either CONFIG_KEXEC or
CONFIG_KEXEC_FILE work here?

Ignat

[1]: https://mjg59.dreamwidth.org/28746.html


Re: [PATCH v2] powerpc: Don't clobber fr0/vs0 during fp|altivec register save

2023-11-20 Thread Michael Ellerman
Timothy Pearson  writes:
> - Original Message -
>> From: "Michael Ellerman" 
...
>> 
>> But we now have a new path, because io-uring can call copy_process() via
>> create_io_thread() from the signal handling path. That's OK if the signal is
>> handled as we return from a syscall, but it's not OK if the signal is handled
>> due to some other interrupt.
>> 
>> Which is:
>> 
>> interrupt_return_srr_user()
>>  interrupt_exit_user_prepare()
>>interrupt_exit_user_prepare_main()
>>  do_notify_resume()
>>get_signal()
>>  task_work_run()
>>create_worker_cb()
>>  create_io_worker()
>>copy_process()
>>  dup_task_struct()
>>arch_dup_task_struct()
>>  flush_all_to_thread()
>>save_all()
>>  if (tsk->thread.regs->msr & MSR_FP)
>>save_fpu()
>># fr0 is clobbered and potentially live in 
>> userspace
>> 
>> 
>> So tldr I think the corruption is only an issue since io-uring started doing
>> the clone via signal, which I think matches the observed timeline of this bug
>> appearing.
>
> I agree the corruption really only started showing up in earnest on
> io_uring clone-via-signal, as this was confirmed several times in the
> course of debugging.

Thanks.

> Note as well that I may very well have a wrong call order in the
> commit message, since I was relying on a couple of WARN_ON() macros I
> inserted to check for a similar (but not identical) condition and
> didn't spend much time getting new traces after identifying the root
> cause.

Yep no worries. I'll reword it to incorporate the full path from my mail.

> I went back and grabbed some real world system-wide stack traces, since I now 
> know what to trigger on.  A typical example is:
>
> interrupt_return_srr_user()
>  interrupt_exit_user_prepare()
>   interrupt_exit_user_prepare_main()
>schedule()
> __schedule()
>  __switch_to()
>   giveup_all()
># tsk->thread.regs->msr MSR_FP is still set here
>__giveup_fpu()
> save_fpu()
> # fr0 is clobbered and potentially live in userspace

fr0 is not live there.

__giveup_fpu() does roughly:

msr = tsk->thread.regs->msr;
msr &= ~(MSR_FP|MSR_FE0|MSR_FE1);
msr &= ~MSR_VSX;
tsk->thread.regs = msr;

ie. it clears the FP etc. bits from the task's MSR. That means the FP
state will be reloaded from the thread struct before the task is run again.

Also on that path we're switching to another task, so we'll be reloading
the other task's FP state before returning to userspace.

So I don't see any bug there.

There's only two places that call save_fpu() and skip the giveup logic,
which is save_all() and kvmppc_save_user_regs().

save_all() is only called via clone() so I think that's unable to
actually cause visible register corruption as I described in my previous
mail.

I thought the KVM case was similar, as it's called via an ioctl, but
I'll have to talk to Nick as his mail indicates otherwise.

cheers


[PATCH] powerpc: add crtsavres.o to always-y instead of extra-y

2023-11-20 Thread Masahiro Yamada
crtsavres.o is linked to modules. However, as explained in commit
d0e628cd817f ("kbuild: doc: clarify the difference between extra-y
and always-y"), 'make modules' does not build extra-y.

For example, the following command fails:

  $ make ARCH=powerpc LLVM=1 KBUILD_MODPOST_WARN=1 mrproper ps3_defconfig 
modules
[snip]
LD [M]  arch/powerpc/platforms/cell/spufs/spufs.ko
  ld.lld: error: cannot open arch/powerpc/lib/crtsavres.o: No such file or 
directory
  make[3]: *** [scripts/Makefile.modfinal:56: 
arch/powerpc/platforms/cell/spufs/spufs.ko] Error 1
  make[2]: *** [Makefile:1844: modules] Error 2
  make[1]: *** [/home/masahiro/workspace/linux-kbuild/Makefile:350: 
__build_one_by_one] Error 2
  make: *** [Makefile:234: __sub-make] Error 2

Signed-off-by: Masahiro Yamada 
---

 arch/powerpc/lib/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/lib/Makefile b/arch/powerpc/lib/Makefile
index 51ad0397c17a..6eac63e79a89 100644
--- a/arch/powerpc/lib/Makefile
+++ b/arch/powerpc/lib/Makefile
@@ -45,7 +45,7 @@ obj-$(CONFIG_FUNCTION_ERROR_INJECTION)+= 
error-inject.o
 # so it is only needed for modules, and only for older linkers which
 # do not support --save-restore-funcs
 ifndef CONFIG_LD_IS_BFD
-extra-$(CONFIG_PPC64)  += crtsavres.o
+always-$(CONFIG_PPC64) += crtsavres.o
 endif
 
 obj-$(CONFIG_PPC_BOOK3S_64) += copyuser_power7.o copypage_power7.o \
-- 
2.40.1



Re: [PATCH v4 2/2] ASoC: imx-rpmsg: Force codec power on in low power audio mode

2023-11-20 Thread Mark Brown
On Mon, Oct 23, 2023 at 10:07:18AM +0800, Chancel Liu wrote:
> Low power audio mode requires binding codec still power on while Acore
> enters into suspend so Mcore can continue playback music.
> 
> ASoC machine driver acquires DAPM endpoints through reading
> "ignore-suspend-widgets" property from DT and then forces the path
> between these endpoints ignoring suspend.

This breaks an x86 allmodconfig build:

/build/stage/linux/sound/soc/fsl/imx-rpmsg.c: In function ‘imx_rpmsg_late_probe’
:
/build/stage/linux/sound/soc/fsl/imx-rpmsg.c:60:46: error: implicit declaration 
of function ‘of_find_device_by_node’; did you mean ‘of_find_i2c_device_by_node’?
 [-Werror=implicit-function-declaration]
   60 | codec_pdev = of_find_device_by_node(code
c_np);
  |  ^~
  |  of_find_i2c_device_by_node
/build/stage/linux/sound/soc/fsl/imx-rpmsg.c:60:44: error: assignment to ‘struct
 platform_device *’ from ‘int’ makes pointer from integer without a cast [-Werro
r=int-conversion]
   60 | codec_pdev = 
of_find_device_by_node(codec_np);
  |^
cc1: all warnings being treated as errors


signature.asc
Description: PGP signature


[PATCH 0/5] usb: gadget: udc: Convert to platform remove callback returning void

2023-11-20 Thread Uwe Kleine-König
Hello,

this patch set converts the platform drivers below drivers/usb/gadget to
use .remove_new. These drivers all have an error path if the driver is
still in use. Returning there early leaks resources, but fixing this
isn't trivial, so I just added an error message. The patches don't make
a difference to the drivers apart from the improved error message.

See commit 5c5a7680e67b ("platform: Provide a remove callback that
returns no value") for an extended explanation and the eventual goal of
.remove_new().

Best regards
Uwe

Uwe Kleine-König (5):
  usb: gadget: at91_udc: Convert to platform remove callback returning void
  usb: gadget: fsl_udc: Convert to platform remove callback returning void
  usb: gadget: gr_udc: Convert to platform remove callback returning void
  usb: gadget: lpc32xx_udc: Convert to platform remove callback returning void
  usb: gadget: pxa25x_udc: Convert to platform remove callback returning void

 drivers/usb/gadget/udc/at91_udc.c | 13 +++--
 drivers/usb/gadget/udc/fsl_udc_core.c | 13 +++--
 drivers/usb/gadget/udc/gr_udc.c   | 13 +++--
 drivers/usb/gadget/udc/lpc32xx_udc.c  | 13 +++--
 drivers/usb/gadget/udc/pxa25x_udc.c   | 12 +++-
 5 files changed, 35 insertions(+), 29 deletions(-)

base-commit: 5a82d69d48c82e89aef44483d2a129f869f3506a
-- 
2.42.0



[PATCH 2/5] usb: gadget: fsl_udc: Convert to platform remove callback returning void

2023-11-20 Thread Uwe Kleine-König
The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart
from emitting a warning) and this typically results in resource leaks.

To improve here there is a quest to make the remove callback return
void. In the first step of this quest all drivers are converted to
.remove_new(), which already returns void. Eventually after all drivers
are converted, .remove_new() will be renamed to .remove().

In the error path emit an error message replacing the (less useful)
message by the core. Apart from the improved error message there is no
change in behaviour.

Signed-off-by: Uwe Kleine-König 
---
 drivers/usb/gadget/udc/fsl_udc_core.c | 13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/drivers/usb/gadget/udc/fsl_udc_core.c 
b/drivers/usb/gadget/udc/fsl_udc_core.c
index 2693a10eb0c7..535b6e79a198 100644
--- a/drivers/usb/gadget/udc/fsl_udc_core.c
+++ b/drivers/usb/gadget/udc/fsl_udc_core.c
@@ -2532,15 +2532,18 @@ static int fsl_udc_probe(struct platform_device *pdev)
 /* Driver removal function
  * Free resources and finish pending transactions
  */
-static int fsl_udc_remove(struct platform_device *pdev)
+static void fsl_udc_remove(struct platform_device *pdev)
 {
struct resource *res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
struct fsl_usb2_platform_data *pdata = dev_get_platdata(>dev);
 
DECLARE_COMPLETION_ONSTACK(done);
 
-   if (!udc_controller)
-   return -ENODEV;
+   if (!udc_controller) {
+   dev_err(>dev,
+   "Driver still in use but removing anyhow\n");
+   return;
+   }
 
udc_controller->done = 
usb_del_gadget_udc(_controller->gadget);
@@ -2568,8 +2571,6 @@ static int fsl_udc_remove(struct platform_device *pdev)
 */
if (pdata->exit)
pdata->exit(pdev);
-
-   return 0;
 }
 
 /*-
@@ -2667,7 +2668,7 @@ static const struct platform_device_id fsl_udc_devtype[] 
= {
 MODULE_DEVICE_TABLE(platform, fsl_udc_devtype);
 static struct platform_driver udc_driver = {
.probe  = fsl_udc_probe,
-   .remove = fsl_udc_remove,
+   .remove_new = fsl_udc_remove,
.id_table   = fsl_udc_devtype,
/* these suspend and resume are not usb suspend and resume */
.suspend= fsl_udc_suspend,
-- 
2.42.0



Re: [PATCH v4 07/13] powerpc/rtas: Warn if per-function lock isn't held

2023-11-20 Thread IBM
Nathan Lynch via B4 Relay 
writes:

> From: Nathan Lynch 
>
> If the function descriptor has a populated lock member, then callers
> are required to hold it across calls. Now that the firmware activation
> sequence is appropriately guarded, we can warn when the requirement
> isn't satisfied.
>
> __do_enter_rtas_trace() gets reorganized a bit as a result of
> performing the function descriptor lookup unconditionally now.
>

Reviewed-by: Aneesh Kumar K.V (IBM) 

> Signed-off-by: Nathan Lynch 
> ---
>  arch/powerpc/kernel/rtas.c | 21 +
>  1 file changed, 9 insertions(+), 12 deletions(-)
>
> diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
> index e38ba05ad613..deb6289fcf9c 100644
> --- a/arch/powerpc/kernel/rtas.c
> +++ b/arch/powerpc/kernel/rtas.c
> @@ -685,28 +685,25 @@ static void __do_enter_rtas(struct rtas_args *args)
>  
>  static void __do_enter_rtas_trace(struct rtas_args *args)
>  {
> - const char *name = NULL;
> + struct rtas_function *func = 
> rtas_token_to_function(be32_to_cpu(args->token));
>  
> - if (args == _args)
> - lockdep_assert_held(_lock);
>   /*
> -  * If the tracepoints that consume the function name aren't
> -  * active, avoid the lookup.
> +  * If there is a per-function lock, it must be held by the
> +  * caller.
>*/
> - if ((trace_rtas_input_enabled() || trace_rtas_output_enabled())) {
> - const s32 token = be32_to_cpu(args->token);
> - const struct rtas_function *func = 
> rtas_token_to_function(token);
> + if (func->lock)
> + WARN_ON(!mutex_is_locked(func->lock));
>  
> - name = func->name;
> - }
> + if (args == _args)
> + lockdep_assert_held(_lock);
>  
> - trace_rtas_input(args, name);
> + trace_rtas_input(args, func->name);
>   trace_rtas_ll_entry(args);
>  
>   __do_enter_rtas(args);
>  
>   trace_rtas_ll_exit(args);
> - trace_rtas_output(args, name);
> + trace_rtas_output(args, func->name);
>  }
>  
>  static void do_enter_rtas(struct rtas_args *args)
>
> -- 
> 2.41.0


Re: [PATCH v4 06/13] powerpc/rtas: Serialize firmware activation sequences

2023-11-20 Thread IBM
Nathan Lynch via B4 Relay 
writes:

> From: Nathan Lynch 
>
> Use the function lock API to prevent interleaving call sequences of
> the ibm,activate-firmware RTAS function, which typically requires
> multiple calls to complete the update. While the spec does not
> specifically prohibit interleaved sequences, there's almost certainly
> no advantage to allowing them.
>

Can we document what is the equivalent thing the userspace does? 

Reviewed-by: Aneesh Kumar K.V (IBM) 

> Signed-off-by: Nathan Lynch 
> ---
>  arch/powerpc/kernel/rtas.c | 4 
>  1 file changed, 4 insertions(+)
>
> diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
> index 52f2242d0c28..e38ba05ad613 100644
> --- a/arch/powerpc/kernel/rtas.c
> +++ b/arch/powerpc/kernel/rtas.c
> @@ -1753,10 +1753,14 @@ void rtas_activate_firmware(void)
>   return;
>   }
>  
> + rtas_function_lock(RTAS_FN_IBM_ACTIVATE_FIRMWARE);
> +
>   do {
>   fwrc = rtas_call(token, 0, 1, NULL);
>   } while (rtas_busy_delay(fwrc));
>  
> + rtas_function_unlock(RTAS_FN_IBM_ACTIVATE_FIRMWARE);
> +
>   if (fwrc)
>   pr_err("ibm,activate-firmware failed (%i)\n", fwrc);
>  }
>
> -- 
> 2.41.0


Re: [PATCH v4 05/13] powerpc/rtas: Facilitate high-level call sequences

2023-11-20 Thread IBM
Nathan Lynch via B4 Relay 
writes:

> From: Nathan Lynch 
>
> On RTAS platforms there is a general restriction that the OS must not
> enter RTAS on more than one CPU at a time. This low-level
> serialization requirement is satisfied by holding a spin
> lock (rtas_lock) across most RTAS function invocations.
>
> However, some pseries RTAS functions require multiple successive calls
> to complete a logical operation. Beginning a new call sequence for such a
> function may disrupt any other sequences of that function already in
> progress. Safe and reliable use of these functions effectively
> requires higher-level serialization beyond what is already done at the
> level of RTAS entry and exit.
>
> Where a sequence-based RTAS function is invoked only through
> sys_rtas(), with no in-kernel users, there is no issue as far as the
> kernel is concerned. User space is responsible for appropriately
> serializing its call sequences. (Whether user space code actually
> takes measures to prevent sequence interleaving is another matter.)
> Examples of such functions currently include ibm,platform-dump and
> ibm,get-vpd.
>
> But where a sequence-based RTAS function has both user space and
> in-kernel uesrs, there is a hazard. Even if the in-kernel call sites
> of such a function serialize their sequences correctly, a user of
> sys_rtas() can invoke the same function at any time, potentially
> disrupting a sequence in progress.
>
> So in order to prevent disruption of kernel-based RTAS call sequences,
> they must serialize not only with themselves but also with sys_rtas()
> users, somehow. Preferably without adding global locks or adding more
> function-specific hacks to sys_rtas(). This is a prerequisite for
> adding an in-kernel call sequence of ibm,get-vpd, which is in a change
> to follow.
>
> Note that it has never been feasible for the kernel to prevent
> sys_rtas()-based sequences from being disrupted because control
> returns to user space on every call. sys_rtas()-based users of these
> functions have always been, and continue to be, responsible for
> coordinating their call sequences with other users, even those which
> may invoke the RTAS functions through less direct means than
> sys_rtas(). This is an unavoidable consequence of exposing
> sequence-based RTAS functions through sys_rtas().
>
> * Add new rtas_function_lock() and rtas_function_unlock() APIs for use
>   with sequence-based RTAS functions.
>
> * Add an optional per-function mutex to struct rtas_function. When this
>   member is set, kernel-internal callers of the RTAS function are
>   required to guard their call sequences with rtas_function_lock() and
>   rtas_function_unlock(). This requirement will be enforced in a later
>   change, after all affected call sites are updated.
>
> * Populate the lock members of function table entries where
>   serialization of call sequences is known to be necessary, along with
>   justifying commentary.
>
> * In sys_rtas(), acquire the per-function mutex when it is present.
>
> There should be no perceivable change introduced here except that
> concurrent callers of the same RTAS function via sys_rtas() may block
> on a mutex instead of spinning on rtas_lock. Changes to follow will
> add rtas_function_lock()/unlock() pairs to kernel-based call
> sequences.
>

Can you add an example of the last part. I did look at to find 06 to
find the details 

rtas_function_lock(RTAS_FN_IBM_ACTIVATE_FIRMWARE);

do {
fwrc = rtas_call(token, 0, 1, NULL);
} while (rtas_busy_delay(fwrc));

rtas_function_unlock(RTAS_FN_IBM_ACTIVATE_FIRMWARE);

Reviewed-by: Aneesh Kumar K.V (IBM) 

>
> Signed-off-by: Nathan Lynch 
> ---
>  arch/powerpc/include/asm/rtas.h |   2 +
>  arch/powerpc/kernel/rtas.c  | 101 
> +++-
>  2 files changed, 101 insertions(+), 2 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/rtas.h b/arch/powerpc/include/asm/rtas.h
> index b73010583a8d..9a20caba6858 100644
> --- a/arch/powerpc/include/asm/rtas.h
> +++ b/arch/powerpc/include/asm/rtas.h
> @@ -421,6 +421,8 @@ static inline bool rtas_function_implemented(const 
> rtas_fn_handle_t handle)
>  {
>   return rtas_function_token(handle) != RTAS_UNKNOWN_SERVICE;
>  }
> +void rtas_function_lock(rtas_fn_handle_t handle);
> +void rtas_function_unlock(rtas_fn_handle_t handle);
>  extern int rtas_token(const char *service);
>  extern int rtas_service_present(const char *service);
>  extern int rtas_call(int token, int, int, int *, ...);
> diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
> index 1fc0b3fffdd1..52f2242d0c28 100644
> --- a/arch/powerpc/kernel/rtas.c
> +++ b/arch/powerpc/kernel/rtas.c
> @@ -18,6 +18,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -70,14 +71,34 @@ struct rtas_filter {
>   *ppc64le, and we want to keep it that way. It 
> does
>   * 

Re: [PATCH v4 04/13] powerpc/rtas: Factor out function descriptor lookup

2023-11-20 Thread IBM
Nathan Lynch via B4 Relay 
writes:

> From: Nathan Lynch 
>
> Move the function descriptor table lookup out of rtas_function_token()
> into a separate routine for use in new code to follow. No functional
> change.
>

Reviewed-by: Aneesh Kumar K.V (IBM) 

> Signed-off-by: Nathan Lynch 
> ---
>  arch/powerpc/kernel/rtas.c | 31 +++
>  1 file changed, 19 insertions(+), 12 deletions(-)
>
> diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
> index f0051881348a..1fc0b3fffdd1 100644
> --- a/arch/powerpc/kernel/rtas.c
> +++ b/arch/powerpc/kernel/rtas.c
> @@ -469,29 +469,36 @@ static struct rtas_function rtas_function_table[] 
> __ro_after_init = {
>  static DEFINE_RAW_SPINLOCK(rtas_lock);
>  static struct rtas_args rtas_args;
>  
> -/**
> - * rtas_function_token() - RTAS function token lookup.
> - * @handle: Function handle, e.g. RTAS_FN_EVENT_SCAN.
> - *
> - * Context: Any context.
> - * Return: the token value for the function if implemented by this platform,
> - * otherwise RTAS_UNKNOWN_SERVICE.
> - */
> -s32 rtas_function_token(const rtas_fn_handle_t handle)
> +static struct rtas_function *rtas_function_lookup(const rtas_fn_handle_t 
> handle)
>  {
>   const size_t index = handle.index;
>   const bool out_of_bounds = index >= ARRAY_SIZE(rtas_function_table);
>  
>   if (WARN_ONCE(out_of_bounds, "invalid function index %zu", index))
> - return RTAS_UNKNOWN_SERVICE;
> + return NULL;
>   /*
>* Various drivers attempt token lookups on non-RTAS
>* platforms.
>*/
>   if (!rtas.dev)
> - return RTAS_UNKNOWN_SERVICE;
> + return NULL;
> +
> + return _function_table[index];
> +}
> +
> +/**
> + * rtas_function_token() - RTAS function token lookup.
> + * @handle: Function handle, e.g. RTAS_FN_EVENT_SCAN.
> + *
> + * Context: Any context.
> + * Return: the token value for the function if implemented by this platform,
> + * otherwise RTAS_UNKNOWN_SERVICE.
> + */
> +s32 rtas_function_token(const rtas_fn_handle_t handle)
> +{
> + const struct rtas_function *func = rtas_function_lookup(handle);
>  
> - return rtas_function_table[index].token;
> + return func ? func->token : RTAS_UNKNOWN_SERVICE;
>  }
>  EXPORT_SYMBOL_GPL(rtas_function_token);
>  
>
> -- 
> 2.41.0


Re: [PATCH v4 03/13] powerpc/rtas: Add function return status constants

2023-11-20 Thread IBM
Nathan Lynch via B4 Relay 
writes:

> From: Nathan Lynch 
>
> Not all of the generic RTAS function statuses specified in PAPR have
> symbolic constants and descriptions in rtas.h. Fix this, providing a
> little more background, slightly updating the existing wording, and
> improving the formatting.
>

Reviewed-by: Aneesh Kumar K.V (IBM) 

> Signed-off-by: Nathan Lynch 
> ---
>  arch/powerpc/include/asm/rtas.h | 25 +++--
>  1 file changed, 19 insertions(+), 6 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/rtas.h b/arch/powerpc/include/asm/rtas.h
> index c697c3c74694..b73010583a8d 100644
> --- a/arch/powerpc/include/asm/rtas.h
> +++ b/arch/powerpc/include/asm/rtas.h
> @@ -201,12 +201,25 @@ typedef struct {
>  /* Memory set aside for sys_rtas to use with calls that need a work area. */
>  #define RTAS_USER_REGION_SIZE (64 * 1024)
>  
> -/* RTAS return status codes */
> -#define RTAS_HARDWARE_ERROR  -1/* Hardware Error */
> -#define RTAS_BUSY-2/* RTAS Busy */
> -#define RTAS_INVALID_PARAMETER   -3/* Invalid 
> indicator/domain/sensor etc. */
> -#define RTAS_EXTENDED_DELAY_MIN  9900
> -#define RTAS_EXTENDED_DELAY_MAX  9905
> +/*
> + * Common RTAS function return values, derived from the table "RTAS
> + * Status Word Values" in PAPR+ 7.2.8: "Return Codes". If a function
> + * can return a value in this table then generally it has the meaning
> + * listed here. More extended commentary in the documentation for
> + * rtas_call().
> + *
> + * RTAS functions may use negative and positive numbers not in this
> + * set for function-specific error and success conditions,
> + * respectively.
> + */
> +#define RTAS_SUCCESS 0 /* Success. */
> +#define RTAS_HARDWARE_ERROR -1 /* Hardware or other unspecified 
> error. */
> +#define RTAS_BUSY   -2 /* Retry immediately. */
> +#define RTAS_INVALID_PARAMETER  -3 /* Invalid 
> indicator/domain/sensor etc. */
> +#define RTAS_UNEXPECTED_STATE_CHANGE-7 /* Seems limited to EEH and slot 
> reset. */
> +#define RTAS_EXTENDED_DELAY_MIN   9900 /* Retry after delaying for ~1ms. 
> */
> +#define RTAS_EXTENDED_DELAY_MAX   9905 /* Retry after delaying for 
> ~100s. */
> +#define RTAS_ML_ISOLATION_ERROR  -9000 /* Multi-level isolation error. */
>  
>  /* statuses specific to ibm,suspend-me */
>  #define RTAS_SUSPEND_ABORTED 9000 /* Suspension aborted */
>
> -- 
> 2.41.0


Re: [PATCH v4 02/13] powerpc/rtas: Fall back to linear search on failed token->function lookup

2023-11-20 Thread IBM
Nathan Lynch via B4 Relay 
writes:

> From: Nathan Lynch 
>
> Enabling any of the powerpc:rtas_* tracepoints at boot is likely to
> result in an oops on RTAS platforms. For example, booting a QEMU
> pseries model with 'trace_event=powerpc:rtas_input' in the command
> line leads to:
>
>   BUG: Kernel NULL pointer dereference on read at 0x0008
>   Oops: Kernel access of bad area, sig: 7 [#1]
>   NIP [c004231c] do_enter_rtas+0x1bc/0x460
>   LR [c004231c] do_enter_rtas+0x1bc/0x460
>   Call Trace:
> do_enter_rtas+0x1bc/0x460 (unreliable)
> rtas_call+0x22c/0x4a0
> rtas_get_boot_time+0x80/0x14c
> read_persistent_clock64+0x124/0x150
> read_persistent_wall_and_boot_offset+0x28/0x58
> timekeeping_init+0x70/0x348
> start_kernel+0xa0c/0xc1c
> start_here_common+0x1c/0x20
>
> (This is preceded by a warning for the failed lookup in
> rtas_token_to_function().)
>
> This happens when __do_enter_rtas_trace() attempts a token to function
> descriptor lookup before the xarray containing the mappings has been
> set up.
>
> Fall back to linear scan of the table if rtas_token_to_function_xarray
> is empty.
>

Reviewed-by: Aneesh Kumar K.V (IBM) 

> Signed-off-by: Nathan Lynch 
> Fixes: 24098f580e2b ("powerpc/rtas: add tracepoints around RTAS entry")
> ---
>  arch/powerpc/kernel/rtas.c | 18 ++
>  1 file changed, 14 insertions(+), 4 deletions(-)
>
> diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
> index 1ad1869e2e96..f0051881348a 100644
> --- a/arch/powerpc/kernel/rtas.c
> +++ b/arch/powerpc/kernel/rtas.c
> @@ -557,11 +557,21 @@ static const struct rtas_function 
> *rtas_token_to_function(s32 token)
>   return NULL;
>  
>   func = xa_load(_token_to_function_xarray, token);
> + if (func)
> + return func;
> + /*
> +  * Fall back to linear scan in case the reverse mapping hasn't
> +  * been initialized yet.
> +  */
> + if (xa_empty(_token_to_function_xarray)) {
> + for_each_rtas_function(func) {
> + if (func->token == token)
> + return func;
> + }
> + }
>  
> - if (WARN_ONCE(!func, "unexpected failed lookup for token %d", token))
> - return NULL;
> -
> - return func;
> + WARN_ONCE(true, "unexpected failed lookup for token %d", token);
> + return NULL;
>  }
>  
>  /* This is here deliberately so it's only used in this file */
>
> -- 
> 2.41.0


Re: [PATCH v4 01/13] powerpc/rtas: Add for_each_rtas_function() iterator

2023-11-20 Thread IBM
Nathan Lynch via B4 Relay 
writes:

> From: Nathan Lynch 
>
> Add a convenience macro for iterating over every element of the
> internal function table and convert the one site that can use it. An
> additional user of the macro is anticipated in changes to follow.
>

Reviewed-by: Aneesh Kumar K.V (IBM) 

> Signed-off-by: Nathan Lynch 
> ---
>  arch/powerpc/kernel/rtas.c | 9 +++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
> index eddc031c4b95..1ad1869e2e96 100644
> --- a/arch/powerpc/kernel/rtas.c
> +++ b/arch/powerpc/kernel/rtas.c
> @@ -454,6 +454,11 @@ static struct rtas_function rtas_function_table[] 
> __ro_after_init = {
>   },
>  };
>  
> +#define for_each_rtas_function(funcp)   \
> + for (funcp = _function_table[0];   \
> +  funcp < _function_table[ARRAY_SIZE(rtas_function_table)]; \
> +  ++funcp)
> +
>  /*
>   * Nearly all RTAS calls need to be serialized. All uses of the
>   * default rtas_args block must hold rtas_lock.
> @@ -525,10 +530,10 @@ static DEFINE_XARRAY(rtas_token_to_function_xarray);
>  
>  static int __init rtas_token_to_function_xarray_init(void)
>  {
> + const struct rtas_function *func;
>   int err = 0;
>  
> - for (size_t i = 0; i < ARRAY_SIZE(rtas_function_table); ++i) {
> - const struct rtas_function *func = _function_table[i];
> + for_each_rtas_function(func) {
>   const s32 token = func->token;
>  
>   if (token == RTAS_UNKNOWN_SERVICE)
>
> -- 
> 2.41.0


Re: [RFC - is this a bug?] powerpc/lib/sstep: Asking for some light on this, please. :)

2023-11-20 Thread Gustavo A. R. Silva




On 11/20/23 09:21, Naveen N Rao wrote:

On Mon, Nov 20, 2023 at 08:33:45AM -0600, Gustavo A. R. Silva wrote:



On 11/20/23 08:25, Naveen N Rao wrote:

On Fri, Nov 17, 2023 at 12:36:01PM -0600, Gustavo A. R. Silva wrote:

Hi all,

I'm trying to fix the following -Wstringop-overflow warnings, and I'd like
to get your feedback on this, please:

In function 'do_byte_reverse',
  inlined from 'do_vec_store' at 
/home/gus/linux/arch/powerpc/lib/sstep.c:722:3,
  inlined from 'emulate_loadstore' at 
/home/gus/linux/arch/powerpc/lib/sstep.c:3510:9:
arch/powerpc/lib/sstep.c:287:23: error: writing 8 bytes into a region of size 0 
[-Werror=stringop-overflow=]
287 | up[3] = tmp;
| ~~^
arch/powerpc/lib/sstep.c: In function 'emulate_loadstore':
arch/powerpc/lib/sstep.c:708:11: note: at offset [24, 39] into destination 
object 'u' of size 16
708 | } u;
|   ^
In function 'do_byte_reverse',
  inlined from 'do_vec_store' at 
/home/gus/linux/arch/powerpc/lib/sstep.c:722:3,
  inlined from 'emulate_loadstore' at 
/home/gus/linux/arch/powerpc/lib/sstep.c:3510:9:
arch/powerpc/lib/sstep.c:289:23: error: writing 8 bytes into a region of size 0 
[-Werror=stringop-overflow=]
289 | up[2] = byterev_8(up[1]);
| ~~^~
arch/powerpc/lib/sstep.c: In function 'emulate_loadstore':
arch/powerpc/lib/sstep.c:708:11: note: at offset 16 into destination object 'u' 
of size 16
708 | } u;
|   ^
In function 'do_byte_reverse',
  inlined from 'do_vec_load' at 
/home/gus/linux/arch/powerpc/lib/sstep.c:691:3,
  inlined from 'emulate_loadstore' at 
/home/gus/linux/arch/powerpc/lib/sstep.c:3439:9:
arch/powerpc/lib/sstep.c:287:23: error: writing 8 bytes into a region of size 0 
[-Werror=stringop-overflow=]
287 | up[3] = tmp;
| ~~^
arch/powerpc/lib/sstep.c: In function 'emulate_loadstore':
arch/powerpc/lib/sstep.c:681:11: note: at offset [24, 39] into destination 
object 'u' of size 16
681 | } u = {};
|   ^
arch/powerpc/lib/sstep.c:681:11: note: at offset [24, 39] into destination 
object 'u' of size 16
arch/powerpc/lib/sstep.c:681:11: note: at offset [24, 39] into destination 
object 'u' of size 16

The origing of the issue seems to be the following calls to function 
`do_vec_store()`, when
`size > 16`, or more precisely when `size == 32`

arch/powerpc/lib/sstep.c:
3436 case LOAD_VMX:
3437 if (!(regs->msr & MSR_PR) && !(regs->msr & MSR_VEC))
3438 return 0;
3439 err = do_vec_load(op->reg, ea, size, regs, cross_endian);
3440 break;
...
3507 case STORE_VMX:
3508 if (!(regs->msr & MSR_PR) && !(regs->msr & MSR_VEC))
3509 return 0;
3510 err = do_vec_store(op->reg, ea, size, regs, cross_endian);
3511 break;


LOAD_VMX and STORE_VMX are set in analyse_instr() and size does not
exceed 16. So, the warning looks incorrect to me.


Does that mean that this piece of code is never actually executed:

  281 case 32: {
  282 unsigned long *up = (unsigned long *)ptr;
  283 unsigned long tmp;
  284
  285 tmp = byterev_8(up[0]);
  286 up[0] = byterev_8(up[3]);
  287 up[3] = tmp;
  288 tmp = byterev_8(up[2]);
  289 up[2] = byterev_8(up[1]);
  290 up[1] = tmp;
  291 break;
  292 }

or rather, under which conditions the above code is executed, if any?


Please see git history to understand the commit that introduced that
code. You can do that by starting with a 'git blame' on the file. You'll
find that the commit that introduced the above hunk was af99da74333b
("powerpc/sstep: Support VSX vector paired storage access
instructions").

The above code path is hit via
do_vsx_load()->emulate_vsx_load()->do_byte_reverse()


It seems there is another path (at least the one that triggers the
-Wstringop-overflow warnings):

emulate_step()->emulate_loadstore()->do_vec_load()->do_byte_reverse()

emulate_step() {
...
if (OP_IS_LOAD_STORE(type)) {
err = emulate_loadstore(regs, );
if (err)
return 0;
goto instr_done;
}
...
}

> emulate_loadstore() {
...
#ifdef CONFIG_ALTIVEC
case LOAD_VMX:
if (!(regs->msr & MSR_PR) && !(regs->msr & MSR_VEC))
return 0;
err = do_vec_load(op->reg, ea, size, regs, cross_endian); // 
with size == 32
break;
#endif
...
#ifdef CONFIG_ALTIVEC
case STORE_VMX:
if (!(regs->msr & MSR_PR) && !(regs->msr & MSR_VEC))
return 0;
err = do_vec_store(op->reg, ea, size, 

Re: [PATCH] ASoC: fsl_mqs: Remove duplicate linux/of.h header

2023-11-20 Thread Mark Brown
On Sun, 19 Nov 2023 10:45:14 +, Lucas Tanure wrote:
> Remove linux/of.h as is included more than once.
> Reported by make includecheck.
> 
> 

Applied to

   https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git for-next

Thanks!

[1/1] ASoC: fsl_mqs: Remove duplicate linux/of.h header
  commit: cac15dc25f416972f8dd0b6a8d74daa79dc3a998

All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.

You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.

If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.

Please add any relevant lists and maintainers to the CCs when replying
to this mail.

Thanks,
Mark



Re: [PATCH v2] powerpc: Don't clobber fr0/vs0 during fp|altivec register save

2023-11-20 Thread Timothy Pearson



- Original Message -
> From: "Michael Ellerman" 
> To: "Timothy Pearson" , "Jens Axboe" 
> , "regressions"
> , "npiggin" , "christophe 
> leroy" ,
> "linuxppc-dev" 
> Sent: Monday, November 20, 2023 1:10:06 AM
> Subject: Re: [PATCH v2] powerpc: Don't clobber fr0/vs0 during fp|altivec 
> register  save

> Hi Timothy,
> 
> Great work debugging this. I think your fix is good, but I want to understand 
> it
> a bit more to make sure I can explain why we haven't seen it outside of
> io-uring.
> If this can be triggered outside of io-uring then I have even more backporting
> in my future :}
> 
> Typically save_fpu() is called from __giveup_fpu() which saves the FP regs and
> also *turns off FP* in the tasks MSR, meaning the kernel will reload the FP 
> regs
> from the thread struct before letting the task use FP again. So in that case
> save_fpu() is free to clobber fr0 because the FP regs no longer hold live 
> values
> for the task.
> 
> There is another case though, which is the path via:
>  copy_process()
>dup_task_struct()
>  arch_dup_task_struct()
>flush_all_to_thread()
>  save_all()
> 
> That path saves the FP regs but leaves them live. That's meant as an
> optimisation for a process that's using FP/VSX and then calls fork(), leaving
> the regs live means the parent process doesn't have to take a fault after the
> fork to get its FP regs back.
> 
> That path does clobber fr0, but fr0 is volatile across a syscall, and the only
> way to reach copy_process() from userspace is via a syscall. So in normal 
> usage
> fr0 being clobbered across a syscall shouldn't cause data corruption.
> 
> Even if we handle a signal on the return from the fork() syscall, the worst 
> that
> happens is that the task's thread struct holds the clobbered fr0, but the task
> doesn't care (because fr0 is volatile across the syscall anyway).
> 
> That path is something like:
> 
> system_call_vectored_common()
>  system_call_exception()
>sys_fork()
>  kernel_clone()
>copy_process()
>  dup_task_struct()
>arch_dup_task_struct()
>  flush_all_to_thread()
>save_all()
>  if (tsk->thread.regs->msr & MSR_FP)
>save_fpu()
># does not clear MSR_FP from regs->msr
>  syscall_exit_prepare()
>interrupt_exit_user_prepare_main()
>  do_notify_resume()
>get_signal()
>handle_rt_signal64()
>  prepare_setup_sigcontext()
>flush_fp_to_thread()
>  if (tsk->thread.regs->msr & MSR_FP)
>giveup_fpu()
>  __giveup_fpu
>save_fpu()
># clobbered fr0 is saved, but task considers it volatile
># across syscall anyway
> 
> 
> But we now have a new path, because io-uring can call copy_process() via
> create_io_thread() from the signal handling path. That's OK if the signal is
> handled as we return from a syscall, but it's not OK if the signal is handled
> due to some other interrupt.
> 
> Which is:
> 
> interrupt_return_srr_user()
>  interrupt_exit_user_prepare()
>interrupt_exit_user_prepare_main()
>  do_notify_resume()
>get_signal()
>  task_work_run()
>create_worker_cb()
>  create_io_worker()
>copy_process()
>  dup_task_struct()
>arch_dup_task_struct()
>  flush_all_to_thread()
>save_all()
>  if (tsk->thread.regs->msr & MSR_FP)
>save_fpu()
># fr0 is clobbered and potentially live in 
> userspace
> 
> 
> So tldr I think the corruption is only an issue since io-uring started doing
> the clone via signal, which I think matches the observed timeline of this bug
> appearing.

I agree the corruption really only started showing up in earnest on io_uring 
clone-via-signal, as this was confirmed several times in the course of 
debugging.  Bear in mind however that we have seen very, very rare crashes over 
several years in other tasks, and I am starting to think this bug might be the 
root cause (see below).

Note as well that I may very well have a wrong call order in the commit 
message, since I was relying on a couple of WARN_ON() macros I inserted to 
check for a similar (but not identical) condition and didn't spend much time 
getting new traces after identifying the root cause.

I went back and grabbed some real world system-wide stack traces, since I now 
know what to trigger on.  A typical example is:

interrupt_return_srr_user()
 interrupt_exit_user_prepare()
  interrupt_exit_user_prepare_main()
   schedule()
__schedule()
 __switch_to()
  giveup_all()
   # tsk->thread.regs->msr MSR_FP is still set here
   __giveup_fpu()
save_fpu()
# fr0 is clobbered and potentially live in userspace

This indicates that the pending 

Re: [PATCH] ASoC: fsl_sai: Fix no frame sync clock issue on i.MX8MP

2023-11-20 Thread Mark Brown
On Mon, 20 Nov 2023 18:05:35 +0800, Shengjiu Wang wrote:
> On i.MX8MP, when the TERE and FSD_MSTR enabled before configuring
> the word width, there will be no frame sync clock issue, because
> old word width impact the generation of frame sync.
> 
> TERE enabled earlier only for i.MX8MP case for the hardware limitation,
> So need to disable FSD_MSTR before configuring word width, then enable
> FSD_MSTR bit for this specific case.
> 
> [...]

Applied to

   https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git for-next

Thanks!

[1/1] ASoC: fsl_sai: Fix no frame sync clock issue on i.MX8MP
  commit: 14e8442e0789598514f3c9de014950de9feda7a4

All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.

You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.

If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.

Please add any relevant lists and maintainers to the CCs when replying
to this mail.

Thanks,
Mark



Re: [RFC - is this a bug?] powerpc/lib/sstep: Asking for some light on this, please. :)

2023-11-20 Thread Naveen N Rao
On Mon, Nov 20, 2023 at 08:33:45AM -0600, Gustavo A. R. Silva wrote:
> 
> 
> On 11/20/23 08:25, Naveen N Rao wrote:
> > On Fri, Nov 17, 2023 at 12:36:01PM -0600, Gustavo A. R. Silva wrote:
> > > Hi all,
> > > 
> > > I'm trying to fix the following -Wstringop-overflow warnings, and I'd like
> > > to get your feedback on this, please:
> > > 
> > > In function 'do_byte_reverse',
> > >  inlined from 'do_vec_store' at 
> > > /home/gus/linux/arch/powerpc/lib/sstep.c:722:3,
> > >  inlined from 'emulate_loadstore' at 
> > > /home/gus/linux/arch/powerpc/lib/sstep.c:3510:9:
> > > arch/powerpc/lib/sstep.c:287:23: error: writing 8 bytes into a region of 
> > > size 0 [-Werror=stringop-overflow=]
> > >287 | up[3] = tmp;
> > >| ~~^
> > > arch/powerpc/lib/sstep.c: In function 'emulate_loadstore':
> > > arch/powerpc/lib/sstep.c:708:11: note: at offset [24, 39] into 
> > > destination object 'u' of size 16
> > >708 | } u;
> > >|   ^
> > > In function 'do_byte_reverse',
> > >  inlined from 'do_vec_store' at 
> > > /home/gus/linux/arch/powerpc/lib/sstep.c:722:3,
> > >  inlined from 'emulate_loadstore' at 
> > > /home/gus/linux/arch/powerpc/lib/sstep.c:3510:9:
> > > arch/powerpc/lib/sstep.c:289:23: error: writing 8 bytes into a region of 
> > > size 0 [-Werror=stringop-overflow=]
> > >289 | up[2] = byterev_8(up[1]);
> > >| ~~^~
> > > arch/powerpc/lib/sstep.c: In function 'emulate_loadstore':
> > > arch/powerpc/lib/sstep.c:708:11: note: at offset 16 into destination 
> > > object 'u' of size 16
> > >708 | } u;
> > >|   ^
> > > In function 'do_byte_reverse',
> > >  inlined from 'do_vec_load' at 
> > > /home/gus/linux/arch/powerpc/lib/sstep.c:691:3,
> > >  inlined from 'emulate_loadstore' at 
> > > /home/gus/linux/arch/powerpc/lib/sstep.c:3439:9:
> > > arch/powerpc/lib/sstep.c:287:23: error: writing 8 bytes into a region of 
> > > size 0 [-Werror=stringop-overflow=]
> > >287 | up[3] = tmp;
> > >| ~~^
> > > arch/powerpc/lib/sstep.c: In function 'emulate_loadstore':
> > > arch/powerpc/lib/sstep.c:681:11: note: at offset [24, 39] into 
> > > destination object 'u' of size 16
> > >681 | } u = {};
> > >|   ^
> > > arch/powerpc/lib/sstep.c:681:11: note: at offset [24, 39] into 
> > > destination object 'u' of size 16
> > > arch/powerpc/lib/sstep.c:681:11: note: at offset [24, 39] into 
> > > destination object 'u' of size 16
> > > 
> > > The origing of the issue seems to be the following calls to function 
> > > `do_vec_store()`, when
> > > `size > 16`, or more precisely when `size == 32`
> > > 
> > > arch/powerpc/lib/sstep.c:
> > > 3436 case LOAD_VMX:
> > > 3437 if (!(regs->msr & MSR_PR) && !(regs->msr & MSR_VEC))
> > > 3438 return 0;
> > > 3439 err = do_vec_load(op->reg, ea, size, regs, 
> > > cross_endian);
> > > 3440 break;
> > > ...
> > > 3507 case STORE_VMX:
> > > 3508 if (!(regs->msr & MSR_PR) && !(regs->msr & MSR_VEC))
> > > 3509 return 0;
> > > 3510 err = do_vec_store(op->reg, ea, size, regs, 
> > > cross_endian);
> > > 3511 break;
> > 
> > LOAD_VMX and STORE_VMX are set in analyse_instr() and size does not
> > exceed 16. So, the warning looks incorrect to me.
> 
> Does that mean that this piece of code is never actually executed:
> 
>  281 case 32: {
>  282 unsigned long *up = (unsigned long *)ptr;
>  283 unsigned long tmp;
>  284
>  285 tmp = byterev_8(up[0]);
>  286 up[0] = byterev_8(up[3]);
>  287 up[3] = tmp;
>  288 tmp = byterev_8(up[2]);
>  289 up[2] = byterev_8(up[1]);
>  290 up[1] = tmp;
>  291 break;
>  292 }
> 
> or rather, under which conditions the above code is executed, if any?

Please see git history to understand the commit that introduced that 
code. You can do that by starting with a 'git blame' on the file. You'll 
find that the commit that introduced the above hunk was af99da74333b 
("powerpc/sstep: Support VSX vector paired storage access 
instructions").

The above code path is hit via 
do_vsx_load()->emulate_vsx_load()->do_byte_reverse()


- Naveen



Re: [RFC - is this a bug?] powerpc/lib/sstep: Asking for some light on this, please. :)

2023-11-20 Thread Gustavo A. R. Silva




On 11/20/23 08:25, Naveen N Rao wrote:

On Fri, Nov 17, 2023 at 12:36:01PM -0600, Gustavo A. R. Silva wrote:

Hi all,

I'm trying to fix the following -Wstringop-overflow warnings, and I'd like
to get your feedback on this, please:

In function 'do_byte_reverse',
 inlined from 'do_vec_store' at 
/home/gus/linux/arch/powerpc/lib/sstep.c:722:3,
 inlined from 'emulate_loadstore' at 
/home/gus/linux/arch/powerpc/lib/sstep.c:3510:9:
arch/powerpc/lib/sstep.c:287:23: error: writing 8 bytes into a region of size 0 
[-Werror=stringop-overflow=]
   287 | up[3] = tmp;
   | ~~^
arch/powerpc/lib/sstep.c: In function 'emulate_loadstore':
arch/powerpc/lib/sstep.c:708:11: note: at offset [24, 39] into destination 
object 'u' of size 16
   708 | } u;
   |   ^
In function 'do_byte_reverse',
 inlined from 'do_vec_store' at 
/home/gus/linux/arch/powerpc/lib/sstep.c:722:3,
 inlined from 'emulate_loadstore' at 
/home/gus/linux/arch/powerpc/lib/sstep.c:3510:9:
arch/powerpc/lib/sstep.c:289:23: error: writing 8 bytes into a region of size 0 
[-Werror=stringop-overflow=]
   289 | up[2] = byterev_8(up[1]);
   | ~~^~
arch/powerpc/lib/sstep.c: In function 'emulate_loadstore':
arch/powerpc/lib/sstep.c:708:11: note: at offset 16 into destination object 'u' 
of size 16
   708 | } u;
   |   ^
In function 'do_byte_reverse',
 inlined from 'do_vec_load' at 
/home/gus/linux/arch/powerpc/lib/sstep.c:691:3,
 inlined from 'emulate_loadstore' at 
/home/gus/linux/arch/powerpc/lib/sstep.c:3439:9:
arch/powerpc/lib/sstep.c:287:23: error: writing 8 bytes into a region of size 0 
[-Werror=stringop-overflow=]
   287 | up[3] = tmp;
   | ~~^
arch/powerpc/lib/sstep.c: In function 'emulate_loadstore':
arch/powerpc/lib/sstep.c:681:11: note: at offset [24, 39] into destination 
object 'u' of size 16
   681 | } u = {};
   |   ^
arch/powerpc/lib/sstep.c:681:11: note: at offset [24, 39] into destination 
object 'u' of size 16
arch/powerpc/lib/sstep.c:681:11: note: at offset [24, 39] into destination 
object 'u' of size 16

The origing of the issue seems to be the following calls to function 
`do_vec_store()`, when
`size > 16`, or more precisely when `size == 32`

arch/powerpc/lib/sstep.c:
3436 case LOAD_VMX:
3437 if (!(regs->msr & MSR_PR) && !(regs->msr & MSR_VEC))
3438 return 0;
3439 err = do_vec_load(op->reg, ea, size, regs, cross_endian);
3440 break;
...
3507 case STORE_VMX:
3508 if (!(regs->msr & MSR_PR) && !(regs->msr & MSR_VEC))
3509 return 0;
3510 err = do_vec_store(op->reg, ea, size, regs, cross_endian);
3511 break;


LOAD_VMX and STORE_VMX are set in analyse_instr() and size does not
exceed 16. So, the warning looks incorrect to me.


Does that mean that this piece of code is never actually executed:

 281 case 32: {
 282 unsigned long *up = (unsigned long *)ptr;
 283 unsigned long tmp;
 284
 285 tmp = byterev_8(up[0]);
 286 up[0] = byterev_8(up[3]);
 287 up[3] = tmp;
 288 tmp = byterev_8(up[2]);
 289 up[2] = byterev_8(up[1]);
 290 up[1] = tmp;
 291 break;
 292 }

or rather, under which conditions the above code is executed, if any?

Thanks!
--
Gustavo


Re: [PATCH v2] powerpc: Don't clobber fr0/vs0 during fp|altivec register save

2023-11-20 Thread Nicholas Piggin
Yeah, awesome.

On Mon Nov 20, 2023 at 5:10 PM AEST, Michael Ellerman wrote:
> Hi Timothy,
>
> Great work debugging this. I think your fix is good, but I want to understand 
> it
> a bit more to make sure I can explain why we haven't seen it outside of 
> io-uring.

Analysis seems right to me.

Probably the best minimal fix. But I wonder if we should just use the
one path for saving/flushing/giving up, just use giveup instead of
save?

KVM looks odd too, and actually gets this wrong. In a way that's not
fixed by Timothy's patch, because it's just not restoring userspace
registers at all. Fortunately QEMU isn't in the habit of using non
volatile FP/VEC registers over a VCPU ioctl, but there's no reason it
couldn't do since GCC/LLVM can easily use them. KVM really wants to be
using giveup.

Thanks,
Nick

> If this can be triggered outside of io-uring then I have even more backporting
> in my future :}
>
> Typically save_fpu() is called from __giveup_fpu() which saves the FP regs and
> also *turns off FP* in the tasks MSR, meaning the kernel will reload the FP 
> regs
> from the thread struct before letting the task use FP again. So in that case
> save_fpu() is free to clobber fr0 because the FP regs no longer hold live 
> values
> for the task.
>
> There is another case though, which is the path via:
>   copy_process()
> dup_task_struct()
>   arch_dup_task_struct()
> flush_all_to_thread()
>   save_all()
>
> That path saves the FP regs but leaves them live. That's meant as an
> optimisation for a process that's using FP/VSX and then calls fork(), leaving
> the regs live means the parent process doesn't have to take a fault after the
> fork to get its FP regs back.
>
> That path does clobber fr0, but fr0 is volatile across a syscall, and the only
> way to reach copy_process() from userspace is via a syscall. So in normal 
> usage
> fr0 being clobbered across a syscall shouldn't cause data corruption.
>
> Even if we handle a signal on the return from the fork() syscall, the worst 
> that
> happens is that the task's thread struct holds the clobbered fr0, but the task
> doesn't care (because fr0 is volatile across the syscall anyway).
>
> That path is something like:
>
> system_call_vectored_common()
>   system_call_exception()
> sys_fork()
>   kernel_clone()
> copy_process()
>   dup_task_struct()
> arch_dup_task_struct()
>   flush_all_to_thread()
> save_all()
>   if (tsk->thread.regs->msr & MSR_FP)
> save_fpu()
> # does not clear MSR_FP from regs->msr
>   syscall_exit_prepare()
> interrupt_exit_user_prepare_main()
>   do_notify_resume()
> get_signal()
> handle_rt_signal64()
>   prepare_setup_sigcontext()
> flush_fp_to_thread()
>   if (tsk->thread.regs->msr & MSR_FP)
> giveup_fpu()
>   __giveup_fpu
> save_fpu()
> # clobbered fr0 is saved, but task considers it volatile
> # across syscall anyway
>
>
> But we now have a new path, because io-uring can call copy_process() via
> create_io_thread() from the signal handling path. That's OK if the signal is
> handled as we return from a syscall, but it's not OK if the signal is handled
> due to some other interrupt.
>
> Which is:
>
> interrupt_return_srr_user()
>   interrupt_exit_user_prepare()
> interrupt_exit_user_prepare_main()
>   do_notify_resume()
> get_signal()
>   task_work_run()
> create_worker_cb()
>   create_io_worker()
> copy_process()
>   dup_task_struct()
> arch_dup_task_struct()
>   flush_all_to_thread()
> save_all()
>   if (tsk->thread.regs->msr & MSR_FP)
> save_fpu()
> # fr0 is clobbered and potentially live in 
> userspace
>
>
> So tldr I think the corruption is only an issue since io-uring started doing
> the clone via signal, which I think matches the observed timeline of this bug
> appearing.
>
> Gotta run home, will have a closer look at the actual patch later on.
>
> cheers
>
>
> Timothy Pearson  writes:
> > During floating point and vector save to thread data fr0/vs0 are clobbered
> > by the FPSCR/VSCR store routine.  This leads to userspace register 
> > corruption
> > and application data corruption / crash under the following rare condition:
> >
> >  * A userspace thread is executing with VSX/FP mode enabled
> >  * The userspace thread is making active use of fr0 and/or vs0
> >  * An IPI is taken in kernel mode, forcing the userspace thread to 
> > reschedule
> >  * The userspace thread is interrupted by the IPI before accessing data it
> >previously stored in fr0/vs0
> >  * The thread being switched in by the IPI has a pending 

Re: [RFC - is this a bug?] powerpc/lib/sstep: Asking for some light on this, please. :)

2023-11-20 Thread Naveen N Rao
On Fri, Nov 17, 2023 at 12:36:01PM -0600, Gustavo A. R. Silva wrote:
> Hi all,
> 
> I'm trying to fix the following -Wstringop-overflow warnings, and I'd like
> to get your feedback on this, please:
> 
> In function 'do_byte_reverse',
> inlined from 'do_vec_store' at 
> /home/gus/linux/arch/powerpc/lib/sstep.c:722:3,
> inlined from 'emulate_loadstore' at 
> /home/gus/linux/arch/powerpc/lib/sstep.c:3510:9:
> arch/powerpc/lib/sstep.c:287:23: error: writing 8 bytes into a region of size 
> 0 [-Werror=stringop-overflow=]
>   287 | up[3] = tmp;
>   | ~~^
> arch/powerpc/lib/sstep.c: In function 'emulate_loadstore':
> arch/powerpc/lib/sstep.c:708:11: note: at offset [24, 39] into destination 
> object 'u' of size 16
>   708 | } u;
>   |   ^
> In function 'do_byte_reverse',
> inlined from 'do_vec_store' at 
> /home/gus/linux/arch/powerpc/lib/sstep.c:722:3,
> inlined from 'emulate_loadstore' at 
> /home/gus/linux/arch/powerpc/lib/sstep.c:3510:9:
> arch/powerpc/lib/sstep.c:289:23: error: writing 8 bytes into a region of size 
> 0 [-Werror=stringop-overflow=]
>   289 | up[2] = byterev_8(up[1]);
>   | ~~^~
> arch/powerpc/lib/sstep.c: In function 'emulate_loadstore':
> arch/powerpc/lib/sstep.c:708:11: note: at offset 16 into destination object 
> 'u' of size 16
>   708 | } u;
>   |   ^
> In function 'do_byte_reverse',
> inlined from 'do_vec_load' at 
> /home/gus/linux/arch/powerpc/lib/sstep.c:691:3,
> inlined from 'emulate_loadstore' at 
> /home/gus/linux/arch/powerpc/lib/sstep.c:3439:9:
> arch/powerpc/lib/sstep.c:287:23: error: writing 8 bytes into a region of size 
> 0 [-Werror=stringop-overflow=]
>   287 | up[3] = tmp;
>   | ~~^
> arch/powerpc/lib/sstep.c: In function 'emulate_loadstore':
> arch/powerpc/lib/sstep.c:681:11: note: at offset [24, 39] into destination 
> object 'u' of size 16
>   681 | } u = {};
>   |   ^
> arch/powerpc/lib/sstep.c:681:11: note: at offset [24, 39] into destination 
> object 'u' of size 16
> arch/powerpc/lib/sstep.c:681:11: note: at offset [24, 39] into destination 
> object 'u' of size 16
> 
> The origing of the issue seems to be the following calls to function 
> `do_vec_store()`, when
> `size > 16`, or more precisely when `size == 32`
> 
> arch/powerpc/lib/sstep.c:
> 3436 case LOAD_VMX:
> 3437 if (!(regs->msr & MSR_PR) && !(regs->msr & MSR_VEC))
> 3438 return 0;
> 3439 err = do_vec_load(op->reg, ea, size, regs, cross_endian);
> 3440 break;
> ...
> 3507 case STORE_VMX:
> 3508 if (!(regs->msr & MSR_PR) && !(regs->msr & MSR_VEC))
> 3509 return 0;
> 3510 err = do_vec_store(op->reg, ea, size, regs, 
> cross_endian);
> 3511 break;

LOAD_VMX and STORE_VMX are set in analyse_instr() and size does not 
exceed 16. So, the warning looks incorrect to me.

- Naveen


Re: [PATCH 34/34] KVM: selftests: Add a memory region subtest to validate invalid flags

2023-11-20 Thread Mark Brown
On Wed, Nov 08, 2023 at 05:08:01PM -0800, Anish Moorthy wrote:
> Applying [1] and [2] reveals that this also breaks non-x86 builds- the
> MEM_REGION_GPA/SLOT definitions are guarded behind an #ifdef
> __x86_64__, while the usages introduced here aren't.
> 
> Should
> 
> On Sun, Nov 5, 2023 at 8:35 AM Paolo Bonzini  wrote:
> >
> > +   test_invalid_memory_region_flags();
> 
> be #ifdef'd, perhaps? I'm not quite sure what the intent is.

This has been broken in -next for a week now, do we have any progress
on a fix or should we just revert the patch?


signature.asc
Description: PGP signature


Re: [PATCH v9 00/27] Add support for QMC HDLC, framer infrastructure and PEF2256 framer

2023-11-20 Thread Mark Brown
On Fri, Nov 17, 2023 at 04:47:46PM -0800, Jakub Kicinski wrote:
> On Wed, 15 Nov 2023 15:39:36 +0100 Herve Codina wrote:
> >- Removed Patches 6, 7 and 8 (patches applied)
> > 
> >- Patches 7, 20, 21, 23 (patches 10, 23, 24, 26 in v8)
> >  Add 'Acked-by: Jakub Kicinski '

> I thought someone (Mark?) asked for the networking stuff to be put 
> on a branch. If that's still the preference - is it possible to factor
> these out as a standalone series, too?  Will they build on their own?

Yes, can we *please* at least get the generic non-driver bits of this
series moving - they seem uncontroversial as far as I can see and are a
tiny portion of the overall 20 patches.  Patches 21-23 look like they
can go on a branch in the net tree?


signature.asc
Description: PGP signature


[PATCH v4 4/4] KVM: PPC: selftests: powerpc enable kvm_create_max_vcpus test

2023-11-20 Thread Nicholas Piggin
powerpc's maximum permitted vCPU ID depends on the VM's SMT mode, and
the maximum reported by KVM_CAP_MAX_VCPU_ID exceeds a simple non-SMT
VM's limit.

The powerpc KVM selftest port uses non-SMT VMs, so add a workaround
to the kvm_create_max_vcpus test case to limit vCPU IDs to
KVM_CAP_MAX_VCPUS on powerpc. And enable the test case.

Signed-off-by: Nicholas Piggin 
---
 tools/testing/selftests/kvm/Makefile   | 1 +
 tools/testing/selftests/kvm/kvm_create_max_vcpus.c | 9 +
 2 files changed, 10 insertions(+)

diff --git a/tools/testing/selftests/kvm/Makefile 
b/tools/testing/selftests/kvm/Makefile
index a198fe6136c8..1e904d8871d7 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -211,6 +211,7 @@ TEST_GEN_PROGS_powerpc += dirty_log_test
 TEST_GEN_PROGS_powerpc += dirty_log_perf_test
 TEST_GEN_PROGS_powerpc += guest_print_test
 TEST_GEN_PROGS_powerpc += hardware_disable_test
+TEST_GEN_PROGS_powerpc += kvm_create_max_vcpus
 TEST_GEN_PROGS_powerpc += kvm_page_table_test
 TEST_GEN_PROGS_powerpc += max_guest_memory_test
 TEST_GEN_PROGS_powerpc += memslot_modification_stress_test
diff --git a/tools/testing/selftests/kvm/kvm_create_max_vcpus.c 
b/tools/testing/selftests/kvm/kvm_create_max_vcpus.c
index 31b3cb24b9a7..330ede73c147 100644
--- a/tools/testing/selftests/kvm/kvm_create_max_vcpus.c
+++ b/tools/testing/selftests/kvm/kvm_create_max_vcpus.c
@@ -51,6 +51,15 @@ int main(int argc, char *argv[])
pr_info("KVM_CAP_MAX_VCPU_ID: %d\n", kvm_max_vcpu_id);
pr_info("KVM_CAP_MAX_VCPUS: %d\n", kvm_max_vcpus);
 
+#ifdef __powerpc64__
+   /*
+* powerpc has a particular format for the vcpu ID that depends on
+* the guest SMT mode, and the max ID cap is too large for non-SMT
+* modes, where the maximum ID is the same as the maximum vCPUs.
+*/
+   kvm_max_vcpu_id = kvm_max_vcpus;
+#endif
+
/*
 * Check that we're allowed to open nr_fds_wanted file descriptors and
 * try raising the limits if needed.
-- 
2.42.0



[PATCH v4 3/4] KVM: PPC: selftests: add support for powerpc

2023-11-20 Thread Nicholas Piggin
Implement KVM selftests support for powerpc (Book3S-64).

ucalls are implemented with an unsuppored PAPR hcall number which will
always cause KVM to exit to userspace.

Virtual memory is implemented for the radix MMU, and only a base page
size is supported (both 4K and 64K).

Guest interrupts are taken in real-mode, so require a page allocated at
gRA 0x0. Interrupt entry is complicated because gVA:gRA is not 1:1 mapped
(like the kernel is), so the MMU can not just just be switched on and
off.

Acked-by: Michael Ellerman  (powerpc)
Signed-off-by: Nicholas Piggin 
---
 MAINTAINERS   |   2 +
 tools/testing/selftests/kvm/Makefile  |  19 +
 .../selftests/kvm/include/kvm_util_base.h |  24 +
 .../selftests/kvm/include/powerpc/hcall.h |  17 +
 .../selftests/kvm/include/powerpc/ppc_asm.h   |  32 ++
 .../selftests/kvm/include/powerpc/processor.h |  39 ++
 .../selftests/kvm/include/powerpc/ucall.h |  21 +
 tools/testing/selftests/kvm/lib/guest_modes.c |  27 +-
 tools/testing/selftests/kvm/lib/kvm_util.c|  12 +
 .../selftests/kvm/lib/powerpc/handlers.S  |  93 
 .../testing/selftests/kvm/lib/powerpc/hcall.c |  45 ++
 .../selftests/kvm/lib/powerpc/processor.c | 468 ++
 .../testing/selftests/kvm/lib/powerpc/ucall.c |  21 +
 tools/testing/selftests/kvm/powerpc/helpers.h |  46 ++
 14 files changed, 862 insertions(+), 4 deletions(-)
 create mode 100644 tools/testing/selftests/kvm/include/powerpc/hcall.h
 create mode 100644 tools/testing/selftests/kvm/include/powerpc/ppc_asm.h
 create mode 100644 tools/testing/selftests/kvm/include/powerpc/processor.h
 create mode 100644 tools/testing/selftests/kvm/include/powerpc/ucall.h
 create mode 100644 tools/testing/selftests/kvm/lib/powerpc/handlers.S
 create mode 100644 tools/testing/selftests/kvm/lib/powerpc/hcall.c
 create mode 100644 tools/testing/selftests/kvm/lib/powerpc/processor.c
 create mode 100644 tools/testing/selftests/kvm/lib/powerpc/ucall.c
 create mode 100644 tools/testing/selftests/kvm/powerpc/helpers.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 97f51d5ec1cf..3f430d517462 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -11705,6 +11705,8 @@ F:  arch/powerpc/include/asm/kvm*
 F: arch/powerpc/include/uapi/asm/kvm*
 F: arch/powerpc/kernel/kvm*
 F: arch/powerpc/kvm/
+F: tools/testing/selftests/kvm/*/powerpc/
+F: tools/testing/selftests/kvm/powerpc/
 
 KERNEL VIRTUAL MACHINE FOR RISC-V (KVM/riscv)
 M: Anup Patel 
diff --git a/tools/testing/selftests/kvm/Makefile 
b/tools/testing/selftests/kvm/Makefile
index a5963ab9215b..a198fe6136c8 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -65,6 +65,11 @@ LIBKVM_s390x += lib/s390x/ucall.c
 LIBKVM_riscv += lib/riscv/processor.c
 LIBKVM_riscv += lib/riscv/ucall.c
 
+LIBKVM_powerpc += lib/powerpc/handlers.S
+LIBKVM_powerpc += lib/powerpc/processor.c
+LIBKVM_powerpc += lib/powerpc/ucall.c
+LIBKVM_powerpc += lib/powerpc/hcall.c
+
 # Non-compiled test targets
 TEST_PROGS_x86_64 += x86_64/nx_huge_pages_test.sh
 
@@ -200,6 +205,20 @@ TEST_GEN_PROGS_riscv += kvm_page_table_test
 TEST_GEN_PROGS_riscv += set_memory_region_test
 TEST_GEN_PROGS_riscv += kvm_binary_stats_test
 
+TEST_GEN_PROGS_powerpc += access_tracking_perf_test
+TEST_GEN_PROGS_powerpc += demand_paging_test
+TEST_GEN_PROGS_powerpc += dirty_log_test
+TEST_GEN_PROGS_powerpc += dirty_log_perf_test
+TEST_GEN_PROGS_powerpc += guest_print_test
+TEST_GEN_PROGS_powerpc += hardware_disable_test
+TEST_GEN_PROGS_powerpc += kvm_page_table_test
+TEST_GEN_PROGS_powerpc += max_guest_memory_test
+TEST_GEN_PROGS_powerpc += memslot_modification_stress_test
+TEST_GEN_PROGS_powerpc += memslot_perf_test
+TEST_GEN_PROGS_powerpc += rseq_test
+TEST_GEN_PROGS_powerpc += set_memory_region_test
+TEST_GEN_PROGS_powerpc += kvm_binary_stats_test
+
 SPLIT_TESTS += get-reg-list
 
 TEST_PROGS += $(TEST_PROGS_$(ARCH_DIR))
diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h 
b/tools/testing/selftests/kvm/include/kvm_util_base.h
index 46edefabac85..3ce0bd0299fa 100644
--- a/tools/testing/selftests/kvm/include/kvm_util_base.h
+++ b/tools/testing/selftests/kvm/include/kvm_util_base.h
@@ -87,6 +87,12 @@ enum kvm_mem_region_type {
NR_MEM_REGIONS,
 };
 
+/* Page table fragment cache for guest page table < page size */
+struct vm_pt_frag_cache {
+   vm_paddr_t page;
+   size_t page_nr_used;
+};
+
 struct kvm_vm {
int mode;
unsigned long type;
@@ -106,6 +112,10 @@ struct kvm_vm {
bool pgd_created;
vm_paddr_t ucall_mmio_addr;
vm_paddr_t pgd;
+#if defined(__powerpc64__)
+   vm_paddr_t prtb; // process table
+   struct vm_pt_frag_cache pt_frag_cache[2]; // 256B and 4KB PT caches
+#endif
vm_vaddr_t gdt;
vm_vaddr_t tss;
vm_vaddr_t idt;
@@ -181,6 +191,8 @@ enum vm_guest_mode {
VM_MODE_PXXV48_4K,  /* For 48bits VA but ANY bits PA */

[PATCH v4 2/4] KVM: selftests: Add aligned guest physical page allocator

2023-11-20 Thread Nicholas Piggin
powerpc will require this to allocate MMU tables in guest memory that
are larger than guest base page size.

Signed-off-by: Nicholas Piggin 
---
 .../selftests/kvm/include/kvm_util_base.h |  2 +
 tools/testing/selftests/kvm/lib/kvm_util.c| 44 ---
 2 files changed, 29 insertions(+), 17 deletions(-)

diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h 
b/tools/testing/selftests/kvm/include/kvm_util_base.h
index e592a75ec052..46edefabac85 100644
--- a/tools/testing/selftests/kvm/include/kvm_util_base.h
+++ b/tools/testing/selftests/kvm/include/kvm_util_base.h
@@ -703,6 +703,8 @@ const char *exit_reason_str(unsigned int exit_reason);
 
 vm_paddr_t vm_phy_page_alloc(struct kvm_vm *vm, vm_paddr_t paddr_min,
 uint32_t memslot);
+vm_paddr_t vm_phy_pages_alloc_align(struct kvm_vm *vm, size_t num, size_t 
align,
+ vm_paddr_t paddr_min, uint32_t memslot);
 vm_paddr_t vm_phy_pages_alloc(struct kvm_vm *vm, size_t num,
  vm_paddr_t paddr_min, uint32_t memslot);
 vm_paddr_t vm_alloc_page_table(struct kvm_vm *vm);
diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c 
b/tools/testing/selftests/kvm/lib/kvm_util.c
index 7a8af1821f5d..22cf6c0ca89f 100644
--- a/tools/testing/selftests/kvm/lib/kvm_util.c
+++ b/tools/testing/selftests/kvm/lib/kvm_util.c
@@ -1924,6 +1924,7 @@ const char *exit_reason_str(unsigned int exit_reason)
  * Input Args:
  *   vm - Virtual Machine
  *   num - number of pages
+ *   align - pages alignment
  *   paddr_min - Physical address minimum
  *   memslot - Memory region to allocate page from
  *
@@ -1937,7 +1938,7 @@ const char *exit_reason_str(unsigned int exit_reason)
  * and their base address is returned. A TEST_ASSERT failure occurs if
  * not enough pages are available at or above paddr_min.
  */
-vm_paddr_t vm_phy_pages_alloc(struct kvm_vm *vm, size_t num,
+vm_paddr_t vm_phy_pages_alloc_align(struct kvm_vm *vm, size_t num, size_t 
align,
  vm_paddr_t paddr_min, uint32_t memslot)
 {
struct userspace_mem_region *region;
@@ -1951,24 +1952,27 @@ vm_paddr_t vm_phy_pages_alloc(struct kvm_vm *vm, size_t 
num,
paddr_min, vm->page_size);
 
region = memslot2region(vm, memslot);
-   base = pg = paddr_min >> vm->page_shift;
-
-   do {
-   for (; pg < base + num; ++pg) {
-   if (!sparsebit_is_set(region->unused_phy_pages, pg)) {
-   base = pg = 
sparsebit_next_set(region->unused_phy_pages, pg);
-   break;
+   base = paddr_min >> vm->page_shift;
+
+again:
+   base = (base + align - 1) & ~(align - 1);
+   for (pg = base; pg < base + num; ++pg) {
+   if (!sparsebit_is_set(region->unused_phy_pages, pg)) {
+   base = sparsebit_next_set(region->unused_phy_pages, pg);
+   if (!base) {
+   fprintf(stderr, "No guest physical pages "
+   "available, paddr_min: 0x%lx "
+   "page_size: 0x%x memslot: %u "
+   "num_pages: %lu align: %lu\n",
+   paddr_min, vm->page_size, memslot,
+   num, align);
+   fputs(" vm dump \n", stderr);
+   vm_dump(stderr, vm, 2);
+   TEST_ASSERT(false, "false");
+   abort();
}
+   goto again;
}
-   } while (pg && pg != base + num);
-
-   if (pg == 0) {
-   fprintf(stderr, "No guest physical page available, "
-   "paddr_min: 0x%lx page_size: 0x%x memslot: %u\n",
-   paddr_min, vm->page_size, memslot);
-   fputs(" vm dump \n", stderr);
-   vm_dump(stderr, vm, 2);
-   abort();
}
 
for (pg = base; pg < base + num; ++pg)
@@ -1977,6 +1981,12 @@ vm_paddr_t vm_phy_pages_alloc(struct kvm_vm *vm, size_t 
num,
return base * vm->page_size;
 }
 
+vm_paddr_t vm_phy_pages_alloc(struct kvm_vm *vm, size_t num,
+ vm_paddr_t paddr_min, uint32_t memslot)
+{
+   return vm_phy_pages_alloc_align(vm, num, 1, paddr_min, memslot);
+}
+
 vm_paddr_t vm_phy_page_alloc(struct kvm_vm *vm, vm_paddr_t paddr_min,
 uint32_t memslot)
 {
-- 
2.42.0



[PATCH v4 1/4] KVM: selftests: Move pgd_created check into virt_pgd_alloc

2023-11-20 Thread Nicholas Piggin
virt_arch_pgd_alloc all do the same test and set of pgd_created. Move
this into common code.

Signed-off-by: Nicholas Piggin 
---
 tools/testing/selftests/kvm/include/kvm_util_base.h | 5 +
 tools/testing/selftests/kvm/lib/aarch64/processor.c | 4 
 tools/testing/selftests/kvm/lib/riscv/processor.c   | 4 
 tools/testing/selftests/kvm/lib/s390x/processor.c   | 4 
 tools/testing/selftests/kvm/lib/x86_64/processor.c  | 7 ++-
 5 files changed, 7 insertions(+), 17 deletions(-)

diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h 
b/tools/testing/selftests/kvm/include/kvm_util_base.h
index a18db6a7b3cf..e592a75ec052 100644
--- a/tools/testing/selftests/kvm/include/kvm_util_base.h
+++ b/tools/testing/selftests/kvm/include/kvm_util_base.h
@@ -846,7 +846,12 @@ void virt_arch_pgd_alloc(struct kvm_vm *vm);
 
 static inline void virt_pgd_alloc(struct kvm_vm *vm)
 {
+   if (vm->pgd_created)
+   return;
+
virt_arch_pgd_alloc(vm);
+
+   vm->pgd_created = true;
 }
 
 /*
diff --git a/tools/testing/selftests/kvm/lib/aarch64/processor.c 
b/tools/testing/selftests/kvm/lib/aarch64/processor.c
index 6fe12e985ba5..69743d4c6748 100644
--- a/tools/testing/selftests/kvm/lib/aarch64/processor.c
+++ b/tools/testing/selftests/kvm/lib/aarch64/processor.c
@@ -96,13 +96,9 @@ void virt_arch_pgd_alloc(struct kvm_vm *vm)
 {
size_t nr_pages = page_align(vm, ptrs_per_pgd(vm) * 8) / vm->page_size;
 
-   if (vm->pgd_created)
-   return;
-
vm->pgd = vm_phy_pages_alloc(vm, nr_pages,
 KVM_GUEST_PAGE_TABLE_MIN_PADDR,
 vm->memslots[MEM_REGION_PT]);
-   vm->pgd_created = true;
 }
 
 static void _virt_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr,
diff --git a/tools/testing/selftests/kvm/lib/riscv/processor.c 
b/tools/testing/selftests/kvm/lib/riscv/processor.c
index d146ca71e0c0..7695ba2cd369 100644
--- a/tools/testing/selftests/kvm/lib/riscv/processor.c
+++ b/tools/testing/selftests/kvm/lib/riscv/processor.c
@@ -57,13 +57,9 @@ void virt_arch_pgd_alloc(struct kvm_vm *vm)
 {
size_t nr_pages = page_align(vm, ptrs_per_pte(vm) * 8) / vm->page_size;
 
-   if (vm->pgd_created)
-   return;
-
vm->pgd = vm_phy_pages_alloc(vm, nr_pages,
 KVM_GUEST_PAGE_TABLE_MIN_PADDR,
 vm->memslots[MEM_REGION_PT]);
-   vm->pgd_created = true;
 }
 
 void virt_arch_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr)
diff --git a/tools/testing/selftests/kvm/lib/s390x/processor.c 
b/tools/testing/selftests/kvm/lib/s390x/processor.c
index 15945121daf1..358e03f09c7a 100644
--- a/tools/testing/selftests/kvm/lib/s390x/processor.c
+++ b/tools/testing/selftests/kvm/lib/s390x/processor.c
@@ -17,16 +17,12 @@ void virt_arch_pgd_alloc(struct kvm_vm *vm)
TEST_ASSERT(vm->page_size == 4096, "Unsupported page size: 0x%x",
vm->page_size);
 
-   if (vm->pgd_created)
-   return;
-
paddr = vm_phy_pages_alloc(vm, PAGES_PER_REGION,
   KVM_GUEST_PAGE_TABLE_MIN_PADDR,
   vm->memslots[MEM_REGION_PT]);
memset(addr_gpa2hva(vm, paddr), 0xff, PAGES_PER_REGION * vm->page_size);
 
vm->pgd = paddr;
-   vm->pgd_created = true;
 }
 
 /*
diff --git a/tools/testing/selftests/kvm/lib/x86_64/processor.c 
b/tools/testing/selftests/kvm/lib/x86_64/processor.c
index d8288374078e..2ec64bb45db2 100644
--- a/tools/testing/selftests/kvm/lib/x86_64/processor.c
+++ b/tools/testing/selftests/kvm/lib/x86_64/processor.c
@@ -127,11 +127,8 @@ void virt_arch_pgd_alloc(struct kvm_vm *vm)
TEST_ASSERT(vm->mode == VM_MODE_PXXV48_4K, "Attempt to use "
"unknown or unsupported guest mode, mode: 0x%x", vm->mode);
 
-   /* If needed, create page map l4 table. */
-   if (!vm->pgd_created) {
-   vm->pgd = vm_alloc_page_table(vm);
-   vm->pgd_created = true;
-   }
+   /* Create page map l4 table. */
+   vm->pgd = vm_alloc_page_table(vm);
 }
 
 static void *virt_get_pte(struct kvm_vm *vm, uint64_t *parent_pte,
-- 
2.42.0



[PATCH v4 0/4] KVM: selftests: add powerpc support

2023-11-20 Thread Nicholas Piggin
This series adds initial KVM selftests support for powerpc
(64-bit, BookS, radix MMU).

Since v3:
https://lore.kernel.org/linuxppc-dev/20230608032425.59796-1-npig...@gmail.com/
- Rebased to upstream (on top of Sean's ucall and guest assert
  rework).
- Drop powerpc specific tests for now, concentrate on base enablement.
- Fix a bunch of bugs and issues including the ones noticed by Joel
  in v3:
- Work around powerpc's max VCPU ID complexity to fix test case failure.
- Use TEST_REQUIRE for radix mode so hash hosts don't assert.
- Pack page table "fragments" in pages (like Linux kernel does), which
  fixes PTE memory consumption estimation and prevents the max memory
  test from failing with no memory for page tables.

Since v2:
- Added a couple of new tests (patch 5,6)
- Make default page size match host page size.
- Check for radix MMU capability.
- Build a few more of the generic tests.

Since v1:
- Update MAINTAINERS KVM PPC entry to include kvm selftests.
- Fixes and cleanups from Sean's review including new patch 1.
- Add 4K guest page support requiring new patch 2.

Thanks,
Nick

Nicholas Piggin (4):
  KVM: selftests: Move pgd_created check into virt_pgd_alloc
  KVM: selftests: Add aligned guest physical page allocator
  KVM: PPC: selftests: add support for powerpc
  KVM: PPC: selftests: powerpc enable kvm_create_max_vcpus test

 MAINTAINERS   |   2 +
 tools/testing/selftests/kvm/Makefile  |  20 +
 .../selftests/kvm/include/kvm_util_base.h |  31 ++
 .../selftests/kvm/include/powerpc/hcall.h |  17 +
 .../selftests/kvm/include/powerpc/ppc_asm.h   |  32 ++
 .../selftests/kvm/include/powerpc/processor.h |  39 ++
 .../selftests/kvm/include/powerpc/ucall.h |  21 +
 .../selftests/kvm/kvm_create_max_vcpus.c  |   9 +
 .../selftests/kvm/lib/aarch64/processor.c |   4 -
 tools/testing/selftests/kvm/lib/guest_modes.c |  27 +-
 tools/testing/selftests/kvm/lib/kvm_util.c|  56 ++-
 .../selftests/kvm/lib/powerpc/handlers.S  |  93 
 .../testing/selftests/kvm/lib/powerpc/hcall.c |  45 ++
 .../selftests/kvm/lib/powerpc/processor.c | 468 ++
 .../testing/selftests/kvm/lib/powerpc/ucall.c |  21 +
 .../selftests/kvm/lib/riscv/processor.c   |   4 -
 .../selftests/kvm/lib/s390x/processor.c   |   4 -
 .../selftests/kvm/lib/x86_64/processor.c  |   7 +-
 tools/testing/selftests/kvm/powerpc/helpers.h |  46 ++
 19 files changed, 908 insertions(+), 38 deletions(-)
 create mode 100644 tools/testing/selftests/kvm/include/powerpc/hcall.h
 create mode 100644 tools/testing/selftests/kvm/include/powerpc/ppc_asm.h
 create mode 100644 tools/testing/selftests/kvm/include/powerpc/processor.h
 create mode 100644 tools/testing/selftests/kvm/include/powerpc/ucall.h
 create mode 100644 tools/testing/selftests/kvm/lib/powerpc/handlers.S
 create mode 100644 tools/testing/selftests/kvm/lib/powerpc/hcall.c
 create mode 100644 tools/testing/selftests/kvm/lib/powerpc/processor.c
 create mode 100644 tools/testing/selftests/kvm/lib/powerpc/ucall.c
 create mode 100644 tools/testing/selftests/kvm/powerpc/helpers.h

-- 
2.42.0



[PATCH] ASoC: fsl_sai: Fix no frame sync clock issue on i.MX8MP

2023-11-20 Thread Shengjiu Wang
On i.MX8MP, when the TERE and FSD_MSTR enabled before configuring
the word width, there will be no frame sync clock issue, because
old word width impact the generation of frame sync.

TERE enabled earlier only for i.MX8MP case for the hardware limitation,
So need to disable FSD_MSTR before configuring word width, then enable
FSD_MSTR bit for this specific case.

Fixes: 3e4a82612998 ("ASoC: fsl_sai: MCLK bind with TX/RX enable bit")
Signed-off-by: Shengjiu Wang 
---
 sound/soc/fsl/fsl_sai.c | 21 +
 1 file changed, 21 insertions(+)

diff --git a/sound/soc/fsl/fsl_sai.c b/sound/soc/fsl/fsl_sai.c
index 79e7c6b98a75..32bbe5056a63 100644
--- a/sound/soc/fsl/fsl_sai.c
+++ b/sound/soc/fsl/fsl_sai.c
@@ -673,6 +673,20 @@ static int fsl_sai_hw_params(struct snd_pcm_substream 
*substream,
   FSL_SAI_CR3_TRCE_MASK,
   FSL_SAI_CR3_TRCE((dl_cfg[dl_cfg_idx].mask[tx] & 
trce_mask)));
 
+   /*
+* When the TERE and FSD_MSTR enabled before configuring the word width
+* There will be no frame sync clock issue, because word width impact
+* the generation of frame sync clock.
+*
+* TERE enabled earlier only for i.MX8MP case for the hardware 
limitation,
+* We need to disable FSD_MSTR before configuring word width, then 
enable
+* FSD_MSTR bit for this specific case.
+*/
+   if (sai->soc_data->mclk_with_tere && sai->mclk_direction_output &&
+   !sai->is_consumer_mode)
+   regmap_update_bits(sai->regmap, FSL_SAI_xCR4(tx, ofs),
+  FSL_SAI_CR4_FSD_MSTR, 0);
+
regmap_update_bits(sai->regmap, FSL_SAI_xCR4(tx, ofs),
   FSL_SAI_CR4_SYWD_MASK | FSL_SAI_CR4_FRSZ_MASK |
   FSL_SAI_CR4_CHMOD_MASK,
@@ -680,6 +694,13 @@ static int fsl_sai_hw_params(struct snd_pcm_substream 
*substream,
regmap_update_bits(sai->regmap, FSL_SAI_xCR5(tx, ofs),
   FSL_SAI_CR5_WNW_MASK | FSL_SAI_CR5_W0W_MASK |
   FSL_SAI_CR5_FBT_MASK, val_cr5);
+
+   /* Enable FSD_MSTR after configuring word width */
+   if (sai->soc_data->mclk_with_tere && sai->mclk_direction_output &&
+   !sai->is_consumer_mode)
+   regmap_update_bits(sai->regmap, FSL_SAI_xCR4(tx, ofs),
+  FSL_SAI_CR4_FSD_MSTR, FSL_SAI_CR4_FSD_MSTR);
+
regmap_write(sai->regmap, FSL_SAI_xMR(tx),
 ~0UL - ((1 << min(channels, slots)) - 1));
 
-- 
2.34.1



Re: [PATCH] ASoC: fsl_mqs: Remove duplicate linux/of.h header

2023-11-20 Thread Shengjiu Wang
On Sun, Nov 19, 2023 at 6:45 PM Lucas Tanure  wrote:
>
> Remove linux/of.h as is included more than once.
> Reported by make includecheck.
>
> Signed-off-by: Lucas Tanure 

Acked-by: Shengjiu Wang 

best regards
Wang shengjiu
> ---
>  sound/soc/fsl/fsl_mqs.c | 1 -
>  1 file changed, 1 deletion(-)
>
> diff --git a/sound/soc/fsl/fsl_mqs.c b/sound/soc/fsl/fsl_mqs.c
> index f2d74ec05cdf..86704ba5f6f0 100644
> --- a/sound/soc/fsl/fsl_mqs.c
> +++ b/sound/soc/fsl/fsl_mqs.c
> @@ -10,7 +10,6 @@
>  #include 
>  #include 
>  #include 
> -#include 
>  #include 
>  #include 
>  #include 
> --
> 2.42.1
>


Re: [PATCH v9 10/15] media: uapi: Add V4L2_CTRL_TYPE_FIXED_POINT

2023-11-20 Thread Shengjiu Wang
On Fri, Nov 17, 2023 at 8:07 PM Hans Verkuil  wrote:
>
> Here is an RFC patch adding support for 'fraction_bits'. It's lacking
> documentation, but it can be used for testing.
>
> It was rather a pain logging fixed point number in a reasonable format,
> but I think it is OK.
>
> In userspace (where you can use floating point) it is a lot easier:
>
> printf("%.*g\n", fraction_bits, (double)v * (1.0 / (1ULL << fraction_bits)));
>
> I decided to only expose fraction_bits in struct v4l2_query_ext_ctrl.
> I could add it to struct v4l2_queryctrl, but I did not think that was
> necessary. Other opinions are welcome.
>
> In the meantime, let me know if this works for your patch series. If it
> does, then I can clean this up.

Thanks.  It works for me.  What I have done are:
1. drop FIXED_POINT
2. use v4l2_ctrl_new_custom

Best regards
Wang shengjiu

>
> Regards,
>
> Hans
>
> Signed-off-by: Hans Verkuil 
> ---
>  drivers/media/v4l2-core/v4l2-ctrls-api.c  |  1 +
>  drivers/media/v4l2-core/v4l2-ctrls-core.c | 72 +++
>  include/media/v4l2-ctrls.h|  7 ++-
>  include/uapi/linux/videodev2.h| 20 ++-
>  4 files changed, 85 insertions(+), 15 deletions(-)
>
> diff --git a/drivers/media/v4l2-core/v4l2-ctrls-api.c 
> b/drivers/media/v4l2-core/v4l2-ctrls-api.c
> index 002ea6588edf..3132df315b17 100644
> --- a/drivers/media/v4l2-core/v4l2-ctrls-api.c
> +++ b/drivers/media/v4l2-core/v4l2-ctrls-api.c
> @@ -1101,6 +1101,7 @@ int v4l2_query_ext_ctrl(struct v4l2_ctrl_handler *hdl, 
> struct v4l2_query_ext_ctr
> qc->elems = ctrl->elems;
> qc->nr_of_dims = ctrl->nr_of_dims;
> memcpy(qc->dims, ctrl->dims, qc->nr_of_dims * sizeof(qc->dims[0]));
> +   qc->fraction_bits = ctrl->fraction_bits;
> qc->minimum = ctrl->minimum;
> qc->maximum = ctrl->maximum;
> qc->default_value = ctrl->default_value;
> diff --git a/drivers/media/v4l2-core/v4l2-ctrls-core.c 
> b/drivers/media/v4l2-core/v4l2-ctrls-core.c
> index a662fb60f73f..0e08a371af5c 100644
> --- a/drivers/media/v4l2-core/v4l2-ctrls-core.c
> +++ b/drivers/media/v4l2-core/v4l2-ctrls-core.c
> @@ -252,12 +252,42 @@ void v4l2_ctrl_type_op_init(const struct v4l2_ctrl 
> *ctrl, u32 from_idx,
>  }
>  EXPORT_SYMBOL(v4l2_ctrl_type_op_init);
>
> +static void v4l2_ctrl_log_fp(s64 v, unsigned int fraction_bits)
> +{
> +   s64 i = v4l2_fp_integer(v, fraction_bits);
> +   s64 f = v4l2_fp_fraction(v, fraction_bits);
> +
> +   if (!f) {
> +   pr_cont("%lld", i);
> +   } else if (fraction_bits < 20) {
> +   u64 div = 1ULL << fraction_bits;
> +
> +   if (!i && f < 0)
> +   pr_cont("-%lld/%llu", -f, div);
> +   else if (!i)
> +   pr_cont("%lld/%llu", f, div);
> +   else if (i < 0 || f < 0)
> +   pr_cont("-%lld-%llu/%llu", -i, -f, div);
> +   else
> +   pr_cont("%lld+%llu/%llu", i, f, div);
> +   } else {
> +   if (!i && f < 0)
> +   pr_cont("-%lld/(2^%u)", -f, fraction_bits);
> +   else if (!i)
> +   pr_cont("%lld/(2^%u)", f, fraction_bits);
> +   else if (i < 0 || f < 0)
> +   pr_cont("-%lld-%llu/(2^%u)", -i, -f, fraction_bits);
> +   else
> +   pr_cont("%lld+%llu/(2^%u)", i, f, fraction_bits);
> +   }
> +}
> +
>  void v4l2_ctrl_type_op_log(const struct v4l2_ctrl *ctrl)
>  {
> union v4l2_ctrl_ptr ptr = ctrl->p_cur;
>
> if (ctrl->is_array) {
> -   unsigned i;
> +   unsigned int i;
>
> for (i = 0; i < ctrl->nr_of_dims; i++)
> pr_cont("[%u]", ctrl->dims[i]);
> @@ -266,7 +296,10 @@ void v4l2_ctrl_type_op_log(const struct v4l2_ctrl *ctrl)
>
> switch (ctrl->type) {
> case V4L2_CTRL_TYPE_INTEGER:
> -   pr_cont("%d", *ptr.p_s32);
> +   if (!ctrl->fraction_bits)
> +   pr_cont("%d", *ptr.p_s32);
> +   else
> +   v4l2_ctrl_log_fp(*ptr.p_s32, ctrl->fraction_bits);
> break;
> case V4L2_CTRL_TYPE_BOOLEAN:
> pr_cont("%s", *ptr.p_s32 ? "true" : "false");
> @@ -281,19 +314,31 @@ void v4l2_ctrl_type_op_log(const struct v4l2_ctrl *ctrl)
> pr_cont("0x%08x", *ptr.p_s32);
> break;
> case V4L2_CTRL_TYPE_INTEGER64:
> -   pr_cont("%lld", *ptr.p_s64);
> +   if (!ctrl->fraction_bits)
> +   pr_cont("%lld", *ptr.p_s64);
> +   else
> +   v4l2_ctrl_log_fp(*ptr.p_s64, ctrl->fraction_bits);
> break;
> case V4L2_CTRL_TYPE_STRING:
> pr_cont("%s", ptr.p_char);
> break;
> case V4L2_CTRL_TYPE_U8:
> -   pr_cont("%u", (unsigned)*ptr.p_u8);
> + 

Re: [PATCH RFC 06/12] mm/gup: Drop folio_fast_pin_allowed() in hugepd processing

2023-11-20 Thread Christoph Hellwig
On Wed, Nov 15, 2023 at 08:29:02PM -0500, Peter Xu wrote:
> Hugepd format is only used in PowerPC with hugetlbfs.  In commit
> a6e79df92e4a ("mm/gup: disallow FOLL_LONGTERM GUP-fast writing to
> file-backed mappings"), we added a check to fail gup-fast if there's
> potential risk of violating GUP over writeback file systems.  That should
> never apply to hugepd.
>
> Drop that check, not only because it'll never be true for hugepd, but also
> it paves way for reusing the function outside fast-gup.

What prevents us from ever using hugepd with file mappings?  I think
it would naturally fit in with how large folios for the pagecache work.

So keeping this check and generalizing it seems like the better idea to
me.


Re: [PATCH v4 2/5] RISC-V: Add SBI debug console helper routines

2023-11-20 Thread Andrew Jones
On Sat, Nov 18, 2023 at 09:08:56AM +0530, Anup Patel wrote:
> Let us provide SBI debug console helper routines which can be
> shared by serial/earlycon-riscv-sbi.c and hvc/hvc_riscv_sbi.c.
> 
> Signed-off-by: Anup Patel 
> ---
>  arch/riscv/include/asm/sbi.h |  5 +
>  arch/riscv/kernel/sbi.c  | 43 
>  2 files changed, 48 insertions(+)
> 
> diff --git a/arch/riscv/include/asm/sbi.h b/arch/riscv/include/asm/sbi.h
> index 66f3933c14f6..ee7aef5f6233 100644
> --- a/arch/riscv/include/asm/sbi.h
> +++ b/arch/riscv/include/asm/sbi.h
> @@ -334,6 +334,11 @@ static inline unsigned long sbi_mk_version(unsigned long 
> major,
>  }
>  
>  int sbi_err_map_linux_errno(int err);
> +
> +extern bool sbi_debug_console_available;
> +int sbi_debug_console_write(unsigned int num_bytes, phys_addr_t base_addr);
> +int sbi_debug_console_read(unsigned int num_bytes, phys_addr_t base_addr);
> +
>  #else /* CONFIG_RISCV_SBI */
>  static inline int sbi_remote_fence_i(const struct cpumask *cpu_mask) { 
> return -1; }
>  static inline void sbi_init(void) {}
> diff --git a/arch/riscv/kernel/sbi.c b/arch/riscv/kernel/sbi.c
> index 5a62ed1da453..73a9c22c3945 100644
> --- a/arch/riscv/kernel/sbi.c
> +++ b/arch/riscv/kernel/sbi.c
> @@ -571,6 +571,44 @@ long sbi_get_mimpid(void)
>  }
>  EXPORT_SYMBOL_GPL(sbi_get_mimpid);
>  
> +bool sbi_debug_console_available;
> +
> +int sbi_debug_console_write(unsigned int num_bytes, phys_addr_t base_addr)
> +{
> + struct sbiret ret;
> +
> + if (!sbi_debug_console_available)
> + return -EOPNOTSUPP;
> +
> + if (IS_ENABLED(CONFIG_32BIT))
> + ret = sbi_ecall(SBI_EXT_DBCN, SBI_EXT_DBCN_CONSOLE_WRITE,
> + num_bytes, lower_32_bits(base_addr),
> + upper_32_bits(base_addr), 0, 0, 0);
> + else
> + ret = sbi_ecall(SBI_EXT_DBCN, SBI_EXT_DBCN_CONSOLE_WRITE,
> + num_bytes, base_addr, 0, 0, 0, 0);
> +
> + return ret.error ? sbi_err_map_linux_errno(ret.error) : ret.value;

We can't get perfect mappings, but I wonder if we can do better than
returning ENOTSUPP for "Failed to write the byte due to I/O errors."

How about

 if (ret.error == SBI_ERR_FAILURE)
 return -EIO;

 return ret.error ? sbi_err_map_linux_errno(ret.error) : ret.value;


> +}
> +
> +int sbi_debug_console_read(unsigned int num_bytes, phys_addr_t base_addr)
> +{
> + struct sbiret ret;
> +
> + if (!sbi_debug_console_available)
> + return -EOPNOTSUPP;
> +
> + if (IS_ENABLED(CONFIG_32BIT))
> + ret = sbi_ecall(SBI_EXT_DBCN, SBI_EXT_DBCN_CONSOLE_READ,
> + num_bytes, lower_32_bits(base_addr),
> + upper_32_bits(base_addr), 0, 0, 0);
> + else
> + ret = sbi_ecall(SBI_EXT_DBCN, SBI_EXT_DBCN_CONSOLE_READ,
> + num_bytes, base_addr, 0, 0, 0, 0);
> +
> + return ret.error ? sbi_err_map_linux_errno(ret.error) : ret.value;

Same comment as above.

> +}
> +
>  void __init sbi_init(void)
>  {
>   int ret;
> @@ -612,6 +650,11 @@ void __init sbi_init(void)
>   sbi_srst_reboot_nb.priority = 192;
>   register_restart_handler(_srst_reboot_nb);
>   }
> + if ((sbi_spec_version >= sbi_mk_version(2, 0)) &&
> + (sbi_probe_extension(SBI_EXT_DBCN) > 0)) {
> + pr_info("SBI DBCN extension detected\n");
> + sbi_debug_console_available = true;
> + }
>   } else {
>   __sbi_set_timer = __sbi_set_timer_v01;
>   __sbi_send_ipi  = __sbi_send_ipi_v01;
> -- 
> 2.34.1
>

Otherwise,

Reviewed-by: Andrew Jones 

Thanks,
drew