Re: [PATCH v4 0/7] introduce vm_flags modifier functions

2023-03-17 Thread Suren Baghdasaryan
On Fri, Mar 17, 2023 at 3:41 PM Alex Williamson
 wrote:
>
> On Fri, 17 Mar 2023 12:08:32 -0700
> Suren Baghdasaryan  wrote:
>
> > On Tue, Mar 14, 2023 at 1:11 PM Alex Williamson
> >  wrote:
> > >
> > > On Thu, 26 Jan 2023 11:37:45 -0800
> > > Suren Baghdasaryan  wrote:
> > >
> > > > This patchset was originally published as a part of per-VMA locking [1] 
> > > > and
> > > > was split after suggestion that it's viable on its own and to facilitate
> > > > the review process. It is now a preprequisite for the next version of 
> > > > per-VMA
> > > > lock patchset, which reuses vm_flags modifier functions to lock the VMA 
> > > > when
> > > > vm_flags are being updated.
> > > >
> > > > VMA vm_flags modifications are usually done under exclusive mmap_lock
> > > > protection because this attrubute affects other decisions like VMA 
> > > > merging
> > > > or splitting and races should be prevented. Introduce vm_flags modifier
> > > > functions to enforce correct locking.
> > > >
> > > > The patchset applies cleanly over mm-unstable branch of mm tree.
> > >
> > > With this series, vfio-pci developed a bunch of warnings around not
> > > holding the mmap_lock write semaphore while calling
> > > io_remap_pfn_range() from our fault handler, vfio_pci_mmap_fault().
> > >
> > > I suspect vdpa has the same issue for their use of remap_pfn_range()
> > > from their fault handler, JasonW, MST, FYI.
> > >
> > > It also looks like gru_fault() would have the same issue, Dimitri.
> > >
> > > In all cases, we're preemptively setting vm_flags to what
> > > remap_pfn_range_notrack() uses, so I thought we were safe here as I
> > > specifically remember trying to avoid changing vm_flags from the
> > > fault handler.  But apparently that doesn't take into account
> > > track_pfn_remap() where VM_PAT comes into play.
> > >
> > > The reason for using remap_pfn_range() on fault in vfio-pci is that
> > > we're mapping device MMIO to userspace, where that MMIO can be disabled
> > > and we'd rather zap the mapping when that occurs so that we can sigbus
> > > the user rather than allow the user to trigger potentially fatal bus
> > > errors on the host.
> > >
> > > Peter Xu has suggested offline that a non-lazy approach to reinsert the
> > > mappings might be more inline with mm expectations relative to touching
> > > vm_flags during fault.  What's the right solution here?  Can the fault
> > > handling be salvaged, is proactive remapping the right approach, or is
> > > there something better?  Thanks,
> >
> > Hi Alex,
> > If in your case it's safe to change vm_flags without holding exclusive
> > mmap_lock, maybe you can use __vm_flags_mod() the way I used it in
> > https://lore.kernel.org/all/20230126193752.297968-7-sur...@google.com,
> > while explaining why this should be safe?
>
> Hi Suren,
>
> Thanks for the reply, but I'm not sure I'm following.  Are you
> suggesting a bool arg added to io_remap_pfn_range(), or some new
> variant of that function to conditionally use __vm_flags_mod() in place
> of vm_flags_set() across the call chain?  Thanks,

I think either way could work but after taking a closer look, both
ways would be quite ugly. If we could somehow identify that we are
handling a page fault and use __vm_flags_mod() without additional
parameters it would be more palatable IMHO...
Peter's suggestion to avoid touching vm_flags during fault would be
much cleaner but I'm not sure how easily that can be done.

>
> Alex
>
> --
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to kernel-team+unsubscr...@android.com.
>


Re: [PATCH v4 0/7] introduce vm_flags modifier functions

2023-03-17 Thread Alex Williamson
On Fri, 17 Mar 2023 12:08:32 -0700
Suren Baghdasaryan  wrote:

> On Tue, Mar 14, 2023 at 1:11 PM Alex Williamson
>  wrote:
> >
> > On Thu, 26 Jan 2023 11:37:45 -0800
> > Suren Baghdasaryan  wrote:
> >  
> > > This patchset was originally published as a part of per-VMA locking [1] 
> > > and
> > > was split after suggestion that it's viable on its own and to facilitate
> > > the review process. It is now a preprequisite for the next version of 
> > > per-VMA
> > > lock patchset, which reuses vm_flags modifier functions to lock the VMA 
> > > when
> > > vm_flags are being updated.
> > >
> > > VMA vm_flags modifications are usually done under exclusive mmap_lock
> > > protection because this attrubute affects other decisions like VMA merging
> > > or splitting and races should be prevented. Introduce vm_flags modifier
> > > functions to enforce correct locking.
> > >
> > > The patchset applies cleanly over mm-unstable branch of mm tree.  
> >
> > With this series, vfio-pci developed a bunch of warnings around not
> > holding the mmap_lock write semaphore while calling
> > io_remap_pfn_range() from our fault handler, vfio_pci_mmap_fault().
> >
> > I suspect vdpa has the same issue for their use of remap_pfn_range()
> > from their fault handler, JasonW, MST, FYI.
> >
> > It also looks like gru_fault() would have the same issue, Dimitri.
> >
> > In all cases, we're preemptively setting vm_flags to what
> > remap_pfn_range_notrack() uses, so I thought we were safe here as I
> > specifically remember trying to avoid changing vm_flags from the
> > fault handler.  But apparently that doesn't take into account
> > track_pfn_remap() where VM_PAT comes into play.
> >
> > The reason for using remap_pfn_range() on fault in vfio-pci is that
> > we're mapping device MMIO to userspace, where that MMIO can be disabled
> > and we'd rather zap the mapping when that occurs so that we can sigbus
> > the user rather than allow the user to trigger potentially fatal bus
> > errors on the host.
> >
> > Peter Xu has suggested offline that a non-lazy approach to reinsert the
> > mappings might be more inline with mm expectations relative to touching
> > vm_flags during fault.  What's the right solution here?  Can the fault
> > handling be salvaged, is proactive remapping the right approach, or is
> > there something better?  Thanks,  
> 
> Hi Alex,
> If in your case it's safe to change vm_flags without holding exclusive
> mmap_lock, maybe you can use __vm_flags_mod() the way I used it in
> https://lore.kernel.org/all/20230126193752.297968-7-sur...@google.com,
> while explaining why this should be safe?

Hi Suren,

Thanks for the reply, but I'm not sure I'm following.  Are you
suggesting a bool arg added to io_remap_pfn_range(), or some new
variant of that function to conditionally use __vm_flags_mod() in place
of vm_flags_set() across the call chain?  Thanks,

Alex



Re: [PATCHv2 pci-next 1/2] PCI/AER: correctable error message as KERN_INFO

2023-03-17 Thread Bjorn Helgaas
On Fri, Mar 17, 2023 at 11:50:22AM -0700, Sathyanarayanan Kuppuswamy wrote:
> On 3/17/23 10:51 AM, Grant Grundler wrote:
> > Since correctable errors have been corrected (and counted), the dmesg output
> > should not be reported as a warning, but rather as "informational".
> > 
> > Otherwise, using a certain well known vendor's PCIe parts in a USB4 docking
> > station, the dmesg buffer can be spammed with correctable errors, 717 bytes
> > per instance, potentially many MB per day.
> 
> Why don't you investigate why you are getting so many correctable errors?
> Isn't solving the problem preferable to hiding the logs?

I hope there's some effort to find the cause of the errors, too.  But
I do think KERN_INFO is a reasonable level for errors that have
already been corrected.  KERN_ERR seems a little bit too severe to me.

Does changing to KERN_INFO keep the messages out of the dmesg log?  I
don't think it does, because *most* kernel messages are at KERN_INFO.
This may be just a commit log clarification.

I would like to know *which* devices are involved.  Is there some
reason for weasel-wording this?  Knowing which devices are involved
helps in triaging issue reports.  If there are any public reports on
mailing lists, etc, we could also cite those here to help users find
this solution.

> > Given the "WARN" priority, these messages have already confused the typical
> > user that stumbles across them, support staff (triaging feedback reports),
> > and more than a few linux kernel devs. Changing to INFO will hide these
> > messages from most audiences.
> > 
> > Signed-off-by: Grant Grundler 
> > ---
> >  drivers/pci/pcie/aer.c | 29 +++--
> >  1 file changed, 19 insertions(+), 10 deletions(-)
> > 
> > diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> > index f6c24ded134c..cb6b96233967 100644
> > --- a/drivers/pci/pcie/aer.c
> > +++ b/drivers/pci/pcie/aer.c
> > @@ -687,23 +687,29 @@ static void __aer_print_error(struct pci_dev *dev,
> >  {
> > const char **strings;
> > unsigned long status = info->status & ~info->mask;
> > -   const char *level, *errmsg;
> > int i;
> >  
> > if (info->severity == AER_CORRECTABLE) {
> > strings = aer_correctable_error_string;
> > -   level = KERN_WARNING;
> > +   pci_info(dev, "aer_status: 0x%08x, aer_mask: 0x%08x\n",
> > +   info->status, info->mask);
> > } else {
> > strings = aer_uncorrectable_error_string;
> > -   level = KERN_ERR;
> > +   pci_err(dev, "aer_status: 0x%08x, aer_mask: 0x%08x\n",
> > +   info->status, info->mask);
> > }
> >  
> > for_each_set_bit(i, , 32) {
> > -   errmsg = strings[i];
> > +   const char *errmsg = strings[i];
> > +
> > if (!errmsg)
> > errmsg = "Unknown Error Bit";
> >  
> > -   pci_printk(level, dev, "   [%2d] %-22s%s\n", i, errmsg,
> > +   if (info->severity == AER_CORRECTABLE)
> > +   pci_info(dev, "   [%2d] %-22s%s\n", i, errmsg,
> > +   info->first_error == i ? " (First)" : "");
> > +   else
> > +   pci_err(dev, "   [%2d] %-22s%s\n", i, errmsg,
> > info->first_error == i ? " (First)" : "");

The - 5 lines, + 11 lines diff and repetition of the printk strings
doesn't seem like an improvement compared to the -1, +1 in the v1
patch:

  @@ -692,7 +692,7 @@ static void __aer_print_error(struct pci_dev *dev,

  if (info->severity == AER_CORRECTABLE) {
  strings = aer_correctable_error_string;
  -   level = KERN_WARNING;
  +   level = KERN_INFO;
  } else {

But maybe there's a reason?

> > }
> > pci_dev_aer_stats_incr(dev, info);
> > @@ -724,7 +730,7 @@ void aer_print_error(struct pci_dev *dev, struct 
> > aer_err_info *info)
> > layer = AER_GET_LAYER_ERROR(info->severity, info->status);
> > agent = AER_GET_AGENT(info->severity, info->status);
> >  
> > -   level = (info->severity == AER_CORRECTABLE) ? KERN_WARNING : KERN_ERR;
> > +   level = (info->severity == AER_CORRECTABLE) ? KERN_INFO : KERN_ERR;
> >  
> > pci_printk(level, dev, "PCIe Bus Error: severity=%s, type=%s, (%s)\n",
> >aer_error_severity_string[info->severity],
> > @@ -797,14 +803,17 @@ void cper_print_aer(struct pci_dev *dev, int 
> > aer_severity,
> > info.mask = mask;
> > info.first_error = PCI_ERR_CAP_FEP(aer->cap_control);
> >  
> > -   pci_err(dev, "aer_status: 0x%08x, aer_mask: 0x%08x\n", status, mask);
> > __aer_print_error(dev, );
> > -   pci_err(dev, "aer_layer=%s, aer_agent=%s\n",
> > -   aer_error_layer[layer], aer_agent_string[agent]);
> >  
> > -   if (aer_severity != AER_CORRECTABLE)
> > +   if (aer_severity == AER_CORRECTABLE) {
> > +   pci_info(dev, "aer_layer=%s, aer_agent=%s\n",
> > +   aer_error_layer[layer], 

Re: [PATCH v11 09/13] arm64: dts: ls1046a: Add serdes nodes

2023-03-17 Thread Sean Anderson
On 3/14/23 03:08, Shawn Guo wrote:
> On Mon, Mar 13, 2023 at 12:11:33PM -0400, Sean Anderson wrote:
>> This adds nodes for the SerDes devices. They are disabled by default
>> to prevent any breakage on existing boards.
>> 
>> Signed-off-by: Sean Anderson 
> 
> The DTS patches look good to me.  Let me know if they are ready to be
> applied.

The main holdup is driver approval from Vinod/Kishon.

--Sean


Re: [PATCH v4 0/7] introduce vm_flags modifier functions

2023-03-17 Thread Suren Baghdasaryan
On Tue, Mar 14, 2023 at 1:11 PM Alex Williamson
 wrote:
>
> On Thu, 26 Jan 2023 11:37:45 -0800
> Suren Baghdasaryan  wrote:
>
> > This patchset was originally published as a part of per-VMA locking [1] and
> > was split after suggestion that it's viable on its own and to facilitate
> > the review process. It is now a preprequisite for the next version of 
> > per-VMA
> > lock patchset, which reuses vm_flags modifier functions to lock the VMA when
> > vm_flags are being updated.
> >
> > VMA vm_flags modifications are usually done under exclusive mmap_lock
> > protection because this attrubute affects other decisions like VMA merging
> > or splitting and races should be prevented. Introduce vm_flags modifier
> > functions to enforce correct locking.
> >
> > The patchset applies cleanly over mm-unstable branch of mm tree.
>
> With this series, vfio-pci developed a bunch of warnings around not
> holding the mmap_lock write semaphore while calling
> io_remap_pfn_range() from our fault handler, vfio_pci_mmap_fault().
>
> I suspect vdpa has the same issue for their use of remap_pfn_range()
> from their fault handler, JasonW, MST, FYI.
>
> It also looks like gru_fault() would have the same issue, Dimitri.
>
> In all cases, we're preemptively setting vm_flags to what
> remap_pfn_range_notrack() uses, so I thought we were safe here as I
> specifically remember trying to avoid changing vm_flags from the
> fault handler.  But apparently that doesn't take into account
> track_pfn_remap() where VM_PAT comes into play.
>
> The reason for using remap_pfn_range() on fault in vfio-pci is that
> we're mapping device MMIO to userspace, where that MMIO can be disabled
> and we'd rather zap the mapping when that occurs so that we can sigbus
> the user rather than allow the user to trigger potentially fatal bus
> errors on the host.
>
> Peter Xu has suggested offline that a non-lazy approach to reinsert the
> mappings might be more inline with mm expectations relative to touching
> vm_flags during fault.  What's the right solution here?  Can the fault
> handling be salvaged, is proactive remapping the right approach, or is
> there something better?  Thanks,

Hi Alex,
If in your case it's safe to change vm_flags without holding exclusive
mmap_lock, maybe you can use __vm_flags_mod() the way I used it in
https://lore.kernel.org/all/20230126193752.297968-7-sur...@google.com,
while explaining why this should be safe?

>
> Alex
>


Re: [PATCHv2 pci-next 1/2] PCI/AER: correctable error message as KERN_INFO

2023-03-17 Thread Sathyanarayanan Kuppuswamy



On 3/17/23 10:51 AM, Grant Grundler wrote:
> Since correctable errors have been corrected (and counted), the dmesg output
> should not be reported as a warning, but rather as "informational".
> 
> Otherwise, using a certain well known vendor's PCIe parts in a USB4 docking
> station, the dmesg buffer can be spammed with correctable errors, 717 bytes
> per instance, potentially many MB per day.

Why don't you investigate why you are getting so many correctable errors?
Isn't solving the problem preferable to hiding the logs?

> 
> Given the "WARN" priority, these messages have already confused the typical
> user that stumbles across them, support staff (triaging feedback reports),
> and more than a few linux kernel devs. Changing to INFO will hide these
> messages from most audiences.
> 
> Signed-off-by: Grant Grundler 
> ---
>  drivers/pci/pcie/aer.c | 29 +++--
>  1 file changed, 19 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> index f6c24ded134c..cb6b96233967 100644
> --- a/drivers/pci/pcie/aer.c
> +++ b/drivers/pci/pcie/aer.c
> @@ -687,23 +687,29 @@ static void __aer_print_error(struct pci_dev *dev,
>  {
>   const char **strings;
>   unsigned long status = info->status & ~info->mask;
> - const char *level, *errmsg;
>   int i;
>  
>   if (info->severity == AER_CORRECTABLE) {
>   strings = aer_correctable_error_string;
> - level = KERN_WARNING;
> + pci_info(dev, "aer_status: 0x%08x, aer_mask: 0x%08x\n",
> + info->status, info->mask);
>   } else {
>   strings = aer_uncorrectable_error_string;
> - level = KERN_ERR;
> + pci_err(dev, "aer_status: 0x%08x, aer_mask: 0x%08x\n",
> + info->status, info->mask);
>   }
>  
>   for_each_set_bit(i, , 32) {
> - errmsg = strings[i];
> + const char *errmsg = strings[i];
> +
>   if (!errmsg)
>   errmsg = "Unknown Error Bit";
>  
> - pci_printk(level, dev, "   [%2d] %-22s%s\n", i, errmsg,
> + if (info->severity == AER_CORRECTABLE)
> + pci_info(dev, "   [%2d] %-22s%s\n", i, errmsg,
> + info->first_error == i ? " (First)" : "");
> + else
> + pci_err(dev, "   [%2d] %-22s%s\n", i, errmsg,
>   info->first_error == i ? " (First)" : "");
>   }
>   pci_dev_aer_stats_incr(dev, info);
> @@ -724,7 +730,7 @@ void aer_print_error(struct pci_dev *dev, struct 
> aer_err_info *info)
>   layer = AER_GET_LAYER_ERROR(info->severity, info->status);
>   agent = AER_GET_AGENT(info->severity, info->status);
>  
> - level = (info->severity == AER_CORRECTABLE) ? KERN_WARNING : KERN_ERR;
> + level = (info->severity == AER_CORRECTABLE) ? KERN_INFO : KERN_ERR;
>  
>   pci_printk(level, dev, "PCIe Bus Error: severity=%s, type=%s, (%s)\n",
>  aer_error_severity_string[info->severity],
> @@ -797,14 +803,17 @@ void cper_print_aer(struct pci_dev *dev, int 
> aer_severity,
>   info.mask = mask;
>   info.first_error = PCI_ERR_CAP_FEP(aer->cap_control);
>  
> - pci_err(dev, "aer_status: 0x%08x, aer_mask: 0x%08x\n", status, mask);
>   __aer_print_error(dev, );
> - pci_err(dev, "aer_layer=%s, aer_agent=%s\n",
> - aer_error_layer[layer], aer_agent_string[agent]);
>  
> - if (aer_severity != AER_CORRECTABLE)
> + if (aer_severity == AER_CORRECTABLE) {
> + pci_info(dev, "aer_layer=%s, aer_agent=%s\n",
> + aer_error_layer[layer], aer_agent_string[agent]);
> + } else {
> + pci_err(dev, "aer_layer=%s, aer_agent=%s\n",
> + aer_error_layer[layer], aer_agent_string[agent]);
>   pci_err(dev, "aer_uncor_severity: 0x%08x\n",
>   aer->uncor_severity);
> + }
>  
>   if (tlp_header_valid)
>   __print_tlp_header(dev, >header_log);

-- 
Sathyanarayanan Kuppuswamy
Linux Kernel Developer


Re: [PATCH] PCI/AER: correctable error message as KERN_INFO

2023-03-17 Thread Grant Grundler
On Tue, Mar 14, 2023 at 5:24 PM Grant Grundler  wrote:
>
> On Tue, Mar 14, 2023 at 12:38 PM Bjorn Helgaas  wrote:
> >
> > On Tue, Feb 28, 2023 at 10:04:53PM -0800, Grant Grundler wrote:
> > > Since correctable errors have been corrected (and counted), the dmesg 
> > > output
> > > should not be reported as a warning, but rather as "informational".
> > >
> > > Otherwise, using a certain well known vendor's PCIe parts in a USB4 
> > > docking
> > > station, the dmesg buffer can be spammed with correctable errors, 717 
> > > bytes
> > > per instance, potentially many MB per day.
> > >
> > > Given the "WARN" priority, these messages have already confused the 
> > > typical
> > > user that stumbles across them, support staff (triaging feedback reports),
> > > and more than a few linux kernel devs. Changing to INFO will hide these
> > > messages from most audiences.
> > >
> > > Signed-off-by: Grant Grundler 
> > > ---
> > > This patch will likely conflict with:
> > >   
> > > https://lore.kernel.org/all/20230103165548.570377-1-rajat.khandel...@linux.intel.com/
> > >
> > > which I'd also like to see upstream. Please let me know to resubmit
> > > mine if Rajat's patch lands first. Or feel free to fix up this one.
> >
> > Yes.  I think it makes sense to separate this into two patches:
> >
> >   1) Log correctable errors as KERN_INFO instead of KERN_WARNING, and
> >   2) Rate-limit correctable error logging.
>
> I'm going to look into your comment below. I'll port Rajat's patch on
> top of mine to follow the order you've listed above.
>
> > >  drivers/pci/pcie/aer.c | 4 ++--
> > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> > > index f6c24ded134c..e4cf3ec40d66 100644
> > > --- a/drivers/pci/pcie/aer.c
> > > +++ b/drivers/pci/pcie/aer.c
> > > @@ -692,7 +692,7 @@ static void __aer_print_error(struct pci_dev *dev,
> > >
> > >   if (info->severity == AER_CORRECTABLE) {
> > >   strings = aer_correctable_error_string;
> > > - level = KERN_WARNING;
> > > + level = KERN_INFO;
> > >   } else {
> > >   strings = aer_uncorrectable_error_string;
> > >   level = KERN_ERR;
> > > @@ -724,7 +724,7 @@ void aer_print_error(struct pci_dev *dev, struct 
> > > aer_err_info *info)
> > >   layer = AER_GET_LAYER_ERROR(info->severity, info->status);
> > >   agent = AER_GET_AGENT(info->severity, info->status);
> > >
> > > - level = (info->severity == AER_CORRECTABLE) ? KERN_WARNING : 
> > > KERN_ERR;
> > > + level = (info->severity == AER_CORRECTABLE) ? KERN_INFO : KERN_ERR;
> > >
> > >   pci_printk(level, dev, "PCIe Bus Error: severity=%s, type=%s, 
> > > (%s)\n",
> > >  aer_error_severity_string[info->severity],
> >
> > Shouldn't we do the same in the cper_print_aer() path?  That path
> > currently uses pci_err() and then calls __aer_print_error(), so the
> > initial message will always be KERN_ERR, and the decoding done by
> > __aer_print_error() will be KERN_INFO (for correctable) or KERN_ERR.
>
> I was completely unaware of this since it's not causing me any
> immediate problems. But I agree the message priority should be
> consistent for correctable errors.

I've just posted a V2 which I believe is against "pci-next":

grundler <1607>git remote -v show pci-next
* remote pci-next
  Fetch URL: git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci.git
  Push  URL: git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci.git
  HEAD branch: main
  Remote branches:
aer   tracked
controller/dt tracked
controller/kirin  tracked
controller/layerscape tracked
controller/rcar   tracked
for-linus tracked
main  tracked
next  tracked
  Local branch configured for 'git pull':
aer_correctable_info merges with remote next

Please let me know if this is the wrong git tree and branch to track.

> > Seems like a shame to do the same test in three places, but would
> > require a little more refactoring to avoid that.
>
> I don't mind doing the same test in multiple places. If refactoring
> this isn't straight forward, I'll leave the refactoring for someone
> more ambitious. :D

I've moved one of the pci_info lines from cper_print_aer()  to
__aer_print_info() since the status/mask are the same for both paths
that invoke __aer_print_info(). But that's as far as I understand what
each of the paths that calls __aer_print_info() do.  If this is not
OK, I can move it back.

cheers,
grant


[PATCHv2 pci-next 2/2] PCI/AER: Rate limit the reporting of the correctable errors

2023-03-17 Thread Grant Grundler
From: Rajat Khandelwal 

There are many instances where correctable errors tend to inundate
the message buffer. We observe such instances during thunderbolt PCIe
tunneling.

It's true that they are mitigated by the hardware and are non-fatal
but we shouldn't be spamming the logs with such correctable errors as it
confuses other kernel developers less familiar with PCI errors, support
staff, and users who happen to look at the logs, hence rate limit them.

A typical example log inside an HP TBT4 dock:
[54912.661142] pcieport :00:07.0: AER: Multiple Corrected error received: 
:2b:00.0
[54912.661194] igc :2b:00.0: PCIe Bus Error: severity=Corrected, type=Data 
Link Layer, (Transmitter ID)
[54912.661203] igc :2b:00.0:   device [8086:5502] error 
status/mask=1100/2000
[54912.661211] igc :2b:00.0:[ 8] Rollover
[54912.661219] igc :2b:00.0:[12] Timeout
[54982.838760] pcieport :00:07.0: AER: Corrected error received: 
:2b:00.0
[54982.838798] igc :2b:00.0: PCIe Bus Error: severity=Corrected, type=Data 
Link Layer, (Transmitter ID)
[54982.838808] igc :2b:00.0:   device [8086:5502] error 
status/mask=1000/2000
[54982.838817] igc :2b:00.0:[12] Timeout

This gets repeated continuously, thus inundating the buffer.

Signed-off-by: Rajat Khandelwal 
Signed-off-by: Grant Grundler 
---
 drivers/pci/pcie/aer.c | 42 --
 1 file changed, 28 insertions(+), 14 deletions(-)

diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
index cb6b96233967..b592cea8bffe 100644
--- a/drivers/pci/pcie/aer.c
+++ b/drivers/pci/pcie/aer.c
@@ -706,8 +706,8 @@ static void __aer_print_error(struct pci_dev *dev,
errmsg = "Unknown Error Bit";
 
if (info->severity == AER_CORRECTABLE)
-   pci_info(dev, "   [%2d] %-22s%s\n", i, errmsg,
-   info->first_error == i ? " (First)" : "");
+   pci_info_ratelimited(dev, "   [%2d] %-22s%s\n", i, 
errmsg,
+info->first_error == i ? " 
(First)" : "");
else
pci_err(dev, "   [%2d] %-22s%s\n", i, errmsg,
info->first_error == i ? " (First)" : "");
@@ -719,7 +719,6 @@ void aer_print_error(struct pci_dev *dev, struct 
aer_err_info *info)
 {
int layer, agent;
int id = ((dev->bus->number << 8) | dev->devfn);
-   const char *level;
 
if (!info->status) {
pci_err(dev, "PCIe Bus Error: severity=%s, type=Inaccessible, 
(Unregistered Agent ID)\n",
@@ -730,14 +729,21 @@ void aer_print_error(struct pci_dev *dev, struct 
aer_err_info *info)
layer = AER_GET_LAYER_ERROR(info->severity, info->status);
agent = AER_GET_AGENT(info->severity, info->status);
 
-   level = (info->severity == AER_CORRECTABLE) ? KERN_INFO : KERN_ERR;
+   if (info->severity == AER_CORRECTABLE) {
+   pci_info_ratelimited(dev, "PCIe Bus Error: severity=%s, 
type=%s, (%s)\n",
+aer_error_severity_string[info->severity],
+aer_error_layer[layer], 
aer_agent_string[agent]);
 
-   pci_printk(level, dev, "PCIe Bus Error: severity=%s, type=%s, (%s)\n",
-  aer_error_severity_string[info->severity],
-  aer_error_layer[layer], aer_agent_string[agent]);
+   pci_info_ratelimited(dev, "  device [%04x:%04x] error 
status/mask=%08x/%08x\n",
+dev->vendor, dev->device, info->status, 
info->mask);
+   } else {
+   pci_err(dev, "PCIe Bus Error: severity=%s, type=%s, (%s)\n",
+   aer_error_severity_string[info->severity],
+   aer_error_layer[layer], aer_agent_string[agent]);
 
-   pci_printk(level, dev, "  device [%04x:%04x] error 
status/mask=%08x/%08x\n",
-  dev->vendor, dev->device, info->status, info->mask);
+   pci_err(dev, "  device [%04x:%04x] error 
status/mask=%08x/%08x\n",
+   dev->vendor, dev->device, info->status, info->mask);
+   }
 
__aer_print_error(dev, info);
 
@@ -757,11 +763,19 @@ static void aer_print_port_info(struct pci_dev *dev, 
struct aer_err_info *info)
u8 bus = info->id >> 8;
u8 devfn = info->id & 0xff;
 
-   pci_info(dev, "%s%s error received: %04x:%02x:%02x.%d\n",
-info->multi_error_valid ? "Multiple " : "",
-aer_error_severity_string[info->severity],
-pci_domain_nr(dev->bus), bus, PCI_SLOT(devfn),
-PCI_FUNC(devfn));
+   if (info->severity == AER_CORRECTABLE)
+   pci_info_ratelimited(dev, "%s%s error received: 
%04x:%02x:%02x.%d\n",
+info->multi_error_valid ? "Multiple " : "",
+

[PATCHv2 pci-next 1/2] PCI/AER: correctable error message as KERN_INFO

2023-03-17 Thread Grant Grundler
Since correctable errors have been corrected (and counted), the dmesg output
should not be reported as a warning, but rather as "informational".

Otherwise, using a certain well known vendor's PCIe parts in a USB4 docking
station, the dmesg buffer can be spammed with correctable errors, 717 bytes
per instance, potentially many MB per day.

Given the "WARN" priority, these messages have already confused the typical
user that stumbles across them, support staff (triaging feedback reports),
and more than a few linux kernel devs. Changing to INFO will hide these
messages from most audiences.

Signed-off-by: Grant Grundler 
---
 drivers/pci/pcie/aer.c | 29 +++--
 1 file changed, 19 insertions(+), 10 deletions(-)

diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
index f6c24ded134c..cb6b96233967 100644
--- a/drivers/pci/pcie/aer.c
+++ b/drivers/pci/pcie/aer.c
@@ -687,23 +687,29 @@ static void __aer_print_error(struct pci_dev *dev,
 {
const char **strings;
unsigned long status = info->status & ~info->mask;
-   const char *level, *errmsg;
int i;
 
if (info->severity == AER_CORRECTABLE) {
strings = aer_correctable_error_string;
-   level = KERN_WARNING;
+   pci_info(dev, "aer_status: 0x%08x, aer_mask: 0x%08x\n",
+   info->status, info->mask);
} else {
strings = aer_uncorrectable_error_string;
-   level = KERN_ERR;
+   pci_err(dev, "aer_status: 0x%08x, aer_mask: 0x%08x\n",
+   info->status, info->mask);
}
 
for_each_set_bit(i, , 32) {
-   errmsg = strings[i];
+   const char *errmsg = strings[i];
+
if (!errmsg)
errmsg = "Unknown Error Bit";
 
-   pci_printk(level, dev, "   [%2d] %-22s%s\n", i, errmsg,
+   if (info->severity == AER_CORRECTABLE)
+   pci_info(dev, "   [%2d] %-22s%s\n", i, errmsg,
+   info->first_error == i ? " (First)" : "");
+   else
+   pci_err(dev, "   [%2d] %-22s%s\n", i, errmsg,
info->first_error == i ? " (First)" : "");
}
pci_dev_aer_stats_incr(dev, info);
@@ -724,7 +730,7 @@ void aer_print_error(struct pci_dev *dev, struct 
aer_err_info *info)
layer = AER_GET_LAYER_ERROR(info->severity, info->status);
agent = AER_GET_AGENT(info->severity, info->status);
 
-   level = (info->severity == AER_CORRECTABLE) ? KERN_WARNING : KERN_ERR;
+   level = (info->severity == AER_CORRECTABLE) ? KERN_INFO : KERN_ERR;
 
pci_printk(level, dev, "PCIe Bus Error: severity=%s, type=%s, (%s)\n",
   aer_error_severity_string[info->severity],
@@ -797,14 +803,17 @@ void cper_print_aer(struct pci_dev *dev, int aer_severity,
info.mask = mask;
info.first_error = PCI_ERR_CAP_FEP(aer->cap_control);
 
-   pci_err(dev, "aer_status: 0x%08x, aer_mask: 0x%08x\n", status, mask);
__aer_print_error(dev, );
-   pci_err(dev, "aer_layer=%s, aer_agent=%s\n",
-   aer_error_layer[layer], aer_agent_string[agent]);
 
-   if (aer_severity != AER_CORRECTABLE)
+   if (aer_severity == AER_CORRECTABLE) {
+   pci_info(dev, "aer_layer=%s, aer_agent=%s\n",
+   aer_error_layer[layer], aer_agent_string[agent]);
+   } else {
+   pci_err(dev, "aer_layer=%s, aer_agent=%s\n",
+   aer_error_layer[layer], aer_agent_string[agent]);
pci_err(dev, "aer_uncor_severity: 0x%08x\n",
aer->uncor_severity);
+   }
 
if (tlp_header_valid)
__print_tlp_header(dev, >header_log);
-- 
2.40.0.rc1.284.g88254d51c5-goog



Re: [GIT PULL] Please pull powerpc/linux.git powerpc-6.3-3 tag

2023-03-17 Thread pr-tracker-bot
The pull request you sent on Fri, 17 Mar 2023 14:24:51 +1100:

> https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git 
> tags/powerpc-6.3-3

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/17adaeaed05b879964b1a72e4db87ebf3b2a38ba

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html


RE: [EXT] Re: [PATCH 1/1] PCI: layerscape: Add the workaround for A-010305

2023-03-17 Thread Frank Li
> >   pci->ep.ops = _pcie_ep_ops;
> >
> > + pcie->big_endian = of_property_read_bool(dev->of_node, "big-
> endian");
> 
> Somewhat surprising that 6c389328c985 ("dt-bindings: pci:
> layerscape-pci: Add a optional property big-endian") added this
> property a year ago, but it has been unused until now?
> 

No, it also for pci host part. Zhiqiang send patch 
https://lore.kernel.org/lkml/20210407030948.3845-1-zhiqiang@nxp.com/

Not sure why bind-doc accepted, but driver code patch have not accepted. 

The same case happen at 
https://lore.kernel.org/imx/20230209151050.233973-1-frank...@nxp.com/T/#t

I tried repost the missed part. But no any response over months.  The above one 
is just one line change. 

> > --
> > 2.34.1
> >


Re: powerpc/pseries: Fix exception handling in pSeries_reconfig_add_node()

2023-03-17 Thread Nathan Lynch
Markus Elfring  writes:
>>> The label “out_err” was used to jump to another pointer check despite of
>>> the detail in the implementation of the function “pSeries_reconfig_add_node”
>>> that it was determined already that the corresponding variable contained
>>> a null pointer (because of a failed function call in two cases).
>>>
>>> 1. Thus return directly after a call of the function “kzalloc” failed.
>>>
>>> 2. Use more appropriate labels instead.
>>>
>>> 3. Delete a redundant check.
>>>
>>> 4. Omit an explicit initialisation for the local variable “err”.
>>>
>>> This issue was detected by using the Coccinelle software.
>> Is there a correctness or safety issue here?
>
> I got the impression that the application of only a single label like 
> “out_err”
> resulted in improvable implementation details.

I don't understand what you're trying to say here. It doesn't seem to
answer my question.

>> The subject uses the word "fix" but the commit message doesn't seem to 
>> identify one.
>
> Can you find the proposed adjustments reasonable?

In the absence of a bug fix or an improvement in readability, not
really, sorry. It adds to the function more goto labels and another
return, apparently to avoid checks that are sometimes redundant (but not
incorrect) at the C source code level. An optimizing compiler doesn't
necessarily arrange the generated code in the same way.

>> Can you share how Coccinelle is being invoked and its output?
>
> Please take another look at available information sources.
> https://lore.kernel.org/cocci/f9303bdc-b1a7-be5e-56c6-dfa8232b8...@web.de/

I wasn't cc'd on this and I'm not subscribed to any lists in the
recipients for that message, so not sure how I would take "another"
look. :-)


Re: powerpc/pseries: Fix exception handling in pSeries_reconfig_add_node()

2023-03-17 Thread Markus Elfring
>> The label “out_err” was used to jump to another pointer check despite of
>> the detail in the implementation of the function “pSeries_reconfig_add_node”
>> that it was determined already that the corresponding variable contained
>> a null pointer (because of a failed function call in two cases).
>>
>> 1. Thus return directly after a call of the function “kzalloc” failed.
>>
>> 2. Use more appropriate labels instead.
>>
>> 3. Delete a redundant check.
>>
>> 4. Omit an explicit initialisation for the local variable “err”.
>>
>> This issue was detected by using the Coccinelle software.
> Is there a correctness or safety issue here?

I got the impression that the application of only a single label like “out_err”
resulted in improvable implementation details.


> The subject uses the word "fix" but the commit message doesn't seem to 
> identify one.

Can you find the proposed adjustments reasonable?


> Can you share how Coccinelle is being invoked and its output?

Please take another look at available information sources.
https://lore.kernel.org/cocci/f9303bdc-b1a7-be5e-56c6-dfa8232b8...@web.de/
https://sympa.inria.fr/sympa/arc/cocci/2023-03/msg00017.html

Another command example:
Markus_Elfring@Sonne:…/Projekte/Linux/next-patched> spatch --timeout 23 -j4 
--chunksize 1 -dir . 
…/Projekte/Coccinelle/janitor/show_jumps_to_pointer_check.cocci > 
../show_jumps_to_pointer_check-next-20230315.diff 2> 
../show_jumps_to_pointer_check-errors-next-20230315.txt


Regards,
Markus


Re: [PATCH v2] hvc/xen: prevent concurrent accesses to the shared ring

2023-03-17 Thread Juergen Gross

On 30.11.22 16:09, Roger Pau Monne wrote:

The hvc machinery registers both a console and a tty device based on
the hv ops provided by the specific implementation.  Those two
interfaces however have different locks, and there's no single locks
that's shared between the tty and the console implementations, hence
the driver needs to protect itself against concurrent accesses.
Otherwise concurrent calls using the split interfaces are likely to
corrupt the ring indexes, leaving the console unusable.

Introduce a lock to xencons_info to serialize accesses to the shared
ring.  This is only required when using the shared memory console,
concurrent accesses to the hypercall based console implementation are
not an issue.

Note the conditional logic in domU_read_console() is slightly modified
so the notify_daemon() call can be done outside of the locked region:
it's an hypercall and there's no need for it to be done with the lock
held.

Fixes: b536b4b96230 ('xen: use the hvc console infrastructure for Xen console')
Signed-off-by: Roger Pau Monné 


Reviewed-by: Juergen Gross 


Juergen



OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


Re: [PATCH] powerpc/pseries: Fix exception handling in pSeries_reconfig_add_node()

2023-03-17 Thread Nathan Lynch
Markus Elfring  writes:
>
> The label “out_err” was used to jump to another pointer check despite of
> the detail in the implementation of the function “pSeries_reconfig_add_node”
> that it was determined already that the corresponding variable contained
> a null pointer (because of a failed function call in two cases).
>
> 1. Thus return directly after a call of the function “kzalloc” failed.
>
> 2. Use more appropriate labels instead.
>
> 3. Delete a redundant check.
>
> 4. Omit an explicit initialisation for the local variable “err”.
>
> This issue was detected by using the Coccinelle software.

Is there a correctness or safety issue here? The subject uses the word
"fix" but the commit message doesn't seem to identify one.

Can you share how Coccinelle is being invoked and its output?


Re: [kvm-unit-tests PATCH 3/7] powerpc: abstract H_CEDE calls into a sleep() function

2023-03-17 Thread Thomas Huth

On 17/03/2023 13.36, Nicholas Piggin wrote:

This consolidates several implementations, and it no longer leaves
MSR[EE] enabled after the decrementer interrupt is handled, but
rather disables it on return.

The handler no longer allows a continuous ticking, but rather dec
has to be re-armed and EE re-enabled (e.g., via H_CEDE hcall) each
time.

Signed-off-by: Nicholas Piggin 
---

...

diff --git a/lib/powerpc/processor.c b/lib/powerpc/processor.c
index ec85b9d..f1fb50f 100644
--- a/lib/powerpc/processor.c
+++ b/lib/powerpc/processor.c
@@ -10,6 +10,8 @@
  #include 
  #include 
  #include 
+#include 
+#include 
  
  static struct {

void (*func)(struct pt_regs *, void *data);
@@ -54,3 +56,32 @@ void udelay(uint64_t us)
  {
delay((us * tb_hz) / 100);
  }
+
+void sleep(uint64_t cycles)


When I see a sleep() in a C program, I automatically assume that it's 
parameter is "seconds", so having a sleep() function here that is taking 
cycles as a parameter is confusing. Could you please name the function 
differently?



+{
+   uint64_t start, end, now;
+
+   start = now = get_tb();
+   end = start + cycles;
+
+   while (end > now) {
+   uint64_t left = end - now;
+
+   /* Could support large decrementer */
+   if (left > 0x7fff)
+   left = 0x7fff;
+
+   asm volatile ("mtdec %0" : : "r" (left));
+   handle_exception(0x900, _handler_oneshot, NULL);
+   if (hcall(H_CEDE) != H_SUCCESS) {
+   printf("H_CEDE failed\n");
+   abort();
+   }
+   handle_exception(0x900, NULL, NULL);
+
+   if (left < 0x7fff)
+   break;


Shouldn't that be covered by the "end > now" in the while loop condition 
check already?



+   now = get_tb();
+   }
+}


 Thomas



Re: [kvm-unit-tests PATCH 2/7] powerpc: add local variant of SPR test

2023-03-17 Thread Thomas Huth

On 17/03/2023 13.36, Nicholas Piggin wrote:

This adds the non-migration variant of the SPR test to the matrix,
which can be simpler to run and debug.

Signed-off-by: Nicholas Piggin 
---
  powerpc/unittests.cfg | 4 
  1 file changed, 4 insertions(+)

diff --git a/powerpc/unittests.cfg b/powerpc/unittests.cfg
index 1e74948..3e41598 100644
--- a/powerpc/unittests.cfg
+++ b/powerpc/unittests.cfg
@@ -68,5 +68,9 @@ groups = h_cede_tm
  
  [sprs]

  file = sprs.elf
+groups = sprs
+
+[sprs-migration]
+file = sprs.elf
  extra_params = -append '-w'
  groups = migration


Reviewed-by: Thomas Huth 



[kvm-unit-tests PATCH 7/7] powerpc/spapr_vpa: Add basic VPA tests

2023-03-17 Thread Nicholas Piggin
The VPA is a(n optional) memory structure shared between the hypervisor
and operating system, defined by PAPR. This test defines the structure
and adds registration, deregistration, and a few simple sanity tests.

Signed-off-by: Nicholas Piggin 
---
 lib/linux/compiler.h|  2 +
 lib/powerpc/asm/hcall.h |  1 +
 lib/ppc64/asm/vpa.h | 62 
 powerpc/Makefile.ppc64  |  2 +-
 powerpc/spapr_vpa.c | 90 +
 5 files changed, 156 insertions(+), 1 deletion(-)
 create mode 100644 lib/ppc64/asm/vpa.h
 create mode 100644 powerpc/spapr_vpa.c

diff --git a/lib/linux/compiler.h b/lib/linux/compiler.h
index 6f565e4..c9d205e 100644
--- a/lib/linux/compiler.h
+++ b/lib/linux/compiler.h
@@ -45,7 +45,9 @@
 
 #define barrier()  asm volatile("" : : : "memory")
 
+#ifndef __always_inline
 #define __always_inlineinline __attribute__((always_inline))
+#endif
 #define noinline __attribute__((noinline))
 #define __unused __attribute__((__unused__))
 
diff --git a/lib/powerpc/asm/hcall.h b/lib/powerpc/asm/hcall.h
index 1173fea..e0f5009 100644
--- a/lib/powerpc/asm/hcall.h
+++ b/lib/powerpc/asm/hcall.h
@@ -18,6 +18,7 @@
 #define H_SET_SPRG00x24
 #define H_SET_DABR 0x28
 #define H_PAGE_INIT0x2c
+#define H_REGISTER_VPA 0xDC
 #define H_CEDE 0xE0
 #define H_GET_TERM_CHAR0x54
 #define H_PUT_TERM_CHAR0x58
diff --git a/lib/ppc64/asm/vpa.h b/lib/ppc64/asm/vpa.h
new file mode 100644
index 000..11dde01
--- /dev/null
+++ b/lib/ppc64/asm/vpa.h
@@ -0,0 +1,62 @@
+#ifndef _ASMPOWERPC_VPA_H_
+#define _ASMPOWERPC_VPA_H_
+/*
+ * This work is licensed under the terms of the GNU LGPL, version 2.
+ */
+
+#ifndef __ASSEMBLY__
+
+struct vpa {
+   uint32_tdescriptor;
+   uint16_tsize;
+   uint8_t reserved1[3];
+   uint8_t status;
+   uint8_t reserved2[14];
+   uint32_tfru_node_id;
+   uint32_tfru_proc_id;
+   uint8_t reserved3[56];
+   uint8_t vhpn_change_counters[8];
+   uint8_t reserved4[80];
+   uint8_t cede_latency;
+   uint8_t maintain_ebb;
+   uint8_t reserved5[6];
+   uint8_t dtl_enable_mask;
+   uint8_t dedicated_cpu_donate;
+   uint8_t maintain_fpr;
+   uint8_t maintain_pmc;
+   uint8_t reserved6[28];
+   uint64_tidle_estimate_purr;
+   uint8_t reserved7[28];
+   uint16_tmaintain_nr_slb;
+   uint8_t idle;
+   uint8_t maintain_vmx;
+   uint32_tvp_dispatch_count;
+   uint32_tvp_dispatch_dispersion;
+   uint64_tvp_fault_count;
+   uint64_tvp_fault_tb;
+   uint64_tpurr_exprop_idle;
+   uint64_tspurr_exprop_idle;
+   uint64_tpurr_exprop_busy;
+   uint64_tspurr_exprop_busy;
+   uint64_tpurr_donate_idle;
+   uint64_tspurr_donate_idle;
+   uint64_tpurr_donate_busy;
+   uint64_tspurr_donate_busy;
+   uint64_tvp_wait3_tb;
+   uint64_tvp_wait2_tb;
+   uint64_tvp_wait1_tb;
+   uint64_tpurr_exprop_adjunct_busy;
+   uint64_tspurr_exprop_adjunct_busy;
+   uint32_tsupervisor_pagein_count;
+   uint8_t reserved8[4];
+   uint64_tpurr_exprop_adjunct_idle;
+   uint64_tspurr_exprop_adjunct_idle;
+   uint64_tadjunct_insns_executed;
+   uint8_t reserved9[120];
+   uint64_tdtl_index;
+   uint8_t reserved10[96];
+};
+
+#endif /* __ASSEMBLY__ */
+
+#endif /* _ASMPOWERPC_VPA_H_ */
diff --git a/powerpc/Makefile.ppc64 b/powerpc/Makefile.ppc64
index ea68447..b0ed2b1 100644
--- a/powerpc/Makefile.ppc64
+++ b/powerpc/Makefile.ppc64
@@ -19,7 +19,7 @@ reloc.o  = $(TEST_DIR)/reloc64.o
 OBJDIRS += lib/ppc64
 
 # ppc64 specific tests
-tests =
+tests = $(TEST_DIR)/spapr_vpa.elf
 
 include $(SRCDIR)/$(TEST_DIR)/Makefile.common
 
diff --git a/powerpc/spapr_vpa.c b/powerpc/spapr_vpa.c
new file mode 100644
index 000..a5047f1
--- /dev/null
+++ b/powerpc/spapr_vpa.c
@@ -0,0 +1,90 @@
+/*
+ * Test sPAPR hypervisor calls (aka. h-calls)
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2.
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include  /* for endian accessors */
+
+static void print_vpa(struct vpa *vpa)
+{
+   printf("VPA\n");
+   printf("descriptor: 0x%08x\n", 
be32_to_cpu(vpa->descriptor));
+   printf("size:   0x%04x\n", 
be16_to_cpu(vpa->size));
+   printf("status:   0x%02x\n", vpa->status);
+   printf("fru_node_id:0x%08x\n", 

[kvm-unit-tests PATCH 6/7] powerpc/sprs: Specify SPRs with data rather than code

2023-03-17 Thread Nicholas Piggin
A significant rework that builds an array of 'struct spr', where each
element describes an SPR. This makes various metadata about the SPR
like name and access type easier to carry and use.

Hypervisor privileged registers are described despite not being used
at the moment for completeness, but also the code might one day be
reused for a hypervisor-privileged test.

Signed-off-by: Nicholas Piggin 
---
This ended up a little over-engineered perhaps, but there are lots of
SPRs, lots of access types, lots of changes between processor and ISA
versions, and lots of places they are implemented and used, so lots of
room for mistakes. There is not a good system in place to easily
see that userspace, supervisor, etc., switches perform all the right
SPR context switching so this is a nice test case to have.

The sprs test quickly caught a few QEMU TCG SPR bugs which really
demonstrates its value and motivated me to improve the SPR coverage
a little.

Thanks,
Nick
---
 powerpc/sprs.c | 586 +
 1 file changed, 391 insertions(+), 195 deletions(-)

diff --git a/powerpc/sprs.c b/powerpc/sprs.c
index b633ea8..257903a 100644
--- a/powerpc/sprs.c
+++ b/powerpc/sprs.c
@@ -82,231 +82,404 @@ static void mtspr(unsigned spr, uint64_t val)
: "lr", "ctr", "xer");
 }
 
-uint64_t before[1024], after[1024];
+static uint64_t before[1024], after[1024];
 
-/* Common SPRs for all PowerPC CPUs */
-static void set_sprs_common(uint64_t val)
-{
-   // mtspr(9, val);   /* CTR */ /* Used by mfspr/mtspr */
-   // mtspr(273, val); /* SPRG1 */  /* Used by our exception handler */
-   mtspr(274, val);/* SPRG2 */
-   mtspr(275, val);/* SPRG3 */
-}
+#define SPR_PR_READ0x0001
+#define SPR_PR_WRITE   0x0002
+#define SPR_OS_READ0x0010
+#define SPR_OS_WRITE   0x0020
+#define SPR_HV_READ0x0100
+#define SPR_HV_WRITE   0x0200
+
+#define RW 0x333
+#define RO 0x111
+#define WO 0x222
+#define OS_RW  0x330
+#define OS_RO  0x110
+#define OS_WO  0x220
+#define HV_RW  0x300
+#define HV_RO  0x100
+#define HV_WO  0x200
+
+#define SPR_ASYNC  0x1000  /* May be updated asynchronously */
+#define SPR_INT0x2000  /* May be updated by synchronous 
interrupt */
+#define SPR_HARNESS0x4000  /* Test harness uses the register */
+
+struct spr {
+   const char  *name;
+   uint8_t width;
+   uint16_taccess;
+   uint16_ttype;
+};
+
+/* SPRs common denominator back to PowerPC Operating Environment Architecture 
*/
+static const struct spr sprs_common[1024] = {
+  [1] = {"XER",64, RW, SPR_HARNESS, }, /* 
Compiler */
+  [8] = {"LR", 64, RW, SPR_HARNESS, }, /* 
Compiler, mfspr/mtspr */
+  [9] = {"CTR",64, RW, SPR_HARNESS, }, /* 
Compiler, mfspr/mtspr */
+ [18] = {"DSISR",  32, OS_RW,  SPR_INT, },
+ [19] = {"DAR",64, OS_RW,  SPR_INT, },
+ [26] = {"SRR0",   64, OS_RW,  SPR_INT, },
+ [27] = {"SRR1",   64, OS_RW,  SPR_INT, },
+[268] = {"TB", 64, RO  ,   SPR_ASYNC, },
+[269] = {"TBU",32, RO, SPR_ASYNC, },
+[272] = {"SPRG0",  64, OS_RW,  SPR_HARNESS, }, /* Int stack */
+[273] = {"SPRG1",  64, OS_RW,  SPR_HARNESS, }, /* Scratch */
+[274] = {"SPRG2",  64, OS_RW, },
+[275] = {"SPRG3",  64, OS_RW, },
+[287] = {"PVR",32, OS_RO, },
+};
 
 /* SPRs from PowerPC Operating Environment Architecture, Book III, Vers. 2.01 
*/
-static void set_sprs_book3s_201(uint64_t val)
-{
-   mtspr(18, val); /* DSISR */
-   mtspr(19, val); /* DAR */
-   mtspr(152, val);/* CTRL */
-   mtspr(256, val);/* VRSAVE */
-   mtspr(786, val);/* MMCRA */
-   mtspr(795, val);/* MMCR0 */
-   mtspr(798, val);/* MMCR1 */
-}
+static const struct spr sprs_201[1024] = {
+ [22] = {"DEC",32, OS_RW,  SPR_ASYNC, },
+ [25] = {"SDR1",   64, HV_RW | OS_RO, },
+ [29] = {"ACCR",   64, OS_RW, },
+[136] = {"CTRL",   32, RO, },
+[152] = {"CTRL",   32, OS_WO, },
+[259] = {"SPRG3",  64, RO, },
+/* ASR, EAR omitted */
+[268] = {"TB", 64, RO, },
+[269] = {"TBU",32, RO, },
+[284] = {"TBL",32, HV_WO, },
+[285] = {"TBU",32, HV_WO, },
+[310] = {"HDEC",   32, HV_RW, },
+[1013]= {"DABR",   64, HV_RW | OS_RO, },
+[1023]= {"PIR",32, OS_RO, },
+};
+
+static const struct spr sprs_970_pmu[1024] = {
+/* POWER4+ PMU, should find PPC970 and confirm */
+[770] = {"MMCRA",  64, RO, },
+[771] = {"PMC1",   32, RO, },
+[772] = {"PMC2",   32, RO, },

[kvm-unit-tests PATCH 5/7] powerpc: Indirect SPR accessor functions

2023-03-17 Thread Nicholas Piggin
Make overly-clever SPR accessor functions that allow a non-constant
SPR number to be specified. This will be used to restructure test
in the next change.

Signed-off-by: Nicholas Piggin 
---
 powerpc/sprs.c | 63 ++
 1 file changed, 54 insertions(+), 9 deletions(-)

diff --git a/powerpc/sprs.c b/powerpc/sprs.c
index 45f77a5..b633ea8 100644
--- a/powerpc/sprs.c
+++ b/powerpc/sprs.c
@@ -28,21 +28,66 @@
 #include 
 #include 
 
-#define mfspr(nr) ({ \
-   uint64_t ret; \
-   asm volatile("mfspr %0,%1" : "=r"(ret) : "i"(nr)); \
-   ret; \
-})
+/* "Indirect" mfspr/mtspr which accept a non-constant spr number */
+static uint64_t mfspr(unsigned spr)
+{
+   uint64_t tmp;
+   uint64_t ret;
+
+   asm volatile(
+"  bcl 20, 31, 1f  \n"
+"1:mflr%0  \n"
+"  addi%0, %0, (2f-1b) \n"
+"  add %0, %0, %2  \n"
+"  mtctr   %0  \n"
+"  bctr\n"
+"2:\n"
+".LSPR=0   \n"
+".rept 1024\n"
+"  mfspr   %1, .LSPR   \n"
+"  b   3f  \n"
+"  .LSPR=.LSPR+1   \n"
+".endr \n"
+"3:\n"
+   : "="(tmp),
+ "=r"(ret)
+   : "r"(spr*8) /* 8 bytes per 'mfspr ; b' block */
+   : "lr", "ctr");
+
+   return ret;
+}
 
-#define mtspr(nr, val) \
-   asm volatile("mtspr %0,%1" : : "i"(nr), "r"(val))
+static void mtspr(unsigned spr, uint64_t val)
+{
+   uint64_t tmp;
+
+   asm volatile(
+"  bcl 20, 31, 1f  \n"
+"1:mflr%0  \n"
+"  addi%0, %0, (2f-1b) \n"
+"  add %0, %0, %2  \n"
+"  mtctr   %0  \n"
+"  bctr\n"
+"2:\n"
+".LSPR=0   \n"
+".rept 1024\n"
+"  mtspr   .LSPR, %1   \n"
+"  b   3f  \n"
+"  .LSPR=.LSPR+1   \n"
+".endr \n"
+"3:\n"
+   : "="(tmp)
+   : "r"(val),
+ "r"(spr*8) /* 8 bytes per 'mfspr ; b' block */
+   : "lr", "ctr", "xer");
+}
 
 uint64_t before[1024], after[1024];
 
 /* Common SPRs for all PowerPC CPUs */
 static void set_sprs_common(uint64_t val)
 {
-   mtspr(9, val);  /* CTR */
+   // mtspr(9, val);   /* CTR */ /* Used by mfspr/mtspr */
// mtspr(273, val); /* SPRG1 */  /* Used by our exception handler */
mtspr(274, val);/* SPRG2 */
mtspr(275, val);/* SPRG3 */
@@ -156,7 +201,7 @@ static void set_sprs(uint64_t val)
 
 static void get_sprs_common(uint64_t *v)
 {
-   v[9] = mfspr(9);/* CTR */
+   v[9] = mfspr(9);/* CTR */ /* Used by mfspr/mtspr */
// v[273] = mfspr(273); /* SPRG1 */ /* Used by our exception handler */
v[274] = mfspr(274);/* SPRG2 */
v[275] = mfspr(275);/* SPRG3 */
-- 
2.37.2



[kvm-unit-tests PATCH 4/7] powerpc: Add ISA v3.1 (POWER10) support to SPR test

2023-03-17 Thread Nicholas Piggin
This is a very basic detection that does not include all new SPRs.

Signed-off-by: Nicholas Piggin 
---
 powerpc/sprs.c | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/powerpc/sprs.c b/powerpc/sprs.c
index 161a6ba..45f77a5 100644
--- a/powerpc/sprs.c
+++ b/powerpc/sprs.c
@@ -117,6 +117,15 @@ static void set_sprs_book3s_300(uint64_t val)
mtspr(823, val);/* PSSCR */
 }
 
+/* SPRs from Power ISA Version 3.1B */
+static void set_sprs_book3s_31(uint64_t val)
+{
+   set_sprs_book3s_207(val);
+   mtspr(48, val); /* PIDR */
+   /* 3.1 removes TIDR */
+   mtspr(823, val);/* PSSCR */
+}
+
 static void set_sprs(uint64_t val)
 {
uint32_t pvr = mfspr(287);  /* Processor Version Register */
@@ -137,6 +146,9 @@ static void set_sprs(uint64_t val)
case 0x4e:  /* POWER9 */
set_sprs_book3s_300(val);
break;
+   case 0x80:  /* POWER10 */
+   set_sprs_book3s_31(val);
+   break;
default:
puts("Warning: Unknown processor version!\n");
}
@@ -220,6 +232,13 @@ static void get_sprs_book3s_300(uint64_t *v)
v[823] = mfspr(823);/* PSSCR */
 }
 
+static void get_sprs_book3s_31(uint64_t *v)
+{
+   get_sprs_book3s_207(v);
+   v[48] = mfspr(48);  /* PIDR */
+   v[823] = mfspr(823);/* PSSCR */
+}
+
 static void get_sprs(uint64_t *v)
 {
uint32_t pvr = mfspr(287);  /* Processor Version Register */
@@ -240,6 +259,9 @@ static void get_sprs(uint64_t *v)
case 0x4e:  /* POWER9 */
get_sprs_book3s_300(v);
break;
+   case 0x80:  /* POWER10 */
+   get_sprs_book3s_31(v);
+   break;
}
 }
 
-- 
2.37.2



[kvm-unit-tests PATCH 1/7] MAINTAINERS: Update powerpc list

2023-03-17 Thread Nicholas Piggin
KVM development on powerpc has moved to the Linux on Power mailing list,
as per linux.git commit 19b27f37ca97d ("MAINTAINERS: Update powerpc KVM
entry").

Signed-off-by: Nicholas Piggin 
---
 MAINTAINERS | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 649de50..b545a45 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -79,7 +79,7 @@ M: Laurent Vivier 
 M: Thomas Huth 
 S: Maintained
 L: k...@vger.kernel.org
-L: kvm-...@vger.kernel.org
+L: linuxppc-dev@lists.ozlabs.org
 F: powerpc/
 F: lib/powerpc/
 F: lib/ppc64/
-- 
2.37.2



[kvm-unit-tests PATCH 3/7] powerpc: abstract H_CEDE calls into a sleep() function

2023-03-17 Thread Nicholas Piggin
This consolidates several implementations, and it no longer leaves
MSR[EE] enabled after the decrementer interrupt is handled, but
rather disables it on return.

The handler no longer allows a continuous ticking, but rather dec
has to be re-armed and EE re-enabled (e.g., via H_CEDE hcall) each
time.

Signed-off-by: Nicholas Piggin 
---
 lib/powerpc/asm/handlers.h  |  2 +-
 lib/powerpc/asm/ppc_asm.h   |  1 +
 lib/powerpc/asm/processor.h |  1 +
 lib/powerpc/handlers.c  | 10 --
 lib/powerpc/processor.c | 31 +++
 powerpc/sprs.c  |  6 +-
 powerpc/tm.c| 20 +---
 7 files changed, 40 insertions(+), 31 deletions(-)

diff --git a/lib/powerpc/asm/handlers.h b/lib/powerpc/asm/handlers.h
index 64ba727..e4a0cd4 100644
--- a/lib/powerpc/asm/handlers.h
+++ b/lib/powerpc/asm/handlers.h
@@ -3,6 +3,6 @@
 
 #include 
 
-void dec_except_handler(struct pt_regs *regs, void *data);
+void dec_handler_oneshot(struct pt_regs *regs, void *data);
 
 #endif /* _ASMPOWERPC_HANDLERS_H_ */
diff --git a/lib/powerpc/asm/ppc_asm.h b/lib/powerpc/asm/ppc_asm.h
index 1b85f6b..6299ff5 100644
--- a/lib/powerpc/asm/ppc_asm.h
+++ b/lib/powerpc/asm/ppc_asm.h
@@ -36,6 +36,7 @@
 #endif /* __BYTE_ORDER__ */
 
 /* Machine State Register definitions: */
+#define MSR_EE_BIT 15  /* External Interrupts Enable */
 #define MSR_SF_BIT 63  /* 64-bit mode */
 
 #endif /* _ASMPOWERPC_PPC_ASM_H */
diff --git a/lib/powerpc/asm/processor.h b/lib/powerpc/asm/processor.h
index ac001e1..1467f2c 100644
--- a/lib/powerpc/asm/processor.h
+++ b/lib/powerpc/asm/processor.h
@@ -20,6 +20,7 @@ static inline uint64_t get_tb(void)
 
 extern void delay(uint64_t cycles);
 extern void udelay(uint64_t us);
+extern void sleep(uint64_t cycles);
 
 static inline void mdelay(uint64_t ms)
 {
diff --git a/lib/powerpc/handlers.c b/lib/powerpc/handlers.c
index c8721e0..296f14f 100644
--- a/lib/powerpc/handlers.c
+++ b/lib/powerpc/handlers.c
@@ -9,15 +9,13 @@
 #include 
 #include 
 #include 
+#include 
 
 /*
  * Generic handler for decrementer exceptions (0x900)
- * Just reset the decrementer back to the value specified when registering the
- * handler
+ * Return with MSR[EE] disabled.
  */
-void dec_except_handler(struct pt_regs *regs __unused, void *data)
+void dec_handler_oneshot(struct pt_regs *regs, void *data)
 {
-   uint64_t dec = *((uint64_t *) data);
-
-   asm volatile ("mtdec %0" : : "r" (dec));
+   regs->msr &= ~(1UL << MSR_EE_BIT);
 }
diff --git a/lib/powerpc/processor.c b/lib/powerpc/processor.c
index ec85b9d..f1fb50f 100644
--- a/lib/powerpc/processor.c
+++ b/lib/powerpc/processor.c
@@ -10,6 +10,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 static struct {
void (*func)(struct pt_regs *, void *data);
@@ -54,3 +56,32 @@ void udelay(uint64_t us)
 {
delay((us * tb_hz) / 100);
 }
+
+void sleep(uint64_t cycles)
+{
+   uint64_t start, end, now;
+
+   start = now = get_tb();
+   end = start + cycles;
+
+   while (end > now) {
+   uint64_t left = end - now;
+
+   /* Could support large decrementer */
+   if (left > 0x7fff)
+   left = 0x7fff;
+
+   asm volatile ("mtdec %0" : : "r" (left));
+   handle_exception(0x900, _handler_oneshot, NULL);
+   if (hcall(H_CEDE) != H_SUCCESS) {
+   printf("H_CEDE failed\n");
+   abort();
+   }
+   handle_exception(0x900, NULL, NULL);
+
+   if (left < 0x7fff)
+   break;
+
+   now = get_tb();
+   }
+}
diff --git a/powerpc/sprs.c b/powerpc/sprs.c
index 5cc1cd1..161a6ba 100644
--- a/powerpc/sprs.c
+++ b/powerpc/sprs.c
@@ -254,7 +254,6 @@ int main(int argc, char **argv)
0x1234567890ABCDEFULL, 0xFEDCBA0987654321ULL,
-1ULL,
};
-   static uint64_t decr = 0x7FFF; /* Max value */
 
for (i = 1; i < argc; i++) {
if (!strcmp(argv[i], "-w")) {
@@ -288,10 +287,7 @@ int main(int argc, char **argv)
if (pause) {
migrate_once();
} else {
-   puts("Sleeping...\n");
-   handle_exception(0x900, _except_handler, );
-   asm volatile ("mtdec %0" : : "r" (0x3FFF));
-   hcall(H_CEDE);
+   sleep(0x1000);
}
 
get_sprs(after);
diff --git a/powerpc/tm.c b/powerpc/tm.c
index 65cacdf..61bacf4 100644
--- a/powerpc/tm.c
+++ b/powerpc/tm.c
@@ -48,17 +48,6 @@ static int count_cpus_with_tm(void)
return available;
 }
 
-static int h_cede(void)
-{
-   register uint64_t r3 asm("r3") = H_CEDE;
-
-   asm volatile ("sc 1" : "+r"(r3) :
-: "r0", "r4", "r5", "r6", "r7", "r8", "r9",
-  "r10", "r11", "r12", "xer", "ctr", "cc");
-
-  

[kvm-unit-tests PATCH 2/7] powerpc: add local variant of SPR test

2023-03-17 Thread Nicholas Piggin
This adds the non-migration variant of the SPR test to the matrix,
which can be simpler to run and debug.

Signed-off-by: Nicholas Piggin 
---
 powerpc/unittests.cfg | 4 
 1 file changed, 4 insertions(+)

diff --git a/powerpc/unittests.cfg b/powerpc/unittests.cfg
index 1e74948..3e41598 100644
--- a/powerpc/unittests.cfg
+++ b/powerpc/unittests.cfg
@@ -68,5 +68,9 @@ groups = h_cede_tm
 
 [sprs]
 file = sprs.elf
+groups = sprs
+
+[sprs-migration]
+file = sprs.elf
 extra_params = -append '-w'
 groups = migration
-- 
2.37.2



Re: [PATCH v2] hvc/xen: prevent concurrent accesses to the shared ring

2023-03-17 Thread Roger Pau Monné
On Fri, Mar 10, 2023 at 10:18:32AM +0100, Roger Pau Monné wrote:
> On Fri, Mar 10, 2023 at 10:01:39AM +1100, Michael Ellerman wrote:
> > Roger Pau Monné  writes:
> > > On Mon, Dec 12, 2022 at 01:36:48PM +0100, Roger Pau Monné wrote:
> > >> On Fri, Dec 02, 2022 at 12:40:05PM +0100, Roger Pau Monné wrote:
> > >> > On Wed, Nov 30, 2022 at 05:08:06PM -0800, Stefano Stabellini wrote:
> > >> > > On Wed, 30 Nov 2022, Roger Pau Monne wrote:
> > >> > > > The hvc machinery registers both a console and a tty device based 
> > >> > > > on
> > >> > > > the hv ops provided by the specific implementation.  Those two
> > >> > > > interfaces however have different locks, and there's no single 
> > >> > > > locks
> > >> > > > that's shared between the tty and the console implementations, 
> > >> > > > hence
> > >> > > > the driver needs to protect itself against concurrent accesses.
> > >> > > > Otherwise concurrent calls using the split interfaces are likely to
> > >> > > > corrupt the ring indexes, leaving the console unusable.
> > >> > > >
> > >> > > > Introduce a lock to xencons_info to serialize accesses to the 
> > >> > > > shared
> > >> > > > ring.  This is only required when using the shared memory console,
> > >> > > > concurrent accesses to the hypercall based console implementation 
> > >> > > > are
> > >> > > > not an issue.
> > >> > > >
> > >> > > > Note the conditional logic in domU_read_console() is slightly 
> > >> > > > modified
> > >> > > > so the notify_daemon() call can be done outside of the locked 
> > >> > > > region:
> > >> > > > it's an hypercall and there's no need for it to be done with the 
> > >> > > > lock
> > >> > > > held.
> > >> > >
> > >> > > For domU_read_console: I don't mean to block this patch but we need 
> > >> > > to
> > >> > > be sure about the semantics of hv_ops.get_chars. Either it is 
> > >> > > expected
> > >> > > to be already locked, then we definitely shouldn't add another lock 
> > >> > > to
> > >> > > domU_read_console. Or it is not expected to be already locked, then 
> > >> > > we
> > >> > > should add the lock.
> > >> > >
> > >> > > My impression is that it is expected to be already locked, but I 
> > >> > > think
> > >> > > we need Greg or Jiri to confirm one way or the other.
> > >> >
> > >> > Let me move both to the 'To:' field then.
> > >> >
> > >> > My main concern is the usage of hv_ops.get_chars hook in
> > >> > hvc_poll_get_char(), as it's not obvious to me that callers of
> > >> > tty->poll_get_char hook as returned by tty_find_polling_driver() will
> > >> > always do so with the tty lock held (in fact the only user right now
> > >> > doesn't seem to hold the tty lock).
> > >> >
> > >> > > Aside from that the rest looks fine.
> > >
> > > I guess I could reluctantly remove the lock in the get_chars hook,
> > > albeit I'm not convinced at all the lock is not needed there.
> > 
> > I don't know the xen driver, but other HVC backends have a lock around
> > their private state in their get_chars() implementations.
> > 
> > See eg. hvterm_raw_get_chars().
> 
> Yes, that was one of the motivation for adding the lock also here, and
> it has already been mentioned.  The other is the usage of the hooks by
> callers of hvc_poll_get_char().

Ping?

Thanks, Roger.


[PATCH] powerpc/pseries: Fix exception handling in pSeries_reconfig_add_node()

2023-03-17 Thread Markus Elfring
Date: Fri, 17 Mar 2023 09:26:13 +0100

The label “out_err” was used to jump to another pointer check despite of
the detail in the implementation of the function “pSeries_reconfig_add_node”
that it was determined already that the corresponding variable contained
a null pointer (because of a failed function call in two cases).

1. Thus return directly after a call of the function “kzalloc” failed.

2. Use more appropriate labels instead.

3. Delete a redundant check.

4. Omit an explicit initialisation for the local variable “err”.

This issue was detected by using the Coccinelle software.

Fixes: 1da177e4c3f41524e886b7f1b8a0c1fc7321cac ("Linux-2.6.12-rc2")
Signed-off-by: Markus Elfring 
---
 arch/powerpc/platforms/pseries/reconfig.c | 26 ---
 1 file changed, 14 insertions(+), 12 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/reconfig.c 
b/arch/powerpc/platforms/pseries/reconfig.c
index 599bd2c78514..14154f48ef63 100644
--- a/arch/powerpc/platforms/pseries/reconfig.c
+++ b/arch/powerpc/platforms/pseries/reconfig.c
@@ -23,15 +23,17 @@
 static int pSeries_reconfig_add_node(const char *path, struct property 
*proplist)
 {
     struct device_node *np;
-    int err = -ENOMEM;
+    int err;
 
     np = kzalloc(sizeof(*np), GFP_KERNEL);
     if (!np)
-        goto out_err;
+        return -ENOMEM;
 
     np->full_name = kstrdup(kbasename(path), GFP_KERNEL);
-    if (!np->full_name)
-        goto out_err;
+    if (!np->full_name) {
+        err = -ENOMEM;
+        goto free_device_node;
+    }
 
     np->properties = proplist;
     of_node_set_flag(np, OF_DYNAMIC);
@@ -40,25 +42,25 @@ static int pSeries_reconfig_add_node(const char *path, 
struct property *proplist
     np->parent = pseries_of_derive_parent(path);
     if (IS_ERR(np->parent)) {
         err = PTR_ERR(np->parent);
-        goto out_err;
+        goto free_name;
     }
 
     err = of_attach_node(np);
     if (err) {
         printk(KERN_ERR "Failed to add device node %s\n", path);
-        goto out_err;
+        goto put_node;
     }
 
     of_node_put(np->parent);
 
     return 0;
 
-out_err:
-    if (np) {
-        of_node_put(np->parent);
-        kfree(np->full_name);
-        kfree(np);
-    }
+put_node:
+    of_node_put(np->parent);
+free_name:
+    kfree(np->full_name);
+free_device_node:
+    kfree(np);
     return err;
 }
 
--
2.40.0