Re: [PATCH 1/3] core/device: Add function to return child node using name at substring "@"

2023-09-24 Thread Athira Rajeev



> On 18-Sep-2023, at 7:42 PM, Reza Arbab  wrote:
> 
> On Thu, Sep 14, 2023 at 10:02:04PM +0530, Athira Rajeev wrote:
>> Add a function dt_find_by_name_before_addr() that returns the child node if
>> it matches till first occurrence at "@" of a given name, otherwise NULL.
>> This is helpful for cases with node name like: "name@addr". In
>> scenarios where nodes are added with "name@addr" format and if the
>> value of "addr" is not known, that node can't be matched with node
>> name or addr. Hence matching with substring as node name will return
>> the expected result. Patch adds dt_find_by_name_before_addr() function
>> and testcase for the same in core/test/run-device.c
> 
> Series applied to skiboot master with the fixup we discussed.
> 
> -- 
> Reza Arbab

Thanks Reza for picking up the patchset

Athira



[PATCH v2 3/3] Documentation/powerpc: update fadump implementation details

2023-09-24 Thread Sourabh Jain
The patch titled ("powerpc: make fadump resilient with memory add/remove
events") has made significant changes to the implementation of fadump,
particularly on elfcorehdr creation and fadump crash info header
structure. Therefore, updating the fadump implementation documentation
to reflect those changes.

Following updates are done to firmware assisted dump documentation:

1. The elfcorehdr is no longer stored after fadump HDR in the reserved
   dump area. Instead, the second kernel dynamically allocates memory
   for the elfcorehdr within the address range from 0 to the boot memory
   size. Therefore, update figures 1 and 2 of Memory Reservation during
   the first and second kernels to reflect this change.

2. A version field has been added to the fadump header to maintain
   backward compatibility without changing the fadump header magic
   number in the future. Therefore, remove the corresponding TODO from
   the document.

Signed-off-by: Sourabh Jain 
---
 .../powerpc/firmware-assisted-dump.rst| 91 +--
 1 file changed, 42 insertions(+), 49 deletions(-)

diff --git a/Documentation/powerpc/firmware-assisted-dump.rst 
b/Documentation/powerpc/firmware-assisted-dump.rst
index e363fc48529a..7e37aadd1f77 100644
--- a/Documentation/powerpc/firmware-assisted-dump.rst
+++ b/Documentation/powerpc/firmware-assisted-dump.rst
@@ -134,12 +134,12 @@ that are run. If there is dump data, then the
 memory is held.
 
 If there is no waiting dump data, then only the memory required to
-hold CPU state, HPTE region, boot memory dump, FADump header and
-elfcore header, is usually reserved at an offset greater than boot
-memory size (see Fig. 1). This area is *not* released: this region
-will be kept permanently reserved, so that it can act as a receptacle
-for a copy of the boot memory content in addition to CPU state and
-HPTE region, in the case a crash does occur.
+hold CPU state, HPTE region, boot memory dump, and FADump header is
+usually reserved at an offset greater than boot memory size (see Fig. 1).
+This area is *not* released: this region will be kept permanently
+reserved, so that it can act as a receptacle for a copy of the boot
+memory content in addition to CPU state and HPTE region, in the case
+a crash does occur.
 
 Since this reserved memory area is used only after the system crash,
 there is no point in blocking this significant chunk of memory from
@@ -153,22 +153,22 @@ that were present in CMA region::
 
   o Memory Reservation during first kernel
 
-  Low memory Top of memory
-  0boot memory size   |<--- Reserved dump area --->|   |
-  |   |   |Permanent Reservation   |   |
-  V   V   ||   V
-  +---+-/ /---+---++---+-+-++--+
-  |   |   |///||  DUMP | HDR | ELF ||  |
-  +---+-/ /---+---++---+-+-++--+
-|   ^^ ^  ^   ^
-|   || |  |   |
-\  CPU  HPTE   /  |   |
- --   |   |
-  Boot memory content gets transferred|   |
-  to reserved area by firmware at the |   |
-  time of crash.  |   |
-  FADump Header   |
-   (meta area)|
+  Low memory  Top of memory
+  0boot memory size   |<-- Reserved dump area ->| |
+  |   |   |  Permanent Reservation  | |
+  V   V   | | V
+  +---+-/ /---+---++---+---++-+
+  |   |   |///||DUMP   |  HDR  || |
+  +---+-/ /---+---++---+---++-+
+|   ^^   ^ ^  ^
+|   ||   | |  |
+\  CPU  HPTE / |  |
+   |  |
+  Boot memory content gets transferred |  |
+  to reserved area by firmware at the  |  |
+  time of crash.   |  |
+   FADump Header  |
+(meta area)   |
   |
   |
   Metadata: This area holds a metadata structure whose
@@ -186,13 +186,20 @@ that were present in CMA region::
   0  boot memory size  |
   |   |< Crash preserved area >|
   V   V   

[PATCH v2 2/3] powerpc/fadump: add hotplug_ready sysfs interface

2023-09-24 Thread Sourabh Jain
The elfcorehdr describes the CPUs and memory of the crashed kernel to
the kernel that captures the dump, known as the second or fadump kernel.
The elfcorehdr needs to be updated if the system's memory changes due to
memory hotplug or online/offline events.

Currently, memory hotplug events are monitored in userspace by udev
rules, and fadump is re-registered, which recreates the elfcorehdr with
the latest available memory in the system.

However, the previous patch ("powerpc: make fadump resilient with memory
add/remove events") moved the creation of elfcorehdr to the second or
fadump kernel. This eliminates the need to regenerate the elfcorehdr
during memory hotplug or online/offline events.

Create a sysfs entry at /sys/kernel/fadump/hotplug_ready to let
userspace know that fadump re-registration is not required for memory
add/remove events.

Signed-off-by: Sourabh Jain 
---
 Documentation/ABI/testing/sysfs-kernel-fadump | 12 
 arch/powerpc/kernel/fadump.c  | 14 ++
 2 files changed, 26 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-kernel-fadump 
b/Documentation/ABI/testing/sysfs-kernel-fadump
index 8f7a64a81783..f8ba21da0f71 100644
--- a/Documentation/ABI/testing/sysfs-kernel-fadump
+++ b/Documentation/ABI/testing/sysfs-kernel-fadump
@@ -38,3 +38,15 @@ Contact: linuxppc-dev@lists.ozlabs.org
 Description:   read only
Provide information about the amount of memory reserved by
FADump to save the crash dump in bytes.
+What:  /sys/kernel/fadump/hotplug_ready
+Date:  Sep 2023
+Contact:   linuxppc-dev@lists.ozlabs.org
+Description:   read only
+   The Kdump scripts utilize udev rules to monitor memory 
add/remove
+   events, ensuring that FADUMP is automatically re-registered when
+   system memory changes occur. This re-registration was necessary
+   to update the elfcorehdr, which describes the system memory to 
the
+   second kernel. Now If this sysfs node holds a value of 1, it
+   indicates to userspace that FADUMP does not require 
re-registration
+   since the elfcorehdr is now generated in the second kernel.
+User:  kexec-tools
diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c
index 27168a333a13..b6055fd7 100644
--- a/arch/powerpc/kernel/fadump.c
+++ b/arch/powerpc/kernel/fadump.c
@@ -1473,6 +1473,18 @@ static ssize_t enabled_show(struct kobject *kobj,
return sprintf(buf, "%d\n", fw_dump.fadump_enabled);
 }
 
+/*
+ * /sys/kernel/fadump/hotplug_ready sysfs node only returns 1,
+ * which inidcates to usersapce that fadump re-registration is not
+ * required on memory hotplug events.
+ */
+static ssize_t hotplug_ready_show(struct kobject *kobj,
+ struct kobj_attribute *attr,
+ char *buf)
+{
+   return sprintf(buf, "%d\n", 1);
+}
+
 static ssize_t mem_reserved_show(struct kobject *kobj,
 struct kobj_attribute *attr,
 char *buf)
@@ -1545,11 +1557,13 @@ static struct kobj_attribute release_attr = 
__ATTR_WO(release_mem);
 static struct kobj_attribute enable_attr = __ATTR_RO(enabled);
 static struct kobj_attribute register_attr = __ATTR_RW(registered);
 static struct kobj_attribute mem_reserved_attr = __ATTR_RO(mem_reserved);
+static struct kobj_attribute hotplug_ready_attr = __ATTR_RO(hotplug_ready);
 
 static struct attribute *fadump_attrs[] = {
_attr.attr,
_attr.attr,
_reserved_attr.attr,
+   _ready_attr.attr,
NULL,
 };
 
-- 
2.41.0



[PATCH v2 1/3] powerpc: make fadump resilient with memory add/remove events

2023-09-24 Thread Sourabh Jain
Due to changes in memory resources caused by either memory hotplug or
online/offline events, the elfcorehdr, which describes the CPUs and
memory of the crashed kernel to the kernel that collects the dump (knows
as second/fadump kernel), becomes outdated. Consequently, attempting
dump collection with an outdated elfcorehdr can lead to failed or
inaccurate dump collection.

Memory hotplug or online/offline events is referred as memory add/remove
events in reset of the commit message.

The current solution to address the aforementioned issue is as follows:
Monitor memory add/remove events in userspace using udev rules, and
re-register fadump whenever there are changes in memory resources. This
leads to the creation of a new elfcorehdr with updated system memory
information.

There are several notable issues associated with re-registering fadump
for every memory add/remove events.

1. Bulk memory add/remove events with udev-based fadump re-registration
   can lead to race conditions and, more importantly, it creates a wide
   window during which fadump is inactive until all memory add/remove
   events are settled.
2. Re-registering fadump for every memory add/remove event is
   inefficient.
3. The memory for elfcorehdr is allocated based on the memblock regions
   available during early boot and remains fixed thereafter. However, if
   elfcorehdr is later recreated with to additional memblock regions,
   its size will increase, potentially leading to memory corruption.

Address the aforementioned challenges by shifting the creation of
elfcorehdr from the first kernel (also referred as the crashed kernel),
where it was created and frequently recreated for every memory
add/remove event, to the fadump kernel. As a result, the elfcorehdr only
needs to be created once, thus eliminating the necessity to re-register
fadump during memory add/remove events.

At present, the first kernel prepares the elfcorehdr and stores it in
the fadump reserved area. Subsequently, it includes the start address of
the elfcorehdr in the elfcorehdr_addr attribute of the fadump header.
The fadump header holds information like elfcorehdr address, crashing
CPU details, etc. The first kernel prepares the fadump header and stores
it in the fadump reserved area. In the event of first kernel crash the
second boots and access the fadump header prepared by first kernel and
do the following in a platform-specific function
[rtas|opal]_fadump_process:

1. Sanity check for fadump header
2. Update CPU notes in elfcorehdr
3. Set the global variable elfcorehdr_addr to the address of the
   fadump header's elfcorehdr. For vmcore module to process it later on.

Along with the above, update the setup_fadump()/fadump.c to create
elfcorehdr in second/fadump kernel.

Section below outlines the information required to create the elfcorehdr
and the changes made to make it available to the fadump kernel if it's
not already.

To create elfcorehdr, the following crashed kernel information is
required: CPU notes, vmcoreinfo, and memory ranges.

At present, the CPU notes are already prepared in the fadump kernel, so
no changes are needed in that regard. The fadump kernel has access to
all crashed kernel memory regions, including boot memory regions that
are relocated by firmware to fadump reserved areas, so no changes for
that either. However, it is necessary to add new members to the fadump
header, i.e., the 'fadump_crash_info_header' structure, in order to pass
the crashed kernel's vmcoreinfo address and size to the fadump kernel.

Table 1 below illustrates kernel's ability to collect dump if either the
first/crashed kernel or the second/fadump kernel does not have the
changes introduced here. Consider the 'old kernel' as the kernel without
this patch, and the 'new kernel' as the kernel with this patch included.

+--+++---+
| scenario |  first/crashed kernel  |  second/fadump kernel  |  Dump |
+--+++---+
|1 |   old kernel   |new kernel  |  Yes  |
+--+++---+
|2 |   new kernel   |old kernel  |  No   |
+--+++---+

  Table 1

Scenario 1:
---
Since the magic number of fadump header is updated, the second kernel
can differentiate the crashed kernel is of type 'new kernel' or
'old kernel' and act accordingly. In this scenario, since the crashed
kernel is of type 'old kernel,' the fadump kernel skips elfcorehdr
creation and uses the one prepared in the first kernel itself to collect
the dump.

Scenario 2:
---
Since 'old kernel' as the fadump kernel is NOT capable of processing
fadump header with updated magic number from 'new kernel' hence it
gracefully fails with the below error and dump collection fails in this
scenario.

[

[PATCH v2 0/3] powerpc: make fadump resilient with memory add/remove events

2023-09-24 Thread Sourabh Jain
Problem:

Due to changes in memory resources caused by either memory hotplug or
online/offline events, the elfcorehdr, which describes the cpus and
memory of the crashed kernel to the kernel that collects the dump (knows
as second/fadump kernel), becomes outdated. Consequently, attempting
dump collection with an outdated elfcorehdr can lead to failed or
inaccurate dump collection.

Memory hotplug or online/offline events is referred as memory add/remove
events in reset of the patch series.

Existing solution:
==
Monitor memory add/remove events in userspace using udev rules, and
re-register fadump whenever there are changes in memory resources. This
leads to the creation of a new elfcorehdr with updated system memory
information.

Challenges with existing solution:
==
1. Performing bulk memory add/remove with udev-based fadump
   re-registration can lead to race conditions and, more importantly,
   it creates a wide window during which fadump is inactive until all
   memory add/remove events are settled.
2. Re-registering fadump for every memory add/remove event is
   inefficient.
3. Memory for elfcorehdr is allocated based on the memblock regions
   available during early boot and remains fixed thereafter. However, if
   the elfcorehdr is later recreated with additional memblock regions,
   its size will increase, potentially leading to memory corruption.

Proposed solution:
==
Address the aforementioned challenges by shifting the creation of
elfcorehdr from the first kernel (also referred as the crashed kernel),
where it was created and frequently recreated for every memory
add/remove event, to the fadump kernel. As a result, the elfcorehdr only
needs to be created once, thus eliminating the necessity to re-register
fadump during memory add/remove events.

To know more about elfcorehdr creation in the fadump kernel, refer to
the first patch in this series.

The second patch includes a new sysfs interface that tells userspace
that fadump re-registration isn't needed for memory add/remove events. 
note that userspace changes do not need to be in sync with kernel
changes; they can roll out independently.

Since there are significant changes in the fadump implementation, the
third patch updates the fadump documentation to reflect the changes made
in this patch series.

Kernel tree rebased on 6.6-rc3 with patch series applied:
=
https://github.com/sourabhjains/linux/tree/fadump-mem-hotplug

Userspace changes:
==
To realize this feature, one must update the kdump udev rules to prevent
fadump re-registration during memory add/remove events.

On rhel apply the following changes to file
/usr/lib/udev/rules.d/98-kexec.rules

-run+="/bin/sh -c '/usr/bin/systemctl is-active kdump.service || exit 0; 
/usr/bin/systemd-run --quiet --no-block /usr/lib/udev/kdump-udev-throttler'"
+# don't re-register fadump if the value of the node
+# /sys/kernel/fadump/hotplug_ready is 1.
+
+run+="/bin/sh -c '/usr/bin/systemctl is-active kdump.service || exit 0; ! test 
-f /sys/kernel/fadump_enabled || cat /sys/kernel/fadump_enabled | grep 0  || ! 
test -f /sys/kernel/fadump/hotplug_ready || cat 
/sys/kernel/fadump/hotplug_ready | grep 0 || exit 0; /usr/bin/systemd-run 
--quiet --no-block /usr/lib/udev/kdump-udev-throttler'"

Changelog:
==
v1 -> v2
- Fixed a few indentation issues reported by the checkpatch script.
- Rebased it to 6.6.0-rc3

Sourabh Jain (3):
  powerpc: make fadump resilient with memory add/remove events
  powerpc/fadump: add hotplug_ready sysfs interface
  Documentation/powerpc: update fadump implementation details

 Documentation/ABI/testing/sysfs-kernel-fadump |  12 +
 .../powerpc/firmware-assisted-dump.rst|  91 ++---
 arch/powerpc/include/asm/fadump-internal.h|  24 +-
 arch/powerpc/kernel/fadump.c  | 371 +++---
 arch/powerpc/platforms/powernv/opal-fadump.c  |  18 +-
 arch/powerpc/platforms/pseries/rtas-fadump.c  |  23 +-
 6 files changed, 307 insertions(+), 232 deletions(-)

-- 
2.41.0



Re: [RFC v3 0/2] CPU-Idle latency selftest framework

2023-09-24 Thread Aboorva Devarajan
On Mon, 2023-09-11 at 11:06 +0530, Aboorva Devarajan wrote:

CC'ing CPUidle lists and maintainers,

Patch Summary:

The patchset introduces a kernel module and userspace driver designed
for estimating the wakeup latency experienced when waking up from
various CPU idle states. It primarily measures latencies related to two
types of events: Inter-Processor Interrupts (IPIs) and Timers.

Background:

Initially, these patches were introduced as a generic self-test.
However, it was later discovered that Intel platforms incorporate
timer-based wakeup optimizations. These optimizations allow CPUs to
perform a pre-wakeup, which limits the effectiveness of latency
observation in certain scenarios because it only measures the optimized
wakeup latency [1]. 

Therefore, in this RFC, the self-test is specifically integrated into
PowerPC, as it has been tested and used in PowerPC so far.

Another proposal is to introduce these patches as a generic cpuilde IPI
and timer wake-up test. While this method may not give us an exact
measurement of latency variations at the hardware level, it can still
help us assess this metric from a software observability standpoint.

Looking forward to hearing what you think and any suggestions you may
have regarding this. Thanks.

[1] 
https://lore.kernel.org/linux-pm/20200914174625.gb25...@in.ibm.com/T/#m5c004b9b1a918f669e91b3d0f33e2e3500923234

> Changelog: v2 -> v3
> 
> * Minimal code refactoring
> * Rebased on v6.6-rc1
> 
> RFC v1: 
> https://lore.kernel.org/all/20210611124154.56427-1-psam...@linux.ibm.com/
> 
> RFC v2:
> https://lore.kernel.org/all/20230828061530.126588-2-aboor...@linux.vnet.ibm.com/
> 
> Other related RFC:
> https://lore.kernel.org/all/20210430082804.38018-1-psam...@linux.ibm.com/
> 
> Userspace selftest:
> https://lkml.org/lkml/2020/9/2/356
> 
> 
> 
> A kernel module + userspace driver to estimate the wakeup latency
> caused by going into stop states. The motivation behind this program
> is
> to find significant deviations behind advertised latency and
> residency
> values.
> 
> The patchset measures latencies for two kinds of events. IPIs and
> Timers
> As this is a software-only mechanism, there will be additional
> latencies
> of the kernel-firmware-hardware interactions. To account for that,
> the
> program also measures a baseline latency on a 100 percent loaded CPU
> and the latencies achieved must be in view relative to that.
> 
> To achieve this, we introduce a kernel module and expose its control
> knobs through the debugfs interface that the selftests can engage
> with.
> 
> The kernel module provides the following interfaces within
> /sys/kernel/debug/powerpc/latency_test/ for,
> 
> IPI test:
> ipi_cpu_dest = Destination CPU for the IPI
> ipi_cpu_src = Origin of the IPI
> ipi_latency_ns = Measured latency time in ns
> Timeout test:
> timeout_cpu_src = CPU on which the timer to be queued
> timeout_expected_ns = Timer duration
> timeout_diff_ns = Difference of actual duration vs expected timer
> 
> Sample output is as follows:
> 
> # --IPI Latency Test---
> # Baseline Avg IPI latency(ns): 2720
> # Observed Avg IPI latency(ns) - State snooze: 2565
> # Observed Avg IPI latency(ns) - State stop0_lite: 3856
> # Observed Avg IPI latency(ns) - State stop0: 3670
> # Observed Avg IPI latency(ns) - State stop1: 3872
> # Observed Avg IPI latency(ns) - State stop2: 17421
> # Observed Avg IPI latency(ns) - State stop4: 1003922
> # Observed Avg IPI latency(ns) - State stop5: 1058870
> #
> # --Timeout Latency Test--
> # Baseline Avg timeout diff(ns): 1435
> # Observed Avg timeout diff(ns) - State snooze: 1709
> # Observed Avg timeout diff(ns) - State stop0_lite: 2028
> # Observed Avg timeout diff(ns) - State stop0: 1954
> # Observed Avg timeout diff(ns) - State stop1: 1895
> # Observed Avg timeout diff(ns) - State stop2: 14556
> # Observed Avg timeout diff(ns) - State stop4: 873988
> # Observed Avg timeout diff(ns) - State stop5: 959137
> 
> Aboorva Devarajan (2):
>   powerpc/cpuidle: cpuidle wakeup latency based on IPI and timer
> events
>   powerpc/selftest: Add support for cpuidle latency measurement
> 
>  arch/powerpc/Kconfig.debug|  10 +
>  arch/powerpc/kernel/Makefile  |   1 +
>  arch/powerpc/kernel/test_cpuidle_latency.c| 154 ++
>  tools/testing/selftests/powerpc/Makefile  |   1 +
>  .../powerpc/cpuidle_latency/.gitignore|   2 +
>  .../powerpc/cpuidle_latency/Makefile  |   6 +
>  .../cpuidle_latency/cpuidle_latency.sh| 443
> ++
>  .../powerpc/cpuidle_latency/settings  |   1 +
>  8 files changed, 618 insertions(+)
>  create mode 100644 arch/powerpc/kernel/test_cpuidle_latency.c
>  create mode 100644
> tools/testing/selftests/powerpc/cpuidle_latency/.gitignore
>  create mode 100644
> tools/testing/selftests/powerpc/cpuidle_latency/Makefile
>  create mode 100755
> tools/testing/selftests/powerpc/cpuidle_latency/cpuidle_latency.sh
>  create mode 

Re: Questions: Should kernel panic when PCIe fatal error occurs?

2023-09-24 Thread Oliver O'Halloran
On Fri, Sep 22, 2023 at 8:23 AM David Laight  wrote:
>
> > It would be nice if they worked the same, but I suspect that vendors
> > may rely on the fact that CPER_SEV_FATAL forces a restart/panic as
> > part of their system integrity story.
>
> The file system errors created by a panic (especially an NMI panic)
> could easily be more problematic than a failed PCIe data transfer.
> Evan a read that returned ~0u - which can be checked for.
>
> Panicking a system that is converting TDM telephony to RTP for the
> 911 emergency service because a PCIe cable/riser connecting one of the
> TDM board has become loose doesn't seem ideal.

For kernel native AER the default reaction to errors is
reset-and-reinit which probably isn't much better for your case.
Sounds like you would want a knob to suppress everything except error
reporting so you can handle it in userspace?

> (Or because the TDM board's fpga has decided it isn't going to respond
> to any accesses until the BARs are setup again...)
>
> The system can carry on with some TDM connections disabled - but that
> is ok because they are all duplicated in case a cable gets cuit.

Well that's a relief :)

Oliver


[powerpc:merge] BUILD SUCCESS 2048fdba5ebe7e010a32b98a2ba4a4e0547334d7

2023-09-24 Thread kernel test robot
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git 
merge
branch HEAD: 2048fdba5ebe7e010a32b98a2ba4a4e0547334d7  Automatic merge of 
'master' into merge (2023-09-22 18:04)

elapsed time: 3802m

configs tested: 223
configs skipped: 2

The following configs have been built successfully.
More configs may be tested in the coming days.

tested configs:
alpha allnoconfig   gcc  
alphaallyesconfig   gcc  
alpha   defconfig   gcc  
arc  allmodconfig   gcc  
arc   allnoconfig   gcc  
arc  allyesconfig   gcc  
arc defconfig   gcc  
arc   randconfig-001-20230922   gcc  
arc   randconfig-001-20230923   gcc  
arm  allmodconfig   gcc  
arm   allnoconfig   gcc  
arm  allyesconfig   gcc  
arm assabet_defconfig   gcc  
arm defconfig   gcc  
armkeystone_defconfig   gcc  
arm   randconfig-001-20230922   gcc  
arm   randconfig-001-20230923   gcc  
arm   randconfig-001-20230924   gcc  
arm s3c6400_defconfig   gcc  
arm   sama7_defconfig   clang
arm   stm32_defconfig   gcc  
arm64allmodconfig   gcc  
arm64 allnoconfig   gcc  
arm64allyesconfig   gcc  
arm64   defconfig   gcc  
csky allmodconfig   gcc  
csky  allnoconfig   gcc  
csky allyesconfig   gcc  
cskydefconfig   gcc  
i386 allmodconfig   gcc  
i386  allnoconfig   gcc  
i386 allyesconfig   gcc  
i386 buildonly-randconfig-001-20230922   gcc  
i386 buildonly-randconfig-001-20230923   gcc  
i386 buildonly-randconfig-001-20230924   gcc  
i386 buildonly-randconfig-002-20230922   gcc  
i386 buildonly-randconfig-002-20230923   gcc  
i386 buildonly-randconfig-002-20230924   gcc  
i386 buildonly-randconfig-003-20230922   gcc  
i386 buildonly-randconfig-003-20230923   gcc  
i386 buildonly-randconfig-003-20230924   gcc  
i386 buildonly-randconfig-004-20230922   gcc  
i386 buildonly-randconfig-004-20230924   gcc  
i386 buildonly-randconfig-005-20230922   gcc  
i386 buildonly-randconfig-005-20230923   gcc  
i386 buildonly-randconfig-005-20230924   gcc  
i386 buildonly-randconfig-006-20230922   gcc  
i386 buildonly-randconfig-006-20230924   gcc  
i386  debian-10.3   gcc  
i386defconfig   gcc  
i386  randconfig-001-20230922   gcc  
i386  randconfig-001-20230923   gcc  
i386  randconfig-001-20230924   gcc  
i386  randconfig-002-20230922   gcc  
i386  randconfig-002-20230924   gcc  
i386  randconfig-003-20230922   gcc  
i386  randconfig-003-20230924   gcc  
i386  randconfig-004-20230922   gcc  
i386  randconfig-004-20230924   gcc  
i386  randconfig-005-20230922   gcc  
i386  randconfig-005-20230923   gcc  
i386  randconfig-005-20230924   gcc  
i386  randconfig-006-20230922   gcc  
i386  randconfig-006-20230923   gcc  
i386  randconfig-006-20230924   gcc  
i386  randconfig-011-20230922   gcc  
i386  randconfig-011-20230923   gcc  
i386  randconfig-011-20230924   gcc  
i386  randconfig-012-20230922   gcc  
i386  randconfig-012-20230924   gcc  
i386  randconfig-013-20230922   gcc  
i386  randconfig-013-20230924   gcc  
i386  randconfig-014-20230922   gcc  
i386  randconfig-014-20230923   gcc  
i386  randconfig-014-20230924   gcc  
i386  randconfig-015-20230922   gcc  
i386  randconfig-015-20230923   gcc  
i386  randconfig-015-20230924   gcc  
i386  randconfig-016-20230922   gcc  
i386  randconfig-016-20230923   gcc  
loongarchallmodconfig   gcc  
loongarch allnoconfig   gcc  
loongarchallyesconfig   gcc  
loongarch   defconfig   gcc  
loongarch randconfig-001-20230922   gcc  
loongarch

[PATCH 2/2] ASoC: imx-rpmsg: Force codec power on in low power audio mode

2023-09-24 Thread Chancel Liu
Low power audio mode requires binding codec still power on while Acore
enters into suspend so Mcore can continue playback music.

ASoC machine driver acquires DAPM endpoints through reading
"fsl,lpa-widgets" property from DT and then forces the path between
these endpoints ignoring suspend.

If the rpmsg sound card is in low power audio mode, the suspend/resume
callback of binding codec is overridden to disable the suspend/resume.

Signed-off-by: Chancel Liu 
---
 sound/soc/fsl/imx-rpmsg.c | 58 +++
 1 file changed, 58 insertions(+)

diff --git a/sound/soc/fsl/imx-rpmsg.c b/sound/soc/fsl/imx-rpmsg.c
index b578f9a32d7f..0568a3420aae 100644
--- a/sound/soc/fsl/imx-rpmsg.c
+++ b/sound/soc/fsl/imx-rpmsg.c
@@ -20,8 +20,11 @@ struct imx_rpmsg {
struct snd_soc_dai_link dai;
struct snd_soc_card card;
unsigned long sysclk;
+   bool lpa;
 };
 
+static struct dev_pm_ops lpa_pm;
+
 static const struct snd_soc_dapm_widget imx_rpmsg_dapm_widgets[] = {
SND_SOC_DAPM_HP("Headphone Jack", NULL),
SND_SOC_DAPM_SPK("Ext Spk", NULL),
@@ -38,6 +41,58 @@ static int imx_rpmsg_late_probe(struct snd_soc_card *card)
struct device *dev = card->dev;
int ret;
 
+   if (data->lpa) {
+   struct snd_soc_component *codec_comp;
+   struct device_node *codec_np;
+   struct device_driver *codec_drv;
+   struct device *codec_dev = NULL;
+
+   codec_np = data->dai.codecs->of_node;
+   if (codec_np) {
+   struct platform_device *codec_pdev;
+   struct i2c_client *codec_i2c;
+
+   codec_i2c = of_find_i2c_device_by_node(codec_np);
+   if (codec_i2c)
+   codec_dev = _i2c->dev;
+   if (!codec_dev) {
+   codec_pdev = of_find_device_by_node(codec_np);
+   if (codec_pdev)
+   codec_dev = _pdev->dev;
+   }
+   }
+   if (codec_dev) {
+   codec_comp = 
snd_soc_lookup_component_nolocked(codec_dev, NULL);
+   if (codec_comp) {
+   int i, num_widgets;
+   const char *widgets;
+   struct snd_soc_dapm_context *dapm;
+
+   num_widgets = 
of_property_count_strings(data->card.dev->of_node,
+   
"fsl,lpa-widgets");
+   for (i = 0; i < num_widgets; i++) {
+   
of_property_read_string_index(data->card.dev->of_node,
+ 
"fsl,lpa-widgets",
+ i, 
);
+   dapm = 
snd_soc_component_get_dapm(codec_comp);
+   snd_soc_dapm_ignore_suspend(dapm, 
widgets);
+   }
+   }
+   codec_drv = codec_dev->driver;
+   if (codec_drv->pm) {
+   memcpy(_pm, codec_drv->pm, sizeof(lpa_pm));
+   lpa_pm.suspend = NULL;
+   lpa_pm.resume = NULL;
+   lpa_pm.freeze = NULL;
+   lpa_pm.thaw = NULL;
+   lpa_pm.poweroff = NULL;
+   lpa_pm.restore = NULL;
+   codec_drv->pm = _pm;
+   }
+   put_device(codec_dev);
+   }
+   }
+
if (!data->sysclk)
return 0;
 
@@ -137,6 +192,9 @@ static int imx_rpmsg_probe(struct platform_device *pdev)
goto fail;
}
 
+   if (of_property_read_bool(np, "fsl,enable-lpa"))
+   data->lpa = true;
+
data->card.dev = >dev;
data->card.owner = THIS_MODULE;
data->card.dapm_widgets = imx_rpmsg_dapm_widgets;
-- 
2.25.1



[PATCH 1/2] ASoC: dt-bindings: fsl_rpmsg: List DAPM endpoints ignoring suspend

2023-09-24 Thread Chancel Liu
Add a property to list DAPM endpoints which mark paths between these
endpoints ignoring suspend. These DAPM paths can still be power on when
system enters into suspend.

Signed-off-by: Chancel Liu 
---
 Documentation/devicetree/bindings/sound/fsl,rpmsg.yaml | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/Documentation/devicetree/bindings/sound/fsl,rpmsg.yaml 
b/Documentation/devicetree/bindings/sound/fsl,rpmsg.yaml
index 188f38baddec..ec6e09eab427 100644
--- a/Documentation/devicetree/bindings/sound/fsl,rpmsg.yaml
+++ b/Documentation/devicetree/bindings/sound/fsl,rpmsg.yaml
@@ -91,6 +91,12 @@ properties:
   - rpmsg-audio-channel
   - rpmsg-micfil-channel
 
+  fsl,lpa-widgets:
+$ref: /schemas/types.yaml#/definitions/non-unique-string-array
+description: |
+  A list of DAPM endpoints which mark paths between these endpoints
+  ignoring suspend.
+
 required:
   - compatible
 
-- 
2.25.1



Re: [PATCH 3/8] iommu/vt-d: Use ops->blocked_domain

2023-09-24 Thread Baolu Lu

On 9/23/23 1:07 AM, Jason Gunthorpe wrote:

Trivially migrate to the ops->blocked_domain for the existing global
static.

Signed-off-by: Jason Gunthorpe
---
  drivers/iommu/intel/iommu.c | 3 +--
  1 file changed, 1 insertion(+), 2 deletions(-)


Reviewed-by: Lu Baolu 

P.S. We can further do the same thing to the identity domain. I will
clean it up after all patches are landed.

Best regards,
baolu


Re: [PATCH 2/8] iommu/vt-d: Update the definition of the blocking domain

2023-09-24 Thread Baolu Lu

On 9/23/23 1:07 AM, Jason Gunthorpe wrote:

The global static should pre-define the type and the NOP free function can
be now left as NULL.

Signed-off-by: Jason Gunthorpe
---
  drivers/iommu/intel/iommu.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)


Reviewed-by: Lu Baolu 

Best regards,
baolu


Re: [PATCH 1/8] iommu: Move IOMMU_DOMAIN_BLOCKED global statics to ops->blocked_domain

2023-09-24 Thread Baolu Lu

On 9/23/23 1:07 AM, Jason Gunthorpe wrote:

Following the pattern of identity domains, just assign the BLOCKED domain
global statics to a value in ops. Update the core code to use the global
static directly.

Update powerpc to use the new scheme and remove its empty domain_alloc
callback.

Signed-off-by: Jason Gunthorpe
---
  arch/powerpc/kernel/iommu.c | 9 +
  drivers/iommu/iommu.c   | 2 ++
  include/linux/iommu.h   | 3 +++
  3 files changed, 6 insertions(+), 8 deletions(-)


Reviewed-by: Lu Baolu 

Best regards,
baolu


[powerpc:fixes-test] BUILD SUCCESS 58b33e78a31782ffe25d404d5eba9a45fe636e27

2023-09-24 Thread kernel test robot
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git 
fixes-test
branch HEAD: 58b33e78a31782ffe25d404d5eba9a45fe636e27  selftests/powerpc: Fix 
emit_tests to work with run_kselftest.sh

elapsed time: 3766m

configs tested: 338
configs skipped: 2

The following configs have been built successfully.
More configs may be tested in the coming days.

tested configs:
alpha allnoconfig   gcc  
alphaallyesconfig   gcc  
alpha   defconfig   gcc  
arc  allmodconfig   gcc  
arc   allnoconfig   gcc  
arc  allyesconfig   gcc  
arc  axs101_defconfig   gcc  
arc  axs103_smp_defconfig   gcc  
arc defconfig   gcc  
archsdk_defconfig   gcc  
arcnsim_700_defconfig   gcc  
arc nsimosci_hs_defconfig   gcc  
arc   randconfig-001-20230922   gcc  
arc   randconfig-001-20230923   gcc  
arc   randconfig-001-20230924   gcc  
arcvdk_hs38_defconfig   gcc  
arcvdk_hs38_smp_defconfig   gcc  
arm  allmodconfig   gcc  
arm   allnoconfig   gcc  
arm  allyesconfig   gcc  
arm   aspeed_g4_defconfig   clang
arm defconfig   gcc  
armkeystone_defconfig   gcc  
arm   omap2plus_defconfig   gcc  
arm  pxa910_defconfig   gcc  
arm   randconfig-001-20230922   gcc  
arm   randconfig-001-20230923   gcc  
arm   randconfig-001-20230924   gcc  
arm   randconfig-001-20230925   gcc  
arm rpc_defconfig   gcc  
arm s3c6400_defconfig   gcc  
armshmobile_defconfig   gcc  
arm   u8500_defconfig   gcc  
arm vf610m4_defconfig   gcc  
arm64allmodconfig   gcc  
arm64 allnoconfig   gcc  
arm64allyesconfig   gcc  
arm64   defconfig   gcc  
csky allmodconfig   gcc  
csky  allnoconfig   gcc  
csky allyesconfig   gcc  
cskydefconfig   gcc  
hexagon   allnoconfig   clang
i386 allmodconfig   gcc  
i386  allnoconfig   gcc  
i386 allyesconfig   gcc  
i386 buildonly-randconfig-001-20230922   gcc  
i386 buildonly-randconfig-001-20230923   gcc  
i386 buildonly-randconfig-001-20230924   gcc  
i386 buildonly-randconfig-002-20230922   gcc  
i386 buildonly-randconfig-002-20230923   gcc  
i386 buildonly-randconfig-002-20230924   gcc  
i386 buildonly-randconfig-003-20230922   gcc  
i386 buildonly-randconfig-003-20230923   gcc  
i386 buildonly-randconfig-003-20230924   gcc  
i386 buildonly-randconfig-004-20230922   gcc  
i386 buildonly-randconfig-004-20230923   gcc  
i386 buildonly-randconfig-004-20230924   gcc  
i386 buildonly-randconfig-005-20230922   gcc  
i386 buildonly-randconfig-005-20230923   gcc  
i386 buildonly-randconfig-005-20230924   gcc  
i386 buildonly-randconfig-006-20230922   gcc  
i386 buildonly-randconfig-006-20230923   gcc  
i386 buildonly-randconfig-006-20230924   gcc  
i386  debian-10.3   gcc  
i386defconfig   gcc  
i386  randconfig-001-20230922   gcc  
i386  randconfig-001-20230923   gcc  
i386  randconfig-001-20230924   gcc  
i386  randconfig-001-20230925   gcc  
i386  randconfig-002-20230922   gcc  
i386  randconfig-002-20230923   gcc  
i386  randconfig-002-20230924   gcc  
i386  randconfig-002-20230925   gcc  
i386  randconfig-003-20230922   gcc  
i386  randconfig-003-20230923   gcc  
i386  randconfig-003-20230924   gcc  
i386  randconfig-003-20230925   gcc  
i386  randconfig-004-20230922   gcc  
i386  randconfig-004-20230923   gcc  
i386  randconfig-004-20230924   gcc  
i386  randconfig-004-20230925   gcc  
i386  randconfig-005-20230922   gcc  
i386  randconfig-005-20230923   gcc  
i386  randconfig-005-20230924   gcc

Re: Questions: Should kernel panic when PCIe fatal error occurs?

2023-09-24 Thread Shuai Xue



On 2023/9/21 21:20, David Laight wrote:
> ...
> I've got a target to generate AER errors by generating read cycles
> that are inside the address range that the bridge forwards but
> outside of any BAR because there are 2 different sized BARs.
> (Pretty easy to setup.)
> On the system I was using they didn't get propagated all the way
> to the root bridge - but were visible in the lower bridge.

So how did you observe it? If the error message does not propagate
to the root bridge, I think no AER interrupt will be trigger.

> It would be nice for a driver to be able to detect/clear such
> a flag if it gets an unexpected ~0u read value.
> (I'm not sure an error callback helps.)

IMHO, a general model is that error detected at endpoint should be
routed to upstream port for example: RCiEP route error message to RCEC,
so that the AER port service could handle the error, the device driver
only have to implement error handler callback.

> 
> OTOH a 'nebs compliant' server routed any kind of PCIe link error
> through to some 'system management' logic that then raised an NMI.
> I'm not sure who thought an NMI was a good idea - they are pretty
> impossible to handle in the kernel and too late to be of use to
> the code performing the access.

I think it is the responsibility of the device to prevent the spread of
errors while reporting that errors have been detected. For example, drop
the current, (drain submit queue) and report error in completion record.
Both NMI and MSI are asynchronous interrupts.

> 
> In any case we were getting one after 'echo 1 >xxx/remove' and
> then taking the PCIe link down by reprogramming the fpga.
> So the link going down was entirely expected, but there seemed
> to be nothing we could do to stop the kernel crashing.
> 
> I'm sure 'nebs compliant' ought to contain some requirements for
> resilience to hardware failures!

How the kernel crash after a link down? Did the system detect a surprise
down error?

Best Regards,
Shuai