[PATCH] powerpc/pseries: Fix use after free in remove_phb_dynamic()

2022-03-17 Thread Michael Ellerman
In remove_phb_dynamic() we use &phb->io_resource, after we've called
device_unregister(&host_bridge->dev). But the unregister may have freed
phb, because pcibios_free_controller_deferred() is the release function
for the host_bridge.

If there are no outstanding references when we call device_unregister()
then phb will be freed out from under us.

This has gone mainly unnoticed, but with slub_debug and page_poison
enabled it can lead to a crash:

  PID: 7574   TASK: c000d492cb80  CPU: 13  COMMAND: "drmgr"
   #0 [c000e4f075a0] crash_kexec at c027d7dc
   #1 [c000e4f075d0] oops_end at c0029608
   #2 [c000e4f07650] __bad_page_fault at c00904b4
   #3 [c000e4f076c0] do_bad_slb_fault at c009a5a8
   #4 [c000e4f076f0] data_access_slb_common_virt at c0008b30
   Data SLB Access [380] exception frame:
   R0:  c0167250R1:  c000e4f07a00R2:  c2a46100
   R3:  c2b39ce8R4:  00c0R5:  00a9
   R6:  3894674d00c0R7:  R8:  00ff
   R9:  0100R10: 6b6b6b6b6b6b6b6bR11: 8000
   R12: c023da80R13: c009ffd38b00R14: 
   R15: 00011c87f0f0R16: 0006R17: 0003
   R18: 0002R19: 0004R20: 0005
   R21: 00011c87ede8R22: 00011c87c5a8R23: 00011c87d3a0
   R24: R25: 0001R26: c000e4f07cc8
   R27: c0004d1cc400R28: c008031d00e8R29: c0004d23d800
   R30: c0004d1d2400R31: c0004d1d2540
   NIP: c0167258MSR: 80009033OR3: c0e9f474
   CTR: LR:  c0167250XER: 20040003
   CCR: 24088420MQ:  DAR: 6b6b6b6b6b6b6ba3
   DSISR: c000e4f07920 Syscall Result: fff2
   [NIP  : release_resource+56]
   [LR   : release_resource+48]
   #5 [c000e4f07a00] release_resource at c0167258  (unreliable)
   #6 [c000e4f07a30] remove_phb_dynamic at c0105648
   #7 [c000e4f07ab0] dlpar_remove_slot at c008031a09e8 [rpadlpar_io]
   #8 [c000e4f07b50] remove_slot_store at c008031a0b9c [rpadlpar_io]
   #9 [c000e4f07be0] kobj_attr_store at c0817d8c
  #10 [c000e4f07c00] sysfs_kf_write at c063e504
  #11 [c000e4f07c20] kernfs_fop_write_iter at c063d868
  #12 [c000e4f07c70] new_sync_write at c054339c
  #13 [c000e4f07d10] vfs_write at c0546624
  #14 [c000e4f07d60] ksys_write at c05469f4
  #15 [c000e4f07db0] system_call_exception at c0030840
  #16 [c000e4f07e10] system_call_vectored_common at c000c168

To avoid it, we can take a reference to the host_bridge->dev until we're
done using phb. Then when we drop the reference the phb will be freed.

Fixes: 2dd9c11b9d4d ("powerpc/pseries: use pci_host_bridge.release_fn() to 
kfree(phb)")
Reported-by: David Dai 
Signed-off-by: Michael Ellerman 
---
 arch/powerpc/platforms/pseries/pci_dlpar.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/arch/powerpc/platforms/pseries/pci_dlpar.c 
b/arch/powerpc/platforms/pseries/pci_dlpar.c
index 90c9d3531694..4ba824568119 100644
--- a/arch/powerpc/platforms/pseries/pci_dlpar.c
+++ b/arch/powerpc/platforms/pseries/pci_dlpar.c
@@ -78,6 +78,9 @@ int remove_phb_dynamic(struct pci_controller *phb)
 
pseries_msi_free_domains(phb);
 
+   /* Keep a reference so phb isn't freed yet */
+   get_device(&host_bridge->dev);
+
/* Remove the PCI bus and unregister the bridge device from sysfs */
phb->bus = NULL;
pci_remove_bus(b);
@@ -101,6 +104,7 @@ int remove_phb_dynamic(struct pci_controller *phb)
 * the pcibios_free_controller_deferred() callback;
 * see pseries_root_bridge_prepare().
 */
+   put_device(&host_bridge->dev);
 
return 0;
 }
-- 
2.34.1



[Bug 215652] kernel 5.17-rc fail to load radeon DRM "modprobe: ERROR: could not insert 'radeon': Unknown symbol in module, or unknown parameter (see dmesg)"

2022-03-17 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=215652

Erhard F. (erhar...@mailbox.org) changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |OBSOLETE

--- Comment #10 from Erhard F. (erhar...@mailbox.org) ---
I did not get out a meaningful result out of my reverse bisect... But
v5.17.0-rc7 abd v5.17.0-rc8 do not show this issue.

So closing here.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching someone on the CC list of the bug.

Re: [PATCH v1] PCI/AER: Handle Multi UnCorrectable/Correctable errors properly

2022-03-17 Thread Bjorn Helgaas
On Mon, Mar 14, 2022 at 09:21:46AM -0700, Eric Badger wrote:
> On Sun, Mar 13, 2022 at 02:43:14PM -0700, Raj, Ashok wrote:
> > On Sun, Mar 13, 2022 at 02:52:20PM -0500, Bjorn Helgaas wrote:
> > > On Fri, Mar 11, 2022 at 02:58:07AM +, Kuppuswamy Sathyanarayanan 
> > > wrote:
> > > > Currently the aer_irq() handler returns IRQ_NONE for cases without bits
> > > > PCI_ERR_ROOT_UNCOR_RCV or PCI_ERR_ROOT_COR_RCV are set. But this
> > > > assumption is incorrect.
> > > > 
> > > > Consider a scenario where aer_irq() is triggered for a correctable
> > > > error, and while we process the error and before we clear the error
> > > > status in "Root Error Status" register, if the same kind of error
> > > > is triggered again, since aer_irq() only clears events it saw, the
> > > > multi-bit error is left in tact. This will cause the interrupt to fire
> > > > again, resulting in entering aer_irq() with just the multi-bit error
> > > > logged in the "Root Error Status" register.
> > > > 
> > > > Repeated AER recovery test has revealed this condition does happen
> > > > and this prevents any new interrupt from being triggered. Allow to
> > > > process interrupt even if only multi-correctable (BIT 1) or
> > > > multi-uncorrectable bit (BIT 3) is set.
> > > > 
> > > > Reported-by: Eric Badger 
> > > 
> > > Is there a bug report with any concrete details (dmesg, lspci, etc)
> > > that we can include here?
> > 
> > Eric might have more details to add when he collected numerous logs to get
> > to the timeline of the problem. The test was to stress the links with an
> > automated power off, this will result in some eDPC UC error followed by
> > link down. The recovery worked fine for several cycles and suddenly there
> > were no more interrupts. A manual rescan on pci would probe and device is
> > operational again.
> 
> The problem was originally discovered while performing a looping hot plug
> test. At hot remove time, one or more corrected errors usually appeared:
> 
> [256236.078151] pcieport :89:02.0: AER: Corrected error received: 
> :89:02.0
> [256236.078154] pcieport :89:02.0: AER: PCIe Bus Error: 
> severity=Corrected, type=Physical Layer, (Receiver ID)
> [256236.088606] pcieport :89:02.0: AER:   device [8086:347a] error 
> status/mask=0001/
> [256236.097857] pcieport :89:02.0: AER:[ 0] RxErr 
> [256236.152622] pcieport :89:02.0: pciehp: Slot(400): Link Down
> [256236.152623] pcieport :89:02.0: pciehp: Slot(400): Card not present
> [256236.152631] pcieport :89:02.0: DPC: containment event, status:0x1f01 
> source:0x
> [256236.152632] pcieport :89:02.0: DPC: unmasked uncorrectable error 
> detected reason 0 ext_reason 0
> [256236.152634] pcieport :89:02.0: AER: PCIe Bus Error: 
> severity=Uncorrected (Fatal), type=Transaction Layer, (Receiver ID)
> [256236.164207] pcieport :89:02.0: AER:   device [8086:347a] error 
> status/mask=0020/0010
> [256236.173464] pcieport :89:02.0: AER:[ 5] SDES   
> (First)
> [256236.278407] pci :8a:00.0: Removing from iommu group 32
> [256237.500837] pcieport :89:02.0: Data Link Layer Link Active not set in 
> 1000 msec
> [256237.500842] pcieport :89:02.0: link reset at upstream device 
> :89:02.0 failed
> [256237.500865] pcieport :89:02.0: AER: Device recovery failed
> 
> The problematic case arose when 2 corrected errors arrived in a sequence like 
> this:
> 
> 1. Correctable error triggered, bit 0 (ERR_COR) set in Root Error Status,
>which now has value 0x1.
> 2. aer_irq() triggered, reads Root Error Status, finds value 0x1.
> 3. Second correctable error triggered, bit 1 (multiple ERR_COR) set in Root
>Error Status, which now has value 0x3.
> 4. aer_irq() writes back 0x1 to Root Error Status, which now has value 0x2.
> 5. aer_irq() triggered again due to the second error, but, finding value 0x2
>in Root Error Status, takes no action. Future interrupts are now inhibited.

Thanks for the additional details!

After this patch, I guess aer_irq() still reads 0x2
(PCI_ERR_ROOT_MULTI_COR_RCV), but now it writes 0x2 back which clears
PCI_ERR_ROOT_MULTI_COR_RCV.

In addition, aer_irq() will continue on to read PCI_ERR_ROOT_ERR_SRC,
which probably contains either 0 or junk left over from being captured
when PCI_ERR_ROOT_COR_RCV was set.

And aer_irq() will queue an e_src record with status ==
PCI_ERR_ROOT_MULTI_COR_RCV.  But since PCI_ERR_ROOT_COR_RCV is not set
in status, aer_isr_one_error() will do nothing, right?

That might not be *terrible* and is definitely better than not being
able to handle future interrupts.  But we basically threw away the
information that multiple errors occurred, and we queued an e_src
record that occupies space without being used for anything.

Bjorn


[PATCH v5 16/20] Kbuild: add Rust support

2022-03-17 Thread Miguel Ojeda
Having all the new files in place, we now enable Rust support
in the build system, including `Kconfig` entries related to Rust,
the Rust configuration printer, the target specification
generation script, the version detection script and a few
other bits.

Co-developed-by: Alex Gaynor 
Signed-off-by: Alex Gaynor 
Co-developed-by: Finn Behrens 
Signed-off-by: Finn Behrens 
Co-developed-by: Adam Bratschi-Kaye 
Signed-off-by: Adam Bratschi-Kaye 
Co-developed-by: Wedson Almeida Filho 
Signed-off-by: Wedson Almeida Filho 
Co-developed-by: Michael Ellerman 
Signed-off-by: Michael Ellerman 
Co-developed-by: Sven Van Asbroeck 
Signed-off-by: Sven Van Asbroeck 
Co-developed-by: Gary Guo 
Signed-off-by: Gary Guo 
Co-developed-by: Boris-Chengbiao Zhou 
Signed-off-by: Boris-Chengbiao Zhou 
Co-developed-by: Boqun Feng 
Signed-off-by: Boqun Feng 
Co-developed-by: Douglas Su 
Signed-off-by: Douglas Su 
Co-developed-by: Dariusz Sosnowski 
Signed-off-by: Dariusz Sosnowski 
Co-developed-by: Antonio Terceiro 
Signed-off-by: Antonio Terceiro 
Co-developed-by: Daniel Xu 
Signed-off-by: Daniel Xu 
Co-developed-by: Miguel Cano 
Signed-off-by: Miguel Cano 
Signed-off-by: Miguel Ojeda 
---
 .gitignore   |   5 +
 .rustfmt.toml|  12 +
 Makefile | 173 -
 arch/Kconfig |   6 +
 arch/arm/Kconfig |   1 +
 arch/arm64/Kconfig   |   1 +
 arch/powerpc/Kconfig |   1 +
 arch/riscv/Kconfig   |   1 +
 arch/riscv/Makefile  |   5 +
 arch/x86/Kconfig |   1 +
 arch/x86/Makefile|  14 +
 init/Kconfig |  44 ++-
 lib/Kconfig.debug| 143 +++
 rust/.gitignore  |   8 +
 rust/Makefile| 376 +++
 rust/bindgen_parameters  |  13 +
 scripts/.gitignore   |   1 +
 scripts/Kconfig.include  |   6 +-
 scripts/Makefile |   3 +
 scripts/Makefile.build   |  60 +++
 scripts/Makefile.debug   |  10 +
 scripts/Makefile.host|  34 +-
 scripts/Makefile.lib |  12 +
 scripts/Makefile.modfinal|   8 +-
 scripts/cc-version.sh|  12 +-
 scripts/generate_rust_target.rs  | 227 +++
 scripts/is_rust_module.sh|  13 +
 scripts/kconfig/confdata.c   |  75 
 scripts/min-tool-version.sh  |   6 +
 scripts/rust-is-available-bindgen-libclang.h |   2 +
 scripts/rust-is-available.sh | 158 
 31 files changed, 1408 insertions(+), 23 deletions(-)
 create mode 100644 .rustfmt.toml
 create mode 100644 rust/.gitignore
 create mode 100644 rust/Makefile
 create mode 100644 rust/bindgen_parameters
 create mode 100644 scripts/generate_rust_target.rs
 create mode 100755 scripts/is_rust_module.sh
 create mode 100644 scripts/rust-is-available-bindgen-libclang.h
 create mode 100755 scripts/rust-is-available.sh

diff --git a/.gitignore b/.gitignore
index 7afd412dadd2..48c68948f476 100644
--- a/.gitignore
+++ b/.gitignore
@@ -37,6 +37,7 @@
 *.o
 *.o.*
 *.patch
+*.rmeta
 *.s
 *.so
 *.so.dbg
@@ -96,6 +97,7 @@ modules.order
 !.gitattributes
 !.gitignore
 !.mailmap
+!.rustfmt.toml
 
 #
 # Generated include files
@@ -161,3 +163,6 @@ x509.genkey
 
 # Documentation toolchain
 sphinx_*/
+
+# Rust analyzer configuration
+/rust-project.json
diff --git a/.rustfmt.toml b/.rustfmt.toml
new file mode 100644
index ..3de5cc497465
--- /dev/null
+++ b/.rustfmt.toml
@@ -0,0 +1,12 @@
+edition = "2021"
+newline_style = "Unix"
+
+# Unstable options that help catching some mistakes in formatting and that we 
may want to enable
+# when they become stable.
+#
+# They are kept here since they are useful to run from time to time.
+#format_code_in_doc_comments = true
+#reorder_impl_items = true
+#comment_width = 100
+#wrap_comments = true
+#normalize_comments = true
diff --git a/Makefile b/Makefile
index 55a30ca69350..67008a2d964c 100644
--- a/Makefile
+++ b/Makefile
@@ -120,6 +120,13 @@ endif
 
 export KBUILD_CHECKSRC
 
+# Enable "clippy" (a linter) as part of the Rust compilation.
+#
+# Use 'make CLIPPY=1' to enable it.
+ifeq ("$(origin CLIPPY)", "command line")
+  KBUILD_CLIPPY := $(CLIPPY)
+endif
+
 # Use make M=dir or set the environment variable KBUILD_EXTMOD to specify the
 # directory of external module to build. Setting M= takes precedence.
 ifeq ("$(origin M)", "command line")
@@ -267,7 +274,7 @@ no-dot-config-targets := $(clean-targets) \
 cscope gtags TAGS tags help% %docs check% coccicheck \
 $(version_

[PATCH v5 00/20] Rust support

2022-03-17 Thread Miguel Ojeda
Rust support

This is the patch series (v5) to add support for Rust as a second
language to the Linux kernel.

If you are interested in following this effort, please join us in
the mailing list at:

rust-for-li...@vger.kernel.org

and take a look at the project itself at:

https://github.com/Rust-for-Linux

As usual, special thanks go to ISRG (Internet Security Research
Group) and Google for their financial support on this endeavor.

Cheers,
Miguel

--

# Rust support

This cover letter explains the major changes and updates done since
the previous ones. For those, please see:

RFC: https://lore.kernel.org/lkml/20210414184604.23473-1-oj...@kernel.org/
v1:  https://lore.kernel.org/lkml/20210704202756.29107-1-oj...@kernel.org/
v2:  https://lore.kernel.org/lkml/20211206140313.5653-1-oj...@kernel.org/
v3:  https://lore.kernel.org/lkml/20220117053349.6804-1-oj...@kernel.org/
v4:  https://lore.kernel.org/lkml/20220212130410.6901-1-oj...@kernel.org/


## Infrastructure updates

There have been several improvements to the overall Rust support:

  - The toolchain and `alloc` have been upgraded to Rust 1.59.0.
This version stabilized `feature(global_asm)` as well as
the `-Csymbol-mangling-version=v0` flag.

  - Added support for host programs written in Rust. This should
only be used in scenarios where Rust is required to be available.

  - Target specification files are now generated on the fly based
on the kernel configuration, via a Rust script, instead of
having a few predefined files.

The content of the generated file has been simplified and,
for x86, all the options that can be specified through the
command-line have been moved to the architecture `Makefile`.

The goal is to reduce the content of the file as much as possible
for all architectures, and eventually, stop needing such a file.

  - Added `HAVE_RUST` kernel option. This symbol should be selected
by an architecture if it supports Rust.

  - Added documentation on `RUSTFLAGS*` and `KBUILD_RUST*` variables.

  - Simplified tags and cross-references in the documentation.

  - Other cleanups, fixes and improvements on the build system.


## Abstractions and driver updates

Some of the improvements to the abstractions and example drivers are:

  - Added abstraction for the Hardware Random Number Generator.

  - `%pA` rework in `vsprintf` following the review.

  - The `sync` sample now shows how to use static mutexes and
conditional variables.

  - Error codes can now be used without prefixing them with
`Error::`, which makes using them closer to C. For instance:

fn f(...) -> Result {
if ... {
return Err(EINVAL);
}
...
Ok(())
}

  - Added `CString` type for owned C strings.

  - `miscdev` registration now holds an owned C string, which enables
scenarios when the device name is constructed at runtime.

  - Added `Bool` trait meant to be used in type states to allow
boolean constraints in implementation blocks.

  - Added `LockInfo` trait that lock "type states" must implement.
This allows the definition of additional writable type states.

  - Simplification of the spin lock implementation by splitting
acquisition types. Type states are used to implement two versions
of the `Lock` trait: one which never modifies the interrupt state
and one that disables them (if they are enabled, then re-enables
on unlock).

  - `Result::unwrap` can now be used in examples that are compiled,
linked and run.

  - Merged `Formatter` and `Buffer` types.

  - Added `IoMem::offset_ok` for runtime sizes.

  - Other cleanups, fixes and improvements.


## Patch series status

The Rust support is still to be considered experimental. However,
support is good enough that kernel developers can start working on the
Rust abstractions for subsystems and write drivers and other modules.

The current series has just arrived in `linux-next`, as usual.
Similarly, the preview docs for this series can be seen at:

https://rust-for-linux.github.io/docs/kernel/

As usual, please see the following link for the
live list of unstable Rust features we are using:

https://github.com/Rust-for-Linux/linux/issues/2

Note that this time the series depends on a patch queued in
powerpc-next:


https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git/commit/?h=next&id=d4be60fe66b7380530868ceebe549f8eebccacc5


## Acknowledgements

The signatures in the main commits correspond to the people that
wrote code that has ended up in them at the present time. For details
on contributions to code and discussions, please see our repository:

https://github.com/Rust-for-Linux/linux

However, we would like to give credit to everyone that has contributed
in one way or another to the Rust for Linux project. Since the
previous cover letter:

  - Akira Yokosawa for a detailed review of the documentation and

Re: [PATCH v1 4/7] arm64/pgtable: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE

2022-03-17 Thread Catalin Marinas
On Thu, Mar 17, 2022 at 11:04:18AM +0100, David Hildenbrand wrote:
> On 16.03.22 19:27, Catalin Marinas wrote:
> > On Tue, Mar 15, 2022 at 03:18:34PM +0100, David Hildenbrand wrote:
> >> @@ -909,12 +925,13 @@ static inline pmd_t pmdp_establish(struct 
> >> vm_area_struct *vma,
> >>  /*
> >>   * Encode and decode a swap entry:
> >>   *bits 0-1:   present (must be zero)
> >> - *bits 2-7:   swap type
> >> + *bits 2: remember PG_anon_exclusive
> >> + *bits 3-7:   swap type
> >>   *bits 8-57:  swap offset
> >>   *bit  58:PTE_PROT_NONE (must be zero)
> > 
> > I don't remember exactly why we reserved bits 0 and 1 when, from the
> > hardware perspective, it's sufficient for bit 0 to be 0 and the whole
> > pte becomes invalid. We use bit 1 as the 'table' bit (when 0 at pmd
> > level, it's a huge page) but we shouldn't check for this on a swap
> > entry.
> 
> You mean
> 
> arch/arm64/include/asm/pgtable-hwdef.h:#define PTE_TABLE_BIT
> (_AT(pteval_t, 1) << 1)
> 
> right?

Yes.

> I wonder why it even exists, for arm64 I only spot:
> 
> arch/arm64/include/asm/pgtable.h:#define pte_mkhuge(pte)
> (__pte(pte_val(pte) & ~PTE_TABLE_BIT))
> 
> I don't really see code that sets PTE_TABLE_BIT.
> 
> Similarly, I don't see code that sets 
> PMD_TABLE_BIT/PUD_TABLE_BIT/P4D_TABLE_BIT.
> Most probably setting code is not using the defines,  that's why I'm not 
> finding it.

It gets set as part of P*D_TYPE_TABLE via p*d_populate(). We use the
P*D_TABLE_BIT mostly for checking whether it's a huge page or not (the
arm64 hugetlbpage.c code).

-- 
Catalin


[RFC][PATCH] net: fs_enet: fix tx error handling

2022-03-17 Thread Mans Rullgard
In some cases, the TXE flag is apparently set without any error
indication in the buffer descriptor status. When this happens, tx
stalls until the tx_restart() function is called via the device
watchdog which can take a long time.

To fix this, check for TXE in the napi poll function and trigger a
tx_restart() call as for errors reported in the buffer descriptor.

This change makes the FCC based Ethernet controller on MPC82xx devices
usable. It probably breaks the other modes (FEC, SCC) which I have no
way of testing.

Signed-off-by: Mans Rullgard 
---
 .../ethernet/freescale/fs_enet/fs_enet-main.c | 47 +++
 .../net/ethernet/freescale/fs_enet/mac-fcc.c  |  2 +-
 2 files changed, 19 insertions(+), 30 deletions(-)

diff --git a/drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c 
b/drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c
index 78e008b81374..4276becd07cf 100644
--- a/drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c
+++ b/drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c
@@ -94,14 +94,22 @@ static int fs_enet_napi(struct napi_struct *napi, int 
budget)
int curidx;
int dirtyidx, do_wake, do_restart;
int tx_left = TX_RING_SIZE;
+   u32 int_events;
 
spin_lock(&fep->tx_lock);
bdp = fep->dirty_tx;
+   do_wake = do_restart = 0;
+
+   int_events = (*fep->ops->get_int_events)(dev);
+
+   if (int_events & fep->ev_err) {
+   (*fep->ops->ev_error)(dev, int_events);
+   do_restart = 1;
+   }
 
/* clear status bits for napi*/
(*fep->ops->napi_clear_event)(dev);
 
-   do_wake = do_restart = 0;
while (((sc = CBDR_SC(bdp)) & BD_ENET_TX_READY) == 0 && tx_left) {
dirtyidx = bdp - fep->tx_bd_base;
 
@@ -318,43 +326,24 @@ fs_enet_interrupt(int irq, void *dev_id)
 {
struct net_device *dev = dev_id;
struct fs_enet_private *fep;
-   const struct fs_platform_info *fpi;
u32 int_events;
-   u32 int_clr_events;
-   int nr, napi_ok;
-   int handled;
 
fep = netdev_priv(dev);
-   fpi = fep->fpi;
 
-   nr = 0;
-   while ((int_events = (*fep->ops->get_int_events)(dev)) != 0) {
-   nr++;
+   int_events = (*fep->ops->get_int_events)(dev);
+   if (!int_events)
+   return IRQ_NONE;
 
-   int_clr_events = int_events;
-   int_clr_events &= ~fep->ev_napi;
+   int_events &= ~fep->ev_napi;
 
-   (*fep->ops->clear_int_events)(dev, int_clr_events);
-
-   if (int_events & fep->ev_err)
-   (*fep->ops->ev_error)(dev, int_events);
-
-   if (int_events & fep->ev) {
-   napi_ok = napi_schedule_prep(&fep->napi);
-
-   (*fep->ops->napi_disable)(dev);
-   (*fep->ops->clear_int_events)(dev, fep->ev_napi);
-
-   /* NOTE: it is possible for FCCs in NAPI mode*/
-   /* to submit a spurious interrupt while in poll  */
-   if (napi_ok)
-   __napi_schedule(&fep->napi);
-   }
+   (*fep->ops->clear_int_events)(dev, int_events);
 
+   if (napi_schedule_prep(&fep->napi)) {
+   (*fep->ops->napi_disable)(dev);
+   __napi_schedule(&fep->napi);
}
 
-   handled = nr > 0;
-   return IRQ_RETVAL(handled);
+   return IRQ_HANDLED;
 }
 
 void fs_init_bds(struct net_device *dev)
diff --git a/drivers/net/ethernet/freescale/fs_enet/mac-fcc.c 
b/drivers/net/ethernet/freescale/fs_enet/mac-fcc.c
index b47490be872c..66c8f82a8333 100644
--- a/drivers/net/ethernet/freescale/fs_enet/mac-fcc.c
+++ b/drivers/net/ethernet/freescale/fs_enet/mac-fcc.c
@@ -124,7 +124,7 @@ static int do_pd_setup(struct fs_enet_private *fep)
return ret;
 }
 
-#define FCC_NAPI_EVENT_MSK (FCC_ENET_RXF | FCC_ENET_RXB | FCC_ENET_TXB)
+#define FCC_NAPI_EVENT_MSK (FCC_ENET_RXF | FCC_ENET_RXB | FCC_ENET_TXB | 
FCC_ENET_TXE)
 #define FCC_EVENT  (FCC_ENET_RXF | FCC_ENET_TXB)
 #define FCC_ERR_EVENT_MSK  (FCC_ENET_TXE)
 
-- 
2.35.1



[PATCH] selftests/powerpc: Add a test of 4PB SLB handling

2022-03-17 Thread Michael Ellerman
Add a test for a bug we had in the 4PB address space SLB handling. It
was fixed in commit 4c2de74cc869 ("powerpc/64: Interrupts save PPR on
stack rather than thread_struct").

Signed-off-by: Michael Ellerman 
---
 tools/testing/selftests/powerpc/mm/.gitignore |   1 +
 tools/testing/selftests/powerpc/mm/Makefile   |   4 +-
 .../powerpc/mm/large_vm_gpr_corruption.c  | 160 ++
 3 files changed, 164 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/powerpc/mm/large_vm_gpr_corruption.c

diff --git a/tools/testing/selftests/powerpc/mm/.gitignore 
b/tools/testing/selftests/powerpc/mm/.gitignore
index aac4a59f9e28..4e1a294eec35 100644
--- a/tools/testing/selftests/powerpc/mm/.gitignore
+++ b/tools/testing/selftests/powerpc/mm/.gitignore
@@ -12,3 +12,4 @@ pkey_exec_prot
 pkey_siginfo
 stack_expansion_ldst
 stack_expansion_signal
+large_vm_gpr_corruption
diff --git a/tools/testing/selftests/powerpc/mm/Makefile 
b/tools/testing/selftests/powerpc/mm/Makefile
index 40253abc6208..27dc09d0bfee 100644
--- a/tools/testing/selftests/powerpc/mm/Makefile
+++ b/tools/testing/selftests/powerpc/mm/Makefile
@@ -4,7 +4,8 @@
 
 TEST_GEN_PROGS := hugetlb_vs_thp_test subpage_prot prot_sao segv_errors 
wild_bctr \
  large_vm_fork_separation bad_accesses pkey_exec_prot \
- pkey_siginfo stack_expansion_signal stack_expansion_ldst
+ pkey_siginfo stack_expansion_signal stack_expansion_ldst \
+ large_vm_gpr_corruption
 TEST_PROGS := stress_code_patching.sh
 
 TEST_GEN_PROGS_EXTENDED := tlbie_test
@@ -19,6 +20,7 @@ $(OUTPUT)/prot_sao: ../utils.c
 
 $(OUTPUT)/wild_bctr: CFLAGS += -m64
 $(OUTPUT)/large_vm_fork_separation: CFLAGS += -m64
+$(OUTPUT)/large_vm_gpr_corruption: CFLAGS += -m64
 $(OUTPUT)/bad_accesses: CFLAGS += -m64
 $(OUTPUT)/pkey_exec_prot: CFLAGS += -m64
 $(OUTPUT)/pkey_siginfo: CFLAGS += -m64
diff --git a/tools/testing/selftests/powerpc/mm/large_vm_gpr_corruption.c 
b/tools/testing/selftests/powerpc/mm/large_vm_gpr_corruption.c
new file mode 100644
index ..8b04c6796908
--- /dev/null
+++ b/tools/testing/selftests/powerpc/mm/large_vm_gpr_corruption.c
@@ -0,0 +1,160 @@
+// SPDX-License-Identifier: GPL-2.0+
+//
+// Copyright 2022, Michael Ellerman, IBM Corp.
+//
+// Test that the 4PB address space SLB handling doesn't corrupt userspace 
registers
+// (r9-r13) due to a SLB fault while saving the PPR.
+//
+// The bug was introduced in f384796c4 ("powerpc/mm: Add support for handling 
> 512TB
+// address in SLB miss") and fixed in 4c2de74cc869 ("powerpc/64: Interrupts 
save PPR on
+// stack rather than thread_struct").
+//
+// To hit the bug requires the task struct and kernel stack to be in different 
segments.
+// Usually that requires more than 1TB of RAM, or if that's not practical, 
boot the kernel
+// with "disable_1tb_segments".
+//
+// The test works by creating mappings about 512TB, to trigger the large 
address space
+// support. It creates 64 mappings, double the size of the SLB, to cause SLB 
faults on
+// each access (assuming naive replacement). It then loops over those mappings 
touching
+// each, and checks that r9-r13 aren't corrupted.
+//
+// It then forks another child and tries again, because a new child process 
will get a new
+// kernel stack and thread struct allocated, which may be more optimally 
placed to trigger
+// the bug. It would probably be better to leave the previous child processes 
hanging
+// around, so that kernel stack & thread struct allocations are not reused, 
but that would
+// amount to a 30 second fork bomb. The current design reliably triggers the 
bug on
+// unpatched kernels.
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "utils.h"
+
+
+#ifndef MAP_FIXED_NOREPLACE
+#define MAP_FIXED_NOREPLACEMAP_FIXED   // "Should be safe" above 512TB
+#endif
+
+#define BASE_ADDRESS   (1ul << 50) // 1PB
+#define STRIDE (2ul << 40) // 2TB
+
+#define SLB_SIZE   32
+#define NR_MAPPINGS(SLB_SIZE * 2)
+
+static volatile sig_atomic_t signaled;
+
+static void signal_handler(int sig)
+{
+   signaled = 1;
+}
+
+#define CHECK_REG(_reg)
\
+   if (_reg != _reg##_orig) {  \
+   printf(str(_reg) " corrupted! Expected 0x%lx != 0x%lx\n", \
+  _reg##_orig, _reg);  \
+   _exit(1);   \
+   }
+
+static int touch_mappings(void)
+{
+   unsigned long r9_orig, r10_orig, r11_orig, r12_orig, r13_orig;
+   unsigned long r9, r10, r11, r12, r13;
+   unsigned long addr, *p;
+   int i;
+
+   for (i = 0; i < NR_MAPPINGS; i ++) {
+   addr = BASE_ADDRESS + (i * STRIDE);
+   p = (unsigned long *)addr;
+
+   asm volatile (
+   "mr   %0, %%r9  ;" // Read origin

Re: [RFC PATCH] powerpc/64/interrupt: Temporarily save PPR on stack to fix register corruption due to SLB miss

2022-03-17 Thread Michael Ellerman
Nicholas Piggin  writes:
> This is a minimal stable kernel fix for the problem solved by
> 4c2de74cc869 ("powerpc/64: Interrupts save PPR on stack rather than
> thread_struct"). Instead of changing the interrupt stack frame (which
> causes a lot of churn), it moves the PPR value from the PACA save area
> to an unused slot in the stack frame temporarily, and defers saving it
> to thread_struct to later on when it is safe to take SLB misses.

The change log for 4c2de74cc869 doesn't really describe the problem that
well, because it was written as a pre-emptive fix for the SLB-in-C
rewrite.

Here's an attempt:

In commit f384796c4 ("powerpc/mm: Add support for handling > 512TB
address in SLB miss") we added support for using multiple context ids
per process. Previously accessing past the first context id was a fatal
error for the process. With the new support it became non-fatal, and so
the previous "bad_addr_slb" handler was changed to be the
"large_addr_slb" handler.

That handler uses the EXCEPTION_PROLOG_COMMON() macro, which in-turn
calls the SAVE_PPR() macro. At the point where SAVE_PPR() is used, the
r9-13 register values from the original user fault are saved in
paca->exslb. It's not until later in EXCEPTION_PROLOG_COMMON_2() that
they are saved from paca->exslb onto the kernel stack.

The PPR is saved into current->thread.ppr, which is notably not on the
kernel stack the way pt_regs are. This means we can take an SLB miss on
current->thread.ppr. If that happens in the "large_addr_slb" case we
will clobber the saved user r9-r13 in paca->exslb with kernel values.
Later we will save those clobbered values into the pt_regs on the stack,
and when we return to userspace those kernel values will be restored.

Typically this appears as some sort of segfault in userspace, with an
address that looks like a kernel address. In dmesg it can appear as:

  [19117.440331] some_program[1869625]: unhandled signal 11 at cf6bda10 
nip 7fff780d559c lr 7fff781ae56c code 30001

The upstream fix for this issue was to move PPR into pt_regs, on the
kernel stack, avoiding the possibility of an SLB fault when saving it.

However changing the size of pt_regs is an intrusive change, and has
side effects in other parts of the kernel. A minimal fix is to
temporarily save the PPR in an unused part of pt_regs, then save the
user register values from paca->exslb into pt_regs, and then move the
saved PPR into thread.ppr.

cheers

> Upstream kernels between 4.17-4.20 have this bug, so I propose this
> patch for 4.19 stable.
>
> Fixes: f384796c4 ("powerpc/mm: Add support for handling > 512TB address in 
> SLB miss")
> Signed-off-by: Nicholas Piggin 
> ---
>  arch/powerpc/include/asm/exception-64s.h | 22 ++
>  1 file changed, 18 insertions(+), 4 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/exception-64s.h 
> b/arch/powerpc/include/asm/exception-64s.h
> index 35fb5b11955a..f0424c6fdeca 100644
> --- a/arch/powerpc/include/asm/exception-64s.h
> +++ b/arch/powerpc/include/asm/exception-64s.h
> @@ -243,10 +243,22 @@
>   * PPR save/restore macros used in exceptions_64s.S
>   * Used for P7 or later processors
>   */
> -#define SAVE_PPR(area, ra, rb)   
> \
> +#define SAVE_PPR(area, ra)   \
> +BEGIN_FTR_SECTION_NESTED(940)
> \
> + ld  ra,area+EX_PPR(r13);/* Read PPR from paca */\
> + std ra,RESULT(r1);  /* Store PPR in RESULT for now */ \
> +END_FTR_SECTION_NESTED(CPU_FTR_HAS_PPR,CPU_FTR_HAS_PPR,940)
> +
> +/*
> + * This is called after we are finished accessing 'area', so we can now take
> + * SLB faults accessing the thread struct, which will use PACA_EXSLB area.
> + * This is required because the large_addr_slb handler uses EXSLB and it also
> + * uses the common exception macros including this PPR saving.
> + */
> +#define MOVE_PPR_TO_THREAD(ra, rb)   \
>  BEGIN_FTR_SECTION_NESTED(940)
> \
>   ld  ra,PACACURRENT(r13);\
> - ld  rb,area+EX_PPR(r13);/* Read PPR from paca */\
> + ld  rb,RESULT(r1);  /* Read PPR from stack */   \
>   std rb,TASKTHREADPPR(ra);   \
>  END_FTR_SECTION_NESTED(CPU_FTR_HAS_PPR,CPU_FTR_HAS_PPR,940)
>
> @@ -515,9 +527,11 @@ END_FTR_SECTION_NESTED(ftr,ftr,943)
>  3:   EXCEPTION_PROLOG_COMMON_1();   \
>   beq 4f; /* if from kernel mode  */ \
>   ACCOUNT_CPU_USER_ENTRY(r13, r9, r10);  \
> - SAVE_PPR(area, r9, r10);   \
> + SAVE_PPR(area, r9);\
>  4:   EXCEPTION_PROLOG_COMMON_2(area)\

Re: [PATCH v4 2/6] Partially revert "KVM: Pass kvm_init()'s opaque param to additional arch funcs"

2022-03-17 Thread Suzuki Kuruppassery Poulose

On 16/02/2022 03:15, Chao Gao wrote:

This partially reverts commit b99040853738 ("KVM: Pass kvm_init()'s opaque
param to additional arch funcs") remove opaque from
kvm_arch_check_processor_compat because no one uses this opaque now.
Address conflicts for ARM (due to file movement) and manually handle RISC-V
which comes after the commit.

And changes about kvm_arch_hardware_setup() in original commit are still
needed so they are not reverted.

Signed-off-by: Chao Gao 
Reviewed-by: Sean Christopherson 
---
  arch/arm64/kvm/arm.c   |  2 +-
  arch/mips/kvm/mips.c   |  2 +-
  arch/powerpc/kvm/powerpc.c |  2 +-
  arch/riscv/kvm/main.c  |  2 +-
  arch/s390/kvm/kvm-s390.c   |  2 +-
  arch/x86/kvm/x86.c |  2 +-
  include/linux/kvm_host.h   |  2 +-
  virt/kvm/kvm_main.c| 16 +++-
  8 files changed, 10 insertions(+), 20 deletions(-)

diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index ecc5958e27fe..0165cf3aac3a 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -73,7 +73,7 @@ int kvm_arch_hardware_setup(void *opaque)
return 0;
  }
  
-int kvm_arch_check_processor_compat(void *opaque)

+int kvm_arch_check_processor_compat(void)
  {
return 0;
  }


For arm64 :

Reviewed-by: Suzuki K Poulose 


Re: [PATCH 00/14] powerpc/rtas: various cleanups and improvements

2022-03-17 Thread Laurent Dufour
Hi Nick,

As this series needs additional work, I just sent a single patch [1] fixing
the MSR[RI] issue addressed in the patch 9/14 of this series.

I did that because that fix is fixing a panic currently seen and this will
ease backport to stable and distro kernel.

I suggest rebasing this series on top of this new patch.

Cheers,
Laurent.

1:
https://lore.kernel.org/linuxppc-dev/20220317110601.86917-1-lduf...@linux.ibm.com/

On 08/03/2022, 14:50:33, Nicholas Piggin wrote:
> I had a bunch of random little fixes and cleanups around and
> was prompted to put them together and make a change to call
> RTAS with MSR[RI] enabled because of a report of the hard
> lockup watchdog NMI IPI hitting in an rtas call which then
> crashed because it's unrecoverable.
> 
> Could possibly move patch 9 earlier if it would help with
> backporting.
> 
> Thanks,
> Nick
> 
> Nicholas Piggin (14):
>   powerpc/rtas: Move rtas entry assembly into its own file
>   powerpc/rtas: Make enter_rtas a nokprobe symbol on 64-bit
>   powerpc/rtas: Fix whitespace in rtas_entry.S
>   powerpc/rtas: Call enter_rtas with MSR[EE] disabled
>   powerpc/rtas: Modernise RI clearing on 64-bit
>   powerpc/rtas: Load rtas entry MSR explicitly
>   powerpc/rtas: PACA can be restored directly from SPRG
>   powerpc/rtas: call enter_rtas in real-mode on 64-bit
>   powerpc/rtas: Leave MSR[RI] enabled over RTAS call
>   powerpc/rtas: replace rtas_call_unlocked with raw_rtas_call
>   powerpc/rtas: tidy __fetch_rtas_last_error
>   powerpc/rtas: Close theoretical memory leak
>   powerpc/rtas: enture rtas_call is called with MMU enabled
>   powerpc/rtas: Consolidate and improve checking for rtas callers
> 
>  arch/powerpc/include/asm/rtas.h  |   4 +-
>  arch/powerpc/kernel/Makefile |   2 +-
>  arch/powerpc/kernel/entry_32.S   |  49 --
>  arch/powerpc/kernel/entry_64.S   | 150 ---
>  arch/powerpc/kernel/rtas.c   | 132 +---
>  arch/powerpc/kernel/rtas_entry.S | 144 ++
>  arch/powerpc/platforms/pseries/hotplug-cpu.c |   2 +-
>  arch/powerpc/platforms/pseries/ras.c |   7 +-
>  arch/powerpc/xmon/xmon.c |   2 +-
>  9 files changed, 227 insertions(+), 265 deletions(-)
>  create mode 100644 arch/powerpc/kernel/rtas_entry.S
> 



[PATCH] powerpc/rtas: Keep MSR RI set when calling RTAS

2022-03-17 Thread Laurent Dufour
RTAS runs in real mode (MSR[DR] and MSR[IR] unset) and in 32bits
mode (MSR[SF] unset).

The change in MSR is done in enter_rtas() in a relatively complex way,
since the MSR value could be hardcoded.

Furthermore, a panic has been reported when hitting the watchdog interrupt
while running in RTAS, this leads to the following stack trace:

[69244.027433][   C24] watchdog: CPU 24 Hard LOCKUP
[69244.027442][   C24] watchdog: CPU 24 TB:997512652051031, last heartbeat 
TB:997504470175378 (15980ms ago)
[69244.027451][   C24] Modules linked in: chacha_generic(E) libchacha(E) 
xxhash_generic(E) wp512(E) sha3_generic(E) rmd160(E) poly1305_generic(E) 
libpoly1305(E) michael_mic(E) md4(E) crc32_generic(E) cmac(E) ccm(E) 
algif_rng(E) twofish_generic(E) twofish_common(E) serpent_generic(E) fcrypt(E) 
des_generic(E) libdes(E) cast6_generic(E) cast5_generic(E) cast_common(E) 
camellia_generic(E) blowfish_generic(E) blowfish_common(E) algif_skcipher(E) 
algif_hash(E) gcm(E) algif_aead(E) af_alg(E) tun(E) rpcsec_gss_krb5(E) 
auth_rpcgss(E)
nfsv4(E) dns_resolver(E) rpadlpar_io(EX) rpaphp(EX) xsk_diag(E) tcp_diag(E) 
udp_diag(E) raw_diag(E) inet_diag(E) unix_diag(E) af_packet_diag(E) 
netlink_diag(E) nfsv3(E) nfs_acl(E) nfs(E) lockd(E) grace(E) sunrpc(E) 
fscache(E) netfs(E) af_packet(E) rfkill(E) bonding(E) tls(E) ibmveth(EX) 
crct10dif_vpmsum(E) rtc_generic(E) drm(E) drm_panel_orientation_quirks(E) 
fuse(E) configfs(E) backlight(E) ip_tables(E) x_tables(E) dm_service_time(E) 
sd_mod(E) t10_pi(E)
[69244.027555][   C24]  ibmvfc(EX) scsi_transport_fc(E) vmx_crypto(E) 
gf128mul(E) btrfs(E) blake2b_generic(E) libcrc32c(E) crc32c_vpmsum(E) xor(E) 
raid6_pq(E) dm_mirror(E) dm_region_hash(E) dm_log(E) sg(E) dm_multipath(E) 
dm_mod(E) scsi_dh_rdac(E) scsi_dh_emc(E) scsi_dh_alua(E) scsi_mod(E)
[69244.027587][   C24] Supported: No, Unreleased kernel
[69244.027600][   C24] CPU: 24 PID: 87504 Comm: drmgr Kdump: loaded Tainted: G  
  E  X5.14.21-150400.71.1.bz196362_2-default #1 SLE15-SP4 
(unreleased) 0d821077ef4faa8dfaf370efb5fdca1fa35f4e2c
[69244.027609][   C24] NIP:  1fb41050 LR: 1fb4104c CTR: 

[69244.027612][   C24] REGS: cfc33d60 TRAP: 0100   Tainted: G   
 E  X (5.14.21-150400.71.1.bz196362_2-default)
[69244.027615][   C24] MSR:  82981000   CR: 4882  
XER: 20040020
[69244.027625][   C24] CFAR: 011c IRQMASK: 1
[69244.027625][   C24] GPR00: 0003  
0001 50dc
[69244.027625][   C24] GPR04: 1ffb6100 0020 
0001 1fb09010
[69244.027625][   C24] GPR08: 2000  
 
[69244.027625][   C24] GPR12: 8004072a40a8 cff8b680 
0007 0034
[69244.027625][   C24] GPR16: 1fbf6e94 1fbf6d84 
1fbd1db0 1fb3f008
[69244.027625][   C24] GPR20: 1fb41018  
017f f68f
[69244.027625][   C24] GPR24: 1fb18fe8 1fb3e000 
1fb1adc0 1fb1cf40
[69244.027625][   C24] GPR28: 1fb26000 1fb460f0 
1fb17f18 1fb17000
[69244.027663][   C24] NIP [1fb41050] 0x1fb41050
[69244.027696][   C24] LR [1fb4104c] 0x1fb4104c
[69244.027699][   C24] Call Trace:
[69244.027701][   C24] Instruction dump:
[69244.027723][   C24]       
 
[69244.027728][   C24]       
 
[69244.027762][T87504] Oops: Unrecoverable System Reset, sig: 6 [#1]
[69244.028044][T87504] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
[69244.028089][T87504] Modules linked in: chacha_generic(E) libchacha(E) 
xxhash_generic(E) wp512(E) sha3_generic(E) rmd160(E) poly1305_generic(E) 
libpoly1305(E) michael_mic(E) md4(E) crc32_generic(E) cmac(E) ccm(E) 
algif_rng(E) twofish_generic(E) twofish_common(E) serpent_generic(E) fcrypt(E) 
des_generic(E) libdes(E) cast6_generic(E) cast5_generic(E) cast_common(E) 
camellia_generic(E) blowfish_generic(E) blowfish_common(E) algif_skcipher(E) 
algif_hash(E) gcm(E) algif_aead(E) af_alg(E) tun(E) rpcsec_gss_krb5(E) 
auth_rpcgss(E)
nfsv4(E) dns_resolver(E) rpadlpar_io(EX) rpaphp(EX) xsk_diag(E) tcp_diag(E) 
udp_diag(E) raw_diag(E) inet_diag(E) unix_diag(E) af_packet_diag(E) 
netlink_diag(E) nfsv3(E) nfs_acl(E) nfs(E) lockd(E) grace(E) sunrpc(E) 
fscache(E) netfs(E) af_packet(E) rfkill(E) bonding(E) tls(E) ibmveth(EX) 
crct10dif_vpmsum(E) rtc_generic(E) drm(E) drm_panel_orientation_quirks(E) 
fuse(E) configfs(E) backlight(E) ip_tables(E) x_tables(E) dm_service_time(E) 
sd_mod(E) t10_pi(E)
[69244.028171][T87504]  ibmvfc(EX) scsi_transport_fc(E) vmx_crypto(E) 
gf128mul(E) btrfs(E) blake2b_generic(E) libcrc32c(E) crc32c_vpmsum(E) xor(E) 
raid6_pq(E) dm_mirror(E) dm_region_hash(E) dm_log(E) sg(E) dm_multipath(E) 
dm_mod(E) scsi_dh_rdac(E) scsi_dh_

Re: [PATCH v4 0/6] Improve KVM's interaction with CPU hotplug

2022-03-17 Thread Chao Gao
Ping. Anyone can help to review this series (particularly patch 3-5)?

FYI, Sean gave his Reviewed-by to patch 1,2,5 and 6.


Re: [PATCH v1 4/7] arm64/pgtable: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE

2022-03-17 Thread David Hildenbrand
On 16.03.22 19:27, Catalin Marinas wrote:
> On Tue, Mar 15, 2022 at 03:18:34PM +0100, David Hildenbrand wrote:
>> diff --git a/arch/arm64/include/asm/pgtable-prot.h 
>> b/arch/arm64/include/asm/pgtable-prot.h
>> index b1e1b74d993c..62e0ebeed720 100644
>> --- a/arch/arm64/include/asm/pgtable-prot.h
>> +++ b/arch/arm64/include/asm/pgtable-prot.h
>> @@ -14,6 +14,7 @@
>>   * Software defined PTE bits definition.
>>   */
>>  #define PTE_WRITE   (PTE_DBM)/* same as DBM (51) */
>> +#define PTE_SWP_EXCLUSIVE   (_AT(pteval_t, 1) << 2)  /* only for swp ptes */
> 
> I think we can use bit 1 here.
> 
>> @@ -909,12 +925,13 @@ static inline pmd_t pmdp_establish(struct 
>> vm_area_struct *vma,
>>  /*
>>   * Encode and decode a swap entry:
>>   *  bits 0-1:   present (must be zero)
>> - *  bits 2-7:   swap type
>> + *  bits 2: remember PG_anon_exclusive
>> + *  bits 3-7:   swap type
>>   *  bits 8-57:  swap offset
>>   *  bit  58:PTE_PROT_NONE (must be zero)
> 
> I don't remember exactly why we reserved bits 0 and 1 when, from the
> hardware perspective, it's sufficient for bit 0 to be 0 and the whole
> pte becomes invalid. We use bit 1 as the 'table' bit (when 0 at pmd
> level, it's a huge page) but we shouldn't check for this on a swap
> entry.

You mean

arch/arm64/include/asm/pgtable-hwdef.h:#define PTE_TABLE_BIT
(_AT(pteval_t, 1) << 1)

right?

I wonder why it even exists, for arm64 I only spot:

arch/arm64/include/asm/pgtable.h:#define pte_mkhuge(pte)
(__pte(pte_val(pte) & ~PTE_TABLE_BIT))

I don't really see code that sets PTE_TABLE_BIT.

Similarly, I don't see code that sets PMD_TABLE_BIT/PUD_TABLE_BIT/P4D_TABLE_BIT.
Most probably setting code is not using the defines,  that's why I'm not 
finding it.

-- 
Thanks,

David / dhildenb



Re: [PATCH 08/14] powerpc/rtas: call enter_rtas in real-mode on 64-bit

2022-03-17 Thread Laurent Dufour
On 08/03/2022, 14:50:41, Nicholas Piggin wrote:
> This moves MSR save/restore and some real-mode juggling out of asm and
> into C code, simplifying things.
> 
> Signed-off-by: Nicholas Piggin 
> ---
>  arch/powerpc/kernel/rtas.c   | 15 ---
>  arch/powerpc/kernel/rtas_entry.S | 32 +---
>  2 files changed, 17 insertions(+), 30 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
> index 6b5892d6a56b..87ede1877816 100644
> --- a/arch/powerpc/kernel/rtas.c
> +++ b/arch/powerpc/kernel/rtas.c
> @@ -47,13 +47,22 @@
>  /* This is here deliberately so it's only used in this file */
>  void enter_rtas(unsigned long);
>  
> -static inline void do_enter_rtas(unsigned long args)
> +static noinline void do_enter_rtas(unsigned long args)
>  {
>   BUG_ON(!irqs_disabled());
>  
> - hard_irq_disable(); /* Ensure MSR[EE] is disabled on PPC64 */
> + if (IS_ENABLED(CONFIG_PPC64)) {
> + unsigned long msr;
>  
> - enter_rtas(args);
> + hard_irq_disable();
> +
> + msr = mfmsr();
> + mtmsr(msr & ~(MSR_IR|MSR_DR));

Further test done on top this series shows that switching MSR_DR off before
entering enter_rtas() is generating DSI when accessing the stack in
enter_rtas(). This may happen if the task stack is mapped beyond the VRMA.

Furthermore, there is no real need to run enter_rtas() in real mode (IR and
DR unset) because the MSR will be set to real mode when doing rfid, see below.

> + enter_rtas(args);
> + mtmsr(msr);
> + } else {
> + enter_rtas(args);
> + }
>  
>   srr_regs_clobbered(); /* rtas uses SRRs, invalidate */
>  }
> diff --git a/arch/powerpc/kernel/rtas_entry.S 
> b/arch/powerpc/kernel/rtas_entry.S
> index 5f65ea4436c6..292551684bbd 100644
> --- a/arch/powerpc/kernel/rtas_entry.S
> +++ b/arch/powerpc/kernel/rtas_entry.S
> @@ -84,14 +84,11 @@ _GLOBAL(enter_rtas)
>   li  r0,0
>   mtcrr0
>  
> - mfmsr   r6
> -
> - /* Unfortunately, the stack pointer and the MSR are also clobbered,
> -  * so they are saved in the PACA which allows us to restore
> -  * our original state after RTAS returns.
> + /*
> +  * The stack pointer is clobbered, so it is saved in the PACA which
> +  * allows us to restore our original state after RTAS returns.
>*/
>   std r1,PACAR1(r13)
> - std r6,PACASAVEDMSR(r13)
>  
>   /* Setup our real return addr */
>   LOAD_REG_ADDR(r4,rtas_return_loc)
> @@ -100,7 +97,6 @@ _GLOBAL(enter_rtas)
>  
>   LOAD_REG_IMMEDIATE(r6, MSR_ME)
>  
> -__enter_rtas:
>   LOAD_REG_ADDR(r4, rtas)
>   ld  r5,RTASENTRY(r4)/* get the rtas->entry value */
>   ld  r4,RTASBASE(r4) /* get the rtas->base value */
> @@ -112,6 +108,7 @@ __enter_rtas:
>   mtspr   SPRN_SRR1,r6
>   RFI_TO_KERNEL

rfid will load the MSR with the value stored in SRR1 (formely r6) and so
switch to the real mode. This why there is no need to switch earlier in
real mode.

>   b   .   /* prevent speculative execution */
> +_ASM_NOKPROBE_SYMBOL(enter_rtas)
>  
>  rtas_return_loc:
>   FIXUP_ENDIAN
> @@ -127,29 +124,10 @@ rtas_return_loc:
>   sync
>   mtmsrd  r6
>  
> - /* relocation is off at this point */>  GET_PACA(r13)
>  
> - bcl 20,31,$+4
> -0:   mflrr3
> - ld  r3,(1f-0b)(r3)  /* get &rtas_restore_regs */
> -
>   ld  r1,PACAR1(r13)  /* Restore our SP */
> - ld  r4,PACASAVEDMSR(r13)/* Restore our MSR */
>  
> - mtspr   SPRN_SRR0,r3
> - mtspr   SPRN_SRR1,r4
> - RFI_TO_KERNEL
This was turning on MSR_DR and MSR_IR so rtas_restore() could access the
stack even if it is beyond the VRMA.

That patch is breaking this and generating panic when task's stack are
below VRMA.

> - b   .   /* prevent speculative execution */
> -_ASM_NOKPROBE_SYMBOL(enter_rtas)
> -_ASM_NOKPROBE_SYMBOL(__enter_rtas)
> -_ASM_NOKPROBE_SYMBOL(rtas_return_loc)
> -
> - .align  3
> -1:   .8byte  rtas_restore_regs
> -
> -rtas_restore_regs:
> - /* relocation is on at this point */
>   REST_GPR(2, r1) /* Restore the TOC */
>   REST_NVGPRS(r1) /* Restore the non-volatiles */
>  
> @@ -169,5 +147,5 @@ rtas_restore_regs:
>  
>   mtlrr0
>   blr /* return to caller */
> -
> +_ASM_NOKPROBE_SYMBOL(rtas_return_loc)
>  #endif /* CONFIG_PPC32 */



答复: [PATCH] macintosh: macio-adb: Fix warning comparing pointer to 0

2022-03-17 Thread 白浩文
Sorry, That's my fault. I've sent again.

发件人: Christophe Leroy 
发送时间: 2022年3月17日 16:54:16
收件人: 白浩文; b...@kernel.crashing.org; masahi...@kernel.org; adobri...@gmail.com
抄送: linuxppc-dev@lists.ozlabs.org; linux-ker...@vger.kernel.org
主题: Re: [PATCH] macintosh: macio-adb: Fix warning comparing pointer to 0

Le 17/03/2022 à 03:30, Haowen Bai a écrit :
> Avoid pointer type value compared with 0 to make code clear.
>
> Signed-off-by: Haowen Bai 

This patch doesn't apply:

Applying patch #1606366 using "git am -s -3 -m"
Description: macintosh: macio-adb: Fix warning comparing pointer to 0
Applying: macintosh: macio-adb: Fix warning comparing pointer to 0
error: corrupt patch at line 37
error: could not build fake ancestor
Patch failed at 0001 macintosh: macio-adb: Fix warning comparing pointer
to 0
hint: Use 'git am --show-current-patch=diff' to see the failed patch
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".
'git am' failed with exit status 128



> ---
>   drivers/macintosh/macio-adb.c | 6 +++---
>   1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/macintosh/macio-adb.c b/drivers/macintosh/macio-adb.c
> index dc634c2..51afa46 100644
> --- a/drivers/macintosh/macio-adb.c
> +++ b/drivers/macintosh/macio-adb.c
> @@ -97,7 +97,7 @@ int macio_init(void)
>   unsigned int irq;
>
>   adbs = of_find_compatible_node(NULL, "adb", "chrp,adb0");
> - if (adbs == 0)
> + if (!adbs)
>   return -ENXIO;
>
>   if (of_address_to_resource(adbs, 0, &r)) {
> @@ -180,7 +180,7 @@ static int macio_send_request(struct adb_request *req, 
> int sync)
>   req->reply_len = 0;
>
>   spin_lock_irqsave(&macio_lock, flags);
> - if (current_req != 0) {
> + if (current_req) {
>   last_req->next = req;
>   last_req = req;
>   } else {
> @@ -210,7 +210,7 @@ static irqreturn_t macio_adb_interrupt(int irq, void *arg)
>   spin_lock(&macio_lock);
>   if (in_8(&adb->intr.r) & TAG) {
>   handled = 1;
> - if ((req = current_req) != 0) {
> + req = current_req;
> + if (req) {
>   /* put the current request in */
>   for (i = 0; i < req->nbytes; ++i)
>   out_8(&adb->data[i].r, req->data[i]);


[PATCH] macintosh: macio-adb: Fix warning comparing pointer to 0

2022-03-17 Thread Haowen Bai
Avoid pointer type value compared with 0 to make code clear.

Signed-off-by: Haowen Bai 
---
 drivers/macintosh/macio-adb.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/macintosh/macio-adb.c b/drivers/macintosh/macio-adb.c
index dc634c2..b7d287a 100644
--- a/drivers/macintosh/macio-adb.c
+++ b/drivers/macintosh/macio-adb.c
@@ -97,7 +97,7 @@ int macio_init(void)
unsigned int irq;
 
adbs = of_find_compatible_node(NULL, "adb", "chrp,adb0");
-   if (adbs == 0)
+   if (!adbs)
return -ENXIO;
 
if (of_address_to_resource(adbs, 0, &r)) {
@@ -180,7 +180,7 @@ static int macio_send_request(struct adb_request *req, int 
sync)
req->reply_len = 0;
 
spin_lock_irqsave(&macio_lock, flags);
-   if (current_req != 0) {
+   if (current_req) {
last_req->next = req;
last_req = req;
} else {
@@ -210,7 +210,8 @@ static irqreturn_t macio_adb_interrupt(int irq, void *arg)
spin_lock(&macio_lock);
if (in_8(&adb->intr.r) & TAG) {
handled = 1;
-   if ((req = current_req) != 0) {
+   req = current_req;
+   if (req) {
/* put the current request in */
for (i = 0; i < req->nbytes; ++i)
out_8(&adb->data[i].r, req->data[i]);
-- 
2.7.4



Re: [PATCH] macintosh: macio-adb: Fix warning comparing pointer to 0

2022-03-17 Thread Christophe Leroy


Le 17/03/2022 à 03:30, Haowen Bai a écrit :
> Avoid pointer type value compared with 0 to make code clear.
> 
> Signed-off-by: Haowen Bai 

This patch doesn't apply:

Applying patch #1606366 using "git am -s -3 -m"
Description: macintosh: macio-adb: Fix warning comparing pointer to 0
Applying: macintosh: macio-adb: Fix warning comparing pointer to 0
error: corrupt patch at line 37
error: could not build fake ancestor
Patch failed at 0001 macintosh: macio-adb: Fix warning comparing pointer 
to 0
hint: Use 'git am --show-current-patch=diff' to see the failed patch
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".
'git am' failed with exit status 128



> ---
>   drivers/macintosh/macio-adb.c | 6 +++---
>   1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/macintosh/macio-adb.c b/drivers/macintosh/macio-adb.c
> index dc634c2..51afa46 100644
> --- a/drivers/macintosh/macio-adb.c
> +++ b/drivers/macintosh/macio-adb.c
> @@ -97,7 +97,7 @@ int macio_init(void)
>   unsigned int irq;
>   
>   adbs = of_find_compatible_node(NULL, "adb", "chrp,adb0");
> - if (adbs == 0)
> + if (!adbs)
>   return -ENXIO;
>   
>   if (of_address_to_resource(adbs, 0, &r)) {
> @@ -180,7 +180,7 @@ static int macio_send_request(struct adb_request *req, 
> int sync)
>   req->reply_len = 0;
>   
>   spin_lock_irqsave(&macio_lock, flags);
> - if (current_req != 0) {
> + if (current_req) {
>   last_req->next = req;
>   last_req = req;
>   } else {
> @@ -210,7 +210,7 @@ static irqreturn_t macio_adb_interrupt(int irq, void *arg)
>   spin_lock(&macio_lock);
>   if (in_8(&adb->intr.r) & TAG) {
>   handled = 1;
> - if ((req = current_req) != 0) {
> + req = current_req;
> + if (req) {
>   /* put the current request in */
>   for (i = 0; i < req->nbytes; ++i)
>   out_8(&adb->data[i].r, req->data[i]);

Re: [PATCH] macintosh: smu: Fix warning comparing pointer to 0

2022-03-17 Thread Christophe Leroy


Le 17/03/2022 à 03:44, Haowen Bai a écrit :
> Avoid pointer type value compared with 0 to make code clear.
> 
> Signed-off-by: Haowen Bai 

This change is already awaiting at 
https://patchwork.ozlabs.org/project/linuxppc-dev/patch/20210825061838.69746-1-deng.changch...@zte.com.cn/

Thanks
Christophe


> ---
>   drivers/macintosh/smu.c | 6 +++---
>   1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/macintosh/smu.c b/drivers/macintosh/smu.c
> index a4fbc3f..d72d073 100644
> --- a/drivers/macintosh/smu.c
> +++ b/drivers/macintosh/smu.c
> @@ -1087,7 +1087,7 @@ static int smu_open(struct inode *inode, struct file 
> *file)
>   unsigned long flags;
>   
>   pp = kzalloc(sizeof(struct smu_private), GFP_KERNEL);
> - if (pp == 0)
> + if (!pp)
>   return -ENOMEM;
>   spin_lock_init(&pp->lock);
>   pp->mode = smu_file_commands;
> @@ -1254,7 +1254,7 @@ static __poll_t smu_fpoll(struct file *file, poll_table 
> *wait)
>   __poll_t mask = 0;
>   unsigned long flags;
>   
> - if (pp == 0)
> + if (!pp)
>   return 0;
>   
>   if (pp->mode == smu_file_commands) {
> @@ -1277,7 +1277,7 @@ static int smu_release(struct inode *inode, struct file 
> *file)
>   unsigned long flags;
>   unsigned int busy;
>   
> - if (pp == 0)
> + if (!pp)
>   return 0;
>   
>   file->private_data = NULL;

Re: [PATCH] macintosh: via-cuda: Fix warning comparing pointer to 0

2022-03-17 Thread Christophe Leroy
Hi,

Le 17/03/2022 à 03:40, Haowen Bai a écrit :
> Avoid pointer type value compared with 0 to make code clear.

We already have this change waiting in the queue, see 
https://patchwork.ozlabs.org/project/linuxppc-dev/patch/20220214010558.130201-1-yang@linux.alibaba.com/

Thanks
Christophe

> 
> Signed-off-by: Haowen Bai 
> ---
>   drivers/macintosh/via-cuda.c | 6 +++---
>   1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/macintosh/via-cuda.c b/drivers/macintosh/via-cuda.c
> index cd267392..05a3cd9 100644
> --- a/drivers/macintosh/via-cuda.c
> +++ b/drivers/macintosh/via-cuda.c
> @@ -236,10 +236,10 @@ int __init find_via_cuda(void)
>   const u32 *reg;
>   int err;
>   
> -if (vias != 0)
> +if (vias)
>   return 1;
>   vias = of_find_node_by_name(NULL, "via-cuda");
> -if (vias == 0)
> +if (!vias)
>   return 0;
>   
>   reg = of_get_property(vias, "reg", NULL);
> @@ -517,7 +517,7 @@ cuda_write(struct adb_request *req)
>   req->reply_len = 0;
>   
>   spin_lock_irqsave(&cuda_lock, flags);
> -if (current_req != 0) {
> +if (current_req) {
>   last_req->next = req;
>   last_req = req;
>   } else {