Re: [PATCH v13 16/35] KVM: Add KVM_CREATE_GUEST_MEMFD ioctl() for guest-specific backing memory

2023-10-31 Thread David Matlack
On Tue, Oct 31, 2023 at 2:36 PM Sean Christopherson  wrote:
> On Tue, Oct 31, 2023, David Matlack wrote:
> > On 2023-10-27 11:21 AM, Sean Christopherson wrote:
> > > Introduce an ioctl(), KVM_CREATE_GUEST_MEMFD, to allow creating file-based
> > > memory that is tied to a specific KVM virtual machine and whose primary
> > > purpose is to serve guest memory.
>
> > Maybe can you sketch out how you see this proposal being extensible to
> > using guest_memfd for shared mappings?
>
> For in-place conversions, e.g. pKVM, no additional guest_memfd is needed.  
> What's
> missing there is the ability to (safely) mmap() guest_memfd, e.g. KVM needs to
> ensure there are no outstanding references when converting back to private.
>
> For TDX/SNP, assuming we don't find a performant and robust way to do in-place
> conversions, a second fd+offset pair would be needed.

Is there a way to support non-in-place conversions within a single guest_memfd?


Re: [ppc64le] WARN at crypto/testmgr.c:5804

2023-10-31 Thread Dimitri John Ledkov
On Tue, 31 Oct 2023 at 14:09, Sachin Sant  wrote:
>
> Following warning is observed during boot of latest -next
> kernel (6.6.0-rc7-next-20231030 and todays -next) on IBM Power server.
>
> [ 0.085775] workingset: timestamp_bits=38 max_order=20 bucket_order=0
> [ 0.085801] zbud: loaded
> [ 0.086473] [ cut here ]
> [ 0.086477] WARNING: CPU: 23 PID: 211 at crypto/testmgr.c:5804 
> alg_test.part.33+0x308/0x740
> [ 0.086486] Modules linked in:
> [ 0.086489] CPU: 23 PID: 211 Comm: cryptomgr_test Not tainted 
> 6.6.0-rc7-next-20231030 #1
> [ 0.086493] Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf06 
> of:IBM,FW1030.20 (NH1030_058) hv:phyp pSeries
> [ 0.086497] NIP: c0765068 LR: c0764ff4 CTR: c075da00
> [ 0.086500] REGS: ced7bb50 TRAP: 0700 Not tainted 
> (6.6.0-rc7-next-20231030)
> [ 0.086503] MSR: 80029033  CR: 8284 XER: 
> 20040002
> [ 0.086511] CFAR: c0765318 IRQMASK: 1
> GPR00: c0764ff4 ced7bdf0 c1482000 0002
> GPR04: ced7be60 000e 002f fffe
> GPR08: ff00 0001 0008 
> GPR12: c075da00 c00aa7cec700 c019da88 c6df9cc0
> GPR16:   c1309f98 
> GPR20: c1308d50 c1308d98 c0ffeaf0 c1309fb0
> GPR24: c0ffb330 c0ffdf30 0400 c00019bfd480
> GPR28: 0002 c00019bfd400 c2bcf633 000e
> [ 0.086547] NIP [c0765068] alg_test.part.33+0x308/0x740
> [ 0.086552] LR [c0764ff4] alg_test.part.33+0x294/0x740
> [ 0.086556] Call Trace:
> [ 0.086557] [ced7bdf0] [c0764ff4] 
> alg_test.part.33+0x294/0x740 (unreliable)
> [ 0.086563] [ced7bf60] [c075da34] cryptomgr_test+0x34/0x70
> [ 0.086568] [ced7bf90] [c019dbb8] kthread+0x138/0x140
> [ 0.086573] [ced7bfe0] [c000df98] 
> start_kernel_thread+0x14/0x18
> [ 0.086578] Code: fb210138 fb810150 3af76d20 3b80 3a526d38 3a946d50 
> 3ab56d98 3b380040 3ad837c0 2f9c 7d301026 5529f7fe <0b09> 7f890034 
> 5529d97e 419d03a4
> [ 0.086589] ---[ end trace  ]---
> [ 0.086592] testmgr: alg_test_descs entries in wrong order: 
> 'pkcs1pad(rsa,sha512)' before 'pkcs1pad(rsa,sha3-256)’
>
> Git bisect points to following patch:
> commit ee62afb9d02dd279a7b73245614f13f8fe777a6d
> crypto: rsa-pkcs1pad - Add FIPS 202 SHA-3 support
>
> - Sachin

Patch to address this was submitted at
https://lore.kernel.org/all/20231027195206.46643-1-ebigg...@kernel.org/
and should address this issue.

--
okurrr,

Dimitri


Re: [PATCH v13 16/35] KVM: Add KVM_CREATE_GUEST_MEMFD ioctl() for guest-specific backing memory

2023-10-31 Thread Paolo Bonzini
On Tue, Oct 31, 2023 at 11:13 PM Sean Christopherson  wrote:
> On Tue, Oct 31, 2023, Fuad Tabba wrote:
> > On Fri, Oct 27, 2023 at 7:23 PM Sean Christopherson  
> > wrote:
> Since we now know that at least pKVM will use guest_memfd for shared memory, 
> and
> odds are quite good that "regular" VMs will also do the same, i.e. will want
> guest_memfd with the concept of private memory, I agree that we should avoid
> PRIVATE.
>
> Though I vote for KVM_MEM_GUEST_MEMFD (or KVM_MEM_GUEST_MEMFD_VALID or
> KVM_MEM_USE_GUEST_MEMFD).  I.e. do our best to avoid ambiguity between 
> referring
> to "guest memory" at-large and guest_memfd.

I was going to propose KVM_MEM_HAS_GUESTMEMFD.  Any option
is okay for me so, if no one complains, I'll go for KVM_MEM_GUESTMEMFD
(no underscore because I found the repeated "_MEM" distracting).

Paolo



Re: [PATCH v13 16/35] KVM: Add KVM_CREATE_GUEST_MEMFD ioctl() for guest-specific backing memory

2023-10-31 Thread Sean Christopherson
On Tue, Oct 31, 2023, Fuad Tabba wrote:
> Hi,
> 
> On Fri, Oct 27, 2023 at 7:23 PM Sean Christopherson  wrote:
> 
> ...
> 
> > diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> > index e2252c748fd6..e82c69d5e755 100644
> > --- a/Documentation/virt/kvm/api.rst
> > +++ b/Documentation/virt/kvm/api.rst
> > @@ -6079,6 +6079,15 @@ applied.
> >  :Parameters: struct kvm_userspace_memory_region2 (in)
> >  :Returns: 0 on success, -1 on error
> >
> > +KVM_SET_USER_MEMORY_REGION2 is an extension to KVM_SET_USER_MEMORY_REGION 
> > that
> > +allows mapping guest_memfd memory into a guest.  All fields shared with
> > +KVM_SET_USER_MEMORY_REGION identically.  Userspace can set KVM_MEM_PRIVATE 
> > in
> > +flags to have KVM bind the memory region to a given guest_memfd range of
> > +[guest_memfd_offset, guest_memfd_offset + memory_size].  The target 
> > guest_memfd
> > +must point at a file created via KVM_CREATE_GUEST_MEMFD on the current VM, 
> > and
> > +the target range must not be bound to any other memory region.  All 
> > standard
> > +bounds checks apply (use common sense).
> > +
> 
> Bikeshedding here: Not sure if KVM_MEM_PRIVATE is the best name for
> this. It gets confusing with KVM_MEMORY_ATTRIBUTE_PRIVATE, i.e., that
> a region marked as KVM_MEM_PRIVATE is only potentially private. It did
> confuse the rest of the team when I walked them through a previous
> version of this code once. Would something like KVM_MEM_GUESTMEM make
> more sense?

Heh, deja vu.  We discussed this back in v7[*], and I came to the conclusion 
that
choosing a name that wasn't explicitly tied to private memory wasn't justified.
But that was before a KVM-owned guest_memfd was even an idea, and thus before we
had anything close to a real use case.

Since we now know that at least pKVM will use guest_memfd for shared memory, and
odds are quite good that "regular" VMs will also do the same, i.e. will want
guest_memfd with the concept of private memory, I agree that we should avoid
PRIVATE.

Though I vote for KVM_MEM_GUEST_MEMFD (or KVM_MEM_GUEST_MEMFD_VALID or
KVM_MEM_USE_GUEST_MEMFD).  I.e. do our best to avoid ambiguity between referring
to "guest memory" at-large and guest_memfd.

Copying a few relevant points from v7 to save a click or three.

 : I don't have a concrete use case (this is a recent idea on my end), but 
since we're
 : already adding fd-based memory, I can't think of a good reason not make it 
more generic
 : for not much extra cost.  And there are definitely classes of VMs for which 
fd-based
 : memory would Just Work, e.g. large VMs that are never oversubscribed on 
memory don't
 : need to support reclaim, so the fact that fd-based memslots won't support 
page aging
 : (among other things) right away is a non-issue.

...

 : Hrm, but basing private memory on top of a generic FD_VALID would 
effectively require
 : shared memory to use hva-based memslots for confidential VMs.  That'd yield 
a very
 : weird API, e.g. non-confidential VMs could be backed entirely by fd-based 
memslots,
 : but confidential VMs would be forced to use hva-based memslots.
 : 
 : Ignore this idea for now.  If there's an actual use case for generic 
fd-based memory
 : then we'll want a separate flag, fd, and offset, i.e. that support could be 
added
 : independent of KVM_MEM_PRIVATE.

...

 : One alternative would be to call it KVM_MEM_PROTECTED.  That shouldn't cause
 : problems for the known use of "private" (TDX and SNP), and it gives us a 
little
 : wiggle room, e.g. if we ever get a use case where VMs can share memory that 
is
 : otherwise protected.
 : 
 : That's a pretty big "if" though, and odds are good we'd need more memslot 
flags and
 : fd+offset pairs to allow differentiating "private" vs. "protected-shared" 
without
 : forcing userspace to punch holes in memslots, so I don't know that hedging 
now will
 : buy us anything.
 : 
 : So I'd say that if people think KVM_MEM_PRIVATE brings additional and 
meaningful
 : clarity over KVM_MEM_PROTECTECD, then lets go with PRIVATE.  But if 
PROTECTED is
 : just as good, go with PROTECTED as it gives us a wee bit of wiggle room for 
the
 : future.

[*] https://lore.kernel.org/all/yuh0ikhoh+tck...@google.com
 
> > -See KVM_SET_USER_MEMORY_REGION.
> > +A KVM_MEM_PRIVATE region _must_ have a valid guest_memfd (private memory) 
> > and
> > +userspace_addr (shared memory).  However, "valid" for userspace_addr simply
> > +means that the address itself must be a legal userspace address.  The 
> > backing
> > +mapping for userspace_addr is not required to be valid/populated at the 
> > time of
> > +KVM_SET_USER_MEMORY_REGION2, e.g. shared memory can be lazily 
> > mapped/allocated
> > +on-demand.
> 
> Regarding requiring that a private region have both a valid
> guest_memfd and a userspace_addr, should this be
> implementation-specific? In pKVM at least, all regions for protected
> VMs are private, and KVM doesn't care about the host userspace address
> for those regions 

Re: [PATCH v13 16/35] KVM: Add KVM_CREATE_GUEST_MEMFD ioctl() for guest-specific backing memory

2023-10-31 Thread Sean Christopherson
On Tue, Oct 31, 2023, David Matlack wrote:
> On 2023-10-27 11:21 AM, Sean Christopherson wrote:
> > Introduce an ioctl(), KVM_CREATE_GUEST_MEMFD, to allow creating file-based
> > memory that is tied to a specific KVM virtual machine and whose primary
> > purpose is to serve guest memory.
> > 
> > A guest-first memory subsystem allows for optimizations and enhancements
> > that are kludgy or outright infeasible to implement/support in a generic
> > memory subsystem.  With guest_memfd, guest protections and mapping sizes
> > are fully decoupled from host userspace mappings.   E.g. KVM currently
> > doesn't support mapping memory as writable in the guest without it also
> > being writable in host userspace, as KVM's ABI uses VMA protections to
> > define the allow guest protection.  Userspace can fudge this by
> > establishing two mappings, a writable mapping for the guest and readable
> > one for itself, but that’s suboptimal on multiple fronts.
> > 
> > Similarly, KVM currently requires the guest mapping size to be a strict
> > subset of the host userspace mapping size, e.g. KVM doesn’t support
> > creating a 1GiB guest mapping unless userspace also has a 1GiB guest
> > mapping.  Decoupling the mappings sizes would allow userspace to precisely
> > map only what is needed without impacting guest performance, e.g. to
> > harden against unintentional accesses to guest memory.
> > 
> > Decoupling guest and userspace mappings may also allow for a cleaner
> > alternative to high-granularity mappings for HugeTLB, which has reached a
> > bit of an impasse and is unlikely to ever be merged.
> > 
> > A guest-first memory subsystem also provides clearer line of sight to
> > things like a dedicated memory pool (for slice-of-hardware VMs) and
> > elimination of "struct page" (for offload setups where userspace _never_
> > needs to mmap() guest memory).
> 
> All of these use-cases involve using guest_memfd for shared pages, but
> this entire series sets up KVM to only use guest_memfd for private
> pages.
> 
> For example, the per-page attributes are a property of a KVM VM, not the
> underlying guest_memfd. So that implies we will need separate
> guest_memfds for private and shared pages. But a given memslot can have
> a mix of private and shared pages. So that implies a memslot will need
> to support 2 guest_memfds?

Yes, someday this may be true.  Allowing guest_memfd (it was probably called
something else at that point) for "regular" memory was discussed in I think v10?
We made a concious decision to defer supporting 2 guest_memfds because it isn't 
strictly
necessary to support the TDX/SNP use cases for which all of this was initially
designed, and adding a second guest_memfd and the infrastructure needed to let
userspace map a guest_memfd can be done on top with minimal overhead.

> But the UAPI only allows 1 and uses the HVA for shared mappings.
> 
> My initial reaction after reading through this series is that the
> per-page private/shared should be a property of the guest_memfd, not the
> VM. Maybe it would even be cleaner in the long-run to make all memory
> attributes a property of the guest_memfd. That way we can scope the
> support to only guest_memfds and not have to worry about making per-page
> attributes work with "legacy" HVA-based memslots.

Making the private vs. shared state a property of the guest_memfd doesn't work
for TDX and SNP.  We (upstream x86 and KVM maintainers) have taken a hard stance
that in-place conversion will not be allowed for TDX/SNP due to the ease with
which a misbehaving userspace and/or guest can crash the host.

We'd also be betting that there would *never* be a use case for per-gfn 
attributes
for non-standard memory, e.g. virtio-gpu buffers, any kind of device memory, 
etc.

We'd also effectively be signing up to either support swap and page migration in
guest_memfd, or make those mutually exclusive with per-gfn attributes too.

guest_memfd is only intended for guest DRAM, and if I get my way, will never 
support
swap (page migration is less scary).  I.e. guest_memfd isn't intended to be a
one-size-fits-all solution, nor is it intended to wholesale replace memslots,
which is effectively what we'd be doing by deprecating hva-based guest memory.

And ignoring all that, the ABI would end up being rather bizarre due to way 
guest_memfd
interacts with memslots.  guest_memfd itself has no real notion of gfns, i.e. 
the
shared vs. private state would be tied to a file offset, not a gfn.  That's a 
solvable
problem, e.g. we could make a gfn:offset binding "sticky", but that would edd 
extra
complexity to the ABI, and AFAICT wouldn't buy us that much, if anything.

> Maybe can you sketch out how you see this proposal being extensible to
> using guest_memfd for shared mappings?

For in-place conversions, e.g. pKVM, no additional guest_memfd is needed.  
What's
missing there is the ability to (safely) mmap() guest_memfd, e.g. KVM needs to
ensure there are no outstanding references when 

Re: [PATCH v13 16/35] KVM: Add KVM_CREATE_GUEST_MEMFD ioctl() for guest-specific backing memory

2023-10-31 Thread David Matlack
On 2023-10-27 11:21 AM, Sean Christopherson wrote:
> Introduce an ioctl(), KVM_CREATE_GUEST_MEMFD, to allow creating file-based
> memory that is tied to a specific KVM virtual machine and whose primary
> purpose is to serve guest memory.
> 
> A guest-first memory subsystem allows for optimizations and enhancements
> that are kludgy or outright infeasible to implement/support in a generic
> memory subsystem.  With guest_memfd, guest protections and mapping sizes
> are fully decoupled from host userspace mappings.   E.g. KVM currently
> doesn't support mapping memory as writable in the guest without it also
> being writable in host userspace, as KVM's ABI uses VMA protections to
> define the allow guest protection.  Userspace can fudge this by
> establishing two mappings, a writable mapping for the guest and readable
> one for itself, but that’s suboptimal on multiple fronts.
> 
> Similarly, KVM currently requires the guest mapping size to be a strict
> subset of the host userspace mapping size, e.g. KVM doesn’t support
> creating a 1GiB guest mapping unless userspace also has a 1GiB guest
> mapping.  Decoupling the mappings sizes would allow userspace to precisely
> map only what is needed without impacting guest performance, e.g. to
> harden against unintentional accesses to guest memory.
> 
> Decoupling guest and userspace mappings may also allow for a cleaner
> alternative to high-granularity mappings for HugeTLB, which has reached a
> bit of an impasse and is unlikely to ever be merged.
> 
> A guest-first memory subsystem also provides clearer line of sight to
> things like a dedicated memory pool (for slice-of-hardware VMs) and
> elimination of "struct page" (for offload setups where userspace _never_
> needs to mmap() guest memory).

All of these use-cases involve using guest_memfd for shared pages, but
this entire series sets up KVM to only use guest_memfd for private
pages.

For example, the per-page attributes are a property of a KVM VM, not the
underlying guest_memfd. So that implies we will need separate
guest_memfds for private and shared pages. But a given memslot can have
a mix of private and shared pages. So that implies a memslot will need
to support 2 guest_memfds? But the UAPI only allows 1 and uses the HVA
for shared mappings.

My initial reaction after reading through this series is that the
per-page private/shared should be a property of the guest_memfd, not the
VM. Maybe it would even be cleaner in the long-run to make all memory
attributes a property of the guest_memfd. That way we can scope the
support to only guest_memfds and not have to worry about making per-page
attributes work with "legacy" HVA-based memslots.

Maybe can you sketch out how you see this proposal being extensible to
using guest_memfd for shared mappings?


Re: [linux-next:master] BUILD REGRESSION c503e3eec382ac708ee7adf874add37b77c5d312

2023-10-31 Thread Dan Williams
Dan Williams wrote:
> Bjorn Helgaas wrote:
> > On Tue, Oct 31, 2023 at 04:35:23AM +0800, kernel test robot wrote:
> > > tree/branch: 
> > > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
> > > branch HEAD: c503e3eec382ac708ee7adf874add37b77c5d312  Add linux-next 
> > > specific files for 20231030
> > > 
> > > Error/Warning reports:
> > > ... 
> > > https://lore.kernel.org/oe-kbuild-all/202310302206.pkr5ebdi-...@intel.com
> > 
> > > Error/Warning: (recently discovered and may have been fixed)
> > > 
> > > Warning: MAINTAINERS references a file that doesn't exist: 
> > > Documentation/devicetree/bindings/iio/imu/bosch,bma400.yaml
> > > aarch64-linux-ld: drivers/cxl/core/pci.c:921:(.text+0xbbc): undefined 
> > > reference to `pci_print_aer'
> > > ...
> > > arch/riscv/include/asm/mmio.h:67:(.text+0xd66): undefined reference to 
> > > `pci_print_aer'
> > > csky-linux-ld: pci.c:(.text+0x6e8): undefined reference to `pci_print_aer'
> > > drivers/cxl/core/pci.c:921: undefined reference to `pci_print_aer'
> > > drivers/cxl/core/pci.c:921:(.text+0xbc0): undefined reference to 
> > > `pci_print_aer'
> > > ...
> > > ld: drivers/cxl/core/pci.c:921: undefined reference to `pci_print_aer'
> > > loongarch64-linux-ld: drivers/cxl/core/pci.c:921:(.text+0xa38): undefined 
> > > reference to `pci_print_aer'
> > > pci.c:(.text+0x662): undefined reference to `pci_print_aer'
> > > powerpc-linux-ld: pci.c:(.text+0xf10): undefined reference to 
> > > `pci_print_aer'
> > > riscv64-linux-ld: pci.c:(.text+0x11ec): undefined reference to 
> > > `pci_print_aer'
> > 
> > I have no idea about the above (and all the similar ones below); I
> > assume they all have to do with
> > https://lore.kernel.org/r/20231018171713.1883517-13-rrich...@amd.com
> 
> Yes, I will get this fix folded into cxl/next:
> 
> diff --git a/include/linux/aer.h b/include/linux/aer.h
> index f6ea2f57d808..3db310c19ab7 100644
> --- a/include/linux/aer.h
> +++ b/include/linux/aer.h
> @@ -43,16 +43,20 @@ struct aer_capability_regs {
>  #if defined(CONFIG_PCIEAER)
>  int pci_aer_clear_nonfatal_status(struct pci_dev *dev);
>  int pcie_aer_is_native(struct pci_dev *dev);
> +void pci_print_aer(struct pci_dev *dev, int aer_severity,
> +   struct aer_capability_regs *aer);
>  #else
>  static inline int pci_aer_clear_nonfatal_status(struct pci_dev *dev)
>  {
> return -EINVAL;
>  }
>  static inline int pcie_aer_is_native(struct pci_dev *dev) { return 0; }
> +static inline void pci_print_aer(struct pci_dev *dev, int aer_severity,
> +struct aer_capability_regs *aer)
> +{
> +}
>  #endif
>  
> -void pci_print_aer(struct pci_dev *dev, int aer_severity,
> -   struct aer_capability_regs *aer);
>  int cper_severity_to_aer(int cper_severity);
>  void aer_recover_queue(int domain, unsigned int bus, unsigned int devfn,
>int severity, struct aer_capability_regs *aer_regs);

Actually, no this was my fault for inadvertantly including this patch in
the branch:

http://lore.kernel.org/r/653c7eb29265c_244c8f29...@dwillia2-xfh.jf.intel.com.notmuch

...which had the desired effect, but at the cost of thrashing
linux-next. Apologies for that.



Re: [linux-next:master] BUILD REGRESSION c503e3eec382ac708ee7adf874add37b77c5d312

2023-10-31 Thread Dan Williams
Bjorn Helgaas wrote:
> On Tue, Oct 31, 2023 at 04:35:23AM +0800, kernel test robot wrote:
> > tree/branch: 
> > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
> > branch HEAD: c503e3eec382ac708ee7adf874add37b77c5d312  Add linux-next 
> > specific files for 20231030
> > 
> > Error/Warning reports:
> > ... 
> > https://lore.kernel.org/oe-kbuild-all/202310302206.pkr5ebdi-...@intel.com
> 
> > Error/Warning: (recently discovered and may have been fixed)
> > 
> > Warning: MAINTAINERS references a file that doesn't exist: 
> > Documentation/devicetree/bindings/iio/imu/bosch,bma400.yaml
> > aarch64-linux-ld: drivers/cxl/core/pci.c:921:(.text+0xbbc): undefined 
> > reference to `pci_print_aer'
> > ...
> > arch/riscv/include/asm/mmio.h:67:(.text+0xd66): undefined reference to 
> > `pci_print_aer'
> > csky-linux-ld: pci.c:(.text+0x6e8): undefined reference to `pci_print_aer'
> > drivers/cxl/core/pci.c:921: undefined reference to `pci_print_aer'
> > drivers/cxl/core/pci.c:921:(.text+0xbc0): undefined reference to 
> > `pci_print_aer'
> > ...
> > ld: drivers/cxl/core/pci.c:921: undefined reference to `pci_print_aer'
> > loongarch64-linux-ld: drivers/cxl/core/pci.c:921:(.text+0xa38): undefined 
> > reference to `pci_print_aer'
> > pci.c:(.text+0x662): undefined reference to `pci_print_aer'
> > powerpc-linux-ld: pci.c:(.text+0xf10): undefined reference to 
> > `pci_print_aer'
> > riscv64-linux-ld: pci.c:(.text+0x11ec): undefined reference to 
> > `pci_print_aer'
> 
> I have no idea about the above (and all the similar ones below); I
> assume they all have to do with
> https://lore.kernel.org/r/20231018171713.1883517-13-rrich...@amd.com

Yes, I will get this fix folded into cxl/next:

diff --git a/include/linux/aer.h b/include/linux/aer.h
index f6ea2f57d808..3db310c19ab7 100644
--- a/include/linux/aer.h
+++ b/include/linux/aer.h
@@ -43,16 +43,20 @@ struct aer_capability_regs {
 #if defined(CONFIG_PCIEAER)
 int pci_aer_clear_nonfatal_status(struct pci_dev *dev);
 int pcie_aer_is_native(struct pci_dev *dev);
+void pci_print_aer(struct pci_dev *dev, int aer_severity,
+   struct aer_capability_regs *aer);
 #else
 static inline int pci_aer_clear_nonfatal_status(struct pci_dev *dev)
 {
return -EINVAL;
 }
 static inline int pcie_aer_is_native(struct pci_dev *dev) { return 0; }
+static inline void pci_print_aer(struct pci_dev *dev, int aer_severity,
+struct aer_capability_regs *aer)
+{
+}
 #endif
 
-void pci_print_aer(struct pci_dev *dev, int aer_severity,
-   struct aer_capability_regs *aer);
 int cper_severity_to_aer(int cper_severity);
 void aer_recover_queue(int domain, unsigned int bus, unsigned int devfn,
   int severity, struct aer_capability_regs *aer_regs);


Re: [pci:controller/xilinx-xdma] BUILD REGRESSION 8d786149d78c7784144c7179e25134b6530b714b

2023-10-31 Thread Bjorn Helgaas
On Tue, Oct 31, 2023 at 09:59:29AM -0700, Nick Desaulniers wrote:
> On Tue, Oct 31, 2023 at 7:56 AM Bjorn Helgaas  wrote:
> > On Sat, Oct 28, 2023 at 08:22:54PM +0800, kernel test robot wrote:
> > > tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/pci/pci.git 
> > > controller/xilinx-xdma
> > > branch HEAD: 8d786149d78c7784144c7179e25134b6530b714b  PCI: xilinx-xdma: 
> > > Add Xilinx XDMA Root Port driver
> > >
> > > Error/Warning ids grouped by kconfigs:
> > >
> > > clang_recent_errors
> > > `-- powerpc-pmac32_defconfig
> > > |-- 
> > > arch-powerpc-sysdev-grackle.c:error:unused-function-grackle_set_stg-Werror-Wunused-function
> > > |-- 
> > > arch-powerpc-xmon-xmon.c:error:unused-function-get_output_lock-Werror-Wunused-function
> > > `-- 
> > > arch-powerpc-xmon-xmon.c:error:unused-function-release_output_lock-Werror-Wunused-function
> >
> > This report is close to useless.  It doesn't show the complete error
> > message, it doesn't show how to reproduce the issue, and the pci -next
> > branch (including controller/xilinx-xdma) doesn't reference any of
> > these functions:
> >
> >   $ git grep -E "grackle_set_stg|get_output_lock|release_output_lock" | cat
> >   arch/powerpc/sysdev/grackle.c:static inline void grackle_set_stg(struct 
> > pci_controller* bp, int enable)
> >   arch/powerpc/sysdev/grackle.c:grackle_set_stg(hose, 1);
> >   arch/powerpc/xmon/xmon.c:static void get_output_lock(void)
> >   arch/powerpc/xmon/xmon.c:static void release_output_lock(void)
> >   arch/powerpc/xmon/xmon.c:static inline void get_output_lock(void) {}
> >   arch/powerpc/xmon/xmon.c:static inline void release_output_lock(void) {}
> >   arch/powerpc/xmon/xmon.c: get_output_lock();
> >   arch/powerpc/xmon/xmon.c: release_output_lock();
> >   arch/powerpc/xmon/xmon.c: get_output_lock();
> >   arch/powerpc/xmon/xmon.c: release_output_lock();
> >   arch/powerpc/xmon/xmon.c: get_output_lock();
> >   arch/powerpc/xmon/xmon.c: release_output_lock();
> >   arch/powerpc/xmon/xmon.c: get_output_lock();
> >   arch/powerpc/xmon/xmon.c: release_output_lock();
> >
> > That said, the unused functions do look legit:
> >
> > grackle_set_stg() is a static function and the only call is under
> > "#if 0".
> 
> Time to remove it then? Or is it a bug that it's not called?
> Otherwise the definition should be behind the same preprocessor guards
> as the caller.  Same for the below.

I don't really care whether we keep the warning or not.

My real complaint is that the 0-day report fingered
pci/controller/xilinx-xdma, which is completely unrelated, which is a
waste of time.

> > Same with get_output_lock() and release_output_lock(): they're static
> > and always defined in xmon.c, but only called if either CONFIG_SMP or
> > CONFIG_DEBUG_FS.
> >
> > But they're certainly not related to controller/xilinx-xdma, so I'm
> > going to ignore them.
> >
> > Bjorn
> >
> > P.S. Nathan & Nick, I cc'd you because of this earlier report that
> > also mentioned grackle_set_stg():
> > https://lore.kernel.org/lkml/202308121120.u2d3ypvt-...@intel.com/
> 
> 
> 
> -- 
> Thanks,
> ~Nick Desaulniers


Re: [pci:controller/xilinx-xdma] BUILD REGRESSION 8d786149d78c7784144c7179e25134b6530b714b

2023-10-31 Thread Nick Desaulniers
On Tue, Oct 31, 2023 at 7:56 AM Bjorn Helgaas  wrote:
>
> [+cc powerpc, clang folks]
>
> On Sat, Oct 28, 2023 at 08:22:54PM +0800, kernel test robot wrote:
> > tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/pci/pci.git 
> > controller/xilinx-xdma
> > branch HEAD: 8d786149d78c7784144c7179e25134b6530b714b  PCI: xilinx-xdma: 
> > Add Xilinx XDMA Root Port driver
> >
> > Error/Warning ids grouped by kconfigs:
> >
> > clang_recent_errors
> > `-- powerpc-pmac32_defconfig
> > |-- 
> > arch-powerpc-sysdev-grackle.c:error:unused-function-grackle_set_stg-Werror-Wunused-function
> > |-- 
> > arch-powerpc-xmon-xmon.c:error:unused-function-get_output_lock-Werror-Wunused-function
> > `-- 
> > arch-powerpc-xmon-xmon.c:error:unused-function-release_output_lock-Werror-Wunused-function
>
> This report is close to useless.  It doesn't show the complete error
> message, it doesn't show how to reproduce the issue, and the pci -next
> branch (including controller/xilinx-xdma) doesn't reference any of
> these functions:
>
>   $ git grep -E "grackle_set_stg|get_output_lock|release_output_lock" | cat
>   arch/powerpc/sysdev/grackle.c:static inline void grackle_set_stg(struct 
> pci_controller* bp, int enable)
>   arch/powerpc/sysdev/grackle.c:grackle_set_stg(hose, 1);
>   arch/powerpc/xmon/xmon.c:static void get_output_lock(void)
>   arch/powerpc/xmon/xmon.c:static void release_output_lock(void)
>   arch/powerpc/xmon/xmon.c:static inline void get_output_lock(void) {}
>   arch/powerpc/xmon/xmon.c:static inline void release_output_lock(void) {}
>   arch/powerpc/xmon/xmon.c: get_output_lock();
>   arch/powerpc/xmon/xmon.c: release_output_lock();
>   arch/powerpc/xmon/xmon.c: get_output_lock();
>   arch/powerpc/xmon/xmon.c: release_output_lock();
>   arch/powerpc/xmon/xmon.c: get_output_lock();
>   arch/powerpc/xmon/xmon.c: release_output_lock();
>   arch/powerpc/xmon/xmon.c: get_output_lock();
>   arch/powerpc/xmon/xmon.c: release_output_lock();
>
> That said, the unused functions do look legit:
>
> grackle_set_stg() is a static function and the only call is under
> "#if 0".

Time to remove it then? Or is it a bug that it's not called?
Otherwise the definition should be behind the same preprocessor guards
as the caller.  Same for the below.

>
> Same with get_output_lock() and release_output_lock(): they're static
> and always defined in xmon.c, but only called if either CONFIG_SMP or
> CONFIG_DEBUG_FS.
>
> But they're certainly not related to controller/xilinx-xdma, so I'm
> going to ignore them.
>
> Bjorn
>
> P.S. Nathan & Nick, I cc'd you because of this earlier report that
> also mentioned grackle_set_stg():
> https://lore.kernel.org/lkml/202308121120.u2d3ypvt-...@intel.com/



-- 
Thanks,
~Nick Desaulniers


Re: Several kmemleak reports + "refcount_t: underflow; use-after-free" at boot when OF_UNITTEST + OF_OVERLAY is set (Kernel v6.6-rc6, PowerMac G5 11,2)

2023-10-31 Thread Erhard Furtner
On Mon, 30 Oct 2023 11:26:48 -0500
Rob Herring  wrote:

> The test tells you to expect a use-after-free...
> 
> > ---[ end trace <> ]--- ### dt-test ### pass 
> > of_unittest_lifecycle():3209
> > [ cut here ]
> > refcount_t: underflow; use-after-free.  
> 
> Then you get a use-after-free. Looks like it is working as designed.
> 
> I believe it's the same with kmemleak.
> 
> Note that running DT unittests also taints the kernel. That's because
> they are not meant to be run on a production system.
> 
> Rob

My bad, did not realize this is actually intended behaviour... Sorry for the 
noise!

Regards,
Erhard


[Bug 206203] kmemleak reports various leaks in drivers/of/unittest.c

2023-10-31 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=206203

Erhard F. (erhar...@mailbox.org) changed:

   What|Removed |Added

 Status|REOPENED|RESOLVED
 Resolution|--- |OBSOLETE

--- Comment #26 from Erhard F. (erhar...@mailbox.org) ---
Closed in favour of:
https://lore.kernel.org/all/20231018233815.34a0417f@yea/#r

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

Re: [PATCH v13 13/35] KVM: Introduce per-page memory attributes

2023-10-31 Thread David Matlack
On 2023-10-27 11:21 AM, Sean Christopherson wrote:
> From: Chao Peng 
> 
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index 89c1a991a3b8..df573229651b 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -808,6 +809,9 @@ struct kvm {
>  
>  #ifdef CONFIG_HAVE_KVM_PM_NOTIFIER
>   struct notifier_block pm_notifier;
> +#endif
> +#ifdef CONFIG_KVM_GENERIC_MEMORY_ATTRIBUTES
> + struct xarray mem_attr_array;

Please document how access to mem_attr_array is synchronized. If I'm
reading the code correctly I think it's...

  /* Protected by slots_locks (for writes) and RCU (for reads) */


Re: [pci:controller/xilinx-xdma] BUILD REGRESSION 8d786149d78c7784144c7179e25134b6530b714b

2023-10-31 Thread Segher Boessenkool
On Tue, Oct 31, 2023 at 09:56:00AM -0500, Bjorn Helgaas wrote:
> That said, the unused functions do look legit:
> 
> grackle_set_stg() is a static function and the only call is under
> "#if 0".

It is a static inline.  It is *normal* that that is uncalled.  It is
very similar to a macro that has no invocation.  The warning is
overeager.


Segher


Re: [linux-next:master] BUILD REGRESSION c503e3eec382ac708ee7adf874add37b77c5d312

2023-10-31 Thread Bjorn Helgaas
On Tue, Oct 31, 2023 at 04:35:23AM +0800, kernel test robot wrote:
> tree/branch: 
> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
> branch HEAD: c503e3eec382ac708ee7adf874add37b77c5d312  Add linux-next 
> specific files for 20231030
> 
> Error/Warning reports:
> ... 
> https://lore.kernel.org/oe-kbuild-all/202310302206.pkr5ebdi-...@intel.com

> Error/Warning: (recently discovered and may have been fixed)
> 
> Warning: MAINTAINERS references a file that doesn't exist: 
> Documentation/devicetree/bindings/iio/imu/bosch,bma400.yaml
> aarch64-linux-ld: drivers/cxl/core/pci.c:921:(.text+0xbbc): undefined 
> reference to `pci_print_aer'
> ...
> arch/riscv/include/asm/mmio.h:67:(.text+0xd66): undefined reference to 
> `pci_print_aer'
> csky-linux-ld: pci.c:(.text+0x6e8): undefined reference to `pci_print_aer'
> drivers/cxl/core/pci.c:921: undefined reference to `pci_print_aer'
> drivers/cxl/core/pci.c:921:(.text+0xbc0): undefined reference to 
> `pci_print_aer'
> ...
> ld: drivers/cxl/core/pci.c:921: undefined reference to `pci_print_aer'
> loongarch64-linux-ld: drivers/cxl/core/pci.c:921:(.text+0xa38): undefined 
> reference to `pci_print_aer'
> pci.c:(.text+0x662): undefined reference to `pci_print_aer'
> powerpc-linux-ld: pci.c:(.text+0xf10): undefined reference to `pci_print_aer'
> riscv64-linux-ld: pci.c:(.text+0x11ec): undefined reference to `pci_print_aer'

I have no idea about the above (and all the similar ones below); I
assume they all have to do with
https://lore.kernel.org/r/20231018171713.1883517-13-rrich...@amd.com

> Unverified Error/Warning (likely false positive, please contact us if 
> interested):
> 
> drivers/pci/controller/dwc/pcie-rcar-gen4.c:439:15: warning: cast to smaller 
> integer type 'enum dw_pcie_device_mode' from 'const void *' 
> [-Wvoid-pointer-to-enum-cast]

Safe but annoying.  Yoshihiro, can you fix this by adding structs for
the of_device_id.data member instead of casting DW_PCIE_RC_TYPE and
DW_PCIE_EP_TYPE?  Examples here:

  
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/pci/controller/dwc/pci-dra7xx.c?id=v6.6#n557
  
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/pci/controller/dwc/pci-keystone.c?id=v6.6#n1069
  
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/pci/controller/dwc/pcie-artpec6.c?id=v6.6#n452
  
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/pci/controller/dwc/pcie-designware-plat.c?id=v6.6#n159
  
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/pci/controller/dwc/pcie-keembay.c?id=v6.6#n437
  
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/pci/controller/dwc/pcie-tegra194.c?id=v6.6#n2431

Siddharth, since you're looking at keystone v3.65, it looks to me like
DW_PCIE_VER_365A is currently broken because ks_pcie_rc_of_data
doesn't set .mode, so it defaults to zero, and it looks like we should
end up at the INVALID device type case here:

  
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/pci/controller/dwc/pci-keystone.c?id=v6.6#n1285

> |-- arm64-buildonly-randconfig-r003-20220511
> |   `-- 
> aarch64-linux-ld:drivers-cxl-core-pci.c:(.text):undefined-reference-to-pci_print_aer

> |-- csky-randconfig-001-20231030
> |   |-- csky-linux-ld:pci.c:(.text):undefined-reference-to-pci_print_aer
> |   `-- pci.c:(.text):undefined-reference-to-pci_print_aer

> |-- i386-randconfig-141-20231030
> |   |-- ld:drivers-cxl-core-pci.c:undefined-reference-to-pci_print_aer

> |-- loongarch-randconfig-r014-20230225
> |   `-- drivers-cxl-core-pci.c:(.text):undefined-reference-to-pci_print_aer
> |-- loongarch-randconfig-r032-20220926
> |   `-- 
> loongarch64-linux-ld:drivers-cxl-core-pci.c:(.text):undefined-reference-to-pci_print_aer

> |-- powerpc-randconfig-003-20231016
> |   `-- powerpc-linux-ld:pci.c:(.text):undefined-reference-to-pci_print_aer

> |-- riscv-randconfig-r002-20220124
> |   `-- 
> arch-riscv-include-asm-mmio.h:(.text):undefined-reference-to-pci_print_aer
> |-- riscv-randconfig-r011-20220606
> |   `-- riscv64-linux-ld:pci.c:(.text):undefined-reference-to-pci_print_aer

> |-- x86_64-randconfig-x052-20230810
> |   `-- drivers-cxl-core-pci.c:undefined-reference-to-pci_print_aer


Re: [PATCH v13 16/35] KVM: Add KVM_CREATE_GUEST_MEMFD ioctl() for guest-specific backing memory

2023-10-31 Thread Fuad Tabba
Hi,

On Fri, Oct 27, 2023 at 7:23 PM Sean Christopherson  wrote:

...

> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> index e2252c748fd6..e82c69d5e755 100644
> --- a/Documentation/virt/kvm/api.rst
> +++ b/Documentation/virt/kvm/api.rst
> @@ -6079,6 +6079,15 @@ applied.
>  :Parameters: struct kvm_userspace_memory_region2 (in)
>  :Returns: 0 on success, -1 on error
>
> +KVM_SET_USER_MEMORY_REGION2 is an extension to KVM_SET_USER_MEMORY_REGION 
> that
> +allows mapping guest_memfd memory into a guest.  All fields shared with
> +KVM_SET_USER_MEMORY_REGION identically.  Userspace can set KVM_MEM_PRIVATE in
> +flags to have KVM bind the memory region to a given guest_memfd range of
> +[guest_memfd_offset, guest_memfd_offset + memory_size].  The target 
> guest_memfd
> +must point at a file created via KVM_CREATE_GUEST_MEMFD on the current VM, 
> and
> +the target range must not be bound to any other memory region.  All standard
> +bounds checks apply (use common sense).
> +

Bikeshedding here: Not sure if KVM_MEM_PRIVATE is the best name for
this. It gets confusing with KVM_MEMORY_ATTRIBUTE_PRIVATE, i.e., that
a region marked as KVM_MEM_PRIVATE is only potentially private. It did
confuse the rest of the team when I walked them through a previous
version of this code once. Would something like KVM_MEM_GUESTMEM make
more sense?

>  ::
>
>struct kvm_userspace_memory_region2 {
> @@ -6087,9 +6096,24 @@ applied.
> __u64 guest_phys_addr;
> __u64 memory_size; /* bytes */
> __u64 userspace_addr; /* start of the userspace allocated memory */
> +  __u64 guest_memfd_offset;
> +   __u32 guest_memfd;
> +   __u32 pad1;
> +   __u64 pad2[14];
>};
>
> -See KVM_SET_USER_MEMORY_REGION.
> +A KVM_MEM_PRIVATE region _must_ have a valid guest_memfd (private memory) and
> +userspace_addr (shared memory).  However, "valid" for userspace_addr simply
> +means that the address itself must be a legal userspace address.  The backing
> +mapping for userspace_addr is not required to be valid/populated at the time 
> of
> +KVM_SET_USER_MEMORY_REGION2, e.g. shared memory can be lazily 
> mapped/allocated
> +on-demand.

Regarding requiring that a private region have both a valid
guest_memfd and a userspace_addr, should this be
implementation-specific? In pKVM at least, all regions for protected
VMs are private, and KVM doesn't care about the host userspace address
for those regions even when part of the memory is shared.

> +When mapping a gfn into the guest, KVM selects shared vs. private, i.e 
> consumes
> +userspace_addr vs. guest_memfd, based on the gfn's 
> KVM_MEMORY_ATTRIBUTE_PRIVATE
> +state.  At VM creation time, all memory is shared, i.e. the PRIVATE attribute
> +is '0' for all gfns.  Userspace can control whether memory is shared/private 
> by
> +toggling KVM_MEMORY_ATTRIBUTE_PRIVATE via KVM_SET_MEMORY_ATTRIBUTES as 
> needed.

In pKVM, guest memory is private by default, and most of it will
remain so for the lifetime of the VM. Userspace could explicitly mark
all the guest's memory as private at initialization, but it would save
a slight amount of work. That said, I understand that it might be
better to be consistent across implementations.

...

> --- /dev/null
> +++ b/virt/kvm/guest_memfd.c
> @@ -0,0 +1,548 @@
> +// SPDX-License-Identifier: GPL-2.0
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 

nit: should this include be first (to maintain alphabetical ordering
of the includes)?

> +
> +#include "kvm_mm.h"
> +
> +struct kvm_gmem {
> +   struct kvm *kvm;
> +   struct xarray bindings;
> +   struct list_head entry;
> +};
> +
> +static struct folio *kvm_gmem_get_folio(struct inode *inode, pgoff_t index)
> +{
> +   struct folio *folio;
> +
> +   /* TODO: Support huge pages. */
> +   folio = filemap_grab_folio(inode->i_mapping, index);
> +   if (IS_ERR_OR_NULL(folio))
> +   return NULL;
> +
> +   /*
> +* Use the up-to-date flag to track whether or not the memory has been
> +* zeroed before being handed off to the guest.  There is no backing
> +* storage for the memory, so the folio will remain up-to-date until
> +* it's removed.
> +*
> +* TODO: Skip clearing pages when trusted firmware will do it when
> +* assigning memory to the guest.
> +*/
> +   if (!folio_test_uptodate(folio)) {
> +   unsigned long nr_pages = folio_nr_pages(folio);
> +   unsigned long i;
> +
> +   for (i = 0; i < nr_pages; i++)
> +   clear_highpage(folio_page(folio, i));
> +
> +   folio_mark_uptodate(folio);
> +   }
> +
> +   /*
> +* Ignore accessed, referenced, and dirty flags.  The memory is
> +* unevictable and there is no storage to write back to.
> +*/
> +   return folio;
> +}
> +
> +static void kvm_gmem_invalidate_begin(struct kvm_gmem *gmem, pgoff_t 

Re: [pci:controller/xilinx-xdma] BUILD REGRESSION 8d786149d78c7784144c7179e25134b6530b714b

2023-10-31 Thread Bjorn Helgaas
[+cc powerpc, clang folks]

On Sat, Oct 28, 2023 at 08:22:54PM +0800, kernel test robot wrote:
> tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/pci/pci.git 
> controller/xilinx-xdma
> branch HEAD: 8d786149d78c7784144c7179e25134b6530b714b  PCI: xilinx-xdma: Add 
> Xilinx XDMA Root Port driver
> 
> Error/Warning ids grouped by kconfigs:
> 
> clang_recent_errors
> `-- powerpc-pmac32_defconfig
> |-- 
> arch-powerpc-sysdev-grackle.c:error:unused-function-grackle_set_stg-Werror-Wunused-function
> |-- 
> arch-powerpc-xmon-xmon.c:error:unused-function-get_output_lock-Werror-Wunused-function
> `-- 
> arch-powerpc-xmon-xmon.c:error:unused-function-release_output_lock-Werror-Wunused-function

This report is close to useless.  It doesn't show the complete error
message, it doesn't show how to reproduce the issue, and the pci -next
branch (including controller/xilinx-xdma) doesn't reference any of
these functions:

  $ git grep -E "grackle_set_stg|get_output_lock|release_output_lock" | cat
  arch/powerpc/sysdev/grackle.c:static inline void grackle_set_stg(struct 
pci_controller* bp, int enable)
  arch/powerpc/sysdev/grackle.c:grackle_set_stg(hose, 1);
  arch/powerpc/xmon/xmon.c:static void get_output_lock(void)
  arch/powerpc/xmon/xmon.c:static void release_output_lock(void)
  arch/powerpc/xmon/xmon.c:static inline void get_output_lock(void) {}
  arch/powerpc/xmon/xmon.c:static inline void release_output_lock(void) {}
  arch/powerpc/xmon/xmon.c: get_output_lock();
  arch/powerpc/xmon/xmon.c: release_output_lock();
  arch/powerpc/xmon/xmon.c: get_output_lock();
  arch/powerpc/xmon/xmon.c: release_output_lock();
  arch/powerpc/xmon/xmon.c: get_output_lock();
  arch/powerpc/xmon/xmon.c: release_output_lock();
  arch/powerpc/xmon/xmon.c: get_output_lock();
  arch/powerpc/xmon/xmon.c: release_output_lock();

That said, the unused functions do look legit:

grackle_set_stg() is a static function and the only call is under
"#if 0".

Same with get_output_lock() and release_output_lock(): they're static
and always defined in xmon.c, but only called if either CONFIG_SMP or
CONFIG_DEBUG_FS.

But they're certainly not related to controller/xilinx-xdma, so I'm
going to ignore them.

Bjorn

P.S. Nathan & Nick, I cc'd you because of this earlier report that
also mentioned grackle_set_stg():
https://lore.kernel.org/lkml/202308121120.u2d3ypvt-...@intel.com/


Re: [PATCH v13 17/35] KVM: Add transparent hugepage support for dedicated guest memory

2023-10-31 Thread Sean Christopherson
On Tue, Oct 31, 2023, Xiaoyao Li wrote:
> On 10/28/2023 2:21 AM, Sean Christopherson wrote:
> > Extended guest_memfd to allow backing guest memory with transparent
> > hugepages. Require userspace to opt-in via a flag even though there's no
> > known/anticipated use case for forcing small pages as THP is optional,
> > i.e. to avoid ending up in a situation where userspace is unaware that
> > KVM can't provide hugepages.
> 
> Personally, it seems not so "transparent" if requiring userspace to opt-in.
> 
> People need to 1) check if the kernel built with TRANSPARENT_HUGEPAGE
> support, or check is the sysfs of transparent hugepage exists; 2)get the
> maximum support hugepage size 3) ensure the size satisfies the alignment;
> before opt-in it.
> 
> Even simpler, userspace can blindly try to create guest memfd with
> transparent hugapage flag. If getting error, fallback to create without the
> transparent hugepage flag.
> 
> However, it doesn't look transparent to me.

The "transparent" part is referring to the underlying kernel mechanism, it's not
saying anything about the API.  The "transparent" part of THP is that the kernel
doesn't guarantee hugepages, i.e. whether or not hugepages are actually used is
(mostly) transparent to userspace.

Paolo also isn't the biggest fan[*], but there are also downsides to always
allowing hugepages, e.g. silent failure due to lack of THP or unaligned size,
and there's precedent in the form of MADV_HUGEPAGE.

[*] https://lore.kernel.org/all/84a908ae-04c7-51c7-c9a8-119e1933a...@redhat.com


Re: [PATCH v13 16/35] KVM: Add KVM_CREATE_GUEST_MEMFD ioctl() for guest-specific backing memory

2023-10-31 Thread Sean Christopherson
On Tue, Oct 31, 2023, Chao Gao wrote:
> >+int kvm_gmem_create(struct kvm *kvm, struct kvm_create_guest_memfd *args)
> >+{
> >+loff_t size = args->size;
> >+u64 flags = args->flags;
> >+u64 valid_flags = 0;
> >+
> >+if (flags & ~valid_flags)
> >+return -EINVAL;
> >+
> >+if (size < 0 || !PAGE_ALIGNED(size))
> >+return -EINVAL;
> 
> is size == 0 a valid case?

Nope, this is a bug.

> >+if (!xa_empty(>bindings) &&
> >+xa_find(>bindings, , end - 1, XA_PRESENT)) {
> >+filemap_invalidate_unlock(inode->i_mapping);
> >+goto err;
> >+}
> >+
> >+/*
> >+ * No synchronize_rcu() needed, any in-flight readers are guaranteed to
> >+ * be see either a NULL file or this new file, no need for them to go
> >+ * away.
> >+ */
> >+rcu_assign_pointer(slot->gmem.file, file);
> >+slot->gmem.pgoff = start;
> >+
> >+xa_store_range(>bindings, start, end - 1, slot, GFP_KERNEL);
> >+filemap_invalidate_unlock(inode->i_mapping);
> >+
> >+/*
> >+ * Drop the reference to the file, even on success.  The file pins KVM,
> >+ * not the other way 'round.  Active bindings are invalidated if the
> >+ * file is closed before memslots are destroyed.
> >+ */
> >+fput(file);
> >+return 0;
> >+
> >+err:
> >+fput(file);
> >+return -EINVAL;
> 
> The cleanup, i.e., filemap_invalidate_unlock() and fput(), is common. So, I 
> think it
> may be slightly better to consolidate the common part e.g.,

I would prefer to keep this as-is.  Only goto needs the unlock, and I find it 
easier
to understand the success vs. error paths with explicit returns.  But it's not a
super strong preference.


Re: [PATCH v13 08/35] KVM: Introduce KVM_SET_USER_MEMORY_REGION2

2023-10-31 Thread Sean Christopherson
On Tue, Oct 31, 2023, Xiaoyao Li wrote:
> On 10/28/2023 2:21 AM, Sean Christopherson wrote:
> > Introduce a "version 2" of KVM_SET_USER_MEMORY_REGION so that additional
> > information can be supplied without setting userspace up to fail.  The
> > padding in the new kvm_userspace_memory_region2 structure will be used to
> > pass a file descriptor in addition to the userspace_addr, i.e. allow
> > userspace to point at a file descriptor and map memory into a guest that
> > is NOT mapped into host userspace.
> > 
> > Alternatively, KVM could simply add "struct kvm_userspace_memory_region2"
> > without a new ioctl(), but as Paolo pointed out, adding a new ioctl()
> > makes detection of bad flags a bit more robust, e.g. if the new fd field
> > is guarded only by a flag and not a new ioctl(), then a userspace bug
> > (setting a "bad" flag) would generate out-of-bounds access instead of an
> > -EINVAL error.
> > 
> > Cc: Jarkko Sakkinen 
> > Reviewed-by: Paolo Bonzini 
> > Reviewed-by: Xiaoyao Li 
> > Signed-off-by: Sean Christopherson 
> > ---
> >   Documentation/virt/kvm/api.rst | 21 +++
> >   arch/x86/kvm/x86.c |  2 +-
> >   include/linux/kvm_host.h   |  4 ++--
> >   include/uapi/linux/kvm.h   | 13 
> >   virt/kvm/kvm_main.c| 38 +++---
> >   5 files changed, 67 insertions(+), 11 deletions(-)
> > 
> > diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> > index 21a7578142a1..ace984acc125 100644
> > --- a/Documentation/virt/kvm/api.rst
> > +++ b/Documentation/virt/kvm/api.rst
> > @@ -6070,6 +6070,27 @@ writes to the CNTVCT_EL0 and CNTPCT_EL0 registers 
> > using the SET_ONE_REG
> >   interface. No error will be returned, but the resulting offset will not be
> >   applied.
> > +4.139 KVM_SET_USER_MEMORY_REGION2
> > +-
> > +
> > +:Capability: KVM_CAP_USER_MEMORY2
> > +:Architectures: all
> > +:Type: vm ioctl
> > +:Parameters: struct kvm_userspace_memory_region2 (in)
> > +:Returns: 0 on success, -1 on error
> > +
> > +::
> > +
> > +  struct kvm_userspace_memory_region2 {
> > +   __u32 slot;
> > +   __u32 flags;
> > +   __u64 guest_phys_addr;
> > +   __u64 memory_size; /* bytes */
> > +   __u64 userspace_addr; /* start of the userspace allocated memory */
> 
> missing
> 
>   __u64 pad[16];

I can't even copy+paste correctly :-(


[ppc64le] WARN at crypto/testmgr.c:5804

2023-10-31 Thread Sachin Sant
Following warning is observed during boot of latest -next
kernel (6.6.0-rc7-next-20231030 and todays -next) on IBM Power server.

[ 0.085775] workingset: timestamp_bits=38 max_order=20 bucket_order=0
[ 0.085801] zbud: loaded
[ 0.086473] [ cut here ]
[ 0.086477] WARNING: CPU: 23 PID: 211 at crypto/testmgr.c:5804 
alg_test.part.33+0x308/0x740
[ 0.086486] Modules linked in:
[ 0.086489] CPU: 23 PID: 211 Comm: cryptomgr_test Not tainted 
6.6.0-rc7-next-20231030 #1
[ 0.086493] Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf06 
of:IBM,FW1030.20 (NH1030_058) hv:phyp pSeries
[ 0.086497] NIP: c0765068 LR: c0764ff4 CTR: c075da00
[ 0.086500] REGS: ced7bb50 TRAP: 0700 Not tainted 
(6.6.0-rc7-next-20231030)
[ 0.086503] MSR: 80029033  CR: 8284 XER: 
20040002
[ 0.086511] CFAR: c0765318 IRQMASK: 1 
GPR00: c0764ff4 ced7bdf0 c1482000 0002 
GPR04: ced7be60 000e 002f fffe 
GPR08: ff00 0001 0008  
GPR12: c075da00 c00aa7cec700 c019da88 c6df9cc0 
GPR16:   c1309f98  
GPR20: c1308d50 c1308d98 c0ffeaf0 c1309fb0 
GPR24: c0ffb330 c0ffdf30 0400 c00019bfd480 
GPR28: 0002 c00019bfd400 c2bcf633 000e 
[ 0.086547] NIP [c0765068] alg_test.part.33+0x308/0x740
[ 0.086552] LR [c0764ff4] alg_test.part.33+0x294/0x740
[ 0.086556] Call Trace:
[ 0.086557] [ced7bdf0] [c0764ff4] alg_test.part.33+0x294/0x740 
(unreliable)
[ 0.086563] [ced7bf60] [c075da34] cryptomgr_test+0x34/0x70
[ 0.086568] [ced7bf90] [c019dbb8] kthread+0x138/0x140
[ 0.086573] [ced7bfe0] [c000df98] start_kernel_thread+0x14/0x18
[ 0.086578] Code: fb210138 fb810150 3af76d20 3b80 3a526d38 3a946d50 
3ab56d98 3b380040 3ad837c0 2f9c 7d301026 5529f7fe <0b09> 7f890034 
5529d97e 419d03a4 
[ 0.086589] ---[ end trace  ]---
[ 0.086592] testmgr: alg_test_descs entries in wrong order: 
'pkcs1pad(rsa,sha512)' before 'pkcs1pad(rsa,sha3-256)’

Git bisect points to following patch:
commit ee62afb9d02dd279a7b73245614f13f8fe777a6d
crypto: rsa-pkcs1pad - Add FIPS 202 SHA-3 support

- Sachin

Re: [PATCH v2 29/37] powerpc/nohash: Replace pte_user() by pte_read()

2023-10-31 Thread Aneesh Kumar K.V
Christophe Leroy  writes:

> pte_user() is now only used in pte_access_permitted() to check
> access on vmas. User flag is cleared to make a page unreadable.
>
> So rename it pte_read() and remove pte_user() which isn't used
> anymore.
>
> For the time being it checks _PAGE_USER but in the near futur
> all plateforms will be converted to _PAGE_READ so lets support
> both for now.
>
> Signed-off-by: Christophe Leroy 
> ---
>  arch/powerpc/include/asm/nohash/32/pte-8xx.h |  7 ---
>  arch/powerpc/include/asm/nohash/pgtable.h| 13 +++--
>  arch/powerpc/mm/ioremap.c|  4 
>  3 files changed, 7 insertions(+), 17 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/nohash/32/pte-8xx.h 
> b/arch/powerpc/include/asm/nohash/32/pte-8xx.h
> index 62c965a4511a..1ee38befd29a 100644
> --- a/arch/powerpc/include/asm/nohash/32/pte-8xx.h
> +++ b/arch/powerpc/include/asm/nohash/32/pte-8xx.h
> @@ -112,13 +112,6 @@ static inline pte_t pte_mkwrite_novma(pte_t pte)
>  
>  #define pte_mkwrite_novma pte_mkwrite_novma
>  
> -static inline bool pte_user(pte_t pte)
> -{
> - return !(pte_val(pte) & _PAGE_SH);
> -}
> -
> -#define pte_user pte_user
> -
>  static inline pte_t pte_mkhuge(pte_t pte)
>  {
>   return __pte(pte_val(pte) | _PAGE_SPS | _PAGE_HUGE);
> diff --git a/arch/powerpc/include/asm/nohash/pgtable.h 
> b/arch/powerpc/include/asm/nohash/pgtable.h
> index ee677162f9e6..aba56fe3b1c6 100644
> --- a/arch/powerpc/include/asm/nohash/pgtable.h
> +++ b/arch/powerpc/include/asm/nohash/pgtable.h
> @@ -160,9 +160,6 @@ static inline int pte_write(pte_t pte)
>   return pte_val(pte) & _PAGE_WRITE;
>  }
>  #endif
> -#ifndef pte_read
> -static inline int pte_read(pte_t pte){ return 1; }
> -#endif
>  static inline int pte_dirty(pte_t pte)   { return pte_val(pte) & 
> _PAGE_DIRTY; }
>  static inline int pte_special(pte_t pte) { return pte_val(pte) & 
> _PAGE_SPECIAL; }
>  static inline int pte_none(pte_t pte){ return (pte_val(pte) 
> & ~_PTE_NONE_MASK) == 0; }
> @@ -190,10 +187,14 @@ static inline int pte_young(pte_t pte)
>   * and PTE_64BIT, PAGE_KERNEL_X contains _PAGE_BAP_SR which is also in
>   * _PAGE_USER.  Need to explicitly match _PAGE_BAP_UR bit in that case too.
>   */
> -#ifndef pte_user
> -static inline bool pte_user(pte_t pte)
> +#ifndef pte_read
> +static inline bool pte_read(pte_t pte)
>  {
> +#ifdef _PAGE_READ
> + return (pte_val(pte) & _PAGE_READ) == _PAGE_READ;
> +#else
>   return (pte_val(pte) & _PAGE_USER) == _PAGE_USER;
> +#endif
>  }
>  #endif
>  
> @@ -208,7 +209,7 @@ static inline bool pte_access_permitted(pte_t pte, bool 
> write)
>* A read-only access is controlled by _PAGE_USER bit.
>* We have _PAGE_READ set for WRITE and EXECUTE
>*/
> - if (!pte_present(pte) || !pte_user(pte) || !pte_read(pte))
> + if (!pte_present(pte) || !pte_read(pte))
>   return false;
>  
>   if (write && !pte_write(pte))
> diff --git a/arch/powerpc/mm/ioremap.c b/arch/powerpc/mm/ioremap.c
> index 7823c38f09de..7b0afcabd89f 100644
> --- a/arch/powerpc/mm/ioremap.c
> +++ b/arch/powerpc/mm/ioremap.c
> @@ -50,10 +50,6 @@ void __iomem *ioremap_prot(phys_addr_t addr, size_t size, 
> unsigned long flags)
>   if (pte_write(pte))
>   pte = pte_mkdirty(pte);
>  
> - /* we don't want to let _PAGE_USER leak out */
> - if (WARN_ON(pte_user(pte)))
> - return NULL;
>

This check is still valid right? I understand that we want to remove
_PAGE_USER. But then loosing this check is ok? 

> -
>   if (iowa_is_active())
>   return iowa_ioremap(addr, size, pte_pgprot(pte), caller);
>   return __ioremap_caller(addr, size, pte_pgprot(pte), caller);
> -- 
> 2.41.0


Re: [PATCH 00/10] Remove obsolete and orphaned wifi drivers

2023-10-31 Thread Kalle Valo
Arnd Bergmann  writes:

> From: Arnd Bergmann 
>
> As discussed previously, a lot of the older wifi drivers are likely
> entirely unused, Though we can't know for sure.
>
> As suggested by both Greg and Jakub, let's remove the ones that look
> are most likely to have no users left and also get in the way of the
> wext cleanup. If anyone is still using any of these, we can revert the
> driver removal individually.
>
> I would suggest merging these for net-next after 6.7-rc1 is out, to give
> them the maximum amount of time for users to speak up before a release
> comes out.
>
> This kills off all pcmcia wifi drivers, and all wext users in
> drivers/net/wireless, but not the ps3-gelic-wireless driver in
> drivers/net/ethernet, or the staging drivers.
>
> In staging, rtl8192u was already removed in the meantime, while rtl8712
> and rtl8192e are apparently still used.  I have not been able to find
> out whether ks7010 is still in use.
>
>   Arnd
>
> Link: https://lore.kernel.org/lkml/20231011080955.1beeb...@kernel.org/
>
>
> Arnd Bergmann (10):
>   wifi: libertas: drop 16-bit PCMCIA support
>   wifi: atmel: remove wext style at76c50x drivers
>   wifi: remove orphaned cisco/aironet driver
>   wifi: remove obsolete hostap driver
>   wifi: remove orphaned zd1201 driver
>   wifi: remove orphaned orinoco driver
>   wifi: remove orphaned ray_cs driver
>   wifi: remove orphaned wl3501 driver
>   wifi: remove orphaned rndis_wlan driver
>   [RFC] wifi: remove ipw2100/ipw2200 drivers

I manually applied these 9 to wireless-next:

4b478bf6bdd8 wifi: libertas: drop 16-bit PCMCIA support
77e49bec6414 wifi: atmel: remove wext style at76c50x drivers
6853c70ba5ed wifi: remove orphaned cisco/aironet driver
d0172d5f7576 wifi: remove obsolete hostap driver
757a46c2a7a9 wifi: remove orphaned zd1201 driver
1535d5962d79 wifi: remove orphaned orinoco driver
6b9dbaff83d6 wifi: remove orphaned ray_cs driver
238349207cd3 wifi: remove orphaned wl3501 driver
bec95598b24a wifi: remove orphaned rndis_wlan driver

I dropped this patch as we got several reports about people using the
driver:

[RFC] wifi: remove ipw2100/ipw2200 drivers

The patches are queued for v6.8. Arnd, thanks a lot for cleaning this
up!

-- 
https://patchwork.kernel.org/project/linux-wireless/list/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches


Re: [PATCH v13 17/35] KVM: Add transparent hugepage support for dedicated guest memory

2023-10-31 Thread Xiaoyao Li

On 10/28/2023 2:21 AM, Sean Christopherson wrote:
Extended guest_memfd to allow backing guest memory with transparent 
hugepages. Require userspace to opt-in via a flag even though there's no 
known/anticipated use case for forcing small pages as THP is optional, 
i.e. to avoid ending up in a situation where userspace is unaware that 
KVM can't provide hugepages.


Personally, it seems not so "transparent" if requiring userspace to opt-in.

People need to 1) check if the kernel built with TRANSPARENT_HUGEPAGE 
support, or check is the sysfs of transparent hugepage exists; 2)get the 
maximum support hugepage size 3) ensure the size satisfies the 
alignment; before opt-in it.


Even simpler, userspace can blindly try to create guest memfd with 
transparent hugapage flag. If getting error, fallback to create without 
the transparent hugepage flag.


However, it doesn't look transparent to me.


Re: [PATCH v13 16/35] KVM: Add KVM_CREATE_GUEST_MEMFD ioctl() for guest-specific backing memory

2023-10-31 Thread Chao Gao
>+int kvm_gmem_create(struct kvm *kvm, struct kvm_create_guest_memfd *args)
>+{
>+  loff_t size = args->size;
>+  u64 flags = args->flags;
>+  u64 valid_flags = 0;
>+
>+  if (flags & ~valid_flags)
>+  return -EINVAL;
>+
>+  if (size < 0 || !PAGE_ALIGNED(size))
>+  return -EINVAL;

is size == 0 a valid case?

>+int kvm_gmem_bind(struct kvm *kvm, struct kvm_memory_slot *slot,
>+unsigned int fd, loff_t offset)
>+{
>+  loff_t size = slot->npages << PAGE_SHIFT;
>+  unsigned long start, end;
>+  struct kvm_gmem *gmem;
>+  struct inode *inode;
>+  struct file *file;
>+
>+  BUILD_BUG_ON(sizeof(gfn_t) != sizeof(slot->gmem.pgoff));
>+
>+  file = fget(fd);
>+  if (!file)
>+  return -EBADF;
>+
>+  if (file->f_op != _gmem_fops)
>+  goto err;
>+
>+  gmem = file->private_data;
>+  if (gmem->kvm != kvm)
>+  goto err;
>+
>+  inode = file_inode(file);
>+
>+  if (offset < 0 || !PAGE_ALIGNED(offset))
>+  return -EINVAL;

goto err;

or hoist the check above fget().

>+
>+  if (offset + size > i_size_read(inode))
>+  goto err;
>+
>+  filemap_invalidate_lock(inode->i_mapping);
>+
>+  start = offset >> PAGE_SHIFT;
>+  end = start + slot->npages;
>+
>+  if (!xa_empty(>bindings) &&
>+  xa_find(>bindings, , end - 1, XA_PRESENT)) {
>+  filemap_invalidate_unlock(inode->i_mapping);
>+  goto err;
>+  }
>+
>+  /*
>+   * No synchronize_rcu() needed, any in-flight readers are guaranteed to
>+   * be see either a NULL file or this new file, no need for them to go
>+   * away.
>+   */
>+  rcu_assign_pointer(slot->gmem.file, file);
>+  slot->gmem.pgoff = start;
>+
>+  xa_store_range(>bindings, start, end - 1, slot, GFP_KERNEL);
>+  filemap_invalidate_unlock(inode->i_mapping);
>+
>+  /*
>+   * Drop the reference to the file, even on success.  The file pins KVM,
>+   * not the other way 'round.  Active bindings are invalidated if the
>+   * file is closed before memslots are destroyed.
>+   */
>+  fput(file);
>+  return 0;
>+
>+err:
>+  fput(file);
>+  return -EINVAL;

The cleanup, i.e., filemap_invalidate_unlock() and fput(), is common. So, I 
think it
may be slightly better to consolidate the common part e.g.,

int ret = -EINVAL;

...

xa_store_range(>bindings, start, end - 1, slot, GFP_KERNEL);
ret = 0;
unlock:
filemap_invalidate_unlock(inode->i_mapping);
err:
/*
 * Drop the reference to the file, even on success.  The file pins KVM,
 * not the other way 'round.  Active bindings are invalidated if the
 * file is closed before memslots are destroyed.
 */
fput(file);
return ret;