Re: [RFC PATCH v11 06/29] KVM: Introduce KVM_SET_USER_MEMORY_REGION2

2023-07-31 Thread Paolo Bonzini

On 7/29/23 02:03, Sean Christopherson wrote:

KVM would need to do multiple uaccess reads, but that's not a big
deal.  Am I missing something, or did past us just get too clever and
miss the obvious solution?


You would have to introduce struct kvm_userspace_memory_region2 anyway, 
though not a new ioctl, for two reasons:


1) the current size of the struct is part of the userspace API via the 
KVM_SET_USER_MEMORY_REGION #define, so introducing a new struct is the 
easiest way to preserve this


2) the struct can (at least theoretically) enter the ABI of a shared 
library, and such mismatches are really hard to detect and resolve.  So 
it's better to add the padding to a new struct, and keep struct 
kvm_userspace_memory_region backwards-compatible.



As to whether we should introduce a new ioctl: doing so makes 
KVM_SET_USER_MEMORY_REGION's detection of bad flags a bit more robust; 
it's not like we cannot introduce new flags at all, of course, but 
having out-of-bounds reads as a side effect of new flags is a bit nasty. 
 Protecting programs from their own bugs gets into diminishing returns 
very quickly, but introducing a new ioctl can make exploits a bit harder 
when struct kvm_userspace_memory_region is on the stack and adjacent to 
an attacker-controlled location.


Paolo



Re: [RFC PATCH v11 06/29] KVM: Introduce KVM_SET_USER_MEMORY_REGION2

2023-07-31 Thread Quentin Perret
On Friday 28 Jul 2023 at 17:03:33 (-0700), Sean Christopherson wrote:
> On Fri, Jul 28, 2023, Quentin Perret wrote:
> > On Tuesday 18 Jul 2023 at 16:44:49 (-0700), Sean Christopherson wrote:
> > > --- a/include/uapi/linux/kvm.h
> > > +++ b/include/uapi/linux/kvm.h
> > > @@ -95,6 +95,16 @@ struct kvm_userspace_memory_region {
> > >   __u64 userspace_addr; /* start of the userspace allocated memory */
> > >  };
> > >  
> > > +/* for KVM_SET_USER_MEMORY_REGION2 */
> > > +struct kvm_userspace_memory_region2 {
> > > + __u32 slot;
> > > + __u32 flags;
> > > + __u64 guest_phys_addr;
> > > + __u64 memory_size;
> > > + __u64 userspace_addr;
> > > + __u64 pad[16];
> > 
> > Should we replace that pad[16] with:
> > 
> > __u64 size;
> > 
> > where 'size' is the size of the structure as seen by userspace? This is
> > used in other UAPIs (see struct sched_attr for example) and is a bit
> > more robust for future extensions (e.g. an 'old' kernel can correctly
> > reject a newer version of the struct with additional fields it doesn't
> > know about if that makes sense, etc).
> 
> "flags" serves that purpose, i.e. allows userspace to opt-in to having KVM 
> actually
> consume what is currently just padding.

Sure, I've just grown to dislike static padding of that type -- it ends
up being either a waste a space, or is too small, while the 'superior'
alternative (having a 'size' member) doesn't cost much and avoids those
problems.

But no strong opinion really, this struct really shouldn't grow much,
so I'm sure that'll be fine in practice.

> The padding is there mainly to simplify kernel/KVM code, e.g. the number of 
> bytes
> that KVM needs to copy in is static.
> 
> But now that I think more on this, I don't know why we didn't just 
> unconditionally
> bump the size of kvm_userspace_memory_region.  We tried to play games with 
> unions
> and overlays, but that was a mess[*].
> 
> KVM would need to do multiple uaccess reads, but that's not a big deal.  Am I
> missing something, or did past us just get too clever and miss the obvious 
> solution?
> 
> [*] https://lkml.kernel.org/r/Y7xrtf9FCuYRYm1q%40google.com

Right, so the first uaccess would get_user() the flags, based on that
we'd figure out the size of the struct, copy_from_user() what we need,
and then sanity check the flags are the same from both reads, or
something along those lines?

That doesn't sound too complicated to me, and as long as every extension
to the struct does come with a new flag I can't immediately see what
would go wrong.


Re: [RFC PATCH v11 06/29] KVM: Introduce KVM_SET_USER_MEMORY_REGION2

2023-07-28 Thread Sean Christopherson
On Fri, Jul 28, 2023, Quentin Perret wrote:
> On Tuesday 18 Jul 2023 at 16:44:49 (-0700), Sean Christopherson wrote:
> > --- a/include/uapi/linux/kvm.h
> > +++ b/include/uapi/linux/kvm.h
> > @@ -95,6 +95,16 @@ struct kvm_userspace_memory_region {
> > __u64 userspace_addr; /* start of the userspace allocated memory */
> >  };
> >  
> > +/* for KVM_SET_USER_MEMORY_REGION2 */
> > +struct kvm_userspace_memory_region2 {
> > +   __u32 slot;
> > +   __u32 flags;
> > +   __u64 guest_phys_addr;
> > +   __u64 memory_size;
> > +   __u64 userspace_addr;
> > +   __u64 pad[16];
> 
> Should we replace that pad[16] with:
> 
>   __u64 size;
> 
> where 'size' is the size of the structure as seen by userspace? This is
> used in other UAPIs (see struct sched_attr for example) and is a bit
> more robust for future extensions (e.g. an 'old' kernel can correctly
> reject a newer version of the struct with additional fields it doesn't
> know about if that makes sense, etc).

"flags" serves that purpose, i.e. allows userspace to opt-in to having KVM 
actually
consume what is currently just padding.

The padding is there mainly to simplify kernel/KVM code, e.g. the number of 
bytes
that KVM needs to copy in is static.

But now that I think more on this, I don't know why we didn't just 
unconditionally
bump the size of kvm_userspace_memory_region.  We tried to play games with 
unions
and overlays, but that was a mess[*].

KVM would need to do multiple uaccess reads, but that's not a big deal.  Am I
missing something, or did past us just get too clever and miss the obvious 
solution?

[*] https://lkml.kernel.org/r/Y7xrtf9FCuYRYm1q%40google.com


Re: [RFC PATCH v11 06/29] KVM: Introduce KVM_SET_USER_MEMORY_REGION2

2023-07-28 Thread Quentin Perret
On Tuesday 18 Jul 2023 at 16:44:49 (-0700), Sean Christopherson wrote:
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -95,6 +95,16 @@ struct kvm_userspace_memory_region {
>   __u64 userspace_addr; /* start of the userspace allocated memory */
>  };
>  
> +/* for KVM_SET_USER_MEMORY_REGION2 */
> +struct kvm_userspace_memory_region2 {
> + __u32 slot;
> + __u32 flags;
> + __u64 guest_phys_addr;
> + __u64 memory_size;
> + __u64 userspace_addr;
> + __u64 pad[16];

Should we replace that pad[16] with:

__u64 size;

where 'size' is the size of the structure as seen by userspace? This is
used in other UAPIs (see struct sched_attr for example) and is a bit
more robust for future extensions (e.g. an 'old' kernel can correctly
reject a newer version of the struct with additional fields it doesn't
know about if that makes sense, etc).


Re: [RFC PATCH v11 06/29] KVM: Introduce KVM_SET_USER_MEMORY_REGION2

2023-07-21 Thread Paolo Bonzini

On 7/19/23 01:44, Sean Christopherson wrote:

Cc: Jarkko Sakkinen 
Signed-off-by: Sean Christopherson 
---
  arch/x86/kvm/x86.c   |  2 +-
  include/linux/kvm_host.h |  4 ++--
  include/uapi/linux/kvm.h | 13 +
  virt/kvm/kvm_main.c  | 38 ++
  4 files changed, 46 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index a6b9bea62fb8..92e77afd3ffd 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -12420,7 +12420,7 @@ void __user * __x86_set_memory_region(struct kvm *kvm, 
int id, gpa_t gpa,
}
  
  	for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) {

-   struct kvm_userspace_memory_region m;
+   struct kvm_userspace_memory_region2 m;
  
  		m.slot = id | (i << 16);

m.flags = 0;
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index d2d3e083ec7f..e9ca49d451f3 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -1130,9 +1130,9 @@ enum kvm_mr_change {
  };
  
  int kvm_set_memory_region(struct kvm *kvm,

- const struct kvm_userspace_memory_region *mem);
+ const struct kvm_userspace_memory_region2 *mem);
  int __kvm_set_memory_region(struct kvm *kvm,
-   const struct kvm_userspace_memory_region *mem);
+   const struct kvm_userspace_memory_region2 *mem);
  void kvm_arch_free_memslot(struct kvm *kvm, struct kvm_memory_slot *slot);
  void kvm_arch_memslots_updated(struct kvm *kvm, u64 gen);
  int kvm_arch_prepare_memory_region(struct kvm *kvm,
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index f089ab290978..4d4b3de8ac55 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -95,6 +95,16 @@ struct kvm_userspace_memory_region {
__u64 userspace_addr; /* start of the userspace allocated memory */
  };
  
+/* for KVM_SET_USER_MEMORY_REGION2 */

+struct kvm_userspace_memory_region2 {
+   __u32 slot;
+   __u32 flags;
+   __u64 guest_phys_addr;
+   __u64 memory_size;
+   __u64 userspace_addr;
+   __u64 pad[16];
+};
+
  /*
   * The bit 0 ~ bit 15 of kvm_userspace_memory_region::flags are visible for
   * userspace, other bits are reserved for kvm internal use which are defined
@@ -1192,6 +1202,7 @@ struct kvm_ppc_resize_hpt {
  #define KVM_CAP_COUNTER_OFFSET 227
  #define KVM_CAP_ARM_EAGER_SPLIT_CHUNK_SIZE 228
  #define KVM_CAP_ARM_SUPPORTED_BLOCK_SIZES 229
+#define KVM_CAP_USER_MEMORY2 230
  
  #ifdef KVM_CAP_IRQ_ROUTING
  
@@ -1466,6 +1477,8 @@ struct kvm_vfio_spapr_tce {

struct kvm_userspace_memory_region)
  #define KVM_SET_TSS_ADDR  _IO(KVMIO,   0x47)
  #define KVM_SET_IDENTITY_MAP_ADDR _IOW(KVMIO,  0x48, __u64)
+#define KVM_SET_USER_MEMORY_REGION2 _IOW(KVMIO, 0x49, \
+struct kvm_userspace_memory_region2)
  
  /* enable ucontrol for s390 */

  struct kvm_s390_ucas_mapping {
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 53346bc2902a..c14adf93daec 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1549,7 +1549,7 @@ static void kvm_replace_memslot(struct kvm *kvm,
}
  }
  
-static int check_memory_region_flags(const struct kvm_userspace_memory_region *mem)

+static int check_memory_region_flags(const struct kvm_userspace_memory_region2 
*mem)
  {
u32 valid_flags = KVM_MEM_LOG_DIRTY_PAGES;
  
@@ -1951,7 +1951,7 @@ static bool kvm_check_memslot_overlap(struct kvm_memslots *slots, int id,

   * Must be called holding kvm->slots_lock for write.
   */
  int __kvm_set_memory_region(struct kvm *kvm,
-   const struct kvm_userspace_memory_region *mem)
+   const struct kvm_userspace_memory_region2 *mem)
  {
struct kvm_memory_slot *old, *new;
struct kvm_memslots *slots;
@@ -2055,7 +2055,7 @@ int __kvm_set_memory_region(struct kvm *kvm,
  EXPORT_SYMBOL_GPL(__kvm_set_memory_region);
  
  int kvm_set_memory_region(struct kvm *kvm,

- const struct kvm_userspace_memory_region *mem)
+ const struct kvm_userspace_memory_region2 *mem)
  {
int r;
  
@@ -2067,7 +2067,7 @@ int kvm_set_memory_region(struct kvm *kvm,

  EXPORT_SYMBOL_GPL(kvm_set_memory_region);
  
  static int kvm_vm_ioctl_set_memory_region(struct kvm *kvm,

- struct kvm_userspace_memory_region 
*mem)
+ struct kvm_userspace_memory_region2 
*mem)
  {
if ((u16)mem->slot >= KVM_USER_MEM_SLOTS)
return -EINVAL;
@@ -4514,6 +4514,7 @@ static int kvm_vm_ioctl_check_extension_generic(struct 
kvm *kvm, long arg)
  {
switch (arg) {
case KVM_CAP_USER_MEMORY:
+   case KVM_CAP_USER_MEMORY2:
case KVM_CAP_DESTROY_MEMORY_REGION_WORKS:
case KVM_CAP_JOIN_MEMORY_REGIONS_WORKS:
  

[RFC PATCH v11 06/29] KVM: Introduce KVM_SET_USER_MEMORY_REGION2

2023-07-18 Thread Sean Christopherson
Cc: Jarkko Sakkinen 
Signed-off-by: Sean Christopherson 
---
 arch/x86/kvm/x86.c   |  2 +-
 include/linux/kvm_host.h |  4 ++--
 include/uapi/linux/kvm.h | 13 +
 virt/kvm/kvm_main.c  | 38 ++
 4 files changed, 46 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index a6b9bea62fb8..92e77afd3ffd 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -12420,7 +12420,7 @@ void __user * __x86_set_memory_region(struct kvm *kvm, 
int id, gpa_t gpa,
}
 
for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) {
-   struct kvm_userspace_memory_region m;
+   struct kvm_userspace_memory_region2 m;
 
m.slot = id | (i << 16);
m.flags = 0;
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index d2d3e083ec7f..e9ca49d451f3 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -1130,9 +1130,9 @@ enum kvm_mr_change {
 };
 
 int kvm_set_memory_region(struct kvm *kvm,
- const struct kvm_userspace_memory_region *mem);
+ const struct kvm_userspace_memory_region2 *mem);
 int __kvm_set_memory_region(struct kvm *kvm,
-   const struct kvm_userspace_memory_region *mem);
+   const struct kvm_userspace_memory_region2 *mem);
 void kvm_arch_free_memslot(struct kvm *kvm, struct kvm_memory_slot *slot);
 void kvm_arch_memslots_updated(struct kvm *kvm, u64 gen);
 int kvm_arch_prepare_memory_region(struct kvm *kvm,
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index f089ab290978..4d4b3de8ac55 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -95,6 +95,16 @@ struct kvm_userspace_memory_region {
__u64 userspace_addr; /* start of the userspace allocated memory */
 };
 
+/* for KVM_SET_USER_MEMORY_REGION2 */
+struct kvm_userspace_memory_region2 {
+   __u32 slot;
+   __u32 flags;
+   __u64 guest_phys_addr;
+   __u64 memory_size;
+   __u64 userspace_addr;
+   __u64 pad[16];
+};
+
 /*
  * The bit 0 ~ bit 15 of kvm_userspace_memory_region::flags are visible for
  * userspace, other bits are reserved for kvm internal use which are defined
@@ -1192,6 +1202,7 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_COUNTER_OFFSET 227
 #define KVM_CAP_ARM_EAGER_SPLIT_CHUNK_SIZE 228
 #define KVM_CAP_ARM_SUPPORTED_BLOCK_SIZES 229
+#define KVM_CAP_USER_MEMORY2 230
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
@@ -1466,6 +1477,8 @@ struct kvm_vfio_spapr_tce {
struct kvm_userspace_memory_region)
 #define KVM_SET_TSS_ADDR  _IO(KVMIO,   0x47)
 #define KVM_SET_IDENTITY_MAP_ADDR _IOW(KVMIO,  0x48, __u64)
+#define KVM_SET_USER_MEMORY_REGION2 _IOW(KVMIO, 0x49, \
+struct kvm_userspace_memory_region2)
 
 /* enable ucontrol for s390 */
 struct kvm_s390_ucas_mapping {
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 53346bc2902a..c14adf93daec 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1549,7 +1549,7 @@ static void kvm_replace_memslot(struct kvm *kvm,
}
 }
 
-static int check_memory_region_flags(const struct kvm_userspace_memory_region 
*mem)
+static int check_memory_region_flags(const struct kvm_userspace_memory_region2 
*mem)
 {
u32 valid_flags = KVM_MEM_LOG_DIRTY_PAGES;
 
@@ -1951,7 +1951,7 @@ static bool kvm_check_memslot_overlap(struct kvm_memslots 
*slots, int id,
  * Must be called holding kvm->slots_lock for write.
  */
 int __kvm_set_memory_region(struct kvm *kvm,
-   const struct kvm_userspace_memory_region *mem)
+   const struct kvm_userspace_memory_region2 *mem)
 {
struct kvm_memory_slot *old, *new;
struct kvm_memslots *slots;
@@ -2055,7 +2055,7 @@ int __kvm_set_memory_region(struct kvm *kvm,
 EXPORT_SYMBOL_GPL(__kvm_set_memory_region);
 
 int kvm_set_memory_region(struct kvm *kvm,
- const struct kvm_userspace_memory_region *mem)
+ const struct kvm_userspace_memory_region2 *mem)
 {
int r;
 
@@ -2067,7 +2067,7 @@ int kvm_set_memory_region(struct kvm *kvm,
 EXPORT_SYMBOL_GPL(kvm_set_memory_region);
 
 static int kvm_vm_ioctl_set_memory_region(struct kvm *kvm,
- struct kvm_userspace_memory_region 
*mem)
+ struct kvm_userspace_memory_region2 
*mem)
 {
if ((u16)mem->slot >= KVM_USER_MEM_SLOTS)
return -EINVAL;
@@ -4514,6 +4514,7 @@ static int kvm_vm_ioctl_check_extension_generic(struct 
kvm *kvm, long arg)
 {
switch (arg) {
case KVM_CAP_USER_MEMORY:
+   case KVM_CAP_USER_MEMORY2:
case KVM_CAP_DESTROY_MEMORY_REGION_WORKS:
case KVM_CAP_JOIN_MEMORY_REGIONS_WORKS:
case KVM_CAP_INTERNAL_ERROR_DATA:
@@ -4757,6 +4758,14 @@ static int