Hi Andre,

On 2/17/21 5:43 PM, Andre Przywara wrote:
> On Thu, 11 Feb 2021 16:10:16 +0000
> Alexandru Elisei <[email protected]> wrote:
>
> Hi,
>
>> On 12/10/20 2:28 PM, Andre Przywara wrote:
>>> In their core functionality MMIO and I/O port traps are not really
>>> different, yet we still have two totally separate code paths for
>>> handling them. Devices need to decide on one conduit or need to provide
>>> different handler functions for each of them.
>>>
>>> Extend the existing MMIO emulation to also cover ioport handlers.
>>> This just adds another RB tree root for holding the I/O port handlers,
>>> but otherwise uses the same tree population and lookup code.  
>> Maybe I'm missing something, but why two trees? Is it valid to have an 
>> overlap
>> between IO port and MMIO emulation? Or was it done to make the removal of 
>> ioport
>> emulation easier?
> So I thought about it as well, but figured it's easier this way:
> - x86 allows overlap, PIO is a totally separate address space from
>   memory/MMIO. Early x86 CPUs had pins to indicate a PIO bus cycle, but
>   using the same address and data pins otherwise. In practise there
>   might be no overlap when it comes to *MMIO* traps vs PIO on x86
>   (there is DRAM only at the lowest 64K of the IBM PC memory map),
>   but not sure we should rely on this.
> - For non-x86 this would indeed be non-overlapping, but this would need
>   to be translated at init time then? And then we can't move those
>   anymore, I guess? So I found it cleaner to keep this separate, and do
>   the translation at trap time.
> - As a consequence we would need to have a bit indicating the address
>   space. I haven't actually tried this, but my understanding is that
>   this would spoil the whole rb_tree functions, since they rely on a
>   linear addressing scheme, and adding another bit there would be at
>   least cumbersome?
>
> At the end I decided to go for separate trees, as also this was less
> change.
>
> I agree that it would be nice to have one tree, from a design point of
> view, but I am afraid that would require more changes.
> If need be, I think we can always unify them later on, on top of this
> series?

Definitely later, I forgot that x86 uses special instructions to access IO 
ports,
which means that port addresses can overlap with memory addresses. Let's keep 2
trees for now and we can decide later if we should unify them for the other
architectures.

>
>> If it's not valid to have that overlap, then I think having one tree for both
>> would better. Struct mmio_mapping would have to be augmented with a flags 
>> field
>> that holds the same flags given to kvm__register_iotrap to differentiate 
>> between
>> the two slightly different emulations. Saving the IOTRAP_COALESCE flag would 
>> also
>> make it trivial to call KVM_UNREGISTER_COALESCED_MMIO in 
>> kvm__deregister_iotrap,
>> which we currently don't do.
>>
>>> "ioport" or "mmio" just become a flag in the registration function.
>>> Provide wrappers to not break existing users, and allow an easy
>>> transition for the existing ioport handlers.
>>>
>>> This also means that ioport handlers now can use the same emulation
>>> callback prototype as MMIO handlers, which means we have to migrate them
>>> over. To allow a smooth transition, we hook up the new I/O emulate
>>> function to the end of the existing ioport emulation code.  
>> I'm sorry, but I don't understand that last sentence. Do you mean that the 
>> ioport
>> emulation code has been modified to use kvm__emulate_pio() as a fallback for 
>> when
>> the port is not found in the ioport_tree?
> I meant that for the transition period we have all of traditional MMIO,
> traditional PIO, *and* just transformed PIO.
>
> That means there are still PIO devices registered through ioport.c's
> ioport__register(), *and* PIO devices registered through mmio.c's
> kvm__register_pio(). Which means they end up in two separate PIO trees.
> And only the traditional kvm__emulate_io() from ioport.c is called upon
> a trap, so it needs to check both trees, which it does by calling into
> kvm__emulate_pio(), shall a search in the local tree fail.

Thank you for the explanation, now it makes sense.

>
> Or did you mean something else?
>
>>> Signed-off-by: Andre Przywara <[email protected]>
>>> ---
>>>  include/kvm/kvm.h | 42 +++++++++++++++++++++++++++++----
>>>  ioport.c          |  4 ++--
>>>  mmio.c            | 59 +++++++++++++++++++++++++++++++++++++++--------
>>>  3 files changed, 89 insertions(+), 16 deletions(-)
>>>
>>> diff --git a/include/kvm/kvm.h b/include/kvm/kvm.h
>>> index ee99c28e..14f9d58b 100644
>>> --- a/include/kvm/kvm.h
>>> +++ b/include/kvm/kvm.h
>>> @@ -27,10 +27,16 @@
>>>  #define PAGE_SIZE (sysconf(_SC_PAGE_SIZE))
>>>  #endif
>>>  
>>> +#define IOTRAP_BUS_MASK            0xf  
>> It's not immediately obvious what this mask does. It turns out it's used to 
>> mask
>> the enum flags defined in the header devices.h, header which is not included 
>> in
>> this file.
>>
>> The flag names we pass to kvm__register_iotrap() are slightly inconsistent
>> (DEVICE_BUS_PCI, DEVICE_BUS_MMIO and IOTRAP_COALESCE), where DEVICE_BUS_{PCI,
>> MMIO} come from devices.h as an enum. I was wondering if I'm missing 
>> something and
>> there is a particular reason why we don't define our own flags for that here
>> (something like IOTRAP_PIO and IOTRAP_MMIO).
> I am not sure why this would be needed?
> We already define and use DEVICE_BUS_x elsewhere, so why not re-use it?

To check if we should coalesce the MMIO regions we check the IOTRAP_COALESCE 
bit,
but to check if we should use the mmio or io tree we use a mask over the first 4
bits and compare that to DEVICE_BUS_xxx. I find that confusing: the
IOTRAP_BUS_MASK and IOTRAP_COALESCE are defined here, but there is no evidence
where the other flags are coming from, and the header file devices.h isn't even
included.

>
>> If we do decide to keep the flags from devices.h, I think it would be worth 
>> it to
>> have a compile time check (with BUILD_BUG_ON) that IOTRAP_BUS_MASK is >=
>> DEVICES_BUS_MAX, which would also be a good indication of where those flags 
>> are
>> coming from.
> Well, if that makes you happy, I am not sure we gain another 13 bus
> types anytime soon, though ;-)

I was actually suggesting that more for documenting the code. If we compare
IOTRAP_BUS_MASK with DEVICE_BUS_MAX, then that will give us a hint where we're
expecting the flags to be defined.

But I had another look at the code and it seems that DEVICE_BUS_MMIO = 1 and
DEVICE_BUS_IOPORT = 2, which means that we don't need the mask and we can check
the bits instead (and IOTRAP_COALESCE can be redefined to be 1U << 2). I think
dropping the mask and replacing it with testing individual bits would make the
code easier to follow, what do you think?

>  
>>> +#define IOTRAP_COALESCE            (1U << 4)
>>> +
>>>  #define DEFINE_KVM_EXT(ext)                \
>>>     .name = #ext,                   \
>>>     .code = ext
>>>  
>>> +struct kvm_cpu;
>>> +typedef void (*mmio_handler_fn)(struct kvm_cpu *vcpu, u64 addr, u8 *data,
>>> +                           u32 len, u8 is_write, void *ptr);
>>>  typedef void (*fdt_irq_fn)(void *fdt, u8 irq, enum irq_type);
>>>  
>>>  enum {
>>> @@ -113,6 +119,8 @@ void kvm__irq_line(struct kvm *kvm, int irq, int level);
>>>  void kvm__irq_trigger(struct kvm *kvm, int irq);
>>>  bool kvm__emulate_io(struct kvm_cpu *vcpu, u16 port, void *data, int 
>>> direction, int size, u32 count);
>>>  bool kvm__emulate_mmio(struct kvm_cpu *vcpu, u64 phys_addr, u8 *data, u32 
>>> len, u8 is_write);
>>> +bool kvm__emulate_pio(struct kvm_cpu *vcpu, u16 port, void *data,
>>> +                 int direction, int size, u32 count);
>>>  int kvm__destroy_mem(struct kvm *kvm, u64 guest_phys, u64 size, void 
>>> *userspace_addr);
>>>  int kvm__register_mem(struct kvm *kvm, u64 guest_phys, u64 size, void 
>>> *userspace_addr,
>>>                   enum kvm_mem_type type);
>>> @@ -136,10 +144,36 @@ static inline int kvm__reserve_mem(struct kvm *kvm, 
>>> u64 guest_phys, u64 size)
>>>                              KVM_MEM_TYPE_RESERVED);
>>>  }
>>>  
>>> -int __must_check kvm__register_mmio(struct kvm *kvm, u64 phys_addr, u64 
>>> phys_addr_len, bool coalesce,
>>> -                               void (*mmio_fn)(struct kvm_cpu *vcpu, u64 
>>> addr, u8 *data, u32 len, u8 is_write, void *ptr),
>>> -                               void *ptr);
>>> -bool kvm__deregister_mmio(struct kvm *kvm, u64 phys_addr);
>>> +int __must_check kvm__register_iotrap(struct kvm *kvm, u64 phys_addr, u64 
>>> len,
>>> +                                 mmio_handler_fn mmio_fn, void *ptr,
>>> +                                 unsigned int flags);
>>> +
>>> +static inline
>>> +int __must_check kvm__register_mmio(struct kvm *kvm, u64 phys_addr,
>>> +                               u64 phys_addr_len, bool coalesce,
>>> +                               mmio_handler_fn mmio_fn, void *ptr)
>>> +{
>>> +   return kvm__register_iotrap(kvm, phys_addr, phys_addr_len, mmio_fn, ptr,
>>> +                   DEVICE_BUS_MMIO | (coalesce ? IOTRAP_COALESCE : 0));
>>> +}
>>> +static inline
>>> +int __must_check kvm__register_pio(struct kvm *kvm, u16 port, u16 len,
>>> +                              mmio_handler_fn mmio_fn, void *ptr)
>>> +{
>>> +   return kvm__register_iotrap(kvm, port, len, mmio_fn, ptr,
>>> +                               DEVICE_BUS_IOPORT);
>>> +}
>>> +
>>> +bool kvm__deregister_iotrap(struct kvm *kvm, u64 phys_addr, unsigned int 
>>> flags);
>>> +static inline bool kvm__deregister_mmio(struct kvm *kvm, u64 phys_addr)
>>> +{
>>> +   return kvm__deregister_iotrap(kvm, phys_addr, DEVICE_BUS_MMIO);
>>> +}
>>> +static inline bool kvm__deregister_pio(struct kvm *kvm, u16 port)
>>> +{
>>> +   return kvm__deregister_iotrap(kvm, port, DEVICE_BUS_IOPORT);
>>> +}
>>> +
>>>  void kvm__reboot(struct kvm *kvm);
>>>  void kvm__pause(struct kvm *kvm);
>>>  void kvm__continue(struct kvm *kvm);
>>> diff --git a/ioport.c b/ioport.c
>>> index b98836d3..204d8103 100644
>>> --- a/ioport.c
>>> +++ b/ioport.c
>>> @@ -147,7 +147,8 @@ bool kvm__emulate_io(struct kvm_cpu *vcpu, u16 port, 
>>> void *data, int direction,
>>>  
>>>     entry = ioport_get(&ioport_tree, port);
>>>     if (!entry)
>>> -           goto out;
>>> +           return kvm__emulate_pio(vcpu, port, data, direction,
>>> +                                   size, count);  
>> I have to admit this gave me pause because this patch doesn't add any users 
>> for
>> kvm__register_pio() (although with this change the behaviour of 
>> kvm__emulate_io()
>> remains exactly the same). Do you think this change would fit better in 
>> patch #7,
>> where the first user for kvm__register_pio() is added, or do you prefer it 
>> here?
> I think it logically belongs here, as we introduce the
> kvm__emulate_pio() function here as well. Otherwise this function would
> have no caller. As it is now, it's just a "coincidence" that no one
> actually called kvm__register_pio() so far. This also makes the other
> patches movable and replaceable: this patch prepares the stage, the
> follow-up patches just fill it.

Sure, makes sense.

>
>>>  
>>>     ops     = entry->ops;
>>>  
>>> @@ -162,7 +163,6 @@ bool kvm__emulate_io(struct kvm_cpu *vcpu, u16
>>> port, void *data, int direction, 
>>>     ioport_put(&ioport_tree, entry);
>>>  
>>> -out:
>>>     if (ret)
>>>             return true;
>>>  
>>> diff --git a/mmio.c b/mmio.c
>>> index cd141cd3..4cce1901 100644
>>> --- a/mmio.c
>>> +++ b/mmio.c
>>> @@ -19,13 +19,14 @@ static DEFINE_MUTEX(mmio_lock);
>>>  
>>>  struct mmio_mapping {
>>>     struct rb_int_node      node;
>>> -   void                    (*mmio_fn)(struct kvm_cpu
>>> *vcpu, u64 addr, u8 *data, u32 len, u8 is_write, void *ptr);
>>> +   mmio_handler_fn         mmio_fn;
>>>     void                    *ptr;
>>>     u32                     refcount;
>>>     bool                    remove;
>>>  };
>>>  
>>>  static struct rb_root mmio_tree = RB_ROOT;
>>> +static struct rb_root pio_tree = RB_ROOT;
>>>  
>>>  static struct mmio_mapping *mmio_search(struct rb_root *root, u64
>>> addr, u64 len) {
>>> @@ -103,9 +104,9 @@ static void mmio_put(struct kvm *kvm, struct
>>> rb_root *root, struct mmio_mapping mutex_unlock(&mmio_lock);
>>>  }
>>>  
>>> -int kvm__register_mmio(struct kvm *kvm, u64 phys_addr, u64
>>> phys_addr_len, bool coalesce,
>>> -                  void (*mmio_fn)(struct kvm_cpu *vcpu, u64
>>> addr, u8 *data, u32 len, u8 is_write, void *ptr),
>>> -                   void *ptr)
>>> +int kvm__register_iotrap(struct kvm *kvm, u64 phys_addr, u64
>>> phys_addr_len,
>>> +                    mmio_handler_fn mmio_fn, void *ptr,
>>> +                    unsigned int flags)
>>>  {
>>>     struct mmio_mapping *mmio;
>>>     struct kvm_coalesced_mmio_zone zone;
>>> @@ -127,7 +128,7 @@ int kvm__register_mmio(struct kvm *kvm, u64
>>> phys_addr, u64 phys_addr_len, bool c .remove                = false,
>>>     };
>>>  
>>> -   if (coalesce) {
>>> +   if (flags & IOTRAP_COALESCE) {  
>> There is no such flag being used in ioport.c, is it valid to have the
>> flags DEVICE_BUS_IOPORT and IOTRAP_COALESCE set at the same time?
> Well, yes and no: Yes, as this maps to MMIO on non-x86, so
> theoretically could use the flag. No, as no one registering a trap
> handler through kvm__register_pio() would ever have the chance to set
> this flag.
> I can check for the registration being for the MMIO bus before entering
> the "if" branch, if that is what you mean?

Yes, that's what I mean, please return an error if a device tries to register an
I/O port and sets the COALESCE flag, since that's forbidden for architectures 
with
true I/O ports (like x86). I realize it might look peculiar since we don't have
any devices that do that, but I think it can be useful for catching errors when
writing new devices.

Thanks,

Alex

>
>>>             zone = (struct kvm_coalesced_mmio_zone) {
>>>                     .addr   = phys_addr,
>>>                     .size   = phys_addr_len,
>>> @@ -139,18 +140,27 @@ int kvm__register_mmio(struct kvm *kvm, u64
>>> phys_addr, u64 phys_addr_len, bool c }
>>>     }
>>>     mutex_lock(&mmio_lock);
>>> -   ret = mmio_insert(&mmio_tree, mmio);
>>> +   if ((flags & IOTRAP_BUS_MASK) == DEVICE_BUS_IOPORT)
>>> +           ret = mmio_insert(&pio_tree, mmio);
>>> +   else
>>> +           ret = mmio_insert(&mmio_tree, mmio);
>>>     mutex_unlock(&mmio_lock);
>>>  
>>>     return ret;
>>>  }
>>>  
>>> -bool kvm__deregister_mmio(struct kvm *kvm, u64 phys_addr)
>>> +bool kvm__deregister_iotrap(struct kvm *kvm, u64 phys_addr,
>>> unsigned int flags) {
>>>     struct mmio_mapping *mmio;
>>> +   struct rb_root *tree;
>>> +
>>> +   if ((flags & IOTRAP_BUS_MASK) == DEVICE_BUS_IOPORT)
>>> +           tree = &pio_tree;
>>> +   else
>>> +           tree = &mmio_tree;
>>>  
>>>     mutex_lock(&mmio_lock);
>>> -   mmio = mmio_search_single(&mmio_tree, phys_addr);
>>> +   mmio = mmio_search_single(tree, phys_addr);
>>>     if (mmio == NULL) {
>>>             mutex_unlock(&mmio_lock);
>>>             return false;
>>> @@ -167,7 +177,7 @@ bool kvm__deregister_mmio(struct kvm *kvm, u64
>>> phys_addr)
>>>      * called mmio_put(). This will trigger use-after-free
>>> errors on VCPU0. */
>>>     if (mmio->refcount == 0)
>>> -           mmio_deregister(kvm, &mmio_tree, mmio);
>>> +           mmio_deregister(kvm, tree, mmio);
>>>     else
>>>             mmio->remove = true;
>>>     mutex_unlock(&mmio_lock);
>>> @@ -175,7 +185,8 @@ bool kvm__deregister_mmio(struct kvm *kvm, u64
>>> phys_addr) return true;
>>>  }
>>>  
>>> -bool kvm__emulate_mmio(struct kvm_cpu *vcpu, u64 phys_addr, u8
>>> *data, u32 len, u8 is_write) +bool kvm__emulate_mmio(struct kvm_cpu
>>> *vcpu, u64 phys_addr, u8 *data,
>>> +                  u32 len, u8 is_write)  
>> I don't think style changes should be part of this patch, the patch
>> is large enough as it is.
> I see, I just figured it's not worth a separate patch either.
>
> Cheers,
> Andre
>
>>>  {
>>>     struct mmio_mapping *mmio;
>>>  
>>> @@ -194,3 +205,31 @@ bool kvm__emulate_mmio(struct kvm_cpu *vcpu,
>>> u64 phys_addr, u8 *data, u32 len, u out:
>>>     return true;
>>>  }
>>> +
>>> +bool kvm__emulate_pio(struct kvm_cpu *vcpu, u16 port, void *data,
>>> +                int direction, int size, u32 count)
>>> +{
>>> +   struct mmio_mapping *mmio;
>>> +   bool is_write = direction == KVM_EXIT_IO_OUT;
>>> +
>>> +   mmio = mmio_get(&pio_tree, port, size);
>>> +   if (!mmio) {
>>> +           if (vcpu->kvm->cfg.ioport_debug) {
>>> +                   fprintf(stderr, "IO error: %s port=%x,
>>> size=%d, count=%u\n",
>>> +                           to_direction(direction), port,
>>> size, count); +
>>> +                   return false;
>>> +           }
>>> +           return true;
>>> +   }
>>> +
>>> +   while (count--) {
>>> +           mmio->mmio_fn(vcpu, port, data, size, is_write,
>>> mmio->ptr); +
>>> +           data += size;
>>> +   }
>>> +
>>> +   mmio_put(vcpu->kvm, &pio_tree, mmio);
>>> +
>>> +   return true;
>>> +}  
_______________________________________________
kvmarm mailing list
[email protected]
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Reply via email to