Hi Andre,

On 22/03/2017 15:39, Andre Przywara wrote:
> Hi,
> 
> On 06/03/17 11:34, Eric Auger wrote:
>> This patch flushes the device table entries into guest RAM.
>> Both flat table and 2 stage tables are supported.  DeviceId
>> indexing is used.
>>
>> For each device listed in the device table, we also flush
>> the translation table using the vgic_its_flush/restore_itt
>> routines.
>>
>> On restore, devices are re-allocated and their itte are
>> re-built.
> 
> Some minor things below.
> In general I had quite some trouble to understand what's going on here,
> though I convinced myself that this is correct. So could you add a bit
> more comments here? For instance to explain that we have to explicitly
> handle the L1 table on restore, but not on flush.

On flush vgic_its_check_id does the 2 stage handling and computes the
entry address from the devid:

                if (!vgic_its_check_id(its, baser,
                                       dev->device_id, &eaddr))

Then you simply flush the entry at that address

On restore, you need to scan the level1 and level2 tables for valid entries.
> 
>> Signed-off-by: Eric Auger <[email protected]>
>>
>> ---
>> v2 -> v3:
>> - fix itt_addr bitmask in vgic_its_restore_dte
>> - addition of return 0 in vgic_its_restore_ite moved to
>>   the ITE related patch
>>
>> v1 -> v2:
>> - use 8 byte format for DTE and ITE
>> - support 2 stage format
>> - remove kvm parameter
>> - ITT flush/restore moved in a separate patch
>> - use deviceid indexing
>> ---
>>  virt/kvm/arm/vgic/vgic-its.c | 144 
>> ++++++++++++++++++++++++++++++++++++++++++-
>>  1 file changed, 142 insertions(+), 2 deletions(-)
>>
>> diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
>> index a216849..27ebabd 100644
>> --- a/virt/kvm/arm/vgic/vgic-its.c
>> +++ b/virt/kvm/arm/vgic/vgic-its.c
>> @@ -1849,12 +1849,137 @@ static int vgic_its_restore_itt(struct vgic_its 
>> *its,
>>  }
>>  
>>  /**
>> + * vgic_its_flush_dte - Flush a device table entry at a given GPA
>> + *
>> + * @its: ITS handle
>> + * @dev: ITS device
>> + * @ptr: GPA
>> + */
>> +static int vgic_its_flush_dte(struct vgic_its *its,
>> +                          struct its_device *dev, gpa_t ptr)
>> +{
>> +    struct kvm *kvm = its->dev->kvm;
>> +    u64 val, itt_addr_field;
>> +    int ret;
>> +    u32 next_offset;
>> +
>> +    itt_addr_field = dev->itt_addr >> 8;
>> +    next_offset = compute_next_devid_offset(&its->device_list, dev);
>> +    val = (((u64)next_offset << 45) | (itt_addr_field << 5) |
> 
> So this gives you 19 bits for next_offset, but the value of
> VITS_DTE_MAX_DEVID_OFFSET suggests 20 bits. It should become more
> obvious what's happening here if use "BITS(x) - 1" at the definition as
> suggested before.
Yes this should be 19 bits
> 
> Also you limit itt_addr here to 40 bits, where the actual limit seems to
> be 44 bits (52 - 8). Is that limited by KVM somewhere else?

Those 40 bits match [47:8] of the itt_addr. I limited to 48 since I
found a comment saying
/*
 * We only implement 48 bits of PA at the moment, although the ITS
 * supports more. Let's be restrictive here.
 */


On the other hand there is
#define its_cmd_get_ittaddr(cmd) its_cmd_mask_field(cmd, 2,  8, 44)
To me this is wrong. I would have expected its_cmd_mask_field(cmd, 2,
8, 47) instead as the other *_ADDRESS()

If confirmed I will send a fix.



> Even if it is, I think we should make sure that itt_addr_field doesn't
> spill over into next_offset.
> 
>> +            (dev->nb_eventid_bits - 1));
> 
> Mmmh, here nb_eventid_bits seems to be the real bit number again. Puzzled.
corrected
> 
>> +    val = cpu_to_le64(val);
>> +    ret = kvm_write_guest(kvm, ptr, &val, 8);
>> +    return ret;
>> +}
>> +
>> +/**
>> + * vgic_its_restore_dte - restore a device table entry
>> + *
>> + * @its: its handle
>> + * @id: device id the DTE corresponds to
>> + * @ptr: kernel VA where the 8 byte DTE is located
>> + * @opaque: unused
>> + * @next: offset to the next valid device id
>> + *
>> + * Return: < 0 on error, 0 otherwise
>> + */
>> +static int vgic_its_restore_dte(struct vgic_its *its, u32 id,
>> +                            void *ptr, void *opaque, u32 *next)
>> +{
>> +    struct its_device *dev;
>> +    gpa_t itt_addr;
>> +    size_t size;
>> +    u64 val, *p = (u64 *)ptr;
>> +    int ret;
>> +
>> +    val = *p;
>> +    val = le64_to_cpu(val);
>> +
>> +    size = val & GENMASK_ULL(4, 0);
>> +    itt_addr = (val & GENMASK_ULL(44, 5)) >> 5;
>> +    *next = 1;
>> +
>> +    if (!itt_addr)
>> +            return 0;
>> +
>> +    /* dte entry is valid */
>> +    *next = (val & GENMASK_ULL(63, 45)) >> 45;
> 
> No need for GENMASK, just shift by 45.
yes
> 
>> +
>> +    ret = vgic_its_alloc_device(its, &dev, id,
>> +                                itt_addr, size);
>> +    if (ret)
>> +            return ret;
>> +    ret = vgic_its_restore_itt(its, dev);
>> +
>> +    return ret;
>> +}
>> +
>> +/**
>>   * vgic_its_flush_device_tables - flush the device table and all ITT
>>   * into guest RAM
>>   */
>>  static int vgic_its_flush_device_tables(struct vgic_its *its)
>>  {
>> -    return -ENXIO;
>> +    struct its_device *dev;
>> +    u64 baser;
>> +
>> +    baser = its->baser_device_table;
>> +
>> +    list_for_each_entry(dev, &its->device_list, dev_list) {
>> +            int ret;
>> +            gpa_t eaddr;
>> +
>> +            if (!vgic_its_check_id(its, baser,
>> +                                   dev->device_id, &eaddr))
>> +                    return -EINVAL;
>> +
>> +            ret = vgic_its_flush_itt(its, dev);
>> +            if (ret)
>> +                    return ret;
>> +
>> +            ret = vgic_its_flush_dte(its, dev, eaddr);
>> +            if (ret)
>> +                    return ret;
>> +            }
> 
> whitespace ?
OK
> 
>> +    return 0;
>> +}
>> +
>> +/**
>> + * handle_l1_entry - callback used for L1 entries (2 stage case)
>> + *
>> + * @its: its handle
>> + * @id: id
>> + * @addr: kernel VA
>> + * @opaque: unused
>> + * @next_offset: offset to the next L1 entry: 0 if the last element
>> + * was found, 1 otherwise
>> + */
>> +static int handle_l1_entry(struct vgic_its *its, u32 id, void *addr,
>> +                       void *opaque, u32 *next_offset)
>> +{
>> +    u64 *pe = addr;
>> +    gpa_t gpa;
>> +    int l2_start_id = id * (SZ_64K / 8);
> 
> I think we can use GITS_LVL1_ENTRY_SIZE here, which I suppose is what
> the 8 stands for.
yes
> 
>> +    int ret;
>> +
>> +    *pe = le64_to_cpu(*pe);
> 
> Is it correct to _update_ the entry here? I think that breaks BE, right?
> Beside I believe the ITS is not supposed to tinker with the L1 table
> entries, isn't it?
> 
> So should it be instead:
>       u64 pe = *(u64 *)addr;
>       
>       pe = le64_to_cpu(pe);
> 
> instead?
> And what "pe" stand for anyway? Maybe "entry" instead?
correct. I fixed that.
> 
>> +    *next_offset = 1;
>> +
>> +    if (!(*pe & BIT_ULL(63)))
>> +            return 0;
>> +
>> +    gpa = *pe & GENMASK_ULL(51, 16);
>> +
>> +    ret = lookup_table(its, gpa, SZ_64K, 8,
>> +                        l2_start_id, vgic_its_restore_dte, NULL);
>> +
>> +    if (ret == 1) {
>> +            /* last entry was found in this L2 table */
>> +            *next_offset = 0;
>> +            ret = 0;
>> +    }
>> +
>> +    return ret;
>>  }
>>  
>>  /**
>> @@ -1863,7 +1988,22 @@ static int vgic_its_flush_device_tables(struct 
>> vgic_its *its)
>>   */
>>  static int vgic_its_restore_device_tables(struct vgic_its *its)
>>  {
>> -    return -ENXIO;
>> +    u64 baser = its->baser_device_table;
>> +    int l1_tbl_size = GITS_BASER_NR_PAGES(baser) * SZ_64K;
>> +    int l1_esz = GITS_BASER_ENTRY_SIZE(baser);
>> +    gpa_t l1_gpa;
>> +
>> +    l1_gpa = BASER_ADDRESS(baser);
>> +    if (!l1_gpa)
>> +            return 0;
>> +
>> +    if (!(baser & GITS_BASER_INDIRECT))
>> +            return lookup_table(its, l1_gpa, l1_tbl_size, l1_esz,
>> +                                0, vgic_its_restore_dte, NULL);
>> +
>> +    /* two stage table */
>> +    return lookup_table(its, l1_gpa, l1_tbl_size, 8, 0,
>> +                        handle_l1_entry, NULL);
> 
> That usage of lookup_table with the callback is pretty neat!
thanks a lot for your time and the numerous bugs you found!

Cheers

Eric
> 
> Cheers,
> Andre.
> 
>>  }
>>  
>>  static int vgic_its_flush_cte(struct vgic_its *its,
>>
> 
> _______________________________________________
> linux-arm-kernel mailing list
> [email protected]
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 
_______________________________________________
kvmarm mailing list
[email protected]
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Reply via email to