Re: [PATCH 1/5] vfio/iommu_type1: Fixes vfio_dma_populate_bitmap to avoid dirty lose

2021-01-12 Thread Alex Williamson
On Thu, 7 Jan 2021 17:28:57 +0800
Keqian Zhu  wrote:

> Defer checking whether vfio_dma is of fully-dirty in update_user_bitmap
> is easy to lose dirty log. For example, after promoting pinned_scope of
> vfio_iommu, vfio_dma is not considered as fully-dirty, then we may lose
> dirty log that occurs before vfio_iommu is promoted.
>
> The key point is that pinned-dirty is not a real dirty tracking way, it
> can't continuously track dirty pages, but just restrict dirty scope. It
> is essentially the same as fully-dirty. Fully-dirty is of full-scope and
> pinned-dirty is of pinned-scope.
> 
> So we must mark pinned-dirty or fully-dirty after we start dirty tracking
> or clear dirty bitmap, to ensure that dirty log is marked right away.

I was initially convinced by these first three patches, but upon
further review, I think the premise is wrong.  AIUI, the concern across
these patches is that our dirty bitmap is only populated with pages
dirtied by pinning and we only take into account the pinned page dirty
scope at the time the bitmap is retrieved by the user.  You suppose
this presents a gap where if a vendor driver has not yet identified
with a page pinning scope that the entire bitmap should be considered
dirty regardless of whether that driver later pins pages prior to the
user retrieving the dirty bitmap.

I don't think this is how we intended the cooperation between the iommu
driver and vendor driver to work.  By pinning pages a vendor driver is
not declaring that only their future dirty page scope is limited to
pinned pages, instead they're declaring themselves as a participant in
dirty page tracking and take responsibility for pinning any necessary
pages.  For example we might extend VFIO_IOMMU_DIRTY_PAGES_FLAG_START
to trigger a blocking notification to groups to not only begin dirty
tracking, but also to synchronously register their current device DMA
footprint.  This patch would require a vendor driver to possibly perform
a gratuitous page pinning in order to set the scope prior to dirty
logging being enabled, or else the initial bitmap will be fully dirty.

Therefore, I don't see that this series is necessary or correct.  Kirti,
does this match your thinking?

Thinking about these semantics, it seems there might still be an issue
if a group with non-pinned-page dirty scope is detached with dirty
logging enabled.  It seems this should in fact fully populate the dirty
bitmaps at the time it's removed since we don't know the extent of its
previous DMA, nor will the group be present to trigger the full bitmap
when the user retrieves the dirty bitmap.  Creating fully populated
bitmaps at the time tracking is enabled negates our ability to take
advantage of later enlightenment though.  Thanks,

Alex

> Fixes: d6a4c185660c ("vfio iommu: Implementation of ioctl for dirty pages 
> tracking")
> Signed-off-by: Keqian Zhu 
> ---
>  drivers/vfio/vfio_iommu_type1.c | 33 ++---
>  1 file changed, 22 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index bceda5e8baaa..b0a26e8e0adf 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -224,7 +224,7 @@ static void vfio_dma_bitmap_free(struct vfio_dma *dma)
>   dma->bitmap = NULL;
>  }
>  
> -static void vfio_dma_populate_bitmap(struct vfio_dma *dma, size_t pgsize)
> +static void vfio_dma_populate_bitmap_pinned(struct vfio_dma *dma, size_t 
> pgsize)
>  {
>   struct rb_node *p;
>   unsigned long pgshift = __ffs(pgsize);
> @@ -236,6 +236,25 @@ static void vfio_dma_populate_bitmap(struct vfio_dma 
> *dma, size_t pgsize)
>   }
>  }
>  
> +static void vfio_dma_populate_bitmap_full(struct vfio_dma *dma, size_t 
> pgsize)
> +{
> + unsigned long pgshift = __ffs(pgsize);
> + unsigned long nbits = dma->size >> pgshift;
> +
> + bitmap_set(dma->bitmap, 0, nbits);
> +}
> +
> +static void vfio_dma_populate_bitmap(struct vfio_iommu *iommu,
> +  struct vfio_dma *dma)
> +{
> + size_t pgsize = (size_t)1 << __ffs(iommu->pgsize_bitmap);
> +
> + if (iommu->pinned_page_dirty_scope)
> + vfio_dma_populate_bitmap_pinned(dma, pgsize);
> + else if (dma->iommu_mapped)
> + vfio_dma_populate_bitmap_full(dma, pgsize);
> +}
> +
>  static int vfio_dma_bitmap_alloc_all(struct vfio_iommu *iommu)
>  {
>   struct rb_node *n;
> @@ -257,7 +276,7 @@ static int vfio_dma_bitmap_alloc_all(struct vfio_iommu 
> *iommu)
>   }
>   return ret;
>   }
> - vfio_dma_populate_bitmap(dma, pgsize);
> + vfio_dma_populate_bitmap(iommu, dma);
>   }
>   return 0;
>  }
> @@ -987,13 +1006,6 @@ static int update_user_bitmap(u64 __user *bitmap, 
> struct vfio_iommu *iommu,
>   unsigned long shift = bit_offset % BITS_PER_LONG;
>   unsigned long leftover;
>  
> - /*
> -  * mark all pages dirty if any I

Re: [RFC PATCH v2 15/26] of/fdt: Introduce early_init_dt_add_memory_hyp()

2021-01-12 Thread Quentin Perret
On Tuesday 12 Jan 2021 at 10:45:56 (-0600), Rob Herring wrote:
> Umm, yes you are right. But both are dealing with nomap. So someone
> needs to sort out what the right thing to do here is. No one cared
> enough to follow up in a year and a half.

Fair enough, happy to do that. I'll send a small series with these two
patches independently from this series which may take a while to land.

Thanks,
Quentin
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [RFC PATCH v2 15/26] of/fdt: Introduce early_init_dt_add_memory_hyp()

2021-01-12 Thread Quentin Perret
On Tuesday 12 Jan 2021 at 09:53:36 (-0600), Rob Herring wrote:
> On Tue, Jan 12, 2021 at 8:26 AM Quentin Perret  wrote:
> >
> > On Tuesday 12 Jan 2021 at 08:10:47 (-0600), Rob Herring wrote:
> > > On Tue, Jan 12, 2021 at 3:51 AM Quentin Perret  wrote:
> > > >
> > > > On Monday 11 Jan 2021 at 08:45:10 (-0600), Rob Herring wrote:
> > > > > On Fri, Jan 8, 2021 at 6:16 AM Quentin Perret  
> > > > > wrote:
> > > > > >
> > > > > > Introduce early_init_dt_add_memory_hyp() to allow KVM to conserve a 
> > > > > > copy
> > > > > > of the memory regions parsed from DT. This will be needed in the 
> > > > > > context
> > > > > > of the protected nVHE feature of KVM/arm64 where the code running 
> > > > > > at EL2
> > > > > > will be cleanly separated from the host kernel during boot, and will
> > > > > > need its own representation of memory.
> > > > >
> > > > > What happened to doing this with memblock?
> > > >
> > > > I gave it a go, but as mentioned in v1, I ran into issues for nomap
> > > > regions. I want the hypervisor to know about these memory regions (it's
> > > > possible some of those will be given to protected guests for instance)
> > > > but these seem to be entirely removed from the memblocks when using DT:
> > > >
> > > > https://elixir.bootlin.com/linux/latest/source/drivers/of/fdt.c#L1153
> > > >
> > > > EFI appears to do things differently, though, as it 'just' uses
> > > > memblock_mark_nomap() instead of actively removing the memblock. And 
> > > > that
> > > > means I could actually use the memblock API for EFI, but I'd rather
> > > > have a common solution. I tried to understand why things are done
> > > > differently but couldn't find an answer and kept things simple and
> > > > working for now.
> > > >
> > > > Is there a good reason for not using memblock_mark_nomap() with DT? If
> > > > not, I'm happy to try that.
> > >
> > > There were 2 patches to do that, but it never got resolved. See here[1].
> >
> > Thanks. So the DT stuff predates the introduction of memblock_mark_nomap,
> > that's why...
> >
> > By reading the discussions, [1] still looks a sensible patch on its own,
> > independently from the issue Nicolas tried to solve. Any reason for not
> > applying it?
> 
> As I mentioned in the thread, same patch with 2 different reasons. So
> I just wanted a better commit message covering both.

Sorry if I'm being thick, but I'm not seeing it. How are they the same?
IIUC, as per Nicolas' last reply, using memblock_mark_nomap() does not
solve his issue with a broken DT. These 2 patches address two completely
separate issues no?

Thanks,
Quentin
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH 4/5] vfio/iommu_type1: Carefully use unmap_unpin_all during dirty tracking

2021-01-12 Thread Alex Williamson
On Tue, 12 Jan 2021 20:04:38 +0800
Keqian Zhu  wrote:

> On 2021/1/12 5:49, Alex Williamson wrote:
> > On Thu, 7 Jan 2021 17:29:00 +0800
> > Keqian Zhu  wrote:
> >   
> >> If we detach group during dirty page tracking, we shouldn't remove
> >> vfio_dma, because dirty log will lose.
> >>
> >> But we don't prevent unmap_unpin_all in vfio_iommu_release, because
> >> under normal procedure, dirty tracking has been stopped.  
> > 
> > This looks like it's creating a larger problem than it's fixing, it's
> > not our job to maintain the dirty bitmap regardless of what the user
> > does.  If the user detaches the last group in a container causing the
> > mappings within that container to be deconstructed before the user has
> > collected dirty pages, that sounds like a user error.  A container with
> > no groups is de-privileged and therefore loses all state.  Thanks,
> > 
> > Alex  
> 
> Hi Alex,
> 
> This looks good to me ;-). That's a reasonable constraint for user behavior.
> 
> What about replacing this patch with an addition to the uapi document of
> VFIO_GROUP_UNSET_CONTAINER? User should pay attention to this when call this
> ioctl during dirty tracking.

Here's the current uapi comment:

/**
 * VFIO_GROUP_UNSET_CONTAINER - _IO(VFIO_TYPE, VFIO_BASE + 5)
 *
 * Remove the group from the attached container.  This is the
 * opposite of the SET_CONTAINER call and returns the group to
 * an initial state.  All device file descriptors must be released
 * prior to calling this interface.  When removing the last group
 * from a container, the IOMMU will be disabled and all state lost,
 * effectively also returning the VFIO file descriptor to an initial
 * state.
 * Return: 0 on success, -errno on failure.
 * Availability: When attached to container
 */

So we already indicate that "all state" of the container is lost when
removing the last group, I don't see that it's necessarily to
explicitly include dirty bitmap state beyond that statement.  Without
mappings there can be no dirty bitmap to track.

 > And any comments on other patches? thanks.

I had a difficult time mapping the commit log to the actual code
change, I'll likely have some wording suggestions.  Is patch 5/5 still
necessary if this patch is dropped?  Thanks,

Alex

> >> Fixes: d6a4c185660c ("vfio iommu: Implementation of ioctl for dirty pages 
> >> tracking")
> >> Signed-off-by: Keqian Zhu 
> >> ---
> >>  drivers/vfio/vfio_iommu_type1.c | 14 --
> >>  1 file changed, 12 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/drivers/vfio/vfio_iommu_type1.c 
> >> b/drivers/vfio/vfio_iommu_type1.c
> >> index 26b7eb2a5cfc..9776a059904d 100644
> >> --- a/drivers/vfio/vfio_iommu_type1.c
> >> +++ b/drivers/vfio/vfio_iommu_type1.c
> >> @@ -2373,7 +2373,12 @@ static void vfio_iommu_type1_detach_group(void 
> >> *iommu_data,
> >>if (list_empty(&iommu->external_domain->group_list)) {
> >>vfio_sanity_check_pfn_list(iommu);
> >>  
> >> -  if (!IS_IOMMU_CAP_DOMAIN_IN_CONTAINER(iommu))
> >> +  /*
> >> +   * During dirty page tracking, we can't remove
> >> +   * vfio_dma because dirty log will lose.
> >> +   */
> >> +  if (!IS_IOMMU_CAP_DOMAIN_IN_CONTAINER(iommu) &&
> >> +  !iommu->dirty_page_tracking)
> >>vfio_iommu_unmap_unpin_all(iommu);
> >>  
> >>kfree(iommu->external_domain);
> >> @@ -2406,10 +2411,15 @@ static void vfio_iommu_type1_detach_group(void 
> >> *iommu_data,
> >> * iommu and external domain doesn't exist, then all the
> >> * mappings go away too. If it's the last domain with iommu and
> >> * external domain exist, update accounting
> >> +   *
> >> +   * Note: During dirty page tracking, we can't remove vfio_dma
> >> +   * because dirty log will lose. Just update accounting is a good
> >> +   * choice.
> >> */
> >>if (list_empty(&domain->group_list)) {
> >>if (list_is_singular(&iommu->domain_list)) {
> >> -  if (!iommu->external_domain)
> >> +  if (!iommu->external_domain &&
> >> +  !iommu->dirty_page_tracking)
> >>vfio_iommu_unmap_unpin_all(iommu);
> >>else
> >>
> >> vfio_iommu_unmap_unpin_reaccount(iommu);  
> > 
> > .
> >   
> 

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [RFC PATCH v2 15/26] of/fdt: Introduce early_init_dt_add_memory_hyp()

2021-01-12 Thread Rob Herring
On Tue, Jan 12, 2021 at 10:15 AM Quentin Perret  wrote:
>
> On Tuesday 12 Jan 2021 at 09:53:36 (-0600), Rob Herring wrote:
> > On Tue, Jan 12, 2021 at 8:26 AM Quentin Perret  wrote:
> > >
> > > On Tuesday 12 Jan 2021 at 08:10:47 (-0600), Rob Herring wrote:
> > > > On Tue, Jan 12, 2021 at 3:51 AM Quentin Perret  
> > > > wrote:
> > > > >
> > > > > On Monday 11 Jan 2021 at 08:45:10 (-0600), Rob Herring wrote:
> > > > > > On Fri, Jan 8, 2021 at 6:16 AM Quentin Perret  
> > > > > > wrote:
> > > > > > >
> > > > > > > Introduce early_init_dt_add_memory_hyp() to allow KVM to conserve 
> > > > > > > a copy
> > > > > > > of the memory regions parsed from DT. This will be needed in the 
> > > > > > > context
> > > > > > > of the protected nVHE feature of KVM/arm64 where the code running 
> > > > > > > at EL2
> > > > > > > will be cleanly separated from the host kernel during boot, and 
> > > > > > > will
> > > > > > > need its own representation of memory.
> > > > > >
> > > > > > What happened to doing this with memblock?
> > > > >
> > > > > I gave it a go, but as mentioned in v1, I ran into issues for nomap
> > > > > regions. I want the hypervisor to know about these memory regions 
> > > > > (it's
> > > > > possible some of those will be given to protected guests for instance)
> > > > > but these seem to be entirely removed from the memblocks when using 
> > > > > DT:
> > > > >
> > > > > https://elixir.bootlin.com/linux/latest/source/drivers/of/fdt.c#L1153
> > > > >
> > > > > EFI appears to do things differently, though, as it 'just' uses
> > > > > memblock_mark_nomap() instead of actively removing the memblock. And 
> > > > > that
> > > > > means I could actually use the memblock API for EFI, but I'd rather
> > > > > have a common solution. I tried to understand why things are done
> > > > > differently but couldn't find an answer and kept things simple and
> > > > > working for now.
> > > > >
> > > > > Is there a good reason for not using memblock_mark_nomap() with DT? If
> > > > > not, I'm happy to try that.
> > > >
> > > > There were 2 patches to do that, but it never got resolved. See here[1].
> > >
> > > Thanks. So the DT stuff predates the introduction of memblock_mark_nomap,
> > > that's why...
> > >
> > > By reading the discussions, [1] still looks a sensible patch on its own,
> > > independently from the issue Nicolas tried to solve. Any reason for not
> > > applying it?
> >
> > As I mentioned in the thread, same patch with 2 different reasons. So
> > I just wanted a better commit message covering both.
>
> Sorry if I'm being thick, but I'm not seeing it. How are they the same?
> IIUC, as per Nicolas' last reply, using memblock_mark_nomap() does not
> solve his issue with a broken DT. These 2 patches address two completely
> separate issues no?

Umm, yes you are right. But both are dealing with nomap. So someone
needs to sort out what the right thing to do here is. No one cared
enough to follow up in a year and a half.

Rob
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [RFC PATCH v2 15/26] of/fdt: Introduce early_init_dt_add_memory_hyp()

2021-01-12 Thread Rob Herring
On Tue, Jan 12, 2021 at 8:26 AM Quentin Perret  wrote:
>
> On Tuesday 12 Jan 2021 at 08:10:47 (-0600), Rob Herring wrote:
> > On Tue, Jan 12, 2021 at 3:51 AM Quentin Perret  wrote:
> > >
> > > On Monday 11 Jan 2021 at 08:45:10 (-0600), Rob Herring wrote:
> > > > On Fri, Jan 8, 2021 at 6:16 AM Quentin Perret  
> > > > wrote:
> > > > >
> > > > > Introduce early_init_dt_add_memory_hyp() to allow KVM to conserve a 
> > > > > copy
> > > > > of the memory regions parsed from DT. This will be needed in the 
> > > > > context
> > > > > of the protected nVHE feature of KVM/arm64 where the code running at 
> > > > > EL2
> > > > > will be cleanly separated from the host kernel during boot, and will
> > > > > need its own representation of memory.
> > > >
> > > > What happened to doing this with memblock?
> > >
> > > I gave it a go, but as mentioned in v1, I ran into issues for nomap
> > > regions. I want the hypervisor to know about these memory regions (it's
> > > possible some of those will be given to protected guests for instance)
> > > but these seem to be entirely removed from the memblocks when using DT:
> > >
> > > https://elixir.bootlin.com/linux/latest/source/drivers/of/fdt.c#L1153
> > >
> > > EFI appears to do things differently, though, as it 'just' uses
> > > memblock_mark_nomap() instead of actively removing the memblock. And that
> > > means I could actually use the memblock API for EFI, but I'd rather
> > > have a common solution. I tried to understand why things are done
> > > differently but couldn't find an answer and kept things simple and
> > > working for now.
> > >
> > > Is there a good reason for not using memblock_mark_nomap() with DT? If
> > > not, I'm happy to try that.
> >
> > There were 2 patches to do that, but it never got resolved. See here[1].
>
> Thanks. So the DT stuff predates the introduction of memblock_mark_nomap,
> that's why...
>
> By reading the discussions, [1] still looks a sensible patch on its own,
> independently from the issue Nicolas tried to solve. Any reason for not
> applying it?

As I mentioned in the thread, same patch with 2 different reasons. So
I just wanted a better commit message covering both.

Rob
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [RFC PATCH v2 15/26] of/fdt: Introduce early_init_dt_add_memory_hyp()

2021-01-12 Thread Rob Herring
On Tue, Jan 12, 2021 at 3:51 AM Quentin Perret  wrote:
>
> On Monday 11 Jan 2021 at 08:45:10 (-0600), Rob Herring wrote:
> > On Fri, Jan 8, 2021 at 6:16 AM Quentin Perret  wrote:
> > >
> > > Introduce early_init_dt_add_memory_hyp() to allow KVM to conserve a copy
> > > of the memory regions parsed from DT. This will be needed in the context
> > > of the protected nVHE feature of KVM/arm64 where the code running at EL2
> > > will be cleanly separated from the host kernel during boot, and will
> > > need its own representation of memory.
> >
> > What happened to doing this with memblock?
>
> I gave it a go, but as mentioned in v1, I ran into issues for nomap
> regions. I want the hypervisor to know about these memory regions (it's
> possible some of those will be given to protected guests for instance)
> but these seem to be entirely removed from the memblocks when using DT:
>
> https://elixir.bootlin.com/linux/latest/source/drivers/of/fdt.c#L1153
>
> EFI appears to do things differently, though, as it 'just' uses
> memblock_mark_nomap() instead of actively removing the memblock. And that
> means I could actually use the memblock API for EFI, but I'd rather
> have a common solution. I tried to understand why things are done
> differently but couldn't find an answer and kept things simple and
> working for now.
>
> Is there a good reason for not using memblock_mark_nomap() with DT? If
> not, I'm happy to try that.

There were 2 patches to do that, but it never got resolved. See here[1].

Rob

[1] https://lore.kernel.org/linux-devicetree/?q=s%3Ano-map
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [RFC PATCH v2 15/26] of/fdt: Introduce early_init_dt_add_memory_hyp()

2021-01-12 Thread Quentin Perret
On Tuesday 12 Jan 2021 at 08:10:47 (-0600), Rob Herring wrote:
> On Tue, Jan 12, 2021 at 3:51 AM Quentin Perret  wrote:
> >
> > On Monday 11 Jan 2021 at 08:45:10 (-0600), Rob Herring wrote:
> > > On Fri, Jan 8, 2021 at 6:16 AM Quentin Perret  wrote:
> > > >
> > > > Introduce early_init_dt_add_memory_hyp() to allow KVM to conserve a copy
> > > > of the memory regions parsed from DT. This will be needed in the context
> > > > of the protected nVHE feature of KVM/arm64 where the code running at EL2
> > > > will be cleanly separated from the host kernel during boot, and will
> > > > need its own representation of memory.
> > >
> > > What happened to doing this with memblock?
> >
> > I gave it a go, but as mentioned in v1, I ran into issues for nomap
> > regions. I want the hypervisor to know about these memory regions (it's
> > possible some of those will be given to protected guests for instance)
> > but these seem to be entirely removed from the memblocks when using DT:
> >
> > https://elixir.bootlin.com/linux/latest/source/drivers/of/fdt.c#L1153
> >
> > EFI appears to do things differently, though, as it 'just' uses
> > memblock_mark_nomap() instead of actively removing the memblock. And that
> > means I could actually use the memblock API for EFI, but I'd rather
> > have a common solution. I tried to understand why things are done
> > differently but couldn't find an answer and kept things simple and
> > working for now.
> >
> > Is there a good reason for not using memblock_mark_nomap() with DT? If
> > not, I'm happy to try that.
> 
> There were 2 patches to do that, but it never got resolved. See here[1].

Thanks. So the DT stuff predates the introduction of memblock_mark_nomap,
that's why...

By reading the discussions, [1] still looks a sensible patch on its own,
independently from the issue Nicolas tried to solve. Any reason for not
applying it?

I'll try to rework my series on top and see how that goes.

Thanks,
Quentin

[1] 
https://lore.kernel.org/linux-devicetree/1562920284-18638-1-git-send-email-karah...@amazon.de/
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH 8/9] KVM: arm64: vgic-v3: Expose GICR_TYPER.Last for userspace

2021-01-12 Thread Marc Zyngier

On 2021-01-12 17:28, Alexandru Elisei wrote:

Hi Eric,

On 12/12/20 6:50 PM, Eric Auger wrote:

Commit 23bde34771f1 ("KVM: arm64: vgic-v3: Drop the
reporting of GICR_TYPER.Last for userspace") temporarily fixed
a bug identified when attempting to access the GICR_TYPER
register before the redistributor region setting but dropped
the support of the LAST bit. This patch restores its
support (if the redistributor region was set) while keeping the
code safe.


If I understand your patch correctly, it is possible for the 
GICR_TYPER.Last bit
to be transiently 1 if the register is accessed before all the 
redistributors

regions have been configured.

Arm IHI 0069F states that accesses to the GICR_TYPER register are RO. I 
haven't
found exactly what RO means (please point me to the definition if you 
find it in
the architecture!), but I assume it means read-only and I'm not sure 
how correct
(from an architectural point of view) it is for two subsequent reads of 
this
register to return different values. Maybe Marc can shed some light on 
this.


RO = Read-Only indeed. Not sure that's documented anywhere in the 
architecture,
but this is enough of a well known acronym that even the ARM ARM doesn't 
feel

the need to invent a new definition for it.

As for your concern, I don't think it is a problem to return different 
values

if the HW has changed in between. This may need to be documented though.

Thanks,

   M.
--
Jazz is not dead. It just smells funny...
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH 8/9] KVM: arm64: vgic-v3: Expose GICR_TYPER.Last for userspace

2021-01-12 Thread Alexandru Elisei
Hi Eric,

On 12/12/20 6:50 PM, Eric Auger wrote:
> Commit 23bde34771f1 ("KVM: arm64: vgic-v3: Drop the
> reporting of GICR_TYPER.Last for userspace") temporarily fixed
> a bug identified when attempting to access the GICR_TYPER
> register before the redistributor region setting but dropped
> the support of the LAST bit. This patch restores its
> support (if the redistributor region was set) while keeping the
> code safe.

If I understand your patch correctly, it is possible for the GICR_TYPER.Last bit
to be transiently 1 if the register is accessed before all the redistributors
regions have been configured.

Arm IHI 0069F states that accesses to the GICR_TYPER register are RO. I haven't
found exactly what RO means (please point me to the definition if you find it in
the architecture!), but I assume it means read-only and I'm not sure how correct
(from an architectural point of view) it is for two subsequent reads of this
register to return different values. Maybe Marc can shed some light on this.

Thanks,
Alex
>
> Signed-off-by: Eric Auger 
> ---
>  arch/arm64/kvm/vgic/vgic-mmio-v3.c | 7 ++-
>  include/kvm/arm_vgic.h | 1 +
>  2 files changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/arch/arm64/kvm/vgic/vgic-mmio-v3.c 
> b/arch/arm64/kvm/vgic/vgic-mmio-v3.c
> index 581f0f49..2f9ef6058f6e 100644
> --- a/arch/arm64/kvm/vgic/vgic-mmio-v3.c
> +++ b/arch/arm64/kvm/vgic/vgic-mmio-v3.c
> @@ -277,6 +277,8 @@ static unsigned long vgic_uaccess_read_v3r_typer(struct 
> kvm_vcpu *vcpu,
>gpa_t addr, unsigned int len)
>  {
>   unsigned long mpidr = kvm_vcpu_get_mpidr_aff(vcpu);
> + struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
> + struct vgic_redist_region *rdreg = vgic_cpu->rdreg;
>   int target_vcpu_id = vcpu->vcpu_id;
>   u64 value;
>  
> @@ -286,7 +288,9 @@ static unsigned long vgic_uaccess_read_v3r_typer(struct 
> kvm_vcpu *vcpu,
>   if (vgic_has_its(vcpu->kvm))
>   value |= GICR_TYPER_PLPIS;
>  
> - /* reporting of the Last bit is not supported for userspace */
> + if (rdreg && (vgic_cpu->rdreg_index == (rdreg->free_index - 1)))
> + value |= GICR_TYPER_LAST;
> +
>   return extract_bytes(value, addr & 7, len);
>  }
>  
> @@ -714,6 +718,7 @@ int vgic_register_redist_iodev(struct kvm_vcpu *vcpu)
>   return -EINVAL;
>  
>   vgic_cpu->rdreg = rdreg;
> + vgic_cpu->rdreg_index = rdreg->free_index;
>  
>   rd_base = rdreg->base + rdreg->free_index * KVM_VGIC_V3_REDIST_SIZE;
>  
> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> index a8d8fdcd3723..596c069263a7 100644
> --- a/include/kvm/arm_vgic.h
> +++ b/include/kvm/arm_vgic.h
> @@ -322,6 +322,7 @@ struct vgic_cpu {
>*/
>   struct vgic_io_device   rd_iodev;
>   struct vgic_redist_region *rdreg;
> + u32 rdreg_index;
>  
>   /* Contains the attributes and gpa of the LPI pending tables. */
>   u64 pendbaser;
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH 8/9] KVM: arm64: vgic-v3: Expose GICR_TYPER.Last for userspace

2021-01-12 Thread Alexandru Elisei
Hi Eric,

On 12/12/20 6:50 PM, Eric Auger wrote:
> Commit 23bde34771f1 ("KVM: arm64: vgic-v3: Drop the
> reporting of GICR_TYPER.Last for userspace") temporarily fixed
> a bug identified when attempting to access the GICR_TYPER
> register before the redistributor region setting but dropped
> the support of the LAST bit. This patch restores its
> support (if the redistributor region was set) while keeping the
> code safe.

I suppose the reason for emulating GICR_TYPER.Last is for architecture 
compliance,
right? I think that should be in the commit message.

>
> Signed-off-by: Eric Auger 
> ---
>  arch/arm64/kvm/vgic/vgic-mmio-v3.c | 7 ++-
>  include/kvm/arm_vgic.h | 1 +
>  2 files changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/arch/arm64/kvm/vgic/vgic-mmio-v3.c 
> b/arch/arm64/kvm/vgic/vgic-mmio-v3.c
> index 581f0f49..2f9ef6058f6e 100644
> --- a/arch/arm64/kvm/vgic/vgic-mmio-v3.c
> +++ b/arch/arm64/kvm/vgic/vgic-mmio-v3.c
> @@ -277,6 +277,8 @@ static unsigned long vgic_uaccess_read_v3r_typer(struct 
> kvm_vcpu *vcpu,
>gpa_t addr, unsigned int len)
>  {
>   unsigned long mpidr = kvm_vcpu_get_mpidr_aff(vcpu);
> + struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
> + struct vgic_redist_region *rdreg = vgic_cpu->rdreg;
>   int target_vcpu_id = vcpu->vcpu_id;
>   u64 value;
>  
> @@ -286,7 +288,9 @@ static unsigned long vgic_uaccess_read_v3r_typer(struct 
> kvm_vcpu *vcpu,
>   if (vgic_has_its(vcpu->kvm))
>   value |= GICR_TYPER_PLPIS;
>  
> - /* reporting of the Last bit is not supported for userspace */
> + if (rdreg && (vgic_cpu->rdreg_index == (rdreg->free_index - 1)))
> + value |= GICR_TYPER_LAST;
> +
>   return extract_bytes(value, addr & 7, len);
>  }
>  
> @@ -714,6 +718,7 @@ int vgic_register_redist_iodev(struct kvm_vcpu *vcpu)
>   return -EINVAL;
>  
>   vgic_cpu->rdreg = rdreg;
> + vgic_cpu->rdreg_index = rdreg->free_index;

What happens if the next redistributor region we register has the base address
adjacent to this one?

I'm really not familiar with the code, but is it not possible to create two
Redistributor regions (via
KVM_DEV_ARM_VGIC_GRP_ADDR(KVM_VGIC_V3_ADDR_TYPE_REDIST)) where the second
Redistributor region start address is immediately after the last Redistributor 
in
the preceding region?

Thanks,
Alex
>  
>   rd_base = rdreg->base + rdreg->free_index * KVM_VGIC_V3_REDIST_SIZE;
>  
> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> index a8d8fdcd3723..596c069263a7 100644
> --- a/include/kvm/arm_vgic.h
> +++ b/include/kvm/arm_vgic.h
> @@ -322,6 +322,7 @@ struct vgic_cpu {
>*/
>   struct vgic_io_device   rd_iodev;
>   struct vgic_redist_region *rdreg;
> + u32 rdreg_index;
>  
>   /* Contains the attributes and gpa of the LPI pending tables. */
>   u64 pendbaser;
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH 7/9] KVM: arm64: Simplify argument passing to vgic_uaccess_[read|write]

2021-01-12 Thread Alexandru Elisei
Hi Eric,

On 1/12/21 4:04 PM, Alexandru Elisei wrote:
> Hi Eric,
>
> On 12/12/20 6:50 PM, Eric Auger wrote:
>> Instead of converting the vgic_io_device handle to a kvm_io_device
>> handled and then do the oppositive, pass a vgic_io_device pointer all
>> along the call chain.
> To me, it looks like the commit message describes what the patch does instead 
> of
> why it does it.
>
> What are "vgic_io_device handle" and "kvm_io_device handled"?

Sorry, I think I got it now. You were referring to the argument types struct
vgic_io_device and struct kvm_io_device. The patch looks like a very good 
cleanup.

How changing to commit message to sound something like this (feel free to
ignore/change it if you think of something else):

vgic_uaccess() takes a struct vgic_io_device argument, converts it to a struct
kvm_io_device and passes it to the read/write accessor functions, which convert 
it
back to a struct vgic_io_device. Avoid the indirection by passing the struct
vgic_io_device argument directly to vgic_uaccess_{read,write).

Thanks,
Alex
>
> Thanks,
> Alex
>> Signed-off-by: Eric Auger 
>> ---
>>  arch/arm64/kvm/vgic/vgic-mmio.c | 10 --
>>  1 file changed, 4 insertions(+), 6 deletions(-)
>>
>> diff --git a/arch/arm64/kvm/vgic/vgic-mmio.c 
>> b/arch/arm64/kvm/vgic/vgic-mmio.c
>> index b2d73fc0d1ef..48c6067fc5ec 100644
>> --- a/arch/arm64/kvm/vgic/vgic-mmio.c
>> +++ b/arch/arm64/kvm/vgic/vgic-mmio.c
>> @@ -938,10 +938,9 @@ vgic_get_mmio_region(struct kvm_vcpu *vcpu, struct 
>> vgic_io_device *iodev,
>>  return region;
>>  }
>>  
>> -static int vgic_uaccess_read(struct kvm_vcpu *vcpu, struct kvm_io_device 
>> *dev,
>> +static int vgic_uaccess_read(struct kvm_vcpu *vcpu, struct vgic_io_device 
>> *iodev,
>>   gpa_t addr, u32 *val)
>>  {
>> -struct vgic_io_device *iodev = kvm_to_vgic_iodev(dev);
>>  const struct vgic_register_region *region;
>>  struct kvm_vcpu *r_vcpu;
>>  
>> @@ -960,10 +959,9 @@ static int vgic_uaccess_read(struct kvm_vcpu *vcpu, 
>> struct kvm_io_device *dev,
>>  return 0;
>>  }
>>  
>> -static int vgic_uaccess_write(struct kvm_vcpu *vcpu, struct kvm_io_device 
>> *dev,
>> +static int vgic_uaccess_write(struct kvm_vcpu *vcpu, struct vgic_io_device 
>> *iodev,
>>gpa_t addr, const u32 *val)
>>  {
>> -struct vgic_io_device *iodev = kvm_to_vgic_iodev(dev);
>>  const struct vgic_register_region *region;
>>  struct kvm_vcpu *r_vcpu;
>>  
>> @@ -986,9 +984,9 @@ int vgic_uaccess(struct kvm_vcpu *vcpu, struct 
>> vgic_io_device *dev,
>>   bool is_write, int offset, u32 *val)
>>  {
>>  if (is_write)
>> -return vgic_uaccess_write(vcpu, &dev->dev, offset, val);
>> +return vgic_uaccess_write(vcpu, dev, offset, val);
>>  else
>> -return vgic_uaccess_read(vcpu, &dev->dev, offset, val);
>> +return vgic_uaccess_read(vcpu, dev, offset, val);
>>  }
>>  
>>  static int dispatch_mmio_read(struct kvm_vcpu *vcpu, struct kvm_io_device 
>> *dev,
> ___
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH 7/9] KVM: arm64: Simplify argument passing to vgic_uaccess_[read|write]

2021-01-12 Thread Alexandru Elisei
Hi Eric,

On 12/12/20 6:50 PM, Eric Auger wrote:
> Instead of converting the vgic_io_device handle to a kvm_io_device
> handled and then do the oppositive, pass a vgic_io_device pointer all
> along the call chain.

To me, it looks like the commit message describes what the patch does instead of
why it does it.

What are "vgic_io_device handle" and "kvm_io_device handled"?

Thanks,
Alex
>
> Signed-off-by: Eric Auger 
> ---
>  arch/arm64/kvm/vgic/vgic-mmio.c | 10 --
>  1 file changed, 4 insertions(+), 6 deletions(-)
>
> diff --git a/arch/arm64/kvm/vgic/vgic-mmio.c b/arch/arm64/kvm/vgic/vgic-mmio.c
> index b2d73fc0d1ef..48c6067fc5ec 100644
> --- a/arch/arm64/kvm/vgic/vgic-mmio.c
> +++ b/arch/arm64/kvm/vgic/vgic-mmio.c
> @@ -938,10 +938,9 @@ vgic_get_mmio_region(struct kvm_vcpu *vcpu, struct 
> vgic_io_device *iodev,
>   return region;
>  }
>  
> -static int vgic_uaccess_read(struct kvm_vcpu *vcpu, struct kvm_io_device 
> *dev,
> +static int vgic_uaccess_read(struct kvm_vcpu *vcpu, struct vgic_io_device 
> *iodev,
>gpa_t addr, u32 *val)
>  {
> - struct vgic_io_device *iodev = kvm_to_vgic_iodev(dev);
>   const struct vgic_register_region *region;
>   struct kvm_vcpu *r_vcpu;
>  
> @@ -960,10 +959,9 @@ static int vgic_uaccess_read(struct kvm_vcpu *vcpu, 
> struct kvm_io_device *dev,
>   return 0;
>  }
>  
> -static int vgic_uaccess_write(struct kvm_vcpu *vcpu, struct kvm_io_device 
> *dev,
> +static int vgic_uaccess_write(struct kvm_vcpu *vcpu, struct vgic_io_device 
> *iodev,
> gpa_t addr, const u32 *val)
>  {
> - struct vgic_io_device *iodev = kvm_to_vgic_iodev(dev);
>   const struct vgic_register_region *region;
>   struct kvm_vcpu *r_vcpu;
>  
> @@ -986,9 +984,9 @@ int vgic_uaccess(struct kvm_vcpu *vcpu, struct 
> vgic_io_device *dev,
>bool is_write, int offset, u32 *val)
>  {
>   if (is_write)
> - return vgic_uaccess_write(vcpu, &dev->dev, offset, val);
> + return vgic_uaccess_write(vcpu, dev, offset, val);
>   else
> - return vgic_uaccess_read(vcpu, &dev->dev, offset, val);
> + return vgic_uaccess_read(vcpu, dev, offset, val);
>  }
>  
>  static int dispatch_mmio_read(struct kvm_vcpu *vcpu, struct kvm_io_device 
> *dev,
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH 6/9] docs: kvm: devices/arm-vgic-v3: enhance KVM_DEV_ARM_VGIC_CTRL_INIT doc

2021-01-12 Thread Alexandru Elisei
Hi Eric,

On 12/12/20 6:50 PM, Eric Auger wrote:
> kvm_arch_vcpu_precreate() returns -EBUSY if the vgic is
> already initialized. So let's document that KVM_DEV_ARM_VGIC_CTRL_INIT
> must be called after all vcpu creations.

Checked and this is indeed the case,
kvm_vm_ioctl_create_vcpu()->kvm_arch_vcpu_precreate() returns -EBUSY is
vgic_initialized() is true.

>
> Signed-off-by: Eric Auger 
> ---
>  Documentation/virt/kvm/devices/arm-vgic-v3.rst | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/Documentation/virt/kvm/devices/arm-vgic-v3.rst 
> b/Documentation/virt/kvm/devices/arm-vgic-v3.rst
> index 5dd3bff51978..322de6aebdec 100644
> --- a/Documentation/virt/kvm/devices/arm-vgic-v3.rst
> +++ b/Documentation/virt/kvm/devices/arm-vgic-v3.rst
> @@ -228,7 +228,7 @@ Groups:
>  
>  KVM_DEV_ARM_VGIC_CTRL_INIT
>request the initialization of the VGIC, no additional parameter in
> -  kvm_device_attr.addr.
> +  kvm_device_attr.addr. Must be called after all vcpu creations.

Nitpick here: the document writes VCPU with all caps. This also sounds a bit
weird, I think something like "Must be called after all VCPUs have been created"
is clearer.

Thanks,
Alex
>  KVM_DEV_ARM_VGIC_SAVE_PENDING_TABLES
>save all LPI pending bits into guest RAM pending tables.
>  
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH 5/9] KVM: arm: move has_run_once after the map_resources

2021-01-12 Thread Alexandru Elisei
Hi Eric,

On 12/12/20 6:50 PM, Eric Auger wrote:
> has_run_once is set to true at the beginning of
> kvm_vcpu_first_run_init(). This generally is not an issue
> except when exercising the code with KVM selftests. Indeed,
> if kvm_vgic_map_resources() fails due to erroneous user settings,
> has_run_once is set and this prevents from continuing
> executing the test. This patch moves the assignment after the
> kvm_vgic_map_resources().
>
> Signed-off-by: Eric Auger 
> ---
>  arch/arm64/kvm/arm.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index c0ffb019ca8b..331fae6bff31 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -540,8 +540,6 @@ static int kvm_vcpu_first_run_init(struct kvm_vcpu *vcpu)
>   if (!kvm_arm_vcpu_is_finalized(vcpu))
>   return -EPERM;
>  
> - vcpu->arch.has_run_once = true;
> -
>   if (likely(irqchip_in_kernel(kvm))) {
>   /*
>* Map the VGIC hardware resources before running a vcpu the
> @@ -560,6 +558,8 @@ static int kvm_vcpu_first_run_init(struct kvm_vcpu *vcpu)
>   static_branch_inc(&userspace_irqchip_in_use);
>   }
>  
> + vcpu->arch.has_run_once = true;

I have a few concerns regarding this:

1. Moving has_run_once = true here seems very arbitrary to me - 
kvm_timer_enable()
and kvm_arm_pmu_v3_enable(), below it, can both fail because of erroneous user
values. If there's a reason why the assignment cannot be moved at the end of the
function, I think it should be clearly stated in a comment for the people who
might be tempted to write similar tests for the timer or pmu.

2. There are many ways that kvm_vgic_map_resources() can fail, other than
incorrect user settings. I started digging into how
kvm_vgic_map_resources()->vgic_v2_map_resources() can fail for a VGIC V2 and 
this
is what I managed to find before I gave up:

* vgic_init() can fail in:
    - kvm_vgic_dist_init()
    - vgic_v3_init()
    - kvm_vgic_setup_default_irq_routing()
* vgic_register_dist_iodev() can fail in:
    - vgic_v3_init_dist_iodev()
    - kvm_io_bus_register_dev()(*)
* kvm_phys_addr_ioremap() can fail in:
    - kvm_mmu_topup_memory_cache()
    - kvm_pgtable_stage2_map()

So if any of the functions below fail, are we 100% sure it is safe to allow the
user to execute kvm_vgic_map_resources() again?

(*) It looks to me like kvm_io_bus_register_dev() doesn't take into account a
caller that tries to register the same device address range and it will create
another identical range. Is this intentional? Is it a bug that should be fixed? 
Or
am I misunderstanding the function?

Thanks,
Alex
> +
>   ret = kvm_timer_enable(vcpu);
>   if (ret)
>   return ret;
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v3 09/21] arm64: cpufeature: Add global feature override facility

2021-01-12 Thread Marc Zyngier

On 2021-01-12 11:59, Suzuki K Poulose wrote:

On 1/12/21 11:50 AM, Marc Zyngier wrote:

Hi Suzuki,

On 2021-01-12 09:17, Suzuki K Poulose wrote:

Hi Marc,

On 1/11/21 7:48 PM, Marc Zyngier wrote:


[...]

diff --git a/arch/arm64/kernel/cpufeature.c 
b/arch/arm64/kernel/cpufeature.c

index 894af60b9669..00d99e593b65 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -774,6 +774,7 @@ static void __init init_cpu_ftr_reg(u32 sys_reg, 
u64 new)

  u64 strict_mask = ~0x0ULL;
  u64 user_mask = 0;
  u64 valid_mask = 0;
+    u64 override_val = 0, override_mask = 0;

  const struct arm64_ftr_bits *ftrp;
  struct arm64_ftr_reg *reg = get_arm64_ftr_reg(sys_reg);
@@ -781,9 +782,35 @@ static void __init init_cpu_ftr_reg(u32 
sys_reg, u64 new)

  if (!reg)
  return;

+    if (reg->override_mask && reg->override_val) {
+    override_mask = *reg->override_mask;
+    override_val = *reg->override_val;
+    }
+
  for (ftrp = reg->ftr_bits; ftrp->width; ftrp++) {
  u64 ftr_mask = arm64_ftr_mask(ftrp);
  s64 ftr_new = arm64_ftr_value(ftrp, new);
+    s64 ftr_ovr = arm64_ftr_value(ftrp, override_val);
+
+    if ((ftr_mask & override_mask) == ftr_mask) {
+    if (ftr_ovr < ftr_new) {


Here we assume that all the features are FTR_LOWER_SAFE. We could
probably use arm64_ftr_safe_value(ftrp, ftr_new, ftr_ovr) here ?
That would cover us for both HIGHER_SAFE and LOWER_SAFE features.
However that may be restrictive for FTR_EXACT, as we the safe
value would be set to "ftr->safe_val". I guess that may be better
than forcing to use an unsafe value for the boot CPU, which could
anyway conflict with the other CPUs and eventually trigger the
ftr alue to be safe_val.


I like the idea of using the helper, as it cleanups up the code a bit.
However, not being to set a feature to a certain value could be 
restrictive,
as in general, it means that we can only disable a feature and not 
adjust

its level of support.

Take PMUVER for example: with the helper, I can't override it from 
v8.4 to

v8.1. I can only go to v8.0.


My point is, we set this only for the "init" of cpu features. So, even 
if we
init to a custom , non-(default-safe) value, the secondary CPUs could 
scream,

and the system wide safe value could fall back to the "safe" value for
EXACT features, no matter what you did to init it.


Right. So let's go with the safe value for EXACT features for now,
and let the override fail if that's not what the user asked for.

After all, there are only so many things we want to support as
an override, and in all the cases at hand, using the safe value
actually matches what we want to do.

We can always revisit this if and when we need a different behaviour.

Thanks,

M.
--
Jazz is not dead. It just smells funny...
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [RFC PATCH v2 15/26] of/fdt: Introduce early_init_dt_add_memory_hyp()

2021-01-12 Thread Quentin Perret
On Monday 11 Jan 2021 at 08:45:10 (-0600), Rob Herring wrote:
> On Fri, Jan 8, 2021 at 6:16 AM Quentin Perret  wrote:
> >
> > Introduce early_init_dt_add_memory_hyp() to allow KVM to conserve a copy
> > of the memory regions parsed from DT. This will be needed in the context
> > of the protected nVHE feature of KVM/arm64 where the code running at EL2
> > will be cleanly separated from the host kernel during boot, and will
> > need its own representation of memory.
> 
> What happened to doing this with memblock?

I gave it a go, but as mentioned in v1, I ran into issues for nomap
regions. I want the hypervisor to know about these memory regions (it's
possible some of those will be given to protected guests for instance)
but these seem to be entirely removed from the memblocks when using DT:

https://elixir.bootlin.com/linux/latest/source/drivers/of/fdt.c#L1153

EFI appears to do things differently, though, as it 'just' uses
memblock_mark_nomap() instead of actively removing the memblock. And that
means I could actually use the memblock API for EFI, but I'd rather
have a common solution. I tried to understand why things are done
differently but couldn't find an answer and kept things simple and
working for now.

Is there a good reason for not using memblock_mark_nomap() with DT? If
not, I'm happy to try that.

Thanks,
Quentin
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v3 09/21] arm64: cpufeature: Add global feature override facility

2021-01-12 Thread Suzuki K Poulose

On 1/12/21 11:51 AM, Marc Zyngier wrote:

On 2021-01-12 11:50, Marc Zyngier wrote:

Hi Suzuki,

On 2021-01-12 09:17, Suzuki K Poulose wrote:

Hi Marc,

On 1/11/21 7:48 PM, Marc Zyngier wrote:


[...]


diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 894af60b9669..00d99e593b65 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -774,6 +774,7 @@ static void __init init_cpu_ftr_reg(u32 sys_reg, u64 new)
  u64 strict_mask = ~0x0ULL;
  u64 user_mask = 0;
  u64 valid_mask = 0;
+    u64 override_val = 0, override_mask = 0;

  const struct arm64_ftr_bits *ftrp;
  struct arm64_ftr_reg *reg = get_arm64_ftr_reg(sys_reg);
@@ -781,9 +782,35 @@ static void __init init_cpu_ftr_reg(u32 sys_reg, u64 new)
  if (!reg)
  return;

+    if (reg->override_mask && reg->override_val) {
+    override_mask = *reg->override_mask;
+    override_val = *reg->override_val;
+    }
+
  for (ftrp = reg->ftr_bits; ftrp->width; ftrp++) {
  u64 ftr_mask = arm64_ftr_mask(ftrp);
  s64 ftr_new = arm64_ftr_value(ftrp, new);
+    s64 ftr_ovr = arm64_ftr_value(ftrp, override_val);
+
+    if ((ftr_mask & override_mask) == ftr_mask) {
+    if (ftr_ovr < ftr_new) {


Here we assume that all the features are FTR_LOWER_SAFE. We could
probably use arm64_ftr_safe_value(ftrp, ftr_new, ftr_ovr) here ?
That would cover us for both HIGHER_SAFE and LOWER_SAFE features.
However that may be restrictive for FTR_EXACT, as we the safe
value would be set to "ftr->safe_val". I guess that may be better
than forcing to use an unsafe value for the boot CPU, which could
anyway conflict with the other CPUs and eventually trigger the
ftr alue to be safe_val.


I like the idea of using the helper, as it cleanups up the code a bit.
However, not being to set a feature to a certain value could be restrictive,
as in general, it means that we can only disable a feature and not adjust
its level of support.

Take PMUVER for example: with the helper, I can't override it from v8.4 to
v8.1. I can only go to v8.0.


Actually, we can only *disable* the PMU altogether. Same question though...


It depends on two things :

1) What is the safe value for an EXACT typed feature ?
Usually, that means either disabled, or the lowest possible value.

2) How is this value consumed ?
  a) i.e, Do we use the per-CPU value
Then none of these changes have any effect
  b) System wide value ?
  Then we get the safe value as "influenced" by the infrastructure.

The safe value we use for EXACT features is exclusively for making sure
that the system uses the safe assumption and thus should be the best
option.

To answer your question, for PMU, it is 0, implies, v8.0. Or we could
update the safe value to -1 (0xf) as the safe value, which is a bit more safer,
kind of implying that the PMU is not a standard one.


Cheers
Suzuki





     M.



Is it something we care about?

Thanks,

    M.




___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH 4/5] vfio/iommu_type1: Carefully use unmap_unpin_all during dirty tracking

2021-01-12 Thread Keqian Zhu



On 2021/1/12 5:49, Alex Williamson wrote:
> On Thu, 7 Jan 2021 17:29:00 +0800
> Keqian Zhu  wrote:
> 
>> If we detach group during dirty page tracking, we shouldn't remove
>> vfio_dma, because dirty log will lose.
>>
>> But we don't prevent unmap_unpin_all in vfio_iommu_release, because
>> under normal procedure, dirty tracking has been stopped.
> 
> This looks like it's creating a larger problem than it's fixing, it's
> not our job to maintain the dirty bitmap regardless of what the user
> does.  If the user detaches the last group in a container causing the
> mappings within that container to be deconstructed before the user has
> collected dirty pages, that sounds like a user error.  A container with
> no groups is de-privileged and therefore loses all state.  Thanks,
> 
> Alex

Hi Alex,

This looks good to me ;-). That's a reasonable constraint for user behavior.

What about replacing this patch with an addition to the uapi document of
VFIO_GROUP_UNSET_CONTAINER? User should pay attention to this when call this
ioctl during dirty tracking.

And any comments on other patches? thanks.

Thanks.
Keqian

> 
>> Fixes: d6a4c185660c ("vfio iommu: Implementation of ioctl for dirty pages 
>> tracking")
>> Signed-off-by: Keqian Zhu 
>> ---
>>  drivers/vfio/vfio_iommu_type1.c | 14 --
>>  1 file changed, 12 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/vfio/vfio_iommu_type1.c 
>> b/drivers/vfio/vfio_iommu_type1.c
>> index 26b7eb2a5cfc..9776a059904d 100644
>> --- a/drivers/vfio/vfio_iommu_type1.c
>> +++ b/drivers/vfio/vfio_iommu_type1.c
>> @@ -2373,7 +2373,12 @@ static void vfio_iommu_type1_detach_group(void 
>> *iommu_data,
>>  if (list_empty(&iommu->external_domain->group_list)) {
>>  vfio_sanity_check_pfn_list(iommu);
>>  
>> -if (!IS_IOMMU_CAP_DOMAIN_IN_CONTAINER(iommu))
>> +/*
>> + * During dirty page tracking, we can't remove
>> + * vfio_dma because dirty log will lose.
>> + */
>> +if (!IS_IOMMU_CAP_DOMAIN_IN_CONTAINER(iommu) &&
>> +!iommu->dirty_page_tracking)
>>  vfio_iommu_unmap_unpin_all(iommu);
>>  
>>  kfree(iommu->external_domain);
>> @@ -2406,10 +2411,15 @@ static void vfio_iommu_type1_detach_group(void 
>> *iommu_data,
>>   * iommu and external domain doesn't exist, then all the
>>   * mappings go away too. If it's the last domain with iommu and
>>   * external domain exist, update accounting
>> + *
>> + * Note: During dirty page tracking, we can't remove vfio_dma
>> + * because dirty log will lose. Just update accounting is a good
>> + * choice.
>>   */
>>  if (list_empty(&domain->group_list)) {
>>  if (list_is_singular(&iommu->domain_list)) {
>> -if (!iommu->external_domain)
>> +if (!iommu->external_domain &&
>> +!iommu->dirty_page_tracking)
>>  vfio_iommu_unmap_unpin_all(iommu);
>>  else
>>  vfio_iommu_unmap_unpin_reaccount(iommu);
> 
> .
> 
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v3 09/21] arm64: cpufeature: Add global feature override facility

2021-01-12 Thread Suzuki K Poulose

On 1/12/21 11:50 AM, Marc Zyngier wrote:

Hi Suzuki,

On 2021-01-12 09:17, Suzuki K Poulose wrote:

Hi Marc,

On 1/11/21 7:48 PM, Marc Zyngier wrote:


[...]


diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 894af60b9669..00d99e593b65 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -774,6 +774,7 @@ static void __init init_cpu_ftr_reg(u32 sys_reg, u64 new)
  u64 strict_mask = ~0x0ULL;
  u64 user_mask = 0;
  u64 valid_mask = 0;
+    u64 override_val = 0, override_mask = 0;

  const struct arm64_ftr_bits *ftrp;
  struct arm64_ftr_reg *reg = get_arm64_ftr_reg(sys_reg);
@@ -781,9 +782,35 @@ static void __init init_cpu_ftr_reg(u32 sys_reg, u64 new)
  if (!reg)
  return;

+    if (reg->override_mask && reg->override_val) {
+    override_mask = *reg->override_mask;
+    override_val = *reg->override_val;
+    }
+
  for (ftrp = reg->ftr_bits; ftrp->width; ftrp++) {
  u64 ftr_mask = arm64_ftr_mask(ftrp);
  s64 ftr_new = arm64_ftr_value(ftrp, new);
+    s64 ftr_ovr = arm64_ftr_value(ftrp, override_val);
+
+    if ((ftr_mask & override_mask) == ftr_mask) {
+    if (ftr_ovr < ftr_new) {


Here we assume that all the features are FTR_LOWER_SAFE. We could
probably use arm64_ftr_safe_value(ftrp, ftr_new, ftr_ovr) here ?
That would cover us for both HIGHER_SAFE and LOWER_SAFE features.
However that may be restrictive for FTR_EXACT, as we the safe
value would be set to "ftr->safe_val". I guess that may be better
than forcing to use an unsafe value for the boot CPU, which could
anyway conflict with the other CPUs and eventually trigger the
ftr alue to be safe_val.


I like the idea of using the helper, as it cleanups up the code a bit.
However, not being to set a feature to a certain value could be restrictive,
as in general, it means that we can only disable a feature and not adjust
its level of support.

Take PMUVER for example: with the helper, I can't override it from v8.4 to
v8.1. I can only go to v8.0.


My point is, we set this only for the "init" of cpu features. So, even if we
init to a custom , non-(default-safe) value, the secondary CPUs could scream,
and the system wide safe value could fall back to the "safe" value for EXACT 
features,
no matter what you did to init it.

Also, it will be dangerous for things like PAC, as the lower level value may
not be compatible with the "actual" cpu feature supported.

So simply setting to a lower value for EXACT features is generally not safe,
though I understand some are exceptions.

Suzuki





Is it something we care about?

Thanks,

     M.


___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v3 09/21] arm64: cpufeature: Add global feature override facility

2021-01-12 Thread Marc Zyngier

On 2021-01-12 11:50, Marc Zyngier wrote:

Hi Suzuki,

On 2021-01-12 09:17, Suzuki K Poulose wrote:

Hi Marc,

On 1/11/21 7:48 PM, Marc Zyngier wrote:


[...]

diff --git a/arch/arm64/kernel/cpufeature.c 
b/arch/arm64/kernel/cpufeature.c

index 894af60b9669..00d99e593b65 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -774,6 +774,7 @@ static void __init init_cpu_ftr_reg(u32 sys_reg, 
u64 new)

  u64 strict_mask = ~0x0ULL;
  u64 user_mask = 0;
  u64 valid_mask = 0;
+    u64 override_val = 0, override_mask = 0;

  const struct arm64_ftr_bits *ftrp;
  struct arm64_ftr_reg *reg = get_arm64_ftr_reg(sys_reg);
@@ -781,9 +782,35 @@ static void __init init_cpu_ftr_reg(u32 sys_reg, 
u64 new)

  if (!reg)
  return;

+    if (reg->override_mask && reg->override_val) {
+    override_mask = *reg->override_mask;
+    override_val = *reg->override_val;
+    }
+
  for (ftrp = reg->ftr_bits; ftrp->width; ftrp++) {
  u64 ftr_mask = arm64_ftr_mask(ftrp);
  s64 ftr_new = arm64_ftr_value(ftrp, new);
+    s64 ftr_ovr = arm64_ftr_value(ftrp, override_val);
+
+    if ((ftr_mask & override_mask) == ftr_mask) {
+    if (ftr_ovr < ftr_new) {


Here we assume that all the features are FTR_LOWER_SAFE. We could
probably use arm64_ftr_safe_value(ftrp, ftr_new, ftr_ovr) here ?
That would cover us for both HIGHER_SAFE and LOWER_SAFE features.
However that may be restrictive for FTR_EXACT, as we the safe
value would be set to "ftr->safe_val". I guess that may be better
than forcing to use an unsafe value for the boot CPU, which could
anyway conflict with the other CPUs and eventually trigger the
ftr alue to be safe_val.


I like the idea of using the helper, as it cleanups up the code a bit.
However, not being to set a feature to a certain value could be 
restrictive,
as in general, it means that we can only disable a feature and not 
adjust

its level of support.

Take PMUVER for example: with the helper, I can't override it from v8.4 
to

v8.1. I can only go to v8.0.


Actually, we can only *disable* the PMU altogether. Same question 
though...


M.



Is it something we care about?

Thanks,

M.


--
Jazz is not dead. It just smells funny...
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v3 09/21] arm64: cpufeature: Add global feature override facility

2021-01-12 Thread Marc Zyngier

Hi Suzuki,

On 2021-01-12 09:17, Suzuki K Poulose wrote:

Hi Marc,

On 1/11/21 7:48 PM, Marc Zyngier wrote:


[...]

diff --git a/arch/arm64/kernel/cpufeature.c 
b/arch/arm64/kernel/cpufeature.c

index 894af60b9669..00d99e593b65 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -774,6 +774,7 @@ static void __init init_cpu_ftr_reg(u32 sys_reg, 
u64 new)

  u64 strict_mask = ~0x0ULL;
  u64 user_mask = 0;
  u64 valid_mask = 0;
+    u64 override_val = 0, override_mask = 0;

  const struct arm64_ftr_bits *ftrp;
  struct arm64_ftr_reg *reg = get_arm64_ftr_reg(sys_reg);
@@ -781,9 +782,35 @@ static void __init init_cpu_ftr_reg(u32 sys_reg, 
u64 new)

  if (!reg)
  return;

+    if (reg->override_mask && reg->override_val) {
+    override_mask = *reg->override_mask;
+    override_val = *reg->override_val;
+    }
+
  for (ftrp = reg->ftr_bits; ftrp->width; ftrp++) {
  u64 ftr_mask = arm64_ftr_mask(ftrp);
  s64 ftr_new = arm64_ftr_value(ftrp, new);
+    s64 ftr_ovr = arm64_ftr_value(ftrp, override_val);
+
+    if ((ftr_mask & override_mask) == ftr_mask) {
+    if (ftr_ovr < ftr_new) {


Here we assume that all the features are FTR_LOWER_SAFE. We could
probably use arm64_ftr_safe_value(ftrp, ftr_new, ftr_ovr) here ?
That would cover us for both HIGHER_SAFE and LOWER_SAFE features.
However that may be restrictive for FTR_EXACT, as we the safe
value would be set to "ftr->safe_val". I guess that may be better
than forcing to use an unsafe value for the boot CPU, which could
anyway conflict with the other CPUs and eventually trigger the
ftr alue to be safe_val.


I like the idea of using the helper, as it cleanups up the code a bit.
However, not being to set a feature to a certain value could be 
restrictive,
as in general, it means that we can only disable a feature and not 
adjust

its level of support.

Take PMUVER for example: with the helper, I can't override it from v8.4 
to

v8.1. I can only go to v8.0.

Is it something we care about?

Thanks,

M.
--
Jazz is not dead. It just smells funny...
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v3 09/21] arm64: cpufeature: Add global feature override facility

2021-01-12 Thread Suzuki K Poulose

Hi Marc,

On 1/11/21 7:48 PM, Marc Zyngier wrote:

Hi Catalin,

On 2021-01-11 18:41, Catalin Marinas wrote:

Hi Marc,

On Mon, Jan 11, 2021 at 01:27:59PM +, Marc Zyngier wrote:

Add a facility to globally override a feature, no matter what
the HW says. Yes, this is dangerous.


Yeah, it's dangerous. We can make it less so if we only allow safe
values (e.g. lower if FTR_UNSIGNED).


My plan was also to allow non-safe values in order to trigger features
that are not advertised by the HW. But I can understand if you are
reluctant to allow such thing! :D


diff --git a/arch/arm64/include/asm/cpufeature.h 
b/arch/arm64/include/asm/cpufeature.h
index 9a555809b89c..465d2cb63bfc 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -75,6 +75,8 @@ struct arm64_ftr_reg {
 u64    sys_val;
 u64    user_val;
 const struct arm64_ftr_bits    *ftr_bits;
+    u64    *override_val;
+    u64    *override_mask;
 };


At the arm64_ftr_reg level, we don't have any information about the safe
values for a feature. Could we instead move this to arm64_ftr_bits? We
probably only need a single field. When populating the feature values,
we can make sure it doesn't go above the hardware one.

I attempted a feature modification for MTE here, though I dropped the
entire series in the meantime as we clarified the ARM ARM:

https://lore.kernel.org/linux-arm-kernel/20200515171612.1020-24-catalin.mari...@arm.com/

Srinivas copied it in his patch (but forgot to give credit ;)):

https://lore.kernel.org/linux-arm-msm/1610152163-16554-3-git-send-email-sram...@codeaurora.org/

The above adds a filter function but, instead, just use your mechanism in
this series for idreg.feature setting via cmdline. The arm64_ftr_value()
function extracts the hardware value and lowers it if a cmdline argument
was passed.


One thing is that it is not always possible to sanitise the value passed
if it is required very early on, as I do with VHE. But in that case
I actually check that we are VHE capable before starting to poke at
VHE-specific state.

I came up with the following patch on top, which preserves the current
global approach (no per arm64_ftr_bits state), but checks (and alters)
the override as it iterates through the various fields.

For example, if I pass "arm64.nopauth kvm-arm.mode=nvhe id_aa64pfr1.bt=5"
to the FVP, I get the following output:

[    0.00] CPU features: SYS_ID_AA64ISAR1_EL1[31:28]: forced from 1 to 0
[    0.00] CPU features: SYS_ID_AA64ISAR1_EL1[11:8]: forced from 1 to 0
[    0.00] CPU features: SYS_ID_AA64MMFR1_EL1[11:8]: forced from 1 to 0
[    0.00] CPU features: SYS_ID_AA64PFR1_EL1[3:0]: not forcing 1 to 5
[    0.00] CPU features: detected: GIC system register CPU interface
[    0.00] CPU features: detected: Hardware dirty bit management
[    0.00] CPU features: detected: Spectre-v4
[    0.00] CPU features: detected: Branch Target Identification

showing that the PAC features have been downgraded, together with VHE,
but that BTI is still detected as value 5 was obviously bogus.

Thoughts?

     M.

diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 894af60b9669..00d99e593b65 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -774,6 +774,7 @@ static void __init init_cpu_ftr_reg(u32 sys_reg, u64 new)
  u64 strict_mask = ~0x0ULL;
  u64 user_mask = 0;
  u64 valid_mask = 0;
+    u64 override_val = 0, override_mask = 0;

  const struct arm64_ftr_bits *ftrp;
  struct arm64_ftr_reg *reg = get_arm64_ftr_reg(sys_reg);
@@ -781,9 +782,35 @@ static void __init init_cpu_ftr_reg(u32 sys_reg, u64 new)
  if (!reg)
  return;

+    if (reg->override_mask && reg->override_val) {
+    override_mask = *reg->override_mask;
+    override_val = *reg->override_val;
+    }
+
  for (ftrp = reg->ftr_bits; ftrp->width; ftrp++) {
  u64 ftr_mask = arm64_ftr_mask(ftrp);
  s64 ftr_new = arm64_ftr_value(ftrp, new);
+    s64 ftr_ovr = arm64_ftr_value(ftrp, override_val);
+
+    if ((ftr_mask & override_mask) == ftr_mask) {
+    if (ftr_ovr < ftr_new) {


Here we assume that all the features are FTR_LOWER_SAFE. We could
probably use arm64_ftr_safe_value(ftrp, ftr_new, ftr_ovr) here ?
That would cover us for both HIGHER_SAFE and LOWER_SAFE features.
However that may be restrictive for FTR_EXACT, as we the safe
value would be set to "ftr->safe_val". I guess that may be better
than forcing to use an unsafe value for the boot CPU, which could
anyway conflict with the other CPUs and eventually trigger the
ftr alue to be safe_val.

i.e,
ftr_val = arm64_ftr_safe_value(ftrp, ftr_ovr, ftr_new);


Suzuki
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v3 00/66] KVM: arm64: ARMv8.3/8.4 Nested Virtualization support

2021-01-12 Thread Haibo Xu
On Mon, 11 Jan 2021 at 16:59, Marc Zyngier  wrote:
>
> Hi Haibo,
>
> On 2021-01-11 07:20, Haibo Xu wrote:
> > On Fri, 11 Dec 2020 at 00:00, Marc Zyngier  wrote:
> >>
> >> This is a rework of the NV series that I posted 10 months ago[1], as a
> >> lot of the KVM code has changed since, and the series apply anymore
> >> (not that anybody really cares as the the HW is, as usual, made of
> >> unobtainium...).
> >>
> >> From the previous version:
> >>
> >> - Integration with the new page-table code
> >> - New exception injection code
> >> - No more messing with the nVHE code
> >> - No AArch32
> >> - Rebased on v5.10-rc4 + kvmarm/next for 5.11
> >>
> >> From a functionality perspective, you can expect a L2 guest to work,
> >> but don't even think of L3, as we only partially emulate the
> >> ARMv8.{3,4}-NV extensions themselves. Same thing for vgic, debug, PMU,
> >> as well as anything that would require a Stage-1 PTW. What we want to
> >> achieve is that with NV disabled, there is no performance overhead and
> >> no regression.
> >>
> >> The series is roughly divided in 5 parts: exception handling, memory
> >> virtualization, interrupts and timers for ARMv8.3, followed by the
> >> ARMv8.4 support. There are of course some dependencies, but you'll
> >> hopefully get the gist of it.
> >>
> >> For the most courageous of you, I've put out a branch[2]. Of course,
> >> you'll need some userspace. Andre maintains a hacked version of
> >> kvmtool[3] that takes a --nested option, allowing the guest to be
> >> started at EL2. You can run the whole stack in the Foundation
> >> model. Don't be in a hurry ;-).
> >>
> >
> > Hi Marc,
> >
> > I got a kernel BUG message when booting the L2 guest kernel with the
> > kvmtool on a FVP setup.
> > Could you help have a look about the BUG message as well as my
> > environment configuration?
> > I think It probably caused by some local configurations of the FVP
> > setup.
>
> No, this is likely a bug in your L1 guest, which was fixed in -rc3:
>
> 2a5f1b67ec57 ("KVM: arm64: Don't access PMCR_EL0 when no PMU is
> available")
>
> and was found in the exact same circumstances. Alternatively, and if
> you don't want to change your L1 guest, you can just pass the --pmu
> option to kvmtool when starting the L1 guest.

After passing --pmu when starting a L1 guest, I can successfully run a
L2 guest now!
Thanks so much for the help!

Haibo

>
> Hope this helps,
>
>  M.
> --
> Jazz is not dead. It just smells funny...
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm