Re: [PATCH] KVM: arm64: Fix unaligned addr case in mmu walking

2021-03-04 Thread Marc Zyngier
On Wed, 3 Mar 2021 10:42:25 +0800, Jia He wrote:
> If the start addr is not aligned with the granule size of that level.
> loop step size should be adjusted to boundary instead of simple
> kvm_granual_size(level) increment. Otherwise, some mmu entries might miss
> the chance to be walked through.
> E.g. Assume the unmap range [data->addr, data->end] is
> [0xff00ab2000,0xff00cb2000] in level 2 walking and NOT block mapping.
> And the 1st part of that pmd entry is [0xff00ab2000,0xff00c0]. The
> pmd value is 0x83fbd2c1002 (not valid entry). In this case, data->addr
> should be adjusted to 0xff00c0 instead of 0xff00cb2000.
> 
> [...]

Applied to fixes, thanks!

[1/1] KVM: arm64: Fix unaligned addr case in mmu walking
  commit: e85583b3f1fe62c9b371a3100c1c91af94005ca9

Cheers,

M.
-- 
Without deviation from the norm, progress is not possible.




Re: [PATCH] KVM: arm64: Fix unaligned addr case in mmu walking

2021-03-04 Thread Will Deacon
On Thu, Mar 04, 2021 at 09:16:25AM +, Marc Zyngier wrote:
> On 2021-03-04 00:46, Justin He wrote:
> > > On Wed, Mar 03, 2021 at 07:07:37PM +, Marc Zyngier wrote:
> > > > From e0524b41a71e0f17d6dc8f197e421e677d584e72 Mon Sep 17 00:00:00 2001
> > > > From: Jia He 
> > > > Date: Wed, 3 Mar 2021 10:42:25 +0800
> > > > Subject: [PATCH] KVM: arm64: Fix range alignment when walking page 
> > > > tables
> > > >
> > > > When walking the page tables at a given level, and if the start
> > > > address for the range isn't aligned for that level, we propagate
> > > > the misalignment on each iteration at that level.
> > > >
> > > > This results in the walker ignoring a number of entries (depending
> > > > on the original misalignment) on each subsequent iteration.
> > > >
> > > > Properly aligning the address at the before the next iteration
> > > 
> > > "at the before the next" ???
> > > 
> > > > addresses the issue.
> > > >
> > > > Cc: sta...@vger.kernel.org
> > > > Reported-by: Howard Zhang 
> > > > Signed-off-by: Jia He 
> > > > Fixes: b1e57de62cfb ("KVM: arm64: Add stand-alone page-table walker
> > > infrastructure")
> > > > [maz: rewrite commit message]
> > > > Signed-off-by: Marc Zyngier 
> > > > Link: https://lore.kernel.org/r/20210303024225.2591-1-justin...@arm.com
> > > > ---
> > > >  arch/arm64/kvm/hyp/pgtable.c | 2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > >
> > > > diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
> > > > index 4d177ce1d536..124cd2f93020 100644
> > > > --- a/arch/arm64/kvm/hyp/pgtable.c
> > > > +++ b/arch/arm64/kvm/hyp/pgtable.c
> > > > @@ -223,7 +223,7 @@ static inline int __kvm_pgtable_visit(struct
> > > kvm_pgtable_walk_data *data,
> > > > goto out;
> > > >
> > > > if (!table) {
> > > > -   data->addr += kvm_granule_size(level);
> > > > +   data->addr = ALIGN(data->addr, kvm_granule_size(level));
> > 
> > What if previous data->addr is already aligned with
> > kvm_granule_size(level)?
> > Hence a deadloop? Am I missing anything else?
> 
> Indeed, well spotted. I'll revert to your original suggestion
> if everybody agrees...

Heh, yeah, at least one of us is awake.
For the original patch, with the updated (including typo fix) commit
message:

Acked-by: Will Deacon 

If that still counts for anything!

Will


Re: [PATCH] KVM: arm64: Fix unaligned addr case in mmu walking

2021-03-04 Thread Marc Zyngier

On 2021-03-04 00:46, Justin He wrote:

Hi Marc


-Original Message-
From: Will Deacon 
Sent: Thursday, March 4, 2021 5:13 AM
To: Marc Zyngier 
Cc: Justin He ; kvm...@lists.cs.columbia.edu; James
Morse ; Julien Thierry 
;

Suzuki Poulose ; Catalin Marinas
; Gavin Shan ; Yanan Wang
; Quentin Perret ; 
linux-arm-

ker...@lists.infradead.org; linux-kernel@vger.kernel.org
Subject: Re: [PATCH] KVM: arm64: Fix unaligned addr case in mmu 
walking


On Wed, Mar 03, 2021 at 07:07:37PM +, Marc Zyngier wrote:
> From e0524b41a71e0f17d6dc8f197e421e677d584e72 Mon Sep 17 00:00:00 2001
> From: Jia He 
> Date: Wed, 3 Mar 2021 10:42:25 +0800
> Subject: [PATCH] KVM: arm64: Fix range alignment when walking page tables
>
> When walking the page tables at a given level, and if the start
> address for the range isn't aligned for that level, we propagate
> the misalignment on each iteration at that level.
>
> This results in the walker ignoring a number of entries (depending
> on the original misalignment) on each subsequent iteration.
>
> Properly aligning the address at the before the next iteration

"at the before the next" ???

> addresses the issue.
>
> Cc: sta...@vger.kernel.org
> Reported-by: Howard Zhang 
> Signed-off-by: Jia He 
> Fixes: b1e57de62cfb ("KVM: arm64: Add stand-alone page-table walker
infrastructure")
> [maz: rewrite commit message]
> Signed-off-by: Marc Zyngier 
> Link: https://lore.kernel.org/r/20210303024225.2591-1-justin...@arm.com
> ---
>  arch/arm64/kvm/hyp/pgtable.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
> index 4d177ce1d536..124cd2f93020 100644
> --- a/arch/arm64/kvm/hyp/pgtable.c
> +++ b/arch/arm64/kvm/hyp/pgtable.c
> @@ -223,7 +223,7 @@ static inline int __kvm_pgtable_visit(struct
kvm_pgtable_walk_data *data,
>goto out;
>
>if (!table) {
> -  data->addr += kvm_granule_size(level);
> +  data->addr = ALIGN(data->addr, kvm_granule_size(level));


What if previous data->addr is already aligned with 
kvm_granule_size(level)?

Hence a deadloop? Am I missing anything else?


Indeed, well spotted. I'll revert to your original suggestion
if everybody agrees...

Thanks,

M.
--
Jazz is not dead. It just smells funny...


RE: [PATCH] KVM: arm64: Fix unaligned addr case in mmu walking

2021-03-03 Thread Justin He
Hi Marc

> -Original Message-
> From: Will Deacon 
> Sent: Thursday, March 4, 2021 5:13 AM
> To: Marc Zyngier 
> Cc: Justin He ; kvm...@lists.cs.columbia.edu; James
> Morse ; Julien Thierry ;
> Suzuki Poulose ; Catalin Marinas
> ; Gavin Shan ; Yanan Wang
> ; Quentin Perret ; linux-arm-
> ker...@lists.infradead.org; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH] KVM: arm64: Fix unaligned addr case in mmu walking
> 
> On Wed, Mar 03, 2021 at 07:07:37PM +, Marc Zyngier wrote:
> > From e0524b41a71e0f17d6dc8f197e421e677d584e72 Mon Sep 17 00:00:00 2001
> > From: Jia He 
> > Date: Wed, 3 Mar 2021 10:42:25 +0800
> > Subject: [PATCH] KVM: arm64: Fix range alignment when walking page tables
> >
> > When walking the page tables at a given level, and if the start
> > address for the range isn't aligned for that level, we propagate
> > the misalignment on each iteration at that level.
> >
> > This results in the walker ignoring a number of entries (depending
> > on the original misalignment) on each subsequent iteration.
> >
> > Properly aligning the address at the before the next iteration
> 
> "at the before the next" ???
> 
> > addresses the issue.
> >
> > Cc: sta...@vger.kernel.org
> > Reported-by: Howard Zhang 
> > Signed-off-by: Jia He 
> > Fixes: b1e57de62cfb ("KVM: arm64: Add stand-alone page-table walker
> infrastructure")
> > [maz: rewrite commit message]
> > Signed-off-by: Marc Zyngier 
> > Link: https://lore.kernel.org/r/20210303024225.2591-1-justin...@arm.com
> > ---
> >  arch/arm64/kvm/hyp/pgtable.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
> > index 4d177ce1d536..124cd2f93020 100644
> > --- a/arch/arm64/kvm/hyp/pgtable.c
> > +++ b/arch/arm64/kvm/hyp/pgtable.c
> > @@ -223,7 +223,7 @@ static inline int __kvm_pgtable_visit(struct
> kvm_pgtable_walk_data *data,
> > goto out;
> >
> > if (!table) {
> > -   data->addr += kvm_granule_size(level);
> > +   data->addr = ALIGN(data->addr, kvm_granule_size(level));

What if previous data->addr is already aligned with kvm_granule_size(level)?
Hence a deadloop? Am I missing anything else?

--
Cheers,
Justin (Jia He)

> > goto out;
> > }
> 
> If Jia is happy with it, please feel free to add:
> 
> Acked-by: Will Deacon 
> 
> Will


RE: [PATCH] KVM: arm64: Fix unaligned addr case in mmu walking

2021-03-03 Thread Justin He
Hi Quentin and Marc
I noticed Marc had sent out new version on behalf of me, thanks for the help.
I hated the time difference, sorry for the late.

Just answer the comments below to make it clear.

> -Original Message-
> From: Quentin Perret 
> Sent: Wednesday, March 3, 2021 7:09 PM
> To: Marc Zyngier 
> Cc: Justin He ; kvm...@lists.cs.columbia.edu; James
> Morse ; Julien Thierry ;
> Suzuki Poulose ; Catalin Marinas
> ; Will Deacon ; Gavin Shan
> ; Yanan Wang ; linux-arm-
> ker...@lists.infradead.org; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH] KVM: arm64: Fix unaligned addr case in mmu walking
> 
> On Wednesday 03 Mar 2021 at 09:54:25 (+), Marc Zyngier wrote:
> > Hi Jia,
> >
> > On Wed, 03 Mar 2021 02:42:25 +,
> > Jia He  wrote:
> > >
> > > If the start addr is not aligned with the granule size of that level.
> > > loop step size should be adjusted to boundary instead of simple
> > > kvm_granual_size(level) increment. Otherwise, some mmu entries might
> miss
> > > the chance to be walked through.
> > > E.g. Assume the unmap range [data->addr, data->end] is
> > > [0xff00ab2000,0xff00cb2000] in level 2 walking and NOT block mapping.
> >
> > When does this occur? Upgrade from page mappings to block? Swap out?
> >
> > > And the 1st part of that pmd entry is [0xff00ab2000,0xff00c0]. The
> > > pmd value is 0x83fbd2c1002 (not valid entry). In this case, data->addr
> > > should be adjusted to 0xff00c0 instead of 0xff00cb2000.
> >
> > Let me see if I understand this. Assuming 4k pages, the region
> > described above spans *two* 2M entries:
> >
> > (a) ff00ab2000-ff00c0, part of ff00a0-ff00c0
> > (b) ff00c0-ff00db2000, part of ff00c0-ff00e0
> >
> > (a) has no valid mapping, but (b) does. Because we fail to correctly
> > align on a block boundary when skipping (a), we also skip (b), which
> > is then left mapped.
> >
> > Did I get it right? If so, yes, this is... annoying.
> >

Yes, exactly the case

> > Understanding the circumstances this triggers in would be most
> > interesting. This current code seems to assume that we get ranges
> > aligned to mapping boundaries, but I seem to remember that the old
> > code did use the stage2_*_addr_end() helpers to deal with this case.
> >
> > Will: I don't think things have changed in that respect, right?
> 
> Indeed we should still use stage2_*_addr_end(), especially in the unmap
> path that is mentioned here, so it would be helpful to have a little bit
> more context.

Yes, stage2_pgd_addr_end() was still there but the stage2_pmd_addr_end() was 
removed.
> 
> > > Without this fix, userspace "segment fault" error can be easily
> > > triggered by running simple gVisor runsc cases on an Ampere Altra
> > > server:
> > > docker run --runtime=runsc -it --rm  ubuntu /bin/bash
> > >
> > > In container:
> > > for i in `seq 1 100`;do ls;done
> >
> > The workload on its own isn't that interesting. What I'd like to
> > understand is what happens on the host during that time.

Okay

> >
> > >
> > > Reported-by: Howard Zhang 
> > > Signed-off-by: Jia He 
> > > ---
> > >  arch/arm64/kvm/hyp/pgtable.c | 1 +
> > >  1 file changed, 1 insertion(+)
> > >
> > > diff --git a/arch/arm64/kvm/hyp/pgtable.c
> b/arch/arm64/kvm/hyp/pgtable.c
> > > index bdf8e55ed308..4d99d07c610c 100644
> > > --- a/arch/arm64/kvm/hyp/pgtable.c
> > > +++ b/arch/arm64/kvm/hyp/pgtable.c
> > > @@ -225,6 +225,7 @@ static inline int __kvm_pgtable_visit(struct
> kvm_pgtable_walk_data *data,
> > >   goto out;
> > >
> > >   if (!table) {
> > > + data->addr = ALIGN_DOWN(data->addr, kvm_granule_size(level));
> > >   data->addr += kvm_granule_size(level);
> > >   goto out;
> > >   }
> >
> > It otherwise looks good to me. Quentin, Will: unless you object to
> > this, I plan to take it in the next round of fixes with
> 
> Though I'm still unsure how we hit that today, the change makes sense on
> its own I think, so no objection from me.
> 
> Thanks,
> Quentin


Re: [PATCH] KVM: arm64: Fix unaligned addr case in mmu walking

2021-03-03 Thread Will Deacon
On Wed, Mar 03, 2021 at 07:07:37PM +, Marc Zyngier wrote:
> From e0524b41a71e0f17d6dc8f197e421e677d584e72 Mon Sep 17 00:00:00 2001
> From: Jia He 
> Date: Wed, 3 Mar 2021 10:42:25 +0800
> Subject: [PATCH] KVM: arm64: Fix range alignment when walking page tables
> 
> When walking the page tables at a given level, and if the start
> address for the range isn't aligned for that level, we propagate
> the misalignment on each iteration at that level.
> 
> This results in the walker ignoring a number of entries (depending
> on the original misalignment) on each subsequent iteration.
> 
> Properly aligning the address at the before the next iteration

"at the before the next" ???

> addresses the issue.
> 
> Cc: sta...@vger.kernel.org
> Reported-by: Howard Zhang 
> Signed-off-by: Jia He 
> Fixes: b1e57de62cfb ("KVM: arm64: Add stand-alone page-table walker 
> infrastructure")
> [maz: rewrite commit message]
> Signed-off-by: Marc Zyngier 
> Link: https://lore.kernel.org/r/20210303024225.2591-1-justin...@arm.com
> ---
>  arch/arm64/kvm/hyp/pgtable.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
> index 4d177ce1d536..124cd2f93020 100644
> --- a/arch/arm64/kvm/hyp/pgtable.c
> +++ b/arch/arm64/kvm/hyp/pgtable.c
> @@ -223,7 +223,7 @@ static inline int __kvm_pgtable_visit(struct 
> kvm_pgtable_walk_data *data,
>   goto out;
>  
>   if (!table) {
> - data->addr += kvm_granule_size(level);
> + data->addr = ALIGN(data->addr, kvm_granule_size(level));
>   goto out;
>   }

If Jia is happy with it, please feel free to add:

Acked-by: Will Deacon 

Will


Re: [PATCH] KVM: arm64: Fix unaligned addr case in mmu walking

2021-03-03 Thread Marc Zyngier
On Wed, 03 Mar 2021 11:29:34 +,
Will Deacon  wrote:
> 
> On Wed, Mar 03, 2021 at 10:42:25AM +0800, Jia He wrote:
> > If the start addr is not aligned with the granule size of that level.
> > loop step size should be adjusted to boundary instead of simple
> > kvm_granual_size(level) increment. Otherwise, some mmu entries might miss
> > the chance to be walked through.
> > E.g. Assume the unmap range [data->addr, data->end] is
> > [0xff00ab2000,0xff00cb2000] in level 2 walking and NOT block mapping.
> > And the 1st part of that pmd entry is [0xff00ab2000,0xff00c0]. The
> > pmd value is 0x83fbd2c1002 (not valid entry). In this case, data->addr
> > should be adjusted to 0xff00c0 instead of 0xff00cb2000.
> > 
> > Without this fix, userspace "segment fault" error can be easily
> > triggered by running simple gVisor runsc cases on an Ampere Altra
> > server:
> > docker run --runtime=runsc -it --rm  ubuntu /bin/bash
> > 
> > In container:
> > for i in `seq 1 100`;do ls;done
> > 
> > Reported-by: Howard Zhang 
> > Signed-off-by: Jia He 
> > ---
> >  arch/arm64/kvm/hyp/pgtable.c | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
> > index bdf8e55ed308..4d99d07c610c 100644
> > --- a/arch/arm64/kvm/hyp/pgtable.c
> > +++ b/arch/arm64/kvm/hyp/pgtable.c
> > @@ -225,6 +225,7 @@ static inline int __kvm_pgtable_visit(struct 
> > kvm_pgtable_walk_data *data,
> > goto out;
> >  
> > if (!table) {
> > +   data->addr = ALIGN_DOWN(data->addr, kvm_granule_size(level));
> > data->addr += kvm_granule_size(level);
> 
> Can you replace both of these lines with:
> 
>   data->addr = ALIGN(data->addr, kvm_granule_size(level));
> 
> instead?

Seems like a good option. I also took the liberty to rewrite the
commit message in an effort to make it a bit clearer.

Jia, please let me know if you are OK with these cosmetic changes.


Thanks,

M.

>From e0524b41a71e0f17d6dc8f197e421e677d584e72 Mon Sep 17 00:00:00 2001
From: Jia He 
Date: Wed, 3 Mar 2021 10:42:25 +0800
Subject: [PATCH] KVM: arm64: Fix range alignment when walking page tables

When walking the page tables at a given level, and if the start
address for the range isn't aligned for that level, we propagate
the misalignment on each iteration at that level.

This results in the walker ignoring a number of entries (depending
on the original misalignment) on each subsequent iteration.

Properly aligning the address at the before the next iteration
addresses the issue.

Cc: sta...@vger.kernel.org
Reported-by: Howard Zhang 
Signed-off-by: Jia He 
Fixes: b1e57de62cfb ("KVM: arm64: Add stand-alone page-table walker 
infrastructure")
[maz: rewrite commit message]
Signed-off-by: Marc Zyngier 
Link: https://lore.kernel.org/r/20210303024225.2591-1-justin...@arm.com
---
 arch/arm64/kvm/hyp/pgtable.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index 4d177ce1d536..124cd2f93020 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -223,7 +223,7 @@ static inline int __kvm_pgtable_visit(struct 
kvm_pgtable_walk_data *data,
goto out;
 
if (!table) {
-   data->addr += kvm_granule_size(level);
+   data->addr = ALIGN(data->addr, kvm_granule_size(level));
goto out;
}
 
-- 
2.30.0


-- 
Without deviation from the norm, progress is not possible.


Re: [PATCH] KVM: arm64: Fix unaligned addr case in mmu walking

2021-03-03 Thread Will Deacon
On Wed, Mar 03, 2021 at 09:54:25AM +, Marc Zyngier wrote:
> Hi Jia,
> 
> On Wed, 03 Mar 2021 02:42:25 +,
> Jia He  wrote:
> > 
> > If the start addr is not aligned with the granule size of that level.
> > loop step size should be adjusted to boundary instead of simple
> > kvm_granual_size(level) increment. Otherwise, some mmu entries might miss
> > the chance to be walked through.
> > E.g. Assume the unmap range [data->addr, data->end] is
> > [0xff00ab2000,0xff00cb2000] in level 2 walking and NOT block mapping.
> 
> When does this occur? Upgrade from page mappings to block? Swap out?
> 
> > And the 1st part of that pmd entry is [0xff00ab2000,0xff00c0]. The
> > pmd value is 0x83fbd2c1002 (not valid entry). In this case, data->addr
> > should be adjusted to 0xff00c0 instead of 0xff00cb2000.
> 
> Let me see if I understand this. Assuming 4k pages, the region
> described above spans *two* 2M entries:
> 
> (a) ff00ab2000-ff00c0, part of ff00a0-ff00c0
> (b) ff00c0-ff00db2000, part of ff00c0-ff00e0
> 
> (a) has no valid mapping, but (b) does. Because we fail to correctly
> align on a block boundary when skipping (a), we also skip (b), which
> is then left mapped.
> 
> Did I get it right? If so, yes, this is... annoying.
> 
> Understanding the circumstances this triggers in would be most
> interesting. This current code seems to assume that we get ranges
> aligned to mapping boundaries, but I seem to remember that the old
> code did use the stage2_*_addr_end() helpers to deal with this case.
> 
> Will: I don't think things have changed in that respect, right?

We've maintained stage2_pgd_addr_end() for the top-level iterator, but
it looks like we're failing to do the rounding in __kvm_pgtable_visit()
when hitting a leaf entry, so the caller (__kvm_pgtable_walk()) will
terminate early.

I agree that it's annoying, as you'd have expected us to run into this
earlier on.

Will


Re: [PATCH] KVM: arm64: Fix unaligned addr case in mmu walking

2021-03-03 Thread Will Deacon
On Wed, Mar 03, 2021 at 10:42:25AM +0800, Jia He wrote:
> If the start addr is not aligned with the granule size of that level.
> loop step size should be adjusted to boundary instead of simple
> kvm_granual_size(level) increment. Otherwise, some mmu entries might miss
> the chance to be walked through.
> E.g. Assume the unmap range [data->addr, data->end] is
> [0xff00ab2000,0xff00cb2000] in level 2 walking and NOT block mapping.
> And the 1st part of that pmd entry is [0xff00ab2000,0xff00c0]. The
> pmd value is 0x83fbd2c1002 (not valid entry). In this case, data->addr
> should be adjusted to 0xff00c0 instead of 0xff00cb2000.
> 
> Without this fix, userspace "segment fault" error can be easily
> triggered by running simple gVisor runsc cases on an Ampere Altra
> server:
> docker run --runtime=runsc -it --rm  ubuntu /bin/bash
> 
> In container:
> for i in `seq 1 100`;do ls;done
> 
> Reported-by: Howard Zhang 
> Signed-off-by: Jia He 
> ---
>  arch/arm64/kvm/hyp/pgtable.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
> index bdf8e55ed308..4d99d07c610c 100644
> --- a/arch/arm64/kvm/hyp/pgtable.c
> +++ b/arch/arm64/kvm/hyp/pgtable.c
> @@ -225,6 +225,7 @@ static inline int __kvm_pgtable_visit(struct 
> kvm_pgtable_walk_data *data,
>   goto out;
>  
>   if (!table) {
> + data->addr = ALIGN_DOWN(data->addr, kvm_granule_size(level));
>   data->addr += kvm_granule_size(level);

Can you replace both of these lines with:

data->addr = ALIGN(data->addr, kvm_granule_size(level));

instead?

Will


Re: [PATCH] KVM: arm64: Fix unaligned addr case in mmu walking

2021-03-03 Thread Quentin Perret
On Wednesday 03 Mar 2021 at 09:54:25 (+), Marc Zyngier wrote:
> Hi Jia,
> 
> On Wed, 03 Mar 2021 02:42:25 +,
> Jia He  wrote:
> > 
> > If the start addr is not aligned with the granule size of that level.
> > loop step size should be adjusted to boundary instead of simple
> > kvm_granual_size(level) increment. Otherwise, some mmu entries might miss
> > the chance to be walked through.
> > E.g. Assume the unmap range [data->addr, data->end] is
> > [0xff00ab2000,0xff00cb2000] in level 2 walking and NOT block mapping.
> 
> When does this occur? Upgrade from page mappings to block? Swap out?
> 
> > And the 1st part of that pmd entry is [0xff00ab2000,0xff00c0]. The
> > pmd value is 0x83fbd2c1002 (not valid entry). In this case, data->addr
> > should be adjusted to 0xff00c0 instead of 0xff00cb2000.
> 
> Let me see if I understand this. Assuming 4k pages, the region
> described above spans *two* 2M entries:
> 
> (a) ff00ab2000-ff00c0, part of ff00a0-ff00c0
> (b) ff00c0-ff00db2000, part of ff00c0-ff00e0
> 
> (a) has no valid mapping, but (b) does. Because we fail to correctly
> align on a block boundary when skipping (a), we also skip (b), which
> is then left mapped.
> 
> Did I get it right? If so, yes, this is... annoying.
> 
> Understanding the circumstances this triggers in would be most
> interesting. This current code seems to assume that we get ranges
> aligned to mapping boundaries, but I seem to remember that the old
> code did use the stage2_*_addr_end() helpers to deal with this case.
> 
> Will: I don't think things have changed in that respect, right?

Indeed we should still use stage2_*_addr_end(), especially in the unmap
path that is mentioned here, so it would be helpful to have a little bit
more context.

> > Without this fix, userspace "segment fault" error can be easily
> > triggered by running simple gVisor runsc cases on an Ampere Altra
> > server:
> > docker run --runtime=runsc -it --rm  ubuntu /bin/bash
> > 
> > In container:
> > for i in `seq 1 100`;do ls;done
> 
> The workload on its own isn't that interesting. What I'd like to
> understand is what happens on the host during that time.
> 
> > 
> > Reported-by: Howard Zhang 
> > Signed-off-by: Jia He 
> > ---
> >  arch/arm64/kvm/hyp/pgtable.c | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
> > index bdf8e55ed308..4d99d07c610c 100644
> > --- a/arch/arm64/kvm/hyp/pgtable.c
> > +++ b/arch/arm64/kvm/hyp/pgtable.c
> > @@ -225,6 +225,7 @@ static inline int __kvm_pgtable_visit(struct 
> > kvm_pgtable_walk_data *data,
> > goto out;
> >  
> > if (!table) {
> > +   data->addr = ALIGN_DOWN(data->addr, kvm_granule_size(level));
> > data->addr += kvm_granule_size(level);
> > goto out;
> > }
> 
> It otherwise looks good to me. Quentin, Will: unless you object to
> this, I plan to take it in the next round of fixes with

Though I'm still unsure how we hit that today, the change makes sense on
its own I think, so no objection from me.

Thanks,
Quentin


Re: [PATCH] KVM: arm64: Fix unaligned addr case in mmu walking

2021-03-03 Thread Marc Zyngier
Hi Jia,

On Wed, 03 Mar 2021 02:42:25 +,
Jia He  wrote:
> 
> If the start addr is not aligned with the granule size of that level.
> loop step size should be adjusted to boundary instead of simple
> kvm_granual_size(level) increment. Otherwise, some mmu entries might miss
> the chance to be walked through.
> E.g. Assume the unmap range [data->addr, data->end] is
> [0xff00ab2000,0xff00cb2000] in level 2 walking and NOT block mapping.

When does this occur? Upgrade from page mappings to block? Swap out?

> And the 1st part of that pmd entry is [0xff00ab2000,0xff00c0]. The
> pmd value is 0x83fbd2c1002 (not valid entry). In this case, data->addr
> should be adjusted to 0xff00c0 instead of 0xff00cb2000.

Let me see if I understand this. Assuming 4k pages, the region
described above spans *two* 2M entries:

(a) ff00ab2000-ff00c0, part of ff00a0-ff00c0
(b) ff00c0-ff00db2000, part of ff00c0-ff00e0

(a) has no valid mapping, but (b) does. Because we fail to correctly
align on a block boundary when skipping (a), we also skip (b), which
is then left mapped.

Did I get it right? If so, yes, this is... annoying.

Understanding the circumstances this triggers in would be most
interesting. This current code seems to assume that we get ranges
aligned to mapping boundaries, but I seem to remember that the old
code did use the stage2_*_addr_end() helpers to deal with this case.

Will: I don't think things have changed in that respect, right?

> 
> Without this fix, userspace "segment fault" error can be easily
> triggered by running simple gVisor runsc cases on an Ampere Altra
> server:
> docker run --runtime=runsc -it --rm  ubuntu /bin/bash
> 
> In container:
> for i in `seq 1 100`;do ls;done

The workload on its own isn't that interesting. What I'd like to
understand is what happens on the host during that time.

> 
> Reported-by: Howard Zhang 
> Signed-off-by: Jia He 
> ---
>  arch/arm64/kvm/hyp/pgtable.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
> index bdf8e55ed308..4d99d07c610c 100644
> --- a/arch/arm64/kvm/hyp/pgtable.c
> +++ b/arch/arm64/kvm/hyp/pgtable.c
> @@ -225,6 +225,7 @@ static inline int __kvm_pgtable_visit(struct 
> kvm_pgtable_walk_data *data,
>   goto out;
>  
>   if (!table) {
> + data->addr = ALIGN_DOWN(data->addr, kvm_granule_size(level));
>   data->addr += kvm_granule_size(level);
>   goto out;
>   }

It otherwise looks good to me. Quentin, Will: unless you object to
this, I plan to take it in the next round of fixes with

Fixes: b1e57de62cfb ("KVM: arm64: Add stand-alone page-table walker 
infrastructure")
Cc: sta...@vger.kernel.org

Thanks,

M.

-- 
Without deviation from the norm, progress is not possible.


[PATCH] KVM: arm64: Fix unaligned addr case in mmu walking

2021-03-03 Thread Jia He
If the start addr is not aligned with the granule size of that level.
loop step size should be adjusted to boundary instead of simple
kvm_granual_size(level) increment. Otherwise, some mmu entries might miss
the chance to be walked through.
E.g. Assume the unmap range [data->addr, data->end] is
[0xff00ab2000,0xff00cb2000] in level 2 walking and NOT block mapping.
And the 1st part of that pmd entry is [0xff00ab2000,0xff00c0]. The
pmd value is 0x83fbd2c1002 (not valid entry). In this case, data->addr
should be adjusted to 0xff00c0 instead of 0xff00cb2000.

Without this fix, userspace "segment fault" error can be easily
triggered by running simple gVisor runsc cases on an Ampere Altra
server:
docker run --runtime=runsc -it --rm  ubuntu /bin/bash

In container:
for i in `seq 1 100`;do ls;done

Reported-by: Howard Zhang 
Signed-off-by: Jia He 
---
 arch/arm64/kvm/hyp/pgtable.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index bdf8e55ed308..4d99d07c610c 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -225,6 +225,7 @@ static inline int __kvm_pgtable_visit(struct 
kvm_pgtable_walk_data *data,
goto out;
 
if (!table) {
+   data->addr = ALIGN_DOWN(data->addr, kvm_granule_size(level));
data->addr += kvm_granule_size(level);
goto out;
}
-- 
2.17.1