Re: [PATCH v2 0/3] Fix several bugs in KVM stage 2 translation

2020-12-02 Thread wangyanan (Y)



On 2020/12/2 20:23, Marc Zyngier wrote:

Hi Yanan,

[...]


BTW: there are two more things below that I want to talk about.

1.  Recently, I have been focusing on the ARMv8.4-TTRem feature which
is aimed at changing block size in stage 2 mapping.

I have a plan to implement this feature for stage 2 translation when
splitting a block into tables or merging tables into a block.

This feature supports changing block size without performing
*break-before-make*, which might have some improvement on performance.

What do you think about this?


It would be interesting if you can demonstrate some significant
performance improvements compared to the same workload with BBM.

I'm not completely convinced this would change much, given that
it is only when moving from a table to a block mapping that you
can elide BBM when the support level is 1 or 2. As far as I can
tell, this only happens in the "stop logging" case.

Is that something that happens often enough to justify the added
complexity? Having to handle TLB Conflict Abort is annoying, for
example.


I will take more consideration about the necessity  and maybe some tests

on the performance will be made later.


Thanks,


Yanan




2. Given that the issues we discussed before were found in practice
when guest state changes from dirty logging to dirty logging canceled.

I could add a test file testing on this case to selftests/ or kvm unit
tests/, if it's necessary.


That would be awesome, and I'd be very grateful if you did. It is the
second time we break this exact case, and having a reliable way to
verify it would definitely help.

Thanks,

    M.


Re: [PATCH v2 0/3] Fix several bugs in KVM stage 2 translation

2020-12-02 Thread Marc Zyngier
On Wed, 2 Dec 2020 04:10:31 +0800, Yanan Wang wrote:
> When installing a new pte entry or updating an old valid entry in stage 2
> translation, we use get_page()/put_page() to record page_count of the 
> page-table
> pages. PATCH 1/3 aims to fix incorrect use of get_page()/put_page() in stage 
> 2,
> which might make page-table pages unable to be freed when unmapping a range.
> 
> When dirty logging of a guest with hugepages is finished, we should merge 
> tables
> back into a block entry if adjustment of huge mapping is found necessary.
> In addition to installing the block entry, we should not only free the 
> non-huge
> page-table pages but also invalidate all the TLB entries of non-huge mappings 
> for
> the block. PATCH 2/3 adds enough TLBI when merging tables into a block entry.
> 
> [...]

Applied to kvm-arm64/fixes-5.10, thanks!

[1/3] KVM: arm64: Fix memory leak on stage2 update of a valid PTE
  commit: 5c646b7e1d8bcb12317426287c516dfa4c5171c2
[2/3] KVM: arm64: Fix handling of merging tables into a block entry
  commit: 3a0b870e3448302ca2ba703bea1b79b61c3f33c6
[3/3] KVM: arm64: Add usage of stage 2 fault lookup level in user_mem_abort()
  commit: 7d894834a305568a0168c55d4729216f5f8cb4e6

Cheers,

M.
-- 
Without deviation from the norm, progress is not possible.




Re: [PATCH v2 0/3] Fix several bugs in KVM stage 2 translation

2020-12-02 Thread Marc Zyngier

Hi Yanan,

[...]


BTW: there are two more things below that I want to talk about.

1.  Recently, I have been focusing on the ARMv8.4-TTRem feature which
is aimed at changing block size in stage 2 mapping.

I have a plan to implement this feature for stage 2 translation when
splitting a block into tables or merging tables into a block.

This feature supports changing block size without performing
*break-before-make*, which might have some improvement on performance.

What do you think about this?


It would be interesting if you can demonstrate some significant
performance improvements compared to the same workload with BBM.

I'm not completely convinced this would change much, given that
it is only when moving from a table to a block mapping that you
can elide BBM when the support level is 1 or 2. As far as I can
tell, this only happens in the "stop logging" case.

Is that something that happens often enough to justify the added
complexity? Having to handle TLB Conflict Abort is annoying, for
example.


2. Given that the issues we discussed before were found in practice
when guest state changes from dirty logging to dirty logging canceled.

I could add a test file testing on this case to selftests/ or kvm unit
tests/, if it's necessary.


That would be awesome, and I'd be very grateful if you did. It is the
second time we break this exact case, and having a reliable way to
verify it would definitely help.

Thanks,

M.
--
Jazz is not dead. It just smells funny...


Re: [PATCH v2 0/3] Fix several bugs in KVM stage 2 translation

2020-12-02 Thread wangyanan (Y)

Hi Will, Marc,
On 2020/12/2 4:59, Will Deacon wrote:

On Wed, Dec 02, 2020 at 04:10:31AM +0800, Yanan Wang wrote:

When installing a new pte entry or updating an old valid entry in stage 2
translation, we use get_page()/put_page() to record page_count of the page-table
pages. PATCH 1/3 aims to fix incorrect use of get_page()/put_page() in stage 2,
which might make page-table pages unable to be freed when unmapping a range.

When dirty logging of a guest with hugepages is finished, we should merge tables
back into a block entry if adjustment of huge mapping is found necessary.
In addition to installing the block entry, we should not only free the non-huge
page-table pages but also invalidate all the TLB entries of non-huge mappings 
for
the block. PATCH 2/3 adds enough TLBI when merging tables into a block entry.

The rewrite of page-table code and fault handling add two different handlers
for "just relaxing permissions" and "map by stage2 page-table walk", that's
good improvement. Yet, in function user_mem_abort(), conditions where we choose
the above two fault handlers are not strictly distinguished. This will causes
guest errors such as infinite-loop (soft lockup will occur in result), because 
of
calling the inappropriate fault handler. So, a solution that can strictly
distinguish conditions is introduced in PATCH 3/3.

For the series:

Acked-by: Will Deacon 

Thanks for reporting these, helping me to understand the issues and then
spinning a v2 so promptly.

Will
.


Thanks for the help and suggestions.

BTW: there are two more things below that I want to talk about.

1.  Recently, I have been focusing on the ARMv8.4-TTRem feature which is 
aimed at changing block size in stage 2 mapping.


I have a plan to implement this feature for stage 2 translation when 
splitting a block into tables or merging tables into a block.


This feature supports changing block size without performing 
*break-before-make*, which might have some improvement on performance.


What do you think about this?


2. Given that the issues we discussed before were found in practice when 
guest state changes from dirty logging to dirty logging canceled.


I could add a test file testing on this case to selftests/ or kvm unit 
tests/, if it's necessary.



Yanan



Re: [PATCH v2 0/3] Fix several bugs in KVM stage 2 translation

2020-12-01 Thread Will Deacon
On Wed, Dec 02, 2020 at 04:10:31AM +0800, Yanan Wang wrote:
> When installing a new pte entry or updating an old valid entry in stage 2
> translation, we use get_page()/put_page() to record page_count of the 
> page-table
> pages. PATCH 1/3 aims to fix incorrect use of get_page()/put_page() in stage 
> 2,
> which might make page-table pages unable to be freed when unmapping a range.
> 
> When dirty logging of a guest with hugepages is finished, we should merge 
> tables
> back into a block entry if adjustment of huge mapping is found necessary.
> In addition to installing the block entry, we should not only free the 
> non-huge
> page-table pages but also invalidate all the TLB entries of non-huge mappings 
> for
> the block. PATCH 2/3 adds enough TLBI when merging tables into a block entry.
> 
> The rewrite of page-table code and fault handling add two different handlers
> for "just relaxing permissions" and "map by stage2 page-table walk", that's
> good improvement. Yet, in function user_mem_abort(), conditions where we 
> choose
> the above two fault handlers are not strictly distinguished. This will causes
> guest errors such as infinite-loop (soft lockup will occur in result), 
> because of
> calling the inappropriate fault handler. So, a solution that can strictly
> distinguish conditions is introduced in PATCH 3/3.

For the series:

Acked-by: Will Deacon 

Thanks for reporting these, helping me to understand the issues and then
spinning a v2 so promptly.

Will


[PATCH v2 0/3] Fix several bugs in KVM stage 2 translation

2020-12-01 Thread Yanan Wang
When installing a new pte entry or updating an old valid entry in stage 2
translation, we use get_page()/put_page() to record page_count of the page-table
pages. PATCH 1/3 aims to fix incorrect use of get_page()/put_page() in stage 2,
which might make page-table pages unable to be freed when unmapping a range.

When dirty logging of a guest with hugepages is finished, we should merge tables
back into a block entry if adjustment of huge mapping is found necessary.
In addition to installing the block entry, we should not only free the non-huge
page-table pages but also invalidate all the TLB entries of non-huge mappings 
for
the block. PATCH 2/3 adds enough TLBI when merging tables into a block entry.

The rewrite of page-table code and fault handling add two different handlers
for "just relaxing permissions" and "map by stage2 page-table walk", that's
good improvement. Yet, in function user_mem_abort(), conditions where we choose
the above two fault handlers are not strictly distinguished. This will causes
guest errors such as infinite-loop (soft lockup will occur in result), because 
of
calling the inappropriate fault handler. So, a solution that can strictly
distinguish conditions is introduced in PATCH 3/3.

Changes from v1:
 * In PATCH 1/3, introduce a more concise fix.
 * In PATCH 2/3, using full S2 TLB invalidation when merging tables into
   a block entry.

Yanan Wang (3):
  KVM: arm64: Fix possible memory leak in kvm stage2
  KVM: arm64: Fix handling of merging tables into a block entry
  KVM: arm64: Add usage of stage 2 fault lookup level in
user_mem_abort()

 arch/arm64/include/asm/esr.h |  1 +
 arch/arm64/include/asm/kvm_emulate.h |  5 +
 arch/arm64/kvm/hyp/pgtable.c | 11 ++-
 arch/arm64/kvm/mmu.c | 11 +--
 4 files changed, 25 insertions(+), 3 deletions(-)


-- 
2.19.1