Hi,
On 2020/8/6 2:19, Sami Tolvanen wrote:
> Commit 7c78f67e9bd9 ("arm64: enable tlbi range instructions") breaks
> LLVM's integrated assembler, because -Wa,-march is only passed to
> external assemblers and therefore, the new instructions are not enabled
> when IAS is used.
>
I have looked
On 2020/5/22 23:50, Catalin Marinas wrote:
> On Thu, Apr 23, 2020 at 09:56:52PM +0800, Zhenyu Ye wrote:
>> diff --git a/arch/arm64/include/asm/tlbflush.h
>> b/arch/arm64/include/asm/tlbflush.h
>> index bc3949064725..5f9f189bc6d2 100644
>> --- a/arch/arm64/include/a
On 2020/5/22 23:49, Catalin Marinas wrote:
> On Thu, Apr 23, 2020 at 09:56:53PM +0800, Zhenyu Ye wrote:
>> @@ -190,8 +196,8 @@ static inline void flush_tlb_page_nosync(struct
>> vm_area_struct *vma,
>> unsigned long addr = __TLBI_VADDR(uaddr, ASID(vma->vm_mm))
On 2020/5/22 23:42, Catalin Marinas wrote:
> On Thu, Apr 23, 2020 at 09:56:55PM +0800, Zhenyu Ye wrote:
>> diff --git a/mm/pgtable-generic.c b/mm/pgtable-generic.c
>> index 3d7c01e76efc..3eff199d3507 100644
>> --- a/mm/pgtable-generic.c
>> +++ b/mm/pgtable-generic.c
&g
This patch uses the cleared_* in struct mmu_gather to set the
TTL field in flush_tlb_range().
Signed-off-by: Zhenyu Ye
---
arch/arm64/include/asm/tlb.h | 29 -
arch/arm64/include/asm/tlbflush.h | 14 --
2 files changed, 36 insertions(+), 7 deletions
level value to 0.
Signed-off-by: Zhenyu Ye
---
arch/arm64/include/asm/tlbflush.h | 14 ++
1 file changed, 10 insertions(+), 4 deletions(-)
diff --git a/arch/arm64/include/asm/tlbflush.h
b/arch/arm64/include/asm/tlbflush.h
index 8adbd6fd8489..969dcf88e2a9 100644
--- a/arch/arm64/include
the ARMv8.4 TTL feature
arm64: Add level-hinted TLB invalidation helper
Peter Zijlstra (Intel) (1):
tlb: mmu_gather: add tlb_flush_*_range APIs
Zhenyu Ye (3):
arm64: Add tlbi_user_level TLB invalidation helper
mm: tlb: Provide flush_*_tlb_range wrappers
arm64: tlb: Set the TTL field
-by: Zhenyu Ye
Reviewed-by: Catalin Marinas
---
arch/arm64/include/asm/cpucaps.h | 3 ++-
arch/arm64/include/asm/sysreg.h | 1 +
arch/arm64/kernel/cpufeature.c | 11 +++
3 files changed, 14 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm
From: Marc Zyngier
Add a level-hinted TLB invalidation helper that only gets used if
ARMv8.4-TTL gets detected.
Signed-off-by: Marc Zyngier
Signed-off-by: Zhenyu Ye
Reviewed-by: Catalin Marinas
---
arch/arm64/include/asm/tlbflush.h | 29 +
1 file changed, 29
From: "Peter Zijlstra (Intel)"
tlb_flush_{pte|pmd|pud|p4d}_range() adjust the tlb->start and
tlb->end, then set corresponding cleared_*.
Signed-off-by: Peter Zijlstra (Intel)
Signed-off-by: Zhenyu Ye
Acked-by: Catalin Marinas
---
include/asm-ge
This patch provides flush_{pte|pmd|pud|p4d}_tlb_range() in generic
code, which are expressed through the mmu_gather APIs. These
interface set tlb->cleared_* and finally call tlb_flush(), so we
can do the tlb invalidation according to the information in
struct mmu_gather.
Signed-off-by: Zhenyu
On 2020/6/1 19:56, Catalin Marinas wrote:
> Hi Zhenyu,
>
> On Sat, May 30, 2020 at 06:24:21PM +0800, Zhenyu Ye wrote:
>> On 2020/5/26 22:52, Catalin Marinas wrote:
>>> On Mon, May 25, 2020 at 03:19:42PM +0800, Zhenyu Ye wrote:
>>>> tlb_flush_##_pxx##_
ARMv8.4-TLBI provides TLBI invalidation instruction that apply to a
range of input addresses. This patch detect this feature.
Signed-off-by: Zhenyu Ye
---
arch/arm64/include/asm/cpucaps.h | 3 ++-
arch/arm64/include/asm/sysreg.h | 4
arch/arm64/kernel/cpufeature.c | 11 +++
3
-rc1.
v2:
Link: https://lkml.org/lkml/2019/11/11/348
Zhenyu Ye (2):
arm64: tlb: Detect the ARMv8.4 TLBI RANGE feature
arm64: tlb: Use the TLBI RANGE feature in arm64
arch/arm64/include/asm/cpucaps.h | 3 +-
arch/arm64/include/asm/sysreg.h | 4 ++
arch/arm64/include/asm/tlbflush.h | 108
use 'end - start < threshold number' to decide which way
to go, however, different hardware may have different thresholds, so
I'm not sure if this is feasible.
Signed-off-by: Zhenyu Ye
---
arch/arm64/include/asm/tlbflush.h | 98 +++
1 file changed, 86 insertions(
Hi Catalin,
I have sent the v4 of this series [1] and combine the two function with
a single loop. See codes for details.
[1]
https://lore.kernel.org/linux-arm-kernel/20200601144713.-1-yezhen...@huawei.com/
On 2020/5/21 1:08, Catalin Marinas wrote:
>> This optimization is only effective
Hi all,
Some optimizations to the codes:
On 2020/6/1 22:47, Zhenyu Ye wrote:
> - start = __TLBI_VADDR(start, asid);
> - end = __TLBI_VADDR(end, asid);
> + /*
> + * The minimum size of TLB RANGE is 2 pages;
> + * Use normal TLB instruction to h
invalidation helper
Peter Zijlstra (Intel) (1):
tlb: mmu_gather: add tlb_flush_*_range APIs
Zhenyu Ye (3):
arm64: Add tlbi_user_level TLB invalidation helper
arm64: tlb: Set the TTL field in flush_tlb_range
arm64: tlb: Set the TTL field in flush_*_tlb_range
arch/arm64/include/asm/cpucaps.h
From: Marc Zyngier
Add a level-hinted TLB invalidation helper that only gets used if
ARMv8.4-TTL gets detected.
Signed-off-by: Marc Zyngier
Signed-off-by: Zhenyu Ye
Reviewed-by: Catalin Marinas
---
arch/arm64/include/asm/tlbflush.h | 29 +
1 file changed, 29
This patch implement flush_{pmd|pud}_tlb_range() in arm64 by
calling __flush_tlb_range() with the corresponding stride and
tlb_level values.
Signed-off-by: Zhenyu Ye
---
arch/arm64/include/asm/pgtable.h | 10 ++
1 file changed, 10 insertions(+)
diff --git a/arch/arm64/include/asm
From: "Peter Zijlstra (Intel)"
tlb_flush_{pte|pmd|pud|p4d}_range() adjust the tlb->start and
tlb->end, then set corresponding cleared_*.
Signed-off-by: Peter Zijlstra (Intel)
Signed-off-by: Zhenyu Ye
Acked-by: Catalin Marinas
---
include/asm-ge
level value of flush_tlb_range() to 0,
which will be updated in future patches. And set the ttl value of
flush_tlb_page_nosync() to 3 because it is only called to flush a
single pte page.
Signed-off-by: Zhenyu Ye
---
arch/arm64/include/asm/tlbflush.h | 19 +--
1 file changed, 13
This patch uses the cleared_* in struct mmu_gather to set the
TTL field in flush_tlb_range().
Signed-off-by: Zhenyu Ye
Reviewed-by: Catalin Marinas
---
arch/arm64/include/asm/tlb.h | 29 -
arch/arm64/include/asm/tlbflush.h | 14 --
2 files changed
-by: Zhenyu Ye
Reviewed-by: Catalin Marinas
---
arch/arm64/include/asm/cpucaps.h | 3 ++-
arch/arm64/include/asm/sysreg.h | 1 +
arch/arm64/kernel/cpufeature.c | 11 +++
3 files changed, 14 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm
Hi Anshuman,
On 2020/5/26 10:39, Anshuman Khandual wrote:
> This patch (https://patchwork.kernel.org/patch/11557359/) is adding some
> more ID_AA64MMFR2 features including the TTL. I am going to respin parts
> of the V4 series patches along with the above mentioned patch. So please
> rebase this
Hi Catalin,
Sorry for taking so long to reply to you.
On 2020/5/26 22:52, Catalin Marinas wrote:
> On Mon, May 25, 2020 at 03:19:42PM +0800, Zhenyu Ye wrote:
>>
>> tlb_flush_##_pxx##_range() is used to set tlb->cleared_*,
>> flush_##_pxx##_tlb_range() will actu
Hi all,
There are checks of ioeventfd collision in both kvm_assign_ioeventfd_idx()
and kvm_deassign_ioeventfd_idx(), however, with different logic.
In kvm_assign_ioeventfd_idx(), this is done by ioeventfd_check_collision():
---8<---
if (_p->bus_idx == p->bus_idx &&
_p->addr
...
}
---8<
This may be easier to understand (keep the same logic in assign/deassign).
I will send a formal patch soon.
Thanks,
Zhenyu
> Il gio 30 lug 2020, 16:36 Zhenyu Ye <mailto:yezhen...@huawei.com>> ha scritto:
>
> Hi all,
>
>
On 2020/7/31 14:44, Paolo Bonzini wrote:
> On 31/07/20 08:39, Zhenyu Ye wrote:
>> On 2020/7/31 2:03, Paolo Bonzini wrote:
>>> Yes, I think it's not needed. Probably the deassign check can be turned
>>> into an assertion?
>>>
>>> Paolo
>>
Get corresponding ioeventfd from kvm->ioeventfds. If no match
is found, return NULL. This is used in kvm_assign_ioeventfd_idx()
and kvm_deassign_ioeventfd_idx().
Signed-off-by: Zhenyu Ye
---
virt/kvm/eventfd.c | 53 --
1 file changed, 28 inserti
This patch uses the cleared_* in struct mmu_gather to set the
TTL field in flush_tlb_range().
Signed-off-by: Zhenyu Ye
Reviewed-by: Catalin Marinas
---
arch/arm64/include/asm/tlb.h | 29 -
arch/arm64/include/asm/tlbflush.h | 14 --
2 files changed
This patch implement flush_{pmd|pud}_tlb_range() in arm64 by
calling __flush_tlb_range() with the corresponding stride and
tlb_level values.
Signed-off-by: Zhenyu Ye
---
arch/arm64/include/asm/pgtable.h | 10 ++
1 file changed, 10 insertions(+)
diff --git a/arch/arm64/include/asm
level value of flush_tlb_range() to 0,
which will be updated in future patches. And set the ttl value of
flush_tlb_page_nosync() to 3 because it is only called to flush a
single pte page.
Signed-off-by: Zhenyu Ye
---
arch/arm64/include/asm/tlbflush.h | 19 +--
1 file changed, 13
feature
arm64: Add level-hinted TLB invalidation helper
Peter Zijlstra (Intel) (1):
tlb: mmu_gather: add tlb_flush_*_range APIs
Zhenyu Ye (3):
arm64: Add tlbi_user_level TLB invalidation helper
arm64: tlb: Set the TTL field in flush_tlb_range
arm64: tlb: Set the TTL field in flush_
From: Marc Zyngier
Add a level-hinted TLB invalidation helper that only gets used if
ARMv8.4-TTL gets detected.
Signed-off-by: Marc Zyngier
Signed-off-by: Zhenyu Ye
Reviewed-by: Catalin Marinas
---
arch/arm64/include/asm/tlbflush.h | 29 +
1 file changed, 29
From: "Peter Zijlstra (Intel)"
tlb_flush_{pte|pmd|pud|p4d}_range() adjust the tlb->start and
tlb->end, then set corresponding cleared_*.
Signed-off-by: Peter Zijlstra (Intel)
Signed-off-by: Zhenyu Ye
Acked-by: Catalin Marinas
---
include/asm-ge
-by: Zhenyu Ye
Reviewed-by: Catalin Marinas
---
arch/arm64/include/asm/cpucaps.h | 3 ++-
arch/arm64/include/asm/sysreg.h | 1 +
arch/arm64/kernel/cpufeature.c | 11 +++
3 files changed, 14 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm
Hi Marc,
On 2020/7/26 1:40, Marc Zyngier wrote:
> On 2020-07-24 14:43, Zhenyu Ye wrote:
>> Now in unmap_stage2_range(), we flush tlbs one by one just after the
>> corresponding pages cleared. However, this may cause some performance
>> problems when the unmap range is ver
On 2020/7/14 0:59, Catalin Marinas wrote:
>> +config ARM64_TLBI_RANGE
>> +bool "Enable support for tlbi range feature"
>> +default y
>> +depends on AS_HAS_TLBI_RANGE
>> +help
>> + ARMv8.4-TLBI provides TLBI invalidation instruction that apply to a
>> + range of input
TLBI RANGE feature instoduces new assembly instructions and only
support by binutils >= 2.30. Add necessary Kconfig logic to allow
this to be enabled and pass '-march=armv8.4-a' to KBUILD_CFLAGS.
Signed-off-by: Zhenyu Ye
---
arch/arm64/Kconfig | 14 ++
arch/arm64/Makefile |
ARMv8.4-TLBI provides TLBI invalidation instruction that apply to a
range of input addresses. This patch detect this feature.
Signed-off-by: Zhenyu Ye
Link: https://lore.kernel.org/r/20200710094420.517-2-yezhen...@huawei.com
[catalin.mari...@arm.com: some renaming for consistency]
Signed-off
)), will incur extra overhead.
So increase 'scale' from 0 to maximum, the flush order is exactly
opposite to the example.
Signed-off-by: Zhenyu Ye
Link: https://lore.kernel.org/r/20200710094420.517-3-yezhen...@huawei.com
[catalin.mari...@arm.com: removed unnecessary masks in __TLBI_VADDR_RANGE]
[catalin.mari
://lore.kernel.org/linux-arm-kernel/20200708124031.1414-1-yezhen...@huawei.com/
Zhenyu Ye (3):
arm64: tlb: Detect the ARMv8.4 TLBI RANGE feature
arm64: enable tlbi range instructions
arm64: tlb: Use the TLBI RANGE feature in arm64
arch/arm64/Kconfig| 14 +++
arch/arm64
the kvm_tlb_flush_vmid_ipa() out of loop, and
flush tlbs by range after other operations completed. Because we
do not make new mapping for the pages here, so this doesn't violate
the Break-Before-Make rules.
Signed-off-by: Zhenyu Ye
---
arch/arm64/include/asm/kvm_asm.h | 2 ++
arch/arm64/kvm/hyp/tlb.c
On 2020/5/5 18:14, Mark Rutland wrote:
> On Tue, Apr 14, 2020 at 07:28:34PM +0800, Zhenyu Ye wrote:
>> ARMv8.4-TLBI provides TLBI invalidation instruction that apply to a
>> range of input addresses. This patch detect this feature.
>>
>> Signed-off-by: Zhenyu Ye
>&g
Hi all,
How is this going about this patch series? Does anyone have any
suggestions?
Thanks,
Zhenyu
On 2020/4/23 21:56, Zhenyu Ye wrote:
> In order to reduce the cost of TLB invalidation, ARMv8.4 provides
> the TTL field in TLBI instruction. The TTL field indicates the
> level of p
Hi Catalin,
Thanks for your review.
On 2020/5/14 23:28, Catalin Marinas wrote:
> Hi Zhenyu,
>
> On Tue, Apr 14, 2020 at 07:28:35PM +0800, Zhenyu Ye wrote:
>> diff --git a/arch/arm64/include/asm/tlb.h b/arch/arm64/include/asm/tlb.h
>> index b76df828e6b7..3a1816770bd1 10064
Hi Anshuman,
On 2020/5/18 12:22, Anshuman Khandual wrote:
static const struct arm64_ftr_bits ftr_id_aa64isar0[] = {
ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE,
ID_AA64ISAR0_RNDR_SHIFT, 4, 0),
+ ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE,
Hi Catalin,
On 2020/5/18 20:21, Zhenyu Ye wrote:
> I will test the performance of your suggestion and then reply you again
> here.
>
I have sent the v4 of this series [1], and compared the performance of
these two different implement. The test code is in the attachment (dire
Hi Catalin,
On 2020/7/13 20:21, Catalin Marinas wrote:
> On Fri, Jul 10, 2020 at 08:11:19PM +0100, Catalin Marinas wrote:
>> On Fri, 10 Jul 2020 17:44:18 +0800, Zhenyu Ye wrote:
>>> NOTICE: this series are based on the arm64 for-next/tlbi branch:
>>> git://git.kernel.o
Hi Jon,
On 2020/7/13 22:27, Jon Hunter wrote:
> After this change I am seeing the following build errors ...
>
> /tmp/cckzq3FT.s: Assembler messages:
> /tmp/cckzq3FT.s:854: Error: unknown or missing operation name at operand 1 --
> `tlbi rvae1is,x7'
> /tmp/cckzq3FT.s:870: Error: unknown or
On 2020/7/14 18:36, Catalin Marinas wrote:
> On Fri, Jul 10, 2020 at 05:44:20PM +0800, Zhenyu Ye wrote:
>> +#define __TLBI_RANGE_PAGES(num, scale) (((num) + 1) << (5 * (scale) +
>> 1))
>> +#define MAX_TLBI_RANGE_PAGES__TLBI_RANGE_PAGES(31, 3)
>&
Hi Catalin,
On 2020/7/8 1:36, Catalin Marinas wrote:
> On Mon, Jun 01, 2020 at 10:47:13PM +0800, Zhenyu Ye wrote:
>> @@ -59,6 +69,47 @@
>> __ta; \
>> })
>>
>> +/*
>> + * __TG defines translation
ARMv8.4-TLBI provides TLBI invalidation instruction that apply to a
range of input addresses. This patch detect this feature.
Signed-off-by: Zhenyu Ye
---
arch/arm64/include/asm/cpucaps.h | 3 ++-
arch/arm64/include/asm/sysreg.h | 3 +++
arch/arm64/kernel/cpufeature.c | 10 ++
3
use 'end - start < threshold number' to decide which way
to go, however, different hardware may have different thresholds, so
I'm not sure if this is feasible.
Signed-off-by: Zhenyu Ye
---
arch/arm64/include/asm/tlbflush.h | 104 ++
1 file changed, 90 insertions(
.
- remove the __TG macro.
- move the odd range_pages check into loop.
v4:
combine the __flush_tlb_range() and the __directly into the same function
with a single loop for both.
v3:
rebase this series on Linux 5.7-rc1.
v2:
Link: https://lkml.org/lkml/2019/11/11/348
Zhenyu Ye (2):
arm64: tlb: Detect
On 2020/7/9 2:24, Catalin Marinas wrote:
> On Wed, Jul 08, 2020 at 08:40:31PM +0800, Zhenyu Ye wrote:
>> Add __TLBI_VADDR_RANGE macro and rewrite __flush_tlb_range().
>>
>> In this patch, we only use the TLBI RANGE feature if the stride == PAGE_SIZE,
>> because when
Add __TLBI_VADDR_RANGE macro and rewrite __flush_tlb_range().
Signed-off-by: Zhenyu Ye
---
arch/arm64/include/asm/tlbflush.h | 156 --
1 file changed, 126 insertions(+), 30 deletions(-)
diff --git a/arch/arm64/include/asm/tlbflush.h
b/arch/arm64/include/asm
ARMv8.4-TLBI provides TLBI invalidation instruction that apply to a
range of input addresses. This patch detect this feature.
Signed-off-by: Zhenyu Ye
---
arch/arm64/include/asm/cpucaps.h | 3 ++-
arch/arm64/include/asm/sysreg.h | 3 +++
arch/arm64/kernel/cpufeature.c | 10 ++
3
://lore.kernel.org/linux-arm-kernel/504c7588-97e5-e014-fca0-c5511ae0d...@huawei.com/
--
RFC patches:
- Link:
https://lore.kernel.org/linux-arm-kernel/20200708124031.1414-1-yezhen...@huawei.com/
Zhenyu Ye (2):
arm64: tlb: Detect the ARMv8.4 TLBI RANGE feature
arm64: tlb: Use the TLBI RANGE
On 2020/7/9 17:10, Zhenyu Ye wrote:
> + /*
> + * When cpu does not support TLBI RANGE feature, we flush the tlb
> + * entries one by one at the granularity of 'stride'.
> + * When cpu supports the TLBI RANGE feature, then:
> + * 1. If pages is odd, flush
, the cost of unmap_stage2_range() can reduce to
16ms, and VM downtime can be less than 1s.
Signed-off-by: Zhenyu Ye
---
arch/arm64/include/asm/kvm_asm.h | 2 ++
arch/arm64/kvm/hyp/tlb.c | 36
arch/arm64/kvm/mmu.c | 11 +++---
3 files
Hi Catalin,
On 2020/7/10 0:48, Catalin Marinas wrote:
> On Thu, Jun 25, 2020 at 04:03:11PM +0800, Zhenyu Ye wrote:
>> @@ -189,8 +195,9 @@ static inline void flush_tlb_page_nosync(struct
>> vm_area_struct *vma,
>> unsigned long addr = __TLBI_VADDR(uaddr, ASID(vma->v
Hi Catalin,
On 2020/7/10 1:36, Catalin Marinas wrote:
> On Thu, Jul 09, 2020 at 05:10:54PM +0800, Zhenyu Ye wrote:
>> #define __tlbi_level(op, addr, level) do { \
>>
flush_tlb_page_nosync
Reported-by: Catalin Marinas
Fixes: e735b98a5fe0 ("arm64: Add tlbi_user_level TLB invalidation helper")
Signed-off-by: Zhenyu Ye
---
arch/arm64/include/asm/tlbflush.h | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
)), will incur extra overhead.
So increase 'scale' from 0 to maximum, the flush order is exactly
opposite to the example.
Signed-off-by: Zhenyu Ye
---
arch/arm64/include/asm/tlbflush.h | 138 +++---
1 file changed, 109 insertions(+), 29 deletions(-)
diff --git a/arch/arm64/include
ARMv8.4-TLBI provides TLBI invalidation instruction that apply to a
range of input addresses. This patch detect this feature.
Signed-off-by: Zhenyu Ye
---
arch/arm64/include/asm/cpucaps.h | 3 ++-
arch/arm64/include/asm/sysreg.h | 3 +++
arch/arm64/kernel/cpufeature.c | 10 ++
3
:
v2:
- remove the __tlbi_last_level() macro.
- add check for parameters in __TLBI_VADDR_RANGE macro.
RFC patches:
- Link:
https://lore.kernel.org/linux-arm-kernel/20200708124031.1414-1-yezhen...@huawei.com/
Zhenyu Ye (2):
arm64: tlb: Detect the ARMv8.4 TLBI RANGE feature
arm64: tlb: Use
Hi Catalin,
On 2020/7/11 2:31, Catalin Marinas wrote:
> On Fri, Jul 10, 2020 at 05:44:20PM +0800, Zhenyu Ye wrote:
>> -if ((end - start) >= (MAX_TLBI_OPS * stride)) {
>> +if ((!cpus_have_const_cap(ARM64_HAS_TLBI_RANGE) &&
>> +(end -
69 matches
Mail list logo