RE: [RFC PATCH 10/11] vfio/iommu_type1: Optimize dirty bitmap population based on iommu HWDBM
Hi, Cheers for these discussion. :) I am on vacation, and will come back at 2.18. Thanks, Keqian On 21-02-07 18:40:36, Keqian Zhu wrote: > Hi Yi, > > On 2021/2/7 17:56, Yi Sun wrote: > > Hi, > > > > On 21-01-28 23:17:41, Keqian Zhu wrote: > > > > [...] > > > >> +static void vfio_dma_dirty_log_start(struct vfio_iommu *iommu, > >> + struct vfio_dma *dma) > >> +{ > >> + struct vfio_domain *d; > >> + > >> + list_for_each_entry(d, >domain_list, next) { > >> + /* Go through all domain anyway even if we fail */ > >> + iommu_split_block(d->domain, dma->iova, dma->size); > >> + } > >> +} > > > > This should be a switch to prepare for dirty log start. Per Intel > > Vtd spec, there is SLADE defined in Scalable-Mode PASID Table Entry. > > It enables Accessed/Dirty Flags in second-level paging entries. > > So, a generic iommu interface here is better. For Intel iommu, it > > enables SLADE. For ARM, it splits block. > Indeed, a generic interface name is better. > > The vendor iommu driver plays vendor's specific actions to start dirty log, > and Intel iommu and ARM smmu may differ. Besides, we may add more actions in > ARM smmu driver in future. > > One question: Though I am not familiar with Intel iommu, I think it also > should split block mapping besides enable SLADE. Right? > I am not familiar with ARM smmu. :) So I want to clarify if the block in smmu is big page, e.g. 2M page? Intel Vtd manages the memory per page, 4KB/2MB/1GB. There are two ways to manage dirty pages. 1. Keep default granularity. Just set SLADE to enable the dirty track. 2. Split big page to 4KB to get finer granularity. But question about the second solution is if it can benefit the user space, e.g. live migration. If my understanding about smmu block (i.e. the big page) is correct, have you collected some performance data to prove that the split can improve performance? Thanks! > Thanks, > Keqian ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [RFC PATCH v2 16/26] KVM: arm64: Prepare Hyp memory protection
On Thursday 04 Feb 2021 at 10:47:08 (+), Quentin Perret wrote: > On Wednesday 03 Feb 2021 at 14:37:10 (+), Will Deacon wrote: > > > +static void handle___pkvm_init(struct kvm_cpu_context *host_ctxt) > > > +{ > > > + DECLARE_REG(phys_addr_t, phys, host_ctxt, 1); > > > + DECLARE_REG(unsigned long, size, host_ctxt, 2); > > > + DECLARE_REG(unsigned long, nr_cpus, host_ctxt, 3); > > > + DECLARE_REG(unsigned long *, per_cpu_base, host_ctxt, 4); > > > + > > > + cpu_reg(host_ctxt, 1) = __pkvm_init(phys, size, nr_cpus, per_cpu_base); > > > > __pkvm_init() doesn't return, so I think this assignment back into host_ctxt > > is confusing. > > Very good point, I'll get rid of this. Actually not, I think I'll leave it like that. __pkvm_init can return an error, which is why I did this in the first place And it is useful for debugging to have it propagated back to the host. Thanks, Quentin ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [RFC PATCH 10/11] vfio/iommu_type1: Optimize dirty bitmap population based on iommu HWDBM
On 21-02-09 11:16:08, Robin Murphy wrote: > On 2021-02-07 09:56, Yi Sun wrote: > >Hi, > > > >On 21-01-28 23:17:41, Keqian Zhu wrote: > > > >[...] > > > >>+static void vfio_dma_dirty_log_start(struct vfio_iommu *iommu, > >>+struct vfio_dma *dma) > >>+{ > >>+ struct vfio_domain *d; > >>+ > >>+ list_for_each_entry(d, >domain_list, next) { > >>+ /* Go through all domain anyway even if we fail */ > >>+ iommu_split_block(d->domain, dma->iova, dma->size); > >>+ } > >>+} > > > >This should be a switch to prepare for dirty log start. Per Intel > >Vtd spec, there is SLADE defined in Scalable-Mode PASID Table Entry. > >It enables Accessed/Dirty Flags in second-level paging entries. > >So, a generic iommu interface here is better. For Intel iommu, it > >enables SLADE. For ARM, it splits block. > > From a quick look, VT-D's SLADE and SMMU's HTTU appear to be the > exact same thing. This step isn't about enabling or disabling that > feature itself (the proposal for SMMU is to simply leave HTTU > enabled all the time), it's about controlling the granularity at > which the dirty status can be detected/reported at all, since that's > tied to the pagetable structure. > > However, if an IOMMU were to come along with some other way of > reporting dirty status that didn't depend on the granularity of > individual mappings, then indeed it wouldn't need this operation. > Per my thought, we can use these two start/stop interfaces to make user space decide when to start/stop the dirty tracking. For Intel SLADE, I think we can enable this bit when this start interface is called by user space. I don't think leave SLADE enabled all the time is necessary for Intel Vt-d. So I suggest a generic interface here. Thanks! > Robin. > > >>+ > >>+static void vfio_dma_dirty_log_stop(struct vfio_iommu *iommu, > >>+ struct vfio_dma *dma) > >>+{ > >>+ struct vfio_domain *d; > >>+ > >>+ list_for_each_entry(d, >domain_list, next) { > >>+ /* Go through all domain anyway even if we fail */ > >>+ iommu_merge_page(d->domain, dma->iova, dma->size, > >>+d->prot | dma->prot); > >>+ } > >>+} > > > >Same as above comment, a generic interface is required here. > > > >>+ > >>+static void vfio_iommu_dirty_log_switch(struct vfio_iommu *iommu, bool > >>start) > >>+{ > >>+ struct rb_node *n; > >>+ > >>+ /* Split and merge even if all iommu don't support HWDBM now */ > >>+ for (n = rb_first(>dma_list); n; n = rb_next(n)) { > >>+ struct vfio_dma *dma = rb_entry(n, struct vfio_dma, node); > >>+ > >>+ if (!dma->iommu_mapped) > >>+ continue; > >>+ > >>+ /* Go through all dma range anyway even if we fail */ > >>+ if (start) > >>+ vfio_dma_dirty_log_start(iommu, dma); > >>+ else > >>+ vfio_dma_dirty_log_stop(iommu, dma); > >>+ } > >>+} > >>+ > >> static int vfio_iommu_type1_dirty_pages(struct vfio_iommu *iommu, > >>unsigned long arg) > >> { > >>@@ -2812,8 +2900,10 @@ static int vfio_iommu_type1_dirty_pages(struct > >>vfio_iommu *iommu, > >>pgsize = 1 << __ffs(iommu->pgsize_bitmap); > >>if (!iommu->dirty_page_tracking) { > >>ret = vfio_dma_bitmap_alloc_all(iommu, pgsize); > >>- if (!ret) > >>+ if (!ret) { > >>iommu->dirty_page_tracking = true; > >>+ vfio_iommu_dirty_log_switch(iommu, true); > >>+ } > >>} > >>mutex_unlock(>lock); > >>return ret; > >>@@ -2822,6 +2912,7 @@ static int vfio_iommu_type1_dirty_pages(struct > >>vfio_iommu *iommu, > >>if (iommu->dirty_page_tracking) { > >>iommu->dirty_page_tracking = false; > >>vfio_dma_bitmap_free_all(iommu); > >>+ vfio_iommu_dirty_log_switch(iommu, false); > >>} > >>mutex_unlock(>lock); > >>return 0; > >>-- > >>2.19.1 > >___ > >iommu mailing list > >io...@lists.linux-foundation.org > >https://lists.linuxfoundation.org/mailman/listinfo/iommu > > ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [RFC PATCH 10/11] vfio/iommu_type1: Optimize dirty bitmap population based on iommu HWDBM
On 21-02-07 18:40:36, Keqian Zhu wrote: > Hi Yi, > > On 2021/2/7 17:56, Yi Sun wrote: > > Hi, > > > > On 21-01-28 23:17:41, Keqian Zhu wrote: > > > > [...] > > > >> +static void vfio_dma_dirty_log_start(struct vfio_iommu *iommu, > >> + struct vfio_dma *dma) > >> +{ > >> + struct vfio_domain *d; > >> + > >> + list_for_each_entry(d, >domain_list, next) { > >> + /* Go through all domain anyway even if we fail */ > >> + iommu_split_block(d->domain, dma->iova, dma->size); > >> + } > >> +} > > > > This should be a switch to prepare for dirty log start. Per Intel > > Vtd spec, there is SLADE defined in Scalable-Mode PASID Table Entry. > > It enables Accessed/Dirty Flags in second-level paging entries. > > So, a generic iommu interface here is better. For Intel iommu, it > > enables SLADE. For ARM, it splits block. > Indeed, a generic interface name is better. > > The vendor iommu driver plays vendor's specific actions to start dirty log, > and Intel iommu and ARM smmu may differ. Besides, we may add more actions in > ARM smmu driver in future. > > One question: Though I am not familiar with Intel iommu, I think it also > should split block mapping besides enable SLADE. Right? > I am not familiar with ARM smmu. :) So I want to clarify if the block in smmu is big page, e.g. 2M page? Intel Vtd manages the memory per page, 4KB/2MB/1GB. There are two ways to manage dirty pages. 1. Keep default granularity. Just set SLADE to enable the dirty track. 2. Split big page to 4KB to get finer granularity. But question about the second solution is if it can benefit the user space, e.g. live migration. If my understanding about smmu block (i.e. the big page) is correct, have you collected some performance data to prove that the split can improve performance? Thanks! > Thanks, > Keqian ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [RFC PATCH v2 16/26] KVM: arm64: Prepare Hyp memory protection
On Tue, Feb 09, 2021 at 10:00:29AM +, Quentin Perret wrote: > On Thursday 04 Feb 2021 at 10:47:08 (+), Quentin Perret wrote: > > On Wednesday 03 Feb 2021 at 14:37:10 (+), Will Deacon wrote: > > > > +static void handle___pkvm_init(struct kvm_cpu_context *host_ctxt) > > > > +{ > > > > + DECLARE_REG(phys_addr_t, phys, host_ctxt, 1); > > > > + DECLARE_REG(unsigned long, size, host_ctxt, 2); > > > > + DECLARE_REG(unsigned long, nr_cpus, host_ctxt, 3); > > > > + DECLARE_REG(unsigned long *, per_cpu_base, host_ctxt, 4); > > > > + > > > > + cpu_reg(host_ctxt, 1) = __pkvm_init(phys, size, nr_cpus, > > > > per_cpu_base); > > > > > > __pkvm_init() doesn't return, so I think this assignment back into > > > host_ctxt > > > is confusing. > > > > Very good point, I'll get rid of this. > > Actually not, I think I'll leave it like that. __pkvm_init can return an > error, which is why I did this in the first place And it is useful for > debugging to have it propagated back to the host. Good point, but please add a comment! Will ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [RFC PATCH 10/11] vfio/iommu_type1: Optimize dirty bitmap population based on iommu HWDBM
On 2021-02-09 11:57, Yi Sun wrote: On 21-02-07 18:40:36, Keqian Zhu wrote: Hi Yi, On 2021/2/7 17:56, Yi Sun wrote: Hi, On 21-01-28 23:17:41, Keqian Zhu wrote: [...] +static void vfio_dma_dirty_log_start(struct vfio_iommu *iommu, +struct vfio_dma *dma) +{ + struct vfio_domain *d; + + list_for_each_entry(d, >domain_list, next) { + /* Go through all domain anyway even if we fail */ + iommu_split_block(d->domain, dma->iova, dma->size); + } +} This should be a switch to prepare for dirty log start. Per Intel Vtd spec, there is SLADE defined in Scalable-Mode PASID Table Entry. It enables Accessed/Dirty Flags in second-level paging entries. So, a generic iommu interface here is better. For Intel iommu, it enables SLADE. For ARM, it splits block. Indeed, a generic interface name is better. The vendor iommu driver plays vendor's specific actions to start dirty log, and Intel iommu and ARM smmu may differ. Besides, we may add more actions in ARM smmu driver in future. One question: Though I am not familiar with Intel iommu, I think it also should split block mapping besides enable SLADE. Right? I am not familiar with ARM smmu. :) So I want to clarify if the block in smmu is big page, e.g. 2M page? Intel Vtd manages the memory per page, 4KB/2MB/1GB. Indeed, what you call large pages, we call blocks :) Robin. There are two ways to manage dirty pages. 1. Keep default granularity. Just set SLADE to enable the dirty track. 2. Split big page to 4KB to get finer granularity. But question about the second solution is if it can benefit the user space, e.g. live migration. If my understanding about smmu block (i.e. the big page) is correct, have you collected some performance data to prove that the split can improve performance? Thanks! Thanks, Keqian ___ iommu mailing list io...@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH v2 2/2] KVM: arm64: Don't access PMSELR_EL0/PMUSERENR_EL0 when no PMU is available
When running under a nesting hypervisor, it isn't guaranteed that the virtual HW will include a PMU. In which case, let's not try to access the PMU registers in the world switch, as that'd be deadly. Reported-by: Andre Przywara Signed-off-by: Marc Zyngier --- arch/arm64/kernel/image-vars.h | 3 +++ arch/arm64/kvm/hyp/include/hyp/switch.h | 9 ++--- 2 files changed, 9 insertions(+), 3 deletions(-) diff --git a/arch/arm64/kernel/image-vars.h b/arch/arm64/kernel/image-vars.h index f676243abac6..32af3c865700 100644 --- a/arch/arm64/kernel/image-vars.h +++ b/arch/arm64/kernel/image-vars.h @@ -102,6 +102,9 @@ KVM_NVHE_ALIAS(__stop___kvm_ex_table); /* Array containing bases of nVHE per-CPU memory regions. */ KVM_NVHE_ALIAS(kvm_arm_hyp_percpu_base); +/* PMU available static key */ +KVM_NVHE_ALIAS(kvm_arm_pmu_available); + #endif /* CONFIG_KVM */ #endif /* __ARM64_KERNEL_IMAGE_VARS_H */ diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h index 84473574c2e7..75c0faa3b791 100644 --- a/arch/arm64/kvm/hyp/include/hyp/switch.h +++ b/arch/arm64/kvm/hyp/include/hyp/switch.h @@ -90,15 +90,18 @@ static inline void __activate_traps_common(struct kvm_vcpu *vcpu) * counter, which could make a PMXEVCNTR_EL0 access UNDEF at * EL1 instead of being trapped to EL2. */ - write_sysreg(0, pmselr_el0); - write_sysreg(ARMV8_PMU_USERENR_MASK, pmuserenr_el0); + if (kvm_arm_support_pmu_v3()) { + write_sysreg(0, pmselr_el0); + write_sysreg(ARMV8_PMU_USERENR_MASK, pmuserenr_el0); + } write_sysreg(vcpu->arch.mdcr_el2, mdcr_el2); } static inline void __deactivate_traps_common(void) { write_sysreg(0, hstr_el2); - write_sysreg(0, pmuserenr_el0); + if (kvm_arm_support_pmu_v3()) + write_sysreg(0, pmuserenr_el0); } static inline void ___activate_traps(struct kvm_vcpu *vcpu) -- 2.29.2 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH v2 0/2] KVM: arm64: Prevent spurious PMU accesses when no
Yet another PMU bug that is only likely to hit under Nested Virt: we unconditionally access PMU registers without checking whether it actually is present. Given that we already have a predicate for this, promote it to a static key, and use that in the world switch. Thanks to Andre for the heads up! * From v1: - Fix compilation when CONFIG_ARM_PMU isn't selected Marc Zyngier (2): KVM: arm64: Turn kvm_arm_support_pmu_v3() into a static key KVM: arm64: Don't access PMSELR_EL0/PMUSERENR_EL0 when no PMU is available arch/arm64/kernel/image-vars.h | 3 +++ arch/arm64/kvm/hyp/include/hyp/switch.h | 9 ++--- arch/arm64/kvm/perf.c | 10 ++ arch/arm64/kvm/pmu-emul.c | 10 -- include/kvm/arm_pmu.h | 9 +++-- 5 files changed, 26 insertions(+), 15 deletions(-) -- 2.29.2 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH v2 1/2] KVM: arm64: Turn kvm_arm_support_pmu_v3() into a static key
We currently find out about the presence of a HW PMU (or the handling of that PMU by perf, which amounts to the same thing) in a fairly roundabout way, by checking the number of counters available to perf. That's good enough for now, but we will soon need to find about about that on paths where perf is out of reach (in the world switch). Instead, let's turn kvm_arm_support_pmu_v3() into a static key. Signed-off-by: Marc Zyngier --- arch/arm64/kvm/perf.c | 10 ++ arch/arm64/kvm/pmu-emul.c | 10 -- include/kvm/arm_pmu.h | 9 +++-- 3 files changed, 17 insertions(+), 12 deletions(-) diff --git a/arch/arm64/kvm/perf.c b/arch/arm64/kvm/perf.c index d45b8b9a4415..739164324afe 100644 --- a/arch/arm64/kvm/perf.c +++ b/arch/arm64/kvm/perf.c @@ -11,6 +11,8 @@ #include +DEFINE_STATIC_KEY_FALSE(kvm_arm_pmu_available); + static int kvm_is_in_guest(void) { return kvm_get_running_vcpu() != NULL; @@ -48,6 +50,14 @@ static struct perf_guest_info_callbacks kvm_guest_cbs = { int kvm_perf_init(void) { + /* +* Check if HW_PERF_EVENTS are supported by checking the number of +* hardware performance counters. This could ensure the presence of +* a physical PMU and CONFIG_PERF_EVENT is selected. +*/ + if (IS_ENABLED(CONFIG_ARM_PMU) && perf_num_counters() > 0) + static_branch_enable(_arm_pmu_available); + return perf_register_guest_info_callbacks(_guest_cbs); } diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c index 4ad66a532e38..44d500706ab9 100644 --- a/arch/arm64/kvm/pmu-emul.c +++ b/arch/arm64/kvm/pmu-emul.c @@ -813,16 +813,6 @@ u64 kvm_pmu_get_pmceid(struct kvm_vcpu *vcpu, bool pmceid1) return val & mask; } -bool kvm_arm_support_pmu_v3(void) -{ - /* -* Check if HW_PERF_EVENTS are supported by checking the number of -* hardware performance counters. This could ensure the presence of -* a physical PMU and CONFIG_PERF_EVENT is selected. -*/ - return (perf_num_counters() > 0); -} - int kvm_arm_pmu_v3_enable(struct kvm_vcpu *vcpu) { if (!kvm_vcpu_has_pmu(vcpu)) diff --git a/include/kvm/arm_pmu.h b/include/kvm/arm_pmu.h index 8dcb3e1477bc..6fd3cda608e4 100644 --- a/include/kvm/arm_pmu.h +++ b/include/kvm/arm_pmu.h @@ -13,6 +13,13 @@ #define ARMV8_PMU_CYCLE_IDX(ARMV8_PMU_MAX_COUNTERS - 1) #define ARMV8_PMU_MAX_COUNTER_PAIRS((ARMV8_PMU_MAX_COUNTERS + 1) >> 1) +DECLARE_STATIC_KEY_FALSE(kvm_arm_pmu_available); + +static __always_inline bool kvm_arm_support_pmu_v3(void) +{ + return static_branch_likely(_arm_pmu_available); +} + #ifdef CONFIG_HW_PERF_EVENTS struct kvm_pmc { @@ -47,7 +54,6 @@ void kvm_pmu_software_increment(struct kvm_vcpu *vcpu, u64 val); void kvm_pmu_handle_pmcr(struct kvm_vcpu *vcpu, u64 val); void kvm_pmu_set_counter_event_type(struct kvm_vcpu *vcpu, u64 data, u64 select_idx); -bool kvm_arm_support_pmu_v3(void); int kvm_arm_pmu_v3_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr); int kvm_arm_pmu_v3_get_attr(struct kvm_vcpu *vcpu, @@ -87,7 +93,6 @@ static inline void kvm_pmu_software_increment(struct kvm_vcpu *vcpu, u64 val) {} static inline void kvm_pmu_handle_pmcr(struct kvm_vcpu *vcpu, u64 val) {} static inline void kvm_pmu_set_counter_event_type(struct kvm_vcpu *vcpu, u64 data, u64 select_idx) {} -static inline bool kvm_arm_support_pmu_v3(void) { return false; } static inline int kvm_arm_pmu_v3_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr) { -- 2.29.2 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [RFC PATCH 10/11] vfio/iommu_type1: Optimize dirty bitmap population based on iommu HWDBM
On 2021-02-07 09:56, Yi Sun wrote: Hi, On 21-01-28 23:17:41, Keqian Zhu wrote: [...] +static void vfio_dma_dirty_log_start(struct vfio_iommu *iommu, +struct vfio_dma *dma) +{ + struct vfio_domain *d; + + list_for_each_entry(d, >domain_list, next) { + /* Go through all domain anyway even if we fail */ + iommu_split_block(d->domain, dma->iova, dma->size); + } +} This should be a switch to prepare for dirty log start. Per Intel Vtd spec, there is SLADE defined in Scalable-Mode PASID Table Entry. It enables Accessed/Dirty Flags in second-level paging entries. So, a generic iommu interface here is better. For Intel iommu, it enables SLADE. For ARM, it splits block. From a quick look, VT-D's SLADE and SMMU's HTTU appear to be the exact same thing. This step isn't about enabling or disabling that feature itself (the proposal for SMMU is to simply leave HTTU enabled all the time), it's about controlling the granularity at which the dirty status can be detected/reported at all, since that's tied to the pagetable structure. However, if an IOMMU were to come along with some other way of reporting dirty status that didn't depend on the granularity of individual mappings, then indeed it wouldn't need this operation. Robin. + +static void vfio_dma_dirty_log_stop(struct vfio_iommu *iommu, + struct vfio_dma *dma) +{ + struct vfio_domain *d; + + list_for_each_entry(d, >domain_list, next) { + /* Go through all domain anyway even if we fail */ + iommu_merge_page(d->domain, dma->iova, dma->size, +d->prot | dma->prot); + } +} Same as above comment, a generic interface is required here. + +static void vfio_iommu_dirty_log_switch(struct vfio_iommu *iommu, bool start) +{ + struct rb_node *n; + + /* Split and merge even if all iommu don't support HWDBM now */ + for (n = rb_first(>dma_list); n; n = rb_next(n)) { + struct vfio_dma *dma = rb_entry(n, struct vfio_dma, node); + + if (!dma->iommu_mapped) + continue; + + /* Go through all dma range anyway even if we fail */ + if (start) + vfio_dma_dirty_log_start(iommu, dma); + else + vfio_dma_dirty_log_stop(iommu, dma); + } +} + static int vfio_iommu_type1_dirty_pages(struct vfio_iommu *iommu, unsigned long arg) { @@ -2812,8 +2900,10 @@ static int vfio_iommu_type1_dirty_pages(struct vfio_iommu *iommu, pgsize = 1 << __ffs(iommu->pgsize_bitmap); if (!iommu->dirty_page_tracking) { ret = vfio_dma_bitmap_alloc_all(iommu, pgsize); - if (!ret) + if (!ret) { iommu->dirty_page_tracking = true; + vfio_iommu_dirty_log_switch(iommu, true); + } } mutex_unlock(>lock); return ret; @@ -2822,6 +2912,7 @@ static int vfio_iommu_type1_dirty_pages(struct vfio_iommu *iommu, if (iommu->dirty_page_tracking) { iommu->dirty_page_tracking = false; vfio_dma_bitmap_free_all(iommu); + vfio_iommu_dirty_log_switch(iommu, false); } mutex_unlock(>lock); return 0; -- 2.19.1 ___ iommu mailing list io...@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm