>-----Original Message----- >From: Liu, Yi L <yi.l....@intel.com> >Subject: Re: [PATCH v1 02/15] intel_iommu: Optimize context entry cache >utilization > >On 2025/6/11 18:06, Duan, Zhenzhong wrote: >> >> >>> -----Original Message----- >>> From: Liu, Yi L <yi.l....@intel.com> >>> Subject: Re: [PATCH v1 02/15] intel_iommu: Optimize context entry cache >>> utilization >>> >>> On 2025/6/6 18:04, Zhenzhong Duan wrote: >>>> There are many call sites referencing context entry by calling >>>> vtd_dev_to_context_entry() which will traverse the DMAR table. >>>> >>>> In most cases we can use cached context entry in vtd_as- >>context_cache_entry >>>> except when its entry is stale. Currently only global and domain context >>>> invalidation stale it. >>>> >>>> So introduce a helper function vtd_as_to_context_entry() to fetch from >cache >>>> before trying with vtd_dev_to_context_entry(). >>> >>> The cached context entry is now protected by vtd_iommu_lock(). While not >>> all caller of vtd_dev_to_context_entry() are under this lock. >>> >>> Also, the cached context entry is created in the translate path. IMHO, >>> this path is not supposed to be triggered for passthrough devices. >>> While this may need double check and may change in the future. But let's >>> see if any locking issue with the current code. >> >> Good finding, yes. >> Previously I thought translation path updates cc_entry->context_entry after >cc_entry->context_cache_gen. >> In vtd_as_to_context_entry() cc_entry->context_cache_gen is checked first, so >there was no real race. >> But I still missed a memory barrier like below: > >yeah, testing context_cache_gen is necessary. But without lock, this >cannot guarantee the cc_entry is valid after the test. > >> @@ -2277,6 +2286,7 @@ static bool >vtd_do_iommu_translate(VTDAddressSpace *vtd_as, PCIBus *bus, >> cc_entry->context_cache_gen, >> s->context_cache_gen); >> cc_entry->context_entry = ce; >> + smp_wmb(); >> cc_entry->context_cache_gen = s->context_cache_gen; >> } >> >> Another option I can think of is adding lock to cache reading like below: > >this is in-enough as well since the cc_entry->context_entry can be modified >after lock is released. > >> @@ -1659,11 +1659,15 @@ static int >vtd_as_to_context_entry(VTDAddressSpace *vtd_as, VTDContextEntry *ce) >> uint8_t devfn = vtd_as->devfn; >> VTDContextCacheEntry *cc_entry = &vtd_as->context_cache_entry; >> >> + vtd_iommu_lock(s); >> + >> /* Try to fetch context-entry from cache first */ >> if (cc_entry->context_cache_gen == s->context_cache_gen) { >> *ce = cc_entry->context_entry; >> + vtd_iommu_unlock(s); >> return 0; >> } else { >> + vtd_iommu_unlock(s); >> return vtd_dev_to_context_entry(s, bus_num, devfn, ce); >> } >> } >> >> Which one do you prefer? > >If it's just optimization, perhaps just drop it. :)
Fine for me, will do. Thanks Zhenzhong