>-----Original Message-----
>From: Liu, Yi L <yi.l....@intel.com>
>Subject: Re: [PATCH v1 02/15] intel_iommu: Optimize context entry cache
>utilization
>
>On 2025/6/11 18:06, Duan, Zhenzhong wrote:
>>
>>
>>> -----Original Message-----
>>> From: Liu, Yi L <yi.l....@intel.com>
>>> Subject: Re: [PATCH v1 02/15] intel_iommu: Optimize context entry cache
>>> utilization
>>>
>>> On 2025/6/6 18:04, Zhenzhong Duan wrote:
>>>> There are many call sites referencing context entry by calling
>>>> vtd_dev_to_context_entry() which will traverse the DMAR table.
>>>>
>>>> In most cases we can use cached context entry in vtd_as-
>>context_cache_entry
>>>> except when its entry is stale. Currently only global and domain context
>>>> invalidation stale it.
>>>>
>>>> So introduce a helper function vtd_as_to_context_entry() to fetch from
>cache
>>>> before trying with vtd_dev_to_context_entry().
>>>
>>> The cached context entry is now protected by vtd_iommu_lock(). While not
>>> all caller of vtd_dev_to_context_entry() are under this lock.
>>>
>>> Also, the cached context entry is created in the translate path. IMHO,
>>> this path is not supposed to be triggered for passthrough devices.
>>> While this may need double check and may change in the future. But let's
>>> see if any locking issue with the current code.
>>
>> Good finding, yes.
>> Previously I thought translation path updates cc_entry->context_entry after
>cc_entry->context_cache_gen.
>> In vtd_as_to_context_entry() cc_entry->context_cache_gen is checked first, so
>there was no real race.
>> But I still missed a memory barrier like below:
>
>yeah, testing context_cache_gen is necessary. But without lock, this
>cannot guarantee the cc_entry is valid after the test.
>
>> @@ -2277,6 +2286,7 @@ static bool
>vtd_do_iommu_translate(VTDAddressSpace *vtd_as, PCIBus *bus,
>>                                     cc_entry->context_cache_gen,
>>                                     s->context_cache_gen);
>>           cc_entry->context_entry = ce;
>> +        smp_wmb();
>>           cc_entry->context_cache_gen = s->context_cache_gen;
>>       }
>>
>> Another option I can think of is adding lock to cache reading like below:
>
>this is in-enough as well since the cc_entry->context_entry can be modified
>after lock is released.
>
>> @@ -1659,11 +1659,15 @@ static int
>vtd_as_to_context_entry(VTDAddressSpace *vtd_as, VTDContextEntry *ce)
>>       uint8_t devfn = vtd_as->devfn;
>>       VTDContextCacheEntry *cc_entry = &vtd_as->context_cache_entry;
>>
>> +    vtd_iommu_lock(s);
>> +
>>       /* Try to fetch context-entry from cache first */
>>       if (cc_entry->context_cache_gen == s->context_cache_gen) {
>>           *ce = cc_entry->context_entry;
>> +        vtd_iommu_unlock(s);
>>           return 0;
>>       } else {
>> +        vtd_iommu_unlock(s);
>>           return vtd_dev_to_context_entry(s, bus_num, devfn, ce);
>>       }
>>   }
>>
>> Which one do you prefer?
>
>If it's just optimization, perhaps just drop it. :)

Fine for me, will do.

Thanks
Zhenzhong

Reply via email to