>> +    for (i = 0; i < to_nsmmu(smmu)->num_inst; i++)

>It might make more sense to make this the innermost loop, i.e.:
        for (i = 0; i < nsmmu->num_inst; i++)
                reg &= readl_relaxed(nsmmu_page(smmu, i, page)...
>since polling the instances in parallel rather than in series seems like it 
>might be a bit more efficient.

Sync register is programmed at the same time for both instances. The status 
check is serialized.
I can update it to check status of both at the same time.

>> +    if (smmu->impl->tlb_sync) {
>> +            smmu->impl->tlb_sync(smmu, page, sync, status);

>What I'd hoped is that rather than needing a hook for this, you could just 
>override smmu_domain->tlb_ops from .init_context to wire up the alternate 
>.sync method directly. That would save this extra level of indirection.

With arm_smmu_domain now available in arm-smmu.h,  arm-smmu-nvidia.c can 
directly update the tlb_ops->tlb_sync and avoid indirection.
Will update in next version.

-KR 

Reply via email to