Hi Catalin,

I have sent the v4 of this series [1] and combine the two function with
a single loop.  See codes for details.

[1] 
https://lore.kernel.org/linux-arm-kernel/[email protected]/

On 2020/5/21 1:08, Catalin Marinas wrote:
>> This optimization is only effective when the range is a multiple of 256KB
>> (when the page size is 4KB), and I'm worried about the performance
>> of ilog2().  I traced the __flush_tlb_range() last year and found that in
>> most cases the range is less than 256K (see details in [1]).
> 
> THP or hugetlbfs would exercise bigger strides but I guess it depends on
> the use-case. ilog2() should be reduced to a few instructions on arm64
> AFAICT (haven't tried but it should use the CLZ instruction).
> 

Not bigger than 256K, but the range must be a integer multiple of 256KB,
so I still start from scale 0.

Thanks,
Zhenyu

Reply via email to