Hi,

Our team just discussed this issue again and consulted our GIC hardware 
design team. They think the RD can afford busy waiting. So we still think 
maybe 0 is better, at least for our hardware.

In addition, if not 0, as I said before, in our measurement, it takes only 
hundreds of nanoseconds, or 1~2 microseconds, to finish parsing the VPT 
in most cases. So maybe 1 microseconds, or smaller, is more appropriate. 
Anyway, 10 microseconds is too much.

But it has to be said that it does depend on the hardware implementation.

Besides, I'm not sure where are the start and end point of the total scheduling 
latency of a vcpu you said, which includes many events. Is the parse time of 
the VPT not clear enough?

-----Original Message-----
From: Marc Zyngier [mailto:m...@kernel.org] 
Sent: 2020-09-15 22:48
To: lushenming <lushenm...@huawei.com>
Cc: Thomas Gleixner <t...@linutronix.de>; Jason Cooper <ja...@lakedaemon.net>; 
linux-kernel@vger.kernel.org; Wanghaibin (D) <wanghaibin.w...@huawei.com>; 
yuzenghui <yuzeng...@huawei.com>
Subject: Re: [PATCH] irqchip/gic-v4.1: Optimize the delay time of the poll on 
the GICR_VPENDBASER.Dirty bit

On 2020-09-15 15:04, lushenming wrote:
> Thanks for your quick response.
> 
> Okay, I agree that busy-waiting may add more overhead at the RD level.
> But I think that the delay time can be adjusted. In our latest 
> hardware implementation, we optimize the search of the VPT, now even 
> the VPT full of interrupts (56k) can be parsed within 2 microseconds.

It's not so much when the VPT is full that it is bad. It is when the pending 
interrupts are not cached, and that you don't know *where* to look for them in 
the VPT.

> It is true that the parse speeds of various hardware are different, 
> but does directly waiting for 10 microseconds make the optimization of 
> those fast hardware be completely masked? Maybe we can set the delay 
> time smaller, like 1 microseconds?

That certainly would be more acceptable. But I still question the impact of 
such a change compared to the cost of a vcpu entry. I suggest you come up with 
measurements that actually show that polling this register more often 
significantly reduces the entry latency. Only then can we make an educated 
decision.

Thanks,

         M.
--
Jazz is not dead. It just smells funny...

Reply via email to