On Thu, Oct 19, 2017 at 11:00:45AM +0800, Leizhen (ThunderTown) wrote: > > > On 2017/10/18 20:58, Will Deacon wrote: > > Hi Thunder, > > > > On Tue, Sep 12, 2017 at 09:00:36PM +0800, Zhen Lei wrote: > >> Because all TLBI commands should be followed by a SYNC command, to make > >> sure that it has been completely finished. So we can just add the TLBI > >> commands into the queue, and put off the execution until meet SYNC or > >> other commands. To prevent the followed SYNC command waiting for a long > >> time because of too many commands have been delayed, restrict the max > >> delayed number. > >> > >> According to my test, I got the same performance data as I replaced writel > >> with writel_relaxed in queue_inc_prod. > >> > >> Signed-off-by: Zhen Lei <thunder.leiz...@huawei.com> > >> --- > >> drivers/iommu/arm-smmu-v3.c | 42 > >> +++++++++++++++++++++++++++++++++++++----- > >> 1 file changed, 37 insertions(+), 5 deletions(-) > > > > If we want to go down the route of explicit command batching, I'd much > > rather do it by implementing the iotlb_range_add callback in the driver, > > and have a fixed-length array of batched ranges on the domain. We could > I think even if iotlb_range_add callback is implemented, this patch is still > valuable. The main purpose > of this patch is to reduce dsb operation. So in the scenario with > iotlb_range_add implemented: > .iotlb_range_add: > spin_lock_irqsave(&smmu->cmdq.lock, flags); > ... > add tlbi range-1 to cmq-queue > ... > add tlbi range-n to cmq-queue //n > dsb > ... > spin_unlock_irqrestore(&smmu->cmdq.lock, flags); > > .iotlb_sync > spin_lock_irqsave(&smmu->cmdq.lock, flags); > ... > add cmd_sync to cmq-queue > dsb > ... > spin_unlock_irqrestore(&smmu->cmdq.lock, flags); > > Although iotlb_range_add can reduce n-1 dsb operations, but there are > still 1 left. If n is not large enough, this patch is helpful.
Then pick an n that is large enough, based on the compatible string. Will