On Sat, 14 Apr 2018 21:29:26 +0200
David Woodhouse <dw...@infradead.org> wrote:

> On Fri, 2018-04-13 at 19:26 +0200, Christoph Hellwig wrote:
> > On Fri, Apr 13, 2018 at 10:12:41AM -0700, Tushar Dave wrote:  
> > > I guess there is nothing we need to do!
> > >
> > > On x86, in case of no intel iommu or iommu is disabled, you end up in
> > > swiotlb for DMA API calls when system has 4G memory.
> > > However, AFAICT, for 64bit DMA capable devices swiotlb DMA APIs do not
> > > use bounce buffer until and unless you have swiotlb=force specified in
> > > kernel commandline.  
> > 
> > Sure.  But that means very sync_*_to_device and sync_*_to_cpu now
> > involves an indirect call to do exactly nothing, which in the workload
> > Jesper is looking at is causing a huge performance degradation due to
> > retpolines.  

Yes, exactly.

> 
> We should look at using the
> 
>  if (dma_ops == swiotlb_dma_ops)
>     swiotlb_map_page()
>  else
>     dma_ops->map_page()
> 
> trick for this. Perhaps with alternatives so that when an Intel or AMD
> IOMMU is detected, it's *that* which is checked for as the special
> case.

Yes, this trick is basically what I'm asking for :-)

It did sound like Hellwig wanted to first avoid/fix that x86 end-up
defaulting to swiotlb.  Thus, we just have to do the same trick with
the new default fall-through dma_ops.

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

Reply via email to