Re: [PATCH 3/4] dma-mapping: add a dma_ops_bypass flag to struct device

2020-04-19 Thread Alexey Kardashevskiy



On 19/04/2020 22:25, Joerg Roedel wrote:
> On Sun, Apr 19, 2020 at 10:00:58AM +0200, Christoph Hellwig wrote:
>> The difference is that NULL ops mean imply the direct mapping is always
>> used, dma_ops_bypass means a direct mapping is used if no bounce buffering
>> using swiotlb is needed, which should also answer your first question.
>> The idea is to consolidate code in the core to use an opportunistic
>> direct mapping instead of the dynamic iommu mapping.  I though the cover
>> letter and commit log explained this well enough, but maybe I need to
>> do a better job.
> 
> Ah right, now I see it, when dma_ops_bypass is set it will only use
> direct mapping when the available memory fits into the device's
> dma_masks, and calls into dma_ops otherwise.
> 
> I wonder how that will interact with an IOMMU driver, which has to make
> sure that the direct mapping is accessible for the device at all.  It
> can either put the device into a passthrough domain for direct mapping
> or into a re-mapped domain, but then all DMA-API calls need to use dma-ops.
> When the dma_mask covers available memory but coherent_mask doesn't,
> the streaming calls will use dma-direct and alloc_coherent() calls into
> dma-ops. There is no way for the IOMMU driver to ensure both works.
> 
> So what are the conditions under which an IOMMU driver would set
> dma_ops_bypass to 1 and get a different result as to when setting
> dev->dma_ops to NULL?


One example is powerpc64/pseries (arch/powerpc/kernel/dma-iommu.c) where
dma_iommu_ops::dma_iommu_dma_supported() (i.e. need ops) decides whether
to set dma_ops_bypass to 1. It tries creating a DMA window with 1:1
mapping to fit maximum possible RAM address, if that works, then ops is
not needed.


-- 
Alexey


Re: [PATCH 3/4] dma-mapping: add a dma_ops_bypass flag to struct device

2020-04-19 Thread Joerg Roedel
On Sun, Apr 19, 2020 at 10:00:58AM +0200, Christoph Hellwig wrote:
> The difference is that NULL ops mean imply the direct mapping is always
> used, dma_ops_bypass means a direct mapping is used if no bounce buffering
> using swiotlb is needed, which should also answer your first question.
> The idea is to consolidate code in the core to use an opportunistic
> direct mapping instead of the dynamic iommu mapping.  I though the cover
> letter and commit log explained this well enough, but maybe I need to
> do a better job.

Ah right, now I see it, when dma_ops_bypass is set it will only use
direct mapping when the available memory fits into the device's
dma_masks, and calls into dma_ops otherwise.

I wonder how that will interact with an IOMMU driver, which has to make
sure that the direct mapping is accessible for the device at all.  It
can either put the device into a passthrough domain for direct mapping
or into a re-mapped domain, but then all DMA-API calls need to use dma-ops.
When the dma_mask covers available memory but coherent_mask doesn't,
the streaming calls will use dma-direct and alloc_coherent() calls into
dma-ops. There is no way for the IOMMU driver to ensure both works.

So what are the conditions under which an IOMMU driver would set
dma_ops_bypass to 1 and get a different result as to when setting
dev->dma_ops to NULL?

Regards,

Joerg


Re: [PATCH 3/4] dma-mapping: add a dma_ops_bypass flag to struct device

2020-04-19 Thread Christoph Hellwig
On Sat, Apr 18, 2020 at 02:42:05PM +0200, Joerg Roedel wrote:
> Hi Christoph,
> 
> On Tue, Apr 14, 2020 at 02:25:05PM +0200, Christoph Hellwig wrote:
> > +static inline bool dma_map_direct(struct device *dev,
> > +   const struct dma_map_ops *ops)
> > +{
> > +   if (likely(!ops))
> > +   return true;
> > +   if (!dev->dma_ops_bypass)
> > +   return false;
> > +
> > +   return min_not_zero(*dev->dma_mask, dev->bus_dma_limit) >=
> > +   dma_direct_get_required_mask(dev);
> 
> Why is the dma-mask check done here? The dma-direct code handles memory
> outside of the devices dma-mask with swiotlb, no?
> 
> I also don't quite get what the difference between setting the
> dma_ops_bypass flag non-zero and setting ops to NULL is.

The difference is that NULL ops mean imply the direct mapping is always
used, dma_ops_bypass means a direct mapping is used if no bounce buffering
using swiotlb is needed, which should also answer your first question.
The idea is to consolidate code in the core to use an opportunistic
direct mapping instead of the dynamic iommu mapping.  I though the cover
letter and commit log explained this well enough, but maybe I need to
do a better job.


Re: [PATCH 3/4] dma-mapping: add a dma_ops_bypass flag to struct device

2020-04-18 Thread Joerg Roedel
Hi Christoph,

On Tue, Apr 14, 2020 at 02:25:05PM +0200, Christoph Hellwig wrote:
> +static inline bool dma_map_direct(struct device *dev,
> + const struct dma_map_ops *ops)
> +{
> + if (likely(!ops))
> + return true;
> + if (!dev->dma_ops_bypass)
> + return false;
> +
> + return min_not_zero(*dev->dma_mask, dev->bus_dma_limit) >=
> + dma_direct_get_required_mask(dev);

Why is the dma-mask check done here? The dma-direct code handles memory
outside of the devices dma-mask with swiotlb, no?

I also don't quite get what the difference between setting the
dma_ops_bypass flag non-zero and setting ops to NULL is.


Joerg




Re: [PATCH 3/4] dma-mapping: add a dma_ops_bypass flag to struct device

2020-04-14 Thread Greg Kroah-Hartman
On Tue, Apr 14, 2020 at 02:25:05PM +0200, Christoph Hellwig wrote:
> Several IOMMU drivers have a bypass mode where they can use a direct
> mapping if the devices DMA mask is large enough.  Add generic support
> to the core dma-mapping code to do that to switch those drivers to
> a common solution.
> 
> Signed-off-by: Christoph Hellwig 
> ---
>  include/linux/device.h |  6 
>  kernel/dma/mapping.c   | 70 +-
>  2 files changed, 55 insertions(+), 21 deletions(-)

Reviewed-by: Greg Kroah-Hartman 


[PATCH 3/4] dma-mapping: add a dma_ops_bypass flag to struct device

2020-04-14 Thread Christoph Hellwig
Several IOMMU drivers have a bypass mode where they can use a direct
mapping if the devices DMA mask is large enough.  Add generic support
to the core dma-mapping code to do that to switch those drivers to
a common solution.

Signed-off-by: Christoph Hellwig 
---
 include/linux/device.h |  6 
 kernel/dma/mapping.c   | 70 +-
 2 files changed, 55 insertions(+), 21 deletions(-)

diff --git a/include/linux/device.h b/include/linux/device.h
index ac8e37cd716a..143c2e0edb19 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -523,6 +523,11 @@ struct dev_links_info {
  *   sync_state() callback.
  * @dma_coherent: this particular device is dma coherent, even if the
  * architecture supports non-coherent devices.
+ * @dma_ops_bypass: If set to %true then the dma_ops are bypassed for the
+ * streaming DMA operations (->map_* / ->unmap_* / ->sync_*),
+ * and optionall (if the coherent mask is large enough) also
+ * for dma allocations.  This flag is managed by the dma ops
+ * instance from ->dma_supported.
  *
  * At the lowest level, every device in a Linux system is represented by an
  * instance of struct device. The device structure contains the information
@@ -622,6 +627,7 @@ struct device {
 defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU_ALL)
booldma_coherent:1;
 #endif
+   booldma_ops_bypass : 1;
 };
 
 static inline struct device *kobj_to_dev(struct kobject *kobj)
diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c
index 8e4155f9ee69..d776fb9f5adb 100644
--- a/kernel/dma/mapping.c
+++ b/kernel/dma/mapping.c
@@ -105,9 +105,33 @@ void *dmam_alloc_attrs(struct device *dev, size_t size, 
dma_addr_t *dma_handle,
 }
 EXPORT_SYMBOL(dmam_alloc_attrs);
 
-static inline bool dma_is_direct(const struct dma_map_ops *ops)
+/*
+ * Check if the devices uses a direct mapping for streaming DMA operations.
+ * This allows IOMMU drivers to set a bypass mode if the DMA mask is large
+ * enough.
+ */
+static inline bool dma_alloc_direct(struct device *dev,
+   const struct dma_map_ops *ops)
 {
-   return likely(!ops);
+   if (likely(!ops))
+   return true;
+   if (!dev->dma_ops_bypass)
+   return false;
+
+   return min_not_zero(dev->coherent_dma_mask, dev->bus_dma_limit) >=
+   dma_direct_get_required_mask(dev);
+}
+
+static inline bool dma_map_direct(struct device *dev,
+   const struct dma_map_ops *ops)
+{
+   if (likely(!ops))
+   return true;
+   if (!dev->dma_ops_bypass)
+   return false;
+
+   return min_not_zero(*dev->dma_mask, dev->bus_dma_limit) >=
+   dma_direct_get_required_mask(dev);
 }
 
 dma_addr_t dma_map_page_attrs(struct device *dev, struct page *page,
@@ -118,7 +142,7 @@ dma_addr_t dma_map_page_attrs(struct device *dev, struct 
page *page,
dma_addr_t addr;
 
BUG_ON(!valid_dma_direction(dir));
-   if (dma_is_direct(ops))
+   if (dma_map_direct(dev, ops))
addr = dma_direct_map_page(dev, page, offset, size, dir, attrs);
else
addr = ops->map_page(dev, page, offset, size, dir, attrs);
@@ -134,7 +158,7 @@ void dma_unmap_page_attrs(struct device *dev, dma_addr_t 
addr, size_t size,
const struct dma_map_ops *ops = get_dma_ops(dev);
 
BUG_ON(!valid_dma_direction(dir));
-   if (dma_is_direct(ops))
+   if (dma_map_direct(dev, ops))
dma_direct_unmap_page(dev, addr, size, dir, attrs);
else if (ops->unmap_page)
ops->unmap_page(dev, addr, size, dir, attrs);
@@ -153,7 +177,7 @@ int dma_map_sg_attrs(struct device *dev, struct scatterlist 
*sg, int nents,
int ents;
 
BUG_ON(!valid_dma_direction(dir));
-   if (dma_is_direct(ops))
+   if (dma_map_direct(dev, ops))
ents = dma_direct_map_sg(dev, sg, nents, dir, attrs);
else
ents = ops->map_sg(dev, sg, nents, dir, attrs);
@@ -172,7 +196,7 @@ void dma_unmap_sg_attrs(struct device *dev, struct 
scatterlist *sg,
 
BUG_ON(!valid_dma_direction(dir));
debug_dma_unmap_sg(dev, sg, nents, dir);
-   if (dma_is_direct(ops))
+   if (dma_map_direct(dev, ops))
dma_direct_unmap_sg(dev, sg, nents, dir, attrs);
else if (ops->unmap_sg)
ops->unmap_sg(dev, sg, nents, dir, attrs);
@@ -191,7 +215,7 @@ dma_addr_t dma_map_resource(struct device *dev, phys_addr_t 
phys_addr,
if (WARN_ON_ONCE(pfn_valid(PHYS_PFN(phys_addr
return DMA_MAPPING_ERROR;
 
-   if (dma_is_direct(ops))
+   if (dma_map_direct(dev, ops))
addr = dma_direct_map_resource(dev, phys_addr, size, dir, 
attrs);
else if (ops->map_resource)
addr = ops->map_resource(dev, phys_addr, size, dir,