Re: [PATCH v3 09/13] device core: Introduce multiple dma pfn offsets

2020-06-05 Thread Jim Quinlan via iommu
On Fri, Jun 5, 2020 at 1:27 PM Nicolas Saenz Julienne
 wrote:
>
> Hi Christoph,
> a question arouse, is there a real value to dealing with PFNs (as opposed to
> real addresses) in the core DMA code/structures? I see that in some cases it
> eases interacting with mm, but the overwhelming usage of say,
> dev->dma_pfn_offset, involves shifting it.
>
> Hi Jim,
> On Thu, 2020-06-04 at 14:01 -0400, Jim Quinlan wrote:
> > Hi Nicolas,
>
> [...]
>
> > > I understand the need for dev to be around, devm_*() is key. But also it's
> > > important to keep the functions on purpose. And if of_dma_get_range() 
> > > starts
> > > setting ranges it calls, for the very least, for a function rename. 
> > > Although
> > > I'd rather split the parsing and setting of ranges as mentioned earlier.
> > > That
> > > said, I get that's a more drastic move.
> >
> > I agree with you.  I could do this from device.c:
> >
> > of_dma_get_num_ranges(..., _ranges); /* new function */
> > r = devm_kcalloc(dev, num_ranges + 1, sizeof(*r), GFP_KERNEL);
> > of_dma_get_range(np, _addr, , , r, num_ranges);
> >
> > The problem here is that there could be four ranges, all with
> > offset=0.  My current code would optimize this case out but the above
> > would have us holding useless memory and looping through the four
> > ranges on every dma <=> phys conversion only to add 0.
>
> Point taken. Ultimately it's setting the device's dma ranges in
> of_dma_get_range() that was really bothering me, so if we have to pass the
> device pointer for allocations, be it.
>
> > > Talking about drastic moves. How about getting rid of the concept of
> > > dma_pfn_offset for drivers altogether. Let them provide
> > > dma_pfn_offset_regions
> > > (even when there is only one). I feel it's conceptually nicer, as you'd be
> > > dealing only in one currency, so to speak, and you'd centralize the bus 
> > > DMA
> > > ranges setter function which is always easier to maintain.
> > Do you agree that we have to somehow hang this info on the struct
> > device structure?  Because in the dma2phys() and phys2dma() all you
> > have is the dev parameter.  I don't see how this  can be done w/o
> > involving dev.
>
> Sorry I didn't make myself clear here. What bothers me is having two functions
> setting the same device parameter trough different means, I'd be happy to get
> rid of attach_uniform_dma_pfn_offset(), and always use the same function to 
> set
> a device's bus dma regions. Something the likes of this comes to mind:
>
> dma_attach_pfn_offset_region(struct device *dev, struct 
> dma_pfn_offset_regions *r)
>
> We could maybe use some helper macros for the linear case. But that's the gist
> of it.
>
> Also, it goes hand in hand with the comment below. Why having a special case
> for non sparse DMA offsets in struct dma_pfn_offset_regions? The way I see it,
> in this case, code simplicity is more interesting than a small optimization.
I've removed the special case and also need for 'dev' in
of_dma_get_range().  v4 is comming...
>
> > > I'd go as far as not creating a special case for uniform offsets. Let just
> > > set
> > > cpu_end and dma_end to -1 so we always get a match. It's slightly more
> > > compute
> > > heavy, but I don't think it's worth the optimization.
> > Well, there are two subcases here.  One where we do know the bounds
> > and one where we do not.  I suppose for the latter I could have the
> > drivers calling it with begin=0 and end=~(dma_addr_t)0.  Let me give
> > this some thought...
> >
> > > Just my two cents :)
> >
> > Worth much more than $0.02 IMO :-)
>
> BTW, would you consider renaming the DMA offset struct to something simpler
> like, struct bus_dma_region? It complements 'dev->bus_dma_limit' better IMO.
Will do

Thanks,
Jim
>
> > BTW, I tried putting the "if (dev->dma_pfn_offset_map)" clause inside
> > the inline functions but the problem is that it slows the fastpath;
> > consider the following code from dma-direct.h
> >
> > if (dev->dma_pfn_offset_map) {
> > unsigned long dma_pfn_offset =
> dma_pfn_offset_from_phys_addr(dev, paddr);
> >
> > dev_addr -= ((dma_addr_t)dma_pfn_offset << PAGE_SHIFT);
> > }
> > return dev_addr;
> >
> > becomes
> >
> > unsigned long dma_pfn_offset = dma_pfn_offset_from_phys_addr(dev,
> paddr);
> >
> > dev_addr -= ((dma_addr_t)dma_pfn_offset << PAGE_SHIFT);
> > return dev_addr;
> >
> > So those configurations that  have no dma_pfn_offsets are doing an
> > unnecessary shift and add.
>
> Fair enough. Still not a huge difference, but I see the value being the most
> common case.
>
> Regards,
> Nicolas
>
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 09/13] device core: Introduce multiple dma pfn offsets

2020-06-05 Thread Nicolas Saenz Julienne
Hi Christoph,
a question arouse, is there a real value to dealing with PFNs (as opposed to
real addresses) in the core DMA code/structures? I see that in some cases it
eases interacting with mm, but the overwhelming usage of say,
dev->dma_pfn_offset, involves shifting it.

Hi Jim,
On Thu, 2020-06-04 at 14:01 -0400, Jim Quinlan wrote:
> Hi Nicolas,

[...]

> > I understand the need for dev to be around, devm_*() is key. But also it's
> > important to keep the functions on purpose. And if of_dma_get_range() starts
> > setting ranges it calls, for the very least, for a function rename. Although
> > I'd rather split the parsing and setting of ranges as mentioned earlier.
> > That
> > said, I get that's a more drastic move.
> 
> I agree with you.  I could do this from device.c:
> 
> of_dma_get_num_ranges(..., _ranges); /* new function */
> r = devm_kcalloc(dev, num_ranges + 1, sizeof(*r), GFP_KERNEL);
> of_dma_get_range(np, _addr, , , r, num_ranges);
> 
> The problem here is that there could be four ranges, all with
> offset=0.  My current code would optimize this case out but the above
> would have us holding useless memory and looping through the four
> ranges on every dma <=> phys conversion only to add 0.

Point taken. Ultimately it's setting the device's dma ranges in
of_dma_get_range() that was really bothering me, so if we have to pass the
device pointer for allocations, be it.

> > Talking about drastic moves. How about getting rid of the concept of
> > dma_pfn_offset for drivers altogether. Let them provide
> > dma_pfn_offset_regions
> > (even when there is only one). I feel it's conceptually nicer, as you'd be
> > dealing only in one currency, so to speak, and you'd centralize the bus DMA
> > ranges setter function which is always easier to maintain.
> Do you agree that we have to somehow hang this info on the struct
> device structure?  Because in the dma2phys() and phys2dma() all you
> have is the dev parameter.  I don't see how this  can be done w/o
> involving dev.

Sorry I didn't make myself clear here. What bothers me is having two functions
setting the same device parameter trough different means, I'd be happy to get
rid of attach_uniform_dma_pfn_offset(), and always use the same function to set
a device's bus dma regions. Something the likes of this comes to mind:

dma_attach_pfn_offset_region(struct device *dev, struct dma_pfn_offset_regions 
*r)

We could maybe use some helper macros for the linear case. But that's the gist
of it.

Also, it goes hand in hand with the comment below. Why having a special case
for non sparse DMA offsets in struct dma_pfn_offset_regions? The way I see it,
in this case, code simplicity is more interesting than a small optimization.

> > I'd go as far as not creating a special case for uniform offsets. Let just
> > set
> > cpu_end and dma_end to -1 so we always get a match. It's slightly more
> > compute
> > heavy, but I don't think it's worth the optimization.
> Well, there are two subcases here.  One where we do know the bounds
> and one where we do not.  I suppose for the latter I could have the
> drivers calling it with begin=0 and end=~(dma_addr_t)0.  Let me give
> this some thought...
> 
> > Just my two cents :)
> 
> Worth much more than $0.02 IMO :-)

BTW, would you consider renaming the DMA offset struct to something simpler
like, struct bus_dma_region? It complements 'dev->bus_dma_limit' better IMO.

> BTW, I tried putting the "if (dev->dma_pfn_offset_map)" clause inside
> the inline functions but the problem is that it slows the fastpath;
> consider the following code from dma-direct.h
> 
> if (dev->dma_pfn_offset_map) {
> unsigned long dma_pfn_offset =
dma_pfn_offset_from_phys_addr(dev, paddr);
> 
> dev_addr -= ((dma_addr_t)dma_pfn_offset << PAGE_SHIFT);
> }
> return dev_addr;
> 
> becomes
> 
> unsigned long dma_pfn_offset = dma_pfn_offset_from_phys_addr(dev,
paddr);
> 
> dev_addr -= ((dma_addr_t)dma_pfn_offset << PAGE_SHIFT);
> return dev_addr;
> 
> So those configurations that  have no dma_pfn_offsets are doing an
> unnecessary shift and add.

Fair enough. Still not a huge difference, but I see the value being the most
common case.

Regards,
Nicolas



signature.asc
Description: This is a digitally signed message part
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 09/13] device core: Introduce multiple dma pfn offsets

2020-06-04 Thread Jim Quinlan via iommu
Hi Nicolas,

On Thu, Jun 4, 2020 at 12:52 PM Nicolas Saenz Julienne
 wrote:
>
> Hi Jim,
>
> On Thu, 2020-06-04 at 10:35 -0400, Jim Quinlan wrote:
>
> [...]
>
> > > > --- a/arch/sh/kernel/dma-coherent.c
> > > > +++ b/arch/sh/kernel/dma-coherent.c
> > > > @@ -14,6 +14,8 @@ void *arch_dma_alloc(struct device *dev, size_t size,
> > > > dma_addr_t *dma_handle,
> > > >  {
> > > >   void *ret, *ret_nocache;
> > > >   int order = get_order(size);
> > > > + unsigned long pfn;
> > > > + phys_addr_t phys;
> > > >
> > > >   gfp |= __GFP_ZERO;
> > > >
> > > > @@ -34,11 +36,14 @@ void *arch_dma_alloc(struct device *dev, size_t 
> > > > size,
> > > > dma_addr_t *dma_handle,
> > > >   return NULL;
> > > >   }
> > > >
> > > > - split_page(pfn_to_page(virt_to_phys(ret) >> PAGE_SHIFT), order);
> > > > + phys = virt_to_phys(ret);
> > > > + pfn =  phys >> PAGE_SHIFT;
> > >
> > > nit: not sure it really pays off to have a pfn variable here.
> > Did it for readability; the compiler's optimization should take care
> > of any extra variables.  But I can switch if you insist.
>
> No need.
>
> [...]
>
> > > > diff --git a/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> > > > b/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> > > > index 055eb0b8e396..2d66d415b6c3 100644
> > > > --- a/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> > > > +++ b/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> > > > @@ -898,7 +898,10 @@ static int sun6i_csi_probe(struct platform_device
> > > > *pdev)
> > > >
> > > >   sdev->dev = >dev;
> > > >   /* The DMA bus has the memory mapped at 0 */
> > > > - sdev->dev->dma_pfn_offset = PHYS_OFFSET >> PAGE_SHIFT;
> > > > + ret = attach_uniform_dma_pfn_offset(sdev->dev,
> > > > + PHYS_OFFSET >> PAGE_SHIFT);
> > > > + if (ret)
> > > > + return ret;
> > > >
> > > >   ret = sun6i_csi_resource_request(sdev, pdev);
> > > >   if (ret)
> > > > diff --git a/drivers/of/address.c b/drivers/of/address.c
> > > > index 96d8cfb14a60..c89333b0a5fb 100644
> > > > --- a/drivers/of/address.c
> > > > +++ b/drivers/of/address.c
> > > > @@ -918,6 +918,70 @@ void __iomem *of_io_request_and_map(struct
> > > > device_node
> > > > *np, int index,
> > > >  }
> > > >  EXPORT_SYMBOL(of_io_request_and_map);
> > > >
> > > > +static int attach_dma_pfn_offset_map(struct device *dev,
> > > > +  struct device_node *node, int
> > > > num_ranges)
> > >
> > > As with the previous review, please take this comment with a grain of 
> > > salt.
> > >
> > > I think there should be a clear split between what belongs to OF and what
> > > belongs to the core device infrastructure.
> > >
> > > OF's job should be to parse DT and provide a list/array of ranges, whereas
> > > the
> > > core device infrastructure should provide an API to assign a list of
> > > ranges/offset to a device.
> > >
> > > As a concrete example, you're forcing devices like the sta2x11 to build 
> > > with
> > > OF
> > > support, which, being an Intel device, it's pretty odd. But I'm also
> > > thinking
> > > of how will all this fit once an ACPI device wants to use it.
> > To fix this I only have to move attach_uniform_dma_pfn_offset() from
> > of/address.c to say include/linux/dma-mapping.h.  It has no
> > dependencies on OF.  Do you agree?
>
> Yes that seems nicer. In case you didn't had it in mind already, I'd change 
> the
> function name to match the naming scheme they use there.
>
> On the other hand, I'd also move the non OF parts of the non unifom dma_offset
> version of the function there.
>
> > > Expanding on this idea, once you have a split between the OF's and device
> > > core
> > > roles, it transpires that of_dma_get_range()'s job should only be to 
> > > provide
> > > the ranges in a device understandable structure and of_dma_configre()'s to
> > > actually assign the device's parameters. This would obsolete patch #7.
> >
> > I think you mean patch #8.
>
> Yes, my bad.
>
> > I agree with you.  The reason I needed a "struct device *"  in the call is
> > because I wanted to make sure the memory that is alloc'd belongs to the
> > device that needs it.  If I do a regular kzalloc(), this memory  will become
> > a leak once someone starts unbinding/binding their device.  Also, in  all
> > uses of of_dma_rtange() -- there is only one --  a dev is required as one
> > can't attach an offset map to NULL.
> >
> > I do see that there are a number of functions in drivers/of/*.c that
> > take 'struct device *dev' as an argument so there is precedent for
> > something like this.  Regardless, I need an owner to the memory I
> > alloc().
>
> I understand the need for dev to be around, devm_*() is key. But also it's
> important to keep the functions on purpose. And if of_dma_get_range() starts
> setting ranges it calls, for the very least, for a function rename. Although
> I'd rather split the parsing and setting 

Re: [PATCH v3 09/13] device core: Introduce multiple dma pfn offsets

2020-06-04 Thread Nicolas Saenz Julienne
Hi Jim,

On Thu, 2020-06-04 at 10:35 -0400, Jim Quinlan wrote:

[...]

> > > --- a/arch/sh/kernel/dma-coherent.c
> > > +++ b/arch/sh/kernel/dma-coherent.c
> > > @@ -14,6 +14,8 @@ void *arch_dma_alloc(struct device *dev, size_t size,
> > > dma_addr_t *dma_handle,
> > >  {
> > >   void *ret, *ret_nocache;
> > >   int order = get_order(size);
> > > + unsigned long pfn;
> > > + phys_addr_t phys;
> > > 
> > >   gfp |= __GFP_ZERO;
> > > 
> > > @@ -34,11 +36,14 @@ void *arch_dma_alloc(struct device *dev, size_t size,
> > > dma_addr_t *dma_handle,
> > >   return NULL;
> > >   }
> > > 
> > > - split_page(pfn_to_page(virt_to_phys(ret) >> PAGE_SHIFT), order);
> > > + phys = virt_to_phys(ret);
> > > + pfn =  phys >> PAGE_SHIFT;
> > 
> > nit: not sure it really pays off to have a pfn variable here.
> Did it for readability; the compiler's optimization should take care
> of any extra variables.  But I can switch if you insist.

No need.

[...]

> > > diff --git a/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> > > b/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> > > index 055eb0b8e396..2d66d415b6c3 100644
> > > --- a/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> > > +++ b/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> > > @@ -898,7 +898,10 @@ static int sun6i_csi_probe(struct platform_device
> > > *pdev)
> > > 
> > >   sdev->dev = >dev;
> > >   /* The DMA bus has the memory mapped at 0 */
> > > - sdev->dev->dma_pfn_offset = PHYS_OFFSET >> PAGE_SHIFT;
> > > + ret = attach_uniform_dma_pfn_offset(sdev->dev,
> > > + PHYS_OFFSET >> PAGE_SHIFT);
> > > + if (ret)
> > > + return ret;
> > > 
> > >   ret = sun6i_csi_resource_request(sdev, pdev);
> > >   if (ret)
> > > diff --git a/drivers/of/address.c b/drivers/of/address.c
> > > index 96d8cfb14a60..c89333b0a5fb 100644
> > > --- a/drivers/of/address.c
> > > +++ b/drivers/of/address.c
> > > @@ -918,6 +918,70 @@ void __iomem *of_io_request_and_map(struct
> > > device_node
> > > *np, int index,
> > >  }
> > >  EXPORT_SYMBOL(of_io_request_and_map);
> > > 
> > > +static int attach_dma_pfn_offset_map(struct device *dev,
> > > +  struct device_node *node, int
> > > num_ranges)
> > 
> > As with the previous review, please take this comment with a grain of salt.
> > 
> > I think there should be a clear split between what belongs to OF and what
> > belongs to the core device infrastructure.
> > 
> > OF's job should be to parse DT and provide a list/array of ranges, whereas
> > the
> > core device infrastructure should provide an API to assign a list of
> > ranges/offset to a device.
> > 
> > As a concrete example, you're forcing devices like the sta2x11 to build with
> > OF
> > support, which, being an Intel device, it's pretty odd. But I'm also
> > thinking
> > of how will all this fit once an ACPI device wants to use it.
> To fix this I only have to move attach_uniform_dma_pfn_offset() from
> of/address.c to say include/linux/dma-mapping.h.  It has no
> dependencies on OF.  Do you agree?

Yes that seems nicer. In case you didn't had it in mind already, I'd change the
function name to match the naming scheme they use there.

On the other hand, I'd also move the non OF parts of the non unifom dma_offset
version of the function there.

> > Expanding on this idea, once you have a split between the OF's and device
> > core
> > roles, it transpires that of_dma_get_range()'s job should only be to provide
> > the ranges in a device understandable structure and of_dma_configre()'s to
> > actually assign the device's parameters. This would obsolete patch #7.
> 
> I think you mean patch #8.

Yes, my bad.

> I agree with you.  The reason I needed a "struct device *"  in the call is
> because I wanted to make sure the memory that is alloc'd belongs to the
> device that needs it.  If I do a regular kzalloc(), this memory  will become
> a leak once someone starts unbinding/binding their device.  Also, in  all
> uses of of_dma_rtange() -- there is only one --  a dev is required as one
> can't attach an offset map to NULL.
> 
> I do see that there are a number of functions in drivers/of/*.c that
> take 'struct device *dev' as an argument so there is precedent for
> something like this.  Regardless, I need an owner to the memory I
> alloc().

I understand the need for dev to be around, devm_*() is key. But also it's
important to keep the functions on purpose. And if of_dma_get_range() starts
setting ranges it calls, for the very least, for a function rename. Although
I'd rather split the parsing and setting of ranges as mentioned earlier. That
said, I get that's a more drastic move.

Talking about drastic moves. How about getting rid of the concept of
dma_pfn_offset for drivers altogether. Let them provide dma_pfn_offset_regions
(even when there is only one). I feel it's conceptually nicer, as you'd be
dealing only in one 

Re: [PATCH v3 09/13] device core: Introduce multiple dma pfn offsets

2020-06-04 Thread Jim Quinlan via iommu
Hi Andy,

On Thu, Jun 4, 2020 at 11:05 AM Andy Shevchenko
 wrote:
>
> On Thu, Jun 04, 2020 at 10:35:12AM -0400, Jim Quinlan wrote:
> > On Thu, Jun 4, 2020 at 9:53 AM Nicolas Saenz Julienne
> >  wrote:
> > > On Wed, 2020-06-03 at 15:20 -0400, Jim Quinlan wrote:
>
> ...
>
> > > > + phys = virt_to_phys(ret);
> > > > + pfn =  phys >> PAGE_SHIFT;
> > >
> > > nit: not sure it really pays off to have a pfn variable here.
> > Did it for readability; the compiler's optimization should take care
> > of any extra variables.  But I can switch if you insist.
>
> One side note: please, try to get familiar with existing helpers in the 
> kernel.
> For example, above line is like
>
> pfn = PFN_DOWN(phys);
I just used the term in the original code; will change to PFN_DOWN().

>
> ...
>
> > > > + if (!WARN_ON(!dev) && dev->dma_pfn_offset_map)
>
> > > > + *dma_handle -= PFN_PHYS(
> > > > + dma_pfn_offset_from_phys_addr(dev, phys));
>
> Don't do such indentation, esp. we have now 100! :-)

Got it.  Thanks,
Jim Quinlan
>
> --
> With Best Regards,
> Andy Shevchenko
>
>
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 09/13] device core: Introduce multiple dma pfn offsets

2020-06-04 Thread Andy Shevchenko
On Thu, Jun 04, 2020 at 10:35:12AM -0400, Jim Quinlan wrote:
> On Thu, Jun 4, 2020 at 9:53 AM Nicolas Saenz Julienne
>  wrote:
> > On Wed, 2020-06-03 at 15:20 -0400, Jim Quinlan wrote:

...

> > > + phys = virt_to_phys(ret);
> > > + pfn =  phys >> PAGE_SHIFT;
> >
> > nit: not sure it really pays off to have a pfn variable here.
> Did it for readability; the compiler's optimization should take care
> of any extra variables.  But I can switch if you insist.

One side note: please, try to get familiar with existing helpers in the kernel.
For example, above line is like

pfn = PFN_DOWN(phys);

...

> > > + if (!WARN_ON(!dev) && dev->dma_pfn_offset_map)

> > > + *dma_handle -= PFN_PHYS(
> > > + dma_pfn_offset_from_phys_addr(dev, phys));

Don't do such indentation, esp. we have now 100! :-)

-- 
With Best Regards,
Andy Shevchenko


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 09/13] device core: Introduce multiple dma pfn offsets

2020-06-04 Thread Jim Quinlan via iommu
On Thu, Jun 4, 2020 at 10:20 AM Dan Carpenter  wrote:
>
> On Thu, Jun 04, 2020 at 09:48:49AM -0400, Jim Quinlan wrote:
> > > > + r = devm_kcalloc(dev, 1, sizeof(struct dma_pfn_offset_region),
> > > > +  GFP_KERNEL);
> > >
> > > Use:r = devm_kzalloc(dev, sizeof(*r), GFP_KERNEL);
> > Will fix.
> >
> > >
> > >
> > > > + if (!r)
> > > > + return -ENOMEM;
> > > > +
> > > > + r->uniform_offset = true;
> > > > + r->pfn_offset = pfn_offset;
> > > > +
> > > > + return 0;
> > > > +}
> > >
> > > This function doesn't seem to do anything useful.  Is part of it
> > > missing?
> > No, the uniform pfn offset is a special case.
>
> Sorry, I wasn't clear.  We're talking about different things.  The code
> does:
>
> r = devm_kzalloc(dev, sizeof(*r), GFP_KERNEL);
> if (!r)
> return -ENOMEM;
>
> r->uniform_offset = true;
> r->pfn_offset = pfn_offset;
>
> return 0;
>
> The code allocates "r" and then doesn't save it anywhere so there is
> no point.
You are absolutely right, sorry I missed your point.  Will fix.

Thanks,
Jim Quinlan

>
> regards,
> dan carpenter
>
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 09/13] device core: Introduce multiple dma pfn offsets

2020-06-04 Thread Jim Quinlan via iommu
On Thu, Jun 4, 2020 at 9:53 AM Nicolas Saenz Julienne
 wrote:
>
> Hi Jim,
>
> On Wed, 2020-06-03 at 15:20 -0400, Jim Quinlan wrote:
> > The new field in struct device 'dma_pfn_offset_map' is used to facilitate
> > the use of multiple pfn offsets between cpu addrs and dma addrs.  It
> > subsumes the role of dev->dma_pfn_offset -- a uniform offset -- and
> > designates the single offset a special case.
> >
> > of_dma_configure() is the typical manner to set pfn offsets but there
> > are a number of ad hoc assignments to dev->dma_pfn_offset in the
> > kernel code.  These cases now invoke the function
> > attach_uniform_dma_pfn_offset(dev, pfn_offset).
> >
> > Signed-off-by: Jim Quinlan 
> > ---
> >  arch/arm/include/asm/dma-mapping.h|  9 +-
> >  arch/arm/mach-keystone/keystone.c |  9 +-
> >  arch/sh/drivers/pci/pcie-sh7786.c |  3 +-
> >  arch/sh/kernel/dma-coherent.c | 17 ++--
> >  arch/x86/pci/sta2x11-fixup.c  |  7 +-
> >  drivers/acpi/arm64/iort.c |  5 +-
> >  drivers/gpu/drm/sun4i/sun4i_backend.c |  7 +-
> >  drivers/iommu/io-pgtable-arm.c|  2 +-
> >  .../platform/sunxi/sun4i-csi/sun4i_csi.c  |  5 +-
> >  .../platform/sunxi/sun6i-csi/sun6i_csi.c  |  5 +-
> >  drivers/of/address.c  | 93 +--
> >  drivers/of/device.c   |  8 +-
> >  drivers/remoteproc/remoteproc_core.c  |  2 +-
> >  .../staging/media/sunxi/cedrus/cedrus_hw.c|  7 +-
> >  drivers/usb/core/message.c|  4 +-
> >  drivers/usb/core/usb.c|  2 +-
> >  include/linux/device.h|  4 +-
> >  include/linux/dma-direct.h| 16 +++-
> >  include/linux/dma-mapping.h   | 45 +
> >  kernel/dma/coherent.c | 11 ++-
> >  20 files changed, 210 insertions(+), 51 deletions(-)
> >
> > diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-
> > mapping.h
> > index bdd80ddbca34..f1e72f99468b 100644
> > --- a/arch/arm/include/asm/dma-mapping.h
> > +++ b/arch/arm/include/asm/dma-mapping.h
> > @@ -35,8 +35,9 @@ static inline const struct dma_map_ops
> > *get_arch_dma_ops(struct bus_type *bus)
> >  #ifndef __arch_pfn_to_dma
> >  static inline dma_addr_t pfn_to_dma(struct device *dev, unsigned long pfn)
> >  {
> > - if (dev)
> > - pfn -= dev->dma_pfn_offset;
> > + if (dev && dev->dma_pfn_offset_map)
>
> Would it make sense to move the dev->dma_pfn_offset_map check into
> dma_pfn_offset_from_phys_addr() and return 0 if not available? Same for the
> opposite variant of the function. I think it'd make the code a little simpler 
> on
> some of the use cases, and overall less error prone if anyone starts using the
> function elsewhere.

Yes it makes sense and I was debating doing it but I just wanted to
make it explicit that there was not much cost for this change for the
fastpath -- no dma_pfn_offset whatsoever -- as the cost goes from a
"pfn += dev->dma_pfn_offset"  to a "if (dev->dma_pfn_offset_map)".  I
will do what you suggest.
>
> > + pfn -= dma_pfn_offset_from_phys_addr(dev, PFN_PHYS(pfn));
> > +
> >   return (dma_addr_t)__pfn_to_bus(pfn);
> >  }
> >
> > @@ -44,8 +45,8 @@ static inline unsigned long dma_to_pfn(struct device *dev,
> > dma_addr_t addr)
> >  {
> >   unsigned long pfn = __bus_to_pfn(addr);
> >
> > - if (dev)
> > - pfn += dev->dma_pfn_offset;
> > + if (dev && dev->dma_pfn_offset_map)
> > + pfn += dma_pfn_offset_from_dma_addr(dev, addr);
> >
> >   return pfn;
> >  }
> > diff --git a/arch/arm/mach-keystone/keystone.c b/arch/arm/mach-
> > keystone/keystone.c
> > index 638808c4e122..e7d3ee6e9cb5 100644
> > --- a/arch/arm/mach-keystone/keystone.c
> > +++ b/arch/arm/mach-keystone/keystone.c
> > @@ -8,6 +8,7 @@
> >   */
> >  #include 
> >  #include 
> > +#include 
> >  #include 
> >  #include 
> >  #include 
> > @@ -38,9 +39,11 @@ static int keystone_platform_notifier(struct 
> > notifier_block
> > *nb,
> >   return NOTIFY_BAD;
> >
> >   if (!dev->of_node) {
> > - dev->dma_pfn_offset = keystone_dma_pfn_offset;
> > - dev_err(dev, "set dma_pfn_offset%08lx\n",
> > - dev->dma_pfn_offset);
> > + int ret = attach_uniform_dma_pfn_offset
> > + (dev, keystone_dma_pfn_offset);
> > +
> > + dev_err(dev, "set dma_pfn_offset%08lx%s\n",
> > + dev->dma_pfn_offset, ret ? " failed" : "");
> >   }
> >   return NOTIFY_OK;
> >  }
> > diff --git a/arch/sh/drivers/pci/pcie-sh7786.c b/arch/sh/drivers/pci/pcie-
> > sh7786.c
> > index e0b568aaa701..2e832a5c58c1 100644
> > --- a/arch/sh/drivers/pci/pcie-sh7786.c
> > +++ b/arch/sh/drivers/pci/pcie-sh7786.c
> > @@ -12,6 +12,7 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >  

Re: [PATCH v3 09/13] device core: Introduce multiple dma pfn offsets

2020-06-04 Thread Dan Carpenter
On Thu, Jun 04, 2020 at 09:48:49AM -0400, Jim Quinlan wrote:
> > > + r = devm_kcalloc(dev, 1, sizeof(struct dma_pfn_offset_region),
> > > +  GFP_KERNEL);
> >
> > Use:r = devm_kzalloc(dev, sizeof(*r), GFP_KERNEL);
> Will fix.
> 
> >
> >
> > > + if (!r)
> > > + return -ENOMEM;
> > > +
> > > + r->uniform_offset = true;
> > > + r->pfn_offset = pfn_offset;
> > > +
> > > + return 0;
> > > +}
> >
> > This function doesn't seem to do anything useful.  Is part of it
> > missing?
> No, the uniform pfn offset is a special case.

Sorry, I wasn't clear.  We're talking about different things.  The code
does:

r = devm_kzalloc(dev, sizeof(*r), GFP_KERNEL);
if (!r)
return -ENOMEM;

r->uniform_offset = true;
r->pfn_offset = pfn_offset;

return 0;

The code allocates "r" and then doesn't save it anywhere so there is
no point.

regards,
dan carpenter

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 09/13] device core: Introduce multiple dma pfn offsets

2020-06-04 Thread Nicolas Saenz Julienne
Hi Jim,

On Wed, 2020-06-03 at 15:20 -0400, Jim Quinlan wrote:
> The new field in struct device 'dma_pfn_offset_map' is used to facilitate
> the use of multiple pfn offsets between cpu addrs and dma addrs.  It
> subsumes the role of dev->dma_pfn_offset -- a uniform offset -- and
> designates the single offset a special case.
> 
> of_dma_configure() is the typical manner to set pfn offsets but there
> are a number of ad hoc assignments to dev->dma_pfn_offset in the
> kernel code.  These cases now invoke the function
> attach_uniform_dma_pfn_offset(dev, pfn_offset).
> 
> Signed-off-by: Jim Quinlan 
> ---
>  arch/arm/include/asm/dma-mapping.h|  9 +-
>  arch/arm/mach-keystone/keystone.c |  9 +-
>  arch/sh/drivers/pci/pcie-sh7786.c |  3 +-
>  arch/sh/kernel/dma-coherent.c | 17 ++--
>  arch/x86/pci/sta2x11-fixup.c  |  7 +-
>  drivers/acpi/arm64/iort.c |  5 +-
>  drivers/gpu/drm/sun4i/sun4i_backend.c |  7 +-
>  drivers/iommu/io-pgtable-arm.c|  2 +-
>  .../platform/sunxi/sun4i-csi/sun4i_csi.c  |  5 +-
>  .../platform/sunxi/sun6i-csi/sun6i_csi.c  |  5 +-
>  drivers/of/address.c  | 93 +--
>  drivers/of/device.c   |  8 +-
>  drivers/remoteproc/remoteproc_core.c  |  2 +-
>  .../staging/media/sunxi/cedrus/cedrus_hw.c|  7 +-
>  drivers/usb/core/message.c|  4 +-
>  drivers/usb/core/usb.c|  2 +-
>  include/linux/device.h|  4 +-
>  include/linux/dma-direct.h| 16 +++-
>  include/linux/dma-mapping.h   | 45 +
>  kernel/dma/coherent.c | 11 ++-
>  20 files changed, 210 insertions(+), 51 deletions(-)
> 
> diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-
> mapping.h
> index bdd80ddbca34..f1e72f99468b 100644
> --- a/arch/arm/include/asm/dma-mapping.h
> +++ b/arch/arm/include/asm/dma-mapping.h
> @@ -35,8 +35,9 @@ static inline const struct dma_map_ops
> *get_arch_dma_ops(struct bus_type *bus)
>  #ifndef __arch_pfn_to_dma
>  static inline dma_addr_t pfn_to_dma(struct device *dev, unsigned long pfn)
>  {
> - if (dev)
> - pfn -= dev->dma_pfn_offset;
> + if (dev && dev->dma_pfn_offset_map)

Would it make sense to move the dev->dma_pfn_offset_map check into
dma_pfn_offset_from_phys_addr() and return 0 if not available? Same for the
opposite variant of the function. I think it'd make the code a little simpler on
some of the use cases, and overall less error prone if anyone starts using the
function elsewhere.

> + pfn -= dma_pfn_offset_from_phys_addr(dev, PFN_PHYS(pfn));
> +
>   return (dma_addr_t)__pfn_to_bus(pfn);
>  }
>  
> @@ -44,8 +45,8 @@ static inline unsigned long dma_to_pfn(struct device *dev,
> dma_addr_t addr)
>  {
>   unsigned long pfn = __bus_to_pfn(addr);
>  
> - if (dev)
> - pfn += dev->dma_pfn_offset;
> + if (dev && dev->dma_pfn_offset_map)
> + pfn += dma_pfn_offset_from_dma_addr(dev, addr);
>  
>   return pfn;
>  }
> diff --git a/arch/arm/mach-keystone/keystone.c b/arch/arm/mach-
> keystone/keystone.c
> index 638808c4e122..e7d3ee6e9cb5 100644
> --- a/arch/arm/mach-keystone/keystone.c
> +++ b/arch/arm/mach-keystone/keystone.c
> @@ -8,6 +8,7 @@
>   */
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -38,9 +39,11 @@ static int keystone_platform_notifier(struct notifier_block
> *nb,
>   return NOTIFY_BAD;
>  
>   if (!dev->of_node) {
> - dev->dma_pfn_offset = keystone_dma_pfn_offset;
> - dev_err(dev, "set dma_pfn_offset%08lx\n",
> - dev->dma_pfn_offset);
> + int ret = attach_uniform_dma_pfn_offset
> + (dev, keystone_dma_pfn_offset);
> +
> + dev_err(dev, "set dma_pfn_offset%08lx%s\n",
> + dev->dma_pfn_offset, ret ? " failed" : "");
>   }
>   return NOTIFY_OK;
>  }
> diff --git a/arch/sh/drivers/pci/pcie-sh7786.c b/arch/sh/drivers/pci/pcie-
> sh7786.c
> index e0b568aaa701..2e832a5c58c1 100644
> --- a/arch/sh/drivers/pci/pcie-sh7786.c
> +++ b/arch/sh/drivers/pci/pcie-sh7786.c
> @@ -12,6 +12,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -487,7 +488,7 @@ int pcibios_map_platform_irq(const struct pci_dev *pdev,
> u8 slot, u8 pin)
>  
>  void pcibios_bus_add_device(struct pci_dev *pdev)
>  {
> - pdev->dev.dma_pfn_offset = dma_pfn_offset;
> + attach_uniform_dma_pfn_offset(>dev, dma_pfn_offset);
>  }
>  
>  static int __init sh7786_pcie_core_init(void)
> diff --git a/arch/sh/kernel/dma-coherent.c b/arch/sh/kernel/dma-coherent.c
> index d4811691b93c..5fc9e358b6c7 100644
> --- a/arch/sh/kernel/dma-coherent.c
> +++ b/arch/sh/kernel/dma-coherent.c
> @@ -14,6 +14,8 @@ void 

Re: [PATCH v3 09/13] device core: Introduce multiple dma pfn offsets

2020-06-04 Thread Jim Quinlan via iommu
Hi Dan,

On Thu, Jun 4, 2020 at 7:06 AM Dan Carpenter  wrote:
>
> On Wed, Jun 03, 2020 at 03:20:41PM -0400, Jim Quinlan wrote:
> > @@ -786,7 +787,7 @@ static int sun4i_backend_bind(struct device *dev, 
> > struct device *master,
> >   const struct sun4i_backend_quirks *quirks;
> >   struct resource *res;
> >   void __iomem *regs;
> > - int i, ret;
> > + int i, ret = 0;
>
> No need for this.
Will fix.

>
> >
> >   backend = devm_kzalloc(dev, sizeof(*backend), GFP_KERNEL);
> >   if (!backend)
> > @@ -812,7 +813,9 @@ static int sun4i_backend_bind(struct device *dev, 
> > struct device *master,
> >* on our device since the RAM mapping is at 0 for the DMA 
> > bus,
> >* unlike the CPU.
> >*/
> > - drm->dev->dma_pfn_offset = PHYS_PFN_OFFSET;
> > + ret = attach_uniform_dma_pfn_offset(dev, PHYS_PFN_OFFSET);
> > + if (ret)
> > + return ret;
> >   }
> >
> >   backend->engine.node = dev->of_node;
> > diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
> > index 04fbd4bf0ff9..e9cc1c2d47cd 100644
> > --- a/drivers/iommu/io-pgtable-arm.c
> > +++ b/drivers/iommu/io-pgtable-arm.c
> > @@ -754,7 +754,7 @@ arm_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg)
> >   if (cfg->oas > ARM_LPAE_MAX_ADDR_BITS)
> >   return NULL;
> >
> > - if (!selftest_running && cfg->iommu_dev->dma_pfn_offset) {
> > + if (!selftest_running && cfg->iommu_dev->dma_pfn_offset_map) {
> >   dev_err(cfg->iommu_dev, "Cannot accommodate DMA offset for 
> > IOMMU page tables\n");
> >   return NULL;
> >   }
> > diff --git a/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c 
> > b/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
> > index eff34ded6305..7212da5e1076 100644
> > --- a/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
> > +++ b/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
> > @@ -7,6 +7,7 @@
> >   */
> >
> >  #include 
> > +#include 
> >  #include 
> >  #include 
> >  #include 
> > @@ -183,7 +184,9 @@ static int sun4i_csi_probe(struct platform_device *pdev)
> >   return ret;
> >   } else {
> >  #ifdef PHYS_PFN_OFFSET
> > - csi->dev->dma_pfn_offset = PHYS_PFN_OFFSET;
> > + ret = attach_uniform_dma_pfn_offset(dev, PHYS_PFN_OFFSET);
> > + if (ret)
> > + return ret;
> >  #endif
> >   }
> >
> > diff --git a/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c 
> > b/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> > index 055eb0b8e396..2d66d415b6c3 100644
> > --- a/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> > +++ b/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> > @@ -898,7 +898,10 @@ static int sun6i_csi_probe(struct platform_device 
> > *pdev)
> >
> >   sdev->dev = >dev;
> >   /* The DMA bus has the memory mapped at 0 */
> > - sdev->dev->dma_pfn_offset = PHYS_OFFSET >> PAGE_SHIFT;
> > + ret = attach_uniform_dma_pfn_offset(sdev->dev,
> > + PHYS_OFFSET >> PAGE_SHIFT);
> > + if (ret)
> > + return ret;
> >
> >   ret = sun6i_csi_resource_request(sdev, pdev);
> >   if (ret)
> > diff --git a/drivers/of/address.c b/drivers/of/address.c
> > index 96d8cfb14a60..c89333b0a5fb 100644
> > --- a/drivers/of/address.c
> > +++ b/drivers/of/address.c
> > @@ -918,6 +918,70 @@ void __iomem *of_io_request_and_map(struct device_node 
> > *np, int index,
> >  }
> >  EXPORT_SYMBOL(of_io_request_and_map);
> >
> > +static int attach_dma_pfn_offset_map(struct device *dev,
> > +  struct device_node *node, int num_ranges)
> > +{
> > + struct of_range_parser parser;
> > + struct of_range range;
> > + struct dma_pfn_offset_region *r;
> > +
> > + r = devm_kcalloc(dev, num_ranges + 1,
> > +  sizeof(struct dma_pfn_offset_region), GFP_KERNEL);
> > + if (!r)
> > + return -ENOMEM;
> > + dev->dma_pfn_offset_map = r;
> > + of_dma_range_parser_init(, node);
> > +
> > + /*
> > +  * Record all info for DMA ranges array.  We could
> > +  * just use the of_range struct, but if we did that it
> > +  * would require more calculations for phys_to_dma and
> > +  * dma_to_phys conversions.
> > +  */
> > + for_each_of_range(, ) {
> > + r->cpu_start = range.cpu_addr;
> > + r->cpu_end = r->cpu_start + range.size - 1;
> > + r->dma_start = range.bus_addr;
> > + r->dma_end = r->dma_start + range.size - 1;
> > + r->pfn_offset = PFN_DOWN(range.cpu_addr)
> > + - PFN_DOWN(range.bus_addr);
> > + r++;
> > + }
> > + return 0;
> > +}
> > +
> > +
> > +
> > +/**
> > + * attach_dma_pfn_offset - Assign scalar offset for all addresses.
> > + * @dev: device pointer; only needed for a corner case.
> 

Re: [PATCH v3 09/13] device core: Introduce multiple dma pfn offsets

2020-06-04 Thread Dan Carpenter
On Wed, Jun 03, 2020 at 03:20:41PM -0400, Jim Quinlan wrote:
> @@ -786,7 +787,7 @@ static int sun4i_backend_bind(struct device *dev, struct 
> device *master,
>   const struct sun4i_backend_quirks *quirks;
>   struct resource *res;
>   void __iomem *regs;
> - int i, ret;
> + int i, ret = 0;

No need for this.

>  
>   backend = devm_kzalloc(dev, sizeof(*backend), GFP_KERNEL);
>   if (!backend)
> @@ -812,7 +813,9 @@ static int sun4i_backend_bind(struct device *dev, struct 
> device *master,
>* on our device since the RAM mapping is at 0 for the DMA bus,
>* unlike the CPU.
>*/
> - drm->dev->dma_pfn_offset = PHYS_PFN_OFFSET;
> + ret = attach_uniform_dma_pfn_offset(dev, PHYS_PFN_OFFSET);
> + if (ret)
> + return ret;
>   }
>  
>   backend->engine.node = dev->of_node;
> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
> index 04fbd4bf0ff9..e9cc1c2d47cd 100644
> --- a/drivers/iommu/io-pgtable-arm.c
> +++ b/drivers/iommu/io-pgtable-arm.c
> @@ -754,7 +754,7 @@ arm_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg)
>   if (cfg->oas > ARM_LPAE_MAX_ADDR_BITS)
>   return NULL;
>  
> - if (!selftest_running && cfg->iommu_dev->dma_pfn_offset) {
> + if (!selftest_running && cfg->iommu_dev->dma_pfn_offset_map) {
>   dev_err(cfg->iommu_dev, "Cannot accommodate DMA offset for 
> IOMMU page tables\n");
>   return NULL;
>   }
> diff --git a/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c 
> b/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
> index eff34ded6305..7212da5e1076 100644
> --- a/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
> +++ b/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
> @@ -7,6 +7,7 @@
>   */
>  
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -183,7 +184,9 @@ static int sun4i_csi_probe(struct platform_device *pdev)
>   return ret;
>   } else {
>  #ifdef PHYS_PFN_OFFSET
> - csi->dev->dma_pfn_offset = PHYS_PFN_OFFSET;
> + ret = attach_uniform_dma_pfn_offset(dev, PHYS_PFN_OFFSET);
> + if (ret)
> + return ret;
>  #endif
>   }
>  
> diff --git a/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c 
> b/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> index 055eb0b8e396..2d66d415b6c3 100644
> --- a/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> +++ b/drivers/media/platform/sunxi/sun6i-csi/sun6i_csi.c
> @@ -898,7 +898,10 @@ static int sun6i_csi_probe(struct platform_device *pdev)
>  
>   sdev->dev = >dev;
>   /* The DMA bus has the memory mapped at 0 */
> - sdev->dev->dma_pfn_offset = PHYS_OFFSET >> PAGE_SHIFT;
> + ret = attach_uniform_dma_pfn_offset(sdev->dev,
> + PHYS_OFFSET >> PAGE_SHIFT);
> + if (ret)
> + return ret;
>  
>   ret = sun6i_csi_resource_request(sdev, pdev);
>   if (ret)
> diff --git a/drivers/of/address.c b/drivers/of/address.c
> index 96d8cfb14a60..c89333b0a5fb 100644
> --- a/drivers/of/address.c
> +++ b/drivers/of/address.c
> @@ -918,6 +918,70 @@ void __iomem *of_io_request_and_map(struct device_node 
> *np, int index,
>  }
>  EXPORT_SYMBOL(of_io_request_and_map);
>  
> +static int attach_dma_pfn_offset_map(struct device *dev,
> +  struct device_node *node, int num_ranges)
> +{
> + struct of_range_parser parser;
> + struct of_range range;
> + struct dma_pfn_offset_region *r;
> +
> + r = devm_kcalloc(dev, num_ranges + 1,
> +  sizeof(struct dma_pfn_offset_region), GFP_KERNEL);
> + if (!r)
> + return -ENOMEM;
> + dev->dma_pfn_offset_map = r;
> + of_dma_range_parser_init(, node);
> +
> + /*
> +  * Record all info for DMA ranges array.  We could
> +  * just use the of_range struct, but if we did that it
> +  * would require more calculations for phys_to_dma and
> +  * dma_to_phys conversions.
> +  */
> + for_each_of_range(, ) {
> + r->cpu_start = range.cpu_addr;
> + r->cpu_end = r->cpu_start + range.size - 1;
> + r->dma_start = range.bus_addr;
> + r->dma_end = r->dma_start + range.size - 1;
> + r->pfn_offset = PFN_DOWN(range.cpu_addr)
> + - PFN_DOWN(range.bus_addr);
> + r++;
> + }
> + return 0;
> +}
> +
> +
> +
> +/**
> + * attach_dma_pfn_offset - Assign scalar offset for all addresses.
> + * @dev: device pointer; only needed for a corner case.
> + * @dma_pfn_offset:  offset to apply when converting from phys addr
  ^^^
This parameter name does not match.

> + *   to dma addr and vice versa.
> + *
> + * It returns -ENOMEM if out of memory, otherwise 0.

It can also return -ENODEV.  Why are we passing NULL dev pointers to
all these 

[PATCH v3 09/13] device core: Introduce multiple dma pfn offsets

2020-06-03 Thread Jim Quinlan via iommu
The new field in struct device 'dma_pfn_offset_map' is used to facilitate
the use of multiple pfn offsets between cpu addrs and dma addrs.  It
subsumes the role of dev->dma_pfn_offset -- a uniform offset -- and
designates the single offset a special case.

of_dma_configure() is the typical manner to set pfn offsets but there
are a number of ad hoc assignments to dev->dma_pfn_offset in the
kernel code.  These cases now invoke the function
attach_uniform_dma_pfn_offset(dev, pfn_offset).

Signed-off-by: Jim Quinlan 
---
 arch/arm/include/asm/dma-mapping.h|  9 +-
 arch/arm/mach-keystone/keystone.c |  9 +-
 arch/sh/drivers/pci/pcie-sh7786.c |  3 +-
 arch/sh/kernel/dma-coherent.c | 17 ++--
 arch/x86/pci/sta2x11-fixup.c  |  7 +-
 drivers/acpi/arm64/iort.c |  5 +-
 drivers/gpu/drm/sun4i/sun4i_backend.c |  7 +-
 drivers/iommu/io-pgtable-arm.c|  2 +-
 .../platform/sunxi/sun4i-csi/sun4i_csi.c  |  5 +-
 .../platform/sunxi/sun6i-csi/sun6i_csi.c  |  5 +-
 drivers/of/address.c  | 93 +--
 drivers/of/device.c   |  8 +-
 drivers/remoteproc/remoteproc_core.c  |  2 +-
 .../staging/media/sunxi/cedrus/cedrus_hw.c|  7 +-
 drivers/usb/core/message.c|  4 +-
 drivers/usb/core/usb.c|  2 +-
 include/linux/device.h|  4 +-
 include/linux/dma-direct.h| 16 +++-
 include/linux/dma-mapping.h   | 45 +
 kernel/dma/coherent.c | 11 ++-
 20 files changed, 210 insertions(+), 51 deletions(-)

diff --git a/arch/arm/include/asm/dma-mapping.h 
b/arch/arm/include/asm/dma-mapping.h
index bdd80ddbca34..f1e72f99468b 100644
--- a/arch/arm/include/asm/dma-mapping.h
+++ b/arch/arm/include/asm/dma-mapping.h
@@ -35,8 +35,9 @@ static inline const struct dma_map_ops 
*get_arch_dma_ops(struct bus_type *bus)
 #ifndef __arch_pfn_to_dma
 static inline dma_addr_t pfn_to_dma(struct device *dev, unsigned long pfn)
 {
-   if (dev)
-   pfn -= dev->dma_pfn_offset;
+   if (dev && dev->dma_pfn_offset_map)
+   pfn -= dma_pfn_offset_from_phys_addr(dev, PFN_PHYS(pfn));
+
return (dma_addr_t)__pfn_to_bus(pfn);
 }
 
@@ -44,8 +45,8 @@ static inline unsigned long dma_to_pfn(struct device *dev, 
dma_addr_t addr)
 {
unsigned long pfn = __bus_to_pfn(addr);
 
-   if (dev)
-   pfn += dev->dma_pfn_offset;
+   if (dev && dev->dma_pfn_offset_map)
+   pfn += dma_pfn_offset_from_dma_addr(dev, addr);
 
return pfn;
 }
diff --git a/arch/arm/mach-keystone/keystone.c 
b/arch/arm/mach-keystone/keystone.c
index 638808c4e122..e7d3ee6e9cb5 100644
--- a/arch/arm/mach-keystone/keystone.c
+++ b/arch/arm/mach-keystone/keystone.c
@@ -8,6 +8,7 @@
  */
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -38,9 +39,11 @@ static int keystone_platform_notifier(struct notifier_block 
*nb,
return NOTIFY_BAD;
 
if (!dev->of_node) {
-   dev->dma_pfn_offset = keystone_dma_pfn_offset;
-   dev_err(dev, "set dma_pfn_offset%08lx\n",
-   dev->dma_pfn_offset);
+   int ret = attach_uniform_dma_pfn_offset
+   (dev, keystone_dma_pfn_offset);
+
+   dev_err(dev, "set dma_pfn_offset%08lx%s\n",
+   dev->dma_pfn_offset, ret ? " failed" : "");
}
return NOTIFY_OK;
 }
diff --git a/arch/sh/drivers/pci/pcie-sh7786.c 
b/arch/sh/drivers/pci/pcie-sh7786.c
index e0b568aaa701..2e832a5c58c1 100644
--- a/arch/sh/drivers/pci/pcie-sh7786.c
+++ b/arch/sh/drivers/pci/pcie-sh7786.c
@@ -12,6 +12,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -487,7 +488,7 @@ int pcibios_map_platform_irq(const struct pci_dev *pdev, u8 
slot, u8 pin)
 
 void pcibios_bus_add_device(struct pci_dev *pdev)
 {
-   pdev->dev.dma_pfn_offset = dma_pfn_offset;
+   attach_uniform_dma_pfn_offset(>dev, dma_pfn_offset);
 }
 
 static int __init sh7786_pcie_core_init(void)
diff --git a/arch/sh/kernel/dma-coherent.c b/arch/sh/kernel/dma-coherent.c
index d4811691b93c..5fc9e358b6c7 100644
--- a/arch/sh/kernel/dma-coherent.c
+++ b/arch/sh/kernel/dma-coherent.c
@@ -14,6 +14,8 @@ void *arch_dma_alloc(struct device *dev, size_t size, 
dma_addr_t *dma_handle,
 {
void *ret, *ret_nocache;
int order = get_order(size);
+   unsigned long pfn;
+   phys_addr_t phys;
 
gfp |= __GFP_ZERO;
 
@@ -34,11 +36,14 @@ void *arch_dma_alloc(struct device *dev, size_t size, 
dma_addr_t *dma_handle,
return NULL;
}
 
-   split_page(pfn_to_page(virt_to_phys(ret) >> PAGE_SHIFT), order);
+   phys = virt_to_phys(ret);
+   pfn =  phys >> PAGE_SHIFT;
+   split_page(pfn_to_page(pfn), order);
 
-   *dma_handle =