Re: [PATCH v4 3/7] iommu/mediatek: Set MISC_CTRL register

2020-06-19 Thread Yong Wu
Hi Chao,

On Thu, 2020-06-18 at 19:49 +0800, chao hao wrote:

> On Wed, 2020-06-17 at 11:34 +0200, Matthias Brugger wrote:


[snip]


> > >  
> > >  #define REG_MMU_MISC_CTRL0x048
> > > +#define F_MMU_IN_ORDER_WR_EN (BIT(1) | BIT(17))
> > > +#define F_MMU_STANDARD_AXI_MODE_BIT  (BIT(3) | BIT(19))
> > > +
> > >  #define REG_MMU_DCM_DIS  0x050
> > >  
> > >  #define REG_MMU_CTRL_REG 0x110
> > > @@ -578,6 +581,14 @@ static int mtk_iommu_hw_init(const struct 
> > > mtk_iommu_data *data)
> > >   writel_relaxed(0, data->base + REG_MMU_MISC_CTRL);
> > >   }
> > >  
> > > + if (data->plat_data->has_misc_ctrl) {
> > 
> > That's confusing. We renamed the register to misc_ctrl, but it's present in 
> > all
> > SoCs. We should find a better name for this flag to describe what the 
> > hardware
> > supports.
> > 
> 
> ok, thanks for you advice, I will rename it in next version.
> ex:has_perf_req(has performance requirement)
> 
> 
> > Regards,
> > Matthias
> > 
> > > + /* For mm_iommu, it can improve performance by the setting */
> > > + regval = readl_relaxed(data->base + REG_MMU_MISC_CTRL);
> > > + regval &= ~F_MMU_STANDARD_AXI_MODE_BIT;
> > > + regval &= ~F_MMU_IN_ORDER_WR_EN;


Note: mt2712 also is MISC_CTRL register, but it don't use this in_order
setting.

As commented in v3. 0x48 is either STANDARD_AXI_MODE or MISC_CTRL
register. No need two flags(reset_axi/has_xx) for it.

something like:

regval = readl_relaxed(data->base + REG_MMU_MISC_CTRL);
if (reset_axi) {
regval = 0;
} else {   /* MISC_CTRL */
if (!apu[1])
regval &= ~F_MMU_STANDARD_AXI_MODE_BIT;
if (out_order_en)
regval &= ~F_MMU_IN_ORDER_WR_EN;
}
writel_relaxed(regval, data->base + REG_MMU_MISC_CTRL);


[1] Your current patch doesn't support apu-iommu, thus, add it when
necessary.


> > > + writel_relaxed(regval, data->base + REG_MMU_MISC_CTRL);
> > > + }
> > > +
> > >   if (devm_request_irq(data->dev, data->irq, mtk_iommu_isr, 0,
> > >dev_name(data->dev), (void *)data)) {
> > >   writel_relaxed(0, data->base + REG_MMU_PT_BASE_ADDR);
> > > diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h
> > > index 1b6ea839b92c..d711ac630037 100644
> > > --- a/drivers/iommu/mtk_iommu.h
> > > +++ b/drivers/iommu/mtk_iommu.h
> > > @@ -40,6 +40,7 @@ struct mtk_iommu_plat_data {
> > >  
> > >   /* HW will use the EMI clock if there isn't the "bclk". */
> > >   boolhas_bclk;
> > > + boolhas_misc_ctrl;
> > >   boolhas_vld_pa_rng;
> > >   boolreset_axi;
> > >   unsigned char   larbid_remap[MTK_LARB_NR_MAX];
> > > 
> 
> 


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: kernel BUG at mm/huge_memory.c:2613!

2020-06-19 Thread Roman Gushchin via iommu
On Fri, Jun 19, 2020 at 01:56:28PM -0700, David Rientjes wrote:
> On Fri, 19 Jun 2020, Roman Gushchin wrote:
> 
> > [   40.287524] BUG: unable to handle page fault for address: 
> > a77b833df000
> > [   40.287529] #PF: supervisor write access in kernel mode
> > [   40.287531] #PF: error_code(0x000b) - reserved bit violation
> > [   40.287532] PGD 40d14e067 P4D 40d14e067 PUD 40d14f067 PMD 3ec54d067
> > PTE 80001688033d9163
> > [   40.287538] Oops: 000b [#2] SMP NOPTI
> > [   40.287542] CPU: 9 PID: 1986 Comm: pulseaudio Tainted: G  D
> >   5.8.0-rc1+ #697
> > [   40.287544] Hardware name: Gigabyte Technology Co., Ltd.
> > AB350-Gaming/AB350-Gaming-CF, BIOS F25 01/16/2019
> > [   40.287550] RIP: 0010:__memset+0x24/0x30
> > [   40.287553] Code: cc cc cc cc cc cc 0f 1f 44 00 00 49 89 f9 48 89
> > d1 83 e2 07 48 c1 e9 03 40 0f b6 f6 48 b8 01 01 01 01 01 01 01 01 48
> > 0f af c6  48 ab 89 d1 f3 aa 4c 89 c8 c3 90 49 89 f9 40 88 f0 48 89
> > d1 f3
> > [   40.287556] RSP: 0018:a77b827a7e08 EFLAGS: 00010216
> > [   40.287558] RAX:  RBX: 90f77dced800 RCX: 
> > 08a0
> > [   40.287560] RDX:  RSI:  RDI: 
> > a77b833df000
> > [   40.287561] RBP: 90f7898c7000 R08: 90f78c507768 R09: 
> > a77b833df000
> > [   40.287563] R10: a77b833df000 R11: 90f7839f2d40 R12: 
> > 
> > [   40.287564] R13: 90f76a802e00 R14: c0cb8880 R15: 
> > 90f770f4e700
> > [   40.287567] FS:  7f3d8e8df880() GS:90f78ee4()
> > knlGS:
> > [   40.287569] CS:  0010 DS:  ES:  CR0: 80050033
> > [   40.287570] CR2: a77b833df000 CR3: 0003fa556000 CR4: 
> > 003406e0
> > [   40.287572] Call Trace:
> > [   40.287584]  snd_pcm_hw_params+0x3fd/0x490 [snd_pcm]
> > [   40.287593]  snd_pcm_common_ioctl+0x1c5/0x1110 [snd_pcm]
> > [   40.287601]  ? snd_pcm_info_user+0x64/0x80 [snd_pcm]
> > [   40.287608]  snd_pcm_ioctl+0x23/0x30 [snd_pcm]
> > [   40.287613]  ksys_ioctl+0x82/0xc0
> > [   40.287617]  __x64_sys_ioctl+0x16/0x20
> > [   40.287622]  do_syscall_64+0x4d/0x90
> > [   40.287627]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> 
> Hi Roman,
> 
> If you have CONFIG_AMD_MEM_ENCRYPT set, this should be resolved by
> 
> commit dbed452a078d56bc7f1abecc3edd6a75e8e4484e
> Author: David Rientjes 
> Date:   Thu Jun 11 00:25:57 2020 -0700
> 
> dma-pool: decouple DMA_REMAP from DMA_COHERENT_POOL
> 
> Or you might want to wait for 5.8-rc2 instead which includes this fix.
> 

Hello, David!

Thank you for pointing at it! Unfortunately, there must be something wrong
with drivers, your patch didn't help much. I still see the same stacktrace.

I'll try again after 5.8-rc2 will be out.

Thanks!
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: kernel BUG at mm/huge_memory.c:2613!

2020-06-19 Thread David Rientjes via iommu
On Fri, 19 Jun 2020, Roman Gushchin wrote:

> [   40.287524] BUG: unable to handle page fault for address: a77b833df000
> [   40.287529] #PF: supervisor write access in kernel mode
> [   40.287531] #PF: error_code(0x000b) - reserved bit violation
> [   40.287532] PGD 40d14e067 P4D 40d14e067 PUD 40d14f067 PMD 3ec54d067
> PTE 80001688033d9163
> [   40.287538] Oops: 000b [#2] SMP NOPTI
> [   40.287542] CPU: 9 PID: 1986 Comm: pulseaudio Tainted: G  D
>   5.8.0-rc1+ #697
> [   40.287544] Hardware name: Gigabyte Technology Co., Ltd.
> AB350-Gaming/AB350-Gaming-CF, BIOS F25 01/16/2019
> [   40.287550] RIP: 0010:__memset+0x24/0x30
> [   40.287553] Code: cc cc cc cc cc cc 0f 1f 44 00 00 49 89 f9 48 89
> d1 83 e2 07 48 c1 e9 03 40 0f b6 f6 48 b8 01 01 01 01 01 01 01 01 48
> 0f af c6  48 ab 89 d1 f3 aa 4c 89 c8 c3 90 49 89 f9 40 88 f0 48 89
> d1 f3
> [   40.287556] RSP: 0018:a77b827a7e08 EFLAGS: 00010216
> [   40.287558] RAX:  RBX: 90f77dced800 RCX: 
> 08a0
> [   40.287560] RDX:  RSI:  RDI: 
> a77b833df000
> [   40.287561] RBP: 90f7898c7000 R08: 90f78c507768 R09: 
> a77b833df000
> [   40.287563] R10: a77b833df000 R11: 90f7839f2d40 R12: 
> 
> [   40.287564] R13: 90f76a802e00 R14: c0cb8880 R15: 
> 90f770f4e700
> [   40.287567] FS:  7f3d8e8df880() GS:90f78ee4()
> knlGS:
> [   40.287569] CS:  0010 DS:  ES:  CR0: 80050033
> [   40.287570] CR2: a77b833df000 CR3: 0003fa556000 CR4: 
> 003406e0
> [   40.287572] Call Trace:
> [   40.287584]  snd_pcm_hw_params+0x3fd/0x490 [snd_pcm]
> [   40.287593]  snd_pcm_common_ioctl+0x1c5/0x1110 [snd_pcm]
> [   40.287601]  ? snd_pcm_info_user+0x64/0x80 [snd_pcm]
> [   40.287608]  snd_pcm_ioctl+0x23/0x30 [snd_pcm]
> [   40.287613]  ksys_ioctl+0x82/0xc0
> [   40.287617]  __x64_sys_ioctl+0x16/0x20
> [   40.287622]  do_syscall_64+0x4d/0x90
> [   40.287627]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

Hi Roman,

If you have CONFIG_AMD_MEM_ENCRYPT set, this should be resolved by

commit dbed452a078d56bc7f1abecc3edd6a75e8e4484e
Author: David Rientjes 
Date:   Thu Jun 11 00:25:57 2020 -0700

dma-pool: decouple DMA_REMAP from DMA_COHERENT_POOL

Or you might want to wait for 5.8-rc2 instead which includes this fix.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: kernel BUG at mm/huge_memory.c:2613!

2020-06-19 Thread Roman Gushchin via iommu
On Thu, Jun 18, 2020 at 10:40:26PM -0400, Andrea Arcangeli wrote:
> Hello,
> 
> On Thu, Jun 18, 2020 at 06:14:49PM -0700, Roman Gushchin wrote:
> > I agree. The whole
> > 
> > page = alloc_pages_node(nid, alloc_flags, order);
> > if (!page)
> > continue;
> > if (!order)
> > break;
> > if (!PageCompound(page)) {
> > split_page(page, order);
> > break;
> > } else if (!split_huge_page(page)) {
> > break;
> > }
> > 
> > looks very suspicious to me.
> > My wild guess is that gfp flags changed somewhere above, so we hit
> > the branch which was never hit before.
> 
> Right to be suspicious about the above: split_huge_page on a regular
> page allocated by a driver was never meant to work.
> 
> The PageLocked BUG_ON is just a symptom of a bigger issue, basically
> split_huge_page it may survive, but it'll stay compound and in turn it
> must be freed as compound.
> 
> The respective free method doesn't even contemplate freeing compound
> pages, the only way the free method can survive, is by removing
> __GFP_COMP forcefully in the allocation that was perhaps set here
> (there are that many __GFP_COMP in that directory):
> 
> static void snd_malloc_dev_pages(struct snd_dma_buffer *dmab, size_t size)
> {
>   gfp_t gfp_flags;
> 
>   gfp_flags = GFP_KERNEL
>   | __GFP_COMP/* compound page lets parts be mapped */
> 
> And I'm not sure what the comment means here, compound or non compound
> doesn't make a difference when you map it, it's not a THP, the
> mappings must be handled manually so nothing should check PG_compound
> anyway in the mapping code.
> 
> Something like this may improve things, it's an untested quick hack,
> but this assumes it's always a bug to setup a compound page for these
> DMA allocations and given the API it's probably a correct
> assumption.. Compound is slower, unless you need it, you can avoid it
> and then split_page will give contiguous memory page granular. Ideally
> the code shouldn't call split_page at all and it should free it all at
> once by keeping track of the order and by returning the order to the
> caller, something the API can't do right now as it returns a plain
> array that can only represent individual small pages.
> 
> Once this is resolved, you may want to check your config, iommu passthrough
> sounds more optimal for a soundcard.

It's based on the default Fedora 32 config + all defaults for the 5.6..5.8-rc1
difference. But thanks for the advice!

> 
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index f68a62c3c32b..3dfbc010fa83 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -499,6 +499,10 @@ static struct page **__iommu_dma_alloc_pages(struct 
> device *dev,
>  
>   /* IOMMU can map any pages, so himem can also be used here */
>   gfp |= __GFP_NOWARN | __GFP_HIGHMEM;
> + if (unlikely(gfp & __GFP_COMP)) {
> + WARN();
> + gfp &= ~__GFP_COMP;
> + }
>  
>   while (count) {
>   struct page *page = NULL;
> @@ -522,13 +526,8 @@ static struct page **__iommu_dma_alloc_pages(struct 
> device *dev,
>   continue;
>   if (!order)
>   break;
> - if (!PageCompound(page)) {
> - split_page(page, order);
> - break;
> - } else if (!split_huge_page(page)) {
> - break;
> - }
> - __free_pages(page, order);
> + split_page(page, order);
> + break;
>   }
>   if (!page) {
>   __iommu_dma_free_pages(pages, i);
> diff --git a/sound/core/memalloc.c b/sound/core/memalloc.c
> index 6850d13aa98c..378f5a36ec5f 100644
> --- a/sound/core/memalloc.c
> +++ b/sound/core/memalloc.c
> @@ -28,7 +28,6 @@ static void snd_malloc_dev_pages(struct snd_dma_buffer 
> *dmab, size_t size)
>   gfp_t gfp_flags;
>  
>   gfp_flags = GFP_KERNEL
> - | __GFP_COMP/* compound page lets parts be mapped */
>   | __GFP_NORETRY /* don't trigger OOM-killer */
>   | __GFP_NOWARN; /* no stack trace print - this call is 
> non-critical */
>   dmab->area = dma_alloc_coherent(dmab->dev.dev, size, >addr,
> 
> 

The patch looks very good to me, but unfortunately it seems to reveal
the next layer of problems:

[   23.864671] page:cd0a8fc3a3c0 refcount:0 mapcount:0
mapping: index:0x0
[   23.864674] flags: 0x17c800(arch_1)
[   23.864678] raw: 0017c800  cd0a8fc3a388

[   23.864680] raw:   

[   23.864705] page dumped because: VM_BUG_ON_PAGE(((unsigned int)
page_ref_count(page) + 127u <= 127u))
[   23.864771] [ cut 

Re: [PATCH v2 1/3] docs: IOMMU user API

2020-06-19 Thread Alex Williamson
On Fri, 19 Jun 2020 03:30:24 +
"Liu, Yi L"  wrote:

> Hi Alex,
> 
> > From: Alex Williamson 
> > Sent: Friday, June 19, 2020 10:55 AM
> > 
> > On Fri, 19 Jun 2020 02:15:36 +
> > "Liu, Yi L"  wrote:
> >   
> > > Hi Alex,
> > >  
> > > > From: Alex Williamson 
> > > > Sent: Friday, June 19, 2020 5:48 AM
> > > >
> > > > On Wed, 17 Jun 2020 08:28:24 +
> > > > "Tian, Kevin"  wrote:
> > > >  
> > > > > > From: Liu, Yi L 
> > > > > > Sent: Wednesday, June 17, 2020 2:20 PM
> > > > > >  
> > > > > > > From: Jacob Pan 
> > > > > > > Sent: Tuesday, June 16, 2020 11:22 PM
> > > > > > >
> > > > > > > On Thu, 11 Jun 2020 17:27:27 -0700 Jacob Pan
> > > > > > >  wrote:
> > > > > > >  
> > > > > > > > >
> > > > > > > > > But then I thought it even better if VFIO leaves the
> > > > > > > > > entire
> > > > > > > > > copy_from_user() to the layer consuming it.
> > > > > > > > >  
> > > > > > > > OK. Sounds good, that was what Kevin suggested also. I just
> > > > > > > > wasn't sure how much VFIO wants to inspect, I thought VFIO
> > > > > > > > layer wanted to do a sanity check.
> > > > > > > >
> > > > > > > > Anyway, I will move copy_from_user to iommu uapi layer.  
> > > > > > >
> > > > > > > Just one more point brought up by Yi when we discuss this offline.
> > > > > > >
> > > > > > > If we move copy_from_user to iommu uapi layer, then there will
> > > > > > > be  
> > > > > > multiple  
> > > > > > > copy_from_user calls for the same data when a VFIO container
> > > > > > > has  
> > > > > > multiple domains,  
> > > > > > > devices. For bind, it might be OK. But might be additional
> > > > > > > overhead for TLB  
> > > > > > flush  
> > > > > > > request from the guest.  
> > > > > >
> > > > > > I think it is the same with bind and TLB flush path. will be
> > > > > > multiple copy_from_user.  
> > > > >
> > > > > multiple copies is possibly fine. In reality we allow only one
> > > > > group per nesting container (as described in patch [03/15]), and
> > > > > usually there is just one SVA-capable device per group.
> > > > >  
> > > > > >
> > > > > > BTW. for moving data copy to iommy layer, there is another point
> > > > > > which need to consider. VFIO needs to do unbind in bind path if
> > > > > > bind failed, so it will assemble unbind_data and pass to iommu
> > > > > > layer. If iommu layer do the copy_from_user, I think it will be 
> > > > > > failed. any  
> > idea?  
> > > >
> > > > If a call into a UAPI fails, there should be nothing to undo.
> > > > Creating a partial setup for a failed call that needs to be undone
> > > > by the caller is not good practice.  
> > >
> > > is it still a problem if it's the VFIO to undo the partial setup
> > > before returning to user space?  
> > 
> > Yes.  If a UAPI function fails there should be no residual effect.  
> 
> ok. the iommu_sva_bind_gpasid() is per device call. There is no residual
> effect if it failed. so no partial setup will happen per device.
> 
> but VFIO needs to use iommu_group_for_each_dev() to do bind, so
> if iommu_group_for_each_dev() failed, I guess VFIO needs to undo
> the partial setup for the group. right?

Yes, each individual call should make no changes if it fails, but the
caller would need to unwind separate calls.  If this introduces too
much knowledge to the caller for the group case, maybe there should be
a group-level function in the iommu code to handle that.  Thanks,

Alex

> > > > > This might be mitigated if we go back to use the same bind_data
> > > > > for both bind/unbind. Then you can reuse the user object for 
> > > > > unwinding.
> > > > >
> > > > > However there is another case where VFIO may need to assemble the
> > > > > bind_data itself. When a VM is killed, VFIO needs to walk
> > > > > allocated PASIDs and unbind them one-by-one. In such case
> > > > > copy_from_user doesn't work since the data is created by kernel.
> > > > > Alex, do you have a suggestion how this usage can be supported?
> > > > > e.g. asking IOMMU driver to provide two sets of APIs to handle 
> > > > > user/kernel  
> > generated requests?  
> > > >
> > > > Yes, it seems like vfio would need to make use of a driver API to do
> > > > this, we shouldn't be faking a user buffer in the kernel in order to
> > > > call through to a UAPI.  Thanks,  
> > >
> > > ok, so if VFIO wants to issue unbind by itself, it should use an API
> > > which passes kernel buffer to IOMMU layer. If the unbind request is
> > > from user space, then VFIO should use another API which passes user
> > > buffer pointer to IOMMU layer. makes sense. will align with jacob.  
> > 
> > Sounds right to me.  Different approaches might be used for the driver API 
> > versus
> > the UAPI, perhaps there is no buffer.  Thanks,  
> 
> thanks for your coaching. It may require Jacob to add APIs in iommu layer
> for the two purposes.
> 
> Regards,
> Yi Liu
> 
> > Alex  
> 

___
iommu mailing list
iommu@lists.linux-foundation.org

Re: [PATCH 3/4] pci: acs: Enable PCI_ACS_TB for untrusted/external-facing devices

2020-06-19 Thread Raj, Ashok
Hi Rajat


On Mon, Jun 15, 2020 at 06:17:41PM -0700, Rajat Jain wrote:
> When enabling ACS, currently the bit "translation blocking" was
> not getting changed at all. Set it to disable translation blocking

Maybe you meant "enable translation blocking" ?

Keep the commit log simple:

When enabling ACS, enable translation blocking for external facing ports
and untrusted devices.

> too for all external facing or untrusted devices. This is OK
> because ATS is only allowed on internal devces.
> 
> Signed-off-by: Rajat Jain 
> ---
>  drivers/pci/pci.c|  4 
>  drivers/pci/quirks.c | 11 +++
>  2 files changed, 15 insertions(+)
> 
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index d2ff987585855..79853b52658a2 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -3330,6 +3330,10 @@ static void pci_std_enable_acs(struct pci_dev *dev)
>   /* Upstream Forwarding */
>   ctrl |= (cap & PCI_ACS_UF);
>  
> + if (dev->external_facing || dev->untrusted)
> + /* Translation Blocking */
> + ctrl |= (cap & PCI_ACS_TB);
> +
>   pci_write_config_word(dev, pos + PCI_ACS_CTRL, ctrl);
>  }
>  
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index b341628e47527..6294adeac4049 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -4934,6 +4934,13 @@ static void pci_quirk_enable_intel_rp_mpc_acs(struct 
> pci_dev *dev)
>   }
>  }
>  
> +/*
> + * Currently this quirk does the equivalent of
> + * PCI_ACS_RR | PCI_ACS_CR | PCI_ACS_UF | PCI_ACS_SV
> + *
> + * Currently missing, it also needs to do equivalent of PCI_ACS_TB,
> + * if dev->external_facing || dev->untrusted
> + */
>  static int pci_quirk_enable_intel_pch_acs(struct pci_dev *dev)
>  {
>   if (!pci_quirk_intel_pch_acs_match(dev))
> @@ -4973,6 +4980,10 @@ static int pci_quirk_enable_intel_spt_pch_acs(struct 
> pci_dev *dev)
>   ctrl |= (cap & PCI_ACS_CR);
>   ctrl |= (cap & PCI_ACS_UF);
>  
> + if (dev->external_facing || dev->untrusted)
> + /* Translation Blocking */
> + ctrl |= (cap & PCI_ACS_TB);
> +
>   pci_write_config_dword(dev, pos + INTEL_SPT_ACS_CTRL, ctrl);
>  
>   pci_info(dev, "Intel SPT PCH root port ACS workaround enabled\n");
> -- 
> 2.27.0.290.gba653c62da-goog
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v7 13/36] drm: msm: fix common struct sg_table related issues

2020-06-19 Thread Rob Clark
On Fri, Jun 19, 2020 at 3:37 AM Marek Szyprowski
 wrote:
>
> The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function
> returns the number of the created entries in the DMA address space.
> However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and
> dma_unmap_sg must be called with the original number of the entries
> passed to the dma_map_sg().
>
> struct sg_table is a common structure used for describing a non-contiguous
> memory buffer, used commonly in the DRM and graphics subsystems. It
> consists of a scatterlist with memory pages and DMA addresses (sgl entry),
> as well as the number of scatterlist entries: CPU pages (orig_nents entry)
> and DMA mapped pages (nents entry).
>
> It turned out that it was a common mistake to misuse nents and orig_nents
> entries, calling DMA-mapping functions with a wrong number of entries or
> ignoring the number of mapped entries returned by the dma_map_sg()
> function.
>
> To avoid such issues, lets use a common dma-mapping wrappers operating
> directly on the struct sg_table objects and use scatterlist page
> iterators where possible. This, almost always, hides references to the
> nents and orig_nents entries, making the code robust, easier to follow
> and copy/paste safe.
>
> Signed-off-by: Marek Szyprowski 

Acked-by: Rob Clark 

(let me know if you want me to take this one in via msm-next or if the
plan is to take the series via drm-misc)


> ---
>  drivers/gpu/drm/msm/msm_gem.c| 13 +
>  drivers/gpu/drm/msm/msm_gpummu.c | 14 ++
>  drivers/gpu/drm/msm/msm_iommu.c  |  2 +-
>  3 files changed, 12 insertions(+), 17 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
> index 38b0c0e1f83e..e0d5fd36ea8f 100644
> --- a/drivers/gpu/drm/msm/msm_gem.c
> +++ b/drivers/gpu/drm/msm/msm_gem.c
> @@ -53,11 +53,10 @@ static void sync_for_device(struct msm_gem_object 
> *msm_obj)
> struct device *dev = msm_obj->base.dev->dev;
>
> if (get_dma_ops(dev) && IS_ENABLED(CONFIG_ARM64)) {
> -   dma_sync_sg_for_device(dev, msm_obj->sgt->sgl,
> -   msm_obj->sgt->nents, DMA_BIDIRECTIONAL);
> +   dma_sync_sgtable_for_device(dev, msm_obj->sgt,
> +   DMA_BIDIRECTIONAL);
> } else {
> -   dma_map_sg(dev, msm_obj->sgt->sgl,
> -   msm_obj->sgt->nents, DMA_BIDIRECTIONAL);
> +   dma_map_sgtable(dev, msm_obj->sgt, DMA_BIDIRECTIONAL, 0);
> }
>  }
>
> @@ -66,11 +65,9 @@ static void sync_for_cpu(struct msm_gem_object *msm_obj)
> struct device *dev = msm_obj->base.dev->dev;
>
> if (get_dma_ops(dev) && IS_ENABLED(CONFIG_ARM64)) {
> -   dma_sync_sg_for_cpu(dev, msm_obj->sgt->sgl,
> -   msm_obj->sgt->nents, DMA_BIDIRECTIONAL);
> +   dma_sync_sgtable_for_cpu(dev, msm_obj->sgt, 
> DMA_BIDIRECTIONAL);
> } else {
> -   dma_unmap_sg(dev, msm_obj->sgt->sgl,
> -   msm_obj->sgt->nents, DMA_BIDIRECTIONAL);
> +   dma_unmap_sgtable(dev, msm_obj->sgt, DMA_BIDIRECTIONAL, 0);
> }
>  }
>
> diff --git a/drivers/gpu/drm/msm/msm_gpummu.c 
> b/drivers/gpu/drm/msm/msm_gpummu.c
> index 310a31b05faa..319f06c28235 100644
> --- a/drivers/gpu/drm/msm/msm_gpummu.c
> +++ b/drivers/gpu/drm/msm/msm_gpummu.c
> @@ -30,21 +30,19 @@ static int msm_gpummu_map(struct msm_mmu *mmu, uint64_t 
> iova,
>  {
> struct msm_gpummu *gpummu = to_msm_gpummu(mmu);
> unsigned idx = (iova - GPUMMU_VA_START) / GPUMMU_PAGE_SIZE;
> -   struct scatterlist *sg;
> +   struct sg_dma_page_iter dma_iter;
> unsigned prot_bits = 0;
> -   unsigned i, j;
>
> if (prot & IOMMU_WRITE)
> prot_bits |= 1;
> if (prot & IOMMU_READ)
> prot_bits |= 2;
>
> -   for_each_sg(sgt->sgl, sg, sgt->nents, i) {
> -   dma_addr_t addr = sg->dma_address;
> -   for (j = 0; j < sg->length / GPUMMU_PAGE_SIZE; j++, idx++) {
> -   gpummu->table[idx] = addr | prot_bits;
> -   addr += GPUMMU_PAGE_SIZE;
> -   }
> +   for_each_sgtable_dma_page(sgt, _iter, 0) {
> +   dma_addr_t addr = sg_page_iter_dma_address(_iter);
> +
> +   BUILD_BUG_ON(GPUMMU_PAGE_SIZE != PAGE_SIZE);
> +   gpummu->table[idx++] = addr | prot_bits;
> }
>
> /* we can improve by deferring flush for multiple map() */
> diff --git a/drivers/gpu/drm/msm/msm_iommu.c b/drivers/gpu/drm/msm/msm_iommu.c
> index 3a381a9674c9..6c31e65834c6 100644
> --- a/drivers/gpu/drm/msm/msm_iommu.c
> +++ b/drivers/gpu/drm/msm/msm_iommu.c
> @@ -36,7 +36,7 @@ static int msm_iommu_map(struct msm_mmu *mmu, uint64_t iova,
> struct msm_iommu *iommu = to_msm_iommu(mmu);
> size_t ret;
>
> -   ret = iommu_map_sg(iommu->domain, iova, sgt->sgl, sgt->nents, 

Re: [PATCH v4 6/7] iommu/mediatek: Add REG_MMU_WR_LEN definition preparing for mt6779

2020-06-19 Thread chao hao
On Wed, 2020-06-17 at 11:22 +0200, Matthias Brugger wrote:
> 
> On 17/06/2020 05:00, Chao Hao wrote:
> > Some platforms(ex: mt6779) have a new register called by REG_MMU_WR_LEN
> > to improve performance.
> > This patch add this register definition.
> 
> Please be more specific what this register is about.
> 
OK. thanks.
We can use "has_wr_len" flag to control whether we need to set the
register. If the register uses default value, iommu will send command to
EMI without restriction, when the number of commands become more and
more, it will drop the EMI performance. So when more than
ten_commands(default value) don't be handled for EMI, IOMMU will stop
send command to EMI for keeping EMI's performace by enabling write
throttling mechanism(bit[5][21]=0) in MMU_WR_LEN_CTRL register.

I will write description above to commit message in next version

> > 
> > Signed-off-by: Chao Hao 
> > ---
> >  drivers/iommu/mtk_iommu.c | 10 ++
> >  drivers/iommu/mtk_iommu.h |  2 ++
> >  2 files changed, 12 insertions(+)
> > 
> > diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> > index a687e8db0e51..c706bca6487e 100644
> > --- a/drivers/iommu/mtk_iommu.c
> > +++ b/drivers/iommu/mtk_iommu.c
> > @@ -46,6 +46,8 @@
> >  #define F_MMU_STANDARD_AXI_MODE_BIT(BIT(3) | BIT(19))
> >  
> >  #define REG_MMU_DCM_DIS0x050
> > +#define REG_MMU_WR_LEN 0x054
> > +#define F_MMU_WR_THROT_DIS_BIT (BIT(5) |  BIT(21))
> >  
> >  #define REG_MMU_CTRL_REG   0x110
> >  #define F_MMU_TF_PROT_TO_PROGRAM_ADDR  (2 << 4)
> > @@ -581,6 +583,12 @@ static int mtk_iommu_hw_init(const struct 
> > mtk_iommu_data *data)
> > writel_relaxed(regval, data->base + REG_MMU_VLD_PA_RNG);
> > }
> > writel_relaxed(0, data->base + REG_MMU_DCM_DIS);
> > +   if (data->plat_data->has_wr_len) {
> > +   /* write command throttling mode */
> > +   regval = readl_relaxed(data->base + REG_MMU_WR_LEN);
> > +   regval &= ~F_MMU_WR_THROT_DIS_BIT;
> > +   writel_relaxed(regval, data->base + REG_MMU_WR_LEN);
> > +   }
> >  
> > if (data->plat_data->reset_axi) {
> > /* The register is called STANDARD_AXI_MODE in this case */
> > @@ -737,6 +745,7 @@ static int __maybe_unused mtk_iommu_suspend(struct 
> > device *dev)
> > struct mtk_iommu_suspend_reg *reg = >reg;
> > void __iomem *base = data->base;
> >  
> > +   reg->wr_len = readl_relaxed(base + REG_MMU_WR_LEN);
> 
> Can we read/write the register without any side effect although hardware has 
> not
> implemented it (!has_wr_len)?

It doesn't have side effect. Becasue all the MTK platform have the
register for iommu HW. If we need to have requirement for performance,
we can set it by has_wr_len.
But I'm Sorry, the name of flag(has_wr_len) is not exact, I will rename
it in next version, ex: "wr_throt_en"

> 
> 
> > reg->misc_ctrl = readl_relaxed(base + REG_MMU_MISC_CTRL);
> > reg->dcm_dis = readl_relaxed(base + REG_MMU_DCM_DIS);
> > reg->ctrl_reg = readl_relaxed(base + REG_MMU_CTRL_REG);
> > @@ -761,6 +770,7 @@ static int __maybe_unused mtk_iommu_resume(struct 
> > device *dev)
> > dev_err(data->dev, "Failed to enable clk(%d) in resume\n", ret);
> > return ret;
> > }
> > +   writel_relaxed(reg->wr_len, base + REG_MMU_WR_LEN);
> > writel_relaxed(reg->misc_ctrl, base + REG_MMU_MISC_CTRL);
> > writel_relaxed(reg->dcm_dis, base + REG_MMU_DCM_DIS);
> > writel_relaxed(reg->ctrl_reg, base + REG_MMU_CTRL_REG);
> > diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h
> > index d51ff99c2c71..9971cedd72ea 100644
> > --- a/drivers/iommu/mtk_iommu.h
> > +++ b/drivers/iommu/mtk_iommu.h
> > @@ -25,6 +25,7 @@ struct mtk_iommu_suspend_reg {
> > u32 int_main_control;
> > u32 ivrp_paddr;
> > u32 vld_pa_rng;
> > +   u32 wr_len;
> >  };
> >  
> >  enum mtk_iommu_plat {
> > @@ -43,6 +44,7 @@ struct mtk_iommu_plat_data {
> > boolhas_misc_ctrl;
> > boolhas_sub_comm;
> > boolhas_vld_pa_rng;
> > +   boolhas_wr_len;
> 
> Given the fact that we are adding more and more plat_data bool values, I think
> it would make sense to use a u32 flags register and add the appropriate macro
> definitions to set and check for a flag present.

Thanks for your advice.
do you mean like this:
struct plat_flag {

#define  HAS_4GB_MODE   BIT(0)
#define  HAS_BCLK   BIT(1)
#define  REST_AXI   BIT(2)
... ...

u32 flag;
};

struct mtk_iommu_plat_data {
..
struct plat_flag flag;
..
};


> Regards,
> Matthias
> 
> > boolreset_axi;
> > u32 inv_sel_reg;
> > unsigned char   

Re: [PATCH v4 7/7] iommu/mediatek: Add mt6779 basic support

2020-06-19 Thread chao hao
On Thu, 2020-06-18 at 18:00 +0200, Matthias Brugger wrote:
> 
> On 18/06/2020 13:54, chao hao wrote:
> > On Wed, 2020-06-17 at 11:33 +0200, Matthias Brugger wrote:
> >>
> >> On 17/06/2020 05:00, Chao Hao wrote:
> >>> 1. Start from mt6779, INVLDT_SEL move to offset=0x2c, so we add
> >>>REG_MMU_INV_SEL_GEN2 definition and mt6779 uses it.
> >>> 2. Change PROTECT_PA_ALIGN from 128 byte to 256 byte.
> >>> 3. For REG_MMU_CTRL_REG register, we only need to change bit[2:0],
> >>>others bits keep default value, ex: enable victim tlb.
> >>> 4. Add mt6779_data to support mm_iommu HW init.
> >>>
> >>> Change since v3:
> >>> 1. When setting MMU_CTRL_REG, we don't need to include mt8173.
> >>>
> >>> Cc: Yong Wu 
> >>> Signed-off-by: Chao Hao 
> >>> ---
> >>>  drivers/iommu/mtk_iommu.c | 20 ++--
> >>>  drivers/iommu/mtk_iommu.h |  1 +
> >>>  2 files changed, 19 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> >>> index c706bca6487e..def2e996683f 100644
> >>> --- a/drivers/iommu/mtk_iommu.c
> >>> +++ b/drivers/iommu/mtk_iommu.c
> >>> @@ -37,6 +37,11 @@
> >>>  #define REG_MMU_INVLD_START_A0x024
> >>>  #define REG_MMU_INVLD_END_A  0x028
> >>>  
> >>> +/* In latest Coda, MMU_INV_SEL's offset is changed to 0x02c.
> >>> + * So we named offset = 0x02c to "REG_MMU_INV_SEL_GEN2"
> >>> + * and offset = 0x038 to "REG_MMU_INV_SEL_GEN1".
> >>> + */
> >>
> >> Please delete the comment, this should be understandable from the git 
> >> history
> > 
> > ok, thanks
> > 
> >>
> >>> +#define REG_MMU_INV_SEL_GEN2 0x02c
> >>>  #define REG_MMU_INV_SEL_GEN1 0x038
> >>>  #define F_INVLD_EN0  BIT(0)
> >>>  #define F_INVLD_EN1  BIT(1)
> >>> @@ -98,7 +103,7 @@
> >>>  #define F_MMU_INT_ID_LARB_ID(a)  (((a) >> 7) & 0x7)
> >>>  #define F_MMU_INT_ID_PORT_ID(a)  (((a) >> 2) & 0x1f)
> >>>  
> >>> -#define MTK_PROTECT_PA_ALIGN 128
> >>> +#define MTK_PROTECT_PA_ALIGN 256
> >>
> >> Do we need 512 bytes for all gen2 IOMMUs?
> >> I'm not sure if we should add this in plat_data or if we should just bump 
> >> up the
> >> value for all SoCs.
> >> In both cases this should be a separate patch.
> >>
> > From mt6779, MTK_PROTECT_PA_ALIGN is extend to 256 bytes and don't be
> > changed for a long time from our HW designer comment. The legacy iommu
> > also can use it, mabye it doesn't set it by platform.
> > 
> 
> Ok then just bump it to 256 in a new patch. Thanks for clarification.

  Ok, thanks

> > 
> >>>  
> >>>  /*
> >>>   * Get the local arbiter ID and the portid within the larb arbiter
> >>> @@ -543,11 +548,12 @@ static int mtk_iommu_hw_init(const struct 
> >>> mtk_iommu_data *data)
> >>>   return ret;
> >>>   }
> >>>  
> >>> + regval = readl_relaxed(data->base + REG_MMU_CTRL_REG);
> >>>   if (data->plat_data->m4u_plat == M4U_MT8173)
> >>>   regval = F_MMU_PREFETCH_RT_REPLACE_MOD |
> >>>F_MMU_TF_PROT_TO_PROGRAM_ADDR_MT8173;
> >>>   else
> >>> - regval = F_MMU_TF_PROT_TO_PROGRAM_ADDR;
> >>> + regval |= F_MMU_TF_PROT_TO_PROGRAM_ADDR;
> >>
> >> Why do we change this, is it that the bootloader for mt6779 set some 
> >> values in
> >> the register we have to keep? In this case I think we should update the 
> >> regval
> >> accordingly.
> > 
> > For REG_MMU_CTRL_REG, bit[12] represents victim_tlb_en feature and
> > victim_tlb is enable defaultly(bit[12]=1),but if we use "regval =
> > F_MMU_TF_PROT_TO_PROGRAM_ADDR", victim_tlb will disable, it will drop
> > iommu performace for mt6779
> > 
> 
> Got it. Please put that in a separate patch then.
> 
  Ok, thanks

> Regards,
> Matthias
> 
> > 
> >>
> >>>   writel_relaxed(regval, data->base + REG_MMU_CTRL_REG);
> >>>  
> >>>   regval = F_L2_MULIT_HIT_EN |
> >>> @@ -797,6 +803,15 @@ static const struct mtk_iommu_plat_data mt2712_data 
> >>> = {
> >>>   .larbid_remap   = {{0}, {1}, {2}, {3}, {4}, {5}, {6}, {7}},
> >>>  };
> >>>  
> >>> +static const struct mtk_iommu_plat_data mt6779_data = {
> >>> + .m4u_plat  = M4U_MT6779,
> >>> + .has_sub_comm  = true,
> >>> + .has_wr_len= true,
> >>> + .has_misc_ctrl = true,
> >>> + .inv_sel_reg   = REG_MMU_INV_SEL_GEN2,
> >>> + .larbid_remap  = {{0}, {1}, {2}, {3}, {5}, {7, 8}, {10}, {9}},
> >>> +};
> >>> +
> >>>  static const struct mtk_iommu_plat_data mt8173_data = {
> >>>   .m4u_plat = M4U_MT8173,
> >>>   .has_4gb_mode = true,
> >>> @@ -815,6 +830,7 @@ static const struct mtk_iommu_plat_data mt8183_data = 
> >>> {
> >>>  
> >>>  static const struct of_device_id mtk_iommu_of_ids[] = {
> >>>   { .compatible = "mediatek,mt2712-m4u", .data = _data},
> >>> + { .compatible = "mediatek,mt6779-m4u", .data = _data},
> >>>   { .compatible = "mediatek,mt8173-m4u", .data = _data},
> >>>   { .compatible = "mediatek,mt8183-m4u", .data = 

[PATCH v7 27/36] drm: rcar-du: fix common struct sg_table related issues

2020-06-19 Thread Marek Szyprowski
The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function
returns the number of the created entries in the DMA address space.
However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and
dma_unmap_sg must be called with the original number of the entries
passed to the dma_map_sg().

struct sg_table is a common structure used for describing a non-contiguous
memory buffer, used commonly in the DRM and graphics subsystems. It
consists of a scatterlist with memory pages and DMA addresses (sgl entry),
as well as the number of scatterlist entries: CPU pages (orig_nents entry)
and DMA mapped pages (nents entry).

It turned out that it was a common mistake to misuse nents and orig_nents
entries, calling DMA-mapping functions with a wrong number of entries or
ignoring the number of mapped entries returned by the dma_map_sg()
function.

To avoid such issues, lets use a common dma-mapping wrappers operating
directly on the struct sg_table objects and use scatterlist page
iterators where possible. This, almost always, hides references to the
nents and orig_nents entries, making the code robust, easier to follow
and copy/paste safe.

dma_map_sgtable() function returns zero or an error code, so adjust the
return value check for the vsp1_du_map_sg() function.

Signed-off-by: Marek Szyprowski 
Reviewed-by: Laurent Pinchart 
---
 drivers/gpu/drm/rcar-du/rcar_du_vsp.c  | 3 +--
 drivers/media/platform/vsp1/vsp1_drm.c | 8 
 2 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/rcar-du/rcar_du_vsp.c 
b/drivers/gpu/drm/rcar-du/rcar_du_vsp.c
index f1a81c9b184d..a27bff999649 100644
--- a/drivers/gpu/drm/rcar-du/rcar_du_vsp.c
+++ b/drivers/gpu/drm/rcar-du/rcar_du_vsp.c
@@ -197,9 +197,8 @@ int rcar_du_vsp_map_fb(struct rcar_du_vsp *vsp, struct 
drm_framebuffer *fb,
goto fail;
 
ret = vsp1_du_map_sg(vsp->vsp, sgt);
-   if (!ret) {
+   if (ret) {
sg_free_table(sgt);
-   ret = -ENOMEM;
goto fail;
}
}
diff --git a/drivers/media/platform/vsp1/vsp1_drm.c 
b/drivers/media/platform/vsp1/vsp1_drm.c
index a4a45d68a6ef..86d5e3f4b1ff 100644
--- a/drivers/media/platform/vsp1/vsp1_drm.c
+++ b/drivers/media/platform/vsp1/vsp1_drm.c
@@ -912,8 +912,8 @@ int vsp1_du_map_sg(struct device *dev, struct sg_table *sgt)
 * skip cache sync. This will need to be revisited when support for
 * non-coherent buffers will be added to the DU driver.
 */
-   return dma_map_sg_attrs(vsp1->bus_master, sgt->sgl, sgt->nents,
-   DMA_TO_DEVICE, DMA_ATTR_SKIP_CPU_SYNC);
+   return dma_map_sgtable(vsp1->bus_master, sgt, DMA_TO_DEVICE,
+  DMA_ATTR_SKIP_CPU_SYNC);
 }
 EXPORT_SYMBOL_GPL(vsp1_du_map_sg);
 
@@ -921,8 +921,8 @@ void vsp1_du_unmap_sg(struct device *dev, struct sg_table 
*sgt)
 {
struct vsp1_device *vsp1 = dev_get_drvdata(dev);
 
-   dma_unmap_sg_attrs(vsp1->bus_master, sgt->sgl, sgt->nents,
-  DMA_TO_DEVICE, DMA_ATTR_SKIP_CPU_SYNC);
+   dma_unmap_sgtable(vsp1->bus_master, sgt, DMA_TO_DEVICE,
+ DMA_ATTR_SKIP_CPU_SYNC);
 }
 EXPORT_SYMBOL_GPL(vsp1_du_unmap_sg);
 
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 26/36] drm: host1x: fix common struct sg_table related issues

2020-06-19 Thread Marek Szyprowski
The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function
returns the number of the created entries in the DMA address space.
However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and
dma_unmap_sg must be called with the original number of the entries
passed to the dma_map_sg().

struct sg_table is a common structure used for describing a non-contiguous
memory buffer, used commonly in the DRM and graphics subsystems. It
consists of a scatterlist with memory pages and DMA addresses (sgl entry),
as well as the number of scatterlist entries: CPU pages (orig_nents entry)
and DMA mapped pages (nents entry).

It turned out that it was a common mistake to misuse nents and orig_nents
entries, calling DMA-mapping functions with a wrong number of entries or
ignoring the number of mapped entries returned by the dma_map_sg()
function.

To avoid such issues, lets use a common dma-mapping wrappers operating
directly on the struct sg_table objects and use scatterlist page
iterators where possible. This, almost always, hides references to the
nents and orig_nents entries, making the code robust, easier to follow
and copy/paste safe.

Signed-off-by: Marek Szyprowski 
---
 drivers/gpu/host1x/job.c | 22 --
 1 file changed, 8 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c
index a10643aa89aa..4832b57f10c4 100644
--- a/drivers/gpu/host1x/job.c
+++ b/drivers/gpu/host1x/job.c
@@ -166,11 +166,9 @@ static unsigned int pin_job(struct host1x *host, struct 
host1x_job *job)
goto unpin;
}
 
-   err = dma_map_sg(dev, sgt->sgl, sgt->nents, dir);
-   if (!err) {
-   err = -ENOMEM;
+   err = dma_map_sgtable(dev, sgt, dir, 0);
+   if (err)
goto unpin;
-   }
 
job->unpins[job->num_unpins].dev = dev;
job->unpins[job->num_unpins].dir = dir;
@@ -217,7 +215,7 @@ static unsigned int pin_job(struct host1x *host, struct 
host1x_job *job)
}
 
if (!IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL) && host->domain) {
-   for_each_sg(sgt->sgl, sg, sgt->nents, j)
+   for_each_sgtable_sg(sgt, sg, j)
gather_size += sg->length;
gather_size = iova_align(>iova, gather_size);
 
@@ -229,9 +227,9 @@ static unsigned int pin_job(struct host1x *host, struct 
host1x_job *job)
goto unpin;
}
 
-   err = iommu_map_sg(host->domain,
+   err = iommu_map_sgtable(host->domain,
iova_dma_addr(>iova, alloc),
-   sgt->sgl, sgt->nents, IOMMU_READ);
+   sgt, IOMMU_READ);
if (err == 0) {
__free_iova(>iova, alloc);
err = -EINVAL;
@@ -241,12 +239,9 @@ static unsigned int pin_job(struct host1x *host, struct 
host1x_job *job)
job->unpins[job->num_unpins].size = gather_size;
phys_addr = iova_dma_addr(>iova, alloc);
} else if (sgt) {
-   err = dma_map_sg(host->dev, sgt->sgl, sgt->nents,
-DMA_TO_DEVICE);
-   if (!err) {
-   err = -ENOMEM;
+   err = dma_map_sgtable(host->dev, sgt, DMA_TO_DEVICE, 0);
+   if (err)
goto unpin;
-   }
 
job->unpins[job->num_unpins].dir = DMA_TO_DEVICE;
job->unpins[job->num_unpins].dev = host->dev;
@@ -647,8 +642,7 @@ void host1x_job_unpin(struct host1x_job *job)
}
 
if (unpin->dev && sgt)
-   dma_unmap_sg(unpin->dev, sgt->sgl, sgt->nents,
-unpin->dir);
+   dma_unmap_sgtable(unpin->dev, sgt, unpin->dir, 0);
 
host1x_bo_unpin(dev, unpin->bo, sgt);
host1x_bo_put(unpin->bo);
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 34/36] samples: vfio-mdev/mbochs: fix common struct sg_table related issues

2020-06-19 Thread Marek Szyprowski
The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function
returns the number of the created entries in the DMA address space.
However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and
dma_unmap_sg must be called with the original number of the entries
passed to the dma_map_sg().

struct sg_table is a common structure used for describing a non-contiguous
memory buffer, used commonly in the DRM and graphics subsystems. It
consists of a scatterlist with memory pages and DMA addresses (sgl entry),
as well as the number of scatterlist entries: CPU pages (orig_nents entry)
and DMA mapped pages (nents entry).

It turned out that it was a common mistake to misuse nents and orig_nents
entries, calling DMA-mapping functions with a wrong number of entries or
ignoring the number of mapped entries returned by the dma_map_sg()
function.

To avoid such issues, lets use a common dma-mapping wrappers operating
directly on the struct sg_table objects and use scatterlist page
iterators where possible. This, almost always, hides references to the
nents and orig_nents entries, making the code robust, easier to follow
and copy/paste safe.

While touching this code, also add missing call to dma_unmap_sgtable.

Signed-off-by: Marek Szyprowski 
---
 samples/vfio-mdev/mbochs.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/samples/vfio-mdev/mbochs.c b/samples/vfio-mdev/mbochs.c
index 3cc5e5921682..e03068917273 100644
--- a/samples/vfio-mdev/mbochs.c
+++ b/samples/vfio-mdev/mbochs.c
@@ -846,7 +846,7 @@ static struct sg_table *mbochs_map_dmabuf(struct 
dma_buf_attachment *at,
if (sg_alloc_table_from_pages(sg, dmabuf->pages, dmabuf->pagecount,
  0, dmabuf->mode.size, GFP_KERNEL) < 0)
goto err2;
-   if (!dma_map_sg(at->dev, sg->sgl, sg->nents, direction))
+   if (dma_map_sgtable(at->dev, sg, direction, 0))
goto err3;
 
return sg;
@@ -868,6 +868,7 @@ static void mbochs_unmap_dmabuf(struct dma_buf_attachment 
*at,
 
dev_dbg(dev, "%s: %d\n", __func__, dmabuf->id);
 
+   dma_unmap_sgtable(at->dev, sg, direction, 0);
sg_free_table(sg);
kfree(sg);
 }
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 24/36] drm: xen: fix common struct sg_table related issues

2020-06-19 Thread Marek Szyprowski
The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function
returns the number of the created entries in the DMA address space.
However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and
dma_unmap_sg must be called with the original number of the entries
passed to the dma_map_sg().

struct sg_table is a common structure used for describing a non-contiguous
memory buffer, used commonly in the DRM and graphics subsystems. It
consists of a scatterlist with memory pages and DMA addresses (sgl entry),
as well as the number of scatterlist entries: CPU pages (orig_nents entry)
and DMA mapped pages (nents entry).

It turned out that it was a common mistake to misuse nents and orig_nents
entries, calling DMA-mapping functions with a wrong number of entries or
ignoring the number of mapped entries returned by the dma_map_sg()
function.

Fix the code to refer to proper nents or orig_nents entries. This driver
reports the number of the pages in the imported scatterlist, so it should
refer to sg_table->orig_nents entry.

Signed-off-by: Marek Szyprowski 
Acked-by: Oleksandr Andrushchenko 
---
 drivers/gpu/drm/xen/xen_drm_front_gem.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/xen/xen_drm_front_gem.c 
b/drivers/gpu/drm/xen/xen_drm_front_gem.c
index f0b85e094111..ba4bdc5590ea 100644
--- a/drivers/gpu/drm/xen/xen_drm_front_gem.c
+++ b/drivers/gpu/drm/xen/xen_drm_front_gem.c
@@ -215,7 +215,7 @@ xen_drm_front_gem_import_sg_table(struct drm_device *dev,
return ERR_PTR(ret);
 
DRM_DEBUG("Imported buffer of size %zu with nents %u\n",
- size, sgt->nents);
+ size, sgt->orig_nents);
 
return _obj->base;
 }
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 12/36] drm: mediatek: use common helper for extracting pages array

2020-06-19 Thread Marek Szyprowski
Use common helper for converting a sg_table object into struct
page pointer array.

Signed-off-by: Marek Szyprowski 
---
 drivers/gpu/drm/mediatek/mtk_drm_gem.c | 9 ++---
 1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/mediatek/mtk_drm_gem.c 
b/drivers/gpu/drm/mediatek/mtk_drm_gem.c
index 3654ec732029..0583e557ad37 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_gem.c
+++ b/drivers/gpu/drm/mediatek/mtk_drm_gem.c
@@ -233,9 +233,7 @@ void *mtk_drm_gem_prime_vmap(struct drm_gem_object *obj)
 {
struct mtk_drm_gem_obj *mtk_gem = to_mtk_gem_obj(obj);
struct sg_table *sgt;
-   struct sg_page_iter iter;
unsigned int npages;
-   unsigned int i = 0;
 
if (mtk_gem->kvaddr)
return mtk_gem->kvaddr;
@@ -249,11 +247,8 @@ void *mtk_drm_gem_prime_vmap(struct drm_gem_object *obj)
if (!mtk_gem->pages)
goto out;
 
-   for_each_sg_page(sgt->sgl, , sgt->orig_nents, 0) {
-   mtk_gem->pages[i++] = sg_page_iter_page();
-   if (i > npages)
-   break;
-   }
+   drm_prime_sg_to_page_addr_arrays(sgt, mtk_gem->pages, NULL, npages);
+
mtk_gem->kvaddr = vmap(mtk_gem->pages, npages, VM_MAP,
   pgprot_writecombine(PAGE_KERNEL));
 
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 36/36] videobuf2: use sgtable-based scatterlist wrappers

2020-06-19 Thread Marek Szyprowski
Use recently introduced common wrappers operating directly on the struct
sg_table objects and scatterlist page iterators to make the code a bit
more compact, robust, easier to follow and copy/paste safe.

No functional change, because the code already properly did all the
scaterlist related calls.

Signed-off-by: Marek Szyprowski 
---
 .../common/videobuf2/videobuf2-dma-contig.c   | 34 ---
 .../media/common/videobuf2/videobuf2-dma-sg.c | 32 +++--
 .../common/videobuf2/videobuf2-vmalloc.c  | 12 +++
 3 files changed, 31 insertions(+), 47 deletions(-)

diff --git a/drivers/media/common/videobuf2/videobuf2-dma-contig.c 
b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
index f4b4a7c135eb..0a16a85f0284 100644
--- a/drivers/media/common/videobuf2/videobuf2-dma-contig.c
+++ b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
@@ -53,10 +53,10 @@ static unsigned long vb2_dc_get_contiguous_size(struct 
sg_table *sgt)
unsigned int i;
unsigned long size = 0;
 
-   for_each_sg(sgt->sgl, s, sgt->nents, i) {
+   for_each_sgtable_dma_sg(sgt, s, i) {
if (sg_dma_address(s) != expected)
break;
-   expected = sg_dma_address(s) + sg_dma_len(s);
+   expected += sg_dma_len(s);
size += sg_dma_len(s);
}
return size;
@@ -99,8 +99,7 @@ static void vb2_dc_prepare(void *buf_priv)
if (!sgt || buf->db_attach)
return;
 
-   dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->orig_nents,
-  buf->dma_dir);
+   dma_sync_sgtable_for_device(buf->dev, sgt, buf->dma_dir);
 }
 
 static void vb2_dc_finish(void *buf_priv)
@@ -112,7 +111,7 @@ static void vb2_dc_finish(void *buf_priv)
if (!sgt || buf->db_attach)
return;
 
-   dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->orig_nents, buf->dma_dir);
+   dma_sync_sgtable_for_cpu(buf->dev, sgt, buf->dma_dir);
 }
 
 /*/
@@ -273,8 +272,8 @@ static void vb2_dc_dmabuf_ops_detach(struct dma_buf *dbuf,
 * memory locations do not require any explicit cache
 * maintenance prior or after being used by the device.
 */
-   dma_unmap_sg_attrs(db_attach->dev, sgt->sgl, sgt->orig_nents,
-  attach->dma_dir, DMA_ATTR_SKIP_CPU_SYNC);
+   dma_unmap_sgtable(db_attach->dev, sgt, attach->dma_dir,
+ DMA_ATTR_SKIP_CPU_SYNC);
sg_free_table(sgt);
kfree(attach);
db_attach->priv = NULL;
@@ -299,8 +298,8 @@ static struct sg_table *vb2_dc_dmabuf_ops_map(
 
/* release any previous cache */
if (attach->dma_dir != DMA_NONE) {
-   dma_unmap_sg_attrs(db_attach->dev, sgt->sgl, sgt->orig_nents,
-  attach->dma_dir, DMA_ATTR_SKIP_CPU_SYNC);
+   dma_unmap_sgtable(db_attach->dev, sgt, attach->dma_dir,
+ DMA_ATTR_SKIP_CPU_SYNC);
attach->dma_dir = DMA_NONE;
}
 
@@ -308,9 +307,8 @@ static struct sg_table *vb2_dc_dmabuf_ops_map(
 * mapping to the client with new direction, no cache sync
 * required see comment in vb2_dc_dmabuf_ops_detach()
 */
-   sgt->nents = dma_map_sg_attrs(db_attach->dev, sgt->sgl, sgt->orig_nents,
- dma_dir, DMA_ATTR_SKIP_CPU_SYNC);
-   if (!sgt->nents) {
+   if (dma_map_sgtable(db_attach->dev, sgt, dma_dir,
+   DMA_ATTR_SKIP_CPU_SYNC)) {
pr_err("failed to map scatterlist\n");
mutex_unlock(lock);
return ERR_PTR(-EIO);
@@ -423,8 +421,8 @@ static void vb2_dc_put_userptr(void *buf_priv)
 * No need to sync to CPU, it's already synced to the CPU
 * since the finish() memop will have been called before this.
 */
-   dma_unmap_sg_attrs(buf->dev, sgt->sgl, sgt->orig_nents,
-  buf->dma_dir, DMA_ATTR_SKIP_CPU_SYNC);
+   dma_unmap_sgtable(buf->dev, sgt, buf->dma_dir,
+ DMA_ATTR_SKIP_CPU_SYNC);
pages = frame_vector_pages(buf->vec);
/* sgt should exist only if vector contains pages... */
BUG_ON(IS_ERR(pages));
@@ -521,9 +519,8 @@ static void *vb2_dc_get_userptr(struct device *dev, 
unsigned long vaddr,
 * No need to sync to the device, this will happen later when the
 * prepare() memop is called.
 */
-   sgt->nents = dma_map_sg_attrs(buf->dev, sgt->sgl, sgt->orig_nents,
- buf->dma_dir, DMA_ATTR_SKIP_CPU_SYNC);
-   if (sgt->nents <= 0) {
+   if (dma_map_sgtable(buf->dev, sgt, buf->dma_dir,
+   DMA_ATTR_SKIP_CPU_SYNC)) {

[PATCH v7 29/36] staging: ion: remove dead code

2020-06-19 Thread Marek Szyprowski
ion_heap_pages_zero() function is not used at all, so remove it to
simplify the ion_heap_sglist_zero() function later.

Signed-off-by: Marek Szyprowski 
---
 drivers/staging/android/ion/ion.h  | 1 -
 drivers/staging/android/ion/ion_heap.c | 9 -
 2 files changed, 10 deletions(-)

diff --git a/drivers/staging/android/ion/ion.h 
b/drivers/staging/android/ion/ion.h
index 74914a266e25..c199e88afc6c 100644
--- a/drivers/staging/android/ion/ion.h
+++ b/drivers/staging/android/ion/ion.h
@@ -177,7 +177,6 @@ void ion_heap_unmap_kernel(struct ion_heap *heap, struct 
ion_buffer *buffer);
 int ion_heap_map_user(struct ion_heap *heap, struct ion_buffer *buffer,
  struct vm_area_struct *vma);
 int ion_heap_buffer_zero(struct ion_buffer *buffer);
-int ion_heap_pages_zero(struct page *page, size_t size, pgprot_t pgprot);
 
 /**
  * ion_heap_init_shrinker
diff --git a/drivers/staging/android/ion/ion_heap.c 
b/drivers/staging/android/ion/ion_heap.c
index 0755b11348ed..9c23b2382a1e 100644
--- a/drivers/staging/android/ion/ion_heap.c
+++ b/drivers/staging/android/ion/ion_heap.c
@@ -145,15 +145,6 @@ int ion_heap_buffer_zero(struct ion_buffer *buffer)
return ion_heap_sglist_zero(table->sgl, table->nents, pgprot);
 }
 
-int ion_heap_pages_zero(struct page *page, size_t size, pgprot_t pgprot)
-{
-   struct scatterlist sg;
-
-   sg_init_table(, 1);
-   sg_set_page(, page, size, 0);
-   return ion_heap_sglist_zero(, 1, pgprot);
-}
-
 void ion_heap_freelist_add(struct ion_heap *heap, struct ion_buffer *buffer)
 {
spin_lock(>free_lock);
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 21/36] drm: v3d: fix common struct sg_table related issues

2020-06-19 Thread Marek Szyprowski
The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function
returns the number of the created entries in the DMA address space.
However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and
dma_unmap_sg must be called with the original number of the entries
passed to the dma_map_sg().

struct sg_table is a common structure used for describing a non-contiguous
memory buffer, used commonly in the DRM and graphics subsystems. It
consists of a scatterlist with memory pages and DMA addresses (sgl entry),
as well as the number of scatterlist entries: CPU pages (orig_nents entry)
and DMA mapped pages (nents entry).

It turned out that it was a common mistake to misuse nents and orig_nents
entries, calling DMA-mapping functions with a wrong number of entries or
ignoring the number of mapped entries returned by the dma_map_sg()
function.

To avoid such issues, lets use a common dma-mapping wrappers operating
directly on the struct sg_table objects and use scatterlist page
iterators where possible. This, almost always, hides references to the
nents and orig_nents entries, making the code robust, easier to follow
and copy/paste safe.

Signed-off-by: Marek Szyprowski 
Reviewed-by: Eric Anholt 
---
 drivers/gpu/drm/v3d/v3d_mmu.c | 13 ++---
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/v3d/v3d_mmu.c b/drivers/gpu/drm/v3d/v3d_mmu.c
index 3b81ea28c0bb..5a453532901f 100644
--- a/drivers/gpu/drm/v3d/v3d_mmu.c
+++ b/drivers/gpu/drm/v3d/v3d_mmu.c
@@ -90,18 +90,17 @@ void v3d_mmu_insert_ptes(struct v3d_bo *bo)
struct v3d_dev *v3d = to_v3d_dev(shmem_obj->base.dev);
u32 page = bo->node.start;
u32 page_prot = V3D_PTE_WRITEABLE | V3D_PTE_VALID;
-   unsigned int count;
-   struct scatterlist *sgl;
+   struct sg_dma_page_iter dma_iter;
 
-   for_each_sg(shmem_obj->sgt->sgl, sgl, shmem_obj->sgt->nents, count) {
-   u32 page_address = sg_dma_address(sgl) >> V3D_MMU_PAGE_SHIFT;
+   for_each_sgtable_dma_page(shmem_obj->sgt, _iter, 0) {
+   dma_addr_t dma_addr = sg_page_iter_dma_address(_iter);
+   u32 page_address = dma_addr >> V3D_MMU_PAGE_SHIFT;
u32 pte = page_prot | page_address;
u32 i;
 
-   BUG_ON(page_address + (sg_dma_len(sgl) >> V3D_MMU_PAGE_SHIFT) >=
+   BUG_ON(page_address + (PAGE_SIZE >> V3D_MMU_PAGE_SHIFT) >=
   BIT(24));
-
-   for (i = 0; i < sg_dma_len(sgl) >> V3D_MMU_PAGE_SHIFT; i++)
+   for (i = 0; i < PAGE_SIZE >> V3D_MMU_PAGE_SHIFT; i++)
v3d->pt[page++] = pte + i;
}
 
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 32/36] misc: fastrpc: fix common struct sg_table related issues

2020-06-19 Thread Marek Szyprowski
The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function
returns the number of the created entries in the DMA address space.
However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and
dma_unmap_sg must be called with the original number of the entries
passed to the dma_map_sg().

struct sg_table is a common structure used for describing a non-contiguous
memory buffer, used commonly in the DRM and graphics subsystems. It
consists of a scatterlist with memory pages and DMA addresses (sgl entry),
as well as the number of scatterlist entries: CPU pages (orig_nents entry)
and DMA mapped pages (nents entry).

It turned out that it was a common mistake to misuse nents and orig_nents
entries, calling DMA-mapping functions with a wrong number of entries or
ignoring the number of mapped entries returned by the dma_map_sg()
function.

To avoid such issues, lets use a common dma-mapping wrappers operating
directly on the struct sg_table objects and use scatterlist page
iterators where possible. This, almost always, hides references to the
nents and orig_nents entries, making the code robust, easier to follow
and copy/paste safe.

Signed-off-by: Marek Szyprowski 
---
 drivers/misc/fastrpc.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/misc/fastrpc.c b/drivers/misc/fastrpc.c
index 7939c55daceb..9d6867749316 100644
--- a/drivers/misc/fastrpc.c
+++ b/drivers/misc/fastrpc.c
@@ -518,7 +518,7 @@ fastrpc_map_dma_buf(struct dma_buf_attachment *attachment,
 
table = >sgt;
 
-   if (!dma_map_sg(attachment->dev, table->sgl, table->nents, dir))
+   if (!dma_map_sgtable(attachment->dev, table, dir, 0))
return ERR_PTR(-ENOMEM);
 
return table;
@@ -528,7 +528,7 @@ static void fastrpc_unmap_dma_buf(struct dma_buf_attachment 
*attach,
  struct sg_table *table,
  enum dma_data_direction dir)
 {
-   dma_unmap_sg(attach->dev, table->sgl, table->nents, dir);
+   dma_unmap_sgtable(attach->dev, table, dir, 0);
 }
 
 static void fastrpc_release(struct dma_buf *dmabuf)
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 22/36] drm: virtio: fix common struct sg_table related issues

2020-06-19 Thread Marek Szyprowski
The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function
returns the number of the created entries in the DMA address space.
However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and
dma_unmap_sg must be called with the original number of the entries
passed to the dma_map_sg().

struct sg_table is a common structure used for describing a non-contiguous
memory buffer, used commonly in the DRM and graphics subsystems. It
consists of a scatterlist with memory pages and DMA addresses (sgl entry),
as well as the number of scatterlist entries: CPU pages (orig_nents entry)
and DMA mapped pages (nents entry).

It turned out that it was a common mistake to misuse nents and orig_nents
entries, calling DMA-mapping functions with a wrong number of entries or
ignoring the number of mapped entries returned by the dma_map_sg()
function.

To avoid such issues, lets use a common dma-mapping wrappers operating
directly on the struct sg_table objects and use scatterlist page
iterators where possible. This, almost always, hides references to the
nents and orig_nents entries, making the code robust, easier to follow
and copy/paste safe.

Signed-off-by: Marek Szyprowski 
Acked-by: Gerd Hoffmann 
---
 drivers/gpu/drm/virtio/virtgpu_object.c | 36 ++---
 drivers/gpu/drm/virtio/virtgpu_vq.c | 12 -
 2 files changed, 26 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/virtio/virtgpu_object.c 
b/drivers/gpu/drm/virtio/virtgpu_object.c
index 346cef5ce251..c3b9e3cf786e 100644
--- a/drivers/gpu/drm/virtio/virtgpu_object.c
+++ b/drivers/gpu/drm/virtio/virtgpu_object.c
@@ -72,9 +72,8 @@ void virtio_gpu_cleanup_object(struct virtio_gpu_object *bo)
 
if (shmem->pages) {
if (shmem->mapped) {
-   dma_unmap_sg(vgdev->vdev->dev.parent,
-shmem->pages->sgl, shmem->mapped,
-DMA_TO_DEVICE);
+   dma_unmap_sgtable(vgdev->vdev->dev.parent,
+shmem->pages, DMA_TO_DEVICE, 0);
shmem->mapped = 0;
}
 
@@ -157,13 +156,13 @@ static int virtio_gpu_object_shmem_init(struct 
virtio_gpu_device *vgdev,
}
 
if (use_dma_api) {
-   shmem->mapped = dma_map_sg(vgdev->vdev->dev.parent,
-  shmem->pages->sgl,
-  shmem->pages->nents,
-  DMA_TO_DEVICE);
-   *nents = shmem->mapped;
+   ret = dma_map_sgtable(vgdev->vdev->dev.parent,
+ shmem->pages, DMA_TO_DEVICE, 0);
+   if (ret)
+   return ret;
+   *nents = shmem->mapped = shmem->pages->nents;
} else {
-   *nents = shmem->pages->nents;
+   *nents = shmem->pages->orig_nents;
}
 
*ents = kmalloc_array(*nents, sizeof(struct virtio_gpu_mem_entry),
@@ -173,13 +172,20 @@ static int virtio_gpu_object_shmem_init(struct 
virtio_gpu_device *vgdev,
return -ENOMEM;
}
 
-   for_each_sg(shmem->pages->sgl, sg, *nents, si) {
-   (*ents)[si].addr = cpu_to_le64(use_dma_api
-  ? sg_dma_address(sg)
-  : sg_phys(sg));
-   (*ents)[si].length = cpu_to_le32(sg->length);
-   (*ents)[si].padding = 0;
+   if (use_dma_api) {
+   for_each_sgtable_dma_sg(shmem->pages, sg, si) {
+   (*ents)[si].addr = cpu_to_le64(sg_dma_address(sg));
+   (*ents)[si].length = cpu_to_le32(sg_dma_len(sg));
+   (*ents)[si].padding = 0;
+   }
+   } else {
+   for_each_sgtable_sg(shmem->pages, sg, si) {
+   (*ents)[si].addr = cpu_to_le64(sg_phys(sg));
+   (*ents)[si].length = cpu_to_le32(sg->length);
+   (*ents)[si].padding = 0;
+   }
}
+
return 0;
 }
 
diff --git a/drivers/gpu/drm/virtio/virtgpu_vq.c 
b/drivers/gpu/drm/virtio/virtgpu_vq.c
index 9e663a5d9952..e5765db85c20 100644
--- a/drivers/gpu/drm/virtio/virtgpu_vq.c
+++ b/drivers/gpu/drm/virtio/virtgpu_vq.c
@@ -302,7 +302,7 @@ static struct sg_table *vmalloc_to_sgt(char *data, uint32_t 
size, int *sg_ents)
return NULL;
}
 
-   for_each_sg(sgt->sgl, sg, *sg_ents, i) {
+   for_each_sgtable_sg(sgt, sg, i) {
pg = vmalloc_to_page(data);
if (!pg) {
sg_free_table(sgt);
@@ -603,9 +603,8 @@ void virtio_gpu_cmd_transfer_to_host_2d(struct 
virtio_gpu_device *vgdev,
struct virtio_gpu_object_shmem *shmem = to_virtio_gpu_shmem(bo);
 
if (use_dma_api)
-  

[PATCH v7 16/36] drm: panfrost: fix common struct sg_table related issues

2020-06-19 Thread Marek Szyprowski
The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function
returns the number of the created entries in the DMA address space.
However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and
dma_unmap_sg must be called with the original number of the entries
passed to the dma_map_sg().

struct sg_table is a common structure used for describing a non-contiguous
memory buffer, used commonly in the DRM and graphics subsystems. It
consists of a scatterlist with memory pages and DMA addresses (sgl entry),
as well as the number of scatterlist entries: CPU pages (orig_nents entry)
and DMA mapped pages (nents entry).

It turned out that it was a common mistake to misuse nents and orig_nents
entries, calling DMA-mapping functions with a wrong number of entries or
ignoring the number of mapped entries returned by the dma_map_sg()
function.

To avoid such issues, lets use a common dma-mapping wrappers operating
directly on the struct sg_table objects and use scatterlist page
iterators where possible. This, almost always, hides references to the
nents and orig_nents entries, making the code robust, easier to follow
and copy/paste safe.

Signed-off-by: Marek Szyprowski 
Reviewed-by: Steven Price 
Reviewed-by: Rob Herring 
---
 drivers/gpu/drm/panfrost/panfrost_gem.c | 4 ++--
 drivers/gpu/drm/panfrost/panfrost_mmu.c | 7 +++
 2 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_gem.c 
b/drivers/gpu/drm/panfrost/panfrost_gem.c
index ac5d0aa80276..ba8450ea04d0 100644
--- a/drivers/gpu/drm/panfrost/panfrost_gem.c
+++ b/drivers/gpu/drm/panfrost/panfrost_gem.c
@@ -41,8 +41,8 @@ static void panfrost_gem_free_object(struct drm_gem_object 
*obj)
 
for (i = 0; i < n_sgt; i++) {
if (bo->sgts[i].sgl) {
-   dma_unmap_sg(pfdev->dev, bo->sgts[i].sgl,
-bo->sgts[i].nents, 
DMA_BIDIRECTIONAL);
+   dma_unmap_sgtable(pfdev->dev, >sgts[i],
+ DMA_BIDIRECTIONAL, 0);
sg_free_table(>sgts[i]);
}
}
diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c 
b/drivers/gpu/drm/panfrost/panfrost_mmu.c
index 0a339c6fbfaa..fd294f6a7d3b 100644
--- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
+++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
@@ -253,7 +253,7 @@ static int mmu_map_sg(struct panfrost_device *pfdev, struct 
panfrost_mmu *mmu,
struct io_pgtable_ops *ops = mmu->pgtbl_ops;
u64 start_iova = iova;
 
-   for_each_sg(sgt->sgl, sgl, sgt->nents, count) {
+   for_each_sgtable_dma_sg(sgt, sgl, count) {
unsigned long paddr = sg_dma_address(sgl);
size_t len = sg_dma_len(sgl);
 
@@ -517,10 +517,9 @@ static int panfrost_mmu_map_fault_addr(struct 
panfrost_device *pfdev, int as,
if (ret)
goto err_pages;
 
-   if (!dma_map_sg(pfdev->dev, sgt->sgl, sgt->nents, DMA_BIDIRECTIONAL)) {
-   ret = -EINVAL;
+   ret = dma_map_sgtable(pfdev->dev, sgt, DMA_BIDIRECTIONAL, 0);
+   if (ret)
goto err_map;
-   }
 
mmu_map_sg(pfdev, bomapping->mmu, addr,
   IOMMU_WRITE | IOMMU_READ | IOMMU_NOEXEC, sgt);
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 30/36] staging: ion: fix common struct sg_table related issues

2020-06-19 Thread Marek Szyprowski
The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function
returns the number of the created entries in the DMA address space.
However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and
dma_unmap_sg must be called with the original number of the entries
passed to the dma_map_sg().

struct sg_table is a common structure used for describing a non-contiguous
memory buffer, used commonly in the DRM and graphics subsystems. It
consists of a scatterlist with memory pages and DMA addresses (sgl entry),
as well as the number of scatterlist entries: CPU pages (orig_nents entry)
and DMA mapped pages (nents entry).

It turned out that it was a common mistake to misuse nents and orig_nents
entries, calling DMA-mapping functions with a wrong number of entries or
ignoring the number of mapped entries returned by the dma_map_sg()
function.

To avoid such issues, lets use a common dma-mapping wrappers operating
directly on the struct sg_table objects and use scatterlist page
iterators where possible. This, almost always, hides references to the
nents and orig_nents entries, making the code robust, easier to follow
and copy/paste safe.

Signed-off-by: Marek Szyprowski 
---
 drivers/staging/android/ion/ion.c | 25 +--
 drivers/staging/android/ion/ion_heap.c| 44 ++-
 drivers/staging/android/ion/ion_system_heap.c |  2 +-
 3 files changed, 25 insertions(+), 46 deletions(-)

diff --git a/drivers/staging/android/ion/ion.c 
b/drivers/staging/android/ion/ion.c
index 38b51eace4f9..3c9f09506ffa 100644
--- a/drivers/staging/android/ion/ion.c
+++ b/drivers/staging/android/ion/ion.c
@@ -147,14 +147,14 @@ static struct sg_table *dup_sg_table(struct sg_table 
*table)
if (!new_table)
return ERR_PTR(-ENOMEM);
 
-   ret = sg_alloc_table(new_table, table->nents, GFP_KERNEL);
+   ret = sg_alloc_table(new_table, table->orig_nents, GFP_KERNEL);
if (ret) {
kfree(new_table);
return ERR_PTR(-ENOMEM);
}
 
new_sg = new_table->sgl;
-   for_each_sg(table->sgl, sg, table->nents, i) {
+   for_each_sgtable_sg(table, sg, i) {
memcpy(new_sg, sg, sizeof(*sg));
new_sg->dma_address = 0;
new_sg = sg_next(new_sg);
@@ -224,12 +224,13 @@ static struct sg_table *ion_map_dma_buf(struct 
dma_buf_attachment *attachment,
 {
struct ion_dma_buf_attachment *a = attachment->priv;
struct sg_table *table;
+   int ret;
 
table = a->table;
 
-   if (!dma_map_sg(attachment->dev, table->sgl, table->nents,
-   direction))
-   return ERR_PTR(-ENOMEM);
+   ret = dma_map_sgtable(attachment->dev, table, direction, 0);
+   if (ret)
+   return ERR_PTR(ret);
 
return table;
 }
@@ -238,7 +239,7 @@ static void ion_unmap_dma_buf(struct dma_buf_attachment 
*attachment,
  struct sg_table *table,
  enum dma_data_direction direction)
 {
-   dma_unmap_sg(attachment->dev, table->sgl, table->nents, direction);
+   dma_unmap_sgtable(attachment->dev, table, direction, 0);
 }
 
 static int ion_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma)
@@ -296,10 +297,8 @@ static int ion_dma_buf_begin_cpu_access(struct dma_buf 
*dmabuf,
}
 
mutex_lock(>lock);
-   list_for_each_entry(a, >attachments, list) {
-   dma_sync_sg_for_cpu(a->dev, a->table->sgl, a->table->nents,
-   direction);
-   }
+   list_for_each_entry(a, >attachments, list)
+   dma_sync_sgtable_for_cpu(a->dev, a->table, direction);
 
 unlock:
mutex_unlock(>lock);
@@ -319,10 +318,8 @@ static int ion_dma_buf_end_cpu_access(struct dma_buf 
*dmabuf,
}
 
mutex_lock(>lock);
-   list_for_each_entry(a, >attachments, list) {
-   dma_sync_sg_for_device(a->dev, a->table->sgl, a->table->nents,
-  direction);
-   }
+   list_for_each_entry(a, >attachments, list)
+   dma_sync_sgtable_for_device(a->dev, a->table, direction);
mutex_unlock(>lock);
 
return 0;
diff --git a/drivers/staging/android/ion/ion_heap.c 
b/drivers/staging/android/ion/ion_heap.c
index 9c23b2382a1e..79f27949edda 100644
--- a/drivers/staging/android/ion/ion_heap.c
+++ b/drivers/staging/android/ion/ion_heap.c
@@ -20,8 +20,7 @@
 void *ion_heap_map_kernel(struct ion_heap *heap,
  struct ion_buffer *buffer)
 {
-   struct scatterlist *sg;
-   int i, j;
+   struct sg_page_iter piter;
void *vaddr;
pgprot_t pgprot;
struct sg_table *table = buffer->sg_table;
@@ -38,14 +37,11 @@ void *ion_heap_map_kernel(struct ion_heap *heap,
else
pgprot = pgprot_writecombine(PAGE_KERNEL);
 
-   for_each_sg(table->sgl, sg, table->nents, i) {
-   int 

[PATCH v7 31/36] staging: tegra-vde: fix common struct sg_table related issues

2020-06-19 Thread Marek Szyprowski
The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function
returns the number of the created entries in the DMA address space.
However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and
dma_unmap_sg must be called with the original number of the entries
passed to the dma_map_sg().

struct sg_table is a common structure used for describing a non-contiguous
memory buffer, used commonly in the DRM and graphics subsystems. It
consists of a scatterlist with memory pages and DMA addresses (sgl entry),
as well as the number of scatterlist entries: CPU pages (orig_nents entry)
and DMA mapped pages (nents entry).

It turned out that it was a common mistake to misuse nents and orig_nents
entries, calling DMA-mapping functions with a wrong number of entries or
ignoring the number of mapped entries returned by the dma_map_sg()
function.

To avoid such issues, lets use a common dma-mapping wrappers operating
directly on the struct sg_table objects and use scatterlist page
iterators where possible. This, almost always, hides references to the
nents and orig_nents entries, making the code robust, easier to follow
and copy/paste safe.

Signed-off-by: Marek Szyprowski 
Reviewed-by: Dmitry Osipenko 
---
 drivers/staging/media/tegra-vde/iommu.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/media/tegra-vde/iommu.c 
b/drivers/staging/media/tegra-vde/iommu.c
index 6af863d92123..adf8dc7ee25c 100644
--- a/drivers/staging/media/tegra-vde/iommu.c
+++ b/drivers/staging/media/tegra-vde/iommu.c
@@ -36,8 +36,8 @@ int tegra_vde_iommu_map(struct tegra_vde *vde,
 
addr = iova_dma_addr(>iova, iova);
 
-   size = iommu_map_sg(vde->domain, addr, sgt->sgl, sgt->nents,
-   IOMMU_READ | IOMMU_WRITE);
+   size = iommu_map_sgtable(vde->domain, addr, sgt,
+IOMMU_READ | IOMMU_WRITE);
if (!size) {
__free_iova(>iova, iova);
return -ENXIO;
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 19/36] drm: rockchip: fix common struct sg_table related issues

2020-06-19 Thread Marek Szyprowski
The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function
returns the number of the created entries in the DMA address space.
However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and
dma_unmap_sg must be called with the original number of the entries
passed to the dma_map_sg().

struct sg_table is a common structure used for describing a non-contiguous
memory buffer, used commonly in the DRM and graphics subsystems. It
consists of a scatterlist with memory pages and DMA addresses (sgl entry),
as well as the number of scatterlist entries: CPU pages (orig_nents entry)
and DMA mapped pages (nents entry).

It turned out that it was a common mistake to misuse nents and orig_nents
entries, calling DMA-mapping functions with a wrong number of entries or
ignoring the number of mapped entries returned by the dma_map_sg()
function.

To avoid such issues, lets use a common dma-mapping wrappers operating
directly on the struct sg_table objects and use scatterlist page
iterators where possible. This, almost always, hides references to the
nents and orig_nents entries, making the code robust, easier to follow
and copy/paste safe.

Signed-off-by: Marek Szyprowski 
---
 drivers/gpu/drm/rockchip/rockchip_drm_gem.c | 23 +
 1 file changed, 10 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_gem.c 
b/drivers/gpu/drm/rockchip/rockchip_drm_gem.c
index 2970e534e2bb..cb50f2ba2e46 100644
--- a/drivers/gpu/drm/rockchip/rockchip_drm_gem.c
+++ b/drivers/gpu/drm/rockchip/rockchip_drm_gem.c
@@ -36,8 +36,8 @@ static int rockchip_gem_iommu_map(struct rockchip_gem_object 
*rk_obj)
 
rk_obj->dma_addr = rk_obj->mm.start;
 
-   ret = iommu_map_sg(private->domain, rk_obj->dma_addr, rk_obj->sgt->sgl,
-  rk_obj->sgt->nents, prot);
+   ret = iommu_map_sgtable(private->domain, rk_obj->dma_addr, rk_obj->sgt,
+   prot);
if (ret < rk_obj->base.size) {
DRM_ERROR("failed to map buffer: size=%zd request_size=%zd\n",
  ret, rk_obj->base.size);
@@ -98,11 +98,10 @@ static int rockchip_gem_get_pages(struct 
rockchip_gem_object *rk_obj)
 * TODO: Replace this by drm_clflush_sg() once it can be implemented
 * without relying on symbols that are not exported.
 */
-   for_each_sg(rk_obj->sgt->sgl, s, rk_obj->sgt->nents, i)
+   for_each_sgtable_sg(rk_obj->sgt, s, i)
sg_dma_address(s) = sg_phys(s);
 
-   dma_sync_sg_for_device(drm->dev, rk_obj->sgt->sgl, rk_obj->sgt->nents,
-  DMA_TO_DEVICE);
+   dma_sync_sgtable_for_device(drm->dev, rk_obj->sgt, DMA_TO_DEVICE);
 
return 0;
 
@@ -350,8 +349,8 @@ void rockchip_gem_free_object(struct drm_gem_object *obj)
if (private->domain) {
rockchip_gem_iommu_unmap(rk_obj);
} else {
-   dma_unmap_sg(drm->dev, rk_obj->sgt->sgl,
-rk_obj->sgt->nents, DMA_BIDIRECTIONAL);
+   dma_unmap_sgtable(drm->dev, rk_obj->sgt,
+ DMA_BIDIRECTIONAL, 0);
}
drm_prime_gem_destroy(obj, rk_obj->sgt);
} else {
@@ -476,15 +475,13 @@ rockchip_gem_dma_map_sg(struct drm_device *drm,
struct sg_table *sg,
struct rockchip_gem_object *rk_obj)
 {
-   int count = dma_map_sg(drm->dev, sg->sgl, sg->nents,
-  DMA_BIDIRECTIONAL);
-   if (!count)
-   return -EINVAL;
+   int err = dma_map_sgtable(drm->dev, sg, DMA_BIDIRECTIONAL, 0);
+   if (err)
+   return err;
 
if (drm_prime_get_contiguous_size(sg) < attach->dmabuf->size) {
DRM_ERROR("failed to map sg_table to contiguous linear 
address.\n");
-   dma_unmap_sg(drm->dev, sg->sgl, sg->nents,
-DMA_BIDIRECTIONAL);
+   dma_unmap_sgtable(drm->dev, sg, DMA_BIDIRECTIONAL, 0);
return -EINVAL;
}
 
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 33/36] rapidio: fix common struct sg_table related issues

2020-06-19 Thread Marek Szyprowski
The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function
returns the number of the created entries in the DMA address space.
However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and
dma_unmap_sg must be called with the original number of the entries
passed to the dma_map_sg().

struct sg_table is a common structure used for describing a non-contiguous
memory buffer, used commonly in the DRM and graphics subsystems. It
consists of a scatterlist with memory pages and DMA addresses (sgl entry),
as well as the number of scatterlist entries: CPU pages (orig_nents entry)
and DMA mapped pages (nents entry).

It turned out that it was a common mistake to misuse nents and orig_nents
entries, calling DMA-mapping functions with a wrong number of entries or
ignoring the number of mapped entries returned by the dma_map_sg()
function.

To avoid such issues, lets use a common dma-mapping wrappers operating
directly on the struct sg_table objects and use scatterlist page
iterators where possible. This, almost always, hides references to the
nents and orig_nents entries, making the code robust, easier to follow
and copy/paste safe.

Signed-off-by: Marek Szyprowski 
---
 drivers/rapidio/devices/rio_mport_cdev.c | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/drivers/rapidio/devices/rio_mport_cdev.c 
b/drivers/rapidio/devices/rio_mport_cdev.c
index 451608e960a1..98c572627c8c 100644
--- a/drivers/rapidio/devices/rio_mport_cdev.c
+++ b/drivers/rapidio/devices/rio_mport_cdev.c
@@ -573,8 +573,7 @@ static void dma_req_free(struct kref *ref)
refcount);
struct mport_cdev_priv *priv = req->priv;
 
-   dma_unmap_sg(req->dmach->device->dev,
-req->sgt.sgl, req->sgt.nents, req->dir);
+   dma_unmap_sgtable(req->dmach->device->dev, >sgt, req->dir, 0);
sg_free_table(>sgt);
if (req->page_list) {
unpin_user_pages(req->page_list, req->nr_pages);
@@ -930,9 +929,8 @@ rio_dma_transfer(struct file *filp, u32 transfer_mode,
xfer->offset, xfer->length);
}
 
-   nents = dma_map_sg(chan->device->dev,
-  req->sgt.sgl, req->sgt.nents, dir);
-   if (nents == 0) {
+   ret = dma_map_sgtable(chan->device->dev, >sgt, dir, 0);
+   if (ret) {
rmcd_error("Failed to map SG list");
ret = -EFAULT;
goto err_pg;
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 35/36] media: pci: fix common ALSA DMA-mapping related codes

2020-06-19 Thread Marek Szyprowski
The Documentation/DMA-API-HOWTO.txt states that dma_map_sg returns the
numer of the created entries in the DMA address space. However the
subsequent calls to dma_sync_sg_for_{device,cpu} and dma_unmap_sg must be
called with the original number of entries passed to dma_map_sg. The
sg_table->nents in turn holds the result of the dma_map_sg call as stated
in include/linux/scatterlist.h. Adapt the code to obey those rules.

Signed-off-by: Marek Szyprowski 
---
 drivers/media/pci/cx23885/cx23885-alsa.c | 2 +-
 drivers/media/pci/cx25821/cx25821-alsa.c | 2 +-
 drivers/media/pci/cx88/cx88-alsa.c   | 2 +-
 drivers/media/pci/saa7134/saa7134-alsa.c | 2 +-
 4 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/media/pci/cx23885/cx23885-alsa.c 
b/drivers/media/pci/cx23885/cx23885-alsa.c
index df44ed7393a0..3f366e4e4685 100644
--- a/drivers/media/pci/cx23885/cx23885-alsa.c
+++ b/drivers/media/pci/cx23885/cx23885-alsa.c
@@ -129,7 +129,7 @@ static int cx23885_alsa_dma_unmap(struct cx23885_audio_dev 
*dev)
if (!buf->sglen)
return 0;
 
-   dma_unmap_sg(>pci->dev, buf->sglist, buf->sglen, 
PCI_DMA_FROMDEVICE);
+   dma_unmap_sg(>pci->dev, buf->sglist, buf->nr_pages, 
PCI_DMA_FROMDEVICE);
buf->sglen = 0;
return 0;
 }
diff --git a/drivers/media/pci/cx25821/cx25821-alsa.c 
b/drivers/media/pci/cx25821/cx25821-alsa.c
index 301616426d8a..c40304d33776 100644
--- a/drivers/media/pci/cx25821/cx25821-alsa.c
+++ b/drivers/media/pci/cx25821/cx25821-alsa.c
@@ -193,7 +193,7 @@ static int cx25821_alsa_dma_unmap(struct cx25821_audio_dev 
*dev)
if (!buf->sglen)
return 0;
 
-   dma_unmap_sg(>pci->dev, buf->sglist, buf->sglen, 
PCI_DMA_FROMDEVICE);
+   dma_unmap_sg(>pci->dev, buf->sglist, buf->nr_pages, 
PCI_DMA_FROMDEVICE);
buf->sglen = 0;
return 0;
 }
diff --git a/drivers/media/pci/cx88/cx88-alsa.c 
b/drivers/media/pci/cx88/cx88-alsa.c
index 7d7aceecc985..3c6fe6ceb0b7 100644
--- a/drivers/media/pci/cx88/cx88-alsa.c
+++ b/drivers/media/pci/cx88/cx88-alsa.c
@@ -332,7 +332,7 @@ static int cx88_alsa_dma_unmap(struct cx88_audio_dev *dev)
if (!buf->sglen)
return 0;
 
-   dma_unmap_sg(>pci->dev, buf->sglist, buf->sglen,
+   dma_unmap_sg(>pci->dev, buf->sglist, buf->nr_pages,
 PCI_DMA_FROMDEVICE);
buf->sglen = 0;
return 0;
diff --git a/drivers/media/pci/saa7134/saa7134-alsa.c 
b/drivers/media/pci/saa7134/saa7134-alsa.c
index 544ca57eee75..398c47ff473d 100644
--- a/drivers/media/pci/saa7134/saa7134-alsa.c
+++ b/drivers/media/pci/saa7134/saa7134-alsa.c
@@ -313,7 +313,7 @@ static int saa7134_alsa_dma_unmap(struct saa7134_dev *dev)
if (!dma->sglen)
return 0;
 
-   dma_unmap_sg(>pci->dev, dma->sglist, dma->sglen, 
PCI_DMA_FROMDEVICE);
+   dma_unmap_sg(>pci->dev, dma->sglist, dma->nr_pages, 
PCI_DMA_FROMDEVICE);
dma->sglen = 0;
return 0;
 }
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 28/36] dmabuf: fix common struct sg_table related issues

2020-06-19 Thread Marek Szyprowski
The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function
returns the number of the created entries in the DMA address space.
However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and
dma_unmap_sg must be called with the original number of the entries
passed to the dma_map_sg().

struct sg_table is a common structure used for describing a non-contiguous
memory buffer, used commonly in the DRM and graphics subsystems. It
consists of a scatterlist with memory pages and DMA addresses (sgl entry),
as well as the number of scatterlist entries: CPU pages (orig_nents entry)
and DMA mapped pages (nents entry).

It turned out that it was a common mistake to misuse nents and orig_nents
entries, calling DMA-mapping functions with a wrong number of entries or
ignoring the number of mapped entries returned by the dma_map_sg()
function.

To avoid such issues, lets use a common dma-mapping wrappers operating
directly on the struct sg_table objects and use scatterlist page
iterators where possible. This, almost always, hides references to the
nents and orig_nents entries, making the code robust, easier to follow
and copy/paste safe.

Signed-off-by: Marek Szyprowski 
Acked-by: Gerd Hoffmann 
---
 drivers/dma-buf/heaps/heap-helpers.c | 13 ++---
 drivers/dma-buf/udmabuf.c|  7 +++
 2 files changed, 9 insertions(+), 11 deletions(-)

diff --git a/drivers/dma-buf/heaps/heap-helpers.c 
b/drivers/dma-buf/heaps/heap-helpers.c
index 9f964ca3f59c..d0696cf937af 100644
--- a/drivers/dma-buf/heaps/heap-helpers.c
+++ b/drivers/dma-buf/heaps/heap-helpers.c
@@ -140,13 +140,12 @@ struct sg_table *dma_heap_map_dma_buf(struct 
dma_buf_attachment *attachment,
  enum dma_data_direction direction)
 {
struct dma_heaps_attachment *a = attachment->priv;
-   struct sg_table *table;
-
-   table = >table;
+   struct sg_table *table = >table;
+   int ret;
 
-   if (!dma_map_sg(attachment->dev, table->sgl, table->nents,
-   direction))
-   table = ERR_PTR(-ENOMEM);
+   ret = dma_map_sgtable(attachment->dev, table, direction, 0);
+   if (ret)
+   table = ERR_PTR(ret);
return table;
 }
 
@@ -154,7 +153,7 @@ static void dma_heap_unmap_dma_buf(struct 
dma_buf_attachment *attachment,
   struct sg_table *table,
   enum dma_data_direction direction)
 {
-   dma_unmap_sg(attachment->dev, table->sgl, table->nents, direction);
+   dma_unmap_sgtable(attachment->dev, table, direction, 0);
 }
 
 static vm_fault_t dma_heap_vm_fault(struct vm_fault *vmf)
diff --git a/drivers/dma-buf/udmabuf.c b/drivers/dma-buf/udmabuf.c
index acb26c627d27..89e293bd9252 100644
--- a/drivers/dma-buf/udmabuf.c
+++ b/drivers/dma-buf/udmabuf.c
@@ -63,10 +63,9 @@ static struct sg_table *get_sg_table(struct device *dev, 
struct dma_buf *buf,
GFP_KERNEL);
if (ret < 0)
goto err;
-   if (!dma_map_sg(dev, sg->sgl, sg->nents, direction)) {
-   ret = -EINVAL;
+   ret = dma_map_sgtable(dev, sg, direction, 0);
+   if (ret < 0)
goto err;
-   }
return sg;
 
 err:
@@ -78,7 +77,7 @@ static struct sg_table *get_sg_table(struct device *dev, 
struct dma_buf *buf,
 static void put_sg_table(struct device *dev, struct sg_table *sg,
 enum dma_data_direction direction)
 {
-   dma_unmap_sg(dev, sg->sgl, sg->nents, direction);
+   dma_unmap_sgtable(dev, sg, direction, 0);
sg_free_table(sg);
kfree(sg);
 }
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 23/36] drm: vmwgfx: fix common struct sg_table related issues

2020-06-19 Thread Marek Szyprowski
The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function
returns the number of the created entries in the DMA address space.
However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and
dma_unmap_sg must be called with the original number of the entries
passed to the dma_map_sg().

struct sg_table is a common structure used for describing a non-contiguous
memory buffer, used commonly in the DRM and graphics subsystems. It
consists of a scatterlist with memory pages and DMA addresses (sgl entry),
as well as the number of scatterlist entries: CPU pages (orig_nents entry)
and DMA mapped pages (nents entry).

It turned out that it was a common mistake to misuse nents and orig_nents
entries, calling DMA-mapping functions with a wrong number of entries or
ignoring the number of mapped entries returned by the dma_map_sg()
function.

To avoid such issues, lets use a common dma-mapping wrappers operating
directly on the struct sg_table objects and use scatterlist page
iterators where possible. This, almost always, hides references to the
nents and orig_nents entries, making the code robust, easier to follow
and copy/paste safe.

Signed-off-by: Marek Szyprowski 
Acked-by: Roland Scheidegger 
---
 drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c | 17 -
 1 file changed, 4 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c 
b/drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c
index bf0bc4697959..861c383c9780 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c
@@ -362,8 +362,7 @@ static void vmw_ttm_unmap_from_dma(struct vmw_ttm_tt 
*vmw_tt)
 {
struct device *dev = vmw_tt->dev_priv->dev->dev;
 
-   dma_unmap_sg(dev, vmw_tt->sgt.sgl, vmw_tt->sgt.nents,
-   DMA_BIDIRECTIONAL);
+   dma_unmap_sgtable(dev, _tt->sgt, DMA_BIDIRECTIONAL, 0);
vmw_tt->sgt.nents = vmw_tt->sgt.orig_nents;
 }
 
@@ -383,16 +382,8 @@ static void vmw_ttm_unmap_from_dma(struct vmw_ttm_tt 
*vmw_tt)
 static int vmw_ttm_map_for_dma(struct vmw_ttm_tt *vmw_tt)
 {
struct device *dev = vmw_tt->dev_priv->dev->dev;
-   int ret;
-
-   ret = dma_map_sg(dev, vmw_tt->sgt.sgl, vmw_tt->sgt.orig_nents,
-DMA_BIDIRECTIONAL);
-   if (unlikely(ret == 0))
-   return -ENOMEM;
 
-   vmw_tt->sgt.nents = ret;
-
-   return 0;
+   return dma_map_sgtable(dev, _tt->sgt, DMA_BIDIRECTIONAL, 0);
 }
 
 /**
@@ -449,10 +440,10 @@ static int vmw_ttm_map_dma(struct vmw_ttm_tt *vmw_tt)
if (unlikely(ret != 0))
goto out_sg_alloc_fail;
 
-   if (vsgt->num_pages > vmw_tt->sgt.nents) {
+   if (vsgt->num_pages > vmw_tt->sgt.orig_nents) {
uint64_t over_alloc =
sgl_size * (vsgt->num_pages -
-   vmw_tt->sgt.nents);
+   vmw_tt->sgt.orig_nents);
 
ttm_mem_global_free(glob, over_alloc);
vmw_tt->sg_alloc_size -= over_alloc;
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 25/36] xen: gntdev: fix common struct sg_table related issues

2020-06-19 Thread Marek Szyprowski
The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function
returns the number of the created entries in the DMA address space.
However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and
dma_unmap_sg must be called with the original number of the entries
passed to the dma_map_sg().

struct sg_table is a common structure used for describing a non-contiguous
memory buffer, used commonly in the DRM and graphics subsystems. It
consists of a scatterlist with memory pages and DMA addresses (sgl entry),
as well as the number of scatterlist entries: CPU pages (orig_nents entry)
and DMA mapped pages (nents entry).

It turned out that it was a common mistake to misuse nents and orig_nents
entries, calling DMA-mapping functions with a wrong number of entries or
ignoring the number of mapped entries returned by the dma_map_sg()
function.

To avoid such issues, lets use a common dma-mapping wrappers operating
directly on the struct sg_table objects and use scatterlist page
iterators where possible. This, almost always, hides references to the
nents and orig_nents entries, making the code robust, easier to follow
and copy/paste safe.

Signed-off-by: Marek Szyprowski 
Acked-by: Juergen Gross 
---
 drivers/xen/gntdev-dmabuf.c | 13 ++---
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/drivers/xen/gntdev-dmabuf.c b/drivers/xen/gntdev-dmabuf.c
index 75d3bb948bf3..ba6cad871568 100644
--- a/drivers/xen/gntdev-dmabuf.c
+++ b/drivers/xen/gntdev-dmabuf.c
@@ -247,10 +247,9 @@ static void dmabuf_exp_ops_detach(struct dma_buf *dma_buf,
 
if (sgt) {
if (gntdev_dmabuf_attach->dir != DMA_NONE)
-   dma_unmap_sg_attrs(attach->dev, sgt->sgl,
-  sgt->nents,
-  gntdev_dmabuf_attach->dir,
-  DMA_ATTR_SKIP_CPU_SYNC);
+   dma_unmap_sgtable(attach->dev, sgt,
+ gntdev_dmabuf_attach->dir,
+ DMA_ATTR_SKIP_CPU_SYNC);
sg_free_table(sgt);
}
 
@@ -288,8 +287,8 @@ dmabuf_exp_ops_map_dma_buf(struct dma_buf_attachment 
*attach,
sgt = dmabuf_pages_to_sgt(gntdev_dmabuf->pages,
  gntdev_dmabuf->nr_pages);
if (!IS_ERR(sgt)) {
-   if (!dma_map_sg_attrs(attach->dev, sgt->sgl, sgt->nents, dir,
- DMA_ATTR_SKIP_CPU_SYNC)) {
+   if (dma_map_sgtable(attach->dev, sgt, dir,
+   DMA_ATTR_SKIP_CPU_SYNC)) {
sg_free_table(sgt);
kfree(sgt);
sgt = ERR_PTR(-ENOMEM);
@@ -625,7 +624,7 @@ dmabuf_imp_to_refs(struct gntdev_dmabuf_priv *priv, struct 
device *dev,
 
/* Now convert sgt to array of pages and check for page validity. */
i = 0;
-   for_each_sg_page(sgt->sgl, _iter, sgt->nents, 0) {
+   for_each_sgtable_page(sgt, _iter, 0) {
struct page *page = sg_page_iter_page(_iter);
/*
 * Check if page is valid: this can happen if we are given
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 17/36] drm: radeon: fix common struct sg_table related issues

2020-06-19 Thread Marek Szyprowski
The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function
returns the number of the created entries in the DMA address space.
However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and
dma_unmap_sg must be called with the original number of the entries
passed to the dma_map_sg().

struct sg_table is a common structure used for describing a non-contiguous
memory buffer, used commonly in the DRM and graphics subsystems. It
consists of a scatterlist with memory pages and DMA addresses (sgl entry),
as well as the number of scatterlist entries: CPU pages (orig_nents entry)
and DMA mapped pages (nents entry).

It turned out that it was a common mistake to misuse nents and orig_nents
entries, calling DMA-mapping functions with a wrong number of entries or
ignoring the number of mapped entries returned by the dma_map_sg()
function.

To avoid such issues, lets use a common dma-mapping wrappers operating
directly on the struct sg_table objects and use scatterlist page
iterators where possible. This, almost always, hides references to the
nents and orig_nents entries, making the code robust, easier to follow
and copy/paste safe.

Signed-off-by: Marek Szyprowski 
Reviewed-by: Christian König 
---
 drivers/gpu/drm/radeon/radeon_ttm.c | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c 
b/drivers/gpu/drm/radeon/radeon_ttm.c
index 5d50c9edbe80..0e3eb0d22831 100644
--- a/drivers/gpu/drm/radeon/radeon_ttm.c
+++ b/drivers/gpu/drm/radeon/radeon_ttm.c
@@ -481,7 +481,7 @@ static int radeon_ttm_tt_pin_userptr(struct ttm_tt *ttm)
 {
struct radeon_device *rdev = radeon_get_rdev(ttm->bdev);
struct radeon_ttm_tt *gtt = (void *)ttm;
-   unsigned pinned = 0, nents;
+   unsigned pinned = 0;
int r;
 
int write = !(gtt->userflags & RADEON_GEM_USERPTR_READONLY);
@@ -521,9 +521,8 @@ static int radeon_ttm_tt_pin_userptr(struct ttm_tt *ttm)
if (r)
goto release_sg;
 
-   r = -ENOMEM;
-   nents = dma_map_sg(rdev->dev, ttm->sg->sgl, ttm->sg->nents, direction);
-   if (nents == 0)
+   r = dma_map_sgtable(rdev->dev, ttm->sg, direction, 0);
+   if (r)
goto release_sg;
 
drm_prime_sg_to_page_addr_arrays(ttm->sg, ttm->pages,
@@ -554,9 +553,9 @@ static void radeon_ttm_tt_unpin_userptr(struct ttm_tt *ttm)
return;
 
/* free the sg table and pages again */
-   dma_unmap_sg(rdev->dev, ttm->sg->sgl, ttm->sg->nents, direction);
+   dma_unmap_sgtable(rdev->dev, ttm->sg, direction, 0);
 
-   for_each_sg_page(ttm->sg->sgl, _iter, ttm->sg->nents, 0) {
+   for_each_sgtable_page(ttm->sg, _iter, 0) {
struct page *page = sg_page_iter_page(_iter);
if (!(gtt->userflags & RADEON_GEM_USERPTR_READONLY))
set_page_dirty(page);
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH v7 02/36] drm: prime: use sgtable iterators in drm_prime_sg_to_page_addr_arrays()

2020-06-19 Thread Marek Szyprowski
Replace the current hand-crafted code for extracting pages and DMA
addresses from the given scatterlist by the much more robust
code based on the generic scatterlist iterators and recently
introduced sg_table-based wrappers. The resulting code is simple and
easy to understand, so the comment describing the old code is no
longer needed.

Signed-off-by: Marek Szyprowski 
---
 drivers/gpu/drm/drm_prime.c | 49 -
 1 file changed, 15 insertions(+), 34 deletions(-)

diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
index 226cd6ad3985..b717e52e909e 100644
--- a/drivers/gpu/drm/drm_prime.c
+++ b/drivers/gpu/drm/drm_prime.c
@@ -990,45 +990,26 @@ EXPORT_SYMBOL(drm_gem_prime_import);
 int drm_prime_sg_to_page_addr_arrays(struct sg_table *sgt, struct page **pages,
 dma_addr_t *addrs, int max_entries)
 {
-   unsigned count;
-   struct scatterlist *sg;
-   struct page *page;
-   u32 page_len, page_index;
-   dma_addr_t addr;
-   u32 dma_len, dma_index;
-
-   /*
-* Scatterlist elements contains both pages and DMA addresses, but
-* one shoud not assume 1:1 relation between them. The sg->length is
-* the size of the physical memory chunk described by the sg->page,
-* while sg_dma_len(sg) is the size of the DMA (IO virtual) chunk
-* described by the sg_dma_address(sg).
-*/
-   page_index = 0;
-   dma_index = 0;
-   for_each_sg(sgt->sgl, sg, sgt->nents, count) {
-   page_len = sg->length;
-   page = sg_page(sg);
-   dma_len = sg_dma_len(sg);
-   addr = sg_dma_address(sg);
-
-   while (pages && page_len > 0) {
-   if (WARN_ON(page_index >= max_entries))
+   struct sg_dma_page_iter dma_iter;
+   struct sg_page_iter page_iter;
+   struct page **p = pages;
+   dma_addr_t *a = addrs;
+
+   if (pages) {
+   for_each_sgtable_page(sgt, _iter, 0) {
+   if (p - pages >= max_entries)
return -1;
-   pages[page_index] = page;
-   page++;
-   page_len -= PAGE_SIZE;
-   page_index++;
+   *p++ = sg_page_iter_page(_iter);
}
-   while (addrs && dma_len > 0) {
-   if (WARN_ON(dma_index >= max_entries))
+   }
+   if (addrs) {
+   for_each_sgtable_dma_page(sgt, _iter, 0) {
+   if (a - addrs >= max_entries)
return -1;
-   addrs[dma_index] = addr;
-   addr += PAGE_SIZE;
-   dma_len -= PAGE_SIZE;
-   dma_index++;
+   *a++ = sg_page_iter_dma_address(_iter);
}
}
+
return 0;
 }
 EXPORT_SYMBOL(drm_prime_sg_to_page_addr_arrays);
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 05/36] drm: armada: fix common struct sg_table related issues

2020-06-19 Thread Marek Szyprowski
The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function
returns the number of the created entries in the DMA address space.
However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and
dma_unmap_sg must be called with the original number of the entries
passed to the dma_map_sg().

struct sg_table is a common structure used for describing a non-contiguous
memory buffer, used commonly in the DRM and graphics subsystems. It
consists of a scatterlist with memory pages and DMA addresses (sgl entry),
as well as the number of scatterlist entries: CPU pages (orig_nents entry)
and DMA mapped pages (nents entry).

It turned out that it was a common mistake to misuse nents and orig_nents
entries, calling DMA-mapping functions with a wrong number of entries or
ignoring the number of mapped entries returned by the dma_map_sg()
function.

To avoid such issues, lets use a common dma-mapping wrappers operating
directly on the struct sg_table objects and use scatterlist page
iterators where possible. This, almost always, hides references to the
nents and orig_nents entries, making the code robust, easier to follow
and copy/paste safe.

Signed-off-by: Marek Szyprowski 
---
 drivers/gpu/drm/armada/armada_gem.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/armada/armada_gem.c 
b/drivers/gpu/drm/armada/armada_gem.c
index 8005614d2e6b..bedd8937d8a1 100644
--- a/drivers/gpu/drm/armada/armada_gem.c
+++ b/drivers/gpu/drm/armada/armada_gem.c
@@ -395,7 +395,7 @@ armada_gem_prime_map_dma_buf(struct dma_buf_attachment 
*attach,
 
mapping = dobj->obj.filp->f_mapping;
 
-   for_each_sg(sgt->sgl, sg, count, i) {
+   for_each_sgtable_sg(sgt, sg, i) {
struct page *page;
 
page = shmem_read_mapping_page(mapping, i);
@@ -407,8 +407,8 @@ armada_gem_prime_map_dma_buf(struct dma_buf_attachment 
*attach,
sg_set_page(sg, page, PAGE_SIZE, 0);
}
 
-   if (dma_map_sg(attach->dev, sgt->sgl, sgt->nents, dir) == 0) {
-   num = sgt->nents;
+   if (dma_map_sgtable(attach->dev, sgt, dir, 0)) {
+   num = count;
goto release;
}
} else if (dobj->page) {
@@ -418,7 +418,7 @@ armada_gem_prime_map_dma_buf(struct dma_buf_attachment 
*attach,
 
sg_set_page(sgt->sgl, dobj->page, dobj->obj.size, 0);
 
-   if (dma_map_sg(attach->dev, sgt->sgl, sgt->nents, dir) == 0)
+   if (dma_map_sgtable(attach->dev, sgt, dir, 0))
goto free_table;
} else if (dobj->linear) {
/* Single contiguous physical region - no struct page */
@@ -449,11 +449,11 @@ static void armada_gem_prime_unmap_dma_buf(struct 
dma_buf_attachment *attach,
int i;
 
if (!dobj->linear)
-   dma_unmap_sg(attach->dev, sgt->sgl, sgt->nents, dir);
+   dma_unmap_sgtable(attach->dev, sgt, dir, 0);
 
if (dobj->obj.filp) {
struct scatterlist *sg;
-   for_each_sg(sgt->sgl, sg, sgt->nents, i)
+   for_each_sgtable_sg(sgt, sg, i)
put_page(sg_page(sg));
}
 
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 07/36] drm: exynos: use common helper for a scatterlist contiguity check

2020-06-19 Thread Marek Szyprowski
Use common helper for checking the contiguity of the imported dma-buf.

Signed-off-by: Marek Szyprowski 
---
 drivers/gpu/drm/exynos/exynos_drm_gem.c | 23 +++
 1 file changed, 3 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/exynos/exynos_drm_gem.c 
b/drivers/gpu/drm/exynos/exynos_drm_gem.c
index efa476858db5..1716a023bca0 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_gem.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_gem.c
@@ -431,27 +431,10 @@ exynos_drm_gem_prime_import_sg_table(struct drm_device 
*dev,
 {
struct exynos_drm_gem *exynos_gem;
 
-   if (sgt->nents < 1)
+   /* check if the entries in the sg_table are contiguous */
+   if (drm_prime_get_contiguous_size(sgt) < attach->dmabuf->size) {
+   DRM_ERROR("buffer chunks must be mapped contiguously");
return ERR_PTR(-EINVAL);
-
-   /*
-* Check if the provided buffer has been mapped as contiguous
-* into DMA address space.
-*/
-   if (sgt->nents > 1) {
-   dma_addr_t next_addr = sg_dma_address(sgt->sgl);
-   struct scatterlist *s;
-   unsigned int i;
-
-   for_each_sg(sgt->sgl, s, sgt->nents, i) {
-   if (!sg_dma_len(s))
-   break;
-   if (sg_dma_address(s) != next_addr) {
-   DRM_ERROR("buffer chunks must be mapped 
contiguously");
-   return ERR_PTR(-EINVAL);
-   }
-   next_addr = sg_dma_address(s) + sg_dma_len(s);
-   }
}
 
exynos_gem = exynos_drm_gem_init(dev, attach->dmabuf->size);
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 10/36] drm: lima: fix common struct sg_table related issues

2020-06-19 Thread Marek Szyprowski
The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function
returns the number of the created entries in the DMA address space.
However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and
dma_unmap_sg must be called with the original number of the entries
passed to the dma_map_sg().

struct sg_table is a common structure used for describing a non-contiguous
memory buffer, used commonly in the DRM and graphics subsystems. It
consists of a scatterlist with memory pages and DMA addresses (sgl entry),
as well as the number of scatterlist entries: CPU pages (orig_nents entry)
and DMA mapped pages (nents entry).

It turned out that it was a common mistake to misuse nents and orig_nents
entries, calling DMA-mapping functions with a wrong number of entries or
ignoring the number of mapped entries returned by the dma_map_sg()
function.

To avoid such issues, lets use a common dma-mapping wrappers operating
directly on the struct sg_table objects and use scatterlist page
iterators where possible. This, almost always, hides references to the
nents and orig_nents entries, making the code robust, easier to follow
and copy/paste safe.

Signed-off-by: Marek Szyprowski 
Reviewed-by: Qiang Yu 
---
 drivers/gpu/drm/lima/lima_gem.c | 11 ---
 drivers/gpu/drm/lima/lima_vm.c  |  5 ++---
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/lima/lima_gem.c b/drivers/gpu/drm/lima/lima_gem.c
index 155f2b4b4030..11223fe348df 100644
--- a/drivers/gpu/drm/lima/lima_gem.c
+++ b/drivers/gpu/drm/lima/lima_gem.c
@@ -69,8 +69,7 @@ int lima_heap_alloc(struct lima_bo *bo, struct lima_vm *vm)
return ret;
 
if (bo->base.sgt) {
-   dma_unmap_sg(dev, bo->base.sgt->sgl,
-bo->base.sgt->nents, DMA_BIDIRECTIONAL);
+   dma_unmap_sgtable(dev, bo->base.sgt, DMA_BIDIRECTIONAL, 0);
sg_free_table(bo->base.sgt);
} else {
bo->base.sgt = kmalloc(sizeof(*bo->base.sgt), GFP_KERNEL);
@@ -80,7 +79,13 @@ int lima_heap_alloc(struct lima_bo *bo, struct lima_vm *vm)
}
}
 
-   dma_map_sg(dev, sgt.sgl, sgt.nents, DMA_BIDIRECTIONAL);
+   ret = dma_map_sgtable(dev, , DMA_BIDIRECTIONAL, 0);
+   if (ret) {
+   sg_free_table();
+   kfree(bo->base.sgt);
+   bo->base.sgt = NULL;
+   return ret;
+   }
 
*bo->base.sgt = sgt;
 
diff --git a/drivers/gpu/drm/lima/lima_vm.c b/drivers/gpu/drm/lima/lima_vm.c
index 5b92fb82674a..2b2739adc7f5 100644
--- a/drivers/gpu/drm/lima/lima_vm.c
+++ b/drivers/gpu/drm/lima/lima_vm.c
@@ -124,7 +124,7 @@ int lima_vm_bo_add(struct lima_vm *vm, struct lima_bo *bo, 
bool create)
if (err)
goto err_out1;
 
-   for_each_sg_dma_page(bo->base.sgt->sgl, _iter, bo->base.sgt->nents, 
0) {
+   for_each_sgtable_dma_page(bo->base.sgt, _iter, 0) {
err = lima_vm_map_page(vm, sg_page_iter_dma_address(_iter),
   bo_va->node.start + offset);
if (err)
@@ -298,8 +298,7 @@ int lima_vm_map_bo(struct lima_vm *vm, struct lima_bo *bo, 
int pageoff)
mutex_lock(>lock);
 
base = bo_va->node.start + (pageoff << PAGE_SHIFT);
-   for_each_sg_dma_page(bo->base.sgt->sgl, _iter,
-bo->base.sgt->nents, pageoff) {
+   for_each_sgtable_dma_page(bo->base.sgt, _iter, pageoff) {
err = lima_vm_map_page(vm, sg_page_iter_dma_address(_iter),
   base + offset);
if (err)
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 20/36] drm: tegra: fix common struct sg_table related issues

2020-06-19 Thread Marek Szyprowski
The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function
returns the number of the created entries in the DMA address space.
However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and
dma_unmap_sg must be called with the original number of the entries
passed to the dma_map_sg().

struct sg_table is a common structure used for describing a non-contiguous
memory buffer, used commonly in the DRM and graphics subsystems. It
consists of a scatterlist with memory pages and DMA addresses (sgl entry),
as well as the number of scatterlist entries: CPU pages (orig_nents entry)
and DMA mapped pages (nents entry).

It turned out that it was a common mistake to misuse nents and orig_nents
entries, calling DMA-mapping functions with a wrong number of entries or
ignoring the number of mapped entries returned by the dma_map_sg()
function.

To avoid such issues, lets use a common dma-mapping wrappers operating
directly on the struct sg_table objects and use scatterlist page
iterators where possible. This, almost always, hides references to the
nents and orig_nents entries, making the code robust, easier to follow
and copy/paste safe.

Signed-off-by: Marek Szyprowski 
---
 drivers/gpu/drm/tegra/gem.c   | 27 ++-
 drivers/gpu/drm/tegra/plane.c | 15 +--
 2 files changed, 15 insertions(+), 27 deletions(-)

diff --git a/drivers/gpu/drm/tegra/gem.c b/drivers/gpu/drm/tegra/gem.c
index 723df142a981..01d94befab11 100644
--- a/drivers/gpu/drm/tegra/gem.c
+++ b/drivers/gpu/drm/tegra/gem.c
@@ -98,8 +98,8 @@ static struct sg_table *tegra_bo_pin(struct device *dev, 
struct host1x_bo *bo,
 * the SG table needs to be copied to avoid overwriting any
 * other potential users of the original SG table.
 */
-   err = sg_alloc_table_from_sg(sgt, obj->sgt->sgl, 
obj->sgt->nents,
-GFP_KERNEL);
+   err = sg_alloc_table_from_sg(sgt, obj->sgt->sgl,
+obj->sgt->orig_nents, GFP_KERNEL);
if (err < 0)
goto free;
} else {
@@ -196,8 +196,7 @@ static int tegra_bo_iommu_map(struct tegra_drm *tegra, 
struct tegra_bo *bo)
 
bo->iova = bo->mm->start;
 
-   bo->size = iommu_map_sg(tegra->domain, bo->iova, bo->sgt->sgl,
-   bo->sgt->nents, prot);
+   bo->size = iommu_map_sgtable(tegra->domain, bo->iova, bo->sgt, prot);
if (!bo->size) {
dev_err(tegra->drm->dev, "failed to map buffer\n");
err = -ENOMEM;
@@ -264,8 +263,7 @@ static struct tegra_bo *tegra_bo_alloc_object(struct 
drm_device *drm,
 static void tegra_bo_free(struct drm_device *drm, struct tegra_bo *bo)
 {
if (bo->pages) {
-   dma_unmap_sg(drm->dev, bo->sgt->sgl, bo->sgt->nents,
-DMA_FROM_DEVICE);
+   dma_unmap_sgtable(drm->dev, bo->sgt, DMA_FROM_DEVICE, 0);
drm_gem_put_pages(>gem, bo->pages, true, true);
sg_free_table(bo->sgt);
kfree(bo->sgt);
@@ -290,12 +288,9 @@ static int tegra_bo_get_pages(struct drm_device *drm, 
struct tegra_bo *bo)
goto put_pages;
}
 
-   err = dma_map_sg(drm->dev, bo->sgt->sgl, bo->sgt->nents,
-DMA_FROM_DEVICE);
-   if (err == 0) {
-   err = -EFAULT;
+   err = dma_map_sgtable(drm->dev, bo->sgt, DMA_FROM_DEVICE, 0);
+   if (err)
goto free_sgt;
-   }
 
return 0;
 
@@ -571,7 +566,7 @@ tegra_gem_prime_map_dma_buf(struct dma_buf_attachment 
*attach,
goto free;
}
 
-   if (dma_map_sg(attach->dev, sgt->sgl, sgt->nents, dir) == 0)
+   if (dma_map_sgtable(attach->dev, sgt, dir, 0))
goto free;
 
return sgt;
@@ -590,7 +585,7 @@ static void tegra_gem_prime_unmap_dma_buf(struct 
dma_buf_attachment *attach,
struct tegra_bo *bo = to_tegra_bo(gem);
 
if (bo->pages)
-   dma_unmap_sg(attach->dev, sgt->sgl, sgt->nents, dir);
+   dma_unmap_sgtable(attach->dev, sgt, dir, 0);
 
sg_free_table(sgt);
kfree(sgt);
@@ -609,8 +604,7 @@ static int tegra_gem_prime_begin_cpu_access(struct dma_buf 
*buf,
struct drm_device *drm = gem->dev;
 
if (bo->pages)
-   dma_sync_sg_for_cpu(drm->dev, bo->sgt->sgl, bo->sgt->nents,
-   DMA_FROM_DEVICE);
+   dma_sync_sgtable_for_cpu(drm->dev, bo->sgt, DMA_FROM_DEVICE);
 
return 0;
 }
@@ -623,8 +617,7 @@ static int tegra_gem_prime_end_cpu_access(struct dma_buf 
*buf,
struct drm_device *drm = gem->dev;
 
if (bo->pages)
-   dma_sync_sg_for_device(drm->dev, bo->sgt->sgl, bo->sgt->nents,
-  DMA_TO_DEVICE);
+   dma_sync_sgtable_for_device(drm->dev, 

[PATCH v7 11/36] drm: mediatek: use common helper for a scatterlist contiguity check

2020-06-19 Thread Marek Szyprowski
Use common helper for checking the contiguity of the imported dma-buf and
do this check before allocating resources, so the error path is simpler.

Signed-off-by: Marek Szyprowski 
---
 drivers/gpu/drm/mediatek/mtk_drm_gem.c | 28 ++
 1 file changed, 6 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/mediatek/mtk_drm_gem.c 
b/drivers/gpu/drm/mediatek/mtk_drm_gem.c
index 6190cc3b7b0d..3654ec732029 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_gem.c
+++ b/drivers/gpu/drm/mediatek/mtk_drm_gem.c
@@ -212,37 +212,21 @@ struct drm_gem_object 
*mtk_gem_prime_import_sg_table(struct drm_device *dev,
struct dma_buf_attachment *attach, struct sg_table *sg)
 {
struct mtk_drm_gem_obj *mtk_gem;
-   int ret;
-   struct scatterlist *s;
-   unsigned int i;
-   dma_addr_t expected;
 
-   mtk_gem = mtk_drm_gem_init(dev, attach->dmabuf->size);
+   /* check if the entries in the sg_table are contiguous */
+   if (drm_prime_get_contiguous_size(sg) < attach->dmabuf->size) {
+   DRM_ERROR("sg_table is not contiguous");
+   return ERR_PTR(-EINVAL);
+   }
 
+   mtk_gem = mtk_drm_gem_init(dev, attach->dmabuf->size);
if (IS_ERR(mtk_gem))
return ERR_CAST(mtk_gem);
 
-   expected = sg_dma_address(sg->sgl);
-   for_each_sg(sg->sgl, s, sg->nents, i) {
-   if (!sg_dma_len(s))
-   break;
-
-   if (sg_dma_address(s) != expected) {
-   DRM_ERROR("sg_table is not contiguous");
-   ret = -EINVAL;
-   goto err_gem_free;
-   }
-   expected = sg_dma_address(s) + sg_dma_len(s);
-   }
-
mtk_gem->dma_addr = sg_dma_address(sg->sgl);
mtk_gem->sg = sg;
 
return _gem->base;
-
-err_gem_free:
-   kfree(mtk_gem);
-   return ERR_PTR(ret);
 }
 
 void *mtk_drm_gem_prime_vmap(struct drm_gem_object *obj)
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 18/36] drm: rockchip: use common helper for a scatterlist contiguity check

2020-06-19 Thread Marek Szyprowski
Use common helper for checking the contiguity of the imported dma-buf.

Signed-off-by: Marek Szyprowski 
---
 drivers/gpu/drm/rockchip/rockchip_drm_gem.c | 19 +--
 1 file changed, 1 insertion(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_gem.c 
b/drivers/gpu/drm/rockchip/rockchip_drm_gem.c
index b9275ba7c5a5..2970e534e2bb 100644
--- a/drivers/gpu/drm/rockchip/rockchip_drm_gem.c
+++ b/drivers/gpu/drm/rockchip/rockchip_drm_gem.c
@@ -460,23 +460,6 @@ struct sg_table *rockchip_gem_prime_get_sg_table(struct 
drm_gem_object *obj)
return sgt;
 }
 
-static unsigned long rockchip_sg_get_contiguous_size(struct sg_table *sgt,
-int count)
-{
-   struct scatterlist *s;
-   dma_addr_t expected = sg_dma_address(sgt->sgl);
-   unsigned int i;
-   unsigned long size = 0;
-
-   for_each_sg(sgt->sgl, s, count, i) {
-   if (sg_dma_address(s) != expected)
-   break;
-   expected = sg_dma_address(s) + sg_dma_len(s);
-   size += sg_dma_len(s);
-   }
-   return size;
-}
-
 static int
 rockchip_gem_iommu_map_sg(struct drm_device *drm,
  struct dma_buf_attachment *attach,
@@ -498,7 +481,7 @@ rockchip_gem_dma_map_sg(struct drm_device *drm,
if (!count)
return -EINVAL;
 
-   if (rockchip_sg_get_contiguous_size(sg, count) < attach->dmabuf->size) {
+   if (drm_prime_get_contiguous_size(sg) < attach->dmabuf->size) {
DRM_ERROR("failed to map sg_table to contiguous linear 
address.\n");
dma_unmap_sg(drm->dev, sg->sgl, sg->nents,
 DMA_BIDIRECTIONAL);
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 13/36] drm: msm: fix common struct sg_table related issues

2020-06-19 Thread Marek Szyprowski
The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function
returns the number of the created entries in the DMA address space.
However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and
dma_unmap_sg must be called with the original number of the entries
passed to the dma_map_sg().

struct sg_table is a common structure used for describing a non-contiguous
memory buffer, used commonly in the DRM and graphics subsystems. It
consists of a scatterlist with memory pages and DMA addresses (sgl entry),
as well as the number of scatterlist entries: CPU pages (orig_nents entry)
and DMA mapped pages (nents entry).

It turned out that it was a common mistake to misuse nents and orig_nents
entries, calling DMA-mapping functions with a wrong number of entries or
ignoring the number of mapped entries returned by the dma_map_sg()
function.

To avoid such issues, lets use a common dma-mapping wrappers operating
directly on the struct sg_table objects and use scatterlist page
iterators where possible. This, almost always, hides references to the
nents and orig_nents entries, making the code robust, easier to follow
and copy/paste safe.

Signed-off-by: Marek Szyprowski 
---
 drivers/gpu/drm/msm/msm_gem.c| 13 +
 drivers/gpu/drm/msm/msm_gpummu.c | 14 ++
 drivers/gpu/drm/msm/msm_iommu.c  |  2 +-
 3 files changed, 12 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
index 38b0c0e1f83e..e0d5fd36ea8f 100644
--- a/drivers/gpu/drm/msm/msm_gem.c
+++ b/drivers/gpu/drm/msm/msm_gem.c
@@ -53,11 +53,10 @@ static void sync_for_device(struct msm_gem_object *msm_obj)
struct device *dev = msm_obj->base.dev->dev;
 
if (get_dma_ops(dev) && IS_ENABLED(CONFIG_ARM64)) {
-   dma_sync_sg_for_device(dev, msm_obj->sgt->sgl,
-   msm_obj->sgt->nents, DMA_BIDIRECTIONAL);
+   dma_sync_sgtable_for_device(dev, msm_obj->sgt,
+   DMA_BIDIRECTIONAL);
} else {
-   dma_map_sg(dev, msm_obj->sgt->sgl,
-   msm_obj->sgt->nents, DMA_BIDIRECTIONAL);
+   dma_map_sgtable(dev, msm_obj->sgt, DMA_BIDIRECTIONAL, 0);
}
 }
 
@@ -66,11 +65,9 @@ static void sync_for_cpu(struct msm_gem_object *msm_obj)
struct device *dev = msm_obj->base.dev->dev;
 
if (get_dma_ops(dev) && IS_ENABLED(CONFIG_ARM64)) {
-   dma_sync_sg_for_cpu(dev, msm_obj->sgt->sgl,
-   msm_obj->sgt->nents, DMA_BIDIRECTIONAL);
+   dma_sync_sgtable_for_cpu(dev, msm_obj->sgt, DMA_BIDIRECTIONAL);
} else {
-   dma_unmap_sg(dev, msm_obj->sgt->sgl,
-   msm_obj->sgt->nents, DMA_BIDIRECTIONAL);
+   dma_unmap_sgtable(dev, msm_obj->sgt, DMA_BIDIRECTIONAL, 0);
}
 }
 
diff --git a/drivers/gpu/drm/msm/msm_gpummu.c b/drivers/gpu/drm/msm/msm_gpummu.c
index 310a31b05faa..319f06c28235 100644
--- a/drivers/gpu/drm/msm/msm_gpummu.c
+++ b/drivers/gpu/drm/msm/msm_gpummu.c
@@ -30,21 +30,19 @@ static int msm_gpummu_map(struct msm_mmu *mmu, uint64_t 
iova,
 {
struct msm_gpummu *gpummu = to_msm_gpummu(mmu);
unsigned idx = (iova - GPUMMU_VA_START) / GPUMMU_PAGE_SIZE;
-   struct scatterlist *sg;
+   struct sg_dma_page_iter dma_iter;
unsigned prot_bits = 0;
-   unsigned i, j;
 
if (prot & IOMMU_WRITE)
prot_bits |= 1;
if (prot & IOMMU_READ)
prot_bits |= 2;
 
-   for_each_sg(sgt->sgl, sg, sgt->nents, i) {
-   dma_addr_t addr = sg->dma_address;
-   for (j = 0; j < sg->length / GPUMMU_PAGE_SIZE; j++, idx++) {
-   gpummu->table[idx] = addr | prot_bits;
-   addr += GPUMMU_PAGE_SIZE;
-   }
+   for_each_sgtable_dma_page(sgt, _iter, 0) {
+   dma_addr_t addr = sg_page_iter_dma_address(_iter);
+
+   BUILD_BUG_ON(GPUMMU_PAGE_SIZE != PAGE_SIZE);
+   gpummu->table[idx++] = addr | prot_bits;
}
 
/* we can improve by deferring flush for multiple map() */
diff --git a/drivers/gpu/drm/msm/msm_iommu.c b/drivers/gpu/drm/msm/msm_iommu.c
index 3a381a9674c9..6c31e65834c6 100644
--- a/drivers/gpu/drm/msm/msm_iommu.c
+++ b/drivers/gpu/drm/msm/msm_iommu.c
@@ -36,7 +36,7 @@ static int msm_iommu_map(struct msm_mmu *mmu, uint64_t iova,
struct msm_iommu *iommu = to_msm_iommu(mmu);
size_t ret;
 
-   ret = iommu_map_sg(iommu->domain, iova, sgt->sgl, sgt->nents, prot);
+   ret = iommu_map_sgtable(iommu->domain, iova, sgt, prot);
WARN_ON(!ret);
 
return (ret == len) ? 0 : -EINVAL;
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 00/36] DRM: fix struct sg_table nents vs. orig_nents misuse

2020-06-19 Thread Marek Szyprowski
Dear All,

During the Exynos DRM GEM rework and fixing the issues in the.
drm_prime_sg_to_page_addr_arrays() function [1] I've noticed that most
drivers in DRM framework incorrectly use nents and orig_nents entries of
the struct sg_table.

In case of the most DMA-mapping implementations exchanging those two
entries or using nents for all loops on the scatterlist is harmless,
because they both have the same value. There exists however a DMA-mapping
implementations, for which such incorrect usage breaks things. The nents
returned by dma_map_sg() might be lower than the nents passed as its
parameter and this is perfectly fine. DMA framework or IOMMU is allowed
to join consecutive chunks while mapping if such operation is supported
by the underlying HW (bus, bridge, IOMMU, etc). Example of the case
where dma_map_sg() might return 1 'DMA' chunk for the 4 'physical' pages
is described here [2]

The DMA-mapping framework documentation [3] states that dma_map_sg()
returns the numer of the created entries in the DMA address space.
However the subsequent calls to dma_sync_sg_for_{device,cpu} and
dma_unmap_sg must be called with the original number of entries passed to
dma_map_sg. The common pattern in DRM drivers were to assign the
dma_map_sg() return value to sg_table->nents and use that value for
the subsequent calls to dma_sync_sg_* or dma_unmap_sg functions. Also
the code iterated over nents times to access the pages stored in the
processed scatterlist, while it should use orig_nents as the numer of
the page entries.

I've tried to identify all such incorrect usage of sg_table->nents and
this is a result of my research. It looks that the incorrect pattern has
been copied over the many drivers mainly in the DRM subsystem. Too bad in
most cases it even worked correctly if the system used a simple, linear
DMA-mapping implementation, for which swapping nents and orig_nents
doesn't make any difference. To avoid similar issues in the future, I've
introduced a common wrappers for DMA-mapping calls, which operate directly
on the sg_table objects. I've also added wrappers for iterating over the
scatterlists stored in the sg_table objects and applied them where
possible. This, together with some common DRM prime helpers, allowed me
to almost get rid of all nents/orig_nents usage in the drivers. I hope
that such change makes the code robust, easier to follow and copy/paste
safe.

The biggest TODO is DRM/i915 driver and I don't feel brave enough to fix
it fully. The driver creatively uses sg_table->orig_nents to store the
size of the allocate scatterlist and ignores the number of the entries
returned by dma_map_sg function. In this patchset I only fixed the
sg_table objects exported by dmabuf related functions. I hope that I
didn't break anything there.

Patches are based on top of Linux next-20200618. The required changes to
DMA-mapping framework has been already merged to v5.8-rc1.

If possible I would like ask for merging most of the patches via DRM
tree.

Best regards,
Marek Szyprowski


References:

[1] https://lkml.org/lkml/2020/3/27/555
[2] https://lkml.org/lkml/2020/3/29/65
[3] Documentation/DMA-API-HOWTO.txt
[4] 
https://lore.kernel.org/linux-iommu/20200512121931.gd20...@lst.de/T/#ma18c958a48c3b241d5409517fa7d192eef87459b

Changelog:

v7:
- changed DMA page interators to standard DMA SG iterators in drm/prim and
  videobuf2-dma-contig as suggested by Robin Murphy
- fixed build issues

v6: 
https://lore.kernel.org/linux-iommu/20200618153956.29558-1-m.szyprow...@samsung.com/T/
- rebased onto Linux next-20200618, which is based on v5.8-rc1; fixed conflicts

v5: 
https://lore.kernel.org/linux-iommu/20200513132114.6046-1-m.szyprow...@samsung.com/T/
- fixed some minor style issues and typos
- fixed lack of the attrs argument in ion, dmabuf, rapidio, fastrpc and
  vfio patches

v4: https://lore.kernel.org/linux-iommu/20200512121931.gd20...@lst.de/T/
- added for_each_sgtable_* wrappers and applied where possible
- added drm_prime_get_contiguous_size() and applied where possible
- applied drm_prime_sg_to_page_addr_arrays() where possible to remove page
  extraction from sg_table objects
- added documentation for the introduced wrappers
- improved patches description a bit

v3: 
https://lore.kernel.org/dri-devel/20200505083926.28503-1-m.szyprow...@samsung.com/
- introduce dma_*_sgtable_* wrappers and use them in all patches

v2: 
https://lore.kernel.org/linux-iommu/c01c9766-9778-fd1f-f36e-2dc7bd376...@arm.com/T/
- dropped most of the changes to drm/i915
- added fixes for rcar-du, xen, media and ion
- fixed a few issues pointed by kbuild test robot
- added wide cc: list for each patch

v1: 
https://lore.kernel.org/linux-iommu/c01c9766-9778-fd1f-f36e-2dc7bd376...@arm.com/T/
- initial version


Patch summary:

Marek Szyprowski (36):
  drm: prime: add common helper to check scatterlist contiguity
  drm: prime: use sgtable iterators in
drm_prime_sg_to_page_addr_arrays()
  drm: core: fix common struct sg_table related issues
 

[PATCH v7 03/36] drm: core: fix common struct sg_table related issues

2020-06-19 Thread Marek Szyprowski
The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function
returns the number of the created entries in the DMA address space.
However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and
dma_unmap_sg must be called with the original number of the entries
passed to the dma_map_sg().

struct sg_table is a common structure used for describing a non-contiguous
memory buffer, used commonly in the DRM and graphics subsystems. It
consists of a scatterlist with memory pages and DMA addresses (sgl entry),
as well as the number of scatterlist entries: CPU pages (orig_nents entry)
and DMA mapped pages (nents entry).

It turned out that it was a common mistake to misuse nents and orig_nents
entries, calling DMA-mapping functions with a wrong number of entries or
ignoring the number of mapped entries returned by the dma_map_sg()
function.

To avoid such issues, lets use a common dma-mapping wrappers operating
directly on the struct sg_table objects and use scatterlist page
iterators where possible. This, almost always, hides references to the
nents and orig_nents entries, making the code robust, easier to follow
and copy/paste safe.

Signed-off-by: Marek Szyprowski 
---
 drivers/gpu/drm/drm_cache.c|  2 +-
 drivers/gpu/drm/drm_gem_shmem_helper.c | 14 +-
 drivers/gpu/drm/drm_prime.c| 11 ++-
 3 files changed, 16 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/drm_cache.c b/drivers/gpu/drm/drm_cache.c
index 03e01b000f7a..0fe3c496002a 100644
--- a/drivers/gpu/drm/drm_cache.c
+++ b/drivers/gpu/drm/drm_cache.c
@@ -127,7 +127,7 @@ drm_clflush_sg(struct sg_table *st)
struct sg_page_iter sg_iter;
 
mb(); /*CLFLUSH is ordered only by using memory barriers*/
-   for_each_sg_page(st->sgl, _iter, st->nents, 0)
+   for_each_sgtable_page(st, _iter, 0)
drm_clflush_page(sg_page_iter_page(_iter));
mb(); /*Make sure that all cache line entry is flushed*/
 
diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c 
b/drivers/gpu/drm/drm_gem_shmem_helper.c
index 4b7cfbac4daa..47d8211221f2 100644
--- a/drivers/gpu/drm/drm_gem_shmem_helper.c
+++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
@@ -126,8 +126,8 @@ void drm_gem_shmem_free_object(struct drm_gem_object *obj)
drm_prime_gem_destroy(obj, shmem->sgt);
} else {
if (shmem->sgt) {
-   dma_unmap_sg(obj->dev->dev, shmem->sgt->sgl,
-shmem->sgt->nents, DMA_BIDIRECTIONAL);
+   dma_unmap_sgtable(obj->dev->dev, shmem->sgt,
+ DMA_BIDIRECTIONAL, 0);
sg_free_table(shmem->sgt);
kfree(shmem->sgt);
}
@@ -424,8 +424,7 @@ void drm_gem_shmem_purge_locked(struct drm_gem_object *obj)
 
WARN_ON(!drm_gem_shmem_is_purgeable(shmem));
 
-   dma_unmap_sg(obj->dev->dev, shmem->sgt->sgl,
-shmem->sgt->nents, DMA_BIDIRECTIONAL);
+   dma_unmap_sgtable(obj->dev->dev, shmem->sgt, DMA_BIDIRECTIONAL, 0);
sg_free_table(shmem->sgt);
kfree(shmem->sgt);
shmem->sgt = NULL;
@@ -697,12 +696,17 @@ struct sg_table *drm_gem_shmem_get_pages_sgt(struct 
drm_gem_object *obj)
goto err_put_pages;
}
/* Map the pages for use by the h/w. */
-   dma_map_sg(obj->dev->dev, sgt->sgl, sgt->nents, DMA_BIDIRECTIONAL);
+   ret = dma_map_sgtable(obj->dev->dev, sgt, DMA_BIDIRECTIONAL, 0);
+   if (ret)
+   goto err_free_sgt;
 
shmem->sgt = sgt;
 
return sgt;
 
+err_free_sgt:
+   sg_free_table(sgt);
+   kfree(sgt);
 err_put_pages:
drm_gem_shmem_put_pages(shmem);
return ERR_PTR(ret);
diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
index b717e52e909e..d583d6545666 100644
--- a/drivers/gpu/drm/drm_prime.c
+++ b/drivers/gpu/drm/drm_prime.c
@@ -617,6 +617,7 @@ struct sg_table *drm_gem_map_dma_buf(struct 
dma_buf_attachment *attach,
 {
struct drm_gem_object *obj = attach->dmabuf->priv;
struct sg_table *sgt;
+   int ret;
 
if (WARN_ON(dir == DMA_NONE))
return ERR_PTR(-EINVAL);
@@ -626,11 +627,12 @@ struct sg_table *drm_gem_map_dma_buf(struct 
dma_buf_attachment *attach,
else
sgt = obj->dev->driver->gem_prime_get_sg_table(obj);
 
-   if (!dma_map_sg_attrs(attach->dev, sgt->sgl, sgt->nents, dir,
- DMA_ATTR_SKIP_CPU_SYNC)) {
+   ret = dma_map_sgtable(attach->dev, sgt, dir,
+ DMA_ATTR_SKIP_CPU_SYNC);
+   if (ret) {
sg_free_table(sgt);
kfree(sgt);
-   sgt = ERR_PTR(-ENOMEM);
+   sgt = ERR_PTR(ret);
}
 
return sgt;
@@ -652,8 +654,7 @@ void drm_gem_unmap_dma_buf(struct dma_buf_attachment 
*attach,
   

[PATCH v7 01/36] drm: prime: add common helper to check scatterlist contiguity

2020-06-19 Thread Marek Szyprowski
It is a common operation done by DRM drivers to check the contiguity
of the DMA-mapped buffer described by a scatterlist in the
sg_table object. Let's add a common helper for this operation.

Signed-off-by: Marek Szyprowski 
---
 drivers/gpu/drm/drm_gem_cma_helper.c | 23 +++--
 drivers/gpu/drm/drm_prime.c  | 31 
 include/drm/drm_prime.h  |  2 ++
 3 files changed, 36 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/drm_gem_cma_helper.c 
b/drivers/gpu/drm/drm_gem_cma_helper.c
index 06a5b9ee1fe0..41566a15dabd 100644
--- a/drivers/gpu/drm/drm_gem_cma_helper.c
+++ b/drivers/gpu/drm/drm_gem_cma_helper.c
@@ -471,26 +471,9 @@ drm_gem_cma_prime_import_sg_table(struct drm_device *dev,
 {
struct drm_gem_cma_object *cma_obj;
 
-   if (sgt->nents != 1) {
-   /* check if the entries in the sg_table are contiguous */
-   dma_addr_t next_addr = sg_dma_address(sgt->sgl);
-   struct scatterlist *s;
-   unsigned int i;
-
-   for_each_sg(sgt->sgl, s, sgt->nents, i) {
-   /*
-* sg_dma_address(s) is only valid for entries
-* that have sg_dma_len(s) != 0
-*/
-   if (!sg_dma_len(s))
-   continue;
-
-   if (sg_dma_address(s) != next_addr)
-   return ERR_PTR(-EINVAL);
-
-   next_addr = sg_dma_address(s) + sg_dma_len(s);
-   }
-   }
+   /* check if the entries in the sg_table are contiguous */
+   if (drm_prime_get_contiguous_size(sgt) < attach->dmabuf->size)
+   return ERR_PTR(-EINVAL);
 
/* Create a CMA GEM buffer. */
cma_obj = __drm_gem_cma_create(dev, attach->dmabuf->size);
diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
index bbfc713bfdc3..226cd6ad3985 100644
--- a/drivers/gpu/drm/drm_prime.c
+++ b/drivers/gpu/drm/drm_prime.c
@@ -825,6 +825,37 @@ struct sg_table *drm_prime_pages_to_sg(struct page 
**pages, unsigned int nr_page
 }
 EXPORT_SYMBOL(drm_prime_pages_to_sg);
 
+/**
+ * drm_prime_get_contiguous_size - returns the contiguous size of the buffer
+ * @sgt: sg_table describing the buffer to check
+ *
+ * This helper calculates the contiguous size in the DMA address space
+ * of the the buffer described by the provided sg_table.
+ *
+ * This is useful for implementing
+ * _gem_object_funcs.gem_prime_import_sg_table.
+ */
+unsigned long drm_prime_get_contiguous_size(struct sg_table *sgt)
+{
+   dma_addr_t expected = sg_dma_address(sgt->sgl);
+   struct scatterlist *sg;
+   unsigned long size = 0;
+   int i;
+
+   for_each_sgtable_dma_sg(sgt, sg, i) {
+   unsigned int len = sg_dma_len(sg);
+
+   if (!len)
+   break;
+   if (sg_dma_address(sg) != expected)
+   break;
+   expected += len;
+   size += len;
+   }
+   return size;
+}
+EXPORT_SYMBOL(drm_prime_get_contiguous_size);
+
 /**
  * drm_gem_prime_export - helper library implementation of the export callback
  * @obj: GEM object to export
diff --git a/include/drm/drm_prime.h b/include/drm/drm_prime.h
index 9af7422b44cf..47ef11614627 100644
--- a/include/drm/drm_prime.h
+++ b/include/drm/drm_prime.h
@@ -92,6 +92,8 @@ struct sg_table *drm_prime_pages_to_sg(struct page **pages, 
unsigned int nr_page
 struct dma_buf *drm_gem_prime_export(struct drm_gem_object *obj,
 int flags);
 
+unsigned long drm_prime_get_contiguous_size(struct sg_table *sgt);
+
 /* helper functions for importing */
 struct drm_gem_object *drm_gem_prime_import_dev(struct drm_device *dev,
struct dma_buf *dma_buf,
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 08/36] drm: exynos: fix common struct sg_table related issues

2020-06-19 Thread Marek Szyprowski
The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function
returns the number of the created entries in the DMA address space.
However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and
dma_unmap_sg must be called with the original number of the entries
passed to the dma_map_sg().

struct sg_table is a common structure used for describing a non-contiguous
memory buffer, used commonly in the DRM and graphics subsystems. It
consists of a scatterlist with memory pages and DMA addresses (sgl entry),
as well as the number of scatterlist entries: CPU pages (orig_nents entry)
and DMA mapped pages (nents entry).

It turned out that it was a common mistake to misuse nents and orig_nents
entries, calling DMA-mapping functions with a wrong number of entries or
ignoring the number of mapped entries returned by the dma_map_sg()
function.

To avoid such issues, lets use a common dma-mapping wrappers operating
directly on the struct sg_table objects and use scatterlist page
iterators where possible. This, almost always, hides references to the
nents and orig_nents entries, making the code robust, easier to follow
and copy/paste safe.

Signed-off-by: Marek Szyprowski 
---
 drivers/gpu/drm/exynos/exynos_drm_g2d.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/exynos/exynos_drm_g2d.c 
b/drivers/gpu/drm/exynos/exynos_drm_g2d.c
index fcee33a43aca..7014a8cd971a 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_g2d.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_g2d.c
@@ -395,8 +395,8 @@ static void g2d_userptr_put_dma_addr(struct g2d_data *g2d,
return;
 
 out:
-   dma_unmap_sg(to_dma_dev(g2d->drm_dev), g2d_userptr->sgt->sgl,
-   g2d_userptr->sgt->nents, DMA_BIDIRECTIONAL);
+   dma_unmap_sgtable(to_dma_dev(g2d->drm_dev), g2d_userptr->sgt,
+ DMA_BIDIRECTIONAL, 0);
 
pages = frame_vector_pages(g2d_userptr->vec);
if (!IS_ERR(pages)) {
@@ -511,10 +511,10 @@ static dma_addr_t *g2d_userptr_get_dma_addr(struct 
g2d_data *g2d,
 
g2d_userptr->sgt = sgt;
 
-   if (!dma_map_sg(to_dma_dev(g2d->drm_dev), sgt->sgl, sgt->nents,
-   DMA_BIDIRECTIONAL)) {
+   ret = dma_map_sgtable(to_dma_dev(g2d->drm_dev), sgt,
+ DMA_BIDIRECTIONAL, 0);
+   if (ret) {
DRM_DEV_ERROR(g2d->dev, "failed to map sgt with dma region.\n");
-   ret = -ENOMEM;
goto err_sg_free_table;
}
 
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 14/36] drm: omapdrm: use common helper for extracting pages array

2020-06-19 Thread Marek Szyprowski
Use common helper for converting a sg_table object into struct
page pointer array.

Signed-off-by: Marek Szyprowski 
---
 drivers/gpu/drm/omapdrm/omap_gem.c | 14 --
 1 file changed, 4 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/omapdrm/omap_gem.c 
b/drivers/gpu/drm/omapdrm/omap_gem.c
index d0d12d5dd76c..ff0c4b0c3fd0 100644
--- a/drivers/gpu/drm/omapdrm/omap_gem.c
+++ b/drivers/gpu/drm/omapdrm/omap_gem.c
@@ -1297,10 +1297,9 @@ struct drm_gem_object *omap_gem_new_dmabuf(struct 
drm_device *dev, size_t size,
omap_obj->dma_addr = sg_dma_address(sgt->sgl);
} else {
/* Create pages list from sgt */
-   struct sg_page_iter iter;
struct page **pages;
unsigned int npages;
-   unsigned int i = 0;
+   unsigned int ret;
 
npages = DIV_ROUND_UP(size, PAGE_SIZE);
pages = kcalloc(npages, sizeof(*pages), GFP_KERNEL);
@@ -1311,14 +1310,9 @@ struct drm_gem_object *omap_gem_new_dmabuf(struct 
drm_device *dev, size_t size,
}
 
omap_obj->pages = pages;
-
-   for_each_sg_page(sgt->sgl, , sgt->orig_nents, 0) {
-   pages[i++] = sg_page_iter_page();
-   if (i > npages)
-   break;
-   }
-
-   if (WARN_ON(i != npages)) {
+   ret = drm_prime_sg_to_page_addr_arrays(sgt, pages, NULL,
+  npages);
+   if (WARN_ON(ret)) {
omap_gem_free_object(obj);
obj = ERR_PTR(-ENOMEM);
goto done;
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 04/36] drm: amdgpu: fix common struct sg_table related issues

2020-06-19 Thread Marek Szyprowski
The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function
returns the number of the created entries in the DMA address space.
However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and
dma_unmap_sg must be called with the original number of the entries
passed to the dma_map_sg().

struct sg_table is a common structure used for describing a non-contiguous
memory buffer, used commonly in the DRM and graphics subsystems. It
consists of a scatterlist with memory pages and DMA addresses (sgl entry),
as well as the number of scatterlist entries: CPU pages (orig_nents entry)
and DMA mapped pages (nents entry).

It turned out that it was a common mistake to misuse nents and orig_nents
entries, calling DMA-mapping functions with a wrong number of entries or
ignoring the number of mapped entries returned by the dma_map_sg()
function.

To avoid such issues, lets use a common dma-mapping wrappers operating
directly on the struct sg_table objects and use scatterlist page
iterators where possible. This, almost always, hides references to the
nents and orig_nents entries, making the code robust, easier to follow
and copy/paste safe.

Signed-off-by: Marek Szyprowski 
Reviewed-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c  | 6 +++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c  | 9 +++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 8 
 3 files changed, 10 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
index 43d8ed7dbd00..519ce4427fce 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
@@ -307,8 +307,8 @@ static struct sg_table *amdgpu_dma_buf_map(struct 
dma_buf_attachment *attach,
if (IS_ERR(sgt))
return sgt;
 
-   if (!dma_map_sg_attrs(attach->dev, sgt->sgl, sgt->nents, dir,
- DMA_ATTR_SKIP_CPU_SYNC))
+   if (dma_map_sgtable(attach->dev, sgt, dir,
+   DMA_ATTR_SKIP_CPU_SYNC))
goto error_free;
break;
 
@@ -349,7 +349,7 @@ static void amdgpu_dma_buf_unmap(struct dma_buf_attachment 
*attach,
struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
 
if (sgt->sgl->page_link) {
-   dma_unmap_sg(attach->dev, sgt->sgl, sgt->nents, dir);
+   dma_unmap_sgtable(attach->dev, sgt, dir, 0);
sg_free_table(sgt);
kfree(sgt);
} else {
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 5129a996e941..97fb73e5a6ae 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -1025,7 +1025,6 @@ static int amdgpu_ttm_tt_pin_userptr(struct ttm_tt *ttm)
 {
struct amdgpu_device *adev = amdgpu_ttm_adev(ttm->bdev);
struct amdgpu_ttm_tt *gtt = (void *)ttm;
-   unsigned nents;
int r;
 
int write = !(gtt->userflags & AMDGPU_GEM_USERPTR_READONLY);
@@ -1040,9 +1039,8 @@ static int amdgpu_ttm_tt_pin_userptr(struct ttm_tt *ttm)
goto release_sg;
 
/* Map SG to device */
-   r = -ENOMEM;
-   nents = dma_map_sg(adev->dev, ttm->sg->sgl, ttm->sg->nents, direction);
-   if (nents == 0)
+   r = dma_map_sgtable(adev->dev, ttm->sg, direction, 0);
+   if (r)
goto release_sg;
 
/* convert SG to linear array of pages and dma addresses */
@@ -1073,8 +1071,7 @@ static void amdgpu_ttm_tt_unpin_userptr(struct ttm_tt 
*ttm)
return;
 
/* unmap the pages mapped to the device */
-   dma_unmap_sg(adev->dev, ttm->sg->sgl, ttm->sg->nents, direction);
-
+   dma_unmap_sgtable(adev->dev, ttm->sg, direction, 0);
sg_free_table(ttm->sg);
 
 #if IS_ENABLED(CONFIG_DRM_AMDGPU_USERPTR)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
index d399e5893170..c281aa13f5ec 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
@@ -477,11 +477,11 @@ int amdgpu_vram_mgr_alloc_sgt(struct amdgpu_device *adev,
if (r)
goto error_free;
 
-   for_each_sg((*sgt)->sgl, sg, num_entries, i)
+   for_each_sgtable_sg((*sgt), sg, i)
sg->length = 0;
 
node = mem->mm_node;
-   for_each_sg((*sgt)->sgl, sg, num_entries, i) {
+   for_each_sgtable_sg((*sgt), sg, i) {
phys_addr_t phys = (node->start << PAGE_SHIFT) +
adev->gmc.aper_base;
size_t size = node->size << PAGE_SHIFT;
@@ -501,7 +501,7 @@ int amdgpu_vram_mgr_alloc_sgt(struct amdgpu_device *adev,
return 0;
 
 error_unmap:
-   for_each_sg((*sgt)->sgl, sg, num_entries, i) {
+   for_each_sgtable_sg((*sgt), sg, i) {

[PATCH v7 06/36] drm: etnaviv: fix common struct sg_table related issues

2020-06-19 Thread Marek Szyprowski
The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function
returns the number of the created entries in the DMA address space.
However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and
dma_unmap_sg must be called with the original number of the entries
passed to the dma_map_sg().

struct sg_table is a common structure used for describing a non-contiguous
memory buffer, used commonly in the DRM and graphics subsystems. It
consists of a scatterlist with memory pages and DMA addresses (sgl entry),
as well as the number of scatterlist entries: CPU pages (orig_nents entry)
and DMA mapped pages (nents entry).

It turned out that it was a common mistake to misuse nents and orig_nents
entries, calling DMA-mapping functions with a wrong number of entries or
ignoring the number of mapped entries returned by the dma_map_sg()
function.

To avoid such issues, lets use a common dma-mapping wrappers operating
directly on the struct sg_table objects and use scatterlist page
iterators where possible. This, almost always, hides references to the
nents and orig_nents entries, making the code robust, easier to follow
and copy/paste safe.

Signed-off-by: Marek Szyprowski 
---
 drivers/gpu/drm/etnaviv/etnaviv_gem.c | 12 +---
 drivers/gpu/drm/etnaviv/etnaviv_mmu.c | 13 +++--
 2 files changed, 8 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem.c 
b/drivers/gpu/drm/etnaviv/etnaviv_gem.c
index f5e5bb8ba953..9f4613f7e255 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_gem.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_gem.c
@@ -27,7 +27,7 @@ static void etnaviv_gem_scatter_map(struct etnaviv_gem_object 
*etnaviv_obj)
 * because display controller, GPU, etc. are not coherent.
 */
if (etnaviv_obj->flags & ETNA_BO_CACHE_MASK)
-   dma_map_sg(dev->dev, sgt->sgl, sgt->nents, DMA_BIDIRECTIONAL);
+   dma_map_sgtable(dev->dev, sgt, DMA_BIDIRECTIONAL, 0);
 }
 
 static void etnaviv_gem_scatterlist_unmap(struct etnaviv_gem_object 
*etnaviv_obj)
@@ -51,7 +51,7 @@ static void etnaviv_gem_scatterlist_unmap(struct 
etnaviv_gem_object *etnaviv_obj
 * discard those writes.
 */
if (etnaviv_obj->flags & ETNA_BO_CACHE_MASK)
-   dma_unmap_sg(dev->dev, sgt->sgl, sgt->nents, DMA_BIDIRECTIONAL);
+   dma_unmap_sgtable(dev->dev, sgt, DMA_BIDIRECTIONAL, 0);
 }
 
 /* called with etnaviv_obj->lock held */
@@ -404,9 +404,8 @@ int etnaviv_gem_cpu_prep(struct drm_gem_object *obj, u32 op,
}
 
if (etnaviv_obj->flags & ETNA_BO_CACHED) {
-   dma_sync_sg_for_cpu(dev->dev, etnaviv_obj->sgt->sgl,
-   etnaviv_obj->sgt->nents,
-   etnaviv_op_to_dma_dir(op));
+   dma_sync_sgtable_for_cpu(dev->dev, etnaviv_obj->sgt,
+etnaviv_op_to_dma_dir(op));
etnaviv_obj->last_cpu_prep_op = op;
}
 
@@ -421,8 +420,7 @@ int etnaviv_gem_cpu_fini(struct drm_gem_object *obj)
if (etnaviv_obj->flags & ETNA_BO_CACHED) {
/* fini without a prep is almost certainly a userspace error */
WARN_ON(etnaviv_obj->last_cpu_prep_op == 0);
-   dma_sync_sg_for_device(dev->dev, etnaviv_obj->sgt->sgl,
-   etnaviv_obj->sgt->nents,
+   dma_sync_sgtable_for_device(dev->dev, etnaviv_obj->sgt,
etnaviv_op_to_dma_dir(etnaviv_obj->last_cpu_prep_op));
etnaviv_obj->last_cpu_prep_op = 0;
}
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_mmu.c 
b/drivers/gpu/drm/etnaviv/etnaviv_mmu.c
index 3607d348c298..13b100553a0b 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_mmu.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_mmu.c
@@ -79,7 +79,7 @@ static int etnaviv_iommu_map(struct etnaviv_iommu_context 
*context, u32 iova,
if (!context || !sgt)
return -EINVAL;
 
-   for_each_sg(sgt->sgl, sg, sgt->nents, i) {
+   for_each_sgtable_dma_sg(sgt, sg, i) {
u32 pa = sg_dma_address(sg) - sg->offset;
size_t bytes = sg_dma_len(sg) + sg->offset;
 
@@ -95,14 +95,7 @@ static int etnaviv_iommu_map(struct etnaviv_iommu_context 
*context, u32 iova,
return 0;
 
 fail:
-   da = iova;
-
-   for_each_sg(sgt->sgl, sg, i, j) {
-   size_t bytes = sg_dma_len(sg) + sg->offset;
-
-   etnaviv_context_unmap(context, da, bytes);
-   da += bytes;
-   }
+   etnaviv_context_unmap(context, iova, da - iova);
return ret;
 }
 
@@ -113,7 +106,7 @@ static void etnaviv_iommu_unmap(struct 
etnaviv_iommu_context *context, u32 iova,
unsigned int da = iova;
int i;
 
-   for_each_sg(sgt->sgl, sg, sgt->nents, i) {
+   for_each_sgtable_dma_sg(sgt, sg, i) {
size_t bytes = sg_dma_len(sg) + sg->offset;
 
etnaviv_context_unmap(context, da, bytes);
-- 
2.17.1


[PATCH v7 15/36] drm: omapdrm: fix common struct sg_table related issues

2020-06-19 Thread Marek Szyprowski
The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function
returns the number of the created entries in the DMA address space.
However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and
dma_unmap_sg must be called with the original number of the entries
passed to the dma_map_sg().

struct sg_table is a common structure used for describing a non-contiguous
memory buffer, used commonly in the DRM and graphics subsystems. It
consists of a scatterlist with memory pages and DMA addresses (sgl entry),
as well as the number of scatterlist entries: CPU pages (orig_nents entry)
and DMA mapped pages (nents entry).

It turned out that it was a common mistake to misuse nents and orig_nents
entries, calling DMA-mapping functions with a wrong number of entries or
ignoring the number of mapped entries returned by the dma_map_sg()
function.

Fix the code to refer to proper nents or orig_nents entries. This driver
checks for a buffer contiguity in DMA address space, so it should test
sg_table->nents entry.

Signed-off-by: Marek Szyprowski 
---
 drivers/gpu/drm/omapdrm/omap_gem.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/omapdrm/omap_gem.c 
b/drivers/gpu/drm/omapdrm/omap_gem.c
index ff0c4b0c3fd0..a7a9a0afe2b6 100644
--- a/drivers/gpu/drm/omapdrm/omap_gem.c
+++ b/drivers/gpu/drm/omapdrm/omap_gem.c
@@ -48,7 +48,7 @@ struct omap_gem_object {
 *   OMAP_BO_MEM_DMA_API flag set)
 *
 * - buffers imported from dmabuf (with the OMAP_BO_MEM_DMABUF flag set)
-*   if they are physically contiguous (when sgt->orig_nents == 1)
+*   if they are physically contiguous (when sgt->nents == 1)
 *
 * - buffers mapped through the TILER when dma_addr_cnt is not zero, in
 *   which case the DMA address points to the TILER aperture
@@ -1279,7 +1279,7 @@ struct drm_gem_object *omap_gem_new_dmabuf(struct 
drm_device *dev, size_t size,
union omap_gem_size gsize;
 
/* Without a DMM only physically contiguous buffers can be supported. */
-   if (sgt->orig_nents != 1 && !priv->has_dmm)
+   if (sgt->nents != 1 && !priv->has_dmm)
return ERR_PTR(-EINVAL);
 
gsize.bytes = PAGE_ALIGN(size);
@@ -1293,7 +1293,7 @@ struct drm_gem_object *omap_gem_new_dmabuf(struct 
drm_device *dev, size_t size,
 
omap_obj->sgt = sgt;
 
-   if (sgt->orig_nents == 1) {
+   if (sgt->nents == 1) {
omap_obj->dma_addr = sg_dma_address(sgt->sgl);
} else {
/* Create pages list from sgt */
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 09/36] drm: i915: fix common struct sg_table related issues

2020-06-19 Thread Marek Szyprowski
The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function
returns the number of the created entries in the DMA address space.
However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and
dma_unmap_sg must be called with the original number of the entries
passed to the dma_map_sg().

struct sg_table is a common structure used for describing a non-contiguous
memory buffer, used commonly in the DRM and graphics subsystems. It
consists of a scatterlist with memory pages and DMA addresses (sgl entry),
as well as the number of scatterlist entries: CPU pages (orig_nents entry)
and DMA mapped pages (nents entry).

It turned out that it was a common mistake to misuse nents and orig_nents
entries, calling DMA-mapping functions with a wrong number of entries or
ignoring the number of mapped entries returned by the dma_map_sg()
function.

This driver creatively uses sg_table->orig_nents to store the size of the
allocated scatterlist and ignores the number of the entries returned by
dma_map_sg function. The sg_table->orig_nents is (mis)used to properly
free the (over)allocated scatterlist.

This patch only introduces the common DMA-mapping wrappers operating
directly on the struct sg_table objects to the dmabuf related functions,
so the other drivers, which might share buffers with i915 could rely on
the properly set nents and orig_nents values.

Signed-off-by: Marek Szyprowski 
---
 drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c   | 11 +++
 drivers/gpu/drm/i915/gem/selftests/mock_dmabuf.c |  7 +++
 2 files changed, 6 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c 
b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
index 2679380159fc..8a988592715b 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
@@ -48,12 +48,9 @@ static struct sg_table *i915_gem_map_dma_buf(struct 
dma_buf_attachment *attachme
src = sg_next(src);
}
 
-   if (!dma_map_sg_attrs(attachment->dev,
- st->sgl, st->nents, dir,
- DMA_ATTR_SKIP_CPU_SYNC)) {
-   ret = -ENOMEM;
+   ret = dma_map_sgtable(attachment->dev, st, dir, DMA_ATTR_SKIP_CPU_SYNC);
+   if (ret)
goto err_free_sg;
-   }
 
return st;
 
@@ -73,9 +70,7 @@ static void i915_gem_unmap_dma_buf(struct dma_buf_attachment 
*attachment,
 {
struct drm_i915_gem_object *obj = dma_buf_to_obj(attachment->dmabuf);
 
-   dma_unmap_sg_attrs(attachment->dev,
-  sg->sgl, sg->nents, dir,
-  DMA_ATTR_SKIP_CPU_SYNC);
+   dma_unmap_sgtable(attachment->dev, sg, dir, DMA_ATTR_SKIP_CPU_SYNC);
sg_free_table(sg);
kfree(sg);
 
diff --git a/drivers/gpu/drm/i915/gem/selftests/mock_dmabuf.c 
b/drivers/gpu/drm/i915/gem/selftests/mock_dmabuf.c
index debaf7b18ab5..be30b27e2926 100644
--- a/drivers/gpu/drm/i915/gem/selftests/mock_dmabuf.c
+++ b/drivers/gpu/drm/i915/gem/selftests/mock_dmabuf.c
@@ -28,10 +28,9 @@ static struct sg_table *mock_map_dma_buf(struct 
dma_buf_attachment *attachment,
sg = sg_next(sg);
}
 
-   if (!dma_map_sg(attachment->dev, st->sgl, st->nents, dir)) {
-   err = -ENOMEM;
+   err = dma_map_sgtable(attachment->dev, st, dir, 0);
+   if (err)
goto err_st;
-   }
 
return st;
 
@@ -46,7 +45,7 @@ static void mock_unmap_dma_buf(struct dma_buf_attachment 
*attachment,
   struct sg_table *st,
   enum dma_data_direction dir)
 {
-   dma_unmap_sg(attachment->dev, st->sgl, st->nents, dir);
+   dma_unmap_sgtable(attachment->dev, st, dir, 0);
sg_free_table(st);
kfree(st);
 }
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v6 35/36] videobuf2: use sgtable-based scatterlist wrappers

2020-06-19 Thread Robin Murphy

On 2020-06-18 16:39, Marek Szyprowski wrote:

Use recently introduced common wrappers operating directly on the struct
sg_table objects and scatterlist page iterators to make the code a bit
more compact, robust, easier to follow and copy/paste safe.

No functional change, because the code already properly did all the
scaterlist related calls.

Signed-off-by: Marek Szyprowski 
---
  .../common/videobuf2/videobuf2-dma-contig.c   | 41 ---
  .../media/common/videobuf2/videobuf2-dma-sg.c | 32 ++-
  .../common/videobuf2/videobuf2-vmalloc.c  | 12 ++
  3 files changed, 34 insertions(+), 51 deletions(-)

diff --git a/drivers/media/common/videobuf2/videobuf2-dma-contig.c 
b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
index f4b4a7c135eb..ba01a8692d88 100644
--- a/drivers/media/common/videobuf2/videobuf2-dma-contig.c
+++ b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
@@ -48,16 +48,15 @@ struct vb2_dc_buf {
  
  static unsigned long vb2_dc_get_contiguous_size(struct sg_table *sgt)

  {
-   struct scatterlist *s;
dma_addr_t expected = sg_dma_address(sgt->sgl);
-   unsigned int i;
+   struct sg_dma_page_iter dma_iter;
unsigned long size = 0;
  
-	for_each_sg(sgt->sgl, s, sgt->nents, i) {

-   if (sg_dma_address(s) != expected)
+   for_each_sgtable_dma_page(sgt, _iter, 0) {
+   if (sg_page_iter_dma_address(_iter) != expected)
break;
-   expected = sg_dma_address(s) + sg_dma_len(s);
-   size += sg_dma_len(s);
+   expected += PAGE_SIZE;
+   size += PAGE_SIZE;


Same comment as for the DRM version. In fact, given that it's the same 
function with the same purpose, might it be worth hoisting out as a 
generic helper for the sg_table API itself?



}
return size;
  }

[...]

diff --git a/drivers/media/common/videobuf2/videobuf2-dma-sg.c 
b/drivers/media/common/videobuf2/videobuf2-dma-sg.c
index 92072a08af25..6ddf953efa11 100644
--- a/drivers/media/common/videobuf2/videobuf2-dma-sg.c
+++ b/drivers/media/common/videobuf2/videobuf2-dma-sg.c
@@ -142,9 +142,8 @@ static void *vb2_dma_sg_alloc(struct device *dev, unsigned 
long dma_attrs,
 * No need to sync to the device, this will happen later when the
 * prepare() memop is called.
 */
-   sgt->nents = dma_map_sg_attrs(buf->dev, sgt->sgl, sgt->orig_nents,
- buf->dma_dir, DMA_ATTR_SKIP_CPU_SYNC);
-   if (!sgt->nents)
+   if (dma_map_sgtable(buf->dev, sgt, buf->dma_dir,
+   DMA_ATTR_SKIP_CPU_SYNC)) {


As 0-day's explosions of nonsense imply, there's a rogue bracket here...


goto fail_map;
  
  	buf->handler.refcount = >refcount;

@@ -180,8 +179,8 @@ static void vb2_dma_sg_put(void *buf_priv)
if (refcount_dec_and_test(>refcount)) {
dprintk(1, "%s: Freeing buffer of %d pages\n", __func__,
buf->num_pages);
-   dma_unmap_sg_attrs(buf->dev, sgt->sgl, sgt->orig_nents,
-  buf->dma_dir, DMA_ATTR_SKIP_CPU_SYNC);
+   dma_unmap_sgtable(buf->dev, sgt, buf->dma_dir,
+ DMA_ATTR_SKIP_CPU_SYNC);
if (buf->vaddr)
vm_unmap_ram(buf->vaddr, buf->num_pages);
sg_free_table(buf->dma_sgt);
@@ -202,8 +201,7 @@ static void vb2_dma_sg_prepare(void *buf_priv)
if (buf->db_attach)
return;
  
-	dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->orig_nents,

-  buf->dma_dir);
+   dma_sync_sgtable_for_device(buf->dev, sgt, buf->dma_dir);
  }
  
  static void vb2_dma_sg_finish(void *buf_priv)

@@ -215,7 +213,7 @@ static void vb2_dma_sg_finish(void *buf_priv)
if (buf->db_attach)
return;
  
-	dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->orig_nents, buf->dma_dir);

+   dma_sync_sgtable_for_cpu(buf->dev, sgt, buf->dma_dir);
  }
  
  static void *vb2_dma_sg_get_userptr(struct device *dev, unsigned long vaddr,

@@ -258,9 +256,8 @@ static void *vb2_dma_sg_get_userptr(struct device *dev, 
unsigned long vaddr,
 * No need to sync to the device, this will happen later when the
 * prepare() memop is called.
 */
-   sgt->nents = dma_map_sg_attrs(buf->dev, sgt->sgl, sgt->orig_nents,
- buf->dma_dir, DMA_ATTR_SKIP_CPU_SYNC);
-   if (!sgt->nents)
+   if (dma_map_sgtable(buf->dev, sgt, buf->dma_dir,
+   DMA_ATTR_SKIP_CPU_SYNC)) {


... and here.

Robin.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 12/12] bus: fsl-mc: Add ACPI support for fsl-mc

2020-06-19 Thread Lorenzo Pieralisi
From: Makarand Pawagi 

Add ACPI support in the fsl-mc driver. Driver parses MC DSDT table to
extract memory and other resources.

Interrupt (GIC ITS) information is extracted from the MADT table
by drivers/irqchip/irq-gic-v3-its-fsl-mc-msi.c.

IORT table is parsed to configure DMA.

Signed-off-by: Makarand Pawagi 
Signed-off-by: Diana Craciun 
Signed-off-by: Laurentiu Tudor 
---
 drivers/bus/fsl-mc/fsl-mc-bus.c | 73 
 drivers/bus/fsl-mc/fsl-mc-msi.c | 37 +
 drivers/irqchip/irq-gic-v3-its-fsl-mc-msi.c | 92 -
 3 files changed, 150 insertions(+), 52 deletions(-)

diff --git a/drivers/bus/fsl-mc/fsl-mc-bus.c b/drivers/bus/fsl-mc/fsl-mc-bus.c
index 824ff77bbe86..324d49d6df89 100644
--- a/drivers/bus/fsl-mc/fsl-mc-bus.c
+++ b/drivers/bus/fsl-mc/fsl-mc-bus.c
@@ -18,6 +18,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 #include "fsl-mc-private.h"
 
@@ -38,6 +40,7 @@ struct fsl_mc {
struct fsl_mc_device *root_mc_bus_dev;
u8 num_translation_ranges;
struct fsl_mc_addr_translation_range *translation_ranges;
+   void *fsl_mc_regs;
 };
 
 /**
@@ -56,6 +59,10 @@ struct fsl_mc_addr_translation_range {
phys_addr_t start_phys_addr;
 };
 
+#define FSL_MC_FAPR0x28
+#define MC_FAPR_PL BIT(18)
+#define MC_FAPR_BMTBIT(17)
+
 /**
  * fsl_mc_bus_match - device to driver matching callback
  * @dev: the fsl-mc device to match against
@@ -124,7 +131,10 @@ static int fsl_mc_dma_configure(struct device *dev)
while (dev_is_fsl_mc(dma_dev))
dma_dev = dma_dev->parent;
 
-   return of_dma_configure_id(dev, dma_dev->of_node, 0, _id);
+   if (dev_of_node(dma_dev))
+   return of_dma_configure_id(dev, dma_dev->of_node, 0, _id);
+
+   return acpi_dma_configure_id(dev, DEV_DMA_COHERENT, _id);
 }
 
 static ssize_t modalias_show(struct device *dev, struct device_attribute *attr,
@@ -865,8 +875,11 @@ static int fsl_mc_bus_probe(struct platform_device *pdev)
struct fsl_mc_io *mc_io = NULL;
int container_id;
phys_addr_t mc_portal_phys_addr;
-   u32 mc_portal_size;
-   struct resource res;
+   u32 mc_portal_size, mc_stream_id;
+   struct resource *plat_res;
+
+   if (!iommu_present(_mc_bus_type))
+   return -EPROBE_DEFER;
 
mc = devm_kzalloc(>dev, sizeof(*mc), GFP_KERNEL);
if (!mc)
@@ -874,19 +887,33 @@ static int fsl_mc_bus_probe(struct platform_device *pdev)
 
platform_set_drvdata(pdev, mc);
 
+   plat_res = platform_get_resource(pdev, IORESOURCE_MEM, 1);
+   mc->fsl_mc_regs = devm_ioremap_resource(>dev, plat_res);
+   if (IS_ERR(mc->fsl_mc_regs))
+   return PTR_ERR(mc->fsl_mc_regs);
+
+   if (IS_ENABLED(CONFIG_ACPI) && !dev_of_node(>dev)) {
+   mc_stream_id = readl(mc->fsl_mc_regs + FSL_MC_FAPR);
+   /*
+* HW ORs the PL and BMT bit, places the result in bit 15 of
+* the StreamID and ORs in the ICID. Calculate it accordingly.
+*/
+   mc_stream_id = (mc_stream_id & 0x) |
+   ((mc_stream_id & (MC_FAPR_PL | MC_FAPR_BMT)) ?
+   0x4000 : 0);
+   error = acpi_dma_configure_id(>dev, DEV_DMA_COHERENT,
+ _stream_id);
+   if (error)
+   dev_warn(>dev, "failed to configure dma: %d.\n",
+error);
+   }
+
/*
 * Get physical address of MC portal for the root DPRC:
 */
-   error = of_address_to_resource(pdev->dev.of_node, 0, );
-   if (error < 0) {
-   dev_err(>dev,
-   "of_address_to_resource() failed for %pOF\n",
-   pdev->dev.of_node);
-   return error;
-   }
-
-   mc_portal_phys_addr = res.start;
-   mc_portal_size = resource_size();
+   plat_res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+   mc_portal_phys_addr = plat_res->start;
+   mc_portal_size = resource_size(plat_res);
error = fsl_create_mc_io(>dev, mc_portal_phys_addr,
 mc_portal_size, NULL,
 FSL_MC_IO_ATOMIC_CONTEXT_PORTAL, _io);
@@ -903,11 +930,13 @@ static int fsl_mc_bus_probe(struct platform_device *pdev)
dev_info(>dev, "MC firmware version: %u.%u.%u\n",
 mc_version.major, mc_version.minor, mc_version.revision);
 
-   error = get_mc_addr_translation_ranges(>dev,
-  >translation_ranges,
-  >num_translation_ranges);
-   if (error < 0)
-   goto error_cleanup_mc_io;
+   if (dev_of_node(>dev)) {
+   error = get_mc_addr_translation_ranges(>dev,
+   

[PATCH v2 10/12] of/irq: Make of_msi_map_rid() PCI bus agnostic

2020-06-19 Thread Lorenzo Pieralisi
There is nothing PCI bus specific in the of_msi_map_rid()
implementation other than the requester ID tag for the input
ID space. Rename requester ID to a more generic ID so that
the translation code can be used by all busses that require
input/output ID translations.

No functional change intended.

Signed-off-by: Lorenzo Pieralisi 
Cc: Bjorn Helgaas 
Cc: Rob Herring 
Cc: Marc Zyngier 
---
 drivers/of/irq.c   | 28 ++--
 drivers/pci/msi.c  |  2 +-
 include/linux/of_irq.h |  8 
 3 files changed, 19 insertions(+), 19 deletions(-)

diff --git a/drivers/of/irq.c b/drivers/of/irq.c
index 1005e4f349ef..25d17b8a1a1a 100644
--- a/drivers/of/irq.c
+++ b/drivers/of/irq.c
@@ -576,43 +576,43 @@ void __init of_irq_init(const struct of_device_id 
*matches)
}
 }
 
-static u32 __of_msi_map_rid(struct device *dev, struct device_node **np,
-   u32 rid_in)
+static u32 __of_msi_map_id(struct device *dev, struct device_node **np,
+   u32 id_in)
 {
struct device *parent_dev;
-   u32 rid_out = rid_in;
+   u32 id_out = id_in;
 
/*
 * Walk up the device parent links looking for one with a
 * "msi-map" property.
 */
for (parent_dev = dev; parent_dev; parent_dev = parent_dev->parent)
-   if (!of_map_id(parent_dev->of_node, rid_in, "msi-map",
-   "msi-map-mask", np, _out))
+   if (!of_map_id(parent_dev->of_node, id_in, "msi-map",
+   "msi-map-mask", np, _out))
break;
-   return rid_out;
+   return id_out;
 }
 
 /**
- * of_msi_map_rid - Map a MSI requester ID for a device.
+ * of_msi_map_id - Map a MSI ID for a device.
  * @dev: device for which the mapping is to be done.
  * @msi_np: device node of the expected msi controller.
- * @rid_in: unmapped MSI requester ID for the device.
+ * @id_in: unmapped MSI ID for the device.
  *
  * Walk up the device hierarchy looking for devices with a "msi-map"
- * property.  If found, apply the mapping to @rid_in.
+ * property.  If found, apply the mapping to @id_in.
  *
- * Returns the mapped MSI requester ID.
+ * Returns the mapped MSI ID.
  */
-u32 of_msi_map_rid(struct device *dev, struct device_node *msi_np, u32 rid_in)
+u32 of_msi_map_id(struct device *dev, struct device_node *msi_np, u32 id_in)
 {
-   return __of_msi_map_rid(dev, _np, rid_in);
+   return __of_msi_map_id(dev, _np, id_in);
 }
 
 /**
  * of_msi_map_get_device_domain - Use msi-map to find the relevant MSI domain
  * @dev: device for which the mapping is to be done.
- * @rid: Requester ID for the device.
+ * @id: Device ID.
  * @bus_token: Bus token
  *
  * Walk up the device hierarchy looking for devices with a "msi-map"
@@ -625,7 +625,7 @@ struct irq_domain *of_msi_map_get_device_domain(struct 
device *dev, u32 id,
 {
struct device_node *np = NULL;
 
-   __of_msi_map_rid(dev, , id);
+   __of_msi_map_id(dev, , id);
return irq_find_matching_host(np, bus_token);
 }
 
diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index b4bfe0b03b2d..19aeadb22f11 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -1535,7 +1535,7 @@ u32 pci_msi_domain_get_msi_rid(struct irq_domain *domain, 
struct pci_dev *pdev)
pci_for_each_dma_alias(pdev, get_msi_id_cb, );
 
of_node = irq_domain_get_of_node(domain);
-   rid = of_node ? of_msi_map_rid(>dev, of_node, rid) :
+   rid = of_node ? of_msi_map_id(>dev, of_node, rid) :
iort_msi_map_id(>dev, rid);
 
return rid;
diff --git a/include/linux/of_irq.h b/include/linux/of_irq.h
index 7142a3722758..e8b78139f78c 100644
--- a/include/linux/of_irq.h
+++ b/include/linux/of_irq.h
@@ -55,7 +55,7 @@ extern struct irq_domain *of_msi_map_get_device_domain(struct 
device *dev,
u32 id,
u32 bus_token);
 extern void of_msi_configure(struct device *dev, struct device_node *np);
-u32 of_msi_map_rid(struct device *dev, struct device_node *msi_np, u32 rid_in);
+u32 of_msi_map_id(struct device *dev, struct device_node *msi_np, u32 id_in);
 #else
 static inline int of_irq_count(struct device_node *dev)
 {
@@ -93,10 +93,10 @@ static inline struct irq_domain 
*of_msi_map_get_device_domain(struct device *dev
 static inline void of_msi_configure(struct device *dev, struct device_node *np)
 {
 }
-static inline u32 of_msi_map_rid(struct device *dev,
-struct device_node *msi_np, u32 rid_in)
+static inline u32 of_msi_map_id(struct device *dev,
+struct device_node *msi_np, u32 id_in)
 {
-   return rid_in;
+   return id_in;
 }
 #endif
 
-- 
2.26.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 09/12] of/irq: make of_msi_map_get_device_domain() bus agnostic

2020-06-19 Thread Lorenzo Pieralisi
From: Diana Craciun 

of_msi_map_get_device_domain() is PCI specific but it need not be and
can be easily changed to be bus agnostic in order to be used by other
busses by adding an IRQ domain bus token as an input parameter.

Signed-off-by: Diana Craciun 
Signed-off-by: Lorenzo Pieralisi 
Acked-by: Bjorn Helgaas# pci/msi.c
Cc: Bjorn Helgaas 
Cc: Rob Herring 
Cc: Marc Zyngier 
---
 drivers/of/irq.c   | 8 +---
 drivers/pci/msi.c  | 2 +-
 include/linux/of_irq.h | 5 +++--
 3 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/drivers/of/irq.c b/drivers/of/irq.c
index d632bc5b3a2d..1005e4f349ef 100644
--- a/drivers/of/irq.c
+++ b/drivers/of/irq.c
@@ -613,18 +613,20 @@ u32 of_msi_map_rid(struct device *dev, struct device_node 
*msi_np, u32 rid_in)
  * of_msi_map_get_device_domain - Use msi-map to find the relevant MSI domain
  * @dev: device for which the mapping is to be done.
  * @rid: Requester ID for the device.
+ * @bus_token: Bus token
  *
  * Walk up the device hierarchy looking for devices with a "msi-map"
  * property.
  *
  * Returns: the MSI domain for this device (or NULL on failure)
  */
-struct irq_domain *of_msi_map_get_device_domain(struct device *dev, u32 rid)
+struct irq_domain *of_msi_map_get_device_domain(struct device *dev, u32 id,
+   u32 bus_token)
 {
struct device_node *np = NULL;
 
-   __of_msi_map_rid(dev, , rid);
-   return irq_find_matching_host(np, DOMAIN_BUS_PCI_MSI);
+   __of_msi_map_rid(dev, , id);
+   return irq_find_matching_host(np, bus_token);
 }
 
 /**
diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index 77f48b95e277..b4bfe0b03b2d 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -1556,7 +1556,7 @@ struct irq_domain *pci_msi_get_device_domain(struct 
pci_dev *pdev)
u32 rid = pci_dev_id(pdev);
 
pci_for_each_dma_alias(pdev, get_msi_id_cb, );
-   dom = of_msi_map_get_device_domain(>dev, rid);
+   dom = of_msi_map_get_device_domain(>dev, rid, DOMAIN_BUS_PCI_MSI);
if (!dom)
dom = iort_get_device_domain(>dev, rid,
 DOMAIN_BUS_PCI_MSI);
diff --git a/include/linux/of_irq.h b/include/linux/of_irq.h
index 1214cabb2247..7142a3722758 100644
--- a/include/linux/of_irq.h
+++ b/include/linux/of_irq.h
@@ -52,7 +52,8 @@ extern struct irq_domain *of_msi_get_domain(struct device 
*dev,
struct device_node *np,
enum irq_domain_bus_token token);
 extern struct irq_domain *of_msi_map_get_device_domain(struct device *dev,
-  u32 rid);
+   u32 id,
+   u32 bus_token);
 extern void of_msi_configure(struct device *dev, struct device_node *np);
 u32 of_msi_map_rid(struct device *dev, struct device_node *msi_np, u32 rid_in);
 #else
@@ -85,7 +86,7 @@ static inline struct irq_domain *of_msi_get_domain(struct 
device *dev,
return NULL;
 }
 static inline struct irq_domain *of_msi_map_get_device_domain(struct device 
*dev,
- u32 rid)
+   u32 id, u32 bus_token)
 {
return NULL;
 }
-- 
2.26.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 11/12] bus/fsl-mc: Refactor the MSI domain creation in the DPRC driver

2020-06-19 Thread Lorenzo Pieralisi
From: Diana Craciun 

The DPRC driver is not taking into account the msi-map property
and assumes that the icid is the same as the stream ID. Although
this assumption is correct, generalize the code to include a
translation between icid and streamID.

Furthermore do not just copy the MSI domain from parent (for child
containers), but use the information provided by the msi-map property.

If the msi-map property is missing from the device tree retain the old
behaviour for backward compatibility ie the child DPRC objects
inherit the MSI domain from the parent.

Signed-off-by: Diana Craciun 
---
 drivers/bus/fsl-mc/dprc-driver.c| 31 ++---
 drivers/bus/fsl-mc/fsl-mc-bus.c |  4 +--
 drivers/bus/fsl-mc/fsl-mc-msi.c | 31 +
 drivers/bus/fsl-mc/fsl-mc-private.h |  6 ++--
 drivers/irqchip/irq-gic-v3-its-fsl-mc-msi.c | 15 +-
 5 files changed, 47 insertions(+), 40 deletions(-)

diff --git a/drivers/bus/fsl-mc/dprc-driver.c b/drivers/bus/fsl-mc/dprc-driver.c
index c8b1c3842c1a..189bff2115a8 100644
--- a/drivers/bus/fsl-mc/dprc-driver.c
+++ b/drivers/bus/fsl-mc/dprc-driver.c
@@ -592,6 +592,7 @@ static int dprc_probe(struct fsl_mc_device *mc_dev)
bool mc_io_created = false;
bool msi_domain_set = false;
u16 major_ver, minor_ver;
+   struct irq_domain *mc_msi_domain;
 
if (!is_fsl_mc_bus_dprc(mc_dev))
return -EINVAL;
@@ -621,31 +622,15 @@ static int dprc_probe(struct fsl_mc_device *mc_dev)
return error;
 
mc_io_created = true;
+   }
 
-   /*
-* Inherit parent MSI domain:
-*/
-   dev_set_msi_domain(_dev->dev,
-  dev_get_msi_domain(parent_dev));
-   msi_domain_set = true;
+   mc_msi_domain = fsl_mc_find_msi_domain(_dev->dev);
+   if (!mc_msi_domain) {
+   dev_warn(_dev->dev,
+"WARNING: MC bus without interrupt support\n");
} else {
-   /*
-* This is a root DPRC
-*/
-   struct irq_domain *mc_msi_domain;
-
-   if (dev_is_fsl_mc(parent_dev))
-   return -EINVAL;
-
-   error = fsl_mc_find_msi_domain(parent_dev,
-  _msi_domain);
-   if (error < 0) {
-   dev_warn(_dev->dev,
-"WARNING: MC bus without interrupt support\n");
-   } else {
-   dev_set_msi_domain(_dev->dev, mc_msi_domain);
-   msi_domain_set = true;
-   }
+   dev_set_msi_domain(_dev->dev, mc_msi_domain);
+   msi_domain_set = true;
}
 
error = dprc_open(mc_dev->mc_io, 0, mc_dev->obj_desc.id,
diff --git a/drivers/bus/fsl-mc/fsl-mc-bus.c b/drivers/bus/fsl-mc/fsl-mc-bus.c
index 8ead3f0238f2..824ff77bbe86 100644
--- a/drivers/bus/fsl-mc/fsl-mc-bus.c
+++ b/drivers/bus/fsl-mc/fsl-mc-bus.c
@@ -370,8 +370,8 @@ EXPORT_SYMBOL_GPL(fsl_mc_get_version);
 /**
  * fsl_mc_get_root_dprc - function to traverse to the root dprc
  */
-static void fsl_mc_get_root_dprc(struct device *dev,
-struct device **root_dprc_dev)
+void fsl_mc_get_root_dprc(struct device *dev,
+struct device **root_dprc_dev)
 {
if (!dev) {
*root_dprc_dev = NULL;
diff --git a/drivers/bus/fsl-mc/fsl-mc-msi.c b/drivers/bus/fsl-mc/fsl-mc-msi.c
index 8b9c66d7c4ff..e7bbff445a83 100644
--- a/drivers/bus/fsl-mc/fsl-mc-msi.c
+++ b/drivers/bus/fsl-mc/fsl-mc-msi.c
@@ -177,23 +177,30 @@ struct irq_domain *fsl_mc_msi_create_irq_domain(struct 
fwnode_handle *fwnode,
return domain;
 }
 
-int fsl_mc_find_msi_domain(struct device *mc_platform_dev,
-  struct irq_domain **mc_msi_domain)
+struct irq_domain *fsl_mc_find_msi_domain(struct device *dev)
 {
-   struct irq_domain *msi_domain;
-   struct device_node *mc_of_node = mc_platform_dev->of_node;
+   struct irq_domain *msi_domain = NULL;
+   struct fsl_mc_device *mc_dev = to_fsl_mc_device(dev);
 
-   msi_domain = of_msi_get_domain(mc_platform_dev, mc_of_node,
-  DOMAIN_BUS_FSL_MC_MSI);
-   if (!msi_domain) {
-   pr_err("Unable to find fsl-mc MSI domain for %pOF\n",
-  mc_of_node);
+   msi_domain = of_msi_map_get_device_domain(dev, mc_dev->icid,
+ DOMAIN_BUS_FSL_MC_MSI);
 
-   return -ENOENT;
+   /*
+* if the msi-map property is missing assume that all the
+* child containers inherit the domain from the parent
+*/
+   if (!msi_domain) {
+   struct device *root_dprc_dev;
+   struct device *bus_dev;
+
+   

[PATCH v2 05/12] ACPI/IORT: Add an input ID to acpi_dma_configure()

2020-06-19 Thread Lorenzo Pieralisi
Some HW devices are created as child devices of proprietary busses,
that have a bus specific policy defining how the child devices
wires representing the devices ID are translated into IOMMU and
IRQ controllers device IDs.

Current IORT code provides translations for:

- PCI devices, where the device ID is well identified at bus level
  as the requester ID (RID)
- Platform devices that are endpoint devices where the device ID is
  retrieved from the ACPI object IORT mappings (Named components single
  mappings). A platform device is represented in IORT as a named
  component node

For devices that are child devices of proprietary busses the IORT
firmware represents the bus node as a named component node in IORT
and it is up to that named component node to define in/out bus
specific ID translations for the bus child devices that are
allocated and created in a bus specific manner.

In order to make IORT ID translations available for proprietary
bus child devices, the current ACPI (and IORT) code must be
augmented to provide an additional ID parameter to acpi_dma_configure()
representing the child devices input ID. This ID is bus specific
and it is retrieved in bus specific code.

By adding an ID parameter to acpi_dma_configure(), the IORT
code can map the child device ID to an IOMMU stream ID through
the IORT named component representing the bus in/out ID mappings.

Signed-off-by: Lorenzo Pieralisi 
Cc: Will Deacon 
Cc: Hanjun Guo 
Cc: Sudeep Holla 
Cc: Catalin Marinas 
Cc: Robin Murphy 
Cc: "Rafael J. Wysocki" 
---
 drivers/acpi/arm64/iort.c | 59 +--
 drivers/acpi/scan.c   |  8 --
 include/acpi/acpi_bus.h   |  9 --
 include/linux/acpi.h  |  7 +
 include/linux/acpi_iort.h |  7 +++--
 5 files changed, 67 insertions(+), 23 deletions(-)

diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
index 421c6976ab81..ec782e4a0fe4 100644
--- a/drivers/acpi/arm64/iort.c
+++ b/drivers/acpi/arm64/iort.c
@@ -978,19 +978,54 @@ static void iort_named_component_init(struct device *dev,
   nc->node_flags);
 }
 
+static int iort_nc_iommu_map(struct device *dev, struct acpi_iort_node *node)
+{
+   struct acpi_iort_node *parent;
+   int err = -ENODEV, i = 0;
+   u32 streamid = 0;
+
+   do {
+
+   parent = iort_node_map_platform_id(node, ,
+  IORT_IOMMU_TYPE,
+  i++);
+
+   if (parent)
+   err = iort_iommu_xlate(dev, parent, streamid);
+   } while (parent && !err);
+
+   return err;
+}
+
+static int iort_nc_iommu_map_id(struct device *dev,
+   struct acpi_iort_node *node,
+   const u32 *in_id)
+{
+   struct acpi_iort_node *parent;
+   u32 streamid;
+
+   parent = iort_node_map_id(node, *in_id, , IORT_IOMMU_TYPE);
+   if (parent)
+   return iort_iommu_xlate(dev, parent, streamid);
+
+   return -ENODEV;
+}
+
+
 /**
- * iort_iommu_configure - Set-up IOMMU configuration for a device.
+ * iort_iommu_configure_id - Set-up IOMMU configuration for a device.
  *
  * @dev: device to configure
+ * @id_in: optional input id const value pointer
  *
  * Returns: iommu_ops pointer on configuration success
  *  NULL on configuration failure
  */
-const struct iommu_ops *iort_iommu_configure(struct device *dev)
+const struct iommu_ops *iort_iommu_configure_id(struct device *dev,
+   const u32 *id_in)
 {
-   struct acpi_iort_node *node, *parent;
+   struct acpi_iort_node *node;
const struct iommu_ops *ops;
-   u32 streamid = 0;
int err = -ENODEV;
 
/*
@@ -1019,21 +1054,13 @@ const struct iommu_ops *iort_iommu_configure(struct 
device *dev)
if (fwspec && iort_pci_rc_supports_ats(node))
fwspec->flags |= IOMMU_FWSPEC_PCI_RC_ATS;
} else {
-   int i = 0;
-
node = iort_scan_node(ACPI_IORT_NODE_NAMED_COMPONENT,
  iort_match_node_callback, dev);
if (!node)
return NULL;
 
-   do {
-   parent = iort_node_map_platform_id(node, ,
-  IORT_IOMMU_TYPE,
-  i++);
-
-   if (parent)
-   err = iort_iommu_xlate(dev, parent, streamid);
-   } while (parent && !err);
+   err = id_in ? iort_nc_iommu_map_id(dev, node, id_in) :
+ iort_nc_iommu_map(dev, node);
 
if (!err)
iort_named_component_init(dev, node);
@@ -1058,6 +1085,7 @@ const struct iommu_ops *iort_iommu_configure(struct 
device *dev)
 

[PATCH v2 06/12] of/iommu: Make of_map_rid() PCI agnostic

2020-06-19 Thread Lorenzo Pieralisi
There is nothing PCI specific (other than the RID - requester ID)
in the of_map_rid() implementation, so the same function can be
reused for input/output IDs mapping for other busses just as well.

Rename the RID instances/names to a generic "id" tag.

No functionality change intended.

Signed-off-by: Lorenzo Pieralisi 
Cc: Rob Herring 
Cc: Joerg Roedel 
Cc: Robin Murphy 
Cc: Marc Zyngier 
---
 drivers/iommu/of_iommu.c |  4 ++--
 drivers/of/base.c| 42 
 drivers/of/irq.c |  2 +-
 include/linux/of.h   |  4 ++--
 4 files changed, 26 insertions(+), 26 deletions(-)

diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
index 20738aacac89..016316244737 100644
--- a/drivers/iommu/of_iommu.c
+++ b/drivers/iommu/of_iommu.c
@@ -129,7 +129,7 @@ static int of_pci_iommu_init(struct pci_dev *pdev, u16 
alias, void *data)
struct of_phandle_args iommu_spec = { .args_count = 1 };
int err;
 
-   err = of_map_rid(info->np, alias, "iommu-map", "iommu-map-mask",
+   err = of_map_id(info->np, alias, "iommu-map", "iommu-map-mask",
 _spec.np, iommu_spec.args);
if (err)
return err == -ENODEV ? NO_IOMMU : err;
@@ -145,7 +145,7 @@ static int of_fsl_mc_iommu_init(struct fsl_mc_device 
*mc_dev,
struct of_phandle_args iommu_spec = { .args_count = 1 };
int err;
 
-   err = of_map_rid(master_np, mc_dev->icid, "iommu-map",
+   err = of_map_id(master_np, mc_dev->icid, "iommu-map",
 "iommu-map-mask", _spec.np,
 iommu_spec.args);
if (err)
diff --git a/drivers/of/base.c b/drivers/of/base.c
index ae03b1218b06..ea44fea99813 100644
--- a/drivers/of/base.c
+++ b/drivers/of/base.c
@@ -2201,15 +2201,15 @@ int of_find_last_cache_level(unsigned int cpu)
 }
 
 /**
- * of_map_rid - Translate a requester ID through a downstream mapping.
+ * of_map_id - Translate an ID through a downstream mapping.
  * @np: root complex device node.
- * @rid: device requester ID to map.
+ * @id: device ID to map.
  * @map_name: property name of the map to use.
  * @map_mask_name: optional property name of the mask to use.
  * @target: optional pointer to a target device node.
  * @id_out: optional pointer to receive the translated ID.
  *
- * Given a device requester ID, look up the appropriate implementation-defined
+ * Given a device ID, look up the appropriate implementation-defined
  * platform ID and/or the target device which receives transactions on that
  * ID, as per the "iommu-map" and "msi-map" bindings. Either of @target or
  * @id_out may be NULL if only the other is required. If @target points to
@@ -2219,11 +2219,11 @@ int of_find_last_cache_level(unsigned int cpu)
  *
  * Return: 0 on success or a standard error code on failure.
  */
-int of_map_rid(struct device_node *np, u32 rid,
+int of_map_id(struct device_node *np, u32 id,
   const char *map_name, const char *map_mask_name,
   struct device_node **target, u32 *id_out)
 {
-   u32 map_mask, masked_rid;
+   u32 map_mask, masked_id;
int map_len;
const __be32 *map = NULL;
 
@@ -2235,7 +2235,7 @@ int of_map_rid(struct device_node *np, u32 rid,
if (target)
return -ENODEV;
/* Otherwise, no map implies no translation */
-   *id_out = rid;
+   *id_out = id;
return 0;
}
 
@@ -2255,22 +2255,22 @@ int of_map_rid(struct device_node *np, u32 rid,
if (map_mask_name)
of_property_read_u32(np, map_mask_name, _mask);
 
-   masked_rid = map_mask & rid;
+   masked_id = map_mask & id;
for ( ; map_len > 0; map_len -= 4 * sizeof(*map), map += 4) {
struct device_node *phandle_node;
-   u32 rid_base = be32_to_cpup(map + 0);
+   u32 id_base = be32_to_cpup(map + 0);
u32 phandle = be32_to_cpup(map + 1);
u32 out_base = be32_to_cpup(map + 2);
-   u32 rid_len = be32_to_cpup(map + 3);
+   u32 id_len = be32_to_cpup(map + 3);
 
-   if (rid_base & ~map_mask) {
-   pr_err("%pOF: Invalid %s translation - %s-mask (0x%x) 
ignores rid-base (0x%x)\n",
+   if (id_base & ~map_mask) {
+   pr_err("%pOF: Invalid %s translation - %s-mask (0x%x) 
ignores id-base (0x%x)\n",
np, map_name, map_name,
-   map_mask, rid_base);
+   map_mask, id_base);
return -EFAULT;
}
 
-   if (masked_rid < rid_base || masked_rid >= rid_base + rid_len)
+   if (masked_id < id_base || masked_id >= id_base + id_len)
continue;
 
phandle_node = of_find_node_by_phandle(phandle);
@@ -2288,20 +2288,20 @@ int of_map_rid(struct 

[PATCH v2 04/12] ACPI/IORT: Remove useless PCI bus walk

2020-06-19 Thread Lorenzo Pieralisi
The PCI bus domain number (used in the iort_match_node_callback() -
pci_domain_nr() call) is cascaded through the PCI bus hierarchy at PCI
bus enumeration time, therefore there is no need in iort_find_dev_node()
to walk the PCI bus upwards to grab the root bus to be passed to
iort_scan_node(), the device->bus PCI bus pointer will do.

Remove this useless code.

Signed-off-by: Lorenzo Pieralisi 
Cc: Will Deacon 
Cc: Hanjun Guo 
Cc: Sudeep Holla 
Cc: Catalin Marinas 
Cc: Robin Murphy 
Cc: "Rafael J. Wysocki" 
---
 drivers/acpi/arm64/iort.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
index 53f9ef515089..421c6976ab81 100644
--- a/drivers/acpi/arm64/iort.c
+++ b/drivers/acpi/arm64/iort.c
@@ -558,10 +558,7 @@ static struct acpi_iort_node *iort_find_dev_node(struct 
device *dev)
  iort_match_node_callback, dev);
}
 
-   /* Find a PCI root bus */
pbus = to_pci_dev(dev)->bus;
-   while (!pci_is_root_bus(pbus))
-   pbus = pbus->parent;
 
return iort_scan_node(ACPI_IORT_NODE_PCI_ROOT_COMPLEX,
  iort_match_node_callback, >dev);
-- 
2.26.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 08/12] dt-bindings: arm: fsl: Add msi-map device-tree binding for fsl-mc bus

2020-06-19 Thread Lorenzo Pieralisi
From: Laurentiu Tudor 

The existing bindings cannot be used to specify the relationship
between fsl-mc devices and GIC ITSes.
Add a generic binding for mapping fsl-mc devices to GIC ITSes, using
msi-map property.
In addition, deprecate msi-parent property which no longer makes sense
now that we support translating the MSIs.

Signed-off-by: Laurentiu Tudor 
Signed-off-by: Diana Craciun 
Cc: Rob Herring 
---
 .../devicetree/bindings/misc/fsl,qoriq-mc.txt | 50 ---
 1 file changed, 44 insertions(+), 6 deletions(-)

diff --git a/Documentation/devicetree/bindings/misc/fsl,qoriq-mc.txt 
b/Documentation/devicetree/bindings/misc/fsl,qoriq-mc.txt
index 9134e9bcca56..ebd329181c14 100644
--- a/Documentation/devicetree/bindings/misc/fsl,qoriq-mc.txt
+++ b/Documentation/devicetree/bindings/misc/fsl,qoriq-mc.txt
@@ -28,6 +28,16 @@ Documentation/devicetree/bindings/iommu/iommu.txt.
 For arm-smmu binding, see:
 Documentation/devicetree/bindings/iommu/arm,smmu.yaml.
 
+The MSI writes are accompanied by sideband data which is derived from the ICID.
+The msi-map property is used to associate the devices with both the ITS
+controller and the sideband data which accompanies the writes.
+
+For generic MSI bindings, see
+Documentation/devicetree/bindings/interrupt-controller/msi.txt.
+
+For GICv3 and GIC ITS bindings, see:
+Documentation/devicetree/bindings/interrupt-controller/arm,gic-v3.yaml.
+
 Required properties:
 
 - compatible
@@ -49,11 +59,6 @@ Required properties:
 region may not be present in some scenarios, such
 as in the device tree presented to a virtual machine.
 
-- msi-parent
-Value type: 
-Definition: Must be present and point to the MSI controller node
-handling message interrupts for the MC.
-
 - ranges
 Value type: 
 Definition: A standard property.  Defines the mapping between the child
@@ -119,6 +124,28 @@ Optional properties:
   associated with the listed IOMMU, with the iommu-specifier
   (i - icid-base + iommu-base).
 
+- msi-map: Maps an ICID to a GIC ITS and associated msi-specifier
+  data.
+
+  The property is an arbitrary number of tuples of
+  (icid-base,gic-its,msi-base,length).
+
+  Any ICID in the interval [icid-base, icid-base + length) is
+  associated with the listed GIC ITS, with the msi-specifier
+  (i - icid-base + msi-base).
+
+Deprecated properties:
+
+- msi-parent
+Value type: 
+Definition: Describes the MSI controller node handling message
+interrupts for the MC. When there is no translation
+between the ICID and deviceID this property can be used
+to describe the MSI controller used by the devices on the
+mc-bus.
+The use of this property for mc-bus is deprecated. Please
+use msi-map.
+
 Example:
 
 smmu: iommu@500 {
@@ -128,13 +155,24 @@ Example:
...
 };
 
+gic: interrupt-controller@600 {
+   compatible = "arm,gic-v3";
+   ...
+}
+its: gic-its@602 {
+   compatible = "arm,gic-v3-its";
+   msi-controller;
+   ...
+};
+
 fsl_mc: fsl-mc@80c00 {
 compatible = "fsl,qoriq-mc";
 reg = <0x0008 0x0c00 0 0x40>,/* MC portal base */
   <0x 0x0834 0 0x4>; /* MC control reg */
-msi-parent = <>;
 /* define map for ICIDs 23-64 */
 iommu-map = <23  23 41>;
+/* define msi map for ICIDs 23-64 */
+msi-map = <23  23 41>;
 #address-cells = <3>;
 #size-cells = <1>;
 
-- 
2.26.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 07/12] of/device: Add input id to of_dma_configure()

2020-06-19 Thread Lorenzo Pieralisi
Devices sitting on proprietary busses have a device ID space that
is owned by the respective bus and related firmware bindings. In order
to let the generic OF layer handle the input translations to
an IOMMU id, for such busses the current of_dma_configure() interface
should be extended in order to allow the bus layer to provide the
device input id parameter - that is retrieved/assigned in bus
specific code and firmware.

Augment of_dma_configure() to add an optional input_id parameter,
leaving current functionality unchanged.

Signed-off-by: Lorenzo Pieralisi 
Cc: Rob Herring 
Cc: Robin Murphy 
Cc: Joerg Roedel 
Cc: Laurentiu Tudor 
---
 drivers/bus/fsl-mc/fsl-mc-bus.c |  4 +-
 drivers/iommu/of_iommu.c| 81 ++---
 drivers/of/device.c |  8 ++--
 include/linux/of_device.h   | 16 ++-
 include/linux/of_iommu.h|  6 ++-
 5 files changed, 70 insertions(+), 45 deletions(-)

diff --git a/drivers/bus/fsl-mc/fsl-mc-bus.c b/drivers/bus/fsl-mc/fsl-mc-bus.c
index 40526da5c6a6..8ead3f0238f2 100644
--- a/drivers/bus/fsl-mc/fsl-mc-bus.c
+++ b/drivers/bus/fsl-mc/fsl-mc-bus.c
@@ -118,11 +118,13 @@ static int fsl_mc_bus_uevent(struct device *dev, struct 
kobj_uevent_env *env)
 static int fsl_mc_dma_configure(struct device *dev)
 {
struct device *dma_dev = dev;
+   struct fsl_mc_device *mc_dev = to_fsl_mc_device(dev);
+   u32 input_id = mc_dev->icid;
 
while (dev_is_fsl_mc(dma_dev))
dma_dev = dma_dev->parent;
 
-   return of_dma_configure(dev, dma_dev->of_node, 0);
+   return of_dma_configure_id(dev, dma_dev->of_node, 0, _id);
 }
 
 static ssize_t modalias_show(struct device *dev, struct device_attribute *attr,
diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
index 016316244737..e505b9130a1c 100644
--- a/drivers/iommu/of_iommu.c
+++ b/drivers/iommu/of_iommu.c
@@ -118,46 +118,66 @@ static int of_iommu_xlate(struct device *dev,
return ret;
 }
 
-struct of_pci_iommu_alias_info {
-   struct device *dev;
-   struct device_node *np;
-};
-
-static int of_pci_iommu_init(struct pci_dev *pdev, u16 alias, void *data)
+static int of_iommu_configure_dev_id(struct device_node *master_np,
+struct device *dev,
+const u32 *id)
 {
-   struct of_pci_iommu_alias_info *info = data;
struct of_phandle_args iommu_spec = { .args_count = 1 };
int err;
 
-   err = of_map_id(info->np, alias, "iommu-map", "iommu-map-mask",
-_spec.np, iommu_spec.args);
+   err = of_map_id(master_np, *id, "iommu-map",
+"iommu-map-mask", _spec.np,
+iommu_spec.args);
if (err)
return err == -ENODEV ? NO_IOMMU : err;
 
-   err = of_iommu_xlate(info->dev, _spec);
+   err = of_iommu_xlate(dev, _spec);
of_node_put(iommu_spec.np);
return err;
 }
 
-static int of_fsl_mc_iommu_init(struct fsl_mc_device *mc_dev,
-   struct device_node *master_np)
+static int of_iommu_configure_dev(struct device_node *master_np,
+ struct device *dev)
 {
-   struct of_phandle_args iommu_spec = { .args_count = 1 };
-   int err;
-
-   err = of_map_id(master_np, mc_dev->icid, "iommu-map",
-"iommu-map-mask", _spec.np,
-iommu_spec.args);
-   if (err)
-   return err == -ENODEV ? NO_IOMMU : err;
+   struct of_phandle_args iommu_spec;
+   int err = NO_IOMMU, idx = 0;
+
+   while (!of_parse_phandle_with_args(master_np, "iommus",
+  "#iommu-cells",
+  idx, _spec)) {
+   err = of_iommu_xlate(dev, _spec);
+   of_node_put(iommu_spec.np);
+   idx++;
+   if (err)
+   break;
+   }
 
-   err = of_iommu_xlate(_dev->dev, _spec);
-   of_node_put(iommu_spec.np);
return err;
 }
 
+struct of_pci_iommu_alias_info {
+   struct device *dev;
+   struct device_node *np;
+};
+
+static int of_pci_iommu_init(struct pci_dev *pdev, u16 alias, void *data)
+{
+   struct of_pci_iommu_alias_info *info = data;
+   u32 input_id = alias;
+
+   return of_iommu_configure_dev_id(info->np, info->dev, _id);
+}
+
+static int of_iommu_configure_device(struct device_node *master_np,
+struct device *dev, const u32 *id)
+{
+   return (id) ? of_iommu_configure_dev_id(master_np, dev, id) :
+ of_iommu_configure_dev(master_np, dev);
+}
+
 const struct iommu_ops *of_iommu_configure(struct device *dev,
-  struct device_node *master_np)
+  struct device_node *master_np,
+  const u32 *id)
 {
  

[PATCH v2 02/12] ACPI/IORT: Make iort_get_device_domain IRQ domain agnostic

2020-06-19 Thread Lorenzo Pieralisi
iort_get_device_domain() is PCI specific but it need not be,
since it can be used to retrieve IRQ domain nexus of any kind
by adding an irq_domain_bus_token input to it.

Make it PCI agnostic by also renaming the requestor ID input
to a more generic ID name.

Signed-off-by: Lorenzo Pieralisi 
Acked-by: Bjorn Helgaas# pci/msi.c
Cc: Will Deacon 
Cc: Hanjun Guo 
Cc: Bjorn Helgaas 
Cc: Sudeep Holla 
Cc: Catalin Marinas 
Cc: Robin Murphy 
Cc: "Rafael J. Wysocki" 
---
 drivers/acpi/arm64/iort.c | 14 +++---
 drivers/pci/msi.c |  3 ++-
 include/linux/acpi_iort.h |  7 ---
 3 files changed, 13 insertions(+), 11 deletions(-)

diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
index 5eee81758184..902e2aaca946 100644
--- a/drivers/acpi/arm64/iort.c
+++ b/drivers/acpi/arm64/iort.c
@@ -550,7 +550,6 @@ static struct acpi_iort_node *iort_find_dev_node(struct 
device *dev)
node = iort_get_iort_node(dev->fwnode);
if (node)
return node;
-
/*
 * if not, then it should be a platform device defined in
 * DSDT/SSDT (with Named Component node in IORT)
@@ -641,13 +640,13 @@ static int __maybe_unused iort_find_its_base(u32 its_id, 
phys_addr_t *base)
 /**
  * iort_dev_find_its_id() - Find the ITS identifier for a device
  * @dev: The device.
- * @req_id: Device's requester ID
+ * @id: Device's ID
  * @idx: Index of the ITS identifier list.
  * @its_id: ITS identifier.
  *
  * Returns: 0 on success, appropriate error value otherwise
  */
-static int iort_dev_find_its_id(struct device *dev, u32 req_id,
+static int iort_dev_find_its_id(struct device *dev, u32 id,
unsigned int idx, int *its_id)
 {
struct acpi_iort_its_group *its;
@@ -657,7 +656,7 @@ static int iort_dev_find_its_id(struct device *dev, u32 
req_id,
if (!node)
return -ENXIO;
 
-   node = iort_node_map_id(node, req_id, NULL, IORT_MSI_TYPE);
+   node = iort_node_map_id(node, id, NULL, IORT_MSI_TYPE);
if (!node)
return -ENXIO;
 
@@ -680,19 +679,20 @@ static int iort_dev_find_its_id(struct device *dev, u32 
req_id,
  *
  * Returns: the MSI domain for this device, NULL otherwise
  */
-struct irq_domain *iort_get_device_domain(struct device *dev, u32 req_id)
+struct irq_domain *iort_get_device_domain(struct device *dev, u32 id,
+ enum irq_domain_bus_token bus_token)
 {
struct fwnode_handle *handle;
int its_id;
 
-   if (iort_dev_find_its_id(dev, req_id, 0, _id))
+   if (iort_dev_find_its_id(dev, id, 0, _id))
return NULL;
 
handle = iort_find_domain_token(its_id);
if (!handle)
return NULL;
 
-   return irq_find_matching_fwnode(handle, DOMAIN_BUS_PCI_MSI);
+   return irq_find_matching_fwnode(handle, bus_token);
 }
 
 static void iort_set_device_domain(struct device *dev,
diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index 6b43a5455c7a..74a91f52ecc0 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -1558,7 +1558,8 @@ struct irq_domain *pci_msi_get_device_domain(struct 
pci_dev *pdev)
pci_for_each_dma_alias(pdev, get_msi_id_cb, );
dom = of_msi_map_get_device_domain(>dev, rid);
if (!dom)
-   dom = iort_get_device_domain(>dev, rid);
+   dom = iort_get_device_domain(>dev, rid,
+DOMAIN_BUS_PCI_MSI);
return dom;
 }
 #endif /* CONFIG_PCI_MSI_IRQ_DOMAIN */
diff --git a/include/linux/acpi_iort.h b/include/linux/acpi_iort.h
index 8e7e2ec37f1b..08ec6bd2297f 100644
--- a/include/linux/acpi_iort.h
+++ b/include/linux/acpi_iort.h
@@ -29,7 +29,8 @@ struct fwnode_handle *iort_find_domain_token(int trans_id);
 #ifdef CONFIG_ACPI_IORT
 void acpi_iort_init(void);
 u32 iort_msi_map_rid(struct device *dev, u32 req_id);
-struct irq_domain *iort_get_device_domain(struct device *dev, u32 req_id);
+struct irq_domain *iort_get_device_domain(struct device *dev, u32 id,
+ enum irq_domain_bus_token bus_token);
 void acpi_configure_pmsi_domain(struct device *dev);
 int iort_pmsi_get_dev_id(struct device *dev, u32 *dev_id);
 /* IOMMU interface */
@@ -40,8 +41,8 @@ int iort_iommu_msi_get_resv_regions(struct device *dev, 
struct list_head *head);
 static inline void acpi_iort_init(void) { }
 static inline u32 iort_msi_map_rid(struct device *dev, u32 req_id)
 { return req_id; }
-static inline struct irq_domain *iort_get_device_domain(struct device *dev,
-   u32 req_id)
+static inline struct irq_domain *iort_get_device_domain(
+   struct device *dev, u32 id, enum irq_domain_bus_token bus_token)
 { return NULL; }
 static inline void acpi_configure_pmsi_domain(struct device *dev) { }
 /* IOMMU interface */
-- 
2.26.1


[PATCH v2 00/12] ACPI/OF: Upgrade MSI/IOMMU ID mapping APIs

2020-06-19 Thread Lorenzo Pieralisi
This series is a v2 of a previous posting:

v1 -> v2

- Removed _rid() wrappers
- Fixed !CONFIG_ACPI compilation issue
- Converted of_pci_iommu_init() to use of_iommu_configure_dev_id()

v1: 
https://lore.kernel.org/linux-arm-kernel/20200521130008.8266-1-lorenzo.pieral...@arm.com/

Original cover letter
-

Firmware bindings provided in the ACPI IORT table[1] and device tree
bindings define rules to carry out input/output ID mappings - ie
retrieving an IOMMU/MSI controller input ID for a device with a given
ID.

At the moment these firmware bindings are used exclusively for PCI
devices and their requester ID to IOMMU/MSI id mapping but there is
nothing PCI specific in the ACPI and devicetree bindings that prevent
the firmware and kernel from using the firmware bindings to traslate
device IDs for any bus that requires its devices to carry out
input/output id translations.

The Freescale FSL bus is an example whereby the input/output ID
translation kernel code put in place for PCI can be reused for devices
attached to the bus that are not PCI devices.

This series updates the kernel code to make the MSI/IOMMU input/output
ID translation PCI agnostic and apply the resulting changes to the
device ID space provided by the Freescale FSL bus.

[1] 
http://infocenter.arm.com/help/topic/com.arm.doc.den0049d/DEN0049D_IO_Remapping_Table.pdf

Cc: Rob Herring 
Cc: "Rafael J. Wysocki" 
Cc: "Joerg Roedel 
Cc: Hanjun Guo 
Cc: Bjorn Helgaas 
Cc: Sudeep Holla 
Cc: Robin Murphy 
Cc: Catalin Marinas 
Cc: Will Deacon 
Cc: Marc Zyngier 

Diana Craciun (2):
  of/irq: make of_msi_map_get_device_domain() bus agnostic
  bus/fsl-mc: Refactor the MSI domain creation in the DPRC driver

Laurentiu Tudor (1):
  dt-bindings: arm: fsl: Add msi-map device-tree binding for fsl-mc bus

Lorenzo Pieralisi (8):
  ACPI/IORT: Make iort_match_node_callback walk the ACPI namespace for
NC
  ACPI/IORT: Make iort_get_device_domain IRQ domain agnostic
  ACPI/IORT: Make iort_msi_map_rid() PCI agnostic
  ACPI/IORT: Remove useless PCI bus walk
  ACPI/IORT: Add an input ID to acpi_dma_configure()
  of/iommu: Make of_map_rid() PCI agnostic
  of/device: Add input id to of_dma_configure()
  of/irq: Make of_msi_map_rid() PCI bus agnostic

Makarand Pawagi (1):
  bus: fsl-mc: Add ACPI support for fsl-mc

 .../devicetree/bindings/misc/fsl,qoriq-mc.txt |  50 +++-
 drivers/acpi/arm64/iort.c | 108 --
 drivers/acpi/scan.c   |   8 +-
 drivers/bus/fsl-mc/dprc-driver.c  |  31 ++---
 drivers/bus/fsl-mc/fsl-mc-bus.c   |  79 +
 drivers/bus/fsl-mc/fsl-mc-msi.c   |  36 --
 drivers/bus/fsl-mc/fsl-mc-private.h   |   6 +-
 drivers/iommu/of_iommu.c  |  81 +++--
 drivers/irqchip/irq-gic-v3-its-fsl-mc-msi.c   | 105 ++---
 drivers/of/base.c |  42 +++
 drivers/of/device.c   |   8 +-
 drivers/of/irq.c  |  34 +++---
 drivers/pci/msi.c |   9 +-
 include/acpi/acpi_bus.h   |   9 +-
 include/linux/acpi.h  |   7 ++
 include/linux/acpi_iort.h |  20 ++--
 include/linux/of.h|   4 +-
 include/linux/of_device.h |  16 ++-
 include/linux/of_iommu.h  |   6 +-
 include/linux/of_irq.h|  13 ++-
 20 files changed, 451 insertions(+), 221 deletions(-)

-- 
2.26.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 01/12] ACPI/IORT: Make iort_match_node_callback walk the ACPI namespace for NC

2020-06-19 Thread Lorenzo Pieralisi
When the iort_match_node_callback is invoked for a named component
the match should be executed upon a device with an ACPI companion.

For devices with no ACPI companion set-up the ACPI device tree must be
walked in order to find the first parent node with a companion set and
check the parent node against the named component entry to check whether
there is a match and therefore an IORT node describing the in/out ID
translation for the device has been found.

Signed-off-by: Lorenzo Pieralisi 
Cc: Will Deacon 
Cc: Hanjun Guo 
Cc: Sudeep Holla 
Cc: Catalin Marinas 
Cc: Robin Murphy 
Cc: "Rafael J. Wysocki" 
---
 drivers/acpi/arm64/iort.c | 20 ++--
 1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
index 28a6b387e80e..5eee81758184 100644
--- a/drivers/acpi/arm64/iort.c
+++ b/drivers/acpi/arm64/iort.c
@@ -264,15 +264,31 @@ static acpi_status iort_match_node_callback(struct 
acpi_iort_node *node,
 
if (node->type == ACPI_IORT_NODE_NAMED_COMPONENT) {
struct acpi_buffer buf = { ACPI_ALLOCATE_BUFFER, NULL };
-   struct acpi_device *adev = to_acpi_device_node(dev->fwnode);
+   struct acpi_device *adev;
struct acpi_iort_named_component *ncomp;
+   struct device *nc_dev = dev;
+
+   /*
+* Walk the device tree to find a device with an
+* ACPI companion; there is no point in scanning
+* IORT for a device matching a named component if
+* the device does not have an ACPI companion to
+* start with.
+*/
+   do {
+   adev = ACPI_COMPANION(nc_dev);
+   if (adev)
+   break;
+
+   nc_dev = nc_dev->parent;
+   } while (nc_dev);
 
if (!adev)
goto out;
 
status = acpi_get_name(adev->handle, ACPI_FULL_PATHNAME, );
if (ACPI_FAILURE(status)) {
-   dev_warn(dev, "Can't get device full path name\n");
+   dev_warn(nc_dev, "Can't get device full path name\n");
goto out;
}
 
-- 
2.26.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 03/12] ACPI/IORT: Make iort_msi_map_rid() PCI agnostic

2020-06-19 Thread Lorenzo Pieralisi
There is nothing PCI specific in iort_msi_map_rid().

Rename the function using a bus protocol agnostic name,
iort_msi_map_id(), and convert current callers to it.

Signed-off-by: Lorenzo Pieralisi 
Cc: Will Deacon 
Cc: Hanjun Guo 
Cc: Bjorn Helgaas 
Cc: Sudeep Holla 
Cc: Catalin Marinas 
Cc: Robin Murphy 
Cc: "Rafael J. Wysocki" 
---
 drivers/acpi/arm64/iort.c | 12 ++--
 drivers/pci/msi.c |  2 +-
 include/linux/acpi_iort.h |  6 +++---
 3 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
index 902e2aaca946..53f9ef515089 100644
--- a/drivers/acpi/arm64/iort.c
+++ b/drivers/acpi/arm64/iort.c
@@ -568,22 +568,22 @@ static struct acpi_iort_node *iort_find_dev_node(struct 
device *dev)
 }
 
 /**
- * iort_msi_map_rid() - Map a MSI requester ID for a device
+ * iort_msi_map_id() - Map a MSI input ID for a device
  * @dev: The device for which the mapping is to be done.
- * @req_id: The device requester ID.
+ * @input_id: The device input ID.
  *
- * Returns: mapped MSI RID on success, input requester ID otherwise
+ * Returns: mapped MSI ID on success, input ID otherwise
  */
-u32 iort_msi_map_rid(struct device *dev, u32 req_id)
+u32 iort_msi_map_id(struct device *dev, u32 input_id)
 {
struct acpi_iort_node *node;
u32 dev_id;
 
node = iort_find_dev_node(dev);
if (!node)
-   return req_id;
+   return input_id;
 
-   iort_node_map_id(node, req_id, _id, IORT_MSI_TYPE);
+   iort_node_map_id(node, input_id, _id, IORT_MSI_TYPE);
return dev_id;
 }
 
diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index 74a91f52ecc0..77f48b95e277 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -1536,7 +1536,7 @@ u32 pci_msi_domain_get_msi_rid(struct irq_domain *domain, 
struct pci_dev *pdev)
 
of_node = irq_domain_get_of_node(domain);
rid = of_node ? of_msi_map_rid(>dev, of_node, rid) :
-   iort_msi_map_rid(>dev, rid);
+   iort_msi_map_id(>dev, rid);
 
return rid;
 }
diff --git a/include/linux/acpi_iort.h b/include/linux/acpi_iort.h
index 08ec6bd2297f..e51425e083da 100644
--- a/include/linux/acpi_iort.h
+++ b/include/linux/acpi_iort.h
@@ -28,7 +28,7 @@ void iort_deregister_domain_token(int trans_id);
 struct fwnode_handle *iort_find_domain_token(int trans_id);
 #ifdef CONFIG_ACPI_IORT
 void acpi_iort_init(void);
-u32 iort_msi_map_rid(struct device *dev, u32 req_id);
+u32 iort_msi_map_id(struct device *dev, u32 id);
 struct irq_domain *iort_get_device_domain(struct device *dev, u32 id,
  enum irq_domain_bus_token bus_token);
 void acpi_configure_pmsi_domain(struct device *dev);
@@ -39,8 +39,8 @@ const struct iommu_ops *iort_iommu_configure(struct device 
*dev);
 int iort_iommu_msi_get_resv_regions(struct device *dev, struct list_head 
*head);
 #else
 static inline void acpi_iort_init(void) { }
-static inline u32 iort_msi_map_rid(struct device *dev, u32 req_id)
-{ return req_id; }
+static inline u32 iort_msi_map_id(struct device *dev, u32 id)
+{ return id; }
 static inline struct irq_domain *iort_get_device_domain(
struct device *dev, u32 id, enum irq_domain_bus_token bus_token)
 { return NULL; }
-- 
2.26.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v8 03/12] iommu/sva: Add PASID helpers

2020-06-19 Thread Lu Baolu

Hi Jean,

On 2020/6/18 23:51, Jean-Philippe Brucker wrote:

Let IOMMU drivers allocate a single PASID per mm. Store the mm in the
IOASID set to allow refcounting and searching mm by PASID, when handling
an I/O page fault.

Signed-off-by: Jean-Philippe Brucker 


Reviewed-by: Lu Baolu 

Best regards,
baolu


---
v7->v8: rename to IOMMU_SVA_LIB (Lu Baolu)
---
  drivers/iommu/Kconfig |  5 +++
  drivers/iommu/Makefile|  1 +
  drivers/iommu/iommu-sva-lib.h | 15 +++
  drivers/iommu/iommu-sva-lib.c | 85 +++
  4 files changed, 106 insertions(+)
  create mode 100644 drivers/iommu/iommu-sva-lib.h
  create mode 100644 drivers/iommu/iommu-sva-lib.c

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index b510f67dfa499..74a10e7a8d082 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -102,6 +102,11 @@ config IOMMU_DMA
select IRQ_MSI_IOMMU
select NEED_SG_DMA_LENGTH
  
+# Shared Virtual Addressing library

+config IOMMU_SVA_LIB
+   bool
+   select IOASID
+
  config FSL_PAMU
bool "Freescale IOMMU support"
depends on PCI
diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index 342190196dfb0..0fe5a7f9bc69c 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -38,3 +38,4 @@ obj-$(CONFIG_S390_IOMMU) += s390-iommu.o
  obj-$(CONFIG_QCOM_IOMMU) += qcom_iommu.o
  obj-$(CONFIG_HYPERV_IOMMU) += hyperv-iommu.o
  obj-$(CONFIG_VIRTIO_IOMMU) += virtio-iommu.o
+obj-$(CONFIG_IOMMU_SVA_LIB) += iommu-sva-lib.o
diff --git a/drivers/iommu/iommu-sva-lib.h b/drivers/iommu/iommu-sva-lib.h
new file mode 100644
index 0..b40990aef3fde
--- /dev/null
+++ b/drivers/iommu/iommu-sva-lib.h
@@ -0,0 +1,15 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * SVA library for IOMMU drivers
+ */
+#ifndef _IOMMU_SVA_LIB_H
+#define _IOMMU_SVA_LIB_H
+
+#include 
+#include 
+
+int iommu_sva_alloc_pasid(struct mm_struct *mm, ioasid_t min, ioasid_t max);
+void iommu_sva_free_pasid(struct mm_struct *mm);
+struct mm_struct *iommu_sva_find(ioasid_t pasid);
+
+#endif /* _IOMMU_SVA_LIB_H */
diff --git a/drivers/iommu/iommu-sva-lib.c b/drivers/iommu/iommu-sva-lib.c
new file mode 100644
index 0..db7e6c104d6b0
--- /dev/null
+++ b/drivers/iommu/iommu-sva-lib.c
@@ -0,0 +1,85 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Helpers for IOMMU drivers implementing SVA
+ */
+#include 
+#include 
+
+#include "iommu-sva-lib.h"
+
+static DEFINE_MUTEX(iommu_sva_lock);
+static DECLARE_IOASID_SET(iommu_sva_pasid);
+
+/**
+ * iommu_sva_alloc_pasid - Allocate a PASID for the mm
+ * @mm: the mm
+ * @min: minimum PASID value (inclusive)
+ * @max: maximum PASID value (inclusive)
+ *
+ * Try to allocate a PASID for this mm, or take a reference to the existing one
+ * provided it fits within the [min, max] range. On success the PASID is
+ * available in mm->pasid, and must be released with iommu_sva_free_pasid().
+ *
+ * Returns 0 on success and < 0 on error.
+ */
+int iommu_sva_alloc_pasid(struct mm_struct *mm, ioasid_t min, ioasid_t max)
+{
+   int ret = 0;
+   ioasid_t pasid;
+
+   if (min == INVALID_IOASID || max == INVALID_IOASID ||
+   min == 0 || max < min)
+   return -EINVAL;
+
+   mutex_lock(_sva_lock);
+   if (mm->pasid) {
+   if (mm->pasid >= min && mm->pasid <= max)
+   ioasid_get(mm->pasid);
+   else
+   ret = -EOVERFLOW;
+   } else {
+   pasid = ioasid_alloc(_sva_pasid, min, max, mm);
+   if (pasid == INVALID_IOASID)
+   ret = -ENOMEM;
+   else
+   mm->pasid = pasid;
+   }
+   mutex_unlock(_sva_lock);
+   return ret;
+}
+EXPORT_SYMBOL_GPL(iommu_sva_alloc_pasid);
+
+/**
+ * iommu_sva_free_pasid - Release the mm's PASID
+ * @mm: the mm.
+ *
+ * Drop one reference to a PASID allocated with iommu_sva_alloc_pasid()
+ */
+void iommu_sva_free_pasid(struct mm_struct *mm)
+{
+   mutex_lock(_sva_lock);
+   if (ioasid_put(mm->pasid))
+   mm->pasid = 0;
+   mutex_unlock(_sva_lock);
+}
+EXPORT_SYMBOL_GPL(iommu_sva_free_pasid);
+
+/* ioasid wants a void * argument */
+static bool __mmget_not_zero(void *mm)
+{
+   return mmget_not_zero(mm);
+}
+
+/**
+ * iommu_sva_find() - Find mm associated to the given PASID
+ * @pasid: Process Address Space ID assigned to the mm
+ *
+ * On success a reference to the mm is taken, and must be released with 
mmput().
+ *
+ * Returns the mm corresponding to this PASID, or an error if not found.
+ */
+struct mm_struct *iommu_sva_find(ioasid_t pasid)
+{
+   return ioasid_find(_sva_pasid, pasid, __mmget_not_zero);
+}
+EXPORT_SYMBOL_GPL(iommu_sva_find);


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu