Re: [Freedreno] [PATCH] of/device: add blacklist for iommu dma_ops

2019-06-05 Thread Tomasz Figa
On Mon, Jun 3, 2019 at 7:48 PM Rob Clark  wrote:
>
> On Sun, Jun 2, 2019 at 11:25 PM Tomasz Figa  wrote:
> >
> > On Mon, Jun 3, 2019 at 4:40 AM Rob Clark  wrote:
> > >
> > > On Fri, May 10, 2019 at 7:35 AM Rob Clark  wrote:
> > > >
> > > > On Tue, Dec 4, 2018 at 2:29 PM Rob Herring  wrote:
> > > > >
> > > > > On Sat, Dec 1, 2018 at 10:54 AM Rob Clark  wrote:
> > > > > >
> > > > > > This solves a problem we see with drm/msm, caused by getting
> > > > > > iommu_dma_ops while we attach our own domain and manage it directly 
> > > > > > at
> > > > > > the iommu API level:
> > > > > >
> > > > > >   [0038] user address but active_mm is swapper
> > > > > >   Internal error: Oops: 9605 [#1] PREEMPT SMP
> > > > > >   Modules linked in:
> > > > > >   CPU: 7 PID: 70 Comm: kworker/7:1 Tainted: GW 
> > > > > > 4.19.3 #90
> > > > > >   Hardware name: xxx (DT)
> > > > > >   Workqueue: events deferred_probe_work_func
> > > > > >   pstate: 80c9 (Nzcv daif +PAN +UAO)
> > > > > >   pc : iommu_dma_map_sg+0x7c/0x2c8
> > > > > >   lr : iommu_dma_map_sg+0x40/0x2c8
> > > > > >   sp : ff80095eb4f0
> > > > > >   x29: ff80095eb4f0 x28: 
> > > > > >   x27: ffc0f9431578 x26: 
> > > > > >   x25:  x24: 0003
> > > > > >   x23: 0001 x22: ffc0fa9ac010
> > > > > >   x21:  x20: ffc0fab40980
> > > > > >   x19: ffc0fab40980 x18: 0003
> > > > > >   x17: 01c4 x16: 0007
> > > > > >   x15: 000e x14: 
> > > > > >   x13:  x12: 0028
> > > > > >   x11: 0101010101010101 x10: 7f7f7f7f7f7f7f7f
> > > > > >   x9 :  x8 : ffc0fab409a0
> > > > > >   x7 :  x6 : 0002
> > > > > >   x5 : 0001 x4 : 
> > > > > >   x3 : 0001 x2 : 0002
> > > > > >   x1 : ffc0f9431578 x0 : 
> > > > > >   Process kworker/7:1 (pid: 70, stack limit = 0x17d08ffb)
> > > > > >   Call trace:
> > > > > >iommu_dma_map_sg+0x7c/0x2c8
> > > > > >__iommu_map_sg_attrs+0x70/0x84
> > > > > >get_pages+0x170/0x1e8
> > > > > >msm_gem_get_iova+0x8c/0x128
> > > > > >_msm_gem_kernel_new+0x6c/0xc8
> > > > > >msm_gem_kernel_new+0x4c/0x58
> > > > > >dsi_tx_buf_alloc_6g+0x4c/0x8c
> > > > > >msm_dsi_host_modeset_init+0xc8/0x108
> > > > > >msm_dsi_modeset_init+0x54/0x18c
> > > > > >_dpu_kms_drm_obj_init+0x430/0x474
> > > > > >dpu_kms_hw_init+0x5f8/0x6b4
> > > > > >msm_drm_bind+0x360/0x6c8
> > > > > >try_to_bring_up_master.part.7+0x28/0x70
> > > > > >component_master_add_with_match+0xe8/0x124
> > > > > >msm_pdev_probe+0x294/0x2b4
> > > > > >platform_drv_probe+0x58/0xa4
> > > > > >really_probe+0x150/0x294
> > > > > >driver_probe_device+0xac/0xe8
> > > > > >__device_attach_driver+0xa4/0xb4
> > > > > >bus_for_each_drv+0x98/0xc8
> > > > > >__device_attach+0xac/0x12c
> > > > > >device_initial_probe+0x24/0x30
> > > > > >bus_probe_device+0x38/0x98
> > > > > >deferred_probe_work_func+0x78/0xa4
> > > > > >process_one_work+0x24c/0x3dc
> > > > > >worker_thread+0x280/0x360
> > > > > >kthread+0x134/0x13c
> > > > > >ret_from_fork+0x10/0x18
> > > > > >   Code: d284 91000725 6b17039f 5400048a (f9401f40)
> > > > > >   ---[ end trace f22dda57f3648e2c ]---
> > > > > >   Kernel panic - not syncing: Fatal exception
> > > > > >   SMP: stopping secondary CPUs
> > > > > >   Kernel Offset: disabled
> > > > > >   CPU features: 0x0,22802a18
> > 

Re: [Freedreno] [PATCH] of/device: add blacklist for iommu dma_ops

2019-06-03 Thread Tomasz Figa
On Mon, Jun 3, 2019 at 4:40 AM Rob Clark  wrote:
>
> On Fri, May 10, 2019 at 7:35 AM Rob Clark  wrote:
> >
> > On Tue, Dec 4, 2018 at 2:29 PM Rob Herring  wrote:
> > >
> > > On Sat, Dec 1, 2018 at 10:54 AM Rob Clark  wrote:
> > > >
> > > > This solves a problem we see with drm/msm, caused by getting
> > > > iommu_dma_ops while we attach our own domain and manage it directly at
> > > > the iommu API level:
> > > >
> > > >   [0038] user address but active_mm is swapper
> > > >   Internal error: Oops: 9605 [#1] PREEMPT SMP
> > > >   Modules linked in:
> > > >   CPU: 7 PID: 70 Comm: kworker/7:1 Tainted: GW 4.19.3 
> > > > #90
> > > >   Hardware name: xxx (DT)
> > > >   Workqueue: events deferred_probe_work_func
> > > >   pstate: 80c9 (Nzcv daif +PAN +UAO)
> > > >   pc : iommu_dma_map_sg+0x7c/0x2c8
> > > >   lr : iommu_dma_map_sg+0x40/0x2c8
> > > >   sp : ff80095eb4f0
> > > >   x29: ff80095eb4f0 x28: 
> > > >   x27: ffc0f9431578 x26: 
> > > >   x25:  x24: 0003
> > > >   x23: 0001 x22: ffc0fa9ac010
> > > >   x21:  x20: ffc0fab40980
> > > >   x19: ffc0fab40980 x18: 0003
> > > >   x17: 01c4 x16: 0007
> > > >   x15: 000e x14: 
> > > >   x13:  x12: 0028
> > > >   x11: 0101010101010101 x10: 7f7f7f7f7f7f7f7f
> > > >   x9 :  x8 : ffc0fab409a0
> > > >   x7 :  x6 : 0002
> > > >   x5 : 0001 x4 : 
> > > >   x3 : 0001 x2 : 0002
> > > >   x1 : ffc0f9431578 x0 : 
> > > >   Process kworker/7:1 (pid: 70, stack limit = 0x17d08ffb)
> > > >   Call trace:
> > > >iommu_dma_map_sg+0x7c/0x2c8
> > > >__iommu_map_sg_attrs+0x70/0x84
> > > >get_pages+0x170/0x1e8
> > > >msm_gem_get_iova+0x8c/0x128
> > > >_msm_gem_kernel_new+0x6c/0xc8
> > > >msm_gem_kernel_new+0x4c/0x58
> > > >dsi_tx_buf_alloc_6g+0x4c/0x8c
> > > >msm_dsi_host_modeset_init+0xc8/0x108
> > > >msm_dsi_modeset_init+0x54/0x18c
> > > >_dpu_kms_drm_obj_init+0x430/0x474
> > > >dpu_kms_hw_init+0x5f8/0x6b4
> > > >msm_drm_bind+0x360/0x6c8
> > > >try_to_bring_up_master.part.7+0x28/0x70
> > > >component_master_add_with_match+0xe8/0x124
> > > >msm_pdev_probe+0x294/0x2b4
> > > >platform_drv_probe+0x58/0xa4
> > > >really_probe+0x150/0x294
> > > >driver_probe_device+0xac/0xe8
> > > >__device_attach_driver+0xa4/0xb4
> > > >bus_for_each_drv+0x98/0xc8
> > > >__device_attach+0xac/0x12c
> > > >device_initial_probe+0x24/0x30
> > > >bus_probe_device+0x38/0x98
> > > >deferred_probe_work_func+0x78/0xa4
> > > >process_one_work+0x24c/0x3dc
> > > >worker_thread+0x280/0x360
> > > >kthread+0x134/0x13c
> > > >ret_from_fork+0x10/0x18
> > > >   Code: d284 91000725 6b17039f 5400048a (f9401f40)
> > > >   ---[ end trace f22dda57f3648e2c ]---
> > > >   Kernel panic - not syncing: Fatal exception
> > > >   SMP: stopping secondary CPUs
> > > >   Kernel Offset: disabled
> > > >   CPU features: 0x0,22802a18
> > > >   Memory Limit: none
> > > >
> > > > The problem is that when drm/msm does it's own iommu_attach_device(),
> > > > now the domain returned by iommu_get_domain_for_dev() is drm/msm's
> > > > domain, and it doesn't have domain->iova_cookie.
> > > >
> > > > We kind of avoided this problem prior to sdm845/dpu because the iommu
> > > > was attached to the mdp node in dt, which is a child of the toplevel
> > > > mdss node (which corresponds to the dev passed in dma_map_sg()).  But
> > > > with sdm845, now the iommu is attached at the mdss level so we hit the
> > > > iommu_dma_ops in dma_map_sg().
> > > >
> > > > But auto allocating/attaching a domain before the driver is probed was
> > > > already a blocking problem for enabling per-context pagetables for the
> > > > GPU.  This problem is also now solved with this patch.
> > > >
> > > > Fixes: 97890ba9289c dma-mapping: detect and configure IOMMU in 
> > > > of_dma_configure
> > > > Tested-by: Douglas Anderson 
> > > > Signed-off-by: Rob Clark 
> > > > ---
> > > > This is an alternative/replacement for [1].  What it lacks in elegance
> > > > it makes up for in practicality ;-)
> > > >
> > > > [1] https://patchwork.freedesktop.org/patch/264930/
> > > >
> > > >  drivers/of/device.c | 22 ++
> > > >  1 file changed, 22 insertions(+)
> > > >
> > > > diff --git a/drivers/of/device.c b/drivers/of/device.c
> > > > index 5957cd4fa262..15ffee00fb22 100644
> > > > --- a/drivers/of/device.c
> > > > +++ b/drivers/of/device.c
> > > > @@ -72,6 +72,14 @@ int of_device_add(struct platform_device *ofdev)
> > > > return device_add(>dev);
> > > >  }
> > > >
> > > > +static const struct of_device_id iommu_blacklist[] = {
> > > > +   { .compatible = "qcom,mdp4" },
> > > > +

Re: [Freedreno] [PATCH] of/device: add blacklist for iommu dma_ops

2018-12-02 Thread Tomasz Figa
\n",
> iommu ? " " : " not ");
>
> +   /*
> +* There is at least one case where the driver wants to directly
> +* manage the IOMMU, but if we end up with iommu dma_ops, that
> +* interferes with the drivers ability to use dma_map_sg() for
> +* cache operations.  Since we don't currently have a better
> +* solution, and this code runs before the driver is probed and
> +* has a chance to intervene, use a simple blacklist to avoid
> +* ending up with iommu dma_ops:
> +*/
> +   if (of_match_device(iommu_blacklist, dev)) {
> +   dev_dbg(dev, "skipping iommu hookup\n");
> +   iommu = NULL;
> +   }
> +
> arch_setup_dma_ops(dev, dma_addr, size, iommu, coherent);
>
> return 0;
> --
> 2.19.2
>

+Marek Szyprowski who I believe had a similar problem with Exynos DRM before.

Reviewed-by: Tomasz Figa 

Best regards,
Tomasz
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH v3 1/1] drm: msm: Replace dma_map_sg with dma_sync_sg*

2018-12-02 Thread Tomasz Figa
On Sat, Dec 1, 2018 at 3:47 AM Rob Clark  wrote:
>
> On Fri, Nov 30, 2018 at 9:05 PM Tomasz Figa  wrote:
> >
> > On Thu, Nov 29, 2018 at 4:23 PM Tomasz Figa  wrote:
> > >
> > > On Thu, Nov 29, 2018 at 12:03 PM Robin Murphy  
> > > wrote:
> > > >
> > > > On 29/11/2018 19:57, Tomasz Figa wrote:
> > > > > On Thu, Nov 29, 2018 at 11:40 AM Jordan Crouse 
> > > > >  wrote:
> > > > >>
> > > > >> On Thu, Nov 29, 2018 at 01:48:15PM -0500, Rob Clark wrote:
> > > > >>> On Thu, Nov 29, 2018 at 10:54 AM Christoph Hellwig  
> > > > >>> wrote:
> > > > >>>>
> > > > >>>> On Thu, Nov 29, 2018 at 09:42:50AM -0500, Rob Clark wrote:
> > > > >>>>> Maybe the thing we need to do is just implement a blacklist of
> > > > >>>>> compatible strings for devices which should skip the automatic
> > > > >>>>> iommu/dma hookup.  Maybe a bit ugly, but it would also solve a 
> > > > >>>>> problem
> > > > >>>>> preventing us from enabling per-process pagetables for a5xx 
> > > > >>>>> (where we
> > > > >>>>> need to control the domain/context-bank that is allocated by the 
> > > > >>>>> dma
> > > > >>>>> api).
> > > > >>>>
> > > > >>>> You can detach from the dma map attachment using 
> > > > >>>> arm_iommu_detach_device,
> > > > >>>> which a few drm drivers do, but I don't think this is the problem.
> > > > >>>
> > > > >>> I think even with detach, we wouldn't end up with the context-bank
> > > > >>> that the gpu firmware was hard-coded to expect, and so it would
> > > > >>> overwrite the incorrect page table address register.  (I could be
> > > > >>> mis-remembering that, Jordan spent more time looking at that.  But 
> > > > >>> it
> > > > >>> was something along those lines.)
> > > > >>
> > > > >> Right - basically the DMA domain steals context bank 0 and the GPU 
> > > > >> is hard coded
> > > > >> to use that context bank for pagetable switching.
> > > > >>
> > > > >> I believe the Tegra guys also had a similar problem with a hard 
> > > > >> coded context
> > > > >> bank.
> > > >
> > > > AIUI, they don't need a specific hardware context, they just need to
> > > > know which one they're actually using, which the domain abstraction 
> > > > hides.
> > > >
> > > > > Wait, if we detach the GPU/display struct device from the default
> > > > > domain and attach it to a newly allocated domain, wouldn't the newly
> > > > > allocated domain use the context bank we need? Note that we're already
> > > >
> > > > The arm-smmu driver doesn't, but there's no fundamental reason it
> > > > couldn't. That should just need code to refcount domain users and
> > > > release hardware contexts for domains with no devices currently 
> > > > attached.
> > > >
> > > > Robin.
> > > >
> > > > > doing that, except that we're doing it behind the back of the DMA
> > > > > mapping subsystem, so that it keeps using the IOMMU version of the DMA
> > > > > ops for the device and doing any mapping operations on the default
> > > > > domain. If we ask the DMA mapping to detach, wouldn't it essentially
> > > > > solve the problem?
> > >
> > > Thanks Robin.
> > >
> > > Still, my point is that the MSM DRM driver attaches the GPU struct
> > > device to a new domain it allocates using iommu_domain_alloc() and it
> > > seems to work fine, so I believe it's not the problem we're looking
> > > into with this patch.
> >
> > Could we just make the MSM DRM call arch_teardown_dma_ops() and then
> > arch_setup_dma_ops() with the `iommu` argument set to NULL and be done
> > with it?
>
> I don't think those are exported to modules?
>

Indeed, if we compile MSM DRM as modules, it wouldn't work...

> I have actually a simpler patch, that adds a small blacklist to check
> in of_dma_configure() before calling arch_setup_dma_ops(), which can
> replace this patch.  It also solves the problem of dma api allocating
> the context bank that he gpu wants to use for context-switching, and
> should be a simple thing to backport to stable branches.
>
> I was just spending some time trying to figure out what changed
> recently to start causing dma_map_sg() to opps on boot for us, so I
> could write a more detailed commit msg.

Yeah, that sounds much better, thanks. Reviewed that patch.

Best regards,
Tomasz
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH v3 1/1] drm: msm: Replace dma_map_sg with dma_sync_sg*

2018-11-30 Thread Tomasz Figa
On Thu, Nov 29, 2018 at 4:23 PM Tomasz Figa  wrote:
>
> On Thu, Nov 29, 2018 at 12:03 PM Robin Murphy  wrote:
> >
> > On 29/11/2018 19:57, Tomasz Figa wrote:
> > > On Thu, Nov 29, 2018 at 11:40 AM Jordan Crouse  
> > > wrote:
> > >>
> > >> On Thu, Nov 29, 2018 at 01:48:15PM -0500, Rob Clark wrote:
> > >>> On Thu, Nov 29, 2018 at 10:54 AM Christoph Hellwig  wrote:
> > >>>>
> > >>>> On Thu, Nov 29, 2018 at 09:42:50AM -0500, Rob Clark wrote:
> > >>>>> Maybe the thing we need to do is just implement a blacklist of
> > >>>>> compatible strings for devices which should skip the automatic
> > >>>>> iommu/dma hookup.  Maybe a bit ugly, but it would also solve a problem
> > >>>>> preventing us from enabling per-process pagetables for a5xx (where we
> > >>>>> need to control the domain/context-bank that is allocated by the dma
> > >>>>> api).
> > >>>>
> > >>>> You can detach from the dma map attachment using 
> > >>>> arm_iommu_detach_device,
> > >>>> which a few drm drivers do, but I don't think this is the problem.
> > >>>
> > >>> I think even with detach, we wouldn't end up with the context-bank
> > >>> that the gpu firmware was hard-coded to expect, and so it would
> > >>> overwrite the incorrect page table address register.  (I could be
> > >>> mis-remembering that, Jordan spent more time looking at that.  But it
> > >>> was something along those lines.)
> > >>
> > >> Right - basically the DMA domain steals context bank 0 and the GPU is 
> > >> hard coded
> > >> to use that context bank for pagetable switching.
> > >>
> > >> I believe the Tegra guys also had a similar problem with a hard coded 
> > >> context
> > >> bank.
> >
> > AIUI, they don't need a specific hardware context, they just need to
> > know which one they're actually using, which the domain abstraction hides.
> >
> > > Wait, if we detach the GPU/display struct device from the default
> > > domain and attach it to a newly allocated domain, wouldn't the newly
> > > allocated domain use the context bank we need? Note that we're already
> >
> > The arm-smmu driver doesn't, but there's no fundamental reason it
> > couldn't. That should just need code to refcount domain users and
> > release hardware contexts for domains with no devices currently attached.
> >
> > Robin.
> >
> > > doing that, except that we're doing it behind the back of the DMA
> > > mapping subsystem, so that it keeps using the IOMMU version of the DMA
> > > ops for the device and doing any mapping operations on the default
> > > domain. If we ask the DMA mapping to detach, wouldn't it essentially
> > > solve the problem?
>
> Thanks Robin.
>
> Still, my point is that the MSM DRM driver attaches the GPU struct
> device to a new domain it allocates using iommu_domain_alloc() and it
> seems to work fine, so I believe it's not the problem we're looking
> into with this patch.

Could we just make the MSM DRM call arch_teardown_dma_ops() and then
arch_setup_dma_ops() with the `iommu` argument set to NULL and be done
with it?

Best regards,
Tomasz
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH v3 1/1] drm: msm: Replace dma_map_sg with dma_sync_sg*

2018-11-29 Thread Tomasz Figa
On Thu, Nov 29, 2018 at 12:03 PM Robin Murphy  wrote:
>
> On 29/11/2018 19:57, Tomasz Figa wrote:
> > On Thu, Nov 29, 2018 at 11:40 AM Jordan Crouse  
> > wrote:
> >>
> >> On Thu, Nov 29, 2018 at 01:48:15PM -0500, Rob Clark wrote:
> >>> On Thu, Nov 29, 2018 at 10:54 AM Christoph Hellwig  wrote:
> >>>>
> >>>> On Thu, Nov 29, 2018 at 09:42:50AM -0500, Rob Clark wrote:
> >>>>> Maybe the thing we need to do is just implement a blacklist of
> >>>>> compatible strings for devices which should skip the automatic
> >>>>> iommu/dma hookup.  Maybe a bit ugly, but it would also solve a problem
> >>>>> preventing us from enabling per-process pagetables for a5xx (where we
> >>>>> need to control the domain/context-bank that is allocated by the dma
> >>>>> api).
> >>>>
> >>>> You can detach from the dma map attachment using arm_iommu_detach_device,
> >>>> which a few drm drivers do, but I don't think this is the problem.
> >>>
> >>> I think even with detach, we wouldn't end up with the context-bank
> >>> that the gpu firmware was hard-coded to expect, and so it would
> >>> overwrite the incorrect page table address register.  (I could be
> >>> mis-remembering that, Jordan spent more time looking at that.  But it
> >>> was something along those lines.)
> >>
> >> Right - basically the DMA domain steals context bank 0 and the GPU is hard 
> >> coded
> >> to use that context bank for pagetable switching.
> >>
> >> I believe the Tegra guys also had a similar problem with a hard coded 
> >> context
> >> bank.
>
> AIUI, they don't need a specific hardware context, they just need to
> know which one they're actually using, which the domain abstraction hides.
>
> > Wait, if we detach the GPU/display struct device from the default
> > domain and attach it to a newly allocated domain, wouldn't the newly
> > allocated domain use the context bank we need? Note that we're already
>
> The arm-smmu driver doesn't, but there's no fundamental reason it
> couldn't. That should just need code to refcount domain users and
> release hardware contexts for domains with no devices currently attached.
>
> Robin.
>
> > doing that, except that we're doing it behind the back of the DMA
> > mapping subsystem, so that it keeps using the IOMMU version of the DMA
> > ops for the device and doing any mapping operations on the default
> > domain. If we ask the DMA mapping to detach, wouldn't it essentially
> > solve the problem?

Thanks Robin.

Still, my point is that the MSM DRM driver attaches the GPU struct
device to a new domain it allocates using iommu_domain_alloc() and it
seems to work fine, so I believe it's not the problem we're looking
into with this patch.

Best regards,
Tomasz
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH v3 1/1] drm: msm: Replace dma_map_sg with dma_sync_sg*

2018-11-29 Thread Tomasz Figa
On Thu, Nov 29, 2018 at 11:40 AM Jordan Crouse  wrote:
>
> On Thu, Nov 29, 2018 at 01:48:15PM -0500, Rob Clark wrote:
> > On Thu, Nov 29, 2018 at 10:54 AM Christoph Hellwig  wrote:
> > >
> > > On Thu, Nov 29, 2018 at 09:42:50AM -0500, Rob Clark wrote:
> > > > Maybe the thing we need to do is just implement a blacklist of
> > > > compatible strings for devices which should skip the automatic
> > > > iommu/dma hookup.  Maybe a bit ugly, but it would also solve a problem
> > > > preventing us from enabling per-process pagetables for a5xx (where we
> > > > need to control the domain/context-bank that is allocated by the dma
> > > > api).
> > >
> > > You can detach from the dma map attachment using arm_iommu_detach_device,
> > > which a few drm drivers do, but I don't think this is the problem.
> >
> > I think even with detach, we wouldn't end up with the context-bank
> > that the gpu firmware was hard-coded to expect, and so it would
> > overwrite the incorrect page table address register.  (I could be
> > mis-remembering that, Jordan spent more time looking at that.  But it
> > was something along those lines.)
>
> Right - basically the DMA domain steals context bank 0 and the GPU is hard 
> coded
> to use that context bank for pagetable switching.
>
> I believe the Tegra guys also had a similar problem with a hard coded context
> bank.

Wait, if we detach the GPU/display struct device from the default
domain and attach it to a newly allocated domain, wouldn't the newly
allocated domain use the context bank we need? Note that we're already
doing that, except that we're doing it behind the back of the DMA
mapping subsystem, so that it keeps using the IOMMU version of the DMA
ops for the device and doing any mapping operations on the default
domain. If we ask the DMA mapping to detach, wouldn't it essentially
solve the problem?

Best regards,
Tomasz
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH v3 1/1] drm: msm: Replace dma_map_sg with dma_sync_sg*

2018-11-29 Thread Tomasz Figa
[CC Marek]

On Thu, Nov 29, 2018 at 9:09 AM Daniel Vetter  wrote:
>
> On Thu, Nov 29, 2018 at 5:57 PM Christoph Hellwig  wrote:
> > On Thu, Nov 29, 2018 at 05:28:07PM +0100, Daniel Vetter wrote:
> > > Just spend a bit of time reading through the implementations already
> > > merged. Is the struct device *dev parameter actually needed anywhere?
> > > dma-api definitely needs it, because we need that to pick the right iommu.
> > > But for cache management from what I've seen the target device doesn't
> > > matter, all the target specific stuff will be handled by the iommu.
> >
> > It looks like only mips every uses the dev argument, and even there
> > the function it passes dev to ignores it.  So it could probably be removed.
> >

Whether the cache maintenance operation needs to actually do anything
or not is a function of `dev`. We can have some devices that are
coherent with CPU caches, and some that are not, on the same system.

> > >
> > > Dropping the dev parameter would make this a perfect fit for coherency
> > > management of buffers used by multiple devices. Right now there's all
> > > kinds of nasty tricks for that use cases needed to avoid redundant
> > > flushes.
> >
> > Note that one thing I'd like to avoid is exposing these funtions directly
> > to drivers, as that will get us into all kinds of abuses.
>
> What kind of abuse do you expect? It could very well be that gpu folks
> call that "standard use case" ... At least on x86 with the i915 driver
> we pretty much rely on architectural guarantees for how cache flushes
> work very much. Down to userspace doing the cache flushing for
> mappings the kernel has set up.

i915 is a very specific case of a fully contained,
architecture-specific hardware subsystem, where you can just hardcode
all integration details inside the driver, because nobody else would
care.

In ARM world, you can have the same IP blocks licensed by multiple SoC
vendors with different integration details and that often includes the
option of coherency.

>
> > So I'd much prefer if we could have iommu APIs wrapping these that are
> > specific to actual uses cases that we understand well.
> >
> > As for the buffer sharing: at least for the DMA API side I want to
> > move the current buffer sharing users away from dma_alloc_coherent
> > (and coherent dma_alloc_attrs users) and the remapping done in there
> > required for non-coherent architectures.  Instead I'd like to allocate
> > plain old pages, and then just dma map them for each device separately,
> > with DMA_ATTR_SKIP_CPU_SYNC passed for all but the first user to map
> > or last user to unmap.  On the iommu side it could probably work
> > similar.
>
> I think this is what's often done. Except then there's also the issue
> of how to get at the cma allocator if your device needs something
> contiguous. There's a lot of that still going on in graphics/media.

I suppose one could just expose CMA with the default pool directly.
It's just an allocator, so not sure why it would need any
device-specific information.

There is also the use case of using CMA with device-specific pools of
memory reusable by the system when not used by the device and those
would have to somehow get the pool to allocate from, but I wonder if
struct device is the right way to pass such information. I'd see the
pool given explicitly like cma_alloc(struct cma_pool *, size, flags)
and perhaps a wrapper cma_alloc_default(size, flags) that is just a
simple macro calling cma_alloc(_pool_default, size, flags).

Best regards,
Tomasz
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH v2 1/1] drm: msm: Replace dma_map_sg with dma_sync_sg*

2018-11-27 Thread Tomasz Figa
Hi Vivek,

On Tue, Nov 27, 2018 at 6:37 AM Vivek Gautam
 wrote:
>
> dma_map_sg() expects a DMA domain. However, the drm devices
> have been traditionally using unmanaged iommu domain which
> is non-dma type. Using dma mapping APIs with that domain is bad.
>
> Replace dma_map_sg() calls with dma_sync_sg_for_device{|cpu}()
> to do the cache maintenance.
>
> Signed-off-by: Vivek Gautam 
> Suggested-by: Tomasz Figa 
> Cc: Rob Clark 
> Cc: Jordan Crouse 
> Cc: Sean Paul 
> ---
>
> Changes since v1:
>  - Addressed Jordan's and Tomasz's comments for
>- moving sg dma addresses preparation out of the coditional
>  check to the main path so we do it irrespective of whether
>  the buffer is cached or uncached.
>- Enhance the comment to explain this dma addresses prepartion.
>

Thanks for the patch. Some comments inline.

>  drivers/gpu/drm/msm/msm_gem.c | 31 ++-
>  1 file changed, 22 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
> index 00c795ced02c..1811ac23a31c 100644
> --- a/drivers/gpu/drm/msm/msm_gem.c
> +++ b/drivers/gpu/drm/msm/msm_gem.c
> @@ -81,6 +81,8 @@ static struct page **get_pages(struct drm_gem_object *obj)
> struct drm_device *dev = obj->dev;
> struct page **p;
> int npages = obj->size >> PAGE_SHIFT;
> +   struct scatterlist *s;
> +   int i;
>
> if (use_pages(obj))
> p = drm_gem_get_pages(obj);
> @@ -104,12 +106,21 @@ static struct page **get_pages(struct drm_gem_object 
> *obj)
> return ptr;
> }
>
> -   /* For non-cached buffers, ensure the new pages are clean
> +   /*
> +* dma_sync_sg_*() flush the physical pages, so point
> +* sg->dma_address to the physical ones for the right 
> behavior.

The two halves of the sequence don't really relate to each other. An
sglist has the `page` field for the purpose of pointing to physical
pages. The `dma_address` field is for DMA addresses, which are not
equivalent to physical addresses. I'd rewrite it like this;

/*
 * Some implementations of the DMA mapping ops expect
 * physical addresses of the pages to be stored as DMA
 * addresses of the sglist entries. To work around it,
 * set them here explicitly.
 */

> +*/
> +   for_each_sg(msm_obj->sgt->sgl, s, msm_obj->sgt->nents, i)
> +   sg_dma_address(s) = sg_phys(s);
> +
> +   /*
> +* For non-cached buffers, ensure the new pages are clean
>  * because display controller, GPU, etc. are not coherent:
>  */
> -   if (msm_obj->flags & (MSM_BO_WC|MSM_BO_UNCACHED))
> -   dma_map_sg(dev->dev, msm_obj->sgt->sgl,
> -   msm_obj->sgt->nents, 
> DMA_BIDIRECTIONAL);
> +   if (msm_obj->flags & (MSM_BO_WC | MSM_BO_UNCACHED))
> +   dma_sync_sg_for_device(dev->dev, msm_obj->sgt->sgl,
> +  msm_obj->sgt->nents,
> +  DMA_TO_DEVICE);

Why changing from DMA_BIDIRECTIONAL?

> }
>
> return msm_obj->pages;
> @@ -133,14 +144,16 @@ static void put_pages(struct drm_gem_object *obj)
>
> if (msm_obj->pages) {
> if (msm_obj->sgt) {
> -   /* For non-cached buffers, ensure the new
> +   /*
> +* For non-cached buffers, ensure the new
>  * pages are clean because display controller,
>  * GPU, etc. are not coherent:
>  */
> -   if (msm_obj->flags & (MSM_BO_WC|MSM_BO_UNCACHED))
> -   dma_unmap_sg(obj->dev->dev, msm_obj->sgt->sgl,
> -msm_obj->sgt->nents,
> -DMA_BIDIRECTIONAL);
> +   if (msm_obj->flags & (MSM_BO_WC | MSM_BO_UNCACHED))
> +   dma_sync_sg_for_cpu(obj->dev->dev,
> +   msm_obj->sgt->sgl,
> +   msm_obj->sgt->nents,
> +   DMA_FROM_DEVICE);

Ditto.

Best regards,
Tomasz
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [RESEND PATCH v17 5/5] iommu/arm-smmu: Add support for qcom, smmu-v2 variant

2018-11-25 Thread Tomasz Figa
On Sat, Nov 24, 2018 at 3:34 AM Will Deacon  wrote:
>
> On Fri, Nov 23, 2018 at 03:06:29PM +0530, Vivek Gautam wrote:
> > On Fri, Nov 23, 2018 at 2:52 PM Tomasz Figa  wrote:
> > > On Fri, Nov 23, 2018 at 6:13 PM Vivek Gautam
> > >  wrote:
> > > > On Wed, Nov 21, 2018 at 11:09 PM Will Deacon  
> > > > wrote:
> > > > > On Fri, Nov 16, 2018 at 04:54:30PM +0530, Vivek Gautam wrote:
> > > > > > @@ -2026,6 +2027,17 @@ ARM_SMMU_MATCH_DATA(arm_mmu401, 
> > > > > > ARM_SMMU_V1_64K, GENERIC_SMMU);
> > > > > >  ARM_SMMU_MATCH_DATA(arm_mmu500, ARM_SMMU_V2, ARM_MMU500);
> > > > > >  ARM_SMMU_MATCH_DATA(cavium_smmuv2, ARM_SMMU_V2, CAVIUM_SMMUV2);
> > > > > >
> > > > > > +static const char * const qcom_smmuv2_clks[] = {
> > > > > > + "bus", "iface",
> > > > > > +};
> > > > > > +
> > > > > > +static const struct arm_smmu_match_data qcom_smmuv2 = {
> > > > > > + .version = ARM_SMMU_V2,
> > > > > > + .model = QCOM_SMMUV2,
> > > > > > + .clks = qcom_smmuv2_clks,
> > > > > > + .num_clks = ARRAY_SIZE(qcom_smmuv2_clks),
> > > > > > +};
> > > > >
> > > > > These seems redundant if we go down the route proposed by Thor, where 
> > > > > we
> > > > > just pull all of the clocks out of the device-tree. In which case, why
> > > > > do we need this match_data at all?
> > > >
> > > > Which is better? Driver relying solely on the device tree to tell
> > > > which all clocks
> > > > are required to be enabled,
> > > > or, the driver deciding itself based on the platform's match data,
> > > > that it should
> > > > have X, Y, & Z clocks that should be supplied from the device tree.
> > >
> > > The former would simplify the driver, but would also make it
> > > impossible to spot mistakes in DT, which would ultimately surface out
> > > as very hard to debug bugs (likely complete system lockups).
> >
> > Thanks.
> > Yea, this is how I understand things presently. Relying on device tree
> > puts the things out of driver's control.
>
> But it also has the undesirable effect of having to update the driver
> code whenever we want to add support for a new SMMU implementation. If
> we do this all in the DT, as Thor is trying to do, then older kernels
> will work well with new hardware.

Fair enough, if you're okay with that. Obviously one would still have
to change the DT bindings to list the exact set of clocks for the new
hardware variant, unless the convention changed recently.

Best regards,
Tomasz
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [RESEND PATCH v17 5/5] iommu/arm-smmu: Add support for qcom, smmu-v2 variant

2018-11-23 Thread Tomasz Figa
Hi Vivek, Will,

On Fri, Nov 23, 2018 at 6:13 PM Vivek Gautam
 wrote:
>
> Hi Will,
>
> On Wed, Nov 21, 2018 at 11:09 PM Will Deacon  wrote:
> >
> > [+Thor]
> >
> > On Fri, Nov 16, 2018 at 04:54:30PM +0530, Vivek Gautam wrote:
> > > qcom,smmu-v2 is an arm,smmu-v2 implementation with specific
> > > clock and power requirements.
> > > On msm8996, multiple cores, viz. mdss, video, etc. use this
> > > smmu. On sdm845, this smmu is used with gpu.
> > > Add bindings for the same.
> > >
> > > Signed-off-by: Vivek Gautam 
> > > Reviewed-by: Rob Herring 
> > > Reviewed-by: Tomasz Figa 
> > > Tested-by: Srinivas Kandagatla 
> > > Reviewed-by: Robin Murphy 
> > > ---
> > >  drivers/iommu/arm-smmu.c | 13 +
> > >  1 file changed, 13 insertions(+)
> > >
> > > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> > > index 2098c3141f5f..d315ca637097 100644
> > > --- a/drivers/iommu/arm-smmu.c
> > > +++ b/drivers/iommu/arm-smmu.c
> > > @@ -120,6 +120,7 @@ enum arm_smmu_implementation {
> > >   GENERIC_SMMU,
> > >   ARM_MMU500,
> > >   CAVIUM_SMMUV2,
> > > + QCOM_SMMUV2,
> > >  };
> > >
> > >  struct arm_smmu_s2cr {
> > > @@ -2026,6 +2027,17 @@ ARM_SMMU_MATCH_DATA(arm_mmu401, ARM_SMMU_V1_64K, 
> > > GENERIC_SMMU);
> > >  ARM_SMMU_MATCH_DATA(arm_mmu500, ARM_SMMU_V2, ARM_MMU500);
> > >  ARM_SMMU_MATCH_DATA(cavium_smmuv2, ARM_SMMU_V2, CAVIUM_SMMUV2);
> > >
> > > +static const char * const qcom_smmuv2_clks[] = {
> > > + "bus", "iface",
> > > +};
> > > +
> > > +static const struct arm_smmu_match_data qcom_smmuv2 = {
> > > + .version = ARM_SMMU_V2,
> > > + .model = QCOM_SMMUV2,
> > > + .clks = qcom_smmuv2_clks,
> > > + .num_clks = ARRAY_SIZE(qcom_smmuv2_clks),
> > > +};
> >
> > These seems redundant if we go down the route proposed by Thor, where we
> > just pull all of the clocks out of the device-tree. In which case, why
> > do we need this match_data at all?
>
> Which is better? Driver relying solely on the device tree to tell
> which all clocks
> are required to be enabled,
> or, the driver deciding itself based on the platform's match data,
> that it should
> have X, Y, & Z clocks that should be supplied from the device tree.

The former would simplify the driver, but would also make it
impossible to spot mistakes in DT, which would ultimately surface out
as very hard to debug bugs (likely complete system lockups).

For qcom_smmuv2, I believe we're eventually going to end up with
platform-specific quirks anyway, so specifying the clocks too wouldn't
hurt. Given that, I'd recommend sticking to the latter, i.e. what this
patch does.

Best regards,
Tomasz
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH 1/1] drm: msm: Replace dma_map_sg with dma_sync_sg*

2018-11-20 Thread Tomasz Figa
Hi Jordan, Vivek,

On Wed, Nov 21, 2018 at 12:41 AM Jordan Crouse  wrote:
>
> On Tue, Nov 20, 2018 at 03:24:37PM +0530, Vivek Gautam wrote:
> > dma_map_sg() expects a DMA domain. However, the drm devices
> > have been traditionally using unmanaged iommu domain which
> > is non-dma type. Using dma mapping APIs with that domain is bad.
> >
> > Replace dma_map_sg() calls with dma_sync_sg_for_device{|cpu}()
> > to do the cache maintenance.
> >
> > Signed-off-by: Vivek Gautam 
> > Suggested-by: Tomasz Figa 
> > ---
> >
> > Tested on an MTP sdm845:
> > https://github.com/vivekgautam1/linux/tree/v4.19/sdm845-mtp-display-working
> >
> >  drivers/gpu/drm/msm/msm_gem.c | 27 ---
> >  1 file changed, 20 insertions(+), 7 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
> > index 00c795ced02c..d7a7af610803 100644
> > --- a/drivers/gpu/drm/msm/msm_gem.c
> > +++ b/drivers/gpu/drm/msm/msm_gem.c
> > @@ -81,6 +81,8 @@ static struct page **get_pages(struct drm_gem_object *obj)
> >   struct drm_device *dev = obj->dev;
> >   struct page **p;
> >   int npages = obj->size >> PAGE_SHIFT;
> > + struct scatterlist *s;
> > + int i;
> >
> >   if (use_pages(obj))
> >   p = drm_gem_get_pages(obj);
> > @@ -107,9 +109,19 @@ static struct page **get_pages(struct drm_gem_object 
> > *obj)
> >   /* For non-cached buffers, ensure the new pages are clean
> >* because display controller, GPU, etc. are not coherent:
> >*/
> > - if (msm_obj->flags & (MSM_BO_WC|MSM_BO_UNCACHED))
> > - dma_map_sg(dev->dev, msm_obj->sgt->sgl,
> > - msm_obj->sgt->nents, 
> > DMA_BIDIRECTIONAL);
> > + if (msm_obj->flags & (MSM_BO_WC | MSM_BO_UNCACHED)) {
> > + /*
> > +  * Fake up the SG table so that dma_sync_sg_*()
> > +  * can be used to flush the pages associated with it.
> > +  */
>
> We aren't really faking.  The table is real, we are just slightly abusing the
> sg_dma_address() which makes this comment a bit misleading. Instead I would
> probably say something like:
>
> /* dma_sync_sg_* flushes pages using sg_dma_address() so point it at the
>  * physical page for the right behavior */
>
> Or something like that.
>

It's actually quite complicated, but I agree that the comment isn't
very precise. The cases are as follows:
- arm64 iommu_dma_ops use sg_phys()
https://elixir.bootlin.com/linux/v4.20-rc3/source/arch/arm64/mm/dma-mapping.c#L599
- swiotlb_dma_ops used on arm64 if no IOMMU is available use
sg->dma_address directly:
https://elixir.bootlin.com/linux/v4.20-rc3/source/kernel/dma/swiotlb.c#L832
- arm_dma_ops use sg_dma_address():
https://elixir.bootlin.com/linux/v4.20-rc3/source/arch/arm/mm/dma-mapping.c#L1130
- arm iommu_ops use sg_page():
https://elixir.bootlin.com/linux/v4.20-rc3/source/arch/arm/mm/dma-mapping.c#L1869

Sounds like a mess...

> > + for_each_sg(msm_obj->sgt->sgl, s,
> > + msm_obj->sgt->nents, i)
> > + sg_dma_address(s) = sg_phys(s);
> > +
>
> I'm wondering - wouldn't we want to do this association for cached buffers to 
> so
> we could sync them correctly in cpu_prep and cpu_fini?  Maybe it wouldn't hurt
> to put this association in the main path (obviously the sync should stay 
> inside
> the conditional for uncached buffers).
>

I guess it wouldn't hurt indeed. Note that cpu_prep/fini seem to be
missing the sync call currently.

P.S. Jordan, not sure if it's my Gmail or your email client, but your
message had all the recipients in a Reply-to header, except you, so
pressing Reply to all in my case led to a message that didn't have you
in recipients anymore...

Best regards,
Tomasz
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH v16 2/5] iommu/arm-smmu: Invoke pm_runtime during probe, add/remove device

2018-09-07 Thread Tomasz Figa
On Fri, Sep 7, 2018 at 6:38 PM Vivek Gautam  wrote:
>
> Hi Tomasz,
>
>
> On 9/7/2018 2:46 PM, Tomasz Figa wrote:
> > Hi Vivek,
> >
> > On Thu, Aug 30, 2018 at 11:46 PM Vivek Gautam
> >  wrote:
> >> From: Sricharan R 
> >>
> >> The smmu device probe/remove and add/remove master device callbacks
> >> gets called when the smmu is not linked to its master, that is without
> >> the context of the master device. So calling runtime apis in those places
> >> separately.
> >> Global locks are also initialized before enabling runtime pm as the
> >> runtime_resume() calls device_reset() which does tlb_sync_global()
> >> that ultimately requires locks to be initialized.
> >>
> >> Signed-off-by: Sricharan R 
> >> [vivek: Cleanup pm runtime calls]
> >> Signed-off-by: Vivek Gautam 
> >> Reviewed-by: Tomasz Figa 
> >> Tested-by: Srinivas Kandagatla 
> >> ---
> >>   drivers/iommu/arm-smmu.c | 89 
> >> +++-
> >>   1 file changed, 81 insertions(+), 8 deletions(-)
> > [snip]
> >> @@ -2215,10 +2281,17 @@ static int arm_smmu_device_remove(struct 
> >> platform_device *pdev)
> >>  if (!bitmap_empty(smmu->context_map, ARM_SMMU_MAX_CBS))
> >>  dev_err(>dev, "removing device with active 
> >> domains!\n");
> >>
> >> +   arm_smmu_rpm_get(smmu);
> >>  /* Turn the thing off */
> >>  writel(sCR0_CLIENTPD, ARM_SMMU_GR0_NS(smmu) + ARM_SMMU_GR0_sCR0);
> >> +   arm_smmu_rpm_put(smmu);
> >> +
> >> +   if (pm_runtime_enabled(smmu->dev))
> >> +   pm_runtime_force_suspend(smmu->dev);
> >> +   else
> >> +   clk_bulk_disable(smmu->num_clks, smmu->clks);
> >>
> >> -   clk_bulk_disable_unprepare(smmu->num_clks, smmu->clks);
> >> +   clk_bulk_unprepare(smmu->num_clks, smmu->clks);
> > Aren't we missing pm_runtime_disable() here? We'll have the enable
> > count unbalanced if the driver is removed and probed again.
>
> pm_runtime_force_suspend() does a pm_runtime_disable() also if i am not
> wrong.
> And, as mentioned in a previous thread [1], we were seeing a warning
> which we avoided
> by keeping force_suspend().
>
> [1] https://lkml.org/lkml/2018/7/8/124

I see, thanks. I didn't realize that pm_runtime_force_suspend()
already disables runtime PM indeed. Sorry for the noise.

Best regards,
Tomasz
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH v16 2/5] iommu/arm-smmu: Invoke pm_runtime during probe, add/remove device

2018-09-07 Thread Tomasz Figa
Hi Vivek,

On Thu, Aug 30, 2018 at 11:46 PM Vivek Gautam
 wrote:
>
> From: Sricharan R 
>
> The smmu device probe/remove and add/remove master device callbacks
> gets called when the smmu is not linked to its master, that is without
> the context of the master device. So calling runtime apis in those places
> separately.
> Global locks are also initialized before enabling runtime pm as the
> runtime_resume() calls device_reset() which does tlb_sync_global()
> that ultimately requires locks to be initialized.
>
> Signed-off-by: Sricharan R 
> [vivek: Cleanup pm runtime calls]
> Signed-off-by: Vivek Gautam 
> Reviewed-by: Tomasz Figa 
> Tested-by: Srinivas Kandagatla 
> ---
>  drivers/iommu/arm-smmu.c | 89 
> +++-
>  1 file changed, 81 insertions(+), 8 deletions(-)
[snip]
> @@ -2215,10 +2281,17 @@ static int arm_smmu_device_remove(struct 
> platform_device *pdev)
> if (!bitmap_empty(smmu->context_map, ARM_SMMU_MAX_CBS))
> dev_err(>dev, "removing device with active domains!\n");
>
> +   arm_smmu_rpm_get(smmu);
> /* Turn the thing off */
> writel(sCR0_CLIENTPD, ARM_SMMU_GR0_NS(smmu) + ARM_SMMU_GR0_sCR0);
> +   arm_smmu_rpm_put(smmu);
> +
> +   if (pm_runtime_enabled(smmu->dev))
> +   pm_runtime_force_suspend(smmu->dev);
> +   else
> +   clk_bulk_disable(smmu->num_clks, smmu->clks);
>
> -   clk_bulk_disable_unprepare(smmu->num_clks, smmu->clks);
> +   clk_bulk_unprepare(smmu->num_clks, smmu->clks);

Aren't we missing pm_runtime_disable() here? We'll have the enable
count unbalanced if the driver is removed and probed again.

Also, if we add pm_runtime_disable(), we can reorder things a bit and
simplify into:

   arm_smmu_rpm_get(smmu);

   /* Turn the thing off */
   writel(sCR0_CLIENTPD, ARM_SMMU_GR0_NS(smmu) + ARM_SMMU_GR0_sCR0);

   if (pm_runtime_enabled())
   pm_runtime_disable();
arm_smmu_rpm_put(smmu);

clk_bulk_disable_unprepare(smmu->num_clks, smmu->clks);

Best regards,
Tomasz
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [Patch v15 2/5] iommu/arm-smmu: Invoke pm_runtime during probe, add/remove device

2018-08-30 Thread Tomasz Figa
On Thu, Aug 30, 2018 at 6:22 PM Vivek Gautam
 wrote:
>
> On Mon, Aug 27, 2018 at 4:27 PM Vivek Gautam
>  wrote:
> >
> > From: Sricharan R 
> >
> > The smmu device probe/remove and add/remove master device callbacks
> > gets called when the smmu is not linked to its master, that is without
> > the context of the master device. So calling runtime apis in those places
> > separately.
> >
> > Signed-off-by: Sricharan R 
> > [vivek: Cleanup pm runtime calls]
> > Signed-off-by: Vivek Gautam 
> > Reviewed-by: Tomasz Figa 
> > Tested-by: Srinivas Kandagatla 
> > ---
> >
> > Changes since v14:
> >  - none.
> >
> >  drivers/iommu/arm-smmu.c | 101 
> > +++
> >  1 file changed, 93 insertions(+), 8 deletions(-)
> >
> > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> > index a81224bc6637..23b4a60149b6 100644
> > --- a/drivers/iommu/arm-smmu.c
> > +++ b/drivers/iommu/arm-smmu.c
>
> [snip]
>
> > @@ -2131,6 +2188,26 @@ static int arm_smmu_device_probe(struct 
> > platform_device *pdev)
> > if (err)
> > return err;
> >
> > +   /*
> > +* We want to avoid touching dev->power.lock in fastpaths unless
> > +* it's really going to do something useful - pm_runtime_enabled()
> > +* can serve as an ideal proxy for that decision. So, conditionally
> > +* enable pm_runtime.
> > +*/
> > +   if (dev->pm_domain)
> > +   pm_runtime_enable(dev);
> > +
> > +   err = arm_smmu_rpm_get(smmu);
>
> We shouldn't be doing a runtime_get() yet, as this eventually calls
> arm_smmu_device_reset().
> arm_smmu_device_reset() should be called only after 
> arm_smmu_device_cfg_probe().
> So, I plan to replace the pm_runtime_get/put() calls in probe() with
> simple clk_bulk_enable()
> to let the driver initialize smmu, and at the end of the probe we can
> disable the clocks and
> enable runtime pm over the device to let it take care of the device 
> further-on.
>

We can avoid the explicit clock disable by just calling
pm_runtime_set_active() before pm_runtime_enable(), assuming that what
probe does is symmetrical with the suspend callback, which would be
called after the latter.

Best regards,
Tomasz
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH v14 0/4] iommu/arm-smmu: Add runtime pm/sleep support

2018-08-20 Thread Tomasz Figa
Hi Robin,

On Fri, Jul 27, 2018 at 4:02 PM Vivek Gautam
 wrote:
>
> This series provides the support for turning on the arm-smmu's
> clocks/power domains using runtime pm. This is done using
> device links between smmu and client devices. The device link
> framework keeps the two devices in correct order for power-cycling
> across runtime PM or across system-wide PM.
>
> With addition of a new device link flag DL_FLAG_AUTOREMOVE_SUPPLIER [8]
> (available in linux-next of Rafael's linux-pm tree [9]), the device links
> created between arm-smmu and its clients will be automatically purged
> when arm-smmu driver unbinds from its device.
>
> As not all implementations support clock/power gating, we are checking
> for a valid 'smmu->dev's pm_domain' to conditionally enable the runtime
> power management for such smmu implementations that can support it.
> Otherwise, the clocks are turned to be always on in .probe until .remove.
> With conditional runtime pm now, we avoid touching dev->power.lock
> in fastpaths for smmu implementations that don't need to do anything
> useful with pm_runtime.
> This lets us to use the much-argued pm_runtime_get_sync/put_sync()
> calls in map/unmap callbacks so that the clients do not have to
> worry about handling any of the arm-smmu's power.
>
> This series also adds support for Qcom's arm-smmu-v2 variant that
> has different clocks and power requirements.
>
> Previous version of this patch series is @ [2].
>
> Tested this series on msm8996, and sdm845 after pulling in Rafael's linux-pm
> linux-next[9] and Joerg's iommu next[10] branches, and related changes for
> device tree, etc.
>
> Hi Robin, Will,
> I have addressed the comments for v13. If there's still a chance
> can you please consider pulling this for v4.19.
> Thanks.
>
> [v14]
>* Moved arm_smmu_device_reset() from arm_smmu_pm_resume() to
>  arm_smmu_runtime_resume() so that the pm_resume callback calls
>  only runtime_resume to resume the device.
>  This should take care of restoring the state of smmu in systems
>  in which smmu lose register state on power-domain collapse.

It's been a while since this series was posted and no more comments
seem to be left anymore. Would you have some time to take a look
again? Thanks.

Best regards,
Tomasz
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH v12 1/4] iommu/arm-smmu: Add pm_runtime/sleep ops

2018-07-11 Thread Tomasz Figa
On Wed, Jul 11, 2018 at 8:11 PM Rafael J. Wysocki  wrote:
>
> On Wed, Jul 11, 2018 at 12:55 PM, Vivek Gautam
>  wrote:
> > Hi Rafael,
> >
> >
> > On Wed, Jul 11, 2018 at 3:20 PM, Rafael J. Wysocki  
> > wrote:
> >> On Sunday, July 8, 2018 7:34:10 PM CEST Vivek Gautam wrote:
> >>> From: Sricharan R 
> >>>
> >>> The smmu needs to be functional only when the respective
> >>> master's using it are active. The device_link feature
> >>> helps to track such functional dependencies, so that the
> >>> iommu gets powered when the master device enables itself
> >>> using pm_runtime. So by adapting the smmu driver for
> >>> runtime pm, above said dependency can be addressed.
> >>>
> >>> This patch adds the pm runtime/sleep callbacks to the
> >>> driver and also the functions to parse the smmu clocks
> >>> from DT and enable them in resume/suspend.
> >>>
> >>> Signed-off-by: Sricharan R 
> >>> Signed-off-by: Archit Taneja 
> >>> [vivek: Clock rework to request bulk of clocks]
> >>> Signed-off-by: Vivek Gautam 
> >>> Reviewed-by: Tomasz Figa 
> >>> ---
> >>>
> >>>  - No change since v11.
> >>>
> >>>  drivers/iommu/arm-smmu.c | 60 
> >>> ++--
> >>>  1 file changed, 58 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> >>> index f7a96bcf94a6..a01d0dde21dd 100644
> >>> --- a/drivers/iommu/arm-smmu.c
> >>> +++ b/drivers/iommu/arm-smmu.c
> >>> @@ -48,6 +48,7 @@
> >>>  #include 
> >>>  #include 
> >>>  #include 
> >>> +#include 
> >>>  #include 
> >>>  #include 
> >>>
> >>> @@ -205,6 +206,8 @@ struct arm_smmu_device {
> >>>   u32 num_global_irqs;
> >>>   u32 num_context_irqs;
> >>>   unsigned int*irqs;
> >>> + struct clk_bulk_data*clks;
> >>> + int num_clks;
> >>>
> >>>   u32 cavium_id_base; /* Specific to 
> >>> Cavium */
> >>>
> >>> @@ -1897,10 +1900,12 @@ static int arm_smmu_device_cfg_probe(struct 
> >>> arm_smmu_device *smmu)
> >>>  struct arm_smmu_match_data {
> >>>   enum arm_smmu_arch_version version;
> >>>   enum arm_smmu_implementation model;
> >>> + const char * const *clks;
> >>> + int num_clks;
> >>>  };
> >>>
> >>>  #define ARM_SMMU_MATCH_DATA(name, ver, imp)  \
> >>> -static struct arm_smmu_match_data name = { .version = ver, .model = imp }
> >>> +static const struct arm_smmu_match_data name = { .version = ver, .model 
> >>> = imp }
> >>>
> >>>  ARM_SMMU_MATCH_DATA(smmu_generic_v1, ARM_SMMU_V1, GENERIC_SMMU);
> >>>  ARM_SMMU_MATCH_DATA(smmu_generic_v2, ARM_SMMU_V2, GENERIC_SMMU);
> >>> @@ -1919,6 +1924,23 @@ static const struct of_device_id 
> >>> arm_smmu_of_match[] = {
> >>>  };
> >>>  MODULE_DEVICE_TABLE(of, arm_smmu_of_match);
> >>>
> >>> +static void arm_smmu_fill_clk_data(struct arm_smmu_device *smmu,
> >>> +const char * const *clks)
> >>> +{
> >>> + int i;
> >>> +
> >>> + if (smmu->num_clks < 1)
> >>> + return;
> >>> +
> >>> + smmu->clks = devm_kcalloc(smmu->dev, smmu->num_clks,
> >>> +   sizeof(*smmu->clks), GFP_KERNEL);
> >>> + if (!smmu->clks)
> >>> + return;
> >>> +
> >>> + for (i = 0; i < smmu->num_clks; i++)
> >>> + smmu->clks[i].id = clks[i];
> >>> +}
> >>> +
> >>>  #ifdef CONFIG_ACPI
> >>>  static int acpi_smmu_get_data(u32 model, struct arm_smmu_device *smmu)
> >>>  {
> >>> @@ -2001,6 +2023,9 @@ static int arm_smmu_device_dt_probe(struct 
> >>> platform_device *pdev,
> >>>   data = of_device_get_match_data(dev);
> >>>   smmu->version = data->version;
> >>>   smmu->model = data-&g

Re: [Freedreno] [PATCH v12 2/4] iommu/arm-smmu: Invoke pm_runtime during probe, add/remove device

2018-07-11 Thread Tomasz Figa
Hi Rafael,

Thanks for review.

On Wed, Jul 11, 2018 at 6:53 PM Rafael J. Wysocki  wrote:
>
> On Sunday, July 8, 2018 7:34:11 PM CEST Vivek Gautam wrote:
> > From: Sricharan R 
> >
> > The smmu device probe/remove and add/remove master device callbacks
> > gets called when the smmu is not linked to its master, that is without
> > the context of the master device. So calling runtime apis in those places
> > separately.
> >
> > Signed-off-by: Sricharan R 
> > [vivek: Cleanup pm runtime calls]
> > Signed-off-by: Vivek Gautam 
> > Reviewed-by: Tomasz Figa 
> > ---
> >
> >  - Change since v11
> >* Replaced pm_runtime_disable() with pm_runtime_force_suspend()
> >  to avoid warning about " Unpreparing enabled clock".
> >  Full warning text mentioned in cover patch.
> >
> >  drivers/iommu/arm-smmu.c | 92 
> > +++-
> >  1 file changed, 84 insertions(+), 8 deletions(-)
> >
> > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> > index a01d0dde21dd..09265e206e2d 100644
> > --- a/drivers/iommu/arm-smmu.c
> > +++ b/drivers/iommu/arm-smmu.c
> > @@ -268,6 +268,20 @@ static struct arm_smmu_option_prop arm_smmu_options[] 
> > = {
> >   { 0, NULL},
> >  };
> >
> > +static inline int arm_smmu_rpm_get(struct arm_smmu_device *smmu)
> > +{
> > + if (pm_runtime_enabled(smmu->dev))
>
> Why do you need the pm_runtime_enabled() checks here and below?
>
> pm_runtime_get_sync() and pm_runtime_put() should work just fine if
> runtime PM is not enabled.

Because pm_runtime_get_sync() acquires a spin lock, even if only for
the short time of checking if runtime PM is enabled and SMMU driver
maintainers didn't want any spin locks in certain IOMMU API code paths
on hardware implementations that don't need runtime PM, while we still
need to be able to control runtime PM there on hardware
implementations that need so.

Best regards,
Tomasz
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH v7 6/6] drm/msm: iommu: Replace runtime calls with runtime suppliers

2018-02-14 Thread Tomasz Figa
On Thu, Feb 15, 2018 at 12:17 PM, Tomasz Figa <tf...@chromium.org> wrote:
> On Thu, Feb 15, 2018 at 1:03 AM, Robin Murphy <robin.mur...@arm.com> wrote:
>> On 14/02/18 10:33, Vivek Gautam wrote:
>>>
>>> On Wed, Feb 14, 2018 at 2:46 PM, Tomasz Figa <tf...@chromium.org> wrote:
>>>
>>> Adding Jordan to this thread as well.
>>>
>>>> On Wed, Feb 14, 2018 at 6:13 PM, Vivek Gautam
>>>> <vivek.gau...@codeaurora.org> wrote:
>>>>>
>>>>> Hi Tomasz,
>>>>>
>>>>> On Wed, Feb 14, 2018 at 11:08 AM, Tomasz Figa <tf...@chromium.org>
>>>>> wrote:
>>>>>>
>>>>>> On Wed, Feb 14, 2018 at 1:17 PM, Vivek Gautam
>>>>>> <vivek.gau...@codeaurora.org> wrote:
>>>>>>>
>>>>>>> Hi Tomasz,
>>>>>>>
>>>>>>> On Wed, Feb 14, 2018 at 8:31 AM, Tomasz Figa <tf...@chromium.org>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>> On Wed, Feb 14, 2018 at 11:13 AM, Rob Clark <robdcl...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> On Tue, Feb 13, 2018 at 8:59 PM, Tomasz Figa <tf...@chromium.org>
>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>> On Wed, Feb 14, 2018 at 3:03 AM, Rob Clark <robdcl...@gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>> On Tue, Feb 13, 2018 at 4:10 AM, Tomasz Figa <tf...@chromium.org>
>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> Hi Vivek,
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks for the patch. Please see my comments inline.
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Feb 7, 2018 at 7:31 PM, Vivek Gautam
>>>>>>>>>>>> <vivek.gau...@codeaurora.org> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> While handling the concerned iommu, there should not be a
>>>>>>>>>>>>> need to power control the drm devices from iommu interface.
>>>>>>>>>>>>> If these drm devices need to be powered around this time,
>>>>>>>>>>>>> the respective drivers should take care of this.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Replace the pm_runtime_get/put_sync() with
>>>>>>>>>>>>> pm_runtime_get/put_suppliers() calls, to power-up
>>>>>>>>>>>>> the connected iommu through the device link interface.
>>>>>>>>>>>>> In case the device link is not setup these get/put_suppliers()
>>>>>>>>>>>>> calls will be a no-op, and the iommu driver should take care of
>>>>>>>>>>>>> powering on its devices accordingly.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Signed-off-by: Vivek Gautam <vivek.gau...@codeaurora.org>
>>>>>>>>>>>>> ---
>>>>>>>>>>>>>   drivers/gpu/drm/msm/msm_iommu.c | 16 
>>>>>>>>>>>>>   1 file changed, 8 insertions(+), 8 deletions(-)
>>>>>>>>>>>>>
>>>>>>>>>>>>> diff --git a/drivers/gpu/drm/msm/msm_iommu.c
>>>>>>>>>>>>> b/drivers/gpu/drm/msm/msm_iommu.c
>>>>>>>>>>>>> index b23d33622f37..1ab629bbee69 100644
>>>>>>>>>>>>> --- a/drivers/gpu/drm/msm/msm_iommu.c
>>>>>>>>>>>>> +++ b/drivers/gpu/drm/msm/msm_iommu.c
>>>>>>>>>>>>> @@ -40,9 +40,9 @@ static int msm_iommu_attach(struct msm_mmu
>>>>>>>>>>>>> *mmu, const char * const *names,
>>>>>>>>>>>>>  struct msm_iommu *iommu = to_msm_iommu(mmu);
>>>>>>>>>>>>>  int ret;
>>>>>>>>>>>>>
>>>>>>>>>>>&

Re: [Freedreno] [PATCH v7 6/6] drm/msm: iommu: Replace runtime calls with runtime suppliers

2018-02-14 Thread Tomasz Figa
On Thu, Feb 15, 2018 at 1:12 AM, Rob Clark <robdcl...@gmail.com> wrote:
> On Wed, Feb 14, 2018 at 10:48 AM, Jordan Crouse <jcro...@codeaurora.org> 
> wrote:
>> On Wed, Feb 14, 2018 at 12:31:29PM +0900, Tomasz Figa wrote:
>>>
>>> - When submitting commands to the GPU, the GPU driver will
>>> pm_runtime_get_sync() on the GPU device, which will automatically do
>>> the same on all the linked suppliers, which would also include the
>>> SMMU itself. The role of device links here is exactly that the GPU
>>> driver doesn't have to care which other devices need to be brought up.
>>
>> This is true.  Assuming that the device link works correctly we would not 
>> need
>> to explicitly power the SMMU which makes my point entirely moot.
>
> Just to point out what motivated this patchset, the biggest problem is
> iommu_unmap() because that can happen when GPU is not powered on (or
> in the v4l2 case, because some other device dropped it's reference to
> the dma-buf allowing it to be free'd).  Currently we pm get/put the
> GPU device around unmap, but it is kinda silly to boot up the GPU just
> to unmap a buffer.

Note that in V4L2 both mapping and unmapping can happen completely
without involving the driver. So AFAICT the approach being implemented
by this patchset will not work, because there will be no one to power
up the IOMMU before the operation. Moreover, there are platforms for
which there is no reason to power up the IOMMU just for map/unmap,
because the hardware state is lost anyway and the only real work
needed is updating the page tables in memory. (I feel like this is
actually true for most of the platforms in the wild, but this is based
purely on the not so small number of platforms I worked with, haven't
bothered looking for more general evidence.)

>
> (Semi-related, I would also like to batch map/unmap's, I just haven't
> gotten around to implementing it yet.. but that would be another case
> where a single get_supplier()/put_supplier() outside of the iommu
> would make sense instead of pm_get/put() inside the iommu driver's
> ->unmap().)
>
> If you really dislike the get/put_supplier() approach, then perhaps we
> need iommu_pm_get()/iommu_pm_put() operations that the iommu user
> could use to accomplish the same thing?

I'm afraid this wouldn't work for V4L2 either. And I still haven't
been given any evidence that the approach I'm suggesting, which relies
only on existing pieces of infrastructure and which worked for both
Exynos and Rockchip, including V4L2, wouldn't work for SMMU and/or QC
SoCs.

Best regards,
Tomasz
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH v7 6/6] drm/msm: iommu: Replace runtime calls with runtime suppliers

2018-02-14 Thread Tomasz Figa
On Thu, Feb 15, 2018 at 1:03 AM, Robin Murphy <robin.mur...@arm.com> wrote:
> On 14/02/18 10:33, Vivek Gautam wrote:
>>
>> On Wed, Feb 14, 2018 at 2:46 PM, Tomasz Figa <tf...@chromium.org> wrote:
>>
>> Adding Jordan to this thread as well.
>>
>>> On Wed, Feb 14, 2018 at 6:13 PM, Vivek Gautam
>>> <vivek.gau...@codeaurora.org> wrote:
>>>>
>>>> Hi Tomasz,
>>>>
>>>> On Wed, Feb 14, 2018 at 11:08 AM, Tomasz Figa <tf...@chromium.org>
>>>> wrote:
>>>>>
>>>>> On Wed, Feb 14, 2018 at 1:17 PM, Vivek Gautam
>>>>> <vivek.gau...@codeaurora.org> wrote:
>>>>>>
>>>>>> Hi Tomasz,
>>>>>>
>>>>>> On Wed, Feb 14, 2018 at 8:31 AM, Tomasz Figa <tf...@chromium.org>
>>>>>> wrote:
>>>>>>>
>>>>>>> On Wed, Feb 14, 2018 at 11:13 AM, Rob Clark <robdcl...@gmail.com>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>> On Tue, Feb 13, 2018 at 8:59 PM, Tomasz Figa <tf...@chromium.org>
>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> On Wed, Feb 14, 2018 at 3:03 AM, Rob Clark <robdcl...@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>> On Tue, Feb 13, 2018 at 4:10 AM, Tomasz Figa <tf...@chromium.org>
>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>> Hi Vivek,
>>>>>>>>>>>
>>>>>>>>>>> Thanks for the patch. Please see my comments inline.
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Feb 7, 2018 at 7:31 PM, Vivek Gautam
>>>>>>>>>>> <vivek.gau...@codeaurora.org> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> While handling the concerned iommu, there should not be a
>>>>>>>>>>>> need to power control the drm devices from iommu interface.
>>>>>>>>>>>> If these drm devices need to be powered around this time,
>>>>>>>>>>>> the respective drivers should take care of this.
>>>>>>>>>>>>
>>>>>>>>>>>> Replace the pm_runtime_get/put_sync() with
>>>>>>>>>>>> pm_runtime_get/put_suppliers() calls, to power-up
>>>>>>>>>>>> the connected iommu through the device link interface.
>>>>>>>>>>>> In case the device link is not setup these get/put_suppliers()
>>>>>>>>>>>> calls will be a no-op, and the iommu driver should take care of
>>>>>>>>>>>> powering on its devices accordingly.
>>>>>>>>>>>>
>>>>>>>>>>>> Signed-off-by: Vivek Gautam <vivek.gau...@codeaurora.org>
>>>>>>>>>>>> ---
>>>>>>>>>>>>   drivers/gpu/drm/msm/msm_iommu.c | 16 
>>>>>>>>>>>>   1 file changed, 8 insertions(+), 8 deletions(-)
>>>>>>>>>>>>
>>>>>>>>>>>> diff --git a/drivers/gpu/drm/msm/msm_iommu.c
>>>>>>>>>>>> b/drivers/gpu/drm/msm/msm_iommu.c
>>>>>>>>>>>> index b23d33622f37..1ab629bbee69 100644
>>>>>>>>>>>> --- a/drivers/gpu/drm/msm/msm_iommu.c
>>>>>>>>>>>> +++ b/drivers/gpu/drm/msm/msm_iommu.c
>>>>>>>>>>>> @@ -40,9 +40,9 @@ static int msm_iommu_attach(struct msm_mmu
>>>>>>>>>>>> *mmu, const char * const *names,
>>>>>>>>>>>>  struct msm_iommu *iommu = to_msm_iommu(mmu);
>>>>>>>>>>>>  int ret;
>>>>>>>>>>>>
>>>>>>>>>>>> -   pm_runtime_get_sync(mmu->dev);
>>>>>>>>>>>> +   pm_runtime_get_suppliers(mmu->dev);
>>>>>>>>>>>>  ret = iommu_attach_device(iommu->domain, mmu->dev);
>>>>>>>>>>>> -   pm_runtime_put_sync(mmu->

Re: [Freedreno] [PATCH v7 6/6] drm/msm: iommu: Replace runtime calls with runtime suppliers

2018-02-14 Thread Tomasz Figa
On Wed, Feb 14, 2018 at 6:13 PM, Vivek Gautam
<vivek.gau...@codeaurora.org> wrote:
> Hi Tomasz,
>
> On Wed, Feb 14, 2018 at 11:08 AM, Tomasz Figa <tf...@chromium.org> wrote:
>> On Wed, Feb 14, 2018 at 1:17 PM, Vivek Gautam
>> <vivek.gau...@codeaurora.org> wrote:
>>> Hi Tomasz,
>>>
>>> On Wed, Feb 14, 2018 at 8:31 AM, Tomasz Figa <tf...@chromium.org> wrote:
>>>> On Wed, Feb 14, 2018 at 11:13 AM, Rob Clark <robdcl...@gmail.com> wrote:
>>>>> On Tue, Feb 13, 2018 at 8:59 PM, Tomasz Figa <tf...@chromium.org> wrote:
>>>>>> On Wed, Feb 14, 2018 at 3:03 AM, Rob Clark <robdcl...@gmail.com> wrote:
>>>>>>> On Tue, Feb 13, 2018 at 4:10 AM, Tomasz Figa <tf...@chromium.org> wrote:
>>>>>>>> Hi Vivek,
>>>>>>>>
>>>>>>>> Thanks for the patch. Please see my comments inline.
>>>>>>>>
>>>>>>>> On Wed, Feb 7, 2018 at 7:31 PM, Vivek Gautam
>>>>>>>> <vivek.gau...@codeaurora.org> wrote:
>>>>>>>>> While handling the concerned iommu, there should not be a
>>>>>>>>> need to power control the drm devices from iommu interface.
>>>>>>>>> If these drm devices need to be powered around this time,
>>>>>>>>> the respective drivers should take care of this.
>>>>>>>>>
>>>>>>>>> Replace the pm_runtime_get/put_sync() with
>>>>>>>>> pm_runtime_get/put_suppliers() calls, to power-up
>>>>>>>>> the connected iommu through the device link interface.
>>>>>>>>> In case the device link is not setup these get/put_suppliers()
>>>>>>>>> calls will be a no-op, and the iommu driver should take care of
>>>>>>>>> powering on its devices accordingly.
>>>>>>>>>
>>>>>>>>> Signed-off-by: Vivek Gautam <vivek.gau...@codeaurora.org>
>>>>>>>>> ---
>>>>>>>>>  drivers/gpu/drm/msm/msm_iommu.c | 16 
>>>>>>>>>  1 file changed, 8 insertions(+), 8 deletions(-)
>>>>>>>>>
>>>>>>>>> diff --git a/drivers/gpu/drm/msm/msm_iommu.c 
>>>>>>>>> b/drivers/gpu/drm/msm/msm_iommu.c
>>>>>>>>> index b23d33622f37..1ab629bbee69 100644
>>>>>>>>> --- a/drivers/gpu/drm/msm/msm_iommu.c
>>>>>>>>> +++ b/drivers/gpu/drm/msm/msm_iommu.c
>>>>>>>>> @@ -40,9 +40,9 @@ static int msm_iommu_attach(struct msm_mmu *mmu, 
>>>>>>>>> const char * const *names,
>>>>>>>>> struct msm_iommu *iommu = to_msm_iommu(mmu);
>>>>>>>>> int ret;
>>>>>>>>>
>>>>>>>>> -   pm_runtime_get_sync(mmu->dev);
>>>>>>>>> +   pm_runtime_get_suppliers(mmu->dev);
>>>>>>>>> ret = iommu_attach_device(iommu->domain, mmu->dev);
>>>>>>>>> -   pm_runtime_put_sync(mmu->dev);
>>>>>>>>> +   pm_runtime_put_suppliers(mmu->dev);
>>>>>>>>
>>>>>>>> For me, it looks like a wrong place to handle runtime PM of IOMMU
>>>>>>>> here. iommu_attach_device() calls into IOMMU driver's attach_device()
>>>>>>>> callback and that's where necessary runtime PM gets should happen, if
>>>>>>>> any. In other words, driver A (MSM DRM driver) shouldn't be dealing
>>>>>>>> with power state of device controlled by driver B (ARM SMMU).
>>>>>>>
>>>>>>> Note that we end up having to do the same, because of iommu_unmap()
>>>>>>> while DRM driver is powered off..  it might be cleaner if it was all
>>>>>>> self contained in the iommu driver, but that would make it so other
>>>>>>> drivers couldn't call iommu_unmap() from an irq handler, which is
>>>>>>> apparently something that some of them want to do..
>>>>>>
>>>>>> I'd assume that runtime PM status is already guaranteed to be active
>>>>>> when the IRQ handler is running, by some other means (e.g.
>>>>>> pm_runtime_get_sync() called earlier, when queuin

Re: [Freedreno] [PATCH v7 6/6] drm/msm: iommu: Replace runtime calls with runtime suppliers

2018-02-13 Thread Tomasz Figa
On Wed, Feb 14, 2018 at 1:17 PM, Vivek Gautam
<vivek.gau...@codeaurora.org> wrote:
> Hi Tomasz,
>
> On Wed, Feb 14, 2018 at 8:31 AM, Tomasz Figa <tf...@chromium.org> wrote:
>> On Wed, Feb 14, 2018 at 11:13 AM, Rob Clark <robdcl...@gmail.com> wrote:
>>> On Tue, Feb 13, 2018 at 8:59 PM, Tomasz Figa <tf...@chromium.org> wrote:
>>>> On Wed, Feb 14, 2018 at 3:03 AM, Rob Clark <robdcl...@gmail.com> wrote:
>>>>> On Tue, Feb 13, 2018 at 4:10 AM, Tomasz Figa <tf...@chromium.org> wrote:
>>>>>> Hi Vivek,
>>>>>>
>>>>>> Thanks for the patch. Please see my comments inline.
>>>>>>
>>>>>> On Wed, Feb 7, 2018 at 7:31 PM, Vivek Gautam
>>>>>> <vivek.gau...@codeaurora.org> wrote:
>>>>>>> While handling the concerned iommu, there should not be a
>>>>>>> need to power control the drm devices from iommu interface.
>>>>>>> If these drm devices need to be powered around this time,
>>>>>>> the respective drivers should take care of this.
>>>>>>>
>>>>>>> Replace the pm_runtime_get/put_sync() with
>>>>>>> pm_runtime_get/put_suppliers() calls, to power-up
>>>>>>> the connected iommu through the device link interface.
>>>>>>> In case the device link is not setup these get/put_suppliers()
>>>>>>> calls will be a no-op, and the iommu driver should take care of
>>>>>>> powering on its devices accordingly.
>>>>>>>
>>>>>>> Signed-off-by: Vivek Gautam <vivek.gau...@codeaurora.org>
>>>>>>> ---
>>>>>>>  drivers/gpu/drm/msm/msm_iommu.c | 16 
>>>>>>>  1 file changed, 8 insertions(+), 8 deletions(-)
>>>>>>>
>>>>>>> diff --git a/drivers/gpu/drm/msm/msm_iommu.c 
>>>>>>> b/drivers/gpu/drm/msm/msm_iommu.c
>>>>>>> index b23d33622f37..1ab629bbee69 100644
>>>>>>> --- a/drivers/gpu/drm/msm/msm_iommu.c
>>>>>>> +++ b/drivers/gpu/drm/msm/msm_iommu.c
>>>>>>> @@ -40,9 +40,9 @@ static int msm_iommu_attach(struct msm_mmu *mmu, 
>>>>>>> const char * const *names,
>>>>>>> struct msm_iommu *iommu = to_msm_iommu(mmu);
>>>>>>> int ret;
>>>>>>>
>>>>>>> -   pm_runtime_get_sync(mmu->dev);
>>>>>>> +   pm_runtime_get_suppliers(mmu->dev);
>>>>>>> ret = iommu_attach_device(iommu->domain, mmu->dev);
>>>>>>> -   pm_runtime_put_sync(mmu->dev);
>>>>>>> +   pm_runtime_put_suppliers(mmu->dev);
>>>>>>
>>>>>> For me, it looks like a wrong place to handle runtime PM of IOMMU
>>>>>> here. iommu_attach_device() calls into IOMMU driver's attach_device()
>>>>>> callback and that's where necessary runtime PM gets should happen, if
>>>>>> any. In other words, driver A (MSM DRM driver) shouldn't be dealing
>>>>>> with power state of device controlled by driver B (ARM SMMU).
>>>>>
>>>>> Note that we end up having to do the same, because of iommu_unmap()
>>>>> while DRM driver is powered off..  it might be cleaner if it was all
>>>>> self contained in the iommu driver, but that would make it so other
>>>>> drivers couldn't call iommu_unmap() from an irq handler, which is
>>>>> apparently something that some of them want to do..
>>>>
>>>> I'd assume that runtime PM status is already guaranteed to be active
>>>> when the IRQ handler is running, by some other means (e.g.
>>>> pm_runtime_get_sync() called earlier, when queuing some work to the
>>>> hardware). Otherwise, I'm not sure how a powered down device could
>>>> trigger an IRQ.
>>>>
>>>> So, if the master device power is already on, suppliers should be
>>>> powered on as well, thanks to device links.
>>>>
>>>
>>> umm, that is kindof the inverse of the problem..  the problem is
>>> things like gpu driver (and v4l2 drivers that import dma-buf's,
>>> afaict).. they will potentially call iommu->unmap() when device is not
>>> active (due to userspace or things beyond the control of the driver)..
>>> so *they* would want iommu to do p

Re: [Freedreno] [PATCH v7 6/6] drm/msm: iommu: Replace runtime calls with runtime suppliers

2018-02-13 Thread Tomasz Figa
Hi Jordan,

On Wed, Feb 14, 2018 at 1:42 AM, Jordan Crouse <jcro...@codeaurora.org> wrote:
> On Tue, Feb 13, 2018 at 06:10:38PM +0900, Tomasz Figa wrote:
>> Hi Vivek,
>>
>> Thanks for the patch. Please see my comments inline.
>>
>> On Wed, Feb 7, 2018 at 7:31 PM, Vivek Gautam
>> <vivek.gau...@codeaurora.org> wrote:
>> > While handling the concerned iommu, there should not be a
>> > need to power control the drm devices from iommu interface.
>> > If these drm devices need to be powered around this time,
>> > the respective drivers should take care of this.
>> >
>> > Replace the pm_runtime_get/put_sync() with
>> > pm_runtime_get/put_suppliers() calls, to power-up
>> > the connected iommu through the device link interface.
>> > In case the device link is not setup these get/put_suppliers()
>> > calls will be a no-op, and the iommu driver should take care of
>> > powering on its devices accordingly.
>> >
>> > Signed-off-by: Vivek Gautam <vivek.gau...@codeaurora.org>
>> > ---
>> >  drivers/gpu/drm/msm/msm_iommu.c | 16 
>> >  1 file changed, 8 insertions(+), 8 deletions(-)
>> >
>> > diff --git a/drivers/gpu/drm/msm/msm_iommu.c 
>> > b/drivers/gpu/drm/msm/msm_iommu.c
>> > index b23d33622f37..1ab629bbee69 100644
>> > --- a/drivers/gpu/drm/msm/msm_iommu.c
>> > +++ b/drivers/gpu/drm/msm/msm_iommu.c
>> > @@ -40,9 +40,9 @@ static int msm_iommu_attach(struct msm_mmu *mmu, const 
>> > char * const *names,
>> > struct msm_iommu *iommu = to_msm_iommu(mmu);
>> > int ret;
>> >
>> > -   pm_runtime_get_sync(mmu->dev);
>> > +   pm_runtime_get_suppliers(mmu->dev);
>> > ret = iommu_attach_device(iommu->domain, mmu->dev);
>> > -   pm_runtime_put_sync(mmu->dev);
>> > +   pm_runtime_put_suppliers(mmu->dev);
>>
>> For me, it looks like a wrong place to handle runtime PM of IOMMU
>> here. iommu_attach_device() calls into IOMMU driver's attach_device()
>> callback and that's where necessary runtime PM gets should happen, if
>> any. In other words, driver A (MSM DRM driver) shouldn't be dealing
>> with power state of device controlled by driver B (ARM SMMU).
>
> This whole thing is confused by the fact that on MSM the GPU and the GPU IOMMU
> share some of the same clocks and power rail so turning on the GPU also
> turned on the IOMMU register banks by extension.

This is surprisingly not a very surprising case. Exactly the same can
be seen on Rockchip SoCs and we're solving the problem using the
solution I suggested. In fact, my suggestions to this thread are based
on the design we chose for Rockchip, due to the high level of
similarity (+/- the GPU directly programming IOMMU registers, which is
not present there, but AFAICT it doesn't pose a problem here).

>
> But if we put that aside the question is who should be responsible for
> controlling the power in this relationship and there are several good reasons 
> to
> leave it up to the client device. The most important reason is when we move to
> the per-instance model where the GPU self-programmings the SMMU registers. In
> that case, the driver will need to make sure that the SMMU is powered up 
> before
> submitting the command and then removing the power vote when the commands
> are done to save energy.

I might need more insight on what's going on in your hardware, but
with my current understanding I'd argue that that is not right,
because:

- When submitting commands to the GPU, the GPU driver will
pm_runtime_get_sync() on the GPU device, which will automatically do
the same on all the linked suppliers, which would also include the
SMMU itself. The role of device links here is exactly that the GPU
driver doesn't have to care which other devices need to be brought up.

- When the GPU is operating, the SMMU power must be supplied anyway,
because it needs to be doing the translations, right? Note that by
"power" I really mean the physical power supply in the SoC, e.g. as
for a power domain. The runtime PM API in its current form (e.g.
binary off or on operation) is unsuitable for managing other things,
such as clocks (and there is ongoing work on improving it, e.g. by
adding support for multiple power states).

   ^^ The above would be actually guaranteed by your hardware design,
where SMMU and GPU share the power domain and clocks. (We used to rely
on this in old downstream implementation of Rockchip IOMMU and master
drivers in Chromium OS kernel, before we moved to handling the clocks
explicitly in the IOMMU driver and properly using device links to
manage the power domain and sta

Re: [Freedreno] [PATCH v7 6/6] drm/msm: iommu: Replace runtime calls with runtime suppliers

2018-02-13 Thread Tomasz Figa
On Wed, Feb 14, 2018 at 3:03 AM, Rob Clark <robdcl...@gmail.com> wrote:
> On Tue, Feb 13, 2018 at 4:10 AM, Tomasz Figa <tf...@chromium.org> wrote:
>> Hi Vivek,
>>
>> Thanks for the patch. Please see my comments inline.
>>
>> On Wed, Feb 7, 2018 at 7:31 PM, Vivek Gautam
>> <vivek.gau...@codeaurora.org> wrote:
>>> While handling the concerned iommu, there should not be a
>>> need to power control the drm devices from iommu interface.
>>> If these drm devices need to be powered around this time,
>>> the respective drivers should take care of this.
>>>
>>> Replace the pm_runtime_get/put_sync() with
>>> pm_runtime_get/put_suppliers() calls, to power-up
>>> the connected iommu through the device link interface.
>>> In case the device link is not setup these get/put_suppliers()
>>> calls will be a no-op, and the iommu driver should take care of
>>> powering on its devices accordingly.
>>>
>>> Signed-off-by: Vivek Gautam <vivek.gau...@codeaurora.org>
>>> ---
>>>  drivers/gpu/drm/msm/msm_iommu.c | 16 
>>>  1 file changed, 8 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/msm/msm_iommu.c 
>>> b/drivers/gpu/drm/msm/msm_iommu.c
>>> index b23d33622f37..1ab629bbee69 100644
>>> --- a/drivers/gpu/drm/msm/msm_iommu.c
>>> +++ b/drivers/gpu/drm/msm/msm_iommu.c
>>> @@ -40,9 +40,9 @@ static int msm_iommu_attach(struct msm_mmu *mmu, const 
>>> char * const *names,
>>> struct msm_iommu *iommu = to_msm_iommu(mmu);
>>> int ret;
>>>
>>> -   pm_runtime_get_sync(mmu->dev);
>>> +   pm_runtime_get_suppliers(mmu->dev);
>>> ret = iommu_attach_device(iommu->domain, mmu->dev);
>>> -   pm_runtime_put_sync(mmu->dev);
>>> +   pm_runtime_put_suppliers(mmu->dev);
>>
>> For me, it looks like a wrong place to handle runtime PM of IOMMU
>> here. iommu_attach_device() calls into IOMMU driver's attach_device()
>> callback and that's where necessary runtime PM gets should happen, if
>> any. In other words, driver A (MSM DRM driver) shouldn't be dealing
>> with power state of device controlled by driver B (ARM SMMU).
>
> Note that we end up having to do the same, because of iommu_unmap()
> while DRM driver is powered off..  it might be cleaner if it was all
> self contained in the iommu driver, but that would make it so other
> drivers couldn't call iommu_unmap() from an irq handler, which is
> apparently something that some of them want to do..

I'd assume that runtime PM status is already guaranteed to be active
when the IRQ handler is running, by some other means (e.g.
pm_runtime_get_sync() called earlier, when queuing some work to the
hardware). Otherwise, I'm not sure how a powered down device could
trigger an IRQ.

So, if the master device power is already on, suppliers should be
powered on as well, thanks to device links.

Best regards,
Tomasz
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH v7 1/6] base: power: runtime: Export pm_runtime_get/put_suppliers

2018-02-13 Thread Tomasz Figa
On Tue, Feb 13, 2018 at 9:00 PM, Robin Murphy <robin.mur...@arm.com> wrote:
> On 13/02/18 07:44, Tomasz Figa wrote:
>>
>> Hi Vivek,
>>
>> On Wed, Feb 7, 2018 at 7:31 PM, Vivek Gautam
>> <vivek.gau...@codeaurora.org> wrote:
>>>
>>> The device link allows the pm framework to tie the supplier and
>>> consumer. So, whenever the consumer is powered-on the supplier
>>> is powered-on first.
>>>
>>> There are however cases in which the consumer wants to power-on
>>> the supplier, but not itself.
>>> E.g., A Graphics or multimedia driver wants to power-on the SMMU
>>> to unmap a buffer and finish the TLB operations without powering
>>> on itself.
>>
>>
>> This sounds strange to me. If the SMMU is powered down, wouldn't the
>> TLB lose its contents as well (and so no flushing needed)?
>
>
> Depends on implementation details - if runtime PM is actually implemented
> via external clock gating (in the absence of fine-grained power domains),
> then "suspended" TLBs might both retain state and not receive invalidation
> requests, which is really the worst case.

Agreed. That's why in "[PATCH v7 3/6] iommu/arm-smmu: Invoke
pm_runtime during probe, add/remove device" I actually suggested
managing clocks separately from runtime PM. At least until runtime PM
framework arrives at a state, where multiple power states can be
managed, i.e. full power state, clock-gated state, domain-off state.
(I think I might have seen some ongoing work on this on LWN though...)

>
>> Other than that, what kind of hardware operations would be needed
>> besides just updating the page tables from the CPU?
>
>
> Domain attach/detach also require updating SMMU hardware state (and possibly
> TLB maintenance), but don't logically require the master device itself to be
> active at the time.

Wouldn't this hardware state need to be reinitialized anyway after
respective power domain power cycles? (In other words, hardware would
only need programming if it's powered on at the moment.)

Best regards,
Tomasz
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH v8 5/6] iommu/arm-smmu: Add support for qcom, smmu-v2 variant

2018-02-13 Thread Tomasz Figa
Hi Vivek,

Thanks for the patch. Please see my comments inline.

On Fri, Feb 9, 2018 at 7:57 PM, Vivek Gautam
 wrote:
> qcom,smmu-v2 is an arm,smmu-v2 implementation with specific
> clock and power requirements. This smmu core is used with
> multiple masters on msm8996, viz. mdss, video, etc.
> Add bindings for the same.
>
> Signed-off-by: Vivek Gautam 
> Reviewed-by: Rob Herring 
> ---
>
> Changes in v8:
>  - Added the missing IOMMU_OF_DECLARE declaration for "qcom,smmu-v2"
>
>  .../devicetree/bindings/iommu/arm,smmu.txt | 43 
> ++
>  drivers/iommu/arm-smmu.c   | 14 +++
>  2 files changed, 57 insertions(+)
>
> diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.txt 
> b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
> index 8a6ffce12af5..169222ae2706 100644
> --- a/Documentation/devicetree/bindings/iommu/arm,smmu.txt
> +++ b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
> @@ -17,10 +17,19 @@ conditions.
>  "arm,mmu-401"
>  "arm,mmu-500"
>  "cavium,smmu-v2"
> +"qcom,-smmu-v2", "qcom,smmu-v2"
>
>depending on the particular implementation and/or the
>version of the architecture implemented.
>
> +  A number of Qcom SoCs use qcom,smmu-v2 version of the IP.
> +  "qcom,-smmu-v2" represents a soc specific compatible
> +  string that should be present along with the "qcom,smmu-v2"
> +  to facilitate SoC specific clocks/power connections and to
> +  address specific bug fixes.
> +  An example string would be -
> +  "qcom,msm8996-smmu-v2", "qcom,smmu-v2".

Hmm, I remember that for  and similar wildcards we required
explicitly listing allowed values. Rob, has it changed since I
stumbled upon such thing last time (or I just got it wrong before)?

> +
>  - reg   : Base address and size of the SMMU.
>
>  - #global-interrupts : The number of global interrupts exposed by the
> @@ -71,6 +80,23 @@ conditions.
>or using stream matching with #iommu-cells = <2>, and
>may be ignored if present in such cases.
>
> +- clock-names:Should be "bus", and "iface" for "qcom,smmu-v2"
> +  implementation.
> +
> +  "bus" clock for "qcom,smmu-v2" is required for downstream
> +  bus access and for the smmu ptw.
> +
> +  "iface" clock is required to access smmu's registers 
> through
> +  the TCU's programming interface.

nit: Could we have it in a bit more structured way? E.g.

- clock-names: List of names of clocks input to the device. The
required list depends on particular implementation and is as follows:
 - for "qcom,smmu-v2":
   - "bus": clock required for downstream bus access and for the smmu ptw,
   - "iface":  clock required to access smmu's registers through the
TCU's programming interface.
 - unspecified for other implementations.

(+/- wrapping)

> +
> +- clocks: Phandles for respective clocks described by clock-names.

Phandle is just a pointer to another node. Clocks are however
specified using a phandle and a number of arguments, depending on the
clock controller. I'd change it to:

- clocks: Specifiers for all clocks listed in the clock-names
property, as per generic clock bindings.

> +
> +- power-domains:  Phandles to SMMU's power domain specifier. This is
> +  required even if SMMU belongs to the master's power
> +  domain, as the SMMU will have to be enabled and
> +  accessed before master gets enabled and linked to its
> +  SMMU.

From DT point of view, the relationship of SMMU belonging to a master
device doesn't exist. The power-domains property needs to be properly
specified for all devices within power domains represented in DT, with
an exception of the case when the parent-child relationship is
explicitly stated in DT by child devices being represented by child
nodes of the parent device node.

- power-domains: Specifiers for power domains required to be powered
on for the SMMU to operate, as per generic power domain bindings.

Best regards,
Tomasz
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH v7 6/6] drm/msm: iommu: Replace runtime calls with runtime suppliers

2018-02-13 Thread Tomasz Figa
Hi Vivek,

Thanks for the patch. Please see my comments inline.

On Wed, Feb 7, 2018 at 7:31 PM, Vivek Gautam
 wrote:
> While handling the concerned iommu, there should not be a
> need to power control the drm devices from iommu interface.
> If these drm devices need to be powered around this time,
> the respective drivers should take care of this.
>
> Replace the pm_runtime_get/put_sync() with
> pm_runtime_get/put_suppliers() calls, to power-up
> the connected iommu through the device link interface.
> In case the device link is not setup these get/put_suppliers()
> calls will be a no-op, and the iommu driver should take care of
> powering on its devices accordingly.
>
> Signed-off-by: Vivek Gautam 
> ---
>  drivers/gpu/drm/msm/msm_iommu.c | 16 
>  1 file changed, 8 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/msm_iommu.c b/drivers/gpu/drm/msm/msm_iommu.c
> index b23d33622f37..1ab629bbee69 100644
> --- a/drivers/gpu/drm/msm/msm_iommu.c
> +++ b/drivers/gpu/drm/msm/msm_iommu.c
> @@ -40,9 +40,9 @@ static int msm_iommu_attach(struct msm_mmu *mmu, const char 
> * const *names,
> struct msm_iommu *iommu = to_msm_iommu(mmu);
> int ret;
>
> -   pm_runtime_get_sync(mmu->dev);
> +   pm_runtime_get_suppliers(mmu->dev);
> ret = iommu_attach_device(iommu->domain, mmu->dev);
> -   pm_runtime_put_sync(mmu->dev);
> +   pm_runtime_put_suppliers(mmu->dev);

For me, it looks like a wrong place to handle runtime PM of IOMMU
here. iommu_attach_device() calls into IOMMU driver's attach_device()
callback and that's where necessary runtime PM gets should happen, if
any. In other words, driver A (MSM DRM driver) shouldn't be dealing
with power state of device controlled by driver B (ARM SMMU).

This is also important for the reasons I stated in my comments to
"[PATCH v7 1/6] base: power: runtime: Export
pm_runtime_get/put_suppliers". Quoting for everyone's convenience:

>> There are however cases in which the consumer wants to power-on
>> the supplier, but not itself.
>> E.g., A Graphics or multimedia driver wants to power-on the SMMU
>> to unmap a buffer and finish the TLB operations without powering
>> on itself.
>
>This sounds strange to me. If the SMMU is powered down, wouldn't the
>TLB lose its contents as well (and so no flushing needed)?
>
>Other than that, what kind of hardware operations would be needed
>besides just updating the page tables from the CPU?
>

In other words, the SMMU driver can deal with hardware state based on
return value of pm_runtime_get_sync() or pm_runtime_get_if_in_use()
and decide whether some operations are necessary or not, e.g.
- a state restore is necessary if the domain was powered off, but we
are bringing the master on,
- a flush may not be required when (un)mapping with the domain powered off,
- etc.

Best regards,
Tomasz
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH v7 4/6] iommu/arm-smmu: Add the device_link between masters and smmu

2018-02-13 Thread Tomasz Figa
Hi Vivek,

Thanks for the patch. Please see my comments inline.

On Wed, Feb 7, 2018 at 7:31 PM, Vivek Gautam
 wrote:
> From: Sricharan R 
>
> Finally add the device link between the master device and
> smmu, so that the smmu gets runtime enabled/disabled only when the
> master needs it. This is done from add_device callback which gets
> called once when the master is added to the smmu.
>
> Signed-off-by: Sricharan R 
> Signed-off-by: Vivek Gautam 
> ---
>  drivers/iommu/arm-smmu.c | 16 
>  1 file changed, 16 insertions(+)
>
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index c024f69c1682..c7e924d553bd 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -215,6 +215,9 @@ struct arm_smmu_device {
>
> /* IOMMU core code handle */
> struct iommu_device iommu;
> +
> +   /* runtime PM link to master */
> +   struct device_link *link;
>  };
>
>  enum arm_smmu_context_fmt {
> @@ -1425,6 +1428,17 @@ static int arm_smmu_add_device(struct device *dev)
>
> pm_runtime_put_sync(smmu->dev);
>
> +   /*
> +* Establish the link between smmu and master, so that the
> +* smmu gets runtime enabled/disabled as per the master's
> +* needs.
> +*/
> +   smmu->link = device_link_add(dev, smmu->dev, DL_FLAG_PM_RUNTIME);
> +   if (!smmu->link)
> +   dev_warn(smmu->dev,
> +"Unable to create device link between %s and %s\n",
> +dev_name(smmu->dev), dev_name(dev));

How likely it is that the master can work normally even if the link
add fails? Perhaps we should just return an error here?

> +
> return 0;
>
>  out_rpm_put:
> @@ -1449,6 +1463,8 @@ static void arm_smmu_remove_device(struct device *dev)
> cfg  = fwspec->iommu_priv;
> smmu = cfg->smmu;
>
> +   device_link_del(smmu->link);

We allowed smmu->link in arm_smmu_add_device(), but here we don't
check it. Looking at the code, device_link_del() doesn't seem to check
either.

Note that this problem would go away if we fail add_device on
device_link_add() failure, as I suggested above, so no change would be
necessary.

Best regards,
Tomasz
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno