Re: [Freedreno] [PATCH v7 6/6] drm/msm: iommu: Replace runtime calls with runtime suppliers

2018-02-14 Thread Tomasz Figa
On Thu, Feb 15, 2018 at 12:17 PM, Tomasz Figa  wrote:
> On Thu, Feb 15, 2018 at 1:03 AM, Robin Murphy  wrote:
>> On 14/02/18 10:33, Vivek Gautam wrote:
>>>
>>> On Wed, Feb 14, 2018 at 2:46 PM, Tomasz Figa  wrote:
>>>
>>> Adding Jordan to this thread as well.
>>>
 On Wed, Feb 14, 2018 at 6:13 PM, Vivek Gautam
  wrote:
>
> Hi Tomasz,
>
> On Wed, Feb 14, 2018 at 11:08 AM, Tomasz Figa 
> wrote:
>>
>> On Wed, Feb 14, 2018 at 1:17 PM, Vivek Gautam
>>  wrote:
>>>
>>> Hi Tomasz,
>>>
>>> On Wed, Feb 14, 2018 at 8:31 AM, Tomasz Figa 
>>> wrote:

 On Wed, Feb 14, 2018 at 11:13 AM, Rob Clark 
 wrote:
>
> On Tue, Feb 13, 2018 at 8:59 PM, Tomasz Figa 
> wrote:
>>
>> On Wed, Feb 14, 2018 at 3:03 AM, Rob Clark 
>> wrote:
>>>
>>> On Tue, Feb 13, 2018 at 4:10 AM, Tomasz Figa 
>>> wrote:

 Hi Vivek,

 Thanks for the patch. Please see my comments inline.

 On Wed, Feb 7, 2018 at 7:31 PM, Vivek Gautam
  wrote:
>
> While handling the concerned iommu, there should not be a
> need to power control the drm devices from iommu interface.
> If these drm devices need to be powered around this time,
> the respective drivers should take care of this.
>
> Replace the pm_runtime_get/put_sync() with
> pm_runtime_get/put_suppliers() calls, to power-up
> the connected iommu through the device link interface.
> In case the device link is not setup these get/put_suppliers()
> calls will be a no-op, and the iommu driver should take care of
> powering on its devices accordingly.
>
> Signed-off-by: Vivek Gautam 
> ---
>   drivers/gpu/drm/msm/msm_iommu.c | 16 
>   1 file changed, 8 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/msm_iommu.c
> b/drivers/gpu/drm/msm/msm_iommu.c
> index b23d33622f37..1ab629bbee69 100644
> --- a/drivers/gpu/drm/msm/msm_iommu.c
> +++ b/drivers/gpu/drm/msm/msm_iommu.c
> @@ -40,9 +40,9 @@ static int msm_iommu_attach(struct msm_mmu
> *mmu, const char * const *names,
>  struct msm_iommu *iommu = to_msm_iommu(mmu);
>  int ret;
>
> -   pm_runtime_get_sync(mmu->dev);
> +   pm_runtime_get_suppliers(mmu->dev);
>  ret = iommu_attach_device(iommu->domain, mmu->dev);
> -   pm_runtime_put_sync(mmu->dev);
> +   pm_runtime_put_suppliers(mmu->dev);


 For me, it looks like a wrong place to handle runtime PM of IOMMU
 here. iommu_attach_device() calls into IOMMU driver's
 attach_device()
 callback and that's where necessary runtime PM gets should
 happen, if
 any. In other words, driver A (MSM DRM driver) shouldn't be
 dealing
 with power state of device controlled by driver B (ARM SMMU).
>>>
>>>
>>> Note that we end up having to do the same, because of
>>> iommu_unmap()
>>> while DRM driver is powered off..  it might be cleaner if it was
>>> all
>>> self contained in the iommu driver, but that would make it so
>>> other
>>> drivers couldn't call iommu_unmap() from an irq handler, which is
>>> apparently something that some of them want to do..
>>
>>
>> I'd assume that runtime PM status is already guaranteed to be
>> active
>> when the IRQ handler is running, by some other means (e.g.
>> pm_runtime_get_sync() called earlier, when queuing some work to the
>> hardware). Otherwise, I'm not sure how a powered down device could
>> trigger an IRQ.
>>
>> So, if the master device power is already on, suppliers should be
>> powered on as well, thanks to device links.
>>
>
> umm, that is kindof the inverse of the problem..  the problem is
> things like gpu driver (and v4l2 drivers that import dma-buf's,
> afaict).. they will potentially call iommu->unmap() when device is
> not
> active (due to userspace or things beyond the control of the
> driver)..
> so *they* would want iommu to do pm get/put 

Re: [Freedreno] [PATCH v7 6/6] drm/msm: iommu: Replace runtime calls with runtime suppliers

2018-02-14 Thread Tomasz Figa
On Thu, Feb 15, 2018 at 1:12 AM, Rob Clark  wrote:
> On Wed, Feb 14, 2018 at 10:48 AM, Jordan Crouse  
> wrote:
>> On Wed, Feb 14, 2018 at 12:31:29PM +0900, Tomasz Figa wrote:
>>>
>>> - When submitting commands to the GPU, the GPU driver will
>>> pm_runtime_get_sync() on the GPU device, which will automatically do
>>> the same on all the linked suppliers, which would also include the
>>> SMMU itself. The role of device links here is exactly that the GPU
>>> driver doesn't have to care which other devices need to be brought up.
>>
>> This is true.  Assuming that the device link works correctly we would not 
>> need
>> to explicitly power the SMMU which makes my point entirely moot.
>
> Just to point out what motivated this patchset, the biggest problem is
> iommu_unmap() because that can happen when GPU is not powered on (or
> in the v4l2 case, because some other device dropped it's reference to
> the dma-buf allowing it to be free'd).  Currently we pm get/put the
> GPU device around unmap, but it is kinda silly to boot up the GPU just
> to unmap a buffer.

Note that in V4L2 both mapping and unmapping can happen completely
without involving the driver. So AFAICT the approach being implemented
by this patchset will not work, because there will be no one to power
up the IOMMU before the operation. Moreover, there are platforms for
which there is no reason to power up the IOMMU just for map/unmap,
because the hardware state is lost anyway and the only real work
needed is updating the page tables in memory. (I feel like this is
actually true for most of the platforms in the wild, but this is based
purely on the not so small number of platforms I worked with, haven't
bothered looking for more general evidence.)

>
> (Semi-related, I would also like to batch map/unmap's, I just haven't
> gotten around to implementing it yet.. but that would be another case
> where a single get_supplier()/put_supplier() outside of the iommu
> would make sense instead of pm_get/put() inside the iommu driver's
> ->unmap().)
>
> If you really dislike the get/put_supplier() approach, then perhaps we
> need iommu_pm_get()/iommu_pm_put() operations that the iommu user
> could use to accomplish the same thing?

I'm afraid this wouldn't work for V4L2 either. And I still haven't
been given any evidence that the approach I'm suggesting, which relies
only on existing pieces of infrastructure and which worked for both
Exynos and Rockchip, including V4L2, wouldn't work for SMMU and/or QC
SoCs.

Best regards,
Tomasz
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH v7 6/6] drm/msm: iommu: Replace runtime calls with runtime suppliers

2018-02-14 Thread Tomasz Figa
On Thu, Feb 15, 2018 at 1:03 AM, Robin Murphy  wrote:
> On 14/02/18 10:33, Vivek Gautam wrote:
>>
>> On Wed, Feb 14, 2018 at 2:46 PM, Tomasz Figa  wrote:
>>
>> Adding Jordan to this thread as well.
>>
>>> On Wed, Feb 14, 2018 at 6:13 PM, Vivek Gautam
>>>  wrote:

 Hi Tomasz,

 On Wed, Feb 14, 2018 at 11:08 AM, Tomasz Figa 
 wrote:
>
> On Wed, Feb 14, 2018 at 1:17 PM, Vivek Gautam
>  wrote:
>>
>> Hi Tomasz,
>>
>> On Wed, Feb 14, 2018 at 8:31 AM, Tomasz Figa 
>> wrote:
>>>
>>> On Wed, Feb 14, 2018 at 11:13 AM, Rob Clark 
>>> wrote:

 On Tue, Feb 13, 2018 at 8:59 PM, Tomasz Figa 
 wrote:
>
> On Wed, Feb 14, 2018 at 3:03 AM, Rob Clark 
> wrote:
>>
>> On Tue, Feb 13, 2018 at 4:10 AM, Tomasz Figa 
>> wrote:
>>>
>>> Hi Vivek,
>>>
>>> Thanks for the patch. Please see my comments inline.
>>>
>>> On Wed, Feb 7, 2018 at 7:31 PM, Vivek Gautam
>>>  wrote:

 While handling the concerned iommu, there should not be a
 need to power control the drm devices from iommu interface.
 If these drm devices need to be powered around this time,
 the respective drivers should take care of this.

 Replace the pm_runtime_get/put_sync() with
 pm_runtime_get/put_suppliers() calls, to power-up
 the connected iommu through the device link interface.
 In case the device link is not setup these get/put_suppliers()
 calls will be a no-op, and the iommu driver should take care of
 powering on its devices accordingly.

 Signed-off-by: Vivek Gautam 
 ---
   drivers/gpu/drm/msm/msm_iommu.c | 16 
   1 file changed, 8 insertions(+), 8 deletions(-)

 diff --git a/drivers/gpu/drm/msm/msm_iommu.c
 b/drivers/gpu/drm/msm/msm_iommu.c
 index b23d33622f37..1ab629bbee69 100644
 --- a/drivers/gpu/drm/msm/msm_iommu.c
 +++ b/drivers/gpu/drm/msm/msm_iommu.c
 @@ -40,9 +40,9 @@ static int msm_iommu_attach(struct msm_mmu
 *mmu, const char * const *names,
  struct msm_iommu *iommu = to_msm_iommu(mmu);
  int ret;

 -   pm_runtime_get_sync(mmu->dev);
 +   pm_runtime_get_suppliers(mmu->dev);
  ret = iommu_attach_device(iommu->domain, mmu->dev);
 -   pm_runtime_put_sync(mmu->dev);
 +   pm_runtime_put_suppliers(mmu->dev);
>>>
>>>
>>> For me, it looks like a wrong place to handle runtime PM of IOMMU
>>> here. iommu_attach_device() calls into IOMMU driver's
>>> attach_device()
>>> callback and that's where necessary runtime PM gets should
>>> happen, if
>>> any. In other words, driver A (MSM DRM driver) shouldn't be
>>> dealing
>>> with power state of device controlled by driver B (ARM SMMU).
>>
>>
>> Note that we end up having to do the same, because of
>> iommu_unmap()
>> while DRM driver is powered off..  it might be cleaner if it was
>> all
>> self contained in the iommu driver, but that would make it so
>> other
>> drivers couldn't call iommu_unmap() from an irq handler, which is
>> apparently something that some of them want to do..
>
>
> I'd assume that runtime PM status is already guaranteed to be
> active
> when the IRQ handler is running, by some other means (e.g.
> pm_runtime_get_sync() called earlier, when queuing some work to the
> hardware). Otherwise, I'm not sure how a powered down device could
> trigger an IRQ.
>
> So, if the master device power is already on, suppliers should be
> powered on as well, thanks to device links.
>

 umm, that is kindof the inverse of the problem..  the problem is
 things like gpu driver (and v4l2 drivers that import dma-buf's,
 afaict).. they will potentially call iommu->unmap() when device is
 not
 active (due to userspace or things beyond the control of the
 driver)..
 so *they* would want iommu to do pm get/put calls.
>>>
>>>
>>> Which is fine and which is actually already done by one of the
>>> patches
>>> in this series, not for map/unmap, but probe, add_device,
>>> 

Re: [Freedreno] [RFC PULL] Add Display Support for Qualcomm SDM845

2018-02-14 Thread Rob Clark
On Wed, Feb 14, 2018 at 1:17 PM, jsanka  wrote:
>
>
> On 2/13/2018 12:00 PM, Rob Clark wrote:
>>
>> On Tue, Feb 13, 2018 at 2:18 PM, Sean Paul  wrote:
>>>
>>> Hi dri-devel,
>>> Qualcomm has been working for the past few weeks on forward porting their
>>> downstream drm driver from 4.14 to mainline. Please consider this PR as a
>>> request for review, rather than an attempt at mainlining the code as it
>>> currently stands. The goal is get this driver in shape over the next
>>> coming
>>> months.
>>>
>>> In the meantime, I'll be hosting a tree here [1] to stage the fixes.
>>> Patches
>>> will be posted and reviewed on linux-arm-...@vger.kernel.org. Once things
>>> look
>>> good, I'll send another pull 4realz.
>>>
>>> To get the ball rolling, I've done some review on the new connector code,
>>> my
>>> comments are below.
>>>
>>> Thanks in advance for your constructive feedback :)
>>>
>>> Sean
>>>
>>> [1]- git://people.freedesktop.org/~seanpaul/dpu-staging
>>>
>>> Review feedback:
>>> 
>>
>> just so others aren't confused, s/sde/dpu/g in the list below (we did
>> our DAL->DC rename before sending first RFC :-P)
>>
>>> - Solve the splash screen handling (or remove it)
>>> - Simplify devicetree binding (remove register offsets)
>>> feedback from reviewing sde_connector.c:
>>> - Rationalize backlight implementation in sde_connector (display_count
>>> static)
>>> - Sort out the dsi event passing between dsi/encoder/connector (move to
>>> encoder)
>>> - include/uapi/drm/msm_drm_pp.h needs opensource userspace (or removal)
>>> - connector->state access violations reading/writing mode_info
>>> - s/sde_rect/drm_rect/
>>> - sde_kms_info usage needs to be replaced with formal data structures
>>> (not
>>>stringified keypairs)
>>> - sde_connector_ops needs to be trimmed, duplicates connector helpers,
>>> info
>>>hooks circumvent state, and other hooks should be stored in state or
>>>prepopulated (get_dst_format)
>>> - sde_connector_get_dpms unused
>>> - esd status check should migrate to encoder from connector
>>> - backlight should be handled in panel drivers, not in the generic
>>> connector/dsi
>>>encoder
>>> - sde_connector_helper_bridge_disable is called from encoder and calls
>>> back into
>>>set_power encoder function. if backlight, and esd status are removed,
>>>pre_kickoff can probably go away
>>> - sde_connector_clk_ctrl is another example of
>>> encoder->connector->encoder call
>>> - RETIRE_FENCE connector property should be removed, opting for the
>>> native
>>>atomic fences
>>> - ROI (regions of interest) should be expressed per-plane instead of
>>> connector.
>>>there is work ongoing to support dirty_rects per-plane by Deepak Singh
>>> Rawat
>>> and Lukasz Spintzyk
>>> 
>>> - Uma Shankar  has proposed HDR source metadata
>>>properties on the list, we should pivot to those instead of
>>> hand-rolling them
>>>in the sde driver
>>> - Convert HDCP implementation to upstream Content Protection property
>>> - Merge dsi and dsi_staging into one driver
>>> - Writeback connector has been proposed by ARM (Liviu Dudau and Brian
>>> Starkey),
>>>we should work with their proposal instead of rolling OUT_FB ourselves
>>> - sde_connector_set_property should be replaced with atomic helper
>>> - dsi hotplug can probably be punted to the panel driver
>>> - dpms should switch to enable/disable (or at least use the atomic
>>> helpers)
>>> - dsi mode handling should also defer to the panel driver
>>> - SDE_WB_CONFIG ioctl should be removed in favor of the existing ioctl to
>>> add
>>>user-defined modes
>>> - dp implementation should use the existing dp helpers wherever possible
>>> - lots of duplicated structures in dsi_defs.h that can be replaced with
>>> existing
>>>drm structs
>>> - mode_valid should be split up and implemented directly in
>>> connector/encoder as
>>>appropriate
>>> - sde_connector->aspace seems like it's unused?
>>>
>> To add to that, a few things I've noticed (but I've mostly only gotten
>> as far as the front-end of the pipeline, ie. dpu_plane):
>>
>>   - It looks like, as was the case with mdp4/mdp5 (and really most other
>> hw)
>> there are still unequal capabilities for planes (ie. some can do YUV,
>> scaling, etc).
>>
>> But drm-hwc (or weston) isn't going to know about that, so I think
>> we'll
>> need to add support for dynamically assigning dpu_plane::pipe, similar
>> to what mdp5 does w/ plane<->hwpipe.  (Which I actually added
>> specifically
>> for drm-hwc ;-))
>
> We are working on plane virtualization focused primarily to support 4K /
> wider displays which cannot
> be catered by one hwpipe. The plan is to gang up two *fixed* hwpipes of
> similar capabilities (in terms of formats,
> sub hw blocks like scalar, post processing ) and expose them as a single
> plane so that user 

Re: [Freedreno] [RFC PULL] Add Display Support for Qualcomm SDM845

2018-02-14 Thread jsanka



On 2/13/2018 12:00 PM, Rob Clark wrote:

On Tue, Feb 13, 2018 at 2:18 PM, Sean Paul  wrote:

Hi dri-devel,
Qualcomm has been working for the past few weeks on forward porting their
downstream drm driver from 4.14 to mainline. Please consider this PR as a
request for review, rather than an attempt at mainlining the code as it
currently stands. The goal is get this driver in shape over the next coming
months.

In the meantime, I'll be hosting a tree here [1] to stage the fixes. Patches
will be posted and reviewed on linux-arm-...@vger.kernel.org. Once things look
good, I'll send another pull 4realz.

To get the ball rolling, I've done some review on the new connector code, my
comments are below.

Thanks in advance for your constructive feedback :)

Sean

[1]- git://people.freedesktop.org/~seanpaul/dpu-staging

Review feedback:


just so others aren't confused, s/sde/dpu/g in the list below (we did
our DAL->DC rename before sending first RFC :-P)


- Solve the splash screen handling (or remove it)
- Simplify devicetree binding (remove register offsets)
feedback from reviewing sde_connector.c:
- Rationalize backlight implementation in sde_connector (display_count static)
- Sort out the dsi event passing between dsi/encoder/connector (move to encoder)
- include/uapi/drm/msm_drm_pp.h needs opensource userspace (or removal)
- connector->state access violations reading/writing mode_info
- s/sde_rect/drm_rect/
- sde_kms_info usage needs to be replaced with formal data structures (not
   stringified keypairs)
- sde_connector_ops needs to be trimmed, duplicates connector helpers, info
   hooks circumvent state, and other hooks should be stored in state or
   prepopulated (get_dst_format)
- sde_connector_get_dpms unused
- esd status check should migrate to encoder from connector
- backlight should be handled in panel drivers, not in the generic connector/dsi
   encoder
- sde_connector_helper_bridge_disable is called from encoder and calls back into
   set_power encoder function. if backlight, and esd status are removed,
   pre_kickoff can probably go away
- sde_connector_clk_ctrl is another example of encoder->connector->encoder call
- RETIRE_FENCE connector property should be removed, opting for the native
   atomic fences
- ROI (regions of interest) should be expressed per-plane instead of connector.
   there is work ongoing to support dirty_rects per-plane by Deepak Singh Rawat
    and Lukasz Spintzyk 
- Uma Shankar  has proposed HDR source metadata
   properties on the list, we should pivot to those instead of hand-rolling them
   in the sde driver
- Convert HDCP implementation to upstream Content Protection property
- Merge dsi and dsi_staging into one driver
- Writeback connector has been proposed by ARM (Liviu Dudau and Brian Starkey),
   we should work with their proposal instead of rolling OUT_FB ourselves
- sde_connector_set_property should be replaced with atomic helper
- dsi hotplug can probably be punted to the panel driver
- dpms should switch to enable/disable (or at least use the atomic helpers)
- dsi mode handling should also defer to the panel driver
- SDE_WB_CONFIG ioctl should be removed in favor of the existing ioctl to add
   user-defined modes
- dp implementation should use the existing dp helpers wherever possible
- lots of duplicated structures in dsi_defs.h that can be replaced with existing
   drm structs
- mode_valid should be split up and implemented directly in connector/encoder as
   appropriate
- sde_connector->aspace seems like it's unused?


To add to that, a few things I've noticed (but I've mostly only gotten
as far as the front-end of the pipeline, ie. dpu_plane):

  - It looks like, as was the case with mdp4/mdp5 (and really most other hw)
there are still unequal capabilities for planes (ie. some can do YUV,
scaling, etc).

But drm-hwc (or weston) isn't going to know about that, so I think we'll
need to add support for dynamically assigning dpu_plane::pipe, similar
to what mdp5 does w/ plane<->hwpipe.  (Which I actually added specifically
for drm-hwc ;-))
We are working on plane virtualization focused primarily to support 4K / 
wider displays which cannot
be catered by one hwpipe. The plan is to gang up two *fixed* hwpipes of 
similar capabilities (in terms of formats,
sub hw blocks like scalar, post processing ) and expose them as a single 
plane so that user space
compositor ( drm-hwc or weston) need not worry about max pixel width 
supported by the planes. But mapping
planes <-> hwpipe dynamically may need more than just virtualization. 
Kernel need to keep track of hwpipes
especially when dealing with multiple displays. And it get complicated 
when planes start moving around CRTC's

for different sized buffers.

Given that DRM exposes planes with its own list of format/modifiers, I 
think its not that big of a deal for a

compositor to 

[Freedreno] [PATCH 2/2] drm/msm: Add A6XX device support

2018-02-14 Thread Jordan Crouse
Add support for the A6XX family of Adreno GPUs. The biggest addition
is the GMU (Graphics Management Unit) which takes over most of the
power management of the GPU itself but in a ironic twist of fate
needs a goodly amount of management itself.  Add support for the
A6XX core code, the GMU and the HFI (hardware firmware interface)
queue that the CPU uses to communicate with the GMU.

Signed-off-by: Jordan Crouse 
---
 drivers/gpu/drm/msm/Makefile   |3 +
 drivers/gpu/drm/msm/adreno/a6xx_gmu.c  | 1210 
 drivers/gpu/drm/msm/adreno/a6xx_gmu.h  |  162 
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c  |  809 +++
 drivers/gpu/drm/msm/adreno/a6xx_gpu.h  |   60 ++
 drivers/gpu/drm/msm/adreno/a6xx_hfi.c  |  435 ++
 drivers/gpu/drm/msm/adreno/a6xx_hfi.h  |  127 +++
 drivers/gpu/drm/msm/adreno/adreno_device.c |   12 +
 drivers/gpu/drm/msm/adreno/adreno_gpu.c|2 +-
 drivers/gpu/drm/msm/adreno/adreno_gpu.h|5 +-
 drivers/gpu/drm/msm/msm_gpu.c  |2 +-
 11 files changed, 2824 insertions(+), 3 deletions(-)
 create mode 100644 drivers/gpu/drm/msm/adreno/a6xx_gmu.c
 create mode 100644 drivers/gpu/drm/msm/adreno/a6xx_gmu.h
 create mode 100644 drivers/gpu/drm/msm/adreno/a6xx_gpu.c
 create mode 100644 drivers/gpu/drm/msm/adreno/a6xx_gpu.h
 create mode 100644 drivers/gpu/drm/msm/adreno/a6xx_hfi.c
 create mode 100644 drivers/gpu/drm/msm/adreno/a6xx_hfi.h

diff --git a/drivers/gpu/drm/msm/Makefile b/drivers/gpu/drm/msm/Makefile
index cd40c050b2d7..4affc665c0de 100644
--- a/drivers/gpu/drm/msm/Makefile
+++ b/drivers/gpu/drm/msm/Makefile
@@ -10,6 +10,9 @@ msm-y := \
adreno/a5xx_gpu.o \
adreno/a5xx_power.o \
adreno/a5xx_preempt.o \
+   adreno/a6xx_gpu.o \
+   adreno/a6xx_gmu.o \
+   adreno/a6xx_hfi.o \
hdmi/hdmi.o \
hdmi/hdmi_audio.o \
hdmi/hdmi_bridge.o \
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
new file mode 100644
index ..a031c2eaf18c
--- /dev/null
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
@@ -0,0 +1,1210 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright (c) 2017-2018 The Linux Foundation. All rights reserved. */
+
+#include 
+#include 
+#include 
+#include 
+
+#include "a6xx_gpu.h"
+#include "a6xx_gmu.xml.h"
+
+static irqreturn_t a6xx_gmu_irq(int irq, void *data)
+{
+   struct a6xx_gmu *gmu = data;
+   u32 status;
+
+   status = gmu_read(gmu, REG_A6XX_GMU_AO_HOST_INTERRUPT_STATUS);
+   gmu_write(gmu, REG_A6XX_GMU_AO_HOST_INTERRUPT_CLR, status);
+
+   if (status & A6XX_GMU_AO_HOST_INTERRUPT_STATUS_WDOG_BITE) {
+   dev_err_ratelimited(gmu->dev, "GMU watchdog expired\n");
+
+   /* Temporary until we can recover safely */
+   BUG();
+   }
+
+   if (status &  A6XX_GMU_AO_HOST_INTERRUPT_STATUS_HOST_AHB_BUS_ERROR)
+   dev_err_ratelimited(gmu->dev, "GMU AHB bus error\n");
+
+   if (status & A6XX_GMU_AO_HOST_INTERRUPT_STATUS_FENCE_ERR)
+   dev_err_ratelimited(gmu->dev, "GMU fence error: 0x%x\n",
+   gmu_read(gmu, REG_A6XX_GMU_AHB_FENCE_STATUS));
+
+   return IRQ_HANDLED;
+}
+
+static irqreturn_t a6xx_hfi_irq(int irq, void *data)
+{
+   struct a6xx_gmu *gmu = data;
+   u32 status;
+
+   status = gmu_read(gmu, REG_A6XX_GMU_GMU2HOST_INTR_INFO);
+   gmu_write(gmu, REG_A6XX_GMU_GMU2HOST_INTR_CLR, status);
+
+   if (status & A6XX_GMU_GMU2HOST_INTR_INFO_MSGQ)
+   tasklet_schedule(>hfi_tasklet);
+
+   if (status & A6XX_GMU_GMU2HOST_INTR_INFO_CM3_FAULT) {
+   dev_err_ratelimited(gmu->dev, "GMU firmware fault\n");
+
+   /* Temporary until we can recover safely */
+   BUG();
+   }
+
+   return IRQ_HANDLED;
+}
+
+/* Check to see if the GX rail is still powered */
+static bool a6xx_gmu_gx_is_on(struct a6xx_gmu *gmu)
+{
+   u32 val  = gmu_read(gmu, REG_A6XX_GMU_SPTPRAC_PWR_CLK_STATUS);
+
+   return !(val &
+   (A6XX_GMU_SPTPRAC_PWR_CLK_STATUS_SPTPRAC_GDSC_POWER_OFF |
+   A6XX_GMU_SPTPRAC_PWR_CLK_STATUS_GX_HM_CLK_OFF));
+}
+
+#define GX_OFF_MASK \
+   (A6XX_GMU_SPTPRAC_PWR_CLK_STATUS_SPTPRAC_GDSC_POWER_OFF | \
+A6XX_GMU_SPTPRAC_PWR_CLK_STATUS_GX_HM_CLK_OFF)
+
+static bool a6xx_gmu_check_idle_level(struct a6xx_gmu *gmu)
+{
+   u32 val;
+   int local = gmu->idle_level;
+
+   /* SPTP and IFPC both report as IFPC */
+   if (gmu->idle_level == GMU_IDLE_STATE_SPTP)
+   local = GMU_IDLE_STATE_IFPC;
+
+   val = gmu_read(gmu, REG_A6XX_GPU_GMU_CX_GMU_RPMH_POWER_STATE);
+
+   if (val == local) {
+   if (gmu->idle_level != GMU_IDLE_STATE_IFPC ||
+   !a6xx_gmu_gx_is_on(gmu))
+   return true;
+   }
+
+   return false;
+}
+
+/* Wait for the GMU to get to its most idle 

[Freedreno] [PATCH 1/2] drm/msm: Add generated headers for A6XX

2018-02-14 Thread Jordan Crouse
From: Sharat Masetty 

Add initial register headers for A6XX targets.

Signed-off-by: Sharat Masetty 
Signed-off-by: Jordan Crouse 
---
 drivers/gpu/drm/msm/adreno/a6xx.xml.h | 1600 +
 drivers/gpu/drm/msm/adreno/a6xx_gmu.xml.h |  382 +++
 2 files changed, 1982 insertions(+)
 create mode 100644 drivers/gpu/drm/msm/adreno/a6xx.xml.h
 create mode 100644 drivers/gpu/drm/msm/adreno/a6xx_gmu.xml.h

diff --git a/drivers/gpu/drm/msm/adreno/a6xx.xml.h 
b/drivers/gpu/drm/msm/adreno/a6xx.xml.h
new file mode 100644
index ..17d12414f347
--- /dev/null
+++ b/drivers/gpu/drm/msm/adreno/a6xx.xml.h
@@ -0,0 +1,1600 @@
+#ifndef A6XX_XML
+#define A6XX_XML
+
+/* Autogenerated file, DO NOT EDIT manually!
+
+This file was generated by the rules-ng-ng headergen tool in this git 
repository:
+http://github.com/freedreno/envytools/
+git clone https://github.com/freedreno/envytools.git
+
+The rules-ng-ng source files this header was generated from are:
+- /local3/projects/drm/envytools/rnndb//adreno.xml   (501 
bytes, from 2017-11-20 17:36:01)
+- /local3/projects/drm/envytools/rnndb//freedreno_copyright.xml  (   1572 
bytes, from 2016-10-24 21:12:27)
+- /local3/projects/drm/envytools/rnndb//adreno/a2xx.xml  (  32901 
bytes, from 2016-10-24 21:12:27)
+- /local3/projects/drm/envytools/rnndb//adreno/adreno_common.xml (  11792 
bytes, from 2017-11-17 23:22:22)
+- /local3/projects/drm/envytools/rnndb//adreno/adreno_pm4.xml(  17205 
bytes, from 2017-11-17 23:22:22)
+- /local3/projects/drm/envytools/rnndb//adreno/a3xx.xml  (  83693 
bytes, from 2017-11-17 23:22:22)
+- /local3/projects/drm/envytools/rnndb//adreno/a4xx.xml  ( 110372 
bytes, from 2017-11-17 23:22:22)
+- /local3/projects/drm/envytools/rnndb//adreno/a5xx.xml  (  66793 
bytes, from 2017-11-17 23:22:22)
+- /local3/projects/drm/envytools/rnndb//adreno/a6xx.xml  (  44551 
bytes, from 2017-11-20 19:30:19)
+- /local3/projects/drm/envytools/rnndb//adreno/a6xx_gmu.xml  (  10431 
bytes, from 2017-11-20 17:59:44)
+- /local3/projects/drm/envytools/rnndb//adreno/ocmem.xml (   1773 
bytes, from 2016-10-24 21:12:27)
+
+Copyright (C) 2013-2017 by the following authors:
+- Rob Clark  (robclark)
+- Ilia Mirkin  (imirkin)
+
+Permission is hereby granted, free of charge, to any person obtaining
+a copy of this software and associated documentation files (the
+"Software"), to deal in the Software without restriction, including
+without limitation the rights to use, copy, modify, merge, publish,
+distribute, sublicense, and/or sell copies of the Software, and to
+permit persons to whom the Software is furnished to do so, subject to
+the following conditions:
+
+The above copyright notice and this permission notice (including the
+next paragraph) shall be included in all copies or substantial
+portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
+IN NO EVENT SHALL THE COPYRIGHT OWNER(S) AND/OR ITS SUPPLIERS BE
+LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
+OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
+WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
+*/
+
+
+enum a6xx_cp_perfcounter_select {
+   PERF_CP_ALWAYS_COUNT = 0,
+};
+
+enum a6xx_event_write {
+   PC_CCU_INVALIDATE_DEPTH = 24,
+   PC_CCU_INVALIDATE_COLOR = 25,
+};
+
+#define A6XX_RBBM_INT_0_MASK_RBBM_GPU_IDLE 0x0001
+#define A6XX_RBBM_INT_0_MASK_CP_AHB_ERROR  0x0002
+#define A6XX_RBBM_INT_0_MASK_RBBM_ATB_ASYNCFIFO_OVERFLOW   0x0040
+#define A6XX_RBBM_INT_0_MASK_RBBM_GPC_ERROR0x0080
+#define A6XX_RBBM_INT_0_MASK_CP_SW 0x0100
+#define A6XX_RBBM_INT_0_MASK_CP_HW_ERROR   0x0200
+#define A6XX_RBBM_INT_0_MASK_CP_CCU_FLUSH_DEPTH_TS 0x0400
+#define A6XX_RBBM_INT_0_MASK_CP_CCU_FLUSH_COLOR_TS 0x0800
+#define A6XX_RBBM_INT_0_MASK_CP_CCU_RESOLVE_TS 0x1000
+#define A6XX_RBBM_INT_0_MASK_CP_IB20x2000
+#define A6XX_RBBM_INT_0_MASK_CP_IB10x4000
+#define A6XX_RBBM_INT_0_MASK_CP_RB 0x8000
+#define A6XX_RBBM_INT_0_MASK_CP_RB_DONE_TS 0x0002
+#define A6XX_RBBM_INT_0_MASK_CP_WT_DONE_TS 0x0004
+#define A6XX_RBBM_INT_0_MASK_CP_CACHE_FLUSH_TS 0x0010
+#define A6XX_RBBM_INT_0_MASK_RBBM_ATB_BUS_OVERFLOW 0x0040
+#define A6XX_RBBM_INT_0_MASK_RBBM_HANG_DETECT  0x0080
+#define 

Re: [Freedreno] [PATCH 2/2] drm/msm/adreno: Add A6XX device support

2018-02-14 Thread Jordan Crouse
On Thu, Feb 01, 2018 at 11:32:25AM +0100, Lucas Stach wrote:
> Hi Jordan,
> 
> just some drive-by comments:

Drive by comments are the best.

> Am Mittwoch, den 31.01.2018, 11:25 -0700 schrieb Jordan Crouse:
> > Add support for the A6XX family of Adreno GPUs. The biggest addition
> > is the GMU (Graphics Management Unit) which takes over most of the
> > power management of the GPU itself but in a ironic twist of fate
> > needs a goodly amount of management itself. Add support for the
> > A6XX core code, the GMU and the HFI (hardware firmware interface)
> > queue that the CPU uses to communicate with the GMU.
> > 
> > > Signed-off-by: Jordan Crouse 
> > ---
> [...]
> > +static int a6xx_hfi_queue_read(struct a6xx_hfi_queue *queue, u32 *data,
> > > + u32 dwords)
> > +{
> > > + struct a6xx_hfi_queue_header *header = queue->header;
> > > + u32 i, hdr, index = header->read_index;
> > +
> > > + if (header->read_index == header->write_index) {
> > > + header->rx_request = 1;
> > > + return 0;
> > > + }
> > +
> > > + hdr = queue->data[index];
> > +
> > > + /*
> > > +  * If we are to assume that the GMU firmware is in fact a rational actor
> > > +  * and is programmed to not send us a larger response than we expect
> > > +  * then we can also assume that if the header size is unexpectedly large
> > > +  * that it is due to memory corruption and/or hardware failure. In this
> > > +  * case the only reasonable course of action is to BUG() to help harden
> > > +  * the failure.
> > > +  */
> > +
> > +   BUG_ON(HFI_HEADER_SIZE(hdr) > dwords);
> 
> Don't ever BUG the kernel due to a peripheral acting up, until you are
> really certain that it has corrupted memory it doesn't own, which would
> be written back to persistent storage. That's the only valid
> justification for a BUG.

Acknowledged. In the final version we'll try to zap the hardware and recover
cleanly but I don't have that coded up yet.  Don't worry, I wouldn't let it
merge with a BUG().

> > +
> > > + for (i = 0; i < HFI_HEADER_SIZE(hdr); i++) {
> > > + data[i] = queue->data[index];
> > > + index = (index + 1) % header->size;
> > > + }
> > +
> > > + header->read_index = index;
> > > + return HFI_HEADER_SIZE(hdr);
> > +}
> > +
> > +static int a6xx_hfi_queue_write(struct a6xx_gmu *gmu,
> > > + struct a6xx_hfi_queue *queue, u32 *data, u32 dwords)
> > +{
> > > + struct a6xx_hfi_queue_header *header = queue->header;
> > > + u32 i, space, index = header->write_index;
> > +
> > > + spin_lock(>lock);
> > +
> > > + space = CIRC_SPACE(header->write_index, header->read_index,
> > > + header->size);
> > > + if (space < dwords) {
> > > + header->dropped++;
> > > + spin_unlock(>lock);
> > > + return -ENOSPC;
> > > + }
> > +
> > > + for (i = 0; i < dwords; i++) {
> > > + queue->data[index] = data[i];
> > > + index = (index + 1) % header->size;
> > > + }
> > +
> > > + header->write_index = index;
> > > + spin_unlock(>lock);
> > +
> > > + /* Make sure the command is written to the queue */
> > > + wmb();
> 
> The above memory barrier is unnecessary. The gmu_write below is a
> wrapper around writel, which already includes the write barrier.

Thanks.  I got this one and a few others I found too.

> > +   gmu_write(gmu, REG_A6XX_GMU_HOST2GMU_INTR_SET, 0x01);
> > > + return 0;
> > +}
> 
> [...]
> > +static int a6xx_hfi_send_msg(struct a6xx_gmu *gmu, int id,
> > > + void *data, u32 size, u32 *payload, u32 payload_size)
> > +{
> > > + struct a6xx_hfi_queue *queue = >queues[HFI_COMMAND_QUEUE];
> > > + struct a6xx_hfi_response resp = { 0 };
> > > + int ret, dwords = size >> 2;
> > > + u32 seqnum;
> > +
> > > + seqnum = atomic_inc_return(>seqnum) % 0xfff;
> > +
> > > + /* First dword of the message is the message header - fill it in */
> > > + *((u32 *) data) = (seqnum << 20) | (HFI_MSG_CMD << 16) |
> > > + (dwords << 8) | id;
> > +
> > > + init_completion();
> > > + resp.id = id;
> > > + resp.seqnum = seqnum;
> > +
> > > + spin_lock_bh(_ack_lock);
> > > + list_add_tail(, _ack_list);
> > > + spin_unlock_bh(_ack_lock);
> 
> What are you protecting against here by using the _bh spinlock
> variants?

We use a tasklet from the interrupt to process responses. The
commands are entirely serialized right now so this is mostly wasted
cycles but we are planning for a brave future where the commands
may be asynchronous.

> Regards,
> Lucas

Thanks,
Jordan
-- 
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [RFT PATCH] drm/msm: Trigger fence completion from GPU

2018-02-14 Thread Rob Clark
On Wed, Feb 14, 2018 at 1:46 AM, Bjorn Andersson
 wrote:
> Interrupt commands causes the CP to trigger an interrupt as the command
> is processed, regardless of the GPU being done processing previous
> commands. This is seen by the interrupt being delivered before the
> fence is written on 8974 and is likely the cause of the additional
> CP_WAIT_FOR_IDLE workaround found for a306, which would cause the CP to
> wait for the GPU to go idle before triggering the interrupt.
>
> Instead we can set the (undocumented) BIT(31) of the CACHE_FLUSH_TS
> which will cause a special CACHE_FLUSH_TS interrupt to be triggered from
> the GPU as the write event is processed.
>
> Add CACHE_FLUSH_TS to the IRQ masks of A3xx and A4xx and remove the
> workaround for A306.
>
> Suggested-by: Jordan Crouse 
> Signed-off-by: Bjorn Andersson 
> ---
>
> This is only tested on 8974.

Tested on db410c (8016).. thanks

BR,
-R

>
>  drivers/gpu/drm/msm/adreno/a3xx_gpu.c   |  1 +
>  drivers/gpu/drm/msm/adreno/a4xx_gpu.c   |  1 +
>  drivers/gpu/drm/msm/adreno/adreno_gpu.c | 18 ++
>  3 files changed, 4 insertions(+), 16 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/adreno/a3xx_gpu.c 
> b/drivers/gpu/drm/msm/adreno/a3xx_gpu.c
> index 4baef2738178..a3a43be920d0 100644
> --- a/drivers/gpu/drm/msm/adreno/a3xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a3xx_gpu.c
> @@ -35,6 +35,7 @@
>  A3XX_INT0_CP_RB_INT | \
>  A3XX_INT0_CP_REG_PROTECT_FAULT |  \
>  A3XX_INT0_CP_AHB_ERROR_HALT | \
> +A3XX_INT0_CACHE_FLUSH_TS |\
>  A3XX_INT0_UCHE_OOB_ACCESS)
>
>  extern bool hang_debug;
> diff --git a/drivers/gpu/drm/msm/adreno/a4xx_gpu.c 
> b/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
> index 8199a4b9f2fa..b44cd0d90621 100644
> --- a/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
> @@ -27,6 +27,7 @@
>  A4XX_INT0_CP_RB_INT | \
>  A4XX_INT0_CP_REG_PROTECT_FAULT |  \
>  A4XX_INT0_CP_AHB_ERROR_HALT | \
> +A4XX_INT0_CACHE_FLUSH_TS |\
>  A4XX_INT0_UCHE_OOB_ACCESS)
>
>  extern bool hang_debug;
> diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c 
> b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> index de63ff26a062..5806f9942514 100644
> --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
> @@ -293,26 +293,12 @@ void adreno_submit(struct msm_gpu *gpu, struct 
> msm_gem_submit *submit,
> OUT_RING(ring, 0x);
> }
>
> +   /* BIT(31) of CACHE_FLUSH_TS triggers CACHE_FLUSH_TS IRQ from GPU */
> OUT_PKT3(ring, CP_EVENT_WRITE, 3);
> -   OUT_RING(ring, CACHE_FLUSH_TS);
> +   OUT_RING(ring, CACHE_FLUSH_TS | BIT(31));
> OUT_RING(ring, rbmemptr(ring, fence));
> OUT_RING(ring, submit->seqno);
>
> -   /* we could maybe be clever and only CP_COND_EXEC the interrupt: */
> -   OUT_PKT3(ring, CP_INTERRUPT, 1);
> -   OUT_RING(ring, 0x8000);
> -
> -   /* Workaround for missing irq issue on 8x16/a306.  Unsure if the
> -* root cause is a platform issue or some a306 quirk, but this
> -* keeps things humming along:
> -*/
> -   if (adreno_is_a306(adreno_gpu)) {
> -   OUT_PKT3(ring, CP_WAIT_FOR_IDLE, 1);
> -   OUT_RING(ring, 0x);
> -   OUT_PKT3(ring, CP_INTERRUPT, 1);
> -   OUT_RING(ring, 0x8000);
> -   }
> -
>  #if 0
> if (adreno_is_a3xx(adreno_gpu)) {
> /* Dummy set-constant to trigger context rollover */
> --
> 2.15.0
>
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [RFC PULL] Add Display Support for Qualcomm SDM845

2018-02-14 Thread Rob Clark
On Tue, Feb 13, 2018 at 3:00 PM, Rob Clark  wrote:
> On Tue, Feb 13, 2018 at 2:18 PM, Sean Paul  wrote:
>> Hi dri-devel,
>> Qualcomm has been working for the past few weeks on forward porting their
>> downstream drm driver from 4.14 to mainline. Please consider this PR as a
>> request for review, rather than an attempt at mainlining the code as it
>> currently stands. The goal is get this driver in shape over the next coming
>> months.
>>
>> In the meantime, I'll be hosting a tree here [1] to stage the fixes. Patches
>> will be posted and reviewed on linux-arm-...@vger.kernel.org. Once things 
>> look
>> good, I'll send another pull 4realz.
>>
>> To get the ball rolling, I've done some review on the new connector code, my
>> comments are below.
>>
>> Thanks in advance for your constructive feedback :)
>>
>> Sean
>>
>> [1]- git://people.freedesktop.org/~seanpaul/dpu-staging
>>
>> Review feedback:
>> 
>
> just so others aren't confused, s/sde/dpu/g in the list below (we did
> our DAL->DC rename before sending first RFC :-P)
>
>> - Solve the splash screen handling (or remove it)
>> - Simplify devicetree binding (remove register offsets)
>> feedback from reviewing sde_connector.c:
>> - Rationalize backlight implementation in sde_connector (display_count 
>> static)
>> - Sort out the dsi event passing between dsi/encoder/connector (move to 
>> encoder)
>> - include/uapi/drm/msm_drm_pp.h needs opensource userspace (or removal)
>> - connector->state access violations reading/writing mode_info
>> - s/sde_rect/drm_rect/
>> - sde_kms_info usage needs to be replaced with formal data structures (not
>>   stringified keypairs)
>> - sde_connector_ops needs to be trimmed, duplicates connector helpers, info
>>   hooks circumvent state, and other hooks should be stored in state or
>>   prepopulated (get_dst_format)
>> - sde_connector_get_dpms unused
>> - esd status check should migrate to encoder from connector
>> - backlight should be handled in panel drivers, not in the generic 
>> connector/dsi
>>   encoder
>> - sde_connector_helper_bridge_disable is called from encoder and calls back 
>> into
>>   set_power encoder function. if backlight, and esd status are removed,
>>   pre_kickoff can probably go away
>> - sde_connector_clk_ctrl is another example of encoder->connector->encoder 
>> call
>> - RETIRE_FENCE connector property should be removed, opting for the native
>>   atomic fences
>> - ROI (regions of interest) should be expressed per-plane instead of 
>> connector.
>>   there is work ongoing to support dirty_rects per-plane by Deepak Singh 
>> Rawat
>>    and Lukasz Spintzyk 
>> - Uma Shankar  has proposed HDR source metadata
>>   properties on the list, we should pivot to those instead of hand-rolling 
>> them
>>   in the sde driver
>> - Convert HDCP implementation to upstream Content Protection property
>> - Merge dsi and dsi_staging into one driver
>> - Writeback connector has been proposed by ARM (Liviu Dudau and Brian 
>> Starkey),
>>   we should work with their proposal instead of rolling OUT_FB ourselves
>> - sde_connector_set_property should be replaced with atomic helper
>> - dsi hotplug can probably be punted to the panel driver
>> - dpms should switch to enable/disable (or at least use the atomic helpers)
>> - dsi mode handling should also defer to the panel driver
>> - SDE_WB_CONFIG ioctl should be removed in favor of the existing ioctl to add
>>   user-defined modes
>> - dp implementation should use the existing dp helpers wherever possible
>> - lots of duplicated structures in dsi_defs.h that can be replaced with 
>> existing
>>   drm structs
>> - mode_valid should be split up and implemented directly in 
>> connector/encoder as
>>   appropriate
>> - sde_connector->aspace seems like it's unused?
>>
>
> To add to that, a few things I've noticed (but I've mostly only gotten
> as far as the front-end of the pipeline, ie. dpu_plane):
>
>  - It looks like, as was the case with mdp4/mdp5 (and really most other hw)
>there are still unequal capabilities for planes (ie. some can do YUV,
>scaling, etc).
>
>But drm-hwc (or weston) isn't going to know about that, so I think we'll
>need to add support for dynamically assigning dpu_plane::pipe, similar
>to what mdp5 does w/ plane<->hwpipe.  (Which I actually added specifically
>for drm-hwc ;-))
>
>  - I *think* we also need the same trick for combining mixers under one CRTC
>for 4k modes too?
>
>  - in dpu_plane, and perhaps elsewhere, userspace pointers for things like CSC
>and scaling coefficients need to be annotated w/ __user.  (Except the 
> should
>probably be blob properties instead.. and we probably need to strip out the
>custom properties and re-introduce them separately with some userspace +
>review.)
>
>  - dpu_formats.c looks mostly like a superset of mdp_format.c, ie there is 

Re: [Freedreno] [PATCH v7 6/6] drm/msm: iommu: Replace runtime calls with runtime suppliers

2018-02-14 Thread Robin Murphy

On 14/02/18 10:33, Vivek Gautam wrote:

On Wed, Feb 14, 2018 at 2:46 PM, Tomasz Figa  wrote:

Adding Jordan to this thread as well.


On Wed, Feb 14, 2018 at 6:13 PM, Vivek Gautam
 wrote:

Hi Tomasz,

On Wed, Feb 14, 2018 at 11:08 AM, Tomasz Figa  wrote:

On Wed, Feb 14, 2018 at 1:17 PM, Vivek Gautam
 wrote:

Hi Tomasz,

On Wed, Feb 14, 2018 at 8:31 AM, Tomasz Figa  wrote:

On Wed, Feb 14, 2018 at 11:13 AM, Rob Clark  wrote:

On Tue, Feb 13, 2018 at 8:59 PM, Tomasz Figa  wrote:

On Wed, Feb 14, 2018 at 3:03 AM, Rob Clark  wrote:

On Tue, Feb 13, 2018 at 4:10 AM, Tomasz Figa  wrote:

Hi Vivek,

Thanks for the patch. Please see my comments inline.

On Wed, Feb 7, 2018 at 7:31 PM, Vivek Gautam
 wrote:

While handling the concerned iommu, there should not be a
need to power control the drm devices from iommu interface.
If these drm devices need to be powered around this time,
the respective drivers should take care of this.

Replace the pm_runtime_get/put_sync() with
pm_runtime_get/put_suppliers() calls, to power-up
the connected iommu through the device link interface.
In case the device link is not setup these get/put_suppliers()
calls will be a no-op, and the iommu driver should take care of
powering on its devices accordingly.

Signed-off-by: Vivek Gautam 
---
  drivers/gpu/drm/msm/msm_iommu.c | 16 
  1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_iommu.c b/drivers/gpu/drm/msm/msm_iommu.c
index b23d33622f37..1ab629bbee69 100644
--- a/drivers/gpu/drm/msm/msm_iommu.c
+++ b/drivers/gpu/drm/msm/msm_iommu.c
@@ -40,9 +40,9 @@ static int msm_iommu_attach(struct msm_mmu *mmu, const char * 
const *names,
 struct msm_iommu *iommu = to_msm_iommu(mmu);
 int ret;

-   pm_runtime_get_sync(mmu->dev);
+   pm_runtime_get_suppliers(mmu->dev);
 ret = iommu_attach_device(iommu->domain, mmu->dev);
-   pm_runtime_put_sync(mmu->dev);
+   pm_runtime_put_suppliers(mmu->dev);


For me, it looks like a wrong place to handle runtime PM of IOMMU
here. iommu_attach_device() calls into IOMMU driver's attach_device()
callback and that's where necessary runtime PM gets should happen, if
any. In other words, driver A (MSM DRM driver) shouldn't be dealing
with power state of device controlled by driver B (ARM SMMU).


Note that we end up having to do the same, because of iommu_unmap()
while DRM driver is powered off..  it might be cleaner if it was all
self contained in the iommu driver, but that would make it so other
drivers couldn't call iommu_unmap() from an irq handler, which is
apparently something that some of them want to do..


I'd assume that runtime PM status is already guaranteed to be active
when the IRQ handler is running, by some other means (e.g.
pm_runtime_get_sync() called earlier, when queuing some work to the
hardware). Otherwise, I'm not sure how a powered down device could
trigger an IRQ.

So, if the master device power is already on, suppliers should be
powered on as well, thanks to device links.



umm, that is kindof the inverse of the problem..  the problem is
things like gpu driver (and v4l2 drivers that import dma-buf's,
afaict).. they will potentially call iommu->unmap() when device is not
active (due to userspace or things beyond the control of the driver)..
so *they* would want iommu to do pm get/put calls.


Which is fine and which is actually already done by one of the patches
in this series, not for map/unmap, but probe, add_device,
remove_device. Having parts of the API doing it inside the callback
and other parts outside sounds at least inconsistent.


But other drivers
trying to unmap from irq ctx would not.  Which is the contradictory
requirement that lead to the idea of iommu user powering up iommu for
unmap.


Sorry, maybe I wasn't clear. My last message was supposed to show that
it's not contradictory at all, because "other drivers trying to unmap
from irq ctx" would already have called pm_runtime_get_*() earlier
from a non-irq ctx, which would have also done the same on all the
linked suppliers, including the IOMMU. The ultimate result would be
that the map/unmap() of the IOMMU driver calling pm_runtime_get_sync()
would do nothing besides incrementing the reference count.


The entire point was to avoid the slowpath that pm_runtime_get/put_sync()
would add in map/unmap. It would not be correct to add a slowpath in irq_ctx
for taking care of non-irq_ctx and for the situations where master is already
powered-off.


Correct me if I'm wrong, but I believe that with what I'm proposing
there wouldn't be any slow path.


Yea, but only when the power domain is irq-safe? And not all platforms
enable irq-safe power domains. For instance, msm doesn't enable its

Re: [Freedreno] [PATCH v7 6/6] drm/msm: iommu: Replace runtime calls with runtime suppliers

2018-02-14 Thread Jordan Crouse
On Wed, Feb 14, 2018 at 12:31:29PM +0900, Tomasz Figa wrote:
> Hi Jordan,
> 
> On Wed, Feb 14, 2018 at 1:42 AM, Jordan Crouse  wrote:
> > On Tue, Feb 13, 2018 at 06:10:38PM +0900, Tomasz Figa wrote:
> >> Hi Vivek,
> >>
> >> Thanks for the patch. Please see my comments inline.
> >>
> >> On Wed, Feb 7, 2018 at 7:31 PM, Vivek Gautam
> >>  wrote:
> >> > While handling the concerned iommu, there should not be a
> >> > need to power control the drm devices from iommu interface.
> >> > If these drm devices need to be powered around this time,
> >> > the respective drivers should take care of this.
> >> >
> >> > Replace the pm_runtime_get/put_sync() with
> >> > pm_runtime_get/put_suppliers() calls, to power-up
> >> > the connected iommu through the device link interface.
> >> > In case the device link is not setup these get/put_suppliers()
> >> > calls will be a no-op, and the iommu driver should take care of
> >> > powering on its devices accordingly.
> >> >
> >> > Signed-off-by: Vivek Gautam 
> >> > ---
> >> >  drivers/gpu/drm/msm/msm_iommu.c | 16 
> >> >  1 file changed, 8 insertions(+), 8 deletions(-)
> >> >
> >> > diff --git a/drivers/gpu/drm/msm/msm_iommu.c 
> >> > b/drivers/gpu/drm/msm/msm_iommu.c
> >> > index b23d33622f37..1ab629bbee69 100644
> >> > --- a/drivers/gpu/drm/msm/msm_iommu.c
> >> > +++ b/drivers/gpu/drm/msm/msm_iommu.c
> >> > @@ -40,9 +40,9 @@ static int msm_iommu_attach(struct msm_mmu *mmu, const 
> >> > char * const *names,
> >> > struct msm_iommu *iommu = to_msm_iommu(mmu);
> >> > int ret;
> >> >
> >> > -   pm_runtime_get_sync(mmu->dev);
> >> > +   pm_runtime_get_suppliers(mmu->dev);
> >> > ret = iommu_attach_device(iommu->domain, mmu->dev);
> >> > -   pm_runtime_put_sync(mmu->dev);
> >> > +   pm_runtime_put_suppliers(mmu->dev);
> >>
> >> For me, it looks like a wrong place to handle runtime PM of IOMMU
> >> here. iommu_attach_device() calls into IOMMU driver's attach_device()
> >> callback and that's where necessary runtime PM gets should happen, if
> >> any. In other words, driver A (MSM DRM driver) shouldn't be dealing
> >> with power state of device controlled by driver B (ARM SMMU).
> >
> > This whole thing is confused by the fact that on MSM the GPU and the GPU 
> > IOMMU
> > share some of the same clocks and power rail so turning on the GPU also
> > turned on the IOMMU register banks by extension.
> 
> This is surprisingly not a very surprising case. Exactly the same can
> be seen on Rockchip SoCs and we're solving the problem using the
> solution I suggested. In fact, my suggestions to this thread are based
> on the design we chose for Rockchip, due to the high level of
> similarity (+/- the GPU directly programming IOMMU registers, which is
> not present there, but AFAICT it doesn't pose a problem here).
> 
> >
> > But if we put that aside the question is who should be responsible for
> > controlling the power in this relationship and there are several good 
> > reasons to
> > leave it up to the client device. The most important reason is when we move 
> > to
> > the per-instance model where the GPU self-programmings the SMMU registers. 
> > In
> > that case, the driver will need to make sure that the SMMU is powered up 
> > before
> > submitting the command and then removing the power vote when the commands
> > are done to save energy.
> 
> I might need more insight on what's going on in your hardware, but
> with my current understanding I'd argue that that is not right,
> because:
> 
> - When submitting commands to the GPU, the GPU driver will
> pm_runtime_get_sync() on the GPU device, which will automatically do
> the same on all the linked suppliers, which would also include the
> SMMU itself. The role of device links here is exactly that the GPU
> driver doesn't have to care which other devices need to be brought up.

This is true.  Assuming that the device link works correctly we would not need
to explicitly power the SMMU which makes my point entirely moot.

> - When the GPU is operating, the SMMU power must be supplied anyway,
> because it needs to be doing the translations, right? Note that by
> "power" I really mean the physical power supply in the SoC, e.g. as
> for a power domain. The runtime PM API in its current form (e.g.
> binary off or on operation) is unsuitable for managing other things,
> such as clocks (and there is ongoing work on improving it, e.g. by
> adding support for multiple power states).

As others have pointed out, the register banks and the translation unit are
powered separately (or at least, clocked separately).

>^^ The above would be actually guaranteed by your hardware design,
> where SMMU and GPU share the power domain and clocks. (We used to rely
> on this in old downstream implementation of Rockchip IOMMU and master
> drivers in Chromium OS kernel, before we moved to handling the clocks
> explicitly in 

Re: [Freedreno] [RFC PULL] Add Display Support for Qualcomm SDM845

2018-02-14 Thread Daniel Stone
Hi,

On 14 February 2018 at 13:50, Sean Paul  wrote:
> On Wed, Feb 14, 2018 at 07:22:53AM -0500, Rob Clark wrote:
>> On Wed, Feb 14, 2018 at 3:30 AM, Daniel Stone  wrote:
>> > This one I guess you can't get around though. Can they still run from
>> > the one plane, or do you secretly consume two planes?
>>
>> I think it is still the case, like mdp5, that you need two hw pipes..
>> but actually it gets more crazy, since there are some cases where two
>> planes could share a hw pipe.
>
> Right. Large fbs might require 2 pipes, and multiple overlapping or adjacent 
> small fbs
> can be serviced with 1 pipe.

I didn't know about using a single pipe for multiple framebuffers.
That's quite special indeed.

I think virtualising probably makes sense in that case: rather than
just getting around limitations, it's seeming like more of a VC4
situation where they just really aren't actually planes in the first
place.

Cheers,
Daniel
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [RFC PULL] Add Display Support for Qualcomm SDM845

2018-02-14 Thread Sean Paul
On Wed, Feb 14, 2018 at 07:22:53AM -0500, Rob Clark wrote:
> On Wed, Feb 14, 2018 at 3:30 AM, Daniel Stone  wrote:
> > Hi Rob,
> >
> > On 13 February 2018 at 20:00, Rob Clark  wrote:
> >> On Tue, Feb 13, 2018 at 2:18 PM, Sean Paul  wrote:
> >>> Qualcomm has been working for the past few weeks on forward porting their
> >>> downstream drm driver from 4.14 to mainline. Please consider this PR as a
> >>> request for review, rather than an attempt at mainlining the code as it
> >>> currently stands. The goal is get this driver in shape over the next 
> >>> coming
> >>> months.
> >>
> >> just so others aren't confused, s/sde/dpu/g in the list below (we did
> >> our DAL->DC rename before sending first RFC :-P)
> >
> > Heh, and it is quite a bit of code to dig through. I haven't yet got a
> > chance to fully check it out.
> >
> >>> - include/uapi/drm/msm_drm_pp.h needs opensource userspace (or removal)
> >
> > We don't really have a good story for pixel-processing API anywhere tbh.
> >
> >> To add to that, a few things I've noticed (but I've mostly only gotten
> >> as far as the front-end of the pipeline, ie. dpu_plane):
> >>
> >>  - It looks like, as was the case with mdp4/mdp5 (and really most other hw)
> >>there are still unequal capabilities for planes (ie. some can do YUV,
> >>scaling, etc).
> >>
> >>But drm-hwc (or weston) isn't going to know about that, so I think we'll
> >>need to add support for dynamically assigning dpu_plane::pipe, similar
> >>to what mdp5 does w/ plane<->hwpipe.  (Which I actually added 
> >> specifically
> >>for drm-hwc ;-))
> >
> > Formats/modifiers we do at least get per-plane. We won't know about
> > scaling, but at least Weston will go through trying each plane in
> > sequence until it finds one which accepts the configuration. Could HWC
> > do something like that with atomic, or does it rely on coming up with
> > a plan first rather than the brute-force testing approach? I haven't
> > seen its planner at all recently I'm afraid.

It could definitely build up a plan a la weston, it just doesn't atm.


> >
> >>  - I *think* we also need the same trick for combining mixers under one 
> >> CRTC
> >>for 4k modes too?
> >
> > This one I guess you can't get around though. Can they still run from
> > the one plane, or do you secretly consume two planes?
> >
> 
> I think it is still the case, like mdp5, that you need two hw pipes..
> but actually it gets more crazy, since there are some cases where two
> planes could share a hw pipe.  

Right. Large fbs might require 2 pipes, and multiple overlapping or adjacent 
small fbs
can be serviced with 1 pipe.

> Not sure that weston or drm-hwc is
> going to have much hope in being able to deal with that, so I think
> virtualizing the planes and crtcs is the better route.

Agreed. I think kernel is in the best place to make these decisions.

Sean

> 
> BR,
> -R
> 
> >>  - in dpu_plane, and perhaps elsewhere, userspace pointers for things like 
> >> CSC
> >>and scaling coefficients need to be annotated w/ __user.  (Except the 
> >> should
> >>probably be blob properties instead.. and we probably need to strip out 
> >> the
> >>custom properties and re-introduce them separately with some userspace +
> >>review.)
> >
> > It'd be good to know if the CSC could use the CRTC GAMMA_LUT -> CTM ->
> > DEGAMMA_LUT properties, if they were moved over to planes. (And if
> > not, why not; if there's any functionality missing from those, what it
> > is, etc.)
> >
> > Cheers,
> > Daniel

-- 
Sean Paul, Software Engineer, Google / Chromium OS
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [RFC PULL] Add Display Support for Qualcomm SDM845

2018-02-14 Thread Rob Clark
On Wed, Feb 14, 2018 at 3:30 AM, Daniel Stone  wrote:
> Hi Rob,
>
> On 13 February 2018 at 20:00, Rob Clark  wrote:
>> On Tue, Feb 13, 2018 at 2:18 PM, Sean Paul  wrote:
>>> Qualcomm has been working for the past few weeks on forward porting their
>>> downstream drm driver from 4.14 to mainline. Please consider this PR as a
>>> request for review, rather than an attempt at mainlining the code as it
>>> currently stands. The goal is get this driver in shape over the next coming
>>> months.
>>
>> just so others aren't confused, s/sde/dpu/g in the list below (we did
>> our DAL->DC rename before sending first RFC :-P)
>
> Heh, and it is quite a bit of code to dig through. I haven't yet got a
> chance to fully check it out.
>
>>> - include/uapi/drm/msm_drm_pp.h needs opensource userspace (or removal)
>
> We don't really have a good story for pixel-processing API anywhere tbh.
>
>> To add to that, a few things I've noticed (but I've mostly only gotten
>> as far as the front-end of the pipeline, ie. dpu_plane):
>>
>>  - It looks like, as was the case with mdp4/mdp5 (and really most other hw)
>>there are still unequal capabilities for planes (ie. some can do YUV,
>>scaling, etc).
>>
>>But drm-hwc (or weston) isn't going to know about that, so I think we'll
>>need to add support for dynamically assigning dpu_plane::pipe, similar
>>to what mdp5 does w/ plane<->hwpipe.  (Which I actually added specifically
>>for drm-hwc ;-))
>
> Formats/modifiers we do at least get per-plane. We won't know about
> scaling, but at least Weston will go through trying each plane in
> sequence until it finds one which accepts the configuration. Could HWC
> do something like that with atomic, or does it rely on coming up with
> a plan first rather than the brute-force testing approach? I haven't
> seen its planner at all recently I'm afraid.
>
>>  - I *think* we also need the same trick for combining mixers under one CRTC
>>for 4k modes too?
>
> This one I guess you can't get around though. Can they still run from
> the one plane, or do you secretly consume two planes?
>

I think it is still the case, like mdp5, that you need two hw pipes..
but actually it gets more crazy, since there are some cases where two
planes could share a hw pipe.  Not sure that weston or drm-hwc is
going to have much hope in being able to deal with that, so I think
virtualizing the planes and crtcs is the better route.

BR,
-R

>>  - in dpu_plane, and perhaps elsewhere, userspace pointers for things like 
>> CSC
>>and scaling coefficients need to be annotated w/ __user.  (Except the 
>> should
>>probably be blob properties instead.. and we probably need to strip out 
>> the
>>custom properties and re-introduce them separately with some userspace +
>>review.)
>
> It'd be good to know if the CSC could use the CRTC GAMMA_LUT -> CTM ->
> DEGAMMA_LUT properties, if they were moved over to planes. (And if
> not, why not; if there's any functionality missing from those, what it
> is, etc.)
>
> Cheers,
> Daniel
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH v7 6/6] drm/msm: iommu: Replace runtime calls with runtime suppliers

2018-02-14 Thread Vivek Gautam
On Wed, Feb 14, 2018 at 2:46 PM, Tomasz Figa  wrote:

Adding Jordan to this thread as well.

> On Wed, Feb 14, 2018 at 6:13 PM, Vivek Gautam
>  wrote:
>> Hi Tomasz,
>>
>> On Wed, Feb 14, 2018 at 11:08 AM, Tomasz Figa  wrote:
>>> On Wed, Feb 14, 2018 at 1:17 PM, Vivek Gautam
>>>  wrote:
 Hi Tomasz,

 On Wed, Feb 14, 2018 at 8:31 AM, Tomasz Figa  wrote:
> On Wed, Feb 14, 2018 at 11:13 AM, Rob Clark  wrote:
>> On Tue, Feb 13, 2018 at 8:59 PM, Tomasz Figa  wrote:
>>> On Wed, Feb 14, 2018 at 3:03 AM, Rob Clark  wrote:
 On Tue, Feb 13, 2018 at 4:10 AM, Tomasz Figa  
 wrote:
> Hi Vivek,
>
> Thanks for the patch. Please see my comments inline.
>
> On Wed, Feb 7, 2018 at 7:31 PM, Vivek Gautam
>  wrote:
>> While handling the concerned iommu, there should not be a
>> need to power control the drm devices from iommu interface.
>> If these drm devices need to be powered around this time,
>> the respective drivers should take care of this.
>>
>> Replace the pm_runtime_get/put_sync() with
>> pm_runtime_get/put_suppliers() calls, to power-up
>> the connected iommu through the device link interface.
>> In case the device link is not setup these get/put_suppliers()
>> calls will be a no-op, and the iommu driver should take care of
>> powering on its devices accordingly.
>>
>> Signed-off-by: Vivek Gautam 
>> ---
>>  drivers/gpu/drm/msm/msm_iommu.c | 16 
>>  1 file changed, 8 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/msm/msm_iommu.c 
>> b/drivers/gpu/drm/msm/msm_iommu.c
>> index b23d33622f37..1ab629bbee69 100644
>> --- a/drivers/gpu/drm/msm/msm_iommu.c
>> +++ b/drivers/gpu/drm/msm/msm_iommu.c
>> @@ -40,9 +40,9 @@ static int msm_iommu_attach(struct msm_mmu *mmu, 
>> const char * const *names,
>> struct msm_iommu *iommu = to_msm_iommu(mmu);
>> int ret;
>>
>> -   pm_runtime_get_sync(mmu->dev);
>> +   pm_runtime_get_suppliers(mmu->dev);
>> ret = iommu_attach_device(iommu->domain, mmu->dev);
>> -   pm_runtime_put_sync(mmu->dev);
>> +   pm_runtime_put_suppliers(mmu->dev);
>
> For me, it looks like a wrong place to handle runtime PM of IOMMU
> here. iommu_attach_device() calls into IOMMU driver's attach_device()
> callback and that's where necessary runtime PM gets should happen, if
> any. In other words, driver A (MSM DRM driver) shouldn't be dealing
> with power state of device controlled by driver B (ARM SMMU).

 Note that we end up having to do the same, because of iommu_unmap()
 while DRM driver is powered off..  it might be cleaner if it was all
 self contained in the iommu driver, but that would make it so other
 drivers couldn't call iommu_unmap() from an irq handler, which is
 apparently something that some of them want to do..
>>>
>>> I'd assume that runtime PM status is already guaranteed to be active
>>> when the IRQ handler is running, by some other means (e.g.
>>> pm_runtime_get_sync() called earlier, when queuing some work to the
>>> hardware). Otherwise, I'm not sure how a powered down device could
>>> trigger an IRQ.
>>>
>>> So, if the master device power is already on, suppliers should be
>>> powered on as well, thanks to device links.
>>>
>>
>> umm, that is kindof the inverse of the problem..  the problem is
>> things like gpu driver (and v4l2 drivers that import dma-buf's,
>> afaict).. they will potentially call iommu->unmap() when device is not
>> active (due to userspace or things beyond the control of the driver)..
>> so *they* would want iommu to do pm get/put calls.
>
> Which is fine and which is actually already done by one of the patches
> in this series, not for map/unmap, but probe, add_device,
> remove_device. Having parts of the API doing it inside the callback
> and other parts outside sounds at least inconsistent.
>
>> But other drivers
>> trying to unmap from irq ctx would not.  Which is the contradictory
>> requirement that lead to the idea of iommu user powering up iommu for
>> unmap.
>
> Sorry, maybe I wasn't clear. My last message was supposed to show that
> it's not contradictory at all, because "other drivers trying to unmap
> from irq ctx" would already have called pm_runtime_get_*() earlier
> from a non-irq 

Re: [Freedreno] [PATCH v7 6/6] drm/msm: iommu: Replace runtime calls with runtime suppliers

2018-02-14 Thread Tomasz Figa
On Wed, Feb 14, 2018 at 6:13 PM, Vivek Gautam
 wrote:
> Hi Tomasz,
>
> On Wed, Feb 14, 2018 at 11:08 AM, Tomasz Figa  wrote:
>> On Wed, Feb 14, 2018 at 1:17 PM, Vivek Gautam
>>  wrote:
>>> Hi Tomasz,
>>>
>>> On Wed, Feb 14, 2018 at 8:31 AM, Tomasz Figa  wrote:
 On Wed, Feb 14, 2018 at 11:13 AM, Rob Clark  wrote:
> On Tue, Feb 13, 2018 at 8:59 PM, Tomasz Figa  wrote:
>> On Wed, Feb 14, 2018 at 3:03 AM, Rob Clark  wrote:
>>> On Tue, Feb 13, 2018 at 4:10 AM, Tomasz Figa  wrote:
 Hi Vivek,

 Thanks for the patch. Please see my comments inline.

 On Wed, Feb 7, 2018 at 7:31 PM, Vivek Gautam
  wrote:
> While handling the concerned iommu, there should not be a
> need to power control the drm devices from iommu interface.
> If these drm devices need to be powered around this time,
> the respective drivers should take care of this.
>
> Replace the pm_runtime_get/put_sync() with
> pm_runtime_get/put_suppliers() calls, to power-up
> the connected iommu through the device link interface.
> In case the device link is not setup these get/put_suppliers()
> calls will be a no-op, and the iommu driver should take care of
> powering on its devices accordingly.
>
> Signed-off-by: Vivek Gautam 
> ---
>  drivers/gpu/drm/msm/msm_iommu.c | 16 
>  1 file changed, 8 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/msm_iommu.c 
> b/drivers/gpu/drm/msm/msm_iommu.c
> index b23d33622f37..1ab629bbee69 100644
> --- a/drivers/gpu/drm/msm/msm_iommu.c
> +++ b/drivers/gpu/drm/msm/msm_iommu.c
> @@ -40,9 +40,9 @@ static int msm_iommu_attach(struct msm_mmu *mmu, 
> const char * const *names,
> struct msm_iommu *iommu = to_msm_iommu(mmu);
> int ret;
>
> -   pm_runtime_get_sync(mmu->dev);
> +   pm_runtime_get_suppliers(mmu->dev);
> ret = iommu_attach_device(iommu->domain, mmu->dev);
> -   pm_runtime_put_sync(mmu->dev);
> +   pm_runtime_put_suppliers(mmu->dev);

 For me, it looks like a wrong place to handle runtime PM of IOMMU
 here. iommu_attach_device() calls into IOMMU driver's attach_device()
 callback and that's where necessary runtime PM gets should happen, if
 any. In other words, driver A (MSM DRM driver) shouldn't be dealing
 with power state of device controlled by driver B (ARM SMMU).
>>>
>>> Note that we end up having to do the same, because of iommu_unmap()
>>> while DRM driver is powered off..  it might be cleaner if it was all
>>> self contained in the iommu driver, but that would make it so other
>>> drivers couldn't call iommu_unmap() from an irq handler, which is
>>> apparently something that some of them want to do..
>>
>> I'd assume that runtime PM status is already guaranteed to be active
>> when the IRQ handler is running, by some other means (e.g.
>> pm_runtime_get_sync() called earlier, when queuing some work to the
>> hardware). Otherwise, I'm not sure how a powered down device could
>> trigger an IRQ.
>>
>> So, if the master device power is already on, suppliers should be
>> powered on as well, thanks to device links.
>>
>
> umm, that is kindof the inverse of the problem..  the problem is
> things like gpu driver (and v4l2 drivers that import dma-buf's,
> afaict).. they will potentially call iommu->unmap() when device is not
> active (due to userspace or things beyond the control of the driver)..
> so *they* would want iommu to do pm get/put calls.

 Which is fine and which is actually already done by one of the patches
 in this series, not for map/unmap, but probe, add_device,
 remove_device. Having parts of the API doing it inside the callback
 and other parts outside sounds at least inconsistent.

> But other drivers
> trying to unmap from irq ctx would not.  Which is the contradictory
> requirement that lead to the idea of iommu user powering up iommu for
> unmap.

 Sorry, maybe I wasn't clear. My last message was supposed to show that
 it's not contradictory at all, because "other drivers trying to unmap
 from irq ctx" would already have called pm_runtime_get_*() earlier
 from a non-irq ctx, which would have also done the same on all the
 linked suppliers, including the IOMMU. The ultimate result would be
 that the map/unmap() of the IOMMU driver calling pm_runtime_get_sync()
 would do 

Re: [Freedreno] [RFC PULL] Add Display Support for Qualcomm SDM845

2018-02-14 Thread Daniel Stone
Hi Rob,

On 13 February 2018 at 20:00, Rob Clark  wrote:
> On Tue, Feb 13, 2018 at 2:18 PM, Sean Paul  wrote:
>> Qualcomm has been working for the past few weeks on forward porting their
>> downstream drm driver from 4.14 to mainline. Please consider this PR as a
>> request for review, rather than an attempt at mainlining the code as it
>> currently stands. The goal is get this driver in shape over the next coming
>> months.
>
> just so others aren't confused, s/sde/dpu/g in the list below (we did
> our DAL->DC rename before sending first RFC :-P)

Heh, and it is quite a bit of code to dig through. I haven't yet got a
chance to fully check it out.

>> - include/uapi/drm/msm_drm_pp.h needs opensource userspace (or removal)

We don't really have a good story for pixel-processing API anywhere tbh.

> To add to that, a few things I've noticed (but I've mostly only gotten
> as far as the front-end of the pipeline, ie. dpu_plane):
>
>  - It looks like, as was the case with mdp4/mdp5 (and really most other hw)
>there are still unequal capabilities for planes (ie. some can do YUV,
>scaling, etc).
>
>But drm-hwc (or weston) isn't going to know about that, so I think we'll
>need to add support for dynamically assigning dpu_plane::pipe, similar
>to what mdp5 does w/ plane<->hwpipe.  (Which I actually added specifically
>for drm-hwc ;-))

Formats/modifiers we do at least get per-plane. We won't know about
scaling, but at least Weston will go through trying each plane in
sequence until it finds one which accepts the configuration. Could HWC
do something like that with atomic, or does it rely on coming up with
a plan first rather than the brute-force testing approach? I haven't
seen its planner at all recently I'm afraid.

>  - I *think* we also need the same trick for combining mixers under one CRTC
>for 4k modes too?

This one I guess you can't get around though. Can they still run from
the one plane, or do you secretly consume two planes?

>  - in dpu_plane, and perhaps elsewhere, userspace pointers for things like CSC
>and scaling coefficients need to be annotated w/ __user.  (Except the 
> should
>probably be blob properties instead.. and we probably need to strip out the
>custom properties and re-introduce them separately with some userspace +
>review.)

It'd be good to know if the CSC could use the CRTC GAMMA_LUT -> CTM ->
DEGAMMA_LUT properties, if they were moved over to planes. (And if
not, why not; if there's any functionality missing from those, what it
is, etc.)

Cheers,
Daniel
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH v7 3/6] iommu/arm-smmu: Invoke pm_runtime during probe, add/remove device

2018-02-14 Thread Vivek Gautam
Hi Tomasz,


On Tue, Feb 13, 2018 at 1:54 PM, Tomasz Figa  wrote:
> Hi Vivek,
>
> Thanks for the patch. Please see my comments inline.
>
> On Wed, Feb 7, 2018 at 7:31 PM, Vivek Gautam
>  wrote:
>> From: Sricharan R 
>>
>> The smmu device probe/remove and add/remove master device callbacks
>> gets called when the smmu is not linked to its master, that is without
>> the context of the master device. So calling runtime apis in those places
>> separately.
>>
>> Signed-off-by: Sricharan R 
>> [vivek: Cleanup pm runtime calls]
>> Signed-off-by: Vivek Gautam 
>> ---
>>  drivers/iommu/arm-smmu.c | 42 ++
>>  1 file changed, 38 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
>> index 9e2f917e16c2..c024f69c1682 100644
>> --- a/drivers/iommu/arm-smmu.c
>> +++ b/drivers/iommu/arm-smmu.c
>> @@ -913,11 +913,15 @@ static void arm_smmu_destroy_domain_context(struct 
>> iommu_domain *domain)
>> struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
>> struct arm_smmu_device *smmu = smmu_domain->smmu;
>> struct arm_smmu_cfg *cfg = _domain->cfg;
>> -   int irq;
>> +   int ret, irq;
>>
>> if (!smmu || domain->type == IOMMU_DOMAIN_IDENTITY)
>> return;
>>
>> +   ret = pm_runtime_get_sync(smmu->dev);
>> +   if (ret)
>> +   return;
>
> pm_runtime_get_sync() will return 0 if the device was powered off, 1
> if it was already/still powered on or runtime PM is not compiled in,
> or a negative value on error, so shouldn't the test be (ret < 0)?

Yes, I too noticed it while i was testing on a different platform, and
was hitting
a failure case. Will update at all places.

>
> Moreover, I'm actually wondering if it makes any sense to power up the
> hardware just to program it and power it down again. In a system where
> the IOMMU is located within a power domain, it would cause the IOMMU
> block to lose its state anyway.
>
> Actually, reflecting back on "[PATCH v7 2/6] iommu/arm-smmu: Add
> pm_runtime/sleep ops", perhaps it would make more sense to just
> control the clocks independently of runtime PM? Then, runtime PM could
> be used for real power management, e.g. really powering the block up
> and down, for further power saving.
>
> +Generally similar comments for other places in this patch.
>
>> +
>> /*
>>  * Disable the context bank and free the page tables before freeing
>>  * it.
>> @@ -932,6 +936,8 @@ static void arm_smmu_destroy_domain_context(struct 
>> iommu_domain *domain)
>>
>> free_io_pgtable_ops(smmu_domain->pgtbl_ops);
>> __arm_smmu_free_bitmap(smmu->context_map, cfg->cbndx);
>> +
>> +   pm_runtime_put_sync(smmu->dev);
>
> Is there any point in the put being sync here?

No, I don't think. Can manage with just a 'put' here. Will modify.

best regards
Vivek

>
> [snip]
>
>> @@ -2131,6 +2152,14 @@ static int arm_smmu_device_probe(struct 
>> platform_device *pdev)
>> if (err)
>> return err;
>>
>> +   platform_set_drvdata(pdev, smmu);
>> +
>> +   pm_runtime_enable(dev);
>
> I suspect this may be a disaster for systems where IOMMUs are located
> inside power domains, because the driver doesn't take care of the
> IOMMU block losing its state on physical power down, as I mentioned in
> my comments above.
>
> Best regards,
> Tomasz



-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation
___
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno


Re: [Freedreno] [PATCH v7 3/6] iommu/arm-smmu: Invoke pm_runtime during probe, add/remove device

2018-02-14 Thread Vivek Gautam
On Tue, Feb 13, 2018 at 7:22 PM, Tomasz Figa  wrote:
> On Tue, Feb 13, 2018 at 9:57 PM, Robin Murphy  wrote:
>> On 13/02/18 08:24, Tomasz Figa wrote:
>>>
>>> Hi Vivek,
>>>
>>> Thanks for the patch. Please see my comments inline.
>>>
>>> On Wed, Feb 7, 2018 at 7:31 PM, Vivek Gautam
>>>  wrote:

 From: Sricharan R 

 The smmu device probe/remove and add/remove master device callbacks
 gets called when the smmu is not linked to its master, that is without
 the context of the master device. So calling runtime apis in those places
 separately.

 Signed-off-by: Sricharan R 
 [vivek: Cleanup pm runtime calls]
 Signed-off-by: Vivek Gautam 
 ---
   drivers/iommu/arm-smmu.c | 42
 ++
   1 file changed, 38 insertions(+), 4 deletions(-)

 diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
 index 9e2f917e16c2..c024f69c1682 100644
 --- a/drivers/iommu/arm-smmu.c
 +++ b/drivers/iommu/arm-smmu.c
 @@ -913,11 +913,15 @@ static void arm_smmu_destroy_domain_context(struct
 iommu_domain *domain)
  struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
  struct arm_smmu_device *smmu = smmu_domain->smmu;
  struct arm_smmu_cfg *cfg = _domain->cfg;
 -   int irq;
 +   int ret, irq;

  if (!smmu || domain->type == IOMMU_DOMAIN_IDENTITY)
  return;

 +   ret = pm_runtime_get_sync(smmu->dev);
 +   if (ret)
 +   return;
>>>
>>>
>>> pm_runtime_get_sync() will return 0 if the device was powered off, 1
>>> if it was already/still powered on or runtime PM is not compiled in,
>>> or a negative value on error, so shouldn't the test be (ret < 0)?
>>>
>>> Moreover, I'm actually wondering if it makes any sense to power up the
>>> hardware just to program it and power it down again. In a system where
>>> the IOMMU is located within a power domain, it would cause the IOMMU
>>> block to lose its state anyway.
>>
>>
>> This is generally for the case where the SMMU internal state remains active,
>> but the programming interface needs to be powered up in order to access it.
>
> That's true for Qualcomm SMMU, but I think that would be different for
> existing users of the driver?
>
>>
>>> Actually, reflecting back on "[PATCH v7 2/6] iommu/arm-smmu: Add
>>> pm_runtime/sleep ops", perhaps it would make more sense to just
>>> control the clocks independently of runtime PM? Then, runtime PM could
>>> be used for real power management, e.g. really powering the block up
>>> and down, for further power saving.
>>
>>
>> Unfortunately that ends up pretty much unmanageable, because there are
>> numerous different SMMU microarchitectures with fundamentally different
>> clock/power domain schemes (multiplied by individual SoC integration
>> possibilities). Since this is fundamentally a generic architectural driver,
>> adding explicit clock support would probably make the whole thing about 50%
>> clock code, with complicated decision trees around every hardware access
>> calculating which clocks are necessary for a given operation on a given
>> system. That maintainability aspect is why we've already nacked such a
>> fine-grained approach in the past.
>
> Hmm, I think we are talking about different things here. My suggestion
> would not add much more code to the driver than this patch does, calls
> to arm_smmu_enable_clocks() instead of pm_runtime_get_sync() and
> arm_smmu_disable_clocks() instead of pm_runtime_put(). The
> implementation of both functions would be a simple call to clk_bulk_
> API (possibly even no need to put this into functions, just call
> directly).

Well, things are not so straight on msm. The IP clocks on msm are usually
powered by (or i should rather say, controlled by) the same power domain
that provides the VDD supply to iommu block. This is the behavior on msm8996
atleast that we are testing on right now.
On later SoCs too things don't change drastically.

So, you can't have the block in low power state until you program the
register space
and then power on the block to let it do its magic.
Clocks and power domains are linked, and that's why we add them to the
pm callbacks.

This approach also looks generic to me since the platforms will either have such
a link or they will not have. But, in either case you will have power and clocks
available at the time when you need them.


Thanks & regards
Vivek

>
> Best regards,
> Tomasz
> --
> To unsubscribe from this list: send the line "unsubscribe linux-arm-msm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora