Re: [PATCH v4 25/27] drm/vmwgfx: Don't set struct drm_device.irq_enabled

2021-06-25 Thread Zack Rusin


> On Jun 25, 2021, at 04:22, Thomas Zimmermann  wrote:
> 
> The field drm_device.irq_enabled is only used by legacy drivers
> with userspace modesetting. Don't set it in vmxgfx. All usage of
> the field within vmwgfx can safely be removed.
> 
> Signed-off-by: Thomas Zimmermann 
> Reviewed-by: Laurent Pinchart 
> Acked-by: Daniel Vetter 

Looks good.

Reviewed-by: Zack Rusin 



Re: [PATCH 12/12] media: hantro: Add support for the Rockchip PX30

2021-06-25 Thread Ezequiel Garcia
Hi Alex,

On Fri, 2021-06-25 at 00:39 +0200, Alex Bee wrote:
> Hi Ezequiel,
> 
> Am 24.06.21 um 20:26 schrieb Ezequiel Garcia:
> > From: Paul Kocialkowski 
> > 
> > The PX30 SoC includes both the VDPU2 and VEPU2 blocks which are similar
> > to the RK3399 (Hantro G1/H1 with shuffled registers).
> > 
> > Signed-off-by: Paul Kocialkowski 
> > Signed-off-by: Ezequiel Garcia 
> > ---
> >   drivers/staging/media/hantro/hantro_drv.c |  1 +
> >   drivers/staging/media/hantro/hantro_hw.h  |  1 +
> >   .../staging/media/hantro/rockchip_vpu_hw.c    | 28 +++
> >   3 files changed, 30 insertions(+)
> > 
> > diff --git a/drivers/staging/media/hantro/hantro_drv.c 
> > b/drivers/staging/media/hantro/hantro_drv.c
> > index 9b5415176bfe..8a2edd67f2c6 100644
> > --- a/drivers/staging/media/hantro/hantro_drv.c
> > +++ b/drivers/staging/media/hantro/hantro_drv.c
> > @@ -582,6 +582,7 @@ static const struct v4l2_file_operations hantro_fops = {
> >   
> >   static const struct of_device_id of_hantro_match[] = {
> >   #ifdef CONFIG_VIDEO_HANTRO_ROCKCHIP
> > +   { .compatible = "rockchip,px30-vpu",   .data = _vpu_variant, },
> > { .compatible = "rockchip,rk3036-vpu", .data = _vpu_variant, 
> > },
> > { .compatible = "rockchip,rk3066-vpu", .data = _vpu_variant, 
> > },
> > { .compatible = "rockchip,rk3288-vpu", .data = _vpu_variant, 
> > },
> > diff --git a/drivers/staging/media/hantro/hantro_hw.h 
> > b/drivers/staging/media/hantro/hantro_hw.h
> > index 9296624654a6..df7b5e3a57b9 100644
> > --- a/drivers/staging/media/hantro/hantro_hw.h
> > +++ b/drivers/staging/media/hantro/hantro_hw.h
> > @@ -209,6 +209,7 @@ enum hantro_enc_fmt {
> >   
> >   extern const struct hantro_variant imx8mq_vpu_g2_variant;
> >   extern const struct hantro_variant imx8mq_vpu_variant;
> > +extern const struct hantro_variant px30_vpu_variant;
> >   extern const struct hantro_variant rk3036_vpu_variant;
> >   extern const struct hantro_variant rk3066_vpu_variant;
> >   extern const struct hantro_variant rk3288_vpu_variant;
> > diff --git a/drivers/staging/media/hantro/rockchip_vpu_hw.c 
> > b/drivers/staging/media/hantro/rockchip_vpu_hw.c
> > index e4e3b5e7689b..e7f56e30b4a8 100644
> > --- a/drivers/staging/media/hantro/rockchip_vpu_hw.c
> > +++ b/drivers/staging/media/hantro/rockchip_vpu_hw.c
> > @@ -16,6 +16,7 @@
> >   
> >   #define RK3066_ACLK_MAX_FREQ (300 * 1000 * 1000)
> >   #define RK3288_ACLK_MAX_FREQ (400 * 1000 * 1000)
> > +#define PX30_ACLK_MAX_FREQ (300 * 1000 * 1000)
> >   
> 
> Not sure it is required (besides semantics) to introduce a new 
> *ACLK_MAX_FREQ here. rk3036_vpu_hw_init could be used to entirely 
> replace px30_vpu_hw_init in px30_vpu_variant.
> 
> (Maybe we can find some more common names, after we know which variant 
> combinations exist)
> 

TBH, I considered getting rid of all the macros and just use something
like 300 * MHZ.

Another alternative is to encode the clock rate in struct hantro_variant
itself.

In any case, I don't see this adding any value, so maybe I'll
just reuse rk3036_vpu_hw_init as you suggest.

> >   /*
> >    * Supported formats.
> > @@ -279,6 +280,12 @@ static int rockchip_vpu_hw_init(struct hantro_dev *vpu)
> > return 0;
> >   }
> >   
> > +static int px30_vpu_hw_init(struct hantro_dev *vpu)
> > +{
> > +   clk_set_rate(vpu->clocks[0].clk, PX30_ACLK_MAX_FREQ);
> > +   return 0;
> > +}
> > +
> >   static void rk3066_vpu_dec_reset(struct hantro_ctx *ctx)
> >   {
> > struct hantro_dev *vpu = ctx->dev;
> > @@ -452,6 +459,10 @@ static const char * const rockchip_vpu_clk_names[] = {
> > "aclk", "hclk"
> >   };
> >   
> > +static const char * const px30_clk_names[] = {
> > +   "aclk", "hclk"
> > +};
> > +
> >   /* VDPU1/VEPU1 */
> >   
> >   const struct hantro_variant rk3036_vpu_variant = {
> > @@ -548,3 +559,20 @@ const struct hantro_variant rk3399_vpu_variant = {
> > .clk_names = rockchip_vpu_clk_names,
> > .num_clocks = ARRAY_SIZE(rockchip_vpu_clk_names)
> >   };
> > +
> > +const struct hantro_variant px30_vpu_variant = {
> > +   .enc_offset = 0x0,
> > +   .enc_fmts = rockchip_vpu_enc_fmts,
> > +   .num_enc_fmts = ARRAY_SIZE(rockchip_vpu_enc_fmts),
> > +   .dec_offset = 0x400,
> > +   .dec_fmts = rk3399_vpu_dec_fmts,
> > +   .num_dec_fmts = ARRAY_SIZE(rk3399_vpu_dec_fmts),
> > +   .codec = HANTRO_JPEG_ENCODER | HANTRO_MPEG2_DECODER |
> > +    HANTRO_VP8_DECODER | HANTRO_H264_DECODER,
> > +   .codec_ops = rk3399_vpu_codec_ops,
> > +   .irqs = rockchip_vpu2_irqs,
> > +   .num_irqs = ARRAY_SIZE(rockchip_vpu2_irqs),
> > +   .init = px30_vpu_hw_init,
> > +   .clk_names = px30_clk_names,
> > +   .num_clocks = ARRAY_SIZE(px30_clk_names)
> Better re-use rockchip_vpu_clk_names for these two.

Ah, this slipped through. You are right of course.

-- 
Kindly,
Ezequiel



Re: [PATCH 10/12] dt-bindings: media: rockchip-vpu: Add PX30 compatible

2021-06-25 Thread Ezequiel Garcia
Hey Dafna,

Thanks a lot for reviewing this.

On Fri, 2021-06-25 at 12:21 +0300, Dafna Hirschfeld wrote:
> Hi,
> 
> On 24.06.21 21:26, Ezequiel Garcia wrote:
> > From: Paul Kocialkowski 
> > 
> > The Rockchip PX30 SoC has a Hantro VPU that features a decoder (VDPU2)
> > and an encoder (VEPU2).
> > 
> > Signed-off-by: Paul Kocialkowski 
> > Signed-off-by: Ezequiel Garcia 
> > ---
> >   Documentation/devicetree/bindings/media/rockchip-vpu.yaml | 3 +++
> >   1 file changed, 3 insertions(+)
> > 
> > diff --git a/Documentation/devicetree/bindings/media/rockchip-vpu.yaml 
> > b/Documentation/devicetree/bindings/media/rockchip-vpu.yaml
> > index b88172a59de7..3b9c5aa91fcc 100644
> > --- a/Documentation/devicetree/bindings/media/rockchip-vpu.yaml
> > +++ b/Documentation/devicetree/bindings/media/rockchip-vpu.yaml
> > @@ -28,6 +28,9 @@ properties:
> >     - items:
> >     - const: rockchip,rk3228-vpu
> >     - const: rockchip,rk3399-vpu
> > +  - items:
> > +  - const: rockchip,px30-vpu
> > +  - const: rockchip,rk3399-vpu
> 
> This rk3399 compatible is already mentioned in the last 'items' list, should 
> we add it again?
> 

What we are mandating here is that "rockchip,px30-vpu" can only be used
with "rockchip,rk3399-vpu".

I.e.:

  compatible = "rockchip,px30-vpu", "rockchip,rk3399-vpu";

-- 
Kindly,
Ezequiel



Re: [PATCH 09/12] media: hantro: Enable H.264 on Rockchip VDPU2

2021-06-25 Thread Ezequiel Garcia
(Adding Nicolas)

Hi Alex,

On Fri, 2021-06-25 at 01:13 +0200, Alex Bee wrote:
> Hi Ezequiel,
> 
> Am 24.06.21 um 20:26 schrieb Ezequiel Garcia:
> > Given H.264 support for VDPU2 was just added, let's enable it.
> > For now, this is only enabled on platform that don't have
> > an RKVDEC core, such as RK3328.
> 
> Is there any reason, you do not want to enabe H.264 on RK3399? I know 
> H.264 can be done by by rkvdec already, but from what I understand that 
> shouldn't be an issue: The first decoder found that meets the 
> requirements will be taken.
> 

Thanks a lot the review.

I really doubt userspace stacks are readily supporting that strategy.

The first decoder device supporting the codec format will be selected,
I doubt features such as profile and levels are checked to decide
which decoder to use.

I'd rather play safe on the kernel side and avoid offering
two competing devices for the same codec.

Kindly,
Ezequiel



[PATCH 2/2] drm/i915/adlp: Add ADL-P GuC/HuC firmware files

2021-06-25 Thread John . C . Harrison
From: John Harrison 

Add ADL-P to the list of supported GuC and HuC firmware versions. For
HuC, it reuses the existing TGL firmware file. For GuC, there is a
dedicated firmware release.

Signed-off-by: John Harrison 
---
 drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c 
b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
index f05b1572e3c3..3a16d08608a5 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
@@ -48,6 +48,7 @@ void intel_uc_fw_change_status(struct intel_uc_fw *uc_fw,
  * firmware as TGL.
  */
 #define INTEL_UC_FIRMWARE_DEFS(fw_def, guc_def, huc_def) \
+   fw_def(ALDERLAKE_P, 0, guc_def(adlp, 62, 0, 3), huc_def(tgl, 7, 9, 3)) \
fw_def(ALDERLAKE_S, 0, guc_def(tgl, 62, 0, 0), huc_def(tgl,  7, 9, 3)) \
fw_def(ROCKETLAKE,  0, guc_def(tgl, 62, 0, 0), huc_def(tgl,  7, 9, 3)) \
fw_def(TIGERLAKE,   0, guc_def(tgl, 62, 0, 0), huc_def(tgl,  7, 9, 3)) \
-- 
2.25.1



[PATCH 1/2] drm/i915/huc: Update TGL and friends to HuC 7.9.3

2021-06-25 Thread John . C . Harrison
From: John Harrison 

A new HuC is available for TGL and compatible platforms, so switch to
using it.

Signed-off-by: John Harrison 
---
 drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c 
b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
index 9f23e9de3237..f05b1572e3c3 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
@@ -48,9 +48,9 @@ void intel_uc_fw_change_status(struct intel_uc_fw *uc_fw,
  * firmware as TGL.
  */
 #define INTEL_UC_FIRMWARE_DEFS(fw_def, guc_def, huc_def) \
-   fw_def(ALDERLAKE_S, 0, guc_def(tgl, 62, 0, 0), huc_def(tgl,  7, 5, 0)) \
-   fw_def(ROCKETLAKE,  0, guc_def(tgl, 62, 0, 0), huc_def(tgl,  7, 5, 0)) \
-   fw_def(TIGERLAKE,   0, guc_def(tgl, 62, 0, 0), huc_def(tgl,  7, 5, 0)) \
+   fw_def(ALDERLAKE_S, 0, guc_def(tgl, 62, 0, 0), huc_def(tgl,  7, 9, 3)) \
+   fw_def(ROCKETLAKE,  0, guc_def(tgl, 62, 0, 0), huc_def(tgl,  7, 9, 3)) \
+   fw_def(TIGERLAKE,   0, guc_def(tgl, 62, 0, 0), huc_def(tgl,  7, 9, 3)) \
fw_def(JASPERLAKE,  0, guc_def(ehl, 62, 0, 0), huc_def(ehl,  9, 0, 0)) \
fw_def(ELKHARTLAKE, 0, guc_def(ehl, 62, 0, 0), huc_def(ehl,  9, 0, 0)) \
fw_def(ICELAKE, 0, guc_def(icl, 62, 0, 0), huc_def(icl,  9, 0, 0)) \
-- 
2.25.1



[PATCH 0/2] Update to new HuC for TGL+ and enable GuC/HuC on ADL-P

2021-06-25 Thread John . C . Harrison
From: John Harrison 

There is a new HuC version available for TGL and compatible platforms,
so switch to using it. Also, there is now a GuC and HuC for ADL-P, so
use those too.

Signed-off-by: John Harrison 


John Harrison (2):
  drm/i915/huc: Update TGL and friends to HuC 7.9.3
  drm/i915/adlp: Add ADL-P GuC/HuC firmware files

 drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

-- 
2.25.1



[CI] PR for new GuC v62.0.3 and HuC v7.9.3 binaries

2021-06-25 Thread John . C . Harrison
The following changes since commit 0f66b74b6267fce66395316308d88b0535aa3df2:

  cypress: update firmware for cyw54591 pcie (2021-06-09 07:12:02 -0400)

are available in the Git repository at:

  git://anongit.freedesktop.org/drm/drm-firmware adlp_updates

for you to fetch changes up to 38983b6f28e0b217aedd7c5430081a60d6591732:

  drm/i915/firmware: Add GuC v62.03 for ADLP (2021-06-25 11:51:11 -0700)


Anusha Srivatsa (2):
  drm/i915/firmware: Add HuC v7.9.3 for TGL
  drm/i915/firmware: Add GuC v62.03 for ADLP

 WHENCE  |   7 +++
 adlp_guc_62.0.3.bin | Bin 0 -> 336704 bytes
 tgl_huc_7.9.3.bin   | Bin 0 -> 589888 bytes
 3 files changed, 7 insertions(+)
 create mode 100644 adlp_guc_62.0.3.bin
 create mode 100644 tgl_huc_7.9.3.bin


Re: [Intel-gfx] [PATCH 06/47] drm/i915/guc: Optimize CTB writes and reads

2021-06-25 Thread Matthew Brost
On Fri, Jun 25, 2021 at 03:09:29PM +0200, Michal Wajdeczko wrote:
> 
> 
> On 24.06.2021 09:04, Matthew Brost wrote:
> > CTB writes are now in the path of command submission and should be
> > optimized for performance. Rather than reading CTB descriptor values
> > (e.g. head, tail) which could result in accesses across the PCIe bus,
> > store shadow local copies and only read/write the descriptor values when
> > absolutely necessary. Also store the current space in the each channel
> > locally.
> > 

Missed two comments, addressed below.

> > Signed-off-by: John Harrison 
> > Signed-off-by: Matthew Brost 
> > ---
> >  drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 76 ++-
> >  drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h |  6 ++
> >  2 files changed, 51 insertions(+), 31 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c 
> > b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> > index 27ec30b5ef47..1fd5c69358ef 100644
> > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> > @@ -130,6 +130,10 @@ static void guc_ct_buffer_desc_init(struct 
> > guc_ct_buffer_desc *desc)
> >  static void guc_ct_buffer_reset(struct intel_guc_ct_buffer *ctb)
> >  {
> > ctb->broken = false;
> > +   ctb->tail = 0;
> > +   ctb->head = 0;
> > +   ctb->space = CIRC_SPACE(ctb->tail, ctb->head, ctb->size);
> > +
> > guc_ct_buffer_desc_init(ctb->desc);
> >  }
> >  
> > @@ -383,10 +387,8 @@ static int ct_write(struct intel_guc_ct *ct,
> >  {
> > struct intel_guc_ct_buffer *ctb = >ctbs.send;
> > struct guc_ct_buffer_desc *desc = ctb->desc;
> > -   u32 head = desc->head;
> > -   u32 tail = desc->tail;
> > +   u32 tail = ctb->tail;
> > u32 size = ctb->size;
> > -   u32 used;
> > u32 header;
> > u32 hxg;
> > u32 *cmds = ctb->cmds;
> > @@ -398,25 +400,14 @@ static int ct_write(struct intel_guc_ct *ct,
> > if (unlikely(desc->status))
> > goto corrupted;
> >  
> > -   if (unlikely((tail | head) >= size)) {
> > +#ifdef CONFIG_DRM_I915_DEBUG_GUC
> 
> since we are caching tail, we may want to check if it's sill correct:
> 
>   tail = READ_ONCE(desc->tail);
>   if (unlikely(tail != ctb->tail)) {
>   CT_ERROR(ct, "Tail was modified %u != %u\n",
>tail, ctb->tail);
>   desc->status |= GUC_CTB_STATUS_MISMATCH;
>   goto corrupted;
>   }
> 
> and since we own the tail then we can be more strict:
> 
>   GEM_BUG_ON(tail > size);
> 
> and then finally just check GuC head:
> 
>   head = READ_ONCE(desc->head);
>   if (unlikely(head >= size)) {
>   ...
> 
> > +   if (unlikely((desc->tail | desc->head) >= size)) {
> > CT_ERROR(ct, "Invalid offsets head=%u tail=%u (size=%u)\n",
> > -head, tail, size);
> > +desc->head, desc->tail, size);
> > desc->status |= GUC_CTB_STATUS_OVERFLOW;
> > goto corrupted;
> > }
> > -
> > -   /*
> > -* tail == head condition indicates empty. GuC FW does not support
> > -* using up the entire buffer to get tail == head meaning full.
> > -*/
> > -   if (tail < head)
> > -   used = (size - head) + tail;
> > -   else
> > -   used = tail - head;
> > -
> > -   /* make sure there is a space including extra dw for the fence */
> > -   if (unlikely(used + len + 1 >= size))
> > -   return -ENOSPC;
> > +#endif
> >  
> > /*
> >  * dw0: CT header (including fence)
> > @@ -457,7 +448,9 @@ static int ct_write(struct intel_guc_ct *ct,
> > write_barrier(ct);
> >  
> > /* now update descriptor */
> > +   ctb->tail = tail;
> > WRITE_ONCE(desc->tail, tail);
> > +   ctb->space -= len + 1;
> 
> this magic "1" is likely GUC_CTB_MSG_MIN_LEN, right ?
> 
> >  
> > return 0;
> >  
> > @@ -473,7 +466,7 @@ static int ct_write(struct intel_guc_ct *ct,
> >   * @req:   pointer to pending request
> >   * @status:placeholder for status
> >   *
> > - * For each sent request, Guc shall send bac CT response message.
> > + * For each sent request, GuC shall send back CT response message.
> >   * Our message handler will update status of tracked request once
> >   * response message with given fence is received. Wait here and
> >   * check for valid response status value.
> > @@ -520,24 +513,35 @@ static inline bool ct_deadlocked(struct intel_guc_ct 
> > *ct)
> > return ret;
> >  }
> >  
> > -static inline bool h2g_has_room(struct intel_guc_ct_buffer *ctb, u32 
> > len_dw)
> > +static inline bool h2g_has_room(struct intel_guc_ct *ct, u32 len_dw)
> >  {
> > -   struct guc_ct_buffer_desc *desc = ctb->desc;
> > -   u32 head = READ_ONCE(desc->head);
> > +   struct intel_guc_ct_buffer *ctb = >ctbs.send;
> > +   u32 head;
> > u32 space;
> >  
> > -   space = CIRC_SPACE(desc->tail, head, ctb->size);
> > +   if (ctb->space >= len_dw)
> > +   return true;
> > +
> > +   head = 

Re: [PATCH] drm/amdgpu/dc: Really fix DCN3.1 Makefile for PPC64

2021-06-25 Thread Harry Wentland
On 2021-06-23 6:30 a.m., Michal Suchanek wrote:
> Also copy over the part that makes old gcc handling cross-platform.
> 
> Fixes: df7a1658f257 ("drm/amdgpu/dc: fix DCN3.1 Makefile for PPC64")
> Fixes: 926d6972efb6 ("drm/amd/display: Add DCN3.1 blocks to the DC Makefile")
> Signed-off-by: Michal Suchanek 

Reviewed-by: Harry Wentland 

Harry

> ---
> The fact that the old gcc handling triggers on gcc 10 and 11 is another
> story I don't want to delve into.
> ---
>  drivers/gpu/drm/amd/display/dc/dcn31/Makefile | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/gpu/drm/amd/display/dc/dcn31/Makefile 
> b/drivers/gpu/drm/amd/display/dc/dcn31/Makefile
> index 5dcdc5a858fe..4bab97acb155 100644
> --- a/drivers/gpu/drm/amd/display/dc/dcn31/Makefile
> +++ b/drivers/gpu/drm/amd/display/dc/dcn31/Makefile
> @@ -28,6 +28,7 @@ endif
>  CFLAGS_$(AMDDALPATH)/dc/dcn31/dcn31_resource.o += -mhard-float
>  endif
>  
> +ifdef CONFIG_X86
>  ifdef IS_OLD_GCC
>  # Stack alignment mismatch, proceed with caution.
>  # GCC < 7.1 cannot compile code using `double` and 
> -mpreferred-stack-boundary=3
> @@ -36,6 +37,7 @@ CFLAGS_$(AMDDALPATH)/dc/dcn31/dcn31_resource.o += 
> -mpreferred-stack-boundary=4
>  else
>  CFLAGS_$(AMDDALPATH)/dc/dcn31/dcn31_resource.o += -msse2
>  endif
> +endif
>  
>  AMD_DAL_DCN31 = $(addprefix $(AMDDALPATH)/dc/dcn31/,$(DCN31))
>  
> 



Re: [PATCH 09/47] drm/i915/guc: Remove GuC stage descriptor, add lrc descriptor

2021-06-25 Thread John Harrison

On 6/24/2021 00:04, Matthew Brost wrote:

Remove old GuC stage descriptor, add lrc descriptor which will be used
by the new GuC interface implemented in this patch series.

Cc: John Harrison 
Signed-off-by: Matthew Brost 

Reviewed-by: John Harrison 



RE: [PATCH 4/4] drm/i915/gem: Migrate to system at dma-buf map time

2021-06-25 Thread Ruhl, Michael J


>-Original Message-
>From: Thomas Hellström 
>Sent: Friday, June 25, 2021 3:10 PM>To: Ruhl, Michael J 
>; intel-
>g...@lists.freedesktop.org; dri-devel@lists.freedesktop.org
>Cc: Auld, Matthew 
>Subject: Re: [PATCH 4/4] drm/i915/gem: Migrate to system at dma-buf map
>time
>
>
>On 6/25/21 9:07 PM, Ruhl, Michael J wrote:
>>> -Original Message-
>>> From: Thomas Hellström 
>>> Sent: Friday, June 25, 2021 2:50 PM
>>> To: Ruhl, Michael J ; intel-
>>> g...@lists.freedesktop.org; dri-devel@lists.freedesktop.org
>>> Cc: Auld, Matthew 
>>> Subject: Re: [PATCH 4/4] drm/i915/gem: Migrate to system at dma-buf
>map
>>> time
>>>
>>> Hi, Mike,
>>>
>>> On 6/25/21 7:57 PM, Ruhl, Michael J wrote:
> -Original Message-
> From: Thomas Hellström 
> Sent: Friday, June 25, 2021 1:52 PM
> To: Ruhl, Michael J ; intel-
> g...@lists.freedesktop.org; dri-devel@lists.freedesktop.org
> Cc: Auld, Matthew 
> Subject: Re: [PATCH 4/4] drm/i915/gem: Migrate to system at dma-buf
>>> map
> time
>
>
> On 6/25/21 7:38 PM, Ruhl, Michael J wrote:
>>> -Original Message-
>>> From: Thomas Hellström 
>>> Sent: Friday, June 25, 2021 12:18 PM
>>> To: Ruhl, Michael J ; intel-
>>> g...@lists.freedesktop.org; dri-devel@lists.freedesktop.org
>>> Cc: Auld, Matthew 
>>> Subject: Re: [PATCH 4/4] drm/i915/gem: Migrate to system at dma-
>buf
> map
>>> time
>>>
>>> Hi, Michael,
>>>
>>> thanks for looking at this.
>>>
>>> On 6/25/21 6:02 PM, Ruhl, Michael J wrote:
> -Original Message-
> From: dri-devel  On
>>> Behalf
> Of
> Thomas Hellström
> Sent: Thursday, June 24, 2021 2:31 PM
> To: intel-...@lists.freedesktop.org; dri-
>de...@lists.freedesktop.org
> Cc: Thomas Hellström ; Auld,
>>> Matthew
> 
> Subject: [PATCH 4/4] drm/i915/gem: Migrate to system at dma-buf
>>> map
>>> time
> Until we support p2p dma or as a complement to that, migrate data
> to system memory at dma-buf map time if possible.
>
> Signed-off-by: Thomas Hellström
>>> 
> ---
> drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c | 9 -
> 1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
> b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
> index 616c3a2f1baf..a52f885bc09a 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
> @@ -25,7 +25,14 @@ static struct sg_table
>>> *i915_gem_map_dma_buf(struct
> dma_buf_attachment *attachme
>   struct scatterlist *src, *dst;
>   int ret, i;
>
> - ret = i915_gem_object_pin_pages_unlocked(obj);
> + ret = i915_gem_object_lock_interruptible(obj, NULL);
 Hmm, I believe in most cases that the caller should be holding the
 lock (object dma-resv) on this object already.
>>> Yes, I agree, In particular for other instances of our own driver,  at
>>> least since the dma_resv introduction.
>>>
>>> But I also think that's a pre-existing bug, since
>>> i915_gem_object_pin_pages_unlocked() will also take the lock.
>> Ouch yes.  Missed that.
>>
>>> I Think we need to initially make the exporter dynamic-capable to
>>> resolve this, and drop the locking here completely, as dma-buf docs
>says
>>> that we're then guaranteed to get called with the object lock held.
>>>
>>> I figure if we make the exporter dynamic, we need to migrate already
>at
>>> dma_buf_pin time so we don't pin the object in the wrong location.
>> The exporter as dynamic  (ops->pin is available) is optional, but
>importer
>> dynamic (ops->move_notify) is required.
>>
>> With that in mind, it would seem that there are three possible
>>> combinations
>> for the migrate to be attempted:
>>
>> 1) in the ops->pin function (export_dynamic != import_dynamic,
>during
> attach)
>> 2) in the ops->pin function (export_dynamic and
> !CONFIG_DMABUF_MOVE_NOTIFY) during mapping
>> 3) and possibly in ops->map_dma_buf (exort_dynamic iand
> CONFIG_DMABUF_MOVE_NOTIFY)
>> Since one possibility has to be in the mapping function, it seems that if
>we
>> can figure out the locking, that the migrate should probably be
>available
> here.
>> Mike
> So perhaps just to initially fix the bug, we could just implement NOP
> pin() and unpin() callbacks and drop the locking in map_attach() and
> replace it with an assert_object_held();
 That is the sticky part of the move notify API.

 If you do the attach_dynamic you have to have an ops with move_notify.

 (https://elixir.bootlin.com/linux/v5.13-rc7/source/drivers/dma-buf/dma-
>>> buf.c#L730)
 If you don't have that, i.e. just 

Re: [Intel-gfx] [PATCH 43/47] drm/i915/guc: Hook GuC scheduling policies up

2021-06-25 Thread John Harrison

On 6/24/2021 17:59, Matthew Brost wrote:

On Thu, Jun 24, 2021 at 12:05:12AM -0700, Matthew Brost wrote:

From: John Harrison 

Use the official driver default scheduling policies for configuring
the GuC scheduler rather than a bunch of hardcoded values.

Signed-off-by: John Harrison 
Signed-off-by: Matthew Brost 
Cc: Jose Souza 
---
  drivers/gpu/drm/i915/gt/intel_engine_types.h  |  1 +
  drivers/gpu/drm/i915/gt/uc/intel_guc.h|  2 +
  drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c| 44 ++-
  .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 11 +++--
  4 files changed, 53 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h 
b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index 0ceffa2be7a7..37db857bb56c 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -455,6 +455,7 @@ struct intel_engine_cs {
  #define I915_ENGINE_IS_VIRTUAL   BIT(5)
  #define I915_ENGINE_HAS_RELATIVE_MMIO BIT(6)
  #define I915_ENGINE_REQUIRES_CMD_PARSER BIT(7)
+#define I915_ENGINE_WANT_FORCED_PREEMPTION BIT(8)
unsigned int flags;
  
  	/*

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h 
b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
index c38365cd5fab..905ecbc7dbe3 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
@@ -270,6 +270,8 @@ int intel_guc_engine_failure_process_msg(struct intel_guc 
*guc,
  
  void intel_guc_find_hung_context(struct intel_engine_cs *engine);
  
+int intel_guc_global_policies_update(struct intel_guc *guc);

+
  void intel_guc_submission_reset_prepare(struct intel_guc *guc);
  void intel_guc_submission_reset(struct intel_guc *guc, bool stalled);
  void intel_guc_submission_reset_finish(struct intel_guc *guc);
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
index d3e86ab7508f..2ad5fcd4e1b7 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
@@ -77,14 +77,54 @@ static u32 guc_ads_blob_size(struct intel_guc *guc)
   guc_ads_private_data_size(guc);
  }
  
-static void guc_policies_init(struct guc_policies *policies)

+static void guc_policies_init(struct intel_guc *guc, struct guc_policies 
*policies)
  {
+   struct intel_gt *gt = guc_to_gt(guc);
+   struct drm_i915_private *i915 = gt->i915;
+
policies->dpc_promote_time = GLOBAL_POLICY_DEFAULT_DPC_PROMOTE_TIME_US;
policies->max_num_work_items = GLOBAL_POLICY_MAX_NUM_WI;
+
policies->global_flags = 0;
+   if (i915->params.reset < 2)
+   policies->global_flags |= GLOBAL_POLICY_DISABLE_ENGINE_RESET;
+
policies->is_valid = 1;
  }
  
+static int guc_action_policies_update(struct intel_guc *guc, u32 policy_offset)

+{
+   u32 action[] = {
+   INTEL_GUC_ACTION_GLOBAL_SCHED_POLICY_CHANGE,
+   policy_offset
+   };
+
+   return intel_guc_send(guc, action, ARRAY_SIZE(action));
+}
+
+int intel_guc_global_policies_update(struct intel_guc *guc)
+{
+   struct __guc_ads_blob *blob = guc->ads_blob;
+   struct intel_gt *gt = guc_to_gt(guc);
+   intel_wakeref_t wakeref;
+   int ret;
+
+   if (!blob)
+   return -ENOTSUPP;
+
+   GEM_BUG_ON(!blob->ads.scheduler_policies);
+
+   guc_policies_init(guc, >policies);
+
+   if (!intel_guc_is_ready(guc))
+   return 0;
+
+   with_intel_runtime_pm(>i915->runtime_pm, wakeref)
+   ret = guc_action_policies_update(guc, 
blob->ads.scheduler_policies);
+
+   return ret;
+}
+
  static void guc_mapping_table_init(struct intel_gt *gt,
   struct guc_gt_system_info *system_info)
  {
@@ -281,7 +321,7 @@ static void __guc_ads_init(struct intel_guc *guc)
u8 engine_class, guc_class;
  
  	/* GuC scheduling policies */

-   guc_policies_init(>policies);
+   guc_policies_init(guc, >policies);
  
  	/*

 * GuC expects a per-engine-class context image and size
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index 6188189314d5..a427336ce916 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -873,6 +873,7 @@ void intel_guc_submission_reset_finish(struct intel_guc 
*guc)
GEM_WARN_ON(atomic_read(>outstanding_submission_g2h));
atomic_set(>outstanding_submission_g2h, 0);
  
+	intel_guc_global_policies_update(guc);

enable_submission(guc);
intel_gt_unpark_heartbeats(guc_to_gt(guc));
  }
@@ -1161,8 +1162,12 @@ static void guc_context_policy_init(struct 
intel_engine_cs *engine,
  {
desc->policy_flags = 0;
  
-	desc->execution_quantum = CONTEXT_POLICY_DEFAULT_EXECUTION_QUANTUM_US;

-   desc->preemption_timeout = CONTEXT_POLICY_DEFAULT_PREEMPTION_TIME_US;
+   if (engine->flags & 

Re: [PATCH 4/4] drm/i915/gem: Migrate to system at dma-buf map time

2021-06-25 Thread Thomas Hellström



On 6/25/21 9:07 PM, Ruhl, Michael J wrote:

-Original Message-
From: Thomas Hellström 
Sent: Friday, June 25, 2021 2:50 PM
To: Ruhl, Michael J ; intel-
g...@lists.freedesktop.org; dri-devel@lists.freedesktop.org
Cc: Auld, Matthew 
Subject: Re: [PATCH 4/4] drm/i915/gem: Migrate to system at dma-buf map
time

Hi, Mike,

On 6/25/21 7:57 PM, Ruhl, Michael J wrote:

-Original Message-
From: Thomas Hellström 
Sent: Friday, June 25, 2021 1:52 PM
To: Ruhl, Michael J ; intel-
g...@lists.freedesktop.org; dri-devel@lists.freedesktop.org
Cc: Auld, Matthew 
Subject: Re: [PATCH 4/4] drm/i915/gem: Migrate to system at dma-buf

map

time


On 6/25/21 7:38 PM, Ruhl, Michael J wrote:

-Original Message-
From: Thomas Hellström 
Sent: Friday, June 25, 2021 12:18 PM
To: Ruhl, Michael J ; intel-
g...@lists.freedesktop.org; dri-devel@lists.freedesktop.org
Cc: Auld, Matthew 
Subject: Re: [PATCH 4/4] drm/i915/gem: Migrate to system at dma-buf

map

time

Hi, Michael,

thanks for looking at this.

On 6/25/21 6:02 PM, Ruhl, Michael J wrote:

-Original Message-
From: dri-devel  On

Behalf

Of

Thomas Hellström
Sent: Thursday, June 24, 2021 2:31 PM
To: intel-...@lists.freedesktop.org; dri-devel@lists.freedesktop.org
Cc: Thomas Hellström ; Auld,

Matthew


Subject: [PATCH 4/4] drm/i915/gem: Migrate to system at dma-buf

map

time

Until we support p2p dma or as a complement to that, migrate data
to system memory at dma-buf map time if possible.

Signed-off-by: Thomas Hellström



---
drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c | 9 -
1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
index 616c3a2f1baf..a52f885bc09a 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
@@ -25,7 +25,14 @@ static struct sg_table

*i915_gem_map_dma_buf(struct

dma_buf_attachment *attachme
struct scatterlist *src, *dst;
int ret, i;

-   ret = i915_gem_object_pin_pages_unlocked(obj);
+   ret = i915_gem_object_lock_interruptible(obj, NULL);

Hmm, I believe in most cases that the caller should be holding the
lock (object dma-resv) on this object already.

Yes, I agree, In particular for other instances of our own driver,  at
least since the dma_resv introduction.

But I also think that's a pre-existing bug, since
i915_gem_object_pin_pages_unlocked() will also take the lock.

Ouch yes.  Missed that.


I Think we need to initially make the exporter dynamic-capable to
resolve this, and drop the locking here completely, as dma-buf docs says
that we're then guaranteed to get called with the object lock held.

I figure if we make the exporter dynamic, we need to migrate already at
dma_buf_pin time so we don't pin the object in the wrong location.

The exporter as dynamic  (ops->pin is available) is optional, but importer
dynamic (ops->move_notify) is required.

With that in mind, it would seem that there are three possible

combinations

for the migrate to be attempted:

1) in the ops->pin function (export_dynamic != import_dynamic, during

attach)

2) in the ops->pin function (export_dynamic and

!CONFIG_DMABUF_MOVE_NOTIFY) during mapping

3) and possibly in ops->map_dma_buf (exort_dynamic iand

CONFIG_DMABUF_MOVE_NOTIFY)

Since one possibility has to be in the mapping function, it seems that if we
can figure out the locking, that the migrate should probably be available

here.

Mike

So perhaps just to initially fix the bug, we could just implement NOP
pin() and unpin() callbacks and drop the locking in map_attach() and
replace it with an assert_object_held();

That is the sticky part of the move notify API.

If you do the attach_dynamic you have to have an ops with move_notify.

(https://elixir.bootlin.com/linux/v5.13-rc7/source/drivers/dma-buf/dma-

buf.c#L730)

If you don't have that, i.e. just the pin interface, the attach will be
rejected, and you will not get the callbacks.

I understood that as the requirement for move_notify is only if the
*importer* declares dynamic. A dynamic exporter could choose whether to
call move_notify() on eviction or to pin and never evict. If the
importer is non-dynamic, the core calls pin() and the only choice is to
pin and never evict.

So if we temporarily choose to pin and never evict for *everything*, (as
the current code does now), I think we should be good for now, and then
we can implement all fancy p2p and move_notify stuff on top of that.

/sigh.

You are correct.  I was mistakenly placing the pin API (dma_buf_ops) in the
attach_ops.  Must be Friday.

Upon further reflection, I think that your path will work.

However, is doing a pin (with no locking) from the dma_buf_mapping any different
from using the pin API + export_dynamic?

M


Yes, it's different for dynamic importers only that would otherwise 
never pin, and we could mistakenly evict the object without having 
implemented calling move_notify. If 

RE: [PATCH 4/4] drm/i915/gem: Migrate to system at dma-buf map time

2021-06-25 Thread Ruhl, Michael J
>-Original Message-
>From: Thomas Hellström 
>Sent: Friday, June 25, 2021 2:50 PM
>To: Ruhl, Michael J ; intel-
>g...@lists.freedesktop.org; dri-devel@lists.freedesktop.org
>Cc: Auld, Matthew 
>Subject: Re: [PATCH 4/4] drm/i915/gem: Migrate to system at dma-buf map
>time
>
>Hi, Mike,
>
>On 6/25/21 7:57 PM, Ruhl, Michael J wrote:
>>> -Original Message-
>>> From: Thomas Hellström 
>>> Sent: Friday, June 25, 2021 1:52 PM
>>> To: Ruhl, Michael J ; intel-
>>> g...@lists.freedesktop.org; dri-devel@lists.freedesktop.org
>>> Cc: Auld, Matthew 
>>> Subject: Re: [PATCH 4/4] drm/i915/gem: Migrate to system at dma-buf
>map
>>> time
>>>
>>>
>>> On 6/25/21 7:38 PM, Ruhl, Michael J wrote:
> -Original Message-
> From: Thomas Hellström 
> Sent: Friday, June 25, 2021 12:18 PM
> To: Ruhl, Michael J ; intel-
> g...@lists.freedesktop.org; dri-devel@lists.freedesktop.org
> Cc: Auld, Matthew 
> Subject: Re: [PATCH 4/4] drm/i915/gem: Migrate to system at dma-buf
>>> map
> time
>
> Hi, Michael,
>
> thanks for looking at this.
>
> On 6/25/21 6:02 PM, Ruhl, Michael J wrote:
>>> -Original Message-
>>> From: dri-devel  On
>Behalf
>>> Of
>>> Thomas Hellström
>>> Sent: Thursday, June 24, 2021 2:31 PM
>>> To: intel-...@lists.freedesktop.org; dri-devel@lists.freedesktop.org
>>> Cc: Thomas Hellström ; Auld,
> Matthew
>>> 
>>> Subject: [PATCH 4/4] drm/i915/gem: Migrate to system at dma-buf
>map
> time
>>> Until we support p2p dma or as a complement to that, migrate data
>>> to system memory at dma-buf map time if possible.
>>>
>>> Signed-off-by: Thomas Hellström
>
>>> ---
>>> drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c | 9 -
>>> 1 file changed, 8 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
>>> b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
>>> index 616c3a2f1baf..a52f885bc09a 100644
>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
>>> @@ -25,7 +25,14 @@ static struct sg_table
> *i915_gem_map_dma_buf(struct
>>> dma_buf_attachment *attachme
>>> struct scatterlist *src, *dst;
>>> int ret, i;
>>>
>>> -   ret = i915_gem_object_pin_pages_unlocked(obj);
>>> +   ret = i915_gem_object_lock_interruptible(obj, NULL);
>> Hmm, I believe in most cases that the caller should be holding the
>> lock (object dma-resv) on this object already.
> Yes, I agree, In particular for other instances of our own driver,  at
> least since the dma_resv introduction.
>
> But I also think that's a pre-existing bug, since
> i915_gem_object_pin_pages_unlocked() will also take the lock.
 Ouch yes.  Missed that.

> I Think we need to initially make the exporter dynamic-capable to
> resolve this, and drop the locking here completely, as dma-buf docs says
> that we're then guaranteed to get called with the object lock held.
>
> I figure if we make the exporter dynamic, we need to migrate already at
> dma_buf_pin time so we don't pin the object in the wrong location.
 The exporter as dynamic  (ops->pin is available) is optional, but importer
 dynamic (ops->move_notify) is required.

 With that in mind, it would seem that there are three possible
>combinations
 for the migrate to be attempted:

 1) in the ops->pin function (export_dynamic != import_dynamic, during
>>> attach)
 2) in the ops->pin function (export_dynamic and
>>> !CONFIG_DMABUF_MOVE_NOTIFY) during mapping
 3) and possibly in ops->map_dma_buf (exort_dynamic iand
>>> CONFIG_DMABUF_MOVE_NOTIFY)
 Since one possibility has to be in the mapping function, it seems that if 
 we
 can figure out the locking, that the migrate should probably be available
>>> here.
 Mike
>>> So perhaps just to initially fix the bug, we could just implement NOP
>>> pin() and unpin() callbacks and drop the locking in map_attach() and
>>> replace it with an assert_object_held();
>> That is the sticky part of the move notify API.
>>
>> If you do the attach_dynamic you have to have an ops with move_notify.
>>
>> (https://elixir.bootlin.com/linux/v5.13-rc7/source/drivers/dma-buf/dma-
>buf.c#L730)
>>
>> If you don't have that, i.e. just the pin interface, the attach will be
>> rejected, and you will not get the callbacks.
>
>I understood that as the requirement for move_notify is only if the
>*importer* declares dynamic. A dynamic exporter could choose whether to
>call move_notify() on eviction or to pin and never evict. If the
>importer is non-dynamic, the core calls pin() and the only choice is to
>pin and never evict.
>
>So if we temporarily choose to pin and never evict for *everything*, (as
>the current code does now), I think we should be good for now, and then
>we can implement all fancy p2p 

Re: [PATCH 4/4] drm/i915/gem: Migrate to system at dma-buf map time

2021-06-25 Thread Thomas Hellström

Hi, Mike,

On 6/25/21 7:57 PM, Ruhl, Michael J wrote:

-Original Message-
From: Thomas Hellström 
Sent: Friday, June 25, 2021 1:52 PM
To: Ruhl, Michael J ; intel-
g...@lists.freedesktop.org; dri-devel@lists.freedesktop.org
Cc: Auld, Matthew 
Subject: Re: [PATCH 4/4] drm/i915/gem: Migrate to system at dma-buf map
time


On 6/25/21 7:38 PM, Ruhl, Michael J wrote:

-Original Message-
From: Thomas Hellström 
Sent: Friday, June 25, 2021 12:18 PM
To: Ruhl, Michael J ; intel-
g...@lists.freedesktop.org; dri-devel@lists.freedesktop.org
Cc: Auld, Matthew 
Subject: Re: [PATCH 4/4] drm/i915/gem: Migrate to system at dma-buf

map

time

Hi, Michael,

thanks for looking at this.

On 6/25/21 6:02 PM, Ruhl, Michael J wrote:

-Original Message-
From: dri-devel  On Behalf

Of

Thomas Hellström
Sent: Thursday, June 24, 2021 2:31 PM
To: intel-...@lists.freedesktop.org; dri-devel@lists.freedesktop.org
Cc: Thomas Hellström ; Auld,

Matthew


Subject: [PATCH 4/4] drm/i915/gem: Migrate to system at dma-buf map

time

Until we support p2p dma or as a complement to that, migrate data
to system memory at dma-buf map time if possible.

Signed-off-by: Thomas Hellström 
---
drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c | 9 -
1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
index 616c3a2f1baf..a52f885bc09a 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
@@ -25,7 +25,14 @@ static struct sg_table

*i915_gem_map_dma_buf(struct

dma_buf_attachment *attachme
struct scatterlist *src, *dst;
int ret, i;

-   ret = i915_gem_object_pin_pages_unlocked(obj);
+   ret = i915_gem_object_lock_interruptible(obj, NULL);

Hmm, I believe in most cases that the caller should be holding the
lock (object dma-resv) on this object already.

Yes, I agree, In particular for other instances of our own driver,  at
least since the dma_resv introduction.

But I also think that's a pre-existing bug, since
i915_gem_object_pin_pages_unlocked() will also take the lock.

Ouch yes.  Missed that.


I Think we need to initially make the exporter dynamic-capable to
resolve this, and drop the locking here completely, as dma-buf docs says
that we're then guaranteed to get called with the object lock held.

I figure if we make the exporter dynamic, we need to migrate already at
dma_buf_pin time so we don't pin the object in the wrong location.

The exporter as dynamic  (ops->pin is available) is optional, but importer
dynamic (ops->move_notify) is required.

With that in mind, it would seem that there are three possible combinations
for the migrate to be attempted:

1) in the ops->pin function (export_dynamic != import_dynamic, during

attach)

2) in the ops->pin function (export_dynamic and

!CONFIG_DMABUF_MOVE_NOTIFY) during mapping

3) and possibly in ops->map_dma_buf (exort_dynamic iand

CONFIG_DMABUF_MOVE_NOTIFY)

Since one possibility has to be in the mapping function, it seems that if we
can figure out the locking, that the migrate should probably be available

here.

Mike

So perhaps just to initially fix the bug, we could just implement NOP
pin() and unpin() callbacks and drop the locking in map_attach() and
replace it with an assert_object_held();

That is the sticky part of the move notify API.

If you do the attach_dynamic you have to have an ops with move_notify.

(https://elixir.bootlin.com/linux/v5.13-rc7/source/drivers/dma-buf/dma-buf.c#L730)

If you don't have that, i.e. just the pin interface, the attach will be
rejected, and you will not get the callbacks.


I understood that as the requirement for move_notify is only if the 
*importer* declares dynamic. A dynamic exporter could choose whether to 
call move_notify() on eviction or to pin and never evict. If the 
importer is non-dynamic, the core calls pin() and the only choice is to 
pin and never evict.


So if we temporarily choose to pin and never evict for *everything*, (as 
the current code does now), I think we should be good for now, and then 
we can implement all fancy p2p and move_notify stuff on top of that.


/Thomas




So I think that the only thing we can do for now is to dop the locking and add 
the

assert_object_held();

M







/Thomas



Re: [Intel-gfx] [PATCH 06/47] drm/i915/guc: Optimize CTB writes and reads

2021-06-25 Thread Matthew Brost
On Fri, Jun 25, 2021 at 03:09:29PM +0200, Michal Wajdeczko wrote:
> 
> 
> On 24.06.2021 09:04, Matthew Brost wrote:
> > CTB writes are now in the path of command submission and should be
> > optimized for performance. Rather than reading CTB descriptor values
> > (e.g. head, tail) which could result in accesses across the PCIe bus,
> > store shadow local copies and only read/write the descriptor values when
> > absolutely necessary. Also store the current space in the each channel
> > locally.
> > 
> > Signed-off-by: John Harrison 
> > Signed-off-by: Matthew Brost 
> > ---
> >  drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 76 ++-
> >  drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h |  6 ++
> >  2 files changed, 51 insertions(+), 31 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c 
> > b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> > index 27ec30b5ef47..1fd5c69358ef 100644
> > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> > @@ -130,6 +130,10 @@ static void guc_ct_buffer_desc_init(struct 
> > guc_ct_buffer_desc *desc)
> >  static void guc_ct_buffer_reset(struct intel_guc_ct_buffer *ctb)
> >  {
> > ctb->broken = false;
> > +   ctb->tail = 0;
> > +   ctb->head = 0;
> > +   ctb->space = CIRC_SPACE(ctb->tail, ctb->head, ctb->size);
> > +
> > guc_ct_buffer_desc_init(ctb->desc);
> >  }
> >  
> > @@ -383,10 +387,8 @@ static int ct_write(struct intel_guc_ct *ct,
> >  {
> > struct intel_guc_ct_buffer *ctb = >ctbs.send;
> > struct guc_ct_buffer_desc *desc = ctb->desc;
> > -   u32 head = desc->head;
> > -   u32 tail = desc->tail;
> > +   u32 tail = ctb->tail;
> > u32 size = ctb->size;
> > -   u32 used;
> > u32 header;
> > u32 hxg;
> > u32 *cmds = ctb->cmds;
> > @@ -398,25 +400,14 @@ static int ct_write(struct intel_guc_ct *ct,
> > if (unlikely(desc->status))
> > goto corrupted;
> >  
> > -   if (unlikely((tail | head) >= size)) {
> > +#ifdef CONFIG_DRM_I915_DEBUG_GUC
> 
> since we are caching tail, we may want to check if it's sill correct:
> 
>   tail = READ_ONCE(desc->tail);
>   if (unlikely(tail != ctb->tail)) {
>   CT_ERROR(ct, "Tail was modified %u != %u\n",
>tail, ctb->tail);
>   desc->status |= GUC_CTB_STATUS_MISMATCH;
>   goto corrupted;
>   }
> 
> and since we own the tail then we can be more strict:
> 
>   GEM_BUG_ON(tail > size);
> 
> and then finally just check GuC head:
> 
>   head = READ_ONCE(desc->head);
>   if (unlikely(head >= size)) {
>   ...
> 

Sure, but still hidden behind CONFIG_DRM_I915_DEBUG_GUC, right?

> > +   if (unlikely((desc->tail | desc->head) >= size)) {
> > CT_ERROR(ct, "Invalid offsets head=%u tail=%u (size=%u)\n",
> > -head, tail, size);
> > +desc->head, desc->tail, size);
> > desc->status |= GUC_CTB_STATUS_OVERFLOW;
> > goto corrupted;
> > }
> > -
> > -   /*
> > -* tail == head condition indicates empty. GuC FW does not support
> > -* using up the entire buffer to get tail == head meaning full.
> > -*/
> > -   if (tail < head)
> > -   used = (size - head) + tail;
> > -   else
> > -   used = tail - head;
> > -
> > -   /* make sure there is a space including extra dw for the fence */
> > -   if (unlikely(used + len + 1 >= size))
> > -   return -ENOSPC;
> > +#endif
> >  
> > /*
> >  * dw0: CT header (including fence)
> > @@ -457,7 +448,9 @@ static int ct_write(struct intel_guc_ct *ct,
> > write_barrier(ct);
> >  
> > /* now update descriptor */
> > +   ctb->tail = tail;
> > WRITE_ONCE(desc->tail, tail);
> > +   ctb->space -= len + 1;
> 
> this magic "1" is likely GUC_CTB_MSG_MIN_LEN, right ?
>

Yes.
 
> >  
> > return 0;
> >  
> > @@ -473,7 +466,7 @@ static int ct_write(struct intel_guc_ct *ct,
> >   * @req:   pointer to pending request
> >   * @status:placeholder for status
> >   *
> > - * For each sent request, Guc shall send bac CT response message.
> > + * For each sent request, GuC shall send back CT response message.
> >   * Our message handler will update status of tracked request once
> >   * response message with given fence is received. Wait here and
> >   * check for valid response status value.
> > @@ -520,24 +513,35 @@ static inline bool ct_deadlocked(struct intel_guc_ct 
> > *ct)
> > return ret;
> >  }
> >  
> > -static inline bool h2g_has_room(struct intel_guc_ct_buffer *ctb, u32 
> > len_dw)
> > +static inline bool h2g_has_room(struct intel_guc_ct *ct, u32 len_dw)
> >  {
> > -   struct guc_ct_buffer_desc *desc = ctb->desc;
> > -   u32 head = READ_ONCE(desc->head);
> > +   struct intel_guc_ct_buffer *ctb = >ctbs.send;
> > +   u32 head;
> > u32 space;
> >  
> > -   space = CIRC_SPACE(desc->tail, head, ctb->size);
> > +   if (ctb->space >= len_dw)
> > +   return true;
> > +
> > 

Re: [Intel-gfx] [PATCH 04/47] drm/i915/guc: Add non blocking CTB send function

2021-06-25 Thread Matthew Brost
On Fri, Jun 25, 2021 at 01:50:21PM +0200, Michal Wajdeczko wrote:
> 
> 
> On 25.06.2021 00:41, Matthew Brost wrote:
> > On Thu, Jun 24, 2021 at 07:02:18PM +0200, Michal Wajdeczko wrote:
> >>
> >>
> >> On 24.06.2021 17:49, Matthew Brost wrote:
> >>> On Thu, Jun 24, 2021 at 04:48:32PM +0200, Michal Wajdeczko wrote:
> 
> 
>  On 24.06.2021 09:04, Matthew Brost wrote:
> > Add non blocking CTB send function, intel_guc_send_nb. GuC submission
> > will send CTBs in the critical path and does not need to wait for these
> > CTBs to complete before moving on, hence the need for this new function.
> >
> > The non-blocking CTB now must have a flow control mechanism to ensure
> > the buffer isn't overrun. A lazy spin wait is used as we believe the
> > flow control condition should be rare with a properly sized buffer.
> >
> > The function, intel_guc_send_nb, is exported in this patch but unused.
> > Several patches later in the series make use of this function.
> >
> > Signed-off-by: John Harrison 
> > Signed-off-by: Matthew Brost 
> > ---
> >  drivers/gpu/drm/i915/gt/uc/intel_guc.h| 12 +++-
> >  drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 77 +--
> >  drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h |  3 +-
> >  3 files changed, 82 insertions(+), 10 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h 
> > b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
> > index 4abc59f6f3cd..24b1df6ad4ae 100644
> > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h
> > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
> > @@ -74,7 +74,15 @@ static inline struct intel_guc *log_to_guc(struct 
> > intel_guc_log *log)
> >  static
> >  inline int intel_guc_send(struct intel_guc *guc, const u32 *action, 
> > u32 len)
> >  {
> > -   return intel_guc_ct_send(>ct, action, len, NULL, 0);
> > +   return intel_guc_ct_send(>ct, action, len, NULL, 0, 0);
> > +}
> > +
> > +#define INTEL_GUC_SEND_NB  BIT(31)
> 
>  hmm, this flag really belongs to intel_guc_ct_send() so it should be
>  defined as CTB flag near that function declaration
> 
> >>>
> >>> I can move this up a few lines.
> >>>
> > +static
> > +inline int intel_guc_send_nb(struct intel_guc *guc, const u32 *action, 
> > u32 len)
> > +{
> > +   return intel_guc_ct_send(>ct, action, len, NULL, 0,
> > +INTEL_GUC_SEND_NB);
> >  }
> >  
> >  static inline int
> > @@ -82,7 +90,7 @@ intel_guc_send_and_receive(struct intel_guc *guc, 
> > const u32 *action, u32 len,
> >u32 *response_buf, u32 response_buf_size)
> >  {
> > return intel_guc_ct_send(>ct, action, len,
> > -response_buf, response_buf_size);
> > +response_buf, response_buf_size, 0);
> >  }
> >  
> >  static inline void intel_guc_to_host_event_handler(struct intel_guc 
> > *guc)
> > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c 
> > b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> > index a17215920e58..c9a65d05911f 100644
> > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> > @@ -3,6 +3,11 @@
> >   * Copyright © 2016-2019 Intel Corporation
> >   */
> >  
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +
> >  #include "i915_drv.h"
> >  #include "intel_guc_ct.h"
> >  #include "gt/intel_gt.h"
> > @@ -373,7 +378,7 @@ static void write_barrier(struct intel_guc_ct *ct)
> >  static int ct_write(struct intel_guc_ct *ct,
> > const u32 *action,
> > u32 len /* in dwords */,
> > -   u32 fence)
> > +   u32 fence, u32 flags)
> >  {
> > struct intel_guc_ct_buffer *ctb = >ctbs.send;
> > struct guc_ct_buffer_desc *desc = ctb->desc;
> > @@ -421,9 +426,13 @@ static int ct_write(struct intel_guc_ct *ct,
> >  FIELD_PREP(GUC_CTB_MSG_0_NUM_DWORDS, len) |
> >  FIELD_PREP(GUC_CTB_MSG_0_FENCE, fence);
> >  
> > -   hxg = FIELD_PREP(GUC_HXG_MSG_0_TYPE, GUC_HXG_TYPE_REQUEST) |
> > - FIELD_PREP(GUC_HXG_REQUEST_MSG_0_ACTION |
> > -GUC_HXG_REQUEST_MSG_0_DATA0, action[0]);
> > +   hxg = (flags & INTEL_GUC_SEND_NB) ?
> > +   (FIELD_PREP(GUC_HXG_MSG_0_TYPE, GUC_HXG_TYPE_EVENT) |
> > +FIELD_PREP(GUC_HXG_EVENT_MSG_0_ACTION |
> > +   GUC_HXG_EVENT_MSG_0_DATA0, action[0])) :
> > +   (FIELD_PREP(GUC_HXG_MSG_0_TYPE, GUC_HXG_TYPE_REQUEST) |
> > +FIELD_PREP(GUC_HXG_REQUEST_MSG_0_ACTION |
> > +  

RE: [PATCH 4/4] drm/i915/gem: Migrate to system at dma-buf map time

2021-06-25 Thread Ruhl, Michael J
>-Original Message-
>From: Thomas Hellström 
>Sent: Friday, June 25, 2021 1:52 PM
>To: Ruhl, Michael J ; intel-
>g...@lists.freedesktop.org; dri-devel@lists.freedesktop.org
>Cc: Auld, Matthew 
>Subject: Re: [PATCH 4/4] drm/i915/gem: Migrate to system at dma-buf map
>time
>
>
>On 6/25/21 7:38 PM, Ruhl, Michael J wrote:
>>> -Original Message-
>>> From: Thomas Hellström 
>>> Sent: Friday, June 25, 2021 12:18 PM
>>> To: Ruhl, Michael J ; intel-
>>> g...@lists.freedesktop.org; dri-devel@lists.freedesktop.org
>>> Cc: Auld, Matthew 
>>> Subject: Re: [PATCH 4/4] drm/i915/gem: Migrate to system at dma-buf
>map
>>> time
>>>
>>> Hi, Michael,
>>>
>>> thanks for looking at this.
>>>
>>> On 6/25/21 6:02 PM, Ruhl, Michael J wrote:
> -Original Message-
> From: dri-devel  On Behalf
>Of
> Thomas Hellström
> Sent: Thursday, June 24, 2021 2:31 PM
> To: intel-...@lists.freedesktop.org; dri-devel@lists.freedesktop.org
> Cc: Thomas Hellström ; Auld,
>>> Matthew
> 
> Subject: [PATCH 4/4] drm/i915/gem: Migrate to system at dma-buf map
>>> time
> Until we support p2p dma or as a complement to that, migrate data
> to system memory at dma-buf map time if possible.
>
> Signed-off-by: Thomas Hellström 
> ---
> drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c | 9 -
> 1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
> b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
> index 616c3a2f1baf..a52f885bc09a 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
> @@ -25,7 +25,14 @@ static struct sg_table
>>> *i915_gem_map_dma_buf(struct
> dma_buf_attachment *attachme
>   struct scatterlist *src, *dst;
>   int ret, i;
>
> - ret = i915_gem_object_pin_pages_unlocked(obj);
> + ret = i915_gem_object_lock_interruptible(obj, NULL);
 Hmm, I believe in most cases that the caller should be holding the
 lock (object dma-resv) on this object already.
>>> Yes, I agree, In particular for other instances of our own driver,  at
>>> least since the dma_resv introduction.
>>>
>>> But I also think that's a pre-existing bug, since
>>> i915_gem_object_pin_pages_unlocked() will also take the lock.
>> Ouch yes.  Missed that.
>>
>>> I Think we need to initially make the exporter dynamic-capable to
>>> resolve this, and drop the locking here completely, as dma-buf docs says
>>> that we're then guaranteed to get called with the object lock held.
>>>
>>> I figure if we make the exporter dynamic, we need to migrate already at
>>> dma_buf_pin time so we don't pin the object in the wrong location.
>> The exporter as dynamic  (ops->pin is available) is optional, but importer
>> dynamic (ops->move_notify) is required.
>>
>> With that in mind, it would seem that there are three possible combinations
>> for the migrate to be attempted:
>>
>> 1) in the ops->pin function (export_dynamic != import_dynamic, during
>attach)
>> 2) in the ops->pin function (export_dynamic and
>!CONFIG_DMABUF_MOVE_NOTIFY) during mapping
>> 3) and possibly in ops->map_dma_buf (exort_dynamic iand
>CONFIG_DMABUF_MOVE_NOTIFY)
>>
>> Since one possibility has to be in the mapping function, it seems that if we
>> can figure out the locking, that the migrate should probably be available
>here.
>>
>> Mike
>
>So perhaps just to initially fix the bug, we could just implement NOP
>pin() and unpin() callbacks and drop the locking in map_attach() and
>replace it with an assert_object_held();

That is the sticky part of the move notify API.

If you do the attach_dynamic you have to have an ops with move_notify.

(https://elixir.bootlin.com/linux/v5.13-rc7/source/drivers/dma-buf/dma-buf.c#L730)

If you don't have that, i.e. just the pin interface, the attach will be
rejected, and you will not get the callbacks.

So I think that the only thing we can do for now is to dop the locking and add 
the 

assert_object_held();

M

>/Thomas
>



Re: [PATCH 13/47] drm/i915/guc: Implement GuC context operations for new inteface

2021-06-25 Thread Matthew Brost
On Fri, Jun 25, 2021 at 03:25:13PM +0200, Michal Wajdeczko wrote:
> 
> 
> On 24.06.2021 09:04, Matthew Brost wrote:
> > Implement GuC context operations which includes GuC specific operations
> > alloc, pin, unpin, and destroy.
> > 
> > Signed-off-by: John Harrison 
> > Signed-off-by: Matthew Brost 
> > ---
> >  drivers/gpu/drm/i915/gt/intel_context.c   |   5 +
> >  drivers/gpu/drm/i915/gt/intel_context_types.h |  22 +-
> >  drivers/gpu/drm/i915/gt/intel_lrc_reg.h   |   1 -
> >  drivers/gpu/drm/i915/gt/uc/intel_guc.h|  34 +
> >  drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c |   4 +
> >  .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 664 --
> >  drivers/gpu/drm/i915/i915_reg.h   |   1 +
> >  drivers/gpu/drm/i915/i915_request.c   |   1 +
> >  8 files changed, 677 insertions(+), 55 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_context.c 
> > b/drivers/gpu/drm/i915/gt/intel_context.c
> > index 4033184f13b9..2b68af16222c 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_context.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_context.c
> > @@ -383,6 +383,11 @@ intel_context_init(struct intel_context *ce, struct 
> > intel_engine_cs *engine)
> >  
> > mutex_init(>pin_mutex);
> >  
> > +   spin_lock_init(>guc_state.lock);
> > +
> > +   ce->guc_id = GUC_INVALID_LRC_ID;
> > +   INIT_LIST_HEAD(>guc_id_link);
> > +
> > i915_active_init(>active,
> >  __intel_context_active, __intel_context_retire, 0);
> >  }
> > diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h 
> > b/drivers/gpu/drm/i915/gt/intel_context_types.h
> > index bb6fef7eae52..ce7c69b34cd1 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_context_types.h
> > +++ b/drivers/gpu/drm/i915/gt/intel_context_types.h
> > @@ -95,6 +95,7 @@ struct intel_context {
> >  #define CONTEXT_BANNED 6
> >  #define CONTEXT_FORCE_SINGLE_SUBMISSION7
> >  #define CONTEXT_NOPREEMPT  8
> > +#define CONTEXT_LRCA_DIRTY 9
> >  
> > struct {
> > u64 timeout_us;
> > @@ -137,14 +138,29 @@ struct intel_context {
> >  
> > u8 wa_bb_page; /* if set, page num reserved for context workarounds */
> >  
> > +   struct {
> > +   /** lock: protects everything in guc_state */
> > +   spinlock_t lock;
> > +   /**
> > +* sched_state: scheduling state of this context using GuC
> > +* submission
> > +*/
> > +   u8 sched_state;
> > +   } guc_state;
> > +
> > /* GuC scheduling state that does not require a lock. */
> > atomic_t guc_sched_state_no_lock;
> >  
> > +   /* GuC lrc descriptor ID */
> > +   u16 guc_id;
> > +
> > +   /* GuC lrc descriptor reference count */
> > +   atomic_t guc_id_ref;
> > +
> > /*
> > -* GuC lrc descriptor ID - Not assigned in this patch but future patches
> > -* in the series will.
> > +* GuC ID link - in list when unpinned but guc_id still valid in GuC
> >  */
> > -   u16 guc_id;
> > +   struct list_head guc_id_link;
> 
> some fields are being added with kerneldoc, some without
> what's the rule ?
> 

Yea, idk. I think we need to scrub all the structures in the driver and
add kernel doc everywhere. IMO not a blocker too as I think all the
structures are going to be reworked with OO concepts after the GuC code
lands before moving to DRM scheduler. That would be logical time to
update all the kernel doc too.

> >  };
> >  
> >  #endif /* __INTEL_CONTEXT_TYPES__ */
> > diff --git a/drivers/gpu/drm/i915/gt/intel_lrc_reg.h 
> > b/drivers/gpu/drm/i915/gt/intel_lrc_reg.h
> > index 41e5350a7a05..49d4857ad9b7 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_lrc_reg.h
> > +++ b/drivers/gpu/drm/i915/gt/intel_lrc_reg.h
> > @@ -87,7 +87,6 @@
> >  #define GEN11_CSB_WRITE_PTR_MASK   (GEN11_CSB_PTR_MASK << 0)
> >  
> >  #define MAX_CONTEXT_HW_ID  (1 << 21) /* exclusive */
> > -#define MAX_GUC_CONTEXT_HW_ID  (1 << 20) /* exclusive */
> >  #define GEN11_MAX_CONTEXT_HW_ID(1 << 11) /* exclusive */
> >  /* in Gen12 ID 0x7FF is reserved to indicate idle */
> >  #define GEN12_MAX_CONTEXT_HW_ID(GEN11_MAX_CONTEXT_HW_ID - 1)
> > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h 
> > b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
> > index 9ba8219475b2..d44316dc914b 100644
> > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h
> > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
> > @@ -44,6 +44,14 @@ struct intel_guc {
> > void (*disable)(struct intel_guc *guc);
> > } interrupts;
> >  
> > +   /*
> > +* contexts_lock protects the pool of free guc ids and a linked list of
> > +* guc ids available to be stolen
> > +*/
> > +   spinlock_t contexts_lock;
> > +   struct ida guc_ids;
> > +   struct list_head guc_id_list;
> > +
> > bool submission_selected;
> >  
> > struct i915_vma *ads_vma;
> > @@ -102,6 +110,29 @@ intel_guc_send_and_receive(struct intel_guc *guc, 
> > const u32 *action, u32 len,
> >

Re: [PATCH 4/4] drm/i915/gem: Migrate to system at dma-buf map time

2021-06-25 Thread Thomas Hellström



On 6/25/21 7:38 PM, Ruhl, Michael J wrote:

-Original Message-
From: Thomas Hellström 
Sent: Friday, June 25, 2021 12:18 PM
To: Ruhl, Michael J ; intel-
g...@lists.freedesktop.org; dri-devel@lists.freedesktop.org
Cc: Auld, Matthew 
Subject: Re: [PATCH 4/4] drm/i915/gem: Migrate to system at dma-buf map
time

Hi, Michael,

thanks for looking at this.

On 6/25/21 6:02 PM, Ruhl, Michael J wrote:

-Original Message-
From: dri-devel  On Behalf Of
Thomas Hellström
Sent: Thursday, June 24, 2021 2:31 PM
To: intel-...@lists.freedesktop.org; dri-devel@lists.freedesktop.org
Cc: Thomas Hellström ; Auld,

Matthew


Subject: [PATCH 4/4] drm/i915/gem: Migrate to system at dma-buf map

time

Until we support p2p dma or as a complement to that, migrate data
to system memory at dma-buf map time if possible.

Signed-off-by: Thomas Hellström 
---
drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c | 9 -
1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
index 616c3a2f1baf..a52f885bc09a 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
@@ -25,7 +25,14 @@ static struct sg_table

*i915_gem_map_dma_buf(struct

dma_buf_attachment *attachme
struct scatterlist *src, *dst;
int ret, i;

-   ret = i915_gem_object_pin_pages_unlocked(obj);
+   ret = i915_gem_object_lock_interruptible(obj, NULL);

Hmm, I believe in most cases that the caller should be holding the
lock (object dma-resv) on this object already.

Yes, I agree, In particular for other instances of our own driver,  at
least since the dma_resv introduction.

But I also think that's a pre-existing bug, since
i915_gem_object_pin_pages_unlocked() will also take the lock.

Ouch yes.  Missed that.


I Think we need to initially make the exporter dynamic-capable to
resolve this, and drop the locking here completely, as dma-buf docs says
that we're then guaranteed to get called with the object lock held.

I figure if we make the exporter dynamic, we need to migrate already at
dma_buf_pin time so we don't pin the object in the wrong location.

The exporter as dynamic  (ops->pin is available) is optional, but importer
dynamic (ops->move_notify) is required.

With that in mind, it would seem that there are three possible combinations
for the migrate to be attempted:

1) in the ops->pin function (export_dynamic != import_dynamic, during attach)
2) in the ops->pin function (export_dynamic and !CONFIG_DMABUF_MOVE_NOTIFY) 
during mapping
3) and possibly in ops->map_dma_buf (exort_dynamic iand 
CONFIG_DMABUF_MOVE_NOTIFY)

Since one possibility has to be in the mapping function, it seems that if we
can figure out the locking, that the migrate should probably be available here.

Mike


So perhaps just to initially fix the bug, we could just implement NOP 
pin() and unpin() callbacks and drop the locking in map_attach() and 
replace it with an assert_object_held();


/Thomas




RE: [PATCH 4/4] drm/i915/gem: Migrate to system at dma-buf map time

2021-06-25 Thread Ruhl, Michael J
>-Original Message-
>From: Thomas Hellström 
>Sent: Friday, June 25, 2021 12:18 PM
>To: Ruhl, Michael J ; intel-
>g...@lists.freedesktop.org; dri-devel@lists.freedesktop.org
>Cc: Auld, Matthew 
>Subject: Re: [PATCH 4/4] drm/i915/gem: Migrate to system at dma-buf map
>time
>
>Hi, Michael,
>
>thanks for looking at this.
>
>On 6/25/21 6:02 PM, Ruhl, Michael J wrote:
>>> -Original Message-
>>> From: dri-devel  On Behalf Of
>>> Thomas Hellström
>>> Sent: Thursday, June 24, 2021 2:31 PM
>>> To: intel-...@lists.freedesktop.org; dri-devel@lists.freedesktop.org
>>> Cc: Thomas Hellström ; Auld,
>Matthew
>>> 
>>> Subject: [PATCH 4/4] drm/i915/gem: Migrate to system at dma-buf map
>time
>>>
>>> Until we support p2p dma or as a complement to that, migrate data
>>> to system memory at dma-buf map time if possible.
>>>
>>> Signed-off-by: Thomas Hellström 
>>> ---
>>> drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c | 9 -
>>> 1 file changed, 8 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
>>> b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
>>> index 616c3a2f1baf..a52f885bc09a 100644
>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
>>> @@ -25,7 +25,14 @@ static struct sg_table
>*i915_gem_map_dma_buf(struct
>>> dma_buf_attachment *attachme
>>> struct scatterlist *src, *dst;
>>> int ret, i;
>>>
>>> -   ret = i915_gem_object_pin_pages_unlocked(obj);
>>> +   ret = i915_gem_object_lock_interruptible(obj, NULL);
>> Hmm, I believe in most cases that the caller should be holding the
>> lock (object dma-resv) on this object already.
>
>Yes, I agree, In particular for other instances of our own driver,  at
>least since the dma_resv introduction.
>
>But I also think that's a pre-existing bug, since
>i915_gem_object_pin_pages_unlocked() will also take the lock.

Ouch yes.  Missed that.

>I Think we need to initially make the exporter dynamic-capable to
>resolve this, and drop the locking here completely, as dma-buf docs says
>that we're then guaranteed to get called with the object lock held.
>
>I figure if we make the exporter dynamic, we need to migrate already at
>dma_buf_pin time so we don't pin the object in the wrong location.

The exporter as dynamic  (ops->pin is available) is optional, but importer
dynamic (ops->move_notify) is required.

With that in mind, it would seem that there are three possible combinations
for the migrate to be attempted:

1) in the ops->pin function (export_dynamic != import_dynamic, during attach)
2) in the ops->pin function (export_dynamic and !CONFIG_DMABUF_MOVE_NOTIFY) 
during mapping
3) and possibly in ops->map_dma_buf (exort_dynamic iand 
CONFIG_DMABUF_MOVE_NOTIFY)

Since one possibility has to be in the mapping function, it seems that if we
can figure out the locking, that the migrate should probably be available here.

Mike


>/Thomas
>
>
>>
>> I know for the dynamic version of dma-buf, there is a check to make
>> sure that the lock is held when called.
>>
>> I think you will run into some issues if you try to get it here as well.
>>
>> Mike
>>
>>> +   if (ret)
>>> +   return ERR_PTR(ret);
>>> +
>>> +   ret = i915_gem_object_migrate(obj, NULL, INTEL_REGION_SMEM);
>>> +   if (!ret)
>>> +   ret = i915_gem_object_pin_pages(obj);
>>> +   i915_gem_object_unlock(obj);
>>> if (ret)
>>> goto err;
>>>
>>> --
>>> 2.31.1


Re: [PATCH 10/47] drm/i915/guc: Add lrc descriptor context lookup array

2021-06-25 Thread Matthew Brost
On Fri, Jun 25, 2021 at 03:17:51PM +0200, Michal Wajdeczko wrote:
> 
> 
> On 24.06.2021 09:04, Matthew Brost wrote:
> > Add lrc descriptor context lookup array which can resolve the
> > intel_context from the lrc descriptor index. In addition to lookup, it
> > can determine in the lrc descriptor context is currently registered with
> > the GuC by checking if an entry for a descriptor index is present.
> > Future patches in the series will make use of this array.
> 
> s/lrc/LRC
> 

I guess? lrc and LRC are used interchangeably throughout the current
code base. 

> > 
> > Cc: John Harrison 
> > Signed-off-by: Matthew Brost 
> > ---
> >  drivers/gpu/drm/i915/gt/uc/intel_guc.h|  5 +++
> >  .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 32 +--
> >  2 files changed, 35 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h 
> > b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
> > index b28fa54214f2..2313d9fc087b 100644
> > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h
> > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
> > @@ -6,6 +6,8 @@
> >  #ifndef _INTEL_GUC_H_
> >  #define _INTEL_GUC_H_
> >  
> > +#include "linux/xarray.h"
> 
> #include 
> 

Yep.

> > +
> >  #include "intel_uncore.h"
> >  #include "intel_guc_fw.h"
> >  #include "intel_guc_fwif.h"
> > @@ -46,6 +48,9 @@ struct intel_guc {
> > struct i915_vma *lrc_desc_pool;
> > void *lrc_desc_pool_vaddr;
> >  
> > +   /* guc_id to intel_context lookup */
> > +   struct xarray context_lookup;
> > +
> > /* Control params for fw initialization */
> > u32 params[GUC_CTL_MAX_DWORDS];
> 
> btw, IIRC there was idea to move most struct definitions to
> intel_guc_types.h, is this still a plan ?
> 

I don't ever recall discussing this but we can certainly do this. For
what it is worth we do introduce intel_guc_submission_types.h a bit
later. I'll make a note about intel_guc_types.h though.

Matt

> >  
> > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
> > b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > index a366890fb840..23a94a896a0b 100644
> > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > @@ -65,8 +65,6 @@ static inline struct i915_priolist *to_priolist(struct 
> > rb_node *rb)
> > return rb_entry(rb, struct i915_priolist, node);
> >  }
> >  
> > -/* Future patches will use this function */
> > -__attribute__ ((unused))
> >  static struct guc_lrc_desc *__get_lrc_desc(struct intel_guc *guc, u32 
> > index)
> >  {
> > struct guc_lrc_desc *base = guc->lrc_desc_pool_vaddr;
> > @@ -76,6 +74,15 @@ static struct guc_lrc_desc *__get_lrc_desc(struct 
> > intel_guc *guc, u32 index)
> > return [index];
> >  }
> >  
> > +static inline struct intel_context *__get_context(struct intel_guc *guc, 
> > u32 id)
> > +{
> > +   struct intel_context *ce = xa_load(>context_lookup, id);
> > +
> > +   GEM_BUG_ON(id >= GUC_MAX_LRC_DESCRIPTORS);
> > +
> > +   return ce;
> > +}
> > +
> >  static int guc_lrc_desc_pool_create(struct intel_guc *guc)
> >  {
> > u32 size;
> > @@ -96,6 +103,25 @@ static void guc_lrc_desc_pool_destroy(struct intel_guc 
> > *guc)
> > i915_vma_unpin_and_release(>lrc_desc_pool, I915_VMA_RELEASE_MAP);
> >  }
> >  
> > +static inline void reset_lrc_desc(struct intel_guc *guc, u32 id)
> > +{
> > +   struct guc_lrc_desc *desc = __get_lrc_desc(guc, id);
> > +
> > +   memset(desc, 0, sizeof(*desc));
> > +   xa_erase_irq(>context_lookup, id);
> > +}
> > +
> > +static inline bool lrc_desc_registered(struct intel_guc *guc, u32 id)
> > +{
> > +   return __get_context(guc, id);
> > +}
> > +
> > +static inline void set_lrc_desc_registered(struct intel_guc *guc, u32 id,
> > +  struct intel_context *ce)
> > +{
> > +   xa_store_irq(>context_lookup, id, ce, GFP_ATOMIC);
> > +}
> > +
> >  static void guc_add_request(struct intel_guc *guc, struct i915_request *rq)
> >  {
> > /* Leaving stub as this function will be used in future patches */
> > @@ -400,6 +426,8 @@ int intel_guc_submission_init(struct intel_guc *guc)
> >  */
> > GEM_BUG_ON(!guc->lrc_desc_pool);
> >  
> > +   xa_init_flags(>context_lookup, XA_FLAGS_LOCK_IRQ);
> > +
> > return 0;
> >  }
> >  
> > 


[PATCH 2/2] drm/amdgpu: raise error on incorrect mem_type

2021-06-25 Thread Nirmoy Das
Be more defensive and raise error on wrong mem_type
argument in amdgpu_gtt_mgr_has_gart_addr().

Signed-off-by: Nirmoy Das 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
index 543000304a1c..0b0fa87b115c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
@@ -107,8 +107,12 @@ const struct attribute_group amdgpu_gtt_mgr_attr_group = {
  */
 bool amdgpu_gtt_mgr_has_gart_addr(struct ttm_resource *res)
 {
-   struct amdgpu_gtt_node *node = to_amdgpu_gtt_node(res);
+   struct amdgpu_gtt_node *node;
+
+   if (WARN_ON(res->mem_type != TTM_PL_TT))
+   return false;
 
+   node = to_amdgpu_gtt_node(res);
return drm_mm_node_allocated(>base.mm_nodes[0]);
 }
 
-- 
2.32.0



[PATCH 1/2] drm/amdgpu: return early for preempt type BOs

2021-06-25 Thread Nirmoy Das
Return early for AMDGPU_PL_PREEMPT BOs so that we don't pass
wrong pointer to amdgpu_gtt_mgr_has_gart_addr() which assumes
ttm_resource argument to be TTM_PL_TT type BO's.

Signed-off-by: Nirmoy Das 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index b46726e47bce..3df06772a425 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -926,6 +926,11 @@ static int amdgpu_ttm_backend_bind(struct ttm_device *bdev,
bo_mem->mem_type == AMDGPU_PL_OA)
return -EINVAL;
 
+   if (bo_mem->mem_type == AMDGPU_PL_PREEMPT) {
+   gtt->offset = AMDGPU_BO_INVALID_OFFSET;
+   return 0;
+   }
+
if (!amdgpu_gtt_mgr_has_gart_addr(bo_mem)) {
gtt->offset = AMDGPU_BO_INVALID_OFFSET;
return 0;
-- 
2.32.0



Re: [PATCH] drm/panel: ws2401: Add driver for WideChips WS2401

2021-06-25 Thread Doug Anderson
Hi,

On Thu, Jun 24, 2021 at 3:47 PM Linus Walleij  wrote:
>
> @@ -5946,6 +5946,13 @@ S:   Maintained
>  T: git git://anongit.freedesktop.org/drm/drm-misc
>  F: drivers/gpu/drm/vboxvideo/
>
> +DRM DRIVER FOR WIDECHIPS WS2401 PANELS
> +M: Linus Walleij 
> +S: Maintained
> +T: git git://anongit.freedesktop.org/drm/drm-misc
> +F: 
> Documentation/devicetree/bindings/display/panel/samsung,lms380kf01.yaml
> +F: drivers/gpu/drm/panel/panel-widechips-ws2401.c
> +
>  DRM DRIVER FOR VMWARE VIRTUAL GPU

nit: I assume this is supposed to be alphabetized? If so, [W]IDECHIPS
comes after [V]MWARE


> @@ -0,0 +1,404 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Panel driver for the WideChips WS2401 480x800 DPI RGB panel, used in
> + * the Samsung Mobile Display (SMD) LMS380KF01.
> + * Found in the Samsung Galaxy Ace 2 GT-I8160 mobile phone.
> + * Linus Walleij 
> + * Inspired by code and know-how in the vendor driver by Gareth Phillips.
> + */
> +#include 
> +#include 

nit: m[o]des sorts after m[i]pi


> +#define ws2401_command(ws, cmd, seq...) \
> +({ \
> +   struct mipi_dbi *dbi = >dbi; \
> +   int ret; \
> +   ret = mipi_dbi_command(dbi, cmd, seq);  \
> +   if (ret) { \
> +   dev_err(ws->dev, "failure in writing command %02x\n", cmd); \
> +   } \

nit: don't need braces for the "if", right?

optional nit: use %#02x instead of %02x


> +})
> +
> +static void ws2401_read_mtp_id(struct ws2401 *ws)
> +{
> +   struct mipi_dbi *dbi = >dbi;
> +   u8 id1, id2, id3;
> +   int ret;
> +
> +   ret = mipi_dbi_command_read(dbi, WS2401_READ_ID1, );
> +   if (ret) {
> +   dev_err(ws->dev, "unable to read MTP ID 1\n");
> +   return;
> +   }
> +   ret = mipi_dbi_command_read(dbi, WS2401_READ_ID2, );
> +   if (ret) {
> +   dev_err(ws->dev, "unable to read MTP ID 2\n");
> +   return;
> +   }
> +   ret = mipi_dbi_command_read(dbi, WS2401_READ_ID3, );
> +   if (ret) {
> +   dev_err(ws->dev, "unable to read MTP ID 3\n");
> +   return;
> +   }
> +   dev_info(ws->dev, "MTP ID: %02x %02x %02x\n", id1, id2, id3);

Does this need to be printed every time you power on the panel? Seems
like it's going to spam up the logs... I'm not sure what it's used
for.


> +static int ws2401_power_off(struct ws2401 *ws)
> +{
> +   /* Disable backlight */
> +   if (ws->bl)
> +   ws2401_command(ws, WS2401_WRCTRLD, 0x00);

I don't have any real knowledge here, but the location of this seems a
little odd. Just based on inspection of the rest of the driver, I
almost would have thought it would need to be sent _before_ entering
sleep mode, but I certainly could be wrong.


> +static int ws2401_disable(struct drm_panel *panel)
> +{
> +   struct ws2401 *ws = to_ws2401(panel);
> +
> +   ws2401_command(ws, MIPI_DCS_SET_DISPLAY_OFF);
> +   msleep(25);

It feels weird / arbitrary the split between "disable" and "unprepare"
on this panel driver compared to the "db7430.c" one. In the other
driver you put the sleep mode here and in this driver you put the
sleep mode un "unpreapre". Is that for a reason, or just arbitrary?
Can it be consistent between the two drivers?

I guess maybe this is because in "db7430" the power up order was
slightly different?


> +static const struct backlight_ops ws2401_bl_ops = {
> +   .update_status = ws2401_set_brightness,
> +};
> +
> +const struct backlight_properties ws2401_bl_props = {

"static const" instead of "const"?


> +   ret = drm_panel_of_backlight(>panel);
> +   if (ret) {
> +   dev_info(dev, "no external backlight, using internal 
> backlight\n");
> +   ws->bl = devm_backlight_device_register(dev, "ws2401", dev, 
> ws,
> +   _bl_ops, 
> _bl_props);
> +   if (IS_ERR(ws->bl)) {
> +   ret = PTR_ERR(ws->bl);
> +   return dev_err_probe(dev, ret,
> +"failed to register backlight 
> device\n");

nit: probably didn't need the separate assignment to "ret". Just pass
"PTR_ERR(ws->bl)" to the function. Then no need for braces for your
"if" too.

> +   }
> +   ws->panel.backlight = ws->bl;
> +   } else {
> +   dev_info(dev, "using external backlight\n");

This (and the other "no extenal backlight") feels a bit chatty to me.
If you really want them and want them at "info" level then I won't
object, but I guess I like short logs even with "info" enabled.

-Doug


Re: [PATCH] drm/msm/dp: Add missing drm_device backpointer

2021-06-25 Thread Lyude Paul
Ah - must have missed this when I added this. Thanks for the fix!

Reviewed-by: Lyude Paul 

On Thu, 2021-06-24 at 20:47 -0700, Bjorn Andersson wrote:
> '6cba3fe43341 ("drm/dp: Add backpointer to drm_device in drm_dp_aux")'
> introduced a mandator drm_device backpointer in struct drm_dp_aux, but
> missed the msm DP driver. Fix this.
> 
> Fixes: 6cba3fe43341 ("drm/dp: Add backpointer to drm_device in drm_dp_aux")
> Signed-off-by: Bjorn Andersson 
> ---
>  drivers/gpu/drm/msm/dp/dp_aux.c | 3 ++-
>  drivers/gpu/drm/msm/dp/dp_aux.h | 2 +-
>  drivers/gpu/drm/msm/dp/dp_display.c | 2 +-
>  3 files changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/dp/dp_aux.c
> b/drivers/gpu/drm/msm/dp/dp_aux.c
> index 4a3293b590b0..88659ed200b9 100644
> --- a/drivers/gpu/drm/msm/dp/dp_aux.c
> +++ b/drivers/gpu/drm/msm/dp/dp_aux.c
> @@ -441,7 +441,7 @@ void dp_aux_deinit(struct drm_dp_aux *dp_aux)
> dp_catalog_aux_enable(aux->catalog, false);
>  }
>  
> -int dp_aux_register(struct drm_dp_aux *dp_aux)
> +int dp_aux_register(struct drm_dp_aux *dp_aux, struct drm_device *drm_dev)
>  {
> struct dp_aux_private *aux;
> int ret;
> @@ -455,6 +455,7 @@ int dp_aux_register(struct drm_dp_aux *dp_aux)
>  
> aux->dp_aux.name = "dpu_dp_aux";
> aux->dp_aux.dev = aux->dev;
> +   aux->dp_aux.drm_dev = drm_dev;
> aux->dp_aux.transfer = dp_aux_transfer;
> ret = drm_dp_aux_register(>dp_aux);
> if (ret) {
> diff --git a/drivers/gpu/drm/msm/dp/dp_aux.h
> b/drivers/gpu/drm/msm/dp/dp_aux.h
> index 0728cc09c9ec..7ef0d83b483a 100644
> --- a/drivers/gpu/drm/msm/dp/dp_aux.h
> +++ b/drivers/gpu/drm/msm/dp/dp_aux.h
> @@ -9,7 +9,7 @@
>  #include "dp_catalog.h"
>  #include 
>  
> -int dp_aux_register(struct drm_dp_aux *dp_aux);
> +int dp_aux_register(struct drm_dp_aux *dp_aux, struct drm_device *drm_dev);
>  void dp_aux_unregister(struct drm_dp_aux *dp_aux);
>  void dp_aux_isr(struct drm_dp_aux *dp_aux);
>  void dp_aux_init(struct drm_dp_aux *dp_aux);
> diff --git a/drivers/gpu/drm/msm/dp/dp_display.c
> b/drivers/gpu/drm/msm/dp/dp_display.c
> index c26562bd85fe..2f0a5c13f251 100644
> --- a/drivers/gpu/drm/msm/dp/dp_display.c
> +++ b/drivers/gpu/drm/msm/dp/dp_display.c
> @@ -259,7 +259,7 @@ static int dp_display_bind(struct device *dev, struct
> device *master,
> return rc;
> }
>  
> -   rc = dp_aux_register(dp->aux);
> +   rc = dp_aux_register(dp->aux, drm);
> if (rc) {
> DRM_ERROR("DRM DP AUX register failed\n");
> return rc;

-- 
Cheers,
 Lyude Paul (she/her)
 Software Engineer at Red Hat



Re: [PATCH v3 09/15] drm/panfrost: Simplify the reset serialization logic

2021-06-25 Thread Boris Brezillon
On Fri, 25 Jun 2021 15:33:21 +0200
Boris Brezillon  wrote:


> @@ -379,57 +370,73 @@ void panfrost_job_enable_interrupts(struct 
> panfrost_device *pfdev)
>   job_write(pfdev, JOB_INT_MASK, irq_mask);
>  }
>  
> -static bool panfrost_scheduler_stop(struct panfrost_queue_state *queue,
> - struct drm_sched_job *bad)
> +static void panfrost_reset(struct panfrost_device *pfdev,
> +struct drm_sched_job *bad)
>  {
> - enum panfrost_queue_status old_status;
> - bool stopped = false;
> + unsigned int i;
> + bool cookie;
>  
> - mutex_lock(>lock);
> - old_status = atomic_xchg(>status,
> -  PANFROST_QUEUE_STATUS_STOPPED);
> - if (old_status == PANFROST_QUEUE_STATUS_STOPPED)
> - goto out;
> + if (WARN_ON(!atomic_read(>reset.pending)))
> + return;
> +
> + /* Stop the schedulers.
> +  *
> +  * FIXME: We temporarily get out of the dma_fence_signalling section
> +  * because the cleanup path generate lockdep splats when taking locks
> +  * to release job resources. We should rework the code to follow this
> +  * pattern:
> +  *
> +  *  try_lock
> +  *  if (locked)
> +  *  release
> +  *  else
> +  *  schedule_work_to_release_later
> +  */
> + for (i = 0; i < NUM_JOB_SLOTS; i++)
> + drm_sched_stop(>js->queue[i].sched, bad);
> +
> + cookie = dma_fence_begin_signalling();
>  
> - WARN_ON(old_status != PANFROST_QUEUE_STATUS_ACTIVE);
> - drm_sched_stop(>sched, bad);
>   if (bad)
>   drm_sched_increase_karma(bad);
>  
> - stopped = true;
> + spin_lock(>js->job_lock);
> + for (i = 0; i < NUM_JOB_SLOTS; i++) {
> + if (pfdev->jobs[i]) {
> + pm_runtime_put_noidle(pfdev->dev);
> + panfrost_devfreq_record_idle(>pfdevfreq);
> + pfdev->jobs[i] = NULL;
> + }
> + }
> + spin_unlock(>js->job_lock);
>  
> - /*
> -  * Set the timeout to max so the timer doesn't get started
> -  * when we return from the timeout handler (restored in
> -  * panfrost_scheduler_start()).
> + panfrost_device_reset(pfdev);
> +
> + /* GPU has been reset, we can cancel timeout/fault work that may have
> +  * been queued in the meantime and clear the reset pending bit.
>*/
> - queue->sched.timeout = MAX_SCHEDULE_TIMEOUT;
> + atomic_set(>reset.pending, 0);
> + cancel_work_sync(>reset.work);

This is introducing a deadlock since panfrost_reset() might be called
from the reset handler, and cancel_work_sync() waits for the handler to
return. Unfortunately there's no cancel_work() variant, so I'll just
remove the

WARN_ON(!atomic_read(>reset.pending)

and return directly when the pending bit is cleared.

> + for (i = 0; i < NUM_JOB_SLOTS; i++)
> + cancel_delayed_work(>js->queue[i].sched.work_tdr);
>  
> -out:
> - mutex_unlock(>lock);
>  
> - return stopped;
> -}
> + /* Now resubmit jobs that were previously queued but didn't have a
> +  * chance to finish.
> +  * FIXME: We temporarily get out of the DMA fence signalling section
> +  * while resubmitting jobs because the job submission logic will
> +  * allocate memory with the GFP_KERNEL flag which can trigger memory
> +  * reclaim and exposes a lock ordering issue.
> +  */
> + dma_fence_end_signalling(cookie);
> + for (i = 0; i < NUM_JOB_SLOTS; i++)
> + drm_sched_resubmit_jobs(>js->queue[i].sched);
> + cookie = dma_fence_begin_signalling();
>  
> -static void panfrost_scheduler_start(struct panfrost_queue_state *queue)
> -{
> - enum panfrost_queue_status old_status;
> + for (i = 0; i < NUM_JOB_SLOTS; i++)
> + drm_sched_start(>js->queue[i].sched, true);
>  
> - mutex_lock(>lock);
> - old_status = atomic_xchg(>status,
> -  PANFROST_QUEUE_STATUS_STARTING);
> - WARN_ON(old_status != PANFROST_QUEUE_STATUS_STOPPED);
> -
> - /* Restore the original timeout before starting the scheduler. */
> - queue->sched.timeout = msecs_to_jiffies(JOB_TIMEOUT_MS);
> - drm_sched_resubmit_jobs(>sched);
> - drm_sched_start(>sched, true);
> - old_status = atomic_xchg(>status,
> -  PANFROST_QUEUE_STATUS_ACTIVE);
> - if (old_status == PANFROST_QUEUE_STATUS_FAULT_PENDING)
> - drm_sched_fault(>sched);
> -
> - mutex_unlock(>lock);
> + dma_fence_end_signalling(cookie);
>  }
>  


Re: [PATCH 4/4] drm/i915/gem: Migrate to system at dma-buf map time

2021-06-25 Thread Thomas Hellström

Hi, Michael,

thanks for looking at this.

On 6/25/21 6:02 PM, Ruhl, Michael J wrote:

-Original Message-
From: dri-devel  On Behalf Of
Thomas Hellström
Sent: Thursday, June 24, 2021 2:31 PM
To: intel-...@lists.freedesktop.org; dri-devel@lists.freedesktop.org
Cc: Thomas Hellström ; Auld, Matthew

Subject: [PATCH 4/4] drm/i915/gem: Migrate to system at dma-buf map time

Until we support p2p dma or as a complement to that, migrate data
to system memory at dma-buf map time if possible.

Signed-off-by: Thomas Hellström 
---
drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c | 9 -
1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
index 616c3a2f1baf..a52f885bc09a 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
@@ -25,7 +25,14 @@ static struct sg_table *i915_gem_map_dma_buf(struct
dma_buf_attachment *attachme
struct scatterlist *src, *dst;
int ret, i;

-   ret = i915_gem_object_pin_pages_unlocked(obj);
+   ret = i915_gem_object_lock_interruptible(obj, NULL);

Hmm, I believe in most cases that the caller should be holding the
lock (object dma-resv) on this object already.


Yes, I agree, In particular for other instances of our own driver,  at 
least since the dma_resv introduction.


But I also think that's a pre-existing bug, since 
i915_gem_object_pin_pages_unlocked() will also take the lock.


I Think we need to initially make the exporter dynamic-capable to 
resolve this, and drop the locking here completely, as dma-buf docs says 
that we're then guaranteed to get called with the object lock held.


I figure if we make the exporter dynamic, we need to migrate already at 
dma_buf_pin time so we don't pin the object in the wrong location.


/Thomas




I know for the dynamic version of dma-buf, there is a check to make
sure that the lock is held when called.

I think you will run into some issues if you try to get it here as well.

Mike


+   if (ret)
+   return ERR_PTR(ret);
+
+   ret = i915_gem_object_migrate(obj, NULL, INTEL_REGION_SMEM);
+   if (!ret)
+   ret = i915_gem_object_pin_pages(obj);
+   i915_gem_object_unlock(obj);
if (ret)
goto err;

--
2.31.1


Re: [PATCH v3 10/15] drm/panfrost: Make sure job interrupts are masked before resetting

2021-06-25 Thread Steven Price
On 25/06/2021 17:02, Boris Brezillon wrote:
> On Fri, 25 Jun 2021 16:55:12 +0100
> Steven Price  wrote:
> 
>> On 25/06/2021 14:33, Boris Brezillon wrote:
>>> This is not yet needed because we let active jobs be killed during by
>>> the reset and we don't really bother making sure they can be restarted.
>>> But once we start adding soft-stop support, controlling when we deal
>>> with the remaining interrrupts and making sure those are handled before
>>> the reset is issued gets tricky if we keep job interrupts active.
>>>
>>> Let's prepare for that and mask+flush job IRQs before issuing a reset.
>>>
>>> Signed-off-by: Boris Brezillon 
>>> ---
>>>  drivers/gpu/drm/panfrost/panfrost_job.c | 21 +++--
>>>  1 file changed, 15 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c 
>>> b/drivers/gpu/drm/panfrost/panfrost_job.c
>>> index 88d34fd781e8..0566e2f7e84a 100644
>>> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
>>> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
>>> @@ -34,6 +34,7 @@ struct panfrost_queue_state {
>>>  struct panfrost_job_slot {
>>> struct panfrost_queue_state queue[NUM_JOB_SLOTS];
>>> spinlock_t job_lock;
>>> +   int irq;
>>>  };
>>>  
>>>  static struct panfrost_job *
>>> @@ -400,7 +401,15 @@ static void panfrost_reset(struct panfrost_device 
>>> *pfdev,
>>> if (bad)
>>> drm_sched_increase_karma(bad);
>>>  
>>> -   spin_lock(>js->job_lock);  
>>
>> I'm not sure it's safe to remove this lock as this protects the
>> pfdev->jobs array: I can't see what would prevent panfrost_job_close()
>> running at the same time without the lock. Am I missing something?
> 
> Ah, you're right, I'll add it back.
> 
>>
>>> +   /* Mask job interrupts and synchronize to make sure we won't be
>>> +* interrupted during our reset.
>>> +*/
>>> +   job_write(pfdev, JOB_INT_MASK, 0);
>>> +   synchronize_irq(pfdev->js->irq);
>>> +
>>> +   /* Schedulers are stopped and interrupts are masked+flushed, we don't
>>> +* need to protect the 'evict unfinished jobs' lock with the job_lock.
>>> +*/
>>> for (i = 0; i < NUM_JOB_SLOTS; i++) {
>>> if (pfdev->jobs[i]) {
>>> pm_runtime_put_noidle(pfdev->dev);
>>> @@ -408,7 +417,6 @@ static void panfrost_reset(struct panfrost_device 
>>> *pfdev,
>>> pfdev->jobs[i] = NULL;
>>> }
>>> }
>>> -   spin_unlock(>js->job_lock);
>>>  
>>> panfrost_device_reset(pfdev);
>>>  
>>> @@ -504,6 +512,7 @@ static void panfrost_job_handle_irq(struct 
>>> panfrost_device *pfdev, u32 status)
>>>  
>>> job = pfdev->jobs[j];
>>> /* Only NULL if job timeout occurred */
>>> +   WARN_ON(!job);  
>>
>> Was this WARN_ON intentional?
> 
> Yes, now that we mask and synchronize the irq in the reset I don't see
> any reason why we would end up with an event but no job to attach this
> even to, but maybe I missed something.
> 

Ok - but I guess the comment above needs updating then! ;) Job timeouts
are still a thing which definitely can happen!

Steve


Re: [PATCH v3 11/15] drm/panfrost: Disable the AS on unhandled page faults

2021-06-25 Thread Steven Price
On 25/06/2021 14:33, Boris Brezillon wrote:
> If we don't do that, we have to wait for the job timeout to expire
> before the fault jobs gets killed.
> 
> v3:
> * Make sure the AS is re-enabled when new jobs are submitted to the
>   context
> 
> Signed-off-by: Boris Brezillon 

Reviewed-by: Steven Price 

> ---
>  drivers/gpu/drm/panfrost/panfrost_device.h |  1 +
>  drivers/gpu/drm/panfrost/panfrost_mmu.c| 34 --
>  2 files changed, 32 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h 
> b/drivers/gpu/drm/panfrost/panfrost_device.h
> index bfe32907ba6b..efe9a675b614 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_device.h
> +++ b/drivers/gpu/drm/panfrost/panfrost_device.h
> @@ -96,6 +96,7 @@ struct panfrost_device {
>   spinlock_t as_lock;
>   unsigned long as_in_use_mask;
>   unsigned long as_alloc_mask;
> + unsigned long as_faulty_mask;
>   struct list_head as_lru_list;
>  
>   struct panfrost_job_slot *js;
> diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c 
> b/drivers/gpu/drm/panfrost/panfrost_mmu.c
> index b4f0c673cd7f..65e98c51cb66 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
> @@ -154,6 +154,7 @@ u32 panfrost_mmu_as_get(struct panfrost_device *pfdev, 
> struct panfrost_mmu *mmu)
>   as = mmu->as;
>   if (as >= 0) {
>   int en = atomic_inc_return(>as_count);
> + u32 mask = BIT(as) | BIT(16 + as);
>  
>   /*
>* AS can be retained by active jobs or a perfcnt context,
> @@ -162,6 +163,18 @@ u32 panfrost_mmu_as_get(struct panfrost_device *pfdev, 
> struct panfrost_mmu *mmu)
>   WARN_ON(en >= (NUM_JOB_SLOTS + 1));
>  
>   list_move(>list, >as_lru_list);
> +
> + if (pfdev->as_faulty_mask & mask) {
> + /* Unhandled pagefault on this AS, the MMU was
> +  * disabled. We need to re-enable the MMU after
> +  * clearing+unmasking the AS interrupts.
> +  */
> + mmu_write(pfdev, MMU_INT_CLEAR, mask);
> + mmu_write(pfdev, MMU_INT_MASK, ~pfdev->as_faulty_mask);
> + pfdev->as_faulty_mask &= ~mask;
> + panfrost_mmu_enable(pfdev, mmu);
> + }
> +
>   goto out;
>   }
>  
> @@ -211,6 +224,7 @@ void panfrost_mmu_reset(struct panfrost_device *pfdev)
>   spin_lock(>as_lock);
>  
>   pfdev->as_alloc_mask = 0;
> + pfdev->as_faulty_mask = 0;
>  
>   list_for_each_entry_safe(mmu, mmu_tmp, >as_lru_list, list) {
>   mmu->as = -1;
> @@ -662,7 +676,7 @@ static irqreturn_t panfrost_mmu_irq_handler_thread(int 
> irq, void *data)
>   if ((status & mask) == BIT(as) && (exception_type & 0xF8) == 
> 0xC0)
>   ret = panfrost_mmu_map_fault_addr(pfdev, as, addr);
>  
> - if (ret)
> + if (ret) {
>   /* terminal fault, print info about the fault */
>   dev_err(pfdev->dev,
>   "Unhandled Page fault in AS%d at VA 0x%016llX\n"
> @@ -680,14 +694,28 @@ static irqreturn_t panfrost_mmu_irq_handler_thread(int 
> irq, void *data)
>   access_type, access_type_name(pfdev, 
> fault_status),
>   source_id);
>  
> + spin_lock(>as_lock);
> + /* Ignore MMU interrupts on this AS until it's been
> +  * re-enabled.
> +  */
> + pfdev->as_faulty_mask |= mask;
> +
> + /* Disable the MMU to kill jobs on this AS. */
> + panfrost_mmu_disable(pfdev, as);
> + spin_unlock(>as_lock);
> + }
> +
>   status &= ~mask;
>  
>   /* If we received new MMU interrupts, process them before 
> returning. */
>   if (!status)
> - status = mmu_read(pfdev, MMU_INT_RAWSTAT);
> + status = mmu_read(pfdev, MMU_INT_RAWSTAT) & 
> ~pfdev->as_faulty_mask;
>   }
>  
> - mmu_write(pfdev, MMU_INT_MASK, ~0);
> + spin_lock(>as_lock);
> + mmu_write(pfdev, MMU_INT_MASK, ~pfdev->as_faulty_mask);
> + spin_unlock(>as_lock);
> +
>   return IRQ_HANDLED;
>  };
>  
> 



Re: [Freedreno] [PATCH] drm/msm/dp: Add missing drm_device backpointer

2021-06-25 Thread abhinavk

On 2021-06-24 20:47, Bjorn Andersson wrote:

'6cba3fe43341 ("drm/dp: Add backpointer to drm_device in drm_dp_aux")'
introduced a mandator drm_device backpointer in struct drm_dp_aux, but

mandatory

missed the msm DP driver. Fix this.

Fixes: 6cba3fe43341 ("drm/dp: Add backpointer to drm_device in 
drm_dp_aux")

Signed-off-by: Bjorn Andersson 

apart from that nit,
Reviewed-by: Abhinav Kumar 

---
 drivers/gpu/drm/msm/dp/dp_aux.c | 3 ++-
 drivers/gpu/drm/msm/dp/dp_aux.h | 2 +-
 drivers/gpu/drm/msm/dp/dp_display.c | 2 +-
 3 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/msm/dp/dp_aux.c 
b/drivers/gpu/drm/msm/dp/dp_aux.c

index 4a3293b590b0..88659ed200b9 100644
--- a/drivers/gpu/drm/msm/dp/dp_aux.c
+++ b/drivers/gpu/drm/msm/dp/dp_aux.c
@@ -441,7 +441,7 @@ void dp_aux_deinit(struct drm_dp_aux *dp_aux)
dp_catalog_aux_enable(aux->catalog, false);
 }

-int dp_aux_register(struct drm_dp_aux *dp_aux)
+int dp_aux_register(struct drm_dp_aux *dp_aux, struct drm_device 
*drm_dev)

 {
struct dp_aux_private *aux;
int ret;
@@ -455,6 +455,7 @@ int dp_aux_register(struct drm_dp_aux *dp_aux)

aux->dp_aux.name = "dpu_dp_aux";
aux->dp_aux.dev = aux->dev;
+   aux->dp_aux.drm_dev = drm_dev;
aux->dp_aux.transfer = dp_aux_transfer;
ret = drm_dp_aux_register(>dp_aux);
if (ret) {
diff --git a/drivers/gpu/drm/msm/dp/dp_aux.h 
b/drivers/gpu/drm/msm/dp/dp_aux.h

index 0728cc09c9ec..7ef0d83b483a 100644
--- a/drivers/gpu/drm/msm/dp/dp_aux.h
+++ b/drivers/gpu/drm/msm/dp/dp_aux.h
@@ -9,7 +9,7 @@
 #include "dp_catalog.h"
 #include 

-int dp_aux_register(struct drm_dp_aux *dp_aux);
+int dp_aux_register(struct drm_dp_aux *dp_aux, struct drm_device 
*drm_dev);

 void dp_aux_unregister(struct drm_dp_aux *dp_aux);
 void dp_aux_isr(struct drm_dp_aux *dp_aux);
 void dp_aux_init(struct drm_dp_aux *dp_aux);
diff --git a/drivers/gpu/drm/msm/dp/dp_display.c
b/drivers/gpu/drm/msm/dp/dp_display.c
index c26562bd85fe..2f0a5c13f251 100644
--- a/drivers/gpu/drm/msm/dp/dp_display.c
+++ b/drivers/gpu/drm/msm/dp/dp_display.c
@@ -259,7 +259,7 @@ static int dp_display_bind(struct device *dev,
struct device *master,
return rc;
}

-   rc = dp_aux_register(dp->aux);
+   rc = dp_aux_register(dp->aux, drm);
if (rc) {
DRM_ERROR("DRM DP AUX register failed\n");
return rc;


RE: [PATCH 4/4] drm/i915/gem: Migrate to system at dma-buf map time

2021-06-25 Thread Ruhl, Michael J
>-Original Message-
>From: dri-devel  On Behalf Of
>Thomas Hellström
>Sent: Thursday, June 24, 2021 2:31 PM
>To: intel-...@lists.freedesktop.org; dri-devel@lists.freedesktop.org
>Cc: Thomas Hellström ; Auld, Matthew
>
>Subject: [PATCH 4/4] drm/i915/gem: Migrate to system at dma-buf map time
>
>Until we support p2p dma or as a complement to that, migrate data
>to system memory at dma-buf map time if possible.
>
>Signed-off-by: Thomas Hellström 
>---
> drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c | 9 -
> 1 file changed, 8 insertions(+), 1 deletion(-)
>
>diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
>b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
>index 616c3a2f1baf..a52f885bc09a 100644
>--- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
>+++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
>@@ -25,7 +25,14 @@ static struct sg_table *i915_gem_map_dma_buf(struct
>dma_buf_attachment *attachme
>   struct scatterlist *src, *dst;
>   int ret, i;
>
>-  ret = i915_gem_object_pin_pages_unlocked(obj);
>+  ret = i915_gem_object_lock_interruptible(obj, NULL);

Hmm, I believe in most cases that the caller should be holding the
lock (object dma-resv) on this object already.

I know for the dynamic version of dma-buf, there is a check to make
sure that the lock is held when called.

I think you will run into some issues if you try to get it here as well.

Mike

>+  if (ret)
>+  return ERR_PTR(ret);
>+
>+  ret = i915_gem_object_migrate(obj, NULL, INTEL_REGION_SMEM);
>+  if (!ret)
>+  ret = i915_gem_object_pin_pages(obj);
>+  i915_gem_object_unlock(obj);
>   if (ret)
>   goto err;
>
>--
>2.31.1



Re: [PATCH v3 10/15] drm/panfrost: Make sure job interrupts are masked before resetting

2021-06-25 Thread Boris Brezillon
On Fri, 25 Jun 2021 16:55:12 +0100
Steven Price  wrote:

> On 25/06/2021 14:33, Boris Brezillon wrote:
> > This is not yet needed because we let active jobs be killed during by
> > the reset and we don't really bother making sure they can be restarted.
> > But once we start adding soft-stop support, controlling when we deal
> > with the remaining interrrupts and making sure those are handled before
> > the reset is issued gets tricky if we keep job interrupts active.
> > 
> > Let's prepare for that and mask+flush job IRQs before issuing a reset.
> > 
> > Signed-off-by: Boris Brezillon 
> > ---
> >  drivers/gpu/drm/panfrost/panfrost_job.c | 21 +++--
> >  1 file changed, 15 insertions(+), 6 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c 
> > b/drivers/gpu/drm/panfrost/panfrost_job.c
> > index 88d34fd781e8..0566e2f7e84a 100644
> > --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> > +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> > @@ -34,6 +34,7 @@ struct panfrost_queue_state {
> >  struct panfrost_job_slot {
> > struct panfrost_queue_state queue[NUM_JOB_SLOTS];
> > spinlock_t job_lock;
> > +   int irq;
> >  };
> >  
> >  static struct panfrost_job *
> > @@ -400,7 +401,15 @@ static void panfrost_reset(struct panfrost_device 
> > *pfdev,
> > if (bad)
> > drm_sched_increase_karma(bad);
> >  
> > -   spin_lock(>js->job_lock);  
> 
> I'm not sure it's safe to remove this lock as this protects the
> pfdev->jobs array: I can't see what would prevent panfrost_job_close()
> running at the same time without the lock. Am I missing something?

Ah, you're right, I'll add it back.

> 
> > +   /* Mask job interrupts and synchronize to make sure we won't be
> > +* interrupted during our reset.
> > +*/
> > +   job_write(pfdev, JOB_INT_MASK, 0);
> > +   synchronize_irq(pfdev->js->irq);
> > +
> > +   /* Schedulers are stopped and interrupts are masked+flushed, we don't
> > +* need to protect the 'evict unfinished jobs' lock with the job_lock.
> > +*/
> > for (i = 0; i < NUM_JOB_SLOTS; i++) {
> > if (pfdev->jobs[i]) {
> > pm_runtime_put_noidle(pfdev->dev);
> > @@ -408,7 +417,6 @@ static void panfrost_reset(struct panfrost_device 
> > *pfdev,
> > pfdev->jobs[i] = NULL;
> > }
> > }
> > -   spin_unlock(>js->job_lock);
> >  
> > panfrost_device_reset(pfdev);
> >  
> > @@ -504,6 +512,7 @@ static void panfrost_job_handle_irq(struct 
> > panfrost_device *pfdev, u32 status)
> >  
> > job = pfdev->jobs[j];
> > /* Only NULL if job timeout occurred */
> > +   WARN_ON(!job);  
> 
> Was this WARN_ON intentional?

Yes, now that we mask and synchronize the irq in the reset I don't see
any reason why we would end up with an event but no job to attach this
even to, but maybe I missed something.


Re: [PATCH] drm/panel: Add DT bindings for Samsung LMS380KF01

2021-06-25 Thread Doug Anderson
Hi,

On Thu, Jun 24, 2021 at 3:40 PM Linus Walleij  wrote:
>
> +  spi-cpha:
> +$ref: /schemas/types.yaml#/definitions/flag
> +description: inherited as a SPI client node. Must be set.
> +
> +  spi-cpol:
> +$ref: /schemas/types.yaml#/definitions/flag
> +description: inherited as a SPI client node. Must be set.

I will defer to Rob Herring (added to CC) to confirm if we really need
all that stuff for spi-cpha and spi-cpol. I would have expected just:

spi-cpha: true
spi-cpol: true

As I understand it, the fact that they are flags will already be
validated as part of the "spi-controller.yaml" so you don't need to
specify that. ...and the fact that you have them listed as "required"
properties documents the fact that they must be set for your device,
so I don't think you need more.

NOTE: if you're testing this using your "example" below I think you
will find that you could set this to something other than just a flag
and it won't yell at you. However, that's because your example has a
bogus SPI controller node in it. I think if you put a real SPI
controller in the example then it'll pull in the "spi-controller.yaml"
bindings and magically start validating everything.


> +  spi-max-frequency:
> +$ref: /schemas/types.yaml#/definitions/uint32

You don't need the "$ref" line here either, right? Again it'll be
validated as part of the "spi-controller.yaml".


> +required:
> +  - compatible
> +  - reg
> +  - spi-cpha
> +  - spi-cpol

Does "port" need to be listed as required too?


Re: [PATCH v3 10/15] drm/panfrost: Make sure job interrupts are masked before resetting

2021-06-25 Thread Steven Price
On 25/06/2021 14:33, Boris Brezillon wrote:
> This is not yet needed because we let active jobs be killed during by
> the reset and we don't really bother making sure they can be restarted.
> But once we start adding soft-stop support, controlling when we deal
> with the remaining interrrupts and making sure those are handled before
> the reset is issued gets tricky if we keep job interrupts active.
> 
> Let's prepare for that and mask+flush job IRQs before issuing a reset.
> 
> Signed-off-by: Boris Brezillon 
> ---
>  drivers/gpu/drm/panfrost/panfrost_job.c | 21 +++--
>  1 file changed, 15 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c 
> b/drivers/gpu/drm/panfrost/panfrost_job.c
> index 88d34fd781e8..0566e2f7e84a 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> @@ -34,6 +34,7 @@ struct panfrost_queue_state {
>  struct panfrost_job_slot {
>   struct panfrost_queue_state queue[NUM_JOB_SLOTS];
>   spinlock_t job_lock;
> + int irq;
>  };
>  
>  static struct panfrost_job *
> @@ -400,7 +401,15 @@ static void panfrost_reset(struct panfrost_device *pfdev,
>   if (bad)
>   drm_sched_increase_karma(bad);
>  
> - spin_lock(>js->job_lock);

I'm not sure it's safe to remove this lock as this protects the
pfdev->jobs array: I can't see what would prevent panfrost_job_close()
running at the same time without the lock. Am I missing something?

> + /* Mask job interrupts and synchronize to make sure we won't be
> +  * interrupted during our reset.
> +  */
> + job_write(pfdev, JOB_INT_MASK, 0);
> + synchronize_irq(pfdev->js->irq);
> +
> + /* Schedulers are stopped and interrupts are masked+flushed, we don't
> +  * need to protect the 'evict unfinished jobs' lock with the job_lock.
> +  */
>   for (i = 0; i < NUM_JOB_SLOTS; i++) {
>   if (pfdev->jobs[i]) {
>   pm_runtime_put_noidle(pfdev->dev);
> @@ -408,7 +417,6 @@ static void panfrost_reset(struct panfrost_device *pfdev,
>   pfdev->jobs[i] = NULL;
>   }
>   }
> - spin_unlock(>js->job_lock);
>  
>   panfrost_device_reset(pfdev);
>  
> @@ -504,6 +512,7 @@ static void panfrost_job_handle_irq(struct 
> panfrost_device *pfdev, u32 status)
>  
>   job = pfdev->jobs[j];
>   /* Only NULL if job timeout occurred */
> + WARN_ON(!job);

Was this WARN_ON intentional?

Steve

>   if (job) {
>   pfdev->jobs[j] = NULL;
>  
> @@ -563,7 +572,7 @@ static void panfrost_reset_work(struct work_struct *work)
>  int panfrost_job_init(struct panfrost_device *pfdev)
>  {
>   struct panfrost_job_slot *js;
> - int ret, j, irq;
> + int ret, j;
>  
>   INIT_WORK(>reset.work, panfrost_reset_work);
>  
> @@ -573,11 +582,11 @@ int panfrost_job_init(struct panfrost_device *pfdev)
>  
>   spin_lock_init(>job_lock);
>  
> - irq = platform_get_irq_byname(to_platform_device(pfdev->dev), "job");
> - if (irq <= 0)
> + js->irq = platform_get_irq_byname(to_platform_device(pfdev->dev), 
> "job");
> + if (js->irq <= 0)
>   return -ENODEV;
>  
> - ret = devm_request_threaded_irq(pfdev->dev, irq,
> + ret = devm_request_threaded_irq(pfdev->dev, js->irq,
>   panfrost_job_irq_handler,
>   panfrost_job_irq_handler_thread,
>   IRQF_SHARED, KBUILD_MODNAME "-job",
> 



Re: [PATCH v3 09/15] drm/panfrost: Simplify the reset serialization logic

2021-06-25 Thread Steven Price
On 25/06/2021 14:33, Boris Brezillon wrote:
> Now that we can pass our own workqueue to drm_sched_init(), we can use

Except that part has somehow slipped through to patch 15:

> @@ -633,8 +849,9 @@ int panfrost_job_init(struct panfrost_device *pfdev)
>  
>   ret = drm_sched_init(>queue[j].sched,
>_sched_ops,
> -  1, 0,
> -  msecs_to_jiffies(JOB_TIMEOUT_MS), NULL,
> +  nslots, 0,
> +  msecs_to_jiffies(JOB_TIMEOUT_MS),
> +  pfdev->reset.wq,
>NULL, "pan_js");
>   if (ret) {
>   dev_err(pfdev->dev, "Failed to create scheduler: %d.", 
> ret);

Steve

> an ordered workqueue on for both the scheduler timeout tdr and our own
> reset work (which we use when the reset is not caused by a fault/timeout
> on a specific job, like when we have AS_ACTIVE bit stuck). This
> guarantees that the timeout handlers and reset handler can't run
> concurrently which drastically simplifies the locking.
> 
> Suggested-by: Daniel Vetter 
> Signed-off-by: Boris Brezillon 
> ---
>  drivers/gpu/drm/panfrost/panfrost_device.h |   6 +-
>  drivers/gpu/drm/panfrost/panfrost_job.c| 185 -
>  2 files changed, 71 insertions(+), 120 deletions(-)
> 
> diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h 
> b/drivers/gpu/drm/panfrost/panfrost_device.h
> index 6024eaf34ba0..bfe32907ba6b 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_device.h
> +++ b/drivers/gpu/drm/panfrost/panfrost_device.h
> @@ -108,6 +108,7 @@ struct panfrost_device {
>   struct mutex sched_lock;
>  
>   struct {
> + struct workqueue_struct *wq;
>   struct work_struct work;
>   atomic_t pending;
>   } reset;
> @@ -177,9 +178,8 @@ const char *panfrost_exception_name(u32 exception_code);
>  static inline void
>  panfrost_device_schedule_reset(struct panfrost_device *pfdev)
>  {
> - /* Schedule a reset if there's no reset in progress. */
> - if (!atomic_xchg(>reset.pending, 1))
> - schedule_work(>reset.work);
> + atomic_set(>reset.pending, 1);
> + queue_work(pfdev->reset.wq, >reset.work);
>  }
>  
>  #endif
> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c 
> b/drivers/gpu/drm/panfrost/panfrost_job.c
> index e0c479e67304..88d34fd781e8 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> @@ -25,17 +25,8 @@
>  #define job_write(dev, reg, data) writel(data, dev->iomem + (reg))
>  #define job_read(dev, reg) readl(dev->iomem + (reg))
>  
> -enum panfrost_queue_status {
> - PANFROST_QUEUE_STATUS_ACTIVE,
> - PANFROST_QUEUE_STATUS_STOPPED,
> - PANFROST_QUEUE_STATUS_STARTING,
> - PANFROST_QUEUE_STATUS_FAULT_PENDING,
> -};
> -
>  struct panfrost_queue_state {
>   struct drm_gpu_scheduler sched;
> - atomic_t status;
> - struct mutex lock;
>   u64 fence_context;
>   u64 emit_seqno;
>  };
> @@ -379,57 +370,73 @@ void panfrost_job_enable_interrupts(struct 
> panfrost_device *pfdev)
>   job_write(pfdev, JOB_INT_MASK, irq_mask);
>  }
>  
> -static bool panfrost_scheduler_stop(struct panfrost_queue_state *queue,
> - struct drm_sched_job *bad)
> +static void panfrost_reset(struct panfrost_device *pfdev,
> +struct drm_sched_job *bad)
>  {
> - enum panfrost_queue_status old_status;
> - bool stopped = false;
> + unsigned int i;
> + bool cookie;
>  
> - mutex_lock(>lock);
> - old_status = atomic_xchg(>status,
> -  PANFROST_QUEUE_STATUS_STOPPED);
> - if (old_status == PANFROST_QUEUE_STATUS_STOPPED)
> - goto out;
> + if (WARN_ON(!atomic_read(>reset.pending)))
> + return;
> +
> + /* Stop the schedulers.
> +  *
> +  * FIXME: We temporarily get out of the dma_fence_signalling section
> +  * because the cleanup path generate lockdep splats when taking locks
> +  * to release job resources. We should rework the code to follow this
> +  * pattern:
> +  *
> +  *  try_lock
> +  *  if (locked)
> +  *  release
> +  *  else
> +  *  schedule_work_to_release_later
> +  */
> + for (i = 0; i < NUM_JOB_SLOTS; i++)
> + drm_sched_stop(>js->queue[i].sched, bad);
> +
> + cookie = dma_fence_begin_signalling();
>  
> - WARN_ON(old_status != PANFROST_QUEUE_STATUS_ACTIVE);
> - drm_sched_stop(>sched, bad);
>   if (bad)
>   drm_sched_increase_karma(bad);
>  
> - stopped = true;
> + spin_lock(>js->job_lock);
> + for (i = 0; i < NUM_JOB_SLOTS; i++) {
> + if (pfdev->jobs[i]) {
> + pm_runtime_put_noidle(pfdev->dev);
> + 

Re: [PATCH v3 08/15] drm/panfrost: Use a threaded IRQ for job interrupts

2021-06-25 Thread Steven Price
On 25/06/2021 14:33, Boris Brezillon wrote:
> This should avoid switching to interrupt context when the GPU is under
> heavy use.
> 
> v3:
> * Don't take the job_lock in panfrost_job_handle_irq()
> 
> Signed-off-by: Boris Brezillon 

Reviewed-by: Steven Price 

> ---
>  drivers/gpu/drm/panfrost/panfrost_job.c | 53 ++---
>  1 file changed, 38 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c 
> b/drivers/gpu/drm/panfrost/panfrost_job.c
> index be8f68f63974..e0c479e67304 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> @@ -470,19 +470,12 @@ static const struct drm_sched_backend_ops 
> panfrost_sched_ops = {
>   .free_job = panfrost_job_free
>  };
>  
> -static irqreturn_t panfrost_job_irq_handler(int irq, void *data)
> +static void panfrost_job_handle_irq(struct panfrost_device *pfdev, u32 
> status)
>  {
> - struct panfrost_device *pfdev = data;
> - u32 status = job_read(pfdev, JOB_INT_STAT);
>   int j;
>  
>   dev_dbg(pfdev->dev, "jobslot irq status=%x\n", status);
>  
> - if (!status)
> - return IRQ_NONE;
> -
> - pm_runtime_mark_last_busy(pfdev->dev);
> -
>   for (j = 0; status; j++) {
>   u32 mask = MK_JS_MASK(j);
>  
> @@ -519,7 +512,6 @@ static irqreturn_t panfrost_job_irq_handler(int irq, void 
> *data)
>   if (status & JOB_INT_MASK_DONE(j)) {
>   struct panfrost_job *job;
>  
> - spin_lock(>js->job_lock);
>   job = pfdev->jobs[j];
>   /* Only NULL if job timeout occurred */
>   if (job) {
> @@ -531,21 +523,49 @@ static irqreturn_t panfrost_job_irq_handler(int irq, 
> void *data)
>   dma_fence_signal_locked(job->done_fence);
>   pm_runtime_put_autosuspend(pfdev->dev);
>   }
> - spin_unlock(>js->job_lock);
>   }
>  
>   status &= ~mask;
>   }
> +}
>  
> +static irqreturn_t panfrost_job_irq_handler_thread(int irq, void *data)
> +{
> + struct panfrost_device *pfdev = data;
> + u32 status = job_read(pfdev, JOB_INT_RAWSTAT);
> +
> + while (status) {
> + pm_runtime_mark_last_busy(pfdev->dev);
> +
> + spin_lock(>js->job_lock);
> + panfrost_job_handle_irq(pfdev, status);
> + spin_unlock(>js->job_lock);
> + status = job_read(pfdev, JOB_INT_RAWSTAT);
> + }
> +
> + job_write(pfdev, JOB_INT_MASK,
> +   GENMASK(16 + NUM_JOB_SLOTS - 1, 16) |
> +   GENMASK(NUM_JOB_SLOTS - 1, 0));
>   return IRQ_HANDLED;
>  }
>  
> +static irqreturn_t panfrost_job_irq_handler(int irq, void *data)
> +{
> + struct panfrost_device *pfdev = data;
> + u32 status = job_read(pfdev, JOB_INT_STAT);
> +
> + if (!status)
> + return IRQ_NONE;
> +
> + job_write(pfdev, JOB_INT_MASK, 0);
> + return IRQ_WAKE_THREAD;
> +}
> +
>  static void panfrost_reset(struct work_struct *work)
>  {
>   struct panfrost_device *pfdev = container_of(work,
>struct panfrost_device,
>reset.work);
> - unsigned long flags;
>   unsigned int i;
>   bool cookie;
>  
> @@ -575,7 +595,7 @@ static void panfrost_reset(struct work_struct *work)
>   /* All timers have been stopped, we can safely reset the pending state. 
> */
>   atomic_set(>reset.pending, 0);
>  
> - spin_lock_irqsave(>js->job_lock, flags);
> + spin_lock(>js->job_lock);
>   for (i = 0; i < NUM_JOB_SLOTS; i++) {
>   if (pfdev->jobs[i]) {
>   pm_runtime_put_noidle(pfdev->dev);
> @@ -583,7 +603,7 @@ static void panfrost_reset(struct work_struct *work)
>   pfdev->jobs[i] = NULL;
>   }
>   }
> - spin_unlock_irqrestore(>js->job_lock, flags);
> + spin_unlock(>js->job_lock);
>  
>   panfrost_device_reset(pfdev);
>  
> @@ -610,8 +630,11 @@ int panfrost_job_init(struct panfrost_device *pfdev)
>   if (irq <= 0)
>   return -ENODEV;
>  
> - ret = devm_request_irq(pfdev->dev, irq, panfrost_job_irq_handler,
> -IRQF_SHARED, KBUILD_MODNAME "-job", pfdev);
> + ret = devm_request_threaded_irq(pfdev->dev, irq,
> + panfrost_job_irq_handler,
> + panfrost_job_irq_handler_thread,
> + IRQF_SHARED, KBUILD_MODNAME "-job",
> + pfdev);
>   if (ret) {
>   dev_err(pfdev->dev, "failed to request job irq");
>   return ret;
> 



Re: [PATCH v3 05/15] drm/panfrost: Expose exception types to userspace

2021-06-25 Thread Boris Brezillon
On Fri, 25 Jun 2021 16:32:27 +0100
Steven Price  wrote:

> On 25/06/2021 15:21, Boris Brezillon wrote:
> > On Fri, 25 Jun 2021 09:42:08 -0400
> > Alyssa Rosenzweig  wrote:
> >   
> >> I'm not convinced. Right now most of our UABI is pleasantly
> >> GPU-agnostic. With this suddenly there's divergence between Midgard and
> >> Bifrost uABI.  
> > 
> > Hm, I don't see why. I mean the exception types seem to be the same,
> > there are just some that are not used on Midgard and some that are no
> > used on Bifrost. Are there any collisions I didn't notice?  
> 
> I think the real question is: why are we exporting them if user space
> doesn't want them ;) Should this be in an internal header file at least
> until someone actually requests they be available to user space?

Alright, I'll move it to panfrost_device.h (or panfrost_regs.h) then.


Re: [PATCH v5 3/5] drm/msm: Improve the a6xx page fault handler

2021-06-25 Thread Rob Clark
On Thu, Jun 24, 2021 at 8:39 PM Bjorn Andersson
 wrote:
>
> On Thu 10 Jun 16:44 CDT 2021, Rob Clark wrote:
> [..]
> > diff --git a/drivers/gpu/drm/msm/msm_iommu.c 
> > b/drivers/gpu/drm/msm/msm_iommu.c
> > index 50d881794758..6975b95c3c29 100644
> > --- a/drivers/gpu/drm/msm/msm_iommu.c
> > +++ b/drivers/gpu/drm/msm/msm_iommu.c
> > @@ -211,8 +211,17 @@ static int msm_fault_handler(struct iommu_domain 
> > *domain, struct device *dev,
> >   unsigned long iova, int flags, void *arg)
> >  {
> >   struct msm_iommu *iommu = arg;
> > + struct adreno_smmu_priv *adreno_smmu = 
> > dev_get_drvdata(iommu->base.dev);
> > + struct adreno_smmu_fault_info info, *ptr = NULL;
> > +
> > + if (adreno_smmu->get_fault_info) {
>
> This seemed reasonable when I read it last time, but I didn't realize
> that the msm_fault_handler() is installed for all msm_iommu instances.
>
> So while we're trying to recover from the boot splash and setup the new
> framebuffer we end up here with iommu->base.dev being the mdss device.
> Naturally drvdata of mdss is not a struct adreno_smmu_priv.
>
> > + adreno_smmu->get_fault_info(adreno_smmu->cookie, );
>
> So here we just jump straight out into hyperspace, never to return.
>
> Not sure how to wire this up to avoid the problem, but right now I don't
> think we can boot any device with a boot splash.
>

I think we could do:


diff --git a/drivers/gpu/drm/msm/msm_iommu.c b/drivers/gpu/drm/msm/msm_iommu.c
index eed2a762e9dd..30ee8866154e 100644
--- a/drivers/gpu/drm/msm/msm_iommu.c
+++ b/drivers/gpu/drm/msm/msm_iommu.c
@@ -29,6 +29,9 @@ static struct msm_iommu_pagetable
*to_pagetable(struct msm_mmu *mmu)
  return container_of(mmu, struct msm_iommu_pagetable, base);
 }

+static int msm_fault_handler(struct iommu_domain *domain, struct device *dev,
+ unsigned long iova, int flags, void *arg);
+
 static int msm_iommu_pagetable_unmap(struct msm_mmu *mmu, u64 iova,
  size_t size)
 {
@@ -151,6 +154,8 @@ struct msm_mmu *msm_iommu_pagetable_create(struct
msm_mmu *parent)
  struct io_pgtable_cfg ttbr0_cfg;
  int ret;

+ iommu_set_fault_handler(iommu->domain, msm_fault_handler, iommu);
+
  /* Get the pagetable configuration from the domain */
  if (adreno_smmu->cookie)
  ttbr1_cfg = adreno_smmu->get_ttbr1_cfg(adreno_smmu->cookie);
@@ -300,7 +305,6 @@ struct msm_mmu *msm_iommu_new(struct device *dev,
struct iommu_domain *domain)

  iommu->domain = domain;
  msm_mmu_init(>base, dev, , MSM_MMU_IOMMU);
- iommu_set_fault_handler(domain, msm_fault_handler, iommu);

  atomic_set(>pagetables, 0);



That would have the result of setting the same fault handler multiple
times, but that looks harmless.  Mostly the fault handling stuff is to
make it easier to debug userspace issues, the fallback dmesg spam from
arm-smmu should be sufficient for any kernel side issues.

BR,
-R


Re: [PATCH v3 06/15] drm/panfrost: Do the exception -> string translation using a table

2021-06-25 Thread Steven Price
On 25/06/2021 14:33, Boris Brezillon wrote:
> Do the exception -> string translation using a table. This way we get
> rid of those magic numbers and can easily add new fields if we need
> to attach extra information to exception types.
> 
> v3:
> * Drop the error field
> 
> Signed-off-by: Boris Brezillon 

Reviewed-by: Steven Price 

> ---
>  drivers/gpu/drm/panfrost/panfrost_device.c | 130 +
>  1 file changed, 83 insertions(+), 47 deletions(-)
> 
> diff --git a/drivers/gpu/drm/panfrost/panfrost_device.c 
> b/drivers/gpu/drm/panfrost/panfrost_device.c
> index bce6b0aff05e..736854542b05 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_device.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_device.c
> @@ -292,55 +292,91 @@ void panfrost_device_fini(struct panfrost_device *pfdev)
>   panfrost_clk_fini(pfdev);
>  }
>  
> -const char *panfrost_exception_name(u32 exception_code)
> -{
> - switch (exception_code) {
> - /* Non-Fault Status code */
> - case 0x00: return "NOT_STARTED/IDLE/OK";
> - case 0x01: return "DONE";
> - case 0x02: return "INTERRUPTED";
> - case 0x03: return "STOPPED";
> - case 0x04: return "TERMINATED";
> - case 0x08: return "ACTIVE";
> - /* Job exceptions */
> - case 0x40: return "JOB_CONFIG_FAULT";
> - case 0x41: return "JOB_POWER_FAULT";
> - case 0x42: return "JOB_READ_FAULT";
> - case 0x43: return "JOB_WRITE_FAULT";
> - case 0x44: return "JOB_AFFINITY_FAULT";
> - case 0x48: return "JOB_BUS_FAULT";
> - case 0x50: return "INSTR_INVALID_PC";
> - case 0x51: return "INSTR_INVALID_ENC";
> - case 0x52: return "INSTR_TYPE_MISMATCH";
> - case 0x53: return "INSTR_OPERAND_FAULT";
> - case 0x54: return "INSTR_TLS_FAULT";
> - case 0x55: return "INSTR_BARRIER_FAULT";
> - case 0x56: return "INSTR_ALIGN_FAULT";
> - case 0x58: return "DATA_INVALID_FAULT";
> - case 0x59: return "TILE_RANGE_FAULT";
> - case 0x5A: return "ADDR_RANGE_FAULT";
> - case 0x60: return "OUT_OF_MEMORY";
> - /* GPU exceptions */
> - case 0x80: return "DELAYED_BUS_FAULT";
> - case 0x88: return "SHAREABILITY_FAULT";
> - /* MMU exceptions */
> - case 0xC1: return "TRANSLATION_FAULT_LEVEL1";
> - case 0xC2: return "TRANSLATION_FAULT_LEVEL2";
> - case 0xC3: return "TRANSLATION_FAULT_LEVEL3";
> - case 0xC4: return "TRANSLATION_FAULT_LEVEL4";
> - case 0xC8: return "PERMISSION_FAULT";
> - case 0xC9 ... 0xCF: return "PERMISSION_FAULT";
> - case 0xD1: return "TRANSTAB_BUS_FAULT_LEVEL1";
> - case 0xD2: return "TRANSTAB_BUS_FAULT_LEVEL2";
> - case 0xD3: return "TRANSTAB_BUS_FAULT_LEVEL3";
> - case 0xD4: return "TRANSTAB_BUS_FAULT_LEVEL4";
> - case 0xD8: return "ACCESS_FLAG";
> - case 0xD9 ... 0xDF: return "ACCESS_FLAG";
> - case 0xE0 ... 0xE7: return "ADDRESS_SIZE_FAULT";
> - case 0xE8 ... 0xEF: return "MEMORY_ATTRIBUTES_FAULT";
> +#define PANFROST_EXCEPTION(id) \
> + [DRM_PANFROST_EXCEPTION_ ## id] = { \
> + .name = #id, \
>   }
>  
> - return "UNKNOWN";
> +struct panfrost_exception_info {
> + const char *name;
> +};
> +
> +static const struct panfrost_exception_info panfrost_exception_infos[] = {
> + PANFROST_EXCEPTION(OK),
> + PANFROST_EXCEPTION(DONE),
> + PANFROST_EXCEPTION(INTERRUPTED),
> + PANFROST_EXCEPTION(STOPPED),
> + PANFROST_EXCEPTION(TERMINATED),
> + PANFROST_EXCEPTION(KABOOM),
> + PANFROST_EXCEPTION(EUREKA),
> + PANFROST_EXCEPTION(ACTIVE),
> + PANFROST_EXCEPTION(JOB_CONFIG_FAULT),
> + PANFROST_EXCEPTION(JOB_POWER_FAULT),
> + PANFROST_EXCEPTION(JOB_READ_FAULT),
> + PANFROST_EXCEPTION(JOB_WRITE_FAULT),
> + PANFROST_EXCEPTION(JOB_AFFINITY_FAULT),
> + PANFROST_EXCEPTION(JOB_BUS_FAULT),
> + PANFROST_EXCEPTION(INSTR_INVALID_PC),
> + PANFROST_EXCEPTION(INSTR_INVALID_ENC),
> + PANFROST_EXCEPTION(INSTR_TYPE_MISMATCH),
> + PANFROST_EXCEPTION(INSTR_OPERAND_FAULT),
> + PANFROST_EXCEPTION(INSTR_TLS_FAULT),
> + PANFROST_EXCEPTION(INSTR_BARRIER_FAULT),
> + PANFROST_EXCEPTION(INSTR_ALIGN_FAULT),
> + PANFROST_EXCEPTION(DATA_INVALID_FAULT),
> + PANFROST_EXCEPTION(TILE_RANGE_FAULT),
> + PANFROST_EXCEPTION(ADDR_RANGE_FAULT),
> + PANFROST_EXCEPTION(IMPRECISE_FAULT),
> + PANFROST_EXCEPTION(OOM),
> + PANFROST_EXCEPTION(OOM_AFBC),
> + PANFROST_EXCEPTION(UNKNOWN),
> + PANFROST_EXCEPTION(DELAYED_BUS_FAULT),
> + PANFROST_EXCEPTION(GPU_SHAREABILITY_FAULT),
> + PANFROST_EXCEPTION(SYS_SHAREABILITY_FAULT),
> + PANFROST_EXCEPTION(GPU_CACHEABILITY_FAULT),
> + PANFROST_EXCEPTION(TRANSLATION_FAULT_0),
> + PANFROST_EXCEPTION(TRANSLATION_FAULT_1),
> + PANFROST_EXCEPTION(TRANSLATION_FAULT_2),
> + PANFROST_EXCEPTION(TRANSLATION_FAULT_3),
> + PANFROST_EXCEPTION(TRANSLATION_FAULT_4),
> + PANFROST_EXCEPTION(TRANSLATION_FAULT_IDENTITY),
> + 

Re: [PATCH v3 05/15] drm/panfrost: Expose exception types to userspace

2021-06-25 Thread Steven Price
On 25/06/2021 15:21, Boris Brezillon wrote:
> On Fri, 25 Jun 2021 09:42:08 -0400
> Alyssa Rosenzweig  wrote:
> 
>> I'm not convinced. Right now most of our UABI is pleasantly
>> GPU-agnostic. With this suddenly there's divergence between Midgard and
>> Bifrost uABI.
> 
> Hm, I don't see why. I mean the exception types seem to be the same,
> there are just some that are not used on Midgard and some that are no
> used on Bifrost. Are there any collisions I didn't notice?

I think the real question is: why are we exporting them if user space
doesn't want them ;) Should this be in an internal header file at least
until someone actually requests they be available to user space?

>> With that drawback in mind, could you explain the benefit?
> 
> Well, I thought having these definitions in a central place would be a
> good thing given they're not expected to change even if they might
> be per-GPU. I don't know if that changes with CSF, maybe the exception
> codes are no longer set in stone and can change with FW update...

CSF certainly means the firmware controls a lot more of this sort of
thing but AFAIK the exception types still fit in the same scheme.

Steve

>>
>> On Fri, Jun 25, 2021 at 03:33:17PM +0200, Boris Brezillon wrote:
>>> Job headers contain an exception type field which might be read and
>>> converted to a human readable string by tracing tools. Let's expose
>>> the exception type as an enum so we share the same definition.
>>>
>>> v3:
>>> * Add missing values
>>>
>>> Signed-off-by: Boris Brezillon 
>>> ---
>>>  include/uapi/drm/panfrost_drm.h | 71 +
>>>  1 file changed, 71 insertions(+)
>>>
>>> diff --git a/include/uapi/drm/panfrost_drm.h 
>>> b/include/uapi/drm/panfrost_drm.h
>>> index ec19db1eead8..899cd6d952d4 100644
>>> --- a/include/uapi/drm/panfrost_drm.h
>>> +++ b/include/uapi/drm/panfrost_drm.h
>>> @@ -223,6 +223,77 @@ struct drm_panfrost_madvise {
>>> __u32 retained;   /* out, whether backing store still exists */
>>>  };
>>>  
>>> +/* The exception types */
>>> +
>>> +enum drm_panfrost_exception_type {
>>> +   DRM_PANFROST_EXCEPTION_OK = 0x00,
>>> +   DRM_PANFROST_EXCEPTION_DONE = 0x01,
>>> +   DRM_PANFROST_EXCEPTION_INTERRUPTED = 0x02,
>>> +   DRM_PANFROST_EXCEPTION_STOPPED = 0x03,
>>> +   DRM_PANFROST_EXCEPTION_TERMINATED = 0x04,
>>> +   DRM_PANFROST_EXCEPTION_KABOOM = 0x05,
>>> +   DRM_PANFROST_EXCEPTION_EUREKA = 0x06,
>>> +   DRM_PANFROST_EXCEPTION_ACTIVE = 0x08,
>>> +   DRM_PANFROST_EXCEPTION_JOB_CONFIG_FAULT = 0x40,
>>> +   DRM_PANFROST_EXCEPTION_JOB_POWER_FAULT = 0x41,
>>> +   DRM_PANFROST_EXCEPTION_JOB_READ_FAULT = 0x42,
>>> +   DRM_PANFROST_EXCEPTION_JOB_WRITE_FAULT = 0x43,
>>> +   DRM_PANFROST_EXCEPTION_JOB_AFFINITY_FAULT = 0x44,
>>> +   DRM_PANFROST_EXCEPTION_JOB_BUS_FAULT = 0x48,
>>> +   DRM_PANFROST_EXCEPTION_INSTR_INVALID_PC = 0x50,
>>> +   DRM_PANFROST_EXCEPTION_INSTR_INVALID_ENC = 0x51,
>>> +   DRM_PANFROST_EXCEPTION_INSTR_TYPE_MISMATCH = 0x52,
>>> +   DRM_PANFROST_EXCEPTION_INSTR_OPERAND_FAULT = 0x53,
>>> +   DRM_PANFROST_EXCEPTION_INSTR_TLS_FAULT = 0x54,
>>> +   DRM_PANFROST_EXCEPTION_INSTR_BARRIER_FAULT = 0x55,
>>> +   DRM_PANFROST_EXCEPTION_INSTR_ALIGN_FAULT = 0x56,
>>> +   DRM_PANFROST_EXCEPTION_DATA_INVALID_FAULT = 0x58,
>>> +   DRM_PANFROST_EXCEPTION_TILE_RANGE_FAULT = 0x59,
>>> +   DRM_PANFROST_EXCEPTION_ADDR_RANGE_FAULT = 0x5a,
>>> +   DRM_PANFROST_EXCEPTION_IMPRECISE_FAULT = 0x5b,
>>> +   DRM_PANFROST_EXCEPTION_OOM = 0x60,
>>> +   DRM_PANFROST_EXCEPTION_OOM_AFBC = 0x61,
>>> +   DRM_PANFROST_EXCEPTION_UNKNOWN = 0x7f,
>>> +   DRM_PANFROST_EXCEPTION_DELAYED_BUS_FAULT = 0x80,
>>> +   DRM_PANFROST_EXCEPTION_GPU_SHAREABILITY_FAULT = 0x88,
>>> +   DRM_PANFROST_EXCEPTION_SYS_SHAREABILITY_FAULT = 0x89,
>>> +   DRM_PANFROST_EXCEPTION_GPU_CACHEABILITY_FAULT = 0x8a,
>>> +   DRM_PANFROST_EXCEPTION_TRANSLATION_FAULT_0 = 0xc0,
>>> +   DRM_PANFROST_EXCEPTION_TRANSLATION_FAULT_1 = 0xc1,
>>> +   DRM_PANFROST_EXCEPTION_TRANSLATION_FAULT_2 = 0xc2,
>>> +   DRM_PANFROST_EXCEPTION_TRANSLATION_FAULT_3 = 0xc3,
>>> +   DRM_PANFROST_EXCEPTION_TRANSLATION_FAULT_4 = 0xc4,
>>> +   DRM_PANFROST_EXCEPTION_TRANSLATION_FAULT_IDENTITY = 0xc7,
>>> +   DRM_PANFROST_EXCEPTION_PERM_FAULT_0 = 0xc8,
>>> +   DRM_PANFROST_EXCEPTION_PERM_FAULT_1 = 0xc9,
>>> +   DRM_PANFROST_EXCEPTION_PERM_FAULT_2 = 0xca,
>>> +   DRM_PANFROST_EXCEPTION_PERM_FAULT_3 = 0xcb,
>>> +   DRM_PANFROST_EXCEPTION_TRANSTAB_BUS_FAULT_0 = 0xd0,
>>> +   DRM_PANFROST_EXCEPTION_TRANSTAB_BUS_FAULT_1 = 0xd1,
>>> +   DRM_PANFROST_EXCEPTION_TRANSTAB_BUS_FAULT_2 = 0xd2,
>>> +   DRM_PANFROST_EXCEPTION_TRANSTAB_BUS_FAULT_3 = 0xd3,
>>> +   DRM_PANFROST_EXCEPTION_ACCESS_FLAG_0 = 0xd8,
>>> +   DRM_PANFROST_EXCEPTION_ACCESS_FLAG_1 = 0xd9,
>>> +   DRM_PANFROST_EXCEPTION_ACCESS_FLAG_2 = 0xda,
>>> +   DRM_PANFROST_EXCEPTION_ACCESS_FLAG_3 = 0xdb,
>>> +   DRM_PANFROST_EXCEPTION_ADDR_SIZE_FAULT_IN0 = 0xe0,
>>> +   DRM_PANFROST_EXCEPTION_ADDR_SIZE_FAULT_IN1 = 0xe1,
>>> +   DRM_PANFROST_EXCEPTION_ADDR_SIZE_FAULT_IN2 

Re: [PATCH v3 01/15] drm/sched: Allow using a dedicated workqueue for the timeout/fault tdr

2021-06-25 Thread Boris Brezillon
On Fri, 25 Jun 2021 16:07:03 +0100
Steven Price  wrote:

> On 25/06/2021 14:33, Boris Brezillon wrote:
> > Mali Midgard/Bifrost GPUs have 3 hardware queues but only a global GPU
> > reset. This leads to extra complexity when we need to synchronize timeout
> > works with the reset work. One solution to address that is to have an
> > ordered workqueue at the driver level that will be used by the different
> > schedulers to queue their timeout work. Thanks to the serialization
> > provided by the ordered workqueue we are guaranteed that timeout
> > handlers are executed sequentially, and can thus easily reset the GPU
> > from the timeout handler without extra synchronization.
> > 
> > Signed-off-by: Boris Brezillon   
> 
> I feel like I'm missing something here - I can't see where
> sched->timeout_wq is ever actually used in this series. There's clearly
> no point passing it into the drm core if the drm core never accesses it.
> AFAICT the changes are all in patch 9 and that doesn't depend on this one.

Oops, indeed, I forgot to patch sched_main.c to use the timeout_wq (below
is a version doing that). We really need a way to trigger this sort of
race...

--->8---
From 18bb739da5a5fc3e36d2c4378408c6938198993c Mon Sep 17 00:00:00 2001
From: Boris Brezillon 
Date: Wed, 23 Jun 2021 16:14:01 +0200
Subject: [PATCH] drm/sched: Allow using a dedicated workqueue for the
 timeout/fault tdr

Mali Midgard/Bifrost GPUs have 3 hardware queues but only a global GPU
reset. This leads to extra complexity when we need to synchronize timeout
works with the reset work. One solution to address that is to have an
ordered workqueue at the driver level that will be used by the different
schedulers to queue their timeout work. Thanks to the serialization
provided by the ordered workqueue we are guaranteed that timeout
handlers are executed sequentially, and can thus easily reset the GPU
from the timeout handler without extra synchronization.

Signed-off-by: Boris Brezillon 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c |  2 +-
 drivers/gpu/drm/etnaviv/etnaviv_sched.c   |  3 ++-
 drivers/gpu/drm/lima/lima_sched.c |  3 ++-
 drivers/gpu/drm/panfrost/panfrost_job.c   |  3 ++-
 drivers/gpu/drm/scheduler/sched_main.c| 14 +-
 drivers/gpu/drm/v3d/v3d_sched.c   | 10 +-
 include/drm/gpu_scheduler.h   |  5 -
 7 files changed, 25 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
index 47ea46859618..532636ea20bc 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
@@ -488,7 +488,7 @@ int amdgpu_fence_driver_init_ring(struct amdgpu_ring *ring,
 
r = drm_sched_init(>sched, _sched_ops,
   num_hw_submission, amdgpu_job_hang_limit,
-  timeout, sched_score, ring->name);
+  timeout, NULL, sched_score, ring->name);
if (r) {
DRM_ERROR("Failed to create scheduler on ring %s.\n",
  ring->name);
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c 
b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
index 19826e504efc..feb6da1b6ceb 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
@@ -190,7 +190,8 @@ int etnaviv_sched_init(struct etnaviv_gpu *gpu)
 
ret = drm_sched_init(>sched, _sched_ops,
 etnaviv_hw_jobs_limit, etnaviv_job_hang_limit,
-msecs_to_jiffies(500), NULL, dev_name(gpu->dev));
+msecs_to_jiffies(500), NULL, NULL,
+dev_name(gpu->dev));
if (ret)
return ret;
 
diff --git a/drivers/gpu/drm/lima/lima_sched.c 
b/drivers/gpu/drm/lima/lima_sched.c
index ecf3267334ff..dba8329937a3 100644
--- a/drivers/gpu/drm/lima/lima_sched.c
+++ b/drivers/gpu/drm/lima/lima_sched.c
@@ -508,7 +508,8 @@ int lima_sched_pipe_init(struct lima_sched_pipe *pipe, 
const char *name)
INIT_WORK(>recover_work, lima_sched_recover_work);
 
return drm_sched_init(>base, _sched_ops, 1,
- lima_job_hang_limit, msecs_to_jiffies(timeout),
+ lima_job_hang_limit,
+ msecs_to_jiffies(timeout), NULL,
  NULL, name);
 }
 
diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c 
b/drivers/gpu/drm/panfrost/panfrost_job.c
index 682f2161b999..8ff79fd49577 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.c
+++ b/drivers/gpu/drm/panfrost/panfrost_job.c
@@ -626,7 +626,8 @@ int panfrost_job_init(struct panfrost_device *pfdev)
 
ret = drm_sched_init(>queue[j].sched,
 _sched_ops,
-1, 0, msecs_to_jiffies(JOB_TIMEOUT_MS),
+1, 0,
+

Re: [PATCH v3 01/15] drm/sched: Allow using a dedicated workqueue for the timeout/fault tdr

2021-06-25 Thread Steven Price
On 25/06/2021 14:33, Boris Brezillon wrote:
> Mali Midgard/Bifrost GPUs have 3 hardware queues but only a global GPU
> reset. This leads to extra complexity when we need to synchronize timeout
> works with the reset work. One solution to address that is to have an
> ordered workqueue at the driver level that will be used by the different
> schedulers to queue their timeout work. Thanks to the serialization
> provided by the ordered workqueue we are guaranteed that timeout
> handlers are executed sequentially, and can thus easily reset the GPU
> from the timeout handler without extra synchronization.
> 
> Signed-off-by: Boris Brezillon 

I feel like I'm missing something here - I can't see where
sched->timeout_wq is ever actually used in this series. There's clearly
no point passing it into the drm core if the drm core never accesses it.
AFAICT the changes are all in patch 9 and that doesn't depend on this one.

Steve

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c |  2 +-
>  drivers/gpu/drm/etnaviv/etnaviv_sched.c   |  3 ++-
>  drivers/gpu/drm/lima/lima_sched.c |  3 ++-
>  drivers/gpu/drm/panfrost/panfrost_job.c   |  3 ++-
>  drivers/gpu/drm/scheduler/sched_main.c|  6 +-
>  drivers/gpu/drm/v3d/v3d_sched.c   | 10 +-
>  include/drm/gpu_scheduler.h   |  5 -
>  7 files changed, 21 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> index 47ea46859618..532636ea20bc 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> @@ -488,7 +488,7 @@ int amdgpu_fence_driver_init_ring(struct amdgpu_ring 
> *ring,
>  
>   r = drm_sched_init(>sched, _sched_ops,
>  num_hw_submission, amdgpu_job_hang_limit,
> -timeout, sched_score, ring->name);
> +timeout, NULL, sched_score, ring->name);
>   if (r) {
>   DRM_ERROR("Failed to create scheduler on ring %s.\n",
> ring->name);
> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c 
> b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> index 19826e504efc..feb6da1b6ceb 100644
> --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> @@ -190,7 +190,8 @@ int etnaviv_sched_init(struct etnaviv_gpu *gpu)
>  
>   ret = drm_sched_init(>sched, _sched_ops,
>etnaviv_hw_jobs_limit, etnaviv_job_hang_limit,
> -  msecs_to_jiffies(500), NULL, dev_name(gpu->dev));
> +  msecs_to_jiffies(500), NULL, NULL,
> +  dev_name(gpu->dev));
>   if (ret)
>   return ret;
>  
> diff --git a/drivers/gpu/drm/lima/lima_sched.c 
> b/drivers/gpu/drm/lima/lima_sched.c
> index ecf3267334ff..dba8329937a3 100644
> --- a/drivers/gpu/drm/lima/lima_sched.c
> +++ b/drivers/gpu/drm/lima/lima_sched.c
> @@ -508,7 +508,8 @@ int lima_sched_pipe_init(struct lima_sched_pipe *pipe, 
> const char *name)
>   INIT_WORK(>recover_work, lima_sched_recover_work);
>  
>   return drm_sched_init(>base, _sched_ops, 1,
> -   lima_job_hang_limit, msecs_to_jiffies(timeout),
> +   lima_job_hang_limit,
> +   msecs_to_jiffies(timeout), NULL,
> NULL, name);
>  }
>  
> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c 
> b/drivers/gpu/drm/panfrost/panfrost_job.c
> index 682f2161b999..8ff79fd49577 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> @@ -626,7 +626,8 @@ int panfrost_job_init(struct panfrost_device *pfdev)
>  
>   ret = drm_sched_init(>queue[j].sched,
>_sched_ops,
> -  1, 0, msecs_to_jiffies(JOB_TIMEOUT_MS),
> +  1, 0,
> +  msecs_to_jiffies(JOB_TIMEOUT_MS), NULL,
>NULL, "pan_js");
>   if (ret) {
>   dev_err(pfdev->dev, "Failed to create scheduler: %d.", 
> ret);
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
> b/drivers/gpu/drm/scheduler/sched_main.c
> index c0a2f8f8d472..a937d0529944 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -837,6 +837,8 @@ static int drm_sched_main(void *param)
>   * @hw_submission: number of hw submissions that can be in flight
>   * @hang_limit: number of times to allow a job to hang before dropping it
>   * @timeout: timeout value in jiffies for the scheduler
> + * @timeout_wq: workqueue to use for timeout work. If NULL, the system_wq is
> + *   used
>   * @score: optional score atomic shared with other schedulers
>   * @name: name used for debugging
>   *
> @@ -844,7 +846,8 @@ static int drm_sched_main(void *param)
> 

Re: [PATCH v3 14/15] drm/panfrost: Kill in-flight jobs on FD close

2021-06-25 Thread Boris Brezillon
On Fri, 25 Jun 2021 15:43:45 +0200
Lucas Stach  wrote:

> Am Freitag, dem 25.06.2021 um 15:33 +0200 schrieb Boris Brezillon:
> > If the process who submitted these jobs decided to close the FD before
> > the jobs are done it probably means it doesn't care about the result.
> > 
> > v3:
> > * Set fence error to ECANCELED when a TERMINATED exception is received
> > 
> > Signed-off-by: Boris Brezillon 
> > ---
> >  drivers/gpu/drm/panfrost/panfrost_job.c | 43 +
> >  1 file changed, 37 insertions(+), 6 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c 
> > b/drivers/gpu/drm/panfrost/panfrost_job.c
> > index 948bd174ff99..aa1e6542adde 100644
> > --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> > +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> > @@ -498,14 +498,21 @@ static void panfrost_job_handle_irq(struct 
> > panfrost_device *pfdev, u32 status)
> >  
> > if (status & JOB_INT_MASK_ERR(j)) {
> > u32 js_status = job_read(pfdev, JS_STATUS(j));
> > +   const char *exception_name = 
> > panfrost_exception_name(js_status);
> >  
> > job_write(pfdev, JS_COMMAND_NEXT(j), JS_COMMAND_NOP);
> >  
> > -   dev_err(pfdev->dev, "js fault, js=%d, status=%s, 
> > head=0x%x, tail=0x%x",
> > -   j,
> > -   panfrost_exception_name(js_status),
> > -   job_read(pfdev, JS_HEAD_LO(j)),
> > -   job_read(pfdev, JS_TAIL_LO(j)));
> > +   if (js_status < 
> > DRM_PANFROST_EXCEPTION_JOB_CONFIG_FAULT) {
> > +   dev_dbg(pfdev->dev, "js interrupt, js=%d, 
> > status=%s, head=0x%x, tail=0x%x",
> > +   j, exception_name,
> > +   job_read(pfdev, JS_HEAD_LO(j)),
> > +   job_read(pfdev, JS_TAIL_LO(j)));
> > +   } else {
> > +   dev_err(pfdev->dev, "js fault, js=%d, 
> > status=%s, head=0x%x, tail=0x%x",
> > +   j, exception_name,
> > +   job_read(pfdev, JS_HEAD_LO(j)),
> > +   job_read(pfdev, JS_TAIL_LO(j)));
> > +   }
> >  
> > /* If we need a reset, signal it to the timeout
> >  * handler, otherwise, update the fence error field and
> > @@ -514,7 +521,16 @@ static void panfrost_job_handle_irq(struct 
> > panfrost_device *pfdev, u32 status)
> > if (panfrost_exception_needs_reset(pfdev, js_status)) {
> > drm_sched_fault(>js->queue[j].sched);
> > } else {
> > -   dma_fence_set_error(pfdev->jobs[j]->done_fence, 
> > -EINVAL);
> > +   int error = 0;
> > +
> > +   if (js_status == 
> > DRM_PANFROST_EXCEPTION_TERMINATED)
> > +   error = -ECANCELED;
> > +   else if (js_status >= 
> > DRM_PANFROST_EXCEPTION_JOB_CONFIG_FAULT)
> > +   error = -EINVAL;
> > +
> > +   if (error)
> > +   
> > dma_fence_set_error(pfdev->jobs[j]->done_fence, error);
> > +
> > status |= JOB_INT_MASK_DONE(j);
> > }
> > }
> > @@ -673,10 +689,25 @@ int panfrost_job_open(struct panfrost_file_priv 
> > *panfrost_priv)
> >  
> >  void panfrost_job_close(struct panfrost_file_priv *panfrost_priv)
> >  {
> > +   struct panfrost_device *pfdev = panfrost_priv->pfdev;
> > +   unsigned long flags;
> > int i;
> >  
> > for (i = 0; i < NUM_JOB_SLOTS; i++)
> > drm_sched_entity_destroy(_priv->sched_entity[i]);
> > +
> > +   /* Kill in-flight jobs */
> > +   spin_lock_irqsave(>js->job_lock, flags);  
> 
> Micro-optimization, but this code is never called from IRQ context, so
> a spin_lock_irq would do here, no need to save/restore flags.

Ah, right, I moved patches around. This patch was before the 'move to
threaded-irq' one in v2, but now that it's coming after, we can use a
regular lock here.

> 
> Regards,
> Lucas
> 
> > +   for (i = 0; i < NUM_JOB_SLOTS; i++) {
> > +   struct drm_sched_entity *entity = 
> > _priv->sched_entity[i];
> > +   struct panfrost_job *job = pfdev->jobs[i];
> > +
> > +   if (!job || job->base.entity != entity)
> > +   continue;
> > +
> > +   job_write(pfdev, JS_COMMAND(i), JS_COMMAND_HARD_STOP);
> > +   }
> > +   spin_unlock_irqrestore(>js->job_lock, flags);
> >  }
> >  
> >  int panfrost_job_is_idle(struct panfrost_device *pfdev)  
> 
> 



Re: [PATCH v3 08/15] drm/panfrost: Use a threaded IRQ for job interrupts

2021-06-25 Thread Boris Brezillon
On Fri, 25 Jun 2021 09:47:59 -0400
Alyssa Rosenzweig  wrote:

> A-b, but could you explain the context? Thanks

The rational behind this change is the complexity added to the
interrupt handler in patch 15. That means we might spend more time in
interrupt context after that patch and block other things on the system
while we dequeue job irqs. Moving things to a thread also helps
performances when the GPU gets faster as executing jobs than the CPU at
queueing them. In that case we keep switching back-and-forth between
interrupt and non-interrupt context which has a cost.

One drawback is increased latency when receiving job events and the
thread is idle, since you need to wake up the thread in that case.

> 
> On Fri, Jun 25, 2021 at 03:33:20PM +0200, Boris Brezillon wrote:
> > This should avoid switching to interrupt context when the GPU is under
> > heavy use.
> > 
> > v3:
> > * Don't take the job_lock in panfrost_job_handle_irq()
> > 
> > Signed-off-by: Boris Brezillon 
> > ---
> >  drivers/gpu/drm/panfrost/panfrost_job.c | 53 ++---
> >  1 file changed, 38 insertions(+), 15 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c 
> > b/drivers/gpu/drm/panfrost/panfrost_job.c
> > index be8f68f63974..e0c479e67304 100644
> > --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> > +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> > @@ -470,19 +470,12 @@ static const struct drm_sched_backend_ops 
> > panfrost_sched_ops = {
> > .free_job = panfrost_job_free
> >  };
> >  
> > -static irqreturn_t panfrost_job_irq_handler(int irq, void *data)
> > +static void panfrost_job_handle_irq(struct panfrost_device *pfdev, u32 
> > status)
> >  {
> > -   struct panfrost_device *pfdev = data;
> > -   u32 status = job_read(pfdev, JOB_INT_STAT);
> > int j;
> >  
> > dev_dbg(pfdev->dev, "jobslot irq status=%x\n", status);
> >  
> > -   if (!status)
> > -   return IRQ_NONE;
> > -
> > -   pm_runtime_mark_last_busy(pfdev->dev);
> > -
> > for (j = 0; status; j++) {
> > u32 mask = MK_JS_MASK(j);
> >  
> > @@ -519,7 +512,6 @@ static irqreturn_t panfrost_job_irq_handler(int irq, 
> > void *data)
> > if (status & JOB_INT_MASK_DONE(j)) {
> > struct panfrost_job *job;
> >  
> > -   spin_lock(>js->job_lock);
> > job = pfdev->jobs[j];
> > /* Only NULL if job timeout occurred */
> > if (job) {
> > @@ -531,21 +523,49 @@ static irqreturn_t panfrost_job_irq_handler(int irq, 
> > void *data)
> > dma_fence_signal_locked(job->done_fence);
> > pm_runtime_put_autosuspend(pfdev->dev);
> > }
> > -   spin_unlock(>js->job_lock);
> > }
> >  
> > status &= ~mask;
> > }
> > +}
> >  
> > +static irqreturn_t panfrost_job_irq_handler_thread(int irq, void *data)
> > +{
> > +   struct panfrost_device *pfdev = data;
> > +   u32 status = job_read(pfdev, JOB_INT_RAWSTAT);
> > +
> > +   while (status) {
> > +   pm_runtime_mark_last_busy(pfdev->dev);
> > +
> > +   spin_lock(>js->job_lock);
> > +   panfrost_job_handle_irq(pfdev, status);
> > +   spin_unlock(>js->job_lock);
> > +   status = job_read(pfdev, JOB_INT_RAWSTAT);
> > +   }
> > +
> > +   job_write(pfdev, JOB_INT_MASK,
> > + GENMASK(16 + NUM_JOB_SLOTS - 1, 16) |
> > + GENMASK(NUM_JOB_SLOTS - 1, 0));
> > return IRQ_HANDLED;
> >  }
> >  
> > +static irqreturn_t panfrost_job_irq_handler(int irq, void *data)
> > +{
> > +   struct panfrost_device *pfdev = data;
> > +   u32 status = job_read(pfdev, JOB_INT_STAT);
> > +
> > +   if (!status)
> > +   return IRQ_NONE;
> > +
> > +   job_write(pfdev, JOB_INT_MASK, 0);
> > +   return IRQ_WAKE_THREAD;
> > +}
> > +
> >  static void panfrost_reset(struct work_struct *work)
> >  {
> > struct panfrost_device *pfdev = container_of(work,
> >  struct panfrost_device,
> >  reset.work);
> > -   unsigned long flags;
> > unsigned int i;
> > bool cookie;
> >  
> > @@ -575,7 +595,7 @@ static void panfrost_reset(struct work_struct *work)
> > /* All timers have been stopped, we can safely reset the pending state. 
> > */
> > atomic_set(>reset.pending, 0);
> >  
> > -   spin_lock_irqsave(>js->job_lock, flags);
> > +   spin_lock(>js->job_lock);
> > for (i = 0; i < NUM_JOB_SLOTS; i++) {
> > if (pfdev->jobs[i]) {
> > pm_runtime_put_noidle(pfdev->dev);
> > @@ -583,7 +603,7 @@ static void panfrost_reset(struct work_struct *work)
> > pfdev->jobs[i] = NULL;
> > }
> > }
> > -   spin_unlock_irqrestore(>js->job_lock, flags);
> > +   spin_unlock(>js->job_lock);
> >  
> > panfrost_device_reset(pfdev);
> >  
> > @@ -610,8 +630,11 @@ int 

Re: [PATCH v3 05/15] drm/panfrost: Expose exception types to userspace

2021-06-25 Thread Boris Brezillon
On Fri, 25 Jun 2021 09:42:08 -0400
Alyssa Rosenzweig  wrote:

> I'm not convinced. Right now most of our UABI is pleasantly
> GPU-agnostic. With this suddenly there's divergence between Midgard and
> Bifrost uABI.

Hm, I don't see why. I mean the exception types seem to be the same,
there are just some that are not used on Midgard and some that are no
used on Bifrost. Are there any collisions I didn't notice?

> With that drawback in mind, could you explain the benefit?

Well, I thought having these definitions in a central place would be a
good thing given they're not expected to change even if they might
be per-GPU. I don't know if that changes with CSF, maybe the exception
codes are no longer set in stone and can change with FW update...

> 
> On Fri, Jun 25, 2021 at 03:33:17PM +0200, Boris Brezillon wrote:
> > Job headers contain an exception type field which might be read and
> > converted to a human readable string by tracing tools. Let's expose
> > the exception type as an enum so we share the same definition.
> > 
> > v3:
> > * Add missing values
> > 
> > Signed-off-by: Boris Brezillon 
> > ---
> >  include/uapi/drm/panfrost_drm.h | 71 +
> >  1 file changed, 71 insertions(+)
> > 
> > diff --git a/include/uapi/drm/panfrost_drm.h 
> > b/include/uapi/drm/panfrost_drm.h
> > index ec19db1eead8..899cd6d952d4 100644
> > --- a/include/uapi/drm/panfrost_drm.h
> > +++ b/include/uapi/drm/panfrost_drm.h
> > @@ -223,6 +223,77 @@ struct drm_panfrost_madvise {
> > __u32 retained;   /* out, whether backing store still exists */
> >  };
> >  
> > +/* The exception types */
> > +
> > +enum drm_panfrost_exception_type {
> > +   DRM_PANFROST_EXCEPTION_OK = 0x00,
> > +   DRM_PANFROST_EXCEPTION_DONE = 0x01,
> > +   DRM_PANFROST_EXCEPTION_INTERRUPTED = 0x02,
> > +   DRM_PANFROST_EXCEPTION_STOPPED = 0x03,
> > +   DRM_PANFROST_EXCEPTION_TERMINATED = 0x04,
> > +   DRM_PANFROST_EXCEPTION_KABOOM = 0x05,
> > +   DRM_PANFROST_EXCEPTION_EUREKA = 0x06,
> > +   DRM_PANFROST_EXCEPTION_ACTIVE = 0x08,
> > +   DRM_PANFROST_EXCEPTION_JOB_CONFIG_FAULT = 0x40,
> > +   DRM_PANFROST_EXCEPTION_JOB_POWER_FAULT = 0x41,
> > +   DRM_PANFROST_EXCEPTION_JOB_READ_FAULT = 0x42,
> > +   DRM_PANFROST_EXCEPTION_JOB_WRITE_FAULT = 0x43,
> > +   DRM_PANFROST_EXCEPTION_JOB_AFFINITY_FAULT = 0x44,
> > +   DRM_PANFROST_EXCEPTION_JOB_BUS_FAULT = 0x48,
> > +   DRM_PANFROST_EXCEPTION_INSTR_INVALID_PC = 0x50,
> > +   DRM_PANFROST_EXCEPTION_INSTR_INVALID_ENC = 0x51,
> > +   DRM_PANFROST_EXCEPTION_INSTR_TYPE_MISMATCH = 0x52,
> > +   DRM_PANFROST_EXCEPTION_INSTR_OPERAND_FAULT = 0x53,
> > +   DRM_PANFROST_EXCEPTION_INSTR_TLS_FAULT = 0x54,
> > +   DRM_PANFROST_EXCEPTION_INSTR_BARRIER_FAULT = 0x55,
> > +   DRM_PANFROST_EXCEPTION_INSTR_ALIGN_FAULT = 0x56,
> > +   DRM_PANFROST_EXCEPTION_DATA_INVALID_FAULT = 0x58,
> > +   DRM_PANFROST_EXCEPTION_TILE_RANGE_FAULT = 0x59,
> > +   DRM_PANFROST_EXCEPTION_ADDR_RANGE_FAULT = 0x5a,
> > +   DRM_PANFROST_EXCEPTION_IMPRECISE_FAULT = 0x5b,
> > +   DRM_PANFROST_EXCEPTION_OOM = 0x60,
> > +   DRM_PANFROST_EXCEPTION_OOM_AFBC = 0x61,
> > +   DRM_PANFROST_EXCEPTION_UNKNOWN = 0x7f,
> > +   DRM_PANFROST_EXCEPTION_DELAYED_BUS_FAULT = 0x80,
> > +   DRM_PANFROST_EXCEPTION_GPU_SHAREABILITY_FAULT = 0x88,
> > +   DRM_PANFROST_EXCEPTION_SYS_SHAREABILITY_FAULT = 0x89,
> > +   DRM_PANFROST_EXCEPTION_GPU_CACHEABILITY_FAULT = 0x8a,
> > +   DRM_PANFROST_EXCEPTION_TRANSLATION_FAULT_0 = 0xc0,
> > +   DRM_PANFROST_EXCEPTION_TRANSLATION_FAULT_1 = 0xc1,
> > +   DRM_PANFROST_EXCEPTION_TRANSLATION_FAULT_2 = 0xc2,
> > +   DRM_PANFROST_EXCEPTION_TRANSLATION_FAULT_3 = 0xc3,
> > +   DRM_PANFROST_EXCEPTION_TRANSLATION_FAULT_4 = 0xc4,
> > +   DRM_PANFROST_EXCEPTION_TRANSLATION_FAULT_IDENTITY = 0xc7,
> > +   DRM_PANFROST_EXCEPTION_PERM_FAULT_0 = 0xc8,
> > +   DRM_PANFROST_EXCEPTION_PERM_FAULT_1 = 0xc9,
> > +   DRM_PANFROST_EXCEPTION_PERM_FAULT_2 = 0xca,
> > +   DRM_PANFROST_EXCEPTION_PERM_FAULT_3 = 0xcb,
> > +   DRM_PANFROST_EXCEPTION_TRANSTAB_BUS_FAULT_0 = 0xd0,
> > +   DRM_PANFROST_EXCEPTION_TRANSTAB_BUS_FAULT_1 = 0xd1,
> > +   DRM_PANFROST_EXCEPTION_TRANSTAB_BUS_FAULT_2 = 0xd2,
> > +   DRM_PANFROST_EXCEPTION_TRANSTAB_BUS_FAULT_3 = 0xd3,
> > +   DRM_PANFROST_EXCEPTION_ACCESS_FLAG_0 = 0xd8,
> > +   DRM_PANFROST_EXCEPTION_ACCESS_FLAG_1 = 0xd9,
> > +   DRM_PANFROST_EXCEPTION_ACCESS_FLAG_2 = 0xda,
> > +   DRM_PANFROST_EXCEPTION_ACCESS_FLAG_3 = 0xdb,
> > +   DRM_PANFROST_EXCEPTION_ADDR_SIZE_FAULT_IN0 = 0xe0,
> > +   DRM_PANFROST_EXCEPTION_ADDR_SIZE_FAULT_IN1 = 0xe1,
> > +   DRM_PANFROST_EXCEPTION_ADDR_SIZE_FAULT_IN2 = 0xe2,
> > +   DRM_PANFROST_EXCEPTION_ADDR_SIZE_FAULT_IN3 = 0xe3,
> > +   DRM_PANFROST_EXCEPTION_ADDR_SIZE_FAULT_OUT0 = 0xe4,
> > +   DRM_PANFROST_EXCEPTION_ADDR_SIZE_FAULT_OUT1 = 0xe5,
> > +   DRM_PANFROST_EXCEPTION_ADDR_SIZE_FAULT_OUT2 = 0xe6,
> > +   DRM_PANFROST_EXCEPTION_ADDR_SIZE_FAULT_OUT3 = 0xe7,
> > +   DRM_PANFROST_EXCEPTION_MEM_ATTR_FAULT_0 = 0xe8,
> > +   DRM_PANFROST_EXCEPTION_MEM_ATTR_FAULT_1 = 0xe9,
> 

Re: [PATCH v3 2/2] drivers/firmware: consolidate EFI framebuffer setup for all arches

2021-06-25 Thread Thomas Zimmermann



Am 25.06.21 um 15:13 schrieb Javier Martinez Canillas:

The register_gop_device() function registers an "efi-framebuffer" platform
device to match against the efifb driver, to have an early framebuffer for
EFI platforms.

But there is already support to do exactly the same by the Generic System
Framebuffers (sysfb) driver. This used to be only for X86 but it has been
moved to drivers/firmware and could be reused by other architectures.

Also, besides supporting registering an "efi-framebuffer", this driver can
register a "simple-framebuffer" allowing to use the siple{fb,drm} drivers
on non-X86 EFI platforms. For example, on aarch64 these drivers can only
be used with DT and doesn't have code to register a "simple-frambuffer"
platform device when booting with EFI.

For these reasons, let's remove the register_gop_device() duplicated code
and instead move the platform specific logic that's there to sysfb driver.

Signed-off-by: Javier Martinez Canillas 
Acked-by: Borislav Petkov 


Acked-by: Thomas Zimmermann 


---

Changes in v3:
- Also update the SYSFB_SIMPLEFB symbol name in drivers/gpu/drm/tiny/Kconfig.
- We have a a max 100 char limit now, use it to avoid multi-line statements.
- Figure out the platform device name before allocating the platform device.

Changes in v2:
- Use "depends on" for the supported architectures instead of selecting it.
- Improve commit message to explain the benefits of reusing sysfb for !X86.

  arch/arm/include/asm/efi.h|  5 +-
  arch/arm64/include/asm/efi.h  |  5 +-
  arch/riscv/include/asm/efi.h  |  5 +-
  drivers/firmware/Kconfig  |  8 +--
  drivers/firmware/Makefile |  2 +-
  drivers/firmware/efi/efi-init.c   | 90 ---
  drivers/firmware/efi/sysfb_efi.c  | 76 +-
  drivers/firmware/sysfb.c  | 35 
  drivers/firmware/sysfb_simplefb.c | 31 +++
  drivers/gpu/drm/tiny/Kconfig  |  4 +-
  include/linux/sysfb.h | 26 -
  11 files changed, 143 insertions(+), 144 deletions(-)

diff --git a/arch/arm/include/asm/efi.h b/arch/arm/include/asm/efi.h
index 9de7ab2ce05d..a6f3b179e8a9 100644
--- a/arch/arm/include/asm/efi.h
+++ b/arch/arm/include/asm/efi.h
@@ -17,6 +17,7 @@
  
  #ifdef CONFIG_EFI

  void efi_init(void);
+extern void efifb_setup_from_dmi(struct screen_info *si, const char *opt);
  
  int efi_create_mapping(struct mm_struct *mm, efi_memory_desc_t *md);

  int efi_set_mapping_permissions(struct mm_struct *mm, efi_memory_desc_t *md);
@@ -52,10 +53,6 @@ void efi_virtmap_unload(void);
  struct screen_info *alloc_screen_info(void);
  void free_screen_info(struct screen_info *si);
  
-static inline void efifb_setup_from_dmi(struct screen_info *si, const char *opt)

-{
-}
-
  /*
   * A reasonable upper bound for the uncompressed kernel size is 32 MBytes,
   * so we will reserve that amount of memory. We have no easy way to tell what
diff --git a/arch/arm64/include/asm/efi.h b/arch/arm64/include/asm/efi.h
index 1bed37eb013a..d3e1825337be 100644
--- a/arch/arm64/include/asm/efi.h
+++ b/arch/arm64/include/asm/efi.h
@@ -14,6 +14,7 @@
  
  #ifdef CONFIG_EFI

  extern void efi_init(void);
+extern void efifb_setup_from_dmi(struct screen_info *si, const char *opt);
  #else
  #define efi_init()
  #endif
@@ -85,10 +86,6 @@ static inline void free_screen_info(struct screen_info *si)
  {
  }
  
-static inline void efifb_setup_from_dmi(struct screen_info *si, const char *opt)

-{
-}
-
  #define EFI_ALLOC_ALIGN   SZ_64K
  
  /*

diff --git a/arch/riscv/include/asm/efi.h b/arch/riscv/include/asm/efi.h
index 6d98cd999680..7a8f0d45b13a 100644
--- a/arch/riscv/include/asm/efi.h
+++ b/arch/riscv/include/asm/efi.h
@@ -13,6 +13,7 @@
  
  #ifdef CONFIG_EFI

  extern void efi_init(void);
+extern void efifb_setup_from_dmi(struct screen_info *si, const char *opt);
  #else
  #define efi_init()
  #endif
@@ -39,10 +40,6 @@ static inline void free_screen_info(struct screen_info *si)
  {
  }
  
-static inline void efifb_setup_from_dmi(struct screen_info *si, const char *opt)

-{
-}
-
  void efi_virtmap_load(void);
  void efi_virtmap_unload(void);
  
diff --git a/drivers/firmware/Kconfig b/drivers/firmware/Kconfig

index 5991071e9d7f..6822727a5e98 100644
--- a/drivers/firmware/Kconfig
+++ b/drivers/firmware/Kconfig
@@ -254,9 +254,9 @@ config QCOM_SCM_DOWNLOAD_MODE_DEFAULT
  config SYSFB
bool
default y
-   depends on X86 || COMPILE_TEST
+   depends on X86 || ARM || ARM64 || RISCV || COMPILE_TEST
  
-config X86_SYSFB

+config SYSFB_SIMPLEFB
bool "Mark VGA/VBE/EFI FB as generic system framebuffer"
depends on SYSFB
help
@@ -264,10 +264,10 @@ config X86_SYSFB
  bootloader or kernel can show basic video-output during boot for
  user-guidance and debugging. Historically, x86 used the VESA BIOS
  Extensions and EFI-framebuffers for this, which are mostly limited
- to x86.
+ to x86 BIOS 

Re: [PATCH v3 08/15] drm/panfrost: Use a threaded IRQ for job interrupts

2021-06-25 Thread Alyssa Rosenzweig
A-b, but could you explain the context? Thanks

On Fri, Jun 25, 2021 at 03:33:20PM +0200, Boris Brezillon wrote:
> This should avoid switching to interrupt context when the GPU is under
> heavy use.
> 
> v3:
> * Don't take the job_lock in panfrost_job_handle_irq()
> 
> Signed-off-by: Boris Brezillon 
> ---
>  drivers/gpu/drm/panfrost/panfrost_job.c | 53 ++---
>  1 file changed, 38 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c 
> b/drivers/gpu/drm/panfrost/panfrost_job.c
> index be8f68f63974..e0c479e67304 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> @@ -470,19 +470,12 @@ static const struct drm_sched_backend_ops 
> panfrost_sched_ops = {
>   .free_job = panfrost_job_free
>  };
>  
> -static irqreturn_t panfrost_job_irq_handler(int irq, void *data)
> +static void panfrost_job_handle_irq(struct panfrost_device *pfdev, u32 
> status)
>  {
> - struct panfrost_device *pfdev = data;
> - u32 status = job_read(pfdev, JOB_INT_STAT);
>   int j;
>  
>   dev_dbg(pfdev->dev, "jobslot irq status=%x\n", status);
>  
> - if (!status)
> - return IRQ_NONE;
> -
> - pm_runtime_mark_last_busy(pfdev->dev);
> -
>   for (j = 0; status; j++) {
>   u32 mask = MK_JS_MASK(j);
>  
> @@ -519,7 +512,6 @@ static irqreturn_t panfrost_job_irq_handler(int irq, void 
> *data)
>   if (status & JOB_INT_MASK_DONE(j)) {
>   struct panfrost_job *job;
>  
> - spin_lock(>js->job_lock);
>   job = pfdev->jobs[j];
>   /* Only NULL if job timeout occurred */
>   if (job) {
> @@ -531,21 +523,49 @@ static irqreturn_t panfrost_job_irq_handler(int irq, 
> void *data)
>   dma_fence_signal_locked(job->done_fence);
>   pm_runtime_put_autosuspend(pfdev->dev);
>   }
> - spin_unlock(>js->job_lock);
>   }
>  
>   status &= ~mask;
>   }
> +}
>  
> +static irqreturn_t panfrost_job_irq_handler_thread(int irq, void *data)
> +{
> + struct panfrost_device *pfdev = data;
> + u32 status = job_read(pfdev, JOB_INT_RAWSTAT);
> +
> + while (status) {
> + pm_runtime_mark_last_busy(pfdev->dev);
> +
> + spin_lock(>js->job_lock);
> + panfrost_job_handle_irq(pfdev, status);
> + spin_unlock(>js->job_lock);
> + status = job_read(pfdev, JOB_INT_RAWSTAT);
> + }
> +
> + job_write(pfdev, JOB_INT_MASK,
> +   GENMASK(16 + NUM_JOB_SLOTS - 1, 16) |
> +   GENMASK(NUM_JOB_SLOTS - 1, 0));
>   return IRQ_HANDLED;
>  }
>  
> +static irqreturn_t panfrost_job_irq_handler(int irq, void *data)
> +{
> + struct panfrost_device *pfdev = data;
> + u32 status = job_read(pfdev, JOB_INT_STAT);
> +
> + if (!status)
> + return IRQ_NONE;
> +
> + job_write(pfdev, JOB_INT_MASK, 0);
> + return IRQ_WAKE_THREAD;
> +}
> +
>  static void panfrost_reset(struct work_struct *work)
>  {
>   struct panfrost_device *pfdev = container_of(work,
>struct panfrost_device,
>reset.work);
> - unsigned long flags;
>   unsigned int i;
>   bool cookie;
>  
> @@ -575,7 +595,7 @@ static void panfrost_reset(struct work_struct *work)
>   /* All timers have been stopped, we can safely reset the pending state. 
> */
>   atomic_set(>reset.pending, 0);
>  
> - spin_lock_irqsave(>js->job_lock, flags);
> + spin_lock(>js->job_lock);
>   for (i = 0; i < NUM_JOB_SLOTS; i++) {
>   if (pfdev->jobs[i]) {
>   pm_runtime_put_noidle(pfdev->dev);
> @@ -583,7 +603,7 @@ static void panfrost_reset(struct work_struct *work)
>   pfdev->jobs[i] = NULL;
>   }
>   }
> - spin_unlock_irqrestore(>js->job_lock, flags);
> + spin_unlock(>js->job_lock);
>  
>   panfrost_device_reset(pfdev);
>  
> @@ -610,8 +630,11 @@ int panfrost_job_init(struct panfrost_device *pfdev)
>   if (irq <= 0)
>   return -ENODEV;
>  
> - ret = devm_request_irq(pfdev->dev, irq, panfrost_job_irq_handler,
> -IRQF_SHARED, KBUILD_MODNAME "-job", pfdev);
> + ret = devm_request_threaded_irq(pfdev->dev, irq,
> + panfrost_job_irq_handler,
> + panfrost_job_irq_handler_thread,
> + IRQF_SHARED, KBUILD_MODNAME "-job",
> + pfdev);
>   if (ret) {
>   dev_err(pfdev->dev, "failed to request job irq");
>   return ret;
> -- 
> 2.31.1
> 


Re: [PATCH v3 07/15] drm/panfrost: Expose a helper to trigger a GPU reset

2021-06-25 Thread Alyssa Rosenzweig
R-b


Re: [PATCH v3 02/15] drm/panfrost: Make ->run_job() return an ERR_PTR() when appropriate

2021-06-25 Thread Alyssa Rosenzweig
R-b

On Fri, Jun 25, 2021 at 03:33:14PM +0200, Boris Brezillon wrote:
> If the fence creation fail, we can return the error pointer directly.
> The core will update the fence error accordingly.
> 
> Signed-off-by: Boris Brezillon 
> Reviewed-by: Steven Price 
> ---
>  drivers/gpu/drm/panfrost/panfrost_job.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c 
> b/drivers/gpu/drm/panfrost/panfrost_job.c
> index 8ff79fd49577..d6c9698bca3b 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> @@ -355,7 +355,7 @@ static struct dma_fence *panfrost_job_run(struct 
> drm_sched_job *sched_job)
>  
>   fence = panfrost_fence_create(pfdev, slot);
>   if (IS_ERR(fence))
> - return NULL;
> + return fence;
>  
>   if (job->done_fence)
>   dma_fence_put(job->done_fence);
> -- 
> 2.31.1
> 


Re: [PATCH v3 04/15] drm/panfrost: Drop the pfdev argument passed to panfrost_exception_name()

2021-06-25 Thread Alyssa Rosenzweig
> Currently unused. We'll add it back if we need per-GPU definitions.

We will, if not for Valhall v9, certainly for Valhall v10 with the CSF..


Re: [PATCH v3 14/15] drm/panfrost: Kill in-flight jobs on FD close

2021-06-25 Thread Lucas Stach
Am Freitag, dem 25.06.2021 um 15:33 +0200 schrieb Boris Brezillon:
> If the process who submitted these jobs decided to close the FD before
> the jobs are done it probably means it doesn't care about the result.
> 
> v3:
> * Set fence error to ECANCELED when a TERMINATED exception is received
> 
> Signed-off-by: Boris Brezillon 
> ---
>  drivers/gpu/drm/panfrost/panfrost_job.c | 43 +
>  1 file changed, 37 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c 
> b/drivers/gpu/drm/panfrost/panfrost_job.c
> index 948bd174ff99..aa1e6542adde 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> @@ -498,14 +498,21 @@ static void panfrost_job_handle_irq(struct 
> panfrost_device *pfdev, u32 status)
>  
>   if (status & JOB_INT_MASK_ERR(j)) {
>   u32 js_status = job_read(pfdev, JS_STATUS(j));
> + const char *exception_name = 
> panfrost_exception_name(js_status);
>  
>   job_write(pfdev, JS_COMMAND_NEXT(j), JS_COMMAND_NOP);
>  
> - dev_err(pfdev->dev, "js fault, js=%d, status=%s, 
> head=0x%x, tail=0x%x",
> - j,
> - panfrost_exception_name(js_status),
> - job_read(pfdev, JS_HEAD_LO(j)),
> - job_read(pfdev, JS_TAIL_LO(j)));
> + if (js_status < 
> DRM_PANFROST_EXCEPTION_JOB_CONFIG_FAULT) {
> + dev_dbg(pfdev->dev, "js interrupt, js=%d, 
> status=%s, head=0x%x, tail=0x%x",
> + j, exception_name,
> + job_read(pfdev, JS_HEAD_LO(j)),
> + job_read(pfdev, JS_TAIL_LO(j)));
> + } else {
> + dev_err(pfdev->dev, "js fault, js=%d, 
> status=%s, head=0x%x, tail=0x%x",
> + j, exception_name,
> + job_read(pfdev, JS_HEAD_LO(j)),
> + job_read(pfdev, JS_TAIL_LO(j)));
> + }
>  
>   /* If we need a reset, signal it to the timeout
>* handler, otherwise, update the fence error field and
> @@ -514,7 +521,16 @@ static void panfrost_job_handle_irq(struct 
> panfrost_device *pfdev, u32 status)
>   if (panfrost_exception_needs_reset(pfdev, js_status)) {
>   drm_sched_fault(>js->queue[j].sched);
>   } else {
> - dma_fence_set_error(pfdev->jobs[j]->done_fence, 
> -EINVAL);
> + int error = 0;
> +
> + if (js_status == 
> DRM_PANFROST_EXCEPTION_TERMINATED)
> + error = -ECANCELED;
> + else if (js_status >= 
> DRM_PANFROST_EXCEPTION_JOB_CONFIG_FAULT)
> + error = -EINVAL;
> +
> + if (error)
> + 
> dma_fence_set_error(pfdev->jobs[j]->done_fence, error);
> +
>   status |= JOB_INT_MASK_DONE(j);
>   }
>   }
> @@ -673,10 +689,25 @@ int panfrost_job_open(struct panfrost_file_priv 
> *panfrost_priv)
>  
>  void panfrost_job_close(struct panfrost_file_priv *panfrost_priv)
>  {
> + struct panfrost_device *pfdev = panfrost_priv->pfdev;
> + unsigned long flags;
>   int i;
>  
>   for (i = 0; i < NUM_JOB_SLOTS; i++)
>   drm_sched_entity_destroy(_priv->sched_entity[i]);
> +
> + /* Kill in-flight jobs */
> + spin_lock_irqsave(>js->job_lock, flags);

Micro-optimization, but this code is never called from IRQ context, so
a spin_lock_irq would do here, no need to save/restore flags.

Regards,
Lucas

> + for (i = 0; i < NUM_JOB_SLOTS; i++) {
> + struct drm_sched_entity *entity = 
> _priv->sched_entity[i];
> + struct panfrost_job *job = pfdev->jobs[i];
> +
> + if (!job || job->base.entity != entity)
> + continue;
> +
> + job_write(pfdev, JS_COMMAND(i), JS_COMMAND_HARD_STOP);
> + }
> + spin_unlock_irqrestore(>js->job_lock, flags);
>  }
>  
>  int panfrost_job_is_idle(struct panfrost_device *pfdev)




Re: [PATCH v3 05/15] drm/panfrost: Expose exception types to userspace

2021-06-25 Thread Alyssa Rosenzweig
I'm not convinced. Right now most of our UABI is pleasantly
GPU-agnostic. With this suddenly there's divergence between Midgard and
Bifrost uABI. With that drawback in mind, could you explain the benefit?

On Fri, Jun 25, 2021 at 03:33:17PM +0200, Boris Brezillon wrote:
> Job headers contain an exception type field which might be read and
> converted to a human readable string by tracing tools. Let's expose
> the exception type as an enum so we share the same definition.
> 
> v3:
> * Add missing values
> 
> Signed-off-by: Boris Brezillon 
> ---
>  include/uapi/drm/panfrost_drm.h | 71 +
>  1 file changed, 71 insertions(+)
> 
> diff --git a/include/uapi/drm/panfrost_drm.h b/include/uapi/drm/panfrost_drm.h
> index ec19db1eead8..899cd6d952d4 100644
> --- a/include/uapi/drm/panfrost_drm.h
> +++ b/include/uapi/drm/panfrost_drm.h
> @@ -223,6 +223,77 @@ struct drm_panfrost_madvise {
>   __u32 retained;   /* out, whether backing store still exists */
>  };
>  
> +/* The exception types */
> +
> +enum drm_panfrost_exception_type {
> + DRM_PANFROST_EXCEPTION_OK = 0x00,
> + DRM_PANFROST_EXCEPTION_DONE = 0x01,
> + DRM_PANFROST_EXCEPTION_INTERRUPTED = 0x02,
> + DRM_PANFROST_EXCEPTION_STOPPED = 0x03,
> + DRM_PANFROST_EXCEPTION_TERMINATED = 0x04,
> + DRM_PANFROST_EXCEPTION_KABOOM = 0x05,
> + DRM_PANFROST_EXCEPTION_EUREKA = 0x06,
> + DRM_PANFROST_EXCEPTION_ACTIVE = 0x08,
> + DRM_PANFROST_EXCEPTION_JOB_CONFIG_FAULT = 0x40,
> + DRM_PANFROST_EXCEPTION_JOB_POWER_FAULT = 0x41,
> + DRM_PANFROST_EXCEPTION_JOB_READ_FAULT = 0x42,
> + DRM_PANFROST_EXCEPTION_JOB_WRITE_FAULT = 0x43,
> + DRM_PANFROST_EXCEPTION_JOB_AFFINITY_FAULT = 0x44,
> + DRM_PANFROST_EXCEPTION_JOB_BUS_FAULT = 0x48,
> + DRM_PANFROST_EXCEPTION_INSTR_INVALID_PC = 0x50,
> + DRM_PANFROST_EXCEPTION_INSTR_INVALID_ENC = 0x51,
> + DRM_PANFROST_EXCEPTION_INSTR_TYPE_MISMATCH = 0x52,
> + DRM_PANFROST_EXCEPTION_INSTR_OPERAND_FAULT = 0x53,
> + DRM_PANFROST_EXCEPTION_INSTR_TLS_FAULT = 0x54,
> + DRM_PANFROST_EXCEPTION_INSTR_BARRIER_FAULT = 0x55,
> + DRM_PANFROST_EXCEPTION_INSTR_ALIGN_FAULT = 0x56,
> + DRM_PANFROST_EXCEPTION_DATA_INVALID_FAULT = 0x58,
> + DRM_PANFROST_EXCEPTION_TILE_RANGE_FAULT = 0x59,
> + DRM_PANFROST_EXCEPTION_ADDR_RANGE_FAULT = 0x5a,
> + DRM_PANFROST_EXCEPTION_IMPRECISE_FAULT = 0x5b,
> + DRM_PANFROST_EXCEPTION_OOM = 0x60,
> + DRM_PANFROST_EXCEPTION_OOM_AFBC = 0x61,
> + DRM_PANFROST_EXCEPTION_UNKNOWN = 0x7f,
> + DRM_PANFROST_EXCEPTION_DELAYED_BUS_FAULT = 0x80,
> + DRM_PANFROST_EXCEPTION_GPU_SHAREABILITY_FAULT = 0x88,
> + DRM_PANFROST_EXCEPTION_SYS_SHAREABILITY_FAULT = 0x89,
> + DRM_PANFROST_EXCEPTION_GPU_CACHEABILITY_FAULT = 0x8a,
> + DRM_PANFROST_EXCEPTION_TRANSLATION_FAULT_0 = 0xc0,
> + DRM_PANFROST_EXCEPTION_TRANSLATION_FAULT_1 = 0xc1,
> + DRM_PANFROST_EXCEPTION_TRANSLATION_FAULT_2 = 0xc2,
> + DRM_PANFROST_EXCEPTION_TRANSLATION_FAULT_3 = 0xc3,
> + DRM_PANFROST_EXCEPTION_TRANSLATION_FAULT_4 = 0xc4,
> + DRM_PANFROST_EXCEPTION_TRANSLATION_FAULT_IDENTITY = 0xc7,
> + DRM_PANFROST_EXCEPTION_PERM_FAULT_0 = 0xc8,
> + DRM_PANFROST_EXCEPTION_PERM_FAULT_1 = 0xc9,
> + DRM_PANFROST_EXCEPTION_PERM_FAULT_2 = 0xca,
> + DRM_PANFROST_EXCEPTION_PERM_FAULT_3 = 0xcb,
> + DRM_PANFROST_EXCEPTION_TRANSTAB_BUS_FAULT_0 = 0xd0,
> + DRM_PANFROST_EXCEPTION_TRANSTAB_BUS_FAULT_1 = 0xd1,
> + DRM_PANFROST_EXCEPTION_TRANSTAB_BUS_FAULT_2 = 0xd2,
> + DRM_PANFROST_EXCEPTION_TRANSTAB_BUS_FAULT_3 = 0xd3,
> + DRM_PANFROST_EXCEPTION_ACCESS_FLAG_0 = 0xd8,
> + DRM_PANFROST_EXCEPTION_ACCESS_FLAG_1 = 0xd9,
> + DRM_PANFROST_EXCEPTION_ACCESS_FLAG_2 = 0xda,
> + DRM_PANFROST_EXCEPTION_ACCESS_FLAG_3 = 0xdb,
> + DRM_PANFROST_EXCEPTION_ADDR_SIZE_FAULT_IN0 = 0xe0,
> + DRM_PANFROST_EXCEPTION_ADDR_SIZE_FAULT_IN1 = 0xe1,
> + DRM_PANFROST_EXCEPTION_ADDR_SIZE_FAULT_IN2 = 0xe2,
> + DRM_PANFROST_EXCEPTION_ADDR_SIZE_FAULT_IN3 = 0xe3,
> + DRM_PANFROST_EXCEPTION_ADDR_SIZE_FAULT_OUT0 = 0xe4,
> + DRM_PANFROST_EXCEPTION_ADDR_SIZE_FAULT_OUT1 = 0xe5,
> + DRM_PANFROST_EXCEPTION_ADDR_SIZE_FAULT_OUT2 = 0xe6,
> + DRM_PANFROST_EXCEPTION_ADDR_SIZE_FAULT_OUT3 = 0xe7,
> + DRM_PANFROST_EXCEPTION_MEM_ATTR_FAULT_0 = 0xe8,
> + DRM_PANFROST_EXCEPTION_MEM_ATTR_FAULT_1 = 0xe9,
> + DRM_PANFROST_EXCEPTION_MEM_ATTR_FAULT_2 = 0xea,
> + DRM_PANFROST_EXCEPTION_MEM_ATTR_FAULT_3 = 0xeb,
> + DRM_PANFROST_EXCEPTION_MEM_ATTR_NONCACHE_0 = 0xec,
> + DRM_PANFROST_EXCEPTION_MEM_ATTR_NONCACHE_1 = 0xed,
> + DRM_PANFROST_EXCEPTION_MEM_ATTR_NONCACHE_2 = 0xee,
> + DRM_PANFROST_EXCEPTION_MEM_ATTR_NONCACHE_3 = 0xef,
> +};
> +
>  #if defined(__cplusplus)
>  }
>  #endif
> -- 
> 2.31.1
> 


Re: [PATCH v3 06/15] drm/panfrost: Do the exception -> string translation using a table

2021-06-25 Thread Alyssa Rosenzweig
R-b


Re: [PATCH v3 03/15] drm/panfrost: Get rid of the unused JS_STATUS_EVENT_ACTIVE definition

2021-06-25 Thread Alyssa Rosenzweig
> Exception types will be defined as an enum in panfrost_drm.h so userspace
> and use the same definitions if needed.

s/and/can/, with that R-b


[PATCH v3 11/15] drm/panfrost: Disable the AS on unhandled page faults

2021-06-25 Thread Boris Brezillon
If we don't do that, we have to wait for the job timeout to expire
before the fault jobs gets killed.

v3:
* Make sure the AS is re-enabled when new jobs are submitted to the
  context

Signed-off-by: Boris Brezillon 
---
 drivers/gpu/drm/panfrost/panfrost_device.h |  1 +
 drivers/gpu/drm/panfrost/panfrost_mmu.c| 34 --
 2 files changed, 32 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h 
b/drivers/gpu/drm/panfrost/panfrost_device.h
index bfe32907ba6b..efe9a675b614 100644
--- a/drivers/gpu/drm/panfrost/panfrost_device.h
+++ b/drivers/gpu/drm/panfrost/panfrost_device.h
@@ -96,6 +96,7 @@ struct panfrost_device {
spinlock_t as_lock;
unsigned long as_in_use_mask;
unsigned long as_alloc_mask;
+   unsigned long as_faulty_mask;
struct list_head as_lru_list;
 
struct panfrost_job_slot *js;
diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c 
b/drivers/gpu/drm/panfrost/panfrost_mmu.c
index b4f0c673cd7f..65e98c51cb66 100644
--- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
+++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
@@ -154,6 +154,7 @@ u32 panfrost_mmu_as_get(struct panfrost_device *pfdev, 
struct panfrost_mmu *mmu)
as = mmu->as;
if (as >= 0) {
int en = atomic_inc_return(>as_count);
+   u32 mask = BIT(as) | BIT(16 + as);
 
/*
 * AS can be retained by active jobs or a perfcnt context,
@@ -162,6 +163,18 @@ u32 panfrost_mmu_as_get(struct panfrost_device *pfdev, 
struct panfrost_mmu *mmu)
WARN_ON(en >= (NUM_JOB_SLOTS + 1));
 
list_move(>list, >as_lru_list);
+
+   if (pfdev->as_faulty_mask & mask) {
+   /* Unhandled pagefault on this AS, the MMU was
+* disabled. We need to re-enable the MMU after
+* clearing+unmasking the AS interrupts.
+*/
+   mmu_write(pfdev, MMU_INT_CLEAR, mask);
+   mmu_write(pfdev, MMU_INT_MASK, ~pfdev->as_faulty_mask);
+   pfdev->as_faulty_mask &= ~mask;
+   panfrost_mmu_enable(pfdev, mmu);
+   }
+
goto out;
}
 
@@ -211,6 +224,7 @@ void panfrost_mmu_reset(struct panfrost_device *pfdev)
spin_lock(>as_lock);
 
pfdev->as_alloc_mask = 0;
+   pfdev->as_faulty_mask = 0;
 
list_for_each_entry_safe(mmu, mmu_tmp, >as_lru_list, list) {
mmu->as = -1;
@@ -662,7 +676,7 @@ static irqreturn_t panfrost_mmu_irq_handler_thread(int irq, 
void *data)
if ((status & mask) == BIT(as) && (exception_type & 0xF8) == 
0xC0)
ret = panfrost_mmu_map_fault_addr(pfdev, as, addr);
 
-   if (ret)
+   if (ret) {
/* terminal fault, print info about the fault */
dev_err(pfdev->dev,
"Unhandled Page fault in AS%d at VA 0x%016llX\n"
@@ -680,14 +694,28 @@ static irqreturn_t panfrost_mmu_irq_handler_thread(int 
irq, void *data)
access_type, access_type_name(pfdev, 
fault_status),
source_id);
 
+   spin_lock(>as_lock);
+   /* Ignore MMU interrupts on this AS until it's been
+* re-enabled.
+*/
+   pfdev->as_faulty_mask |= mask;
+
+   /* Disable the MMU to kill jobs on this AS. */
+   panfrost_mmu_disable(pfdev, as);
+   spin_unlock(>as_lock);
+   }
+
status &= ~mask;
 
/* If we received new MMU interrupts, process them before 
returning. */
if (!status)
-   status = mmu_read(pfdev, MMU_INT_RAWSTAT);
+   status = mmu_read(pfdev, MMU_INT_RAWSTAT) & 
~pfdev->as_faulty_mask;
}
 
-   mmu_write(pfdev, MMU_INT_MASK, ~0);
+   spin_lock(>as_lock);
+   mmu_write(pfdev, MMU_INT_MASK, ~pfdev->as_faulty_mask);
+   spin_unlock(>as_lock);
+
return IRQ_HANDLED;
 };
 
-- 
2.31.1



[PATCH v3 02/15] drm/panfrost: Make ->run_job() return an ERR_PTR() when appropriate

2021-06-25 Thread Boris Brezillon
If the fence creation fail, we can return the error pointer directly.
The core will update the fence error accordingly.

Signed-off-by: Boris Brezillon 
Reviewed-by: Steven Price 
---
 drivers/gpu/drm/panfrost/panfrost_job.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c 
b/drivers/gpu/drm/panfrost/panfrost_job.c
index 8ff79fd49577..d6c9698bca3b 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.c
+++ b/drivers/gpu/drm/panfrost/panfrost_job.c
@@ -355,7 +355,7 @@ static struct dma_fence *panfrost_job_run(struct 
drm_sched_job *sched_job)
 
fence = panfrost_fence_create(pfdev, slot);
if (IS_ERR(fence))
-   return NULL;
+   return fence;
 
if (job->done_fence)
dma_fence_put(job->done_fence);
-- 
2.31.1



[PATCH v3 05/15] drm/panfrost: Expose exception types to userspace

2021-06-25 Thread Boris Brezillon
Job headers contain an exception type field which might be read and
converted to a human readable string by tracing tools. Let's expose
the exception type as an enum so we share the same definition.

v3:
* Add missing values

Signed-off-by: Boris Brezillon 
---
 include/uapi/drm/panfrost_drm.h | 71 +
 1 file changed, 71 insertions(+)

diff --git a/include/uapi/drm/panfrost_drm.h b/include/uapi/drm/panfrost_drm.h
index ec19db1eead8..899cd6d952d4 100644
--- a/include/uapi/drm/panfrost_drm.h
+++ b/include/uapi/drm/panfrost_drm.h
@@ -223,6 +223,77 @@ struct drm_panfrost_madvise {
__u32 retained;   /* out, whether backing store still exists */
 };
 
+/* The exception types */
+
+enum drm_panfrost_exception_type {
+   DRM_PANFROST_EXCEPTION_OK = 0x00,
+   DRM_PANFROST_EXCEPTION_DONE = 0x01,
+   DRM_PANFROST_EXCEPTION_INTERRUPTED = 0x02,
+   DRM_PANFROST_EXCEPTION_STOPPED = 0x03,
+   DRM_PANFROST_EXCEPTION_TERMINATED = 0x04,
+   DRM_PANFROST_EXCEPTION_KABOOM = 0x05,
+   DRM_PANFROST_EXCEPTION_EUREKA = 0x06,
+   DRM_PANFROST_EXCEPTION_ACTIVE = 0x08,
+   DRM_PANFROST_EXCEPTION_JOB_CONFIG_FAULT = 0x40,
+   DRM_PANFROST_EXCEPTION_JOB_POWER_FAULT = 0x41,
+   DRM_PANFROST_EXCEPTION_JOB_READ_FAULT = 0x42,
+   DRM_PANFROST_EXCEPTION_JOB_WRITE_FAULT = 0x43,
+   DRM_PANFROST_EXCEPTION_JOB_AFFINITY_FAULT = 0x44,
+   DRM_PANFROST_EXCEPTION_JOB_BUS_FAULT = 0x48,
+   DRM_PANFROST_EXCEPTION_INSTR_INVALID_PC = 0x50,
+   DRM_PANFROST_EXCEPTION_INSTR_INVALID_ENC = 0x51,
+   DRM_PANFROST_EXCEPTION_INSTR_TYPE_MISMATCH = 0x52,
+   DRM_PANFROST_EXCEPTION_INSTR_OPERAND_FAULT = 0x53,
+   DRM_PANFROST_EXCEPTION_INSTR_TLS_FAULT = 0x54,
+   DRM_PANFROST_EXCEPTION_INSTR_BARRIER_FAULT = 0x55,
+   DRM_PANFROST_EXCEPTION_INSTR_ALIGN_FAULT = 0x56,
+   DRM_PANFROST_EXCEPTION_DATA_INVALID_FAULT = 0x58,
+   DRM_PANFROST_EXCEPTION_TILE_RANGE_FAULT = 0x59,
+   DRM_PANFROST_EXCEPTION_ADDR_RANGE_FAULT = 0x5a,
+   DRM_PANFROST_EXCEPTION_IMPRECISE_FAULT = 0x5b,
+   DRM_PANFROST_EXCEPTION_OOM = 0x60,
+   DRM_PANFROST_EXCEPTION_OOM_AFBC = 0x61,
+   DRM_PANFROST_EXCEPTION_UNKNOWN = 0x7f,
+   DRM_PANFROST_EXCEPTION_DELAYED_BUS_FAULT = 0x80,
+   DRM_PANFROST_EXCEPTION_GPU_SHAREABILITY_FAULT = 0x88,
+   DRM_PANFROST_EXCEPTION_SYS_SHAREABILITY_FAULT = 0x89,
+   DRM_PANFROST_EXCEPTION_GPU_CACHEABILITY_FAULT = 0x8a,
+   DRM_PANFROST_EXCEPTION_TRANSLATION_FAULT_0 = 0xc0,
+   DRM_PANFROST_EXCEPTION_TRANSLATION_FAULT_1 = 0xc1,
+   DRM_PANFROST_EXCEPTION_TRANSLATION_FAULT_2 = 0xc2,
+   DRM_PANFROST_EXCEPTION_TRANSLATION_FAULT_3 = 0xc3,
+   DRM_PANFROST_EXCEPTION_TRANSLATION_FAULT_4 = 0xc4,
+   DRM_PANFROST_EXCEPTION_TRANSLATION_FAULT_IDENTITY = 0xc7,
+   DRM_PANFROST_EXCEPTION_PERM_FAULT_0 = 0xc8,
+   DRM_PANFROST_EXCEPTION_PERM_FAULT_1 = 0xc9,
+   DRM_PANFROST_EXCEPTION_PERM_FAULT_2 = 0xca,
+   DRM_PANFROST_EXCEPTION_PERM_FAULT_3 = 0xcb,
+   DRM_PANFROST_EXCEPTION_TRANSTAB_BUS_FAULT_0 = 0xd0,
+   DRM_PANFROST_EXCEPTION_TRANSTAB_BUS_FAULT_1 = 0xd1,
+   DRM_PANFROST_EXCEPTION_TRANSTAB_BUS_FAULT_2 = 0xd2,
+   DRM_PANFROST_EXCEPTION_TRANSTAB_BUS_FAULT_3 = 0xd3,
+   DRM_PANFROST_EXCEPTION_ACCESS_FLAG_0 = 0xd8,
+   DRM_PANFROST_EXCEPTION_ACCESS_FLAG_1 = 0xd9,
+   DRM_PANFROST_EXCEPTION_ACCESS_FLAG_2 = 0xda,
+   DRM_PANFROST_EXCEPTION_ACCESS_FLAG_3 = 0xdb,
+   DRM_PANFROST_EXCEPTION_ADDR_SIZE_FAULT_IN0 = 0xe0,
+   DRM_PANFROST_EXCEPTION_ADDR_SIZE_FAULT_IN1 = 0xe1,
+   DRM_PANFROST_EXCEPTION_ADDR_SIZE_FAULT_IN2 = 0xe2,
+   DRM_PANFROST_EXCEPTION_ADDR_SIZE_FAULT_IN3 = 0xe3,
+   DRM_PANFROST_EXCEPTION_ADDR_SIZE_FAULT_OUT0 = 0xe4,
+   DRM_PANFROST_EXCEPTION_ADDR_SIZE_FAULT_OUT1 = 0xe5,
+   DRM_PANFROST_EXCEPTION_ADDR_SIZE_FAULT_OUT2 = 0xe6,
+   DRM_PANFROST_EXCEPTION_ADDR_SIZE_FAULT_OUT3 = 0xe7,
+   DRM_PANFROST_EXCEPTION_MEM_ATTR_FAULT_0 = 0xe8,
+   DRM_PANFROST_EXCEPTION_MEM_ATTR_FAULT_1 = 0xe9,
+   DRM_PANFROST_EXCEPTION_MEM_ATTR_FAULT_2 = 0xea,
+   DRM_PANFROST_EXCEPTION_MEM_ATTR_FAULT_3 = 0xeb,
+   DRM_PANFROST_EXCEPTION_MEM_ATTR_NONCACHE_0 = 0xec,
+   DRM_PANFROST_EXCEPTION_MEM_ATTR_NONCACHE_1 = 0xed,
+   DRM_PANFROST_EXCEPTION_MEM_ATTR_NONCACHE_2 = 0xee,
+   DRM_PANFROST_EXCEPTION_MEM_ATTR_NONCACHE_3 = 0xef,
+};
+
 #if defined(__cplusplus)
 }
 #endif
-- 
2.31.1



[PATCH v3 15/15] drm/panfrost: Queue jobs on the hardware

2021-06-25 Thread Boris Brezillon
From: Steven Price 

The hardware has a set of '_NEXT' registers that can hold a second job
while the first is executing. Make use of these registers to enqueue a
second job per slot.

v3:
* Fix the done/err job dequeuing logic to get a valid active state
* Only enable the second slot on GPUs supporting jobchain disambiguation
* Split interrupt handling in sub-functions

Signed-off-by: Steven Price 
Signed-off-by: Boris Brezillon 
---
 drivers/gpu/drm/panfrost/panfrost_device.h |   2 +-
 drivers/gpu/drm/panfrost/panfrost_job.c| 473 -
 2 files changed, 357 insertions(+), 118 deletions(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h 
b/drivers/gpu/drm/panfrost/panfrost_device.h
index ecbc79ad0006..65a7b9b08f3a 100644
--- a/drivers/gpu/drm/panfrost/panfrost_device.h
+++ b/drivers/gpu/drm/panfrost/panfrost_device.h
@@ -101,7 +101,7 @@ struct panfrost_device {
 
struct panfrost_job_slot *js;
 
-   struct panfrost_job *jobs[NUM_JOB_SLOTS];
+   struct panfrost_job *jobs[NUM_JOB_SLOTS][2];
struct list_head scheduled_jobs;
 
struct panfrost_perfcnt *perfcnt;
diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c 
b/drivers/gpu/drm/panfrost/panfrost_job.c
index aa1e6542adde..0d0011cbe864 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.c
+++ b/drivers/gpu/drm/panfrost/panfrost_job.c
@@ -4,6 +4,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -140,9 +141,52 @@ static void panfrost_job_write_affinity(struct 
panfrost_device *pfdev,
job_write(pfdev, JS_AFFINITY_NEXT_HI(js), affinity >> 32);
 }
 
+static u32
+panfrost_get_job_chain_flag(const struct panfrost_job *job)
+{
+   struct panfrost_fence *f = to_panfrost_fence(job->done_fence);
+
+   if (!panfrost_has_hw_feature(job->pfdev, 
HW_FEATURE_JOBCHAIN_DISAMBIGUATION))
+   return 0;
+
+   return (f->seqno & 1) ? JS_CONFIG_JOB_CHAIN_FLAG : 0;
+}
+
+static struct panfrost_job *
+panfrost_dequeue_job(struct panfrost_device *pfdev, int slot)
+{
+   struct panfrost_job *job = pfdev->jobs[slot][0];
+
+   WARN_ON(!job);
+   pfdev->jobs[slot][0] = pfdev->jobs[slot][1];
+   pfdev->jobs[slot][1] = NULL;
+
+   return job;
+}
+
+static unsigned int
+panfrost_enqueue_job(struct panfrost_device *pfdev, int slot,
+struct panfrost_job *job)
+{
+   if (WARN_ON(!job))
+   return 0;
+
+   if (!pfdev->jobs[slot][0]) {
+   pfdev->jobs[slot][0] = job;
+   return 0;
+   }
+
+   WARN_ON(pfdev->jobs[slot][1]);
+   pfdev->jobs[slot][1] = job;
+   WARN_ON(panfrost_get_job_chain_flag(job) ==
+   panfrost_get_job_chain_flag(pfdev->jobs[slot][0]));
+   return 1;
+}
+
 static void panfrost_job_hw_submit(struct panfrost_job *job, int js)
 {
struct panfrost_device *pfdev = job->pfdev;
+   unsigned int subslot;
u32 cfg;
u64 jc_head = job->jc;
int ret;
@@ -168,7 +212,8 @@ static void panfrost_job_hw_submit(struct panfrost_job 
*job, int js)
 * start */
cfg |= JS_CONFIG_THREAD_PRI(8) |
JS_CONFIG_START_FLUSH_CLEAN_INVALIDATE |
-   JS_CONFIG_END_FLUSH_CLEAN_INVALIDATE;
+   JS_CONFIG_END_FLUSH_CLEAN_INVALIDATE |
+   panfrost_get_job_chain_flag(job);
 
if (panfrost_has_hw_feature(pfdev, HW_FEATURE_FLUSH_REDUCTION))
cfg |= JS_CONFIG_ENABLE_FLUSH_REDUCTION;
@@ -182,10 +227,17 @@ static void panfrost_job_hw_submit(struct panfrost_job 
*job, int js)
job_write(pfdev, JS_FLUSH_ID_NEXT(js), job->flush_id);
 
/* GO ! */
-   dev_dbg(pfdev->dev, "JS: Submitting atom %p to js[%d] with head=0x%llx",
-   job, js, jc_head);
 
-   job_write(pfdev, JS_COMMAND_NEXT(js), JS_COMMAND_START);
+   spin_lock(>js->job_lock);
+   subslot = panfrost_enqueue_job(pfdev, js, job);
+   /* Don't queue the job if a reset is in progress */
+   if (!atomic_read(>reset.pending)) {
+   job_write(pfdev, JS_COMMAND_NEXT(js), JS_COMMAND_START);
+   dev_dbg(pfdev->dev,
+   "JS: Submitting atom %p to js[%d][%d] with head=0x%llx 
AS %d",
+   job, js, subslot, jc_head, cfg & 0xf);
+   }
+   spin_unlock(>js->job_lock);
 }
 
 static void panfrost_acquire_object_fences(struct drm_gem_object **bos,
@@ -343,7 +395,11 @@ static struct dma_fence *panfrost_job_run(struct 
drm_sched_job *sched_job)
if (unlikely(job->base.s_fence->finished.error))
return NULL;
 
-   pfdev->jobs[slot] = job;
+   /* Nothing to execute: can happen if the job has finished while
+* we were resetting the GPU.
+*/
+   if (!job->jc)
+   return NULL;
 
fence = panfrost_fence_create(pfdev, slot);
if (IS_ERR(fence))
@@ -371,11 +427,218 @@ void panfrost_job_enable_interrupts(struct 

[PATCH v3 06/15] drm/panfrost: Do the exception -> string translation using a table

2021-06-25 Thread Boris Brezillon
Do the exception -> string translation using a table. This way we get
rid of those magic numbers and can easily add new fields if we need
to attach extra information to exception types.

v3:
* Drop the error field

Signed-off-by: Boris Brezillon 
---
 drivers/gpu/drm/panfrost/panfrost_device.c | 130 +
 1 file changed, 83 insertions(+), 47 deletions(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_device.c 
b/drivers/gpu/drm/panfrost/panfrost_device.c
index bce6b0aff05e..736854542b05 100644
--- a/drivers/gpu/drm/panfrost/panfrost_device.c
+++ b/drivers/gpu/drm/panfrost/panfrost_device.c
@@ -292,55 +292,91 @@ void panfrost_device_fini(struct panfrost_device *pfdev)
panfrost_clk_fini(pfdev);
 }
 
-const char *panfrost_exception_name(u32 exception_code)
-{
-   switch (exception_code) {
-   /* Non-Fault Status code */
-   case 0x00: return "NOT_STARTED/IDLE/OK";
-   case 0x01: return "DONE";
-   case 0x02: return "INTERRUPTED";
-   case 0x03: return "STOPPED";
-   case 0x04: return "TERMINATED";
-   case 0x08: return "ACTIVE";
-   /* Job exceptions */
-   case 0x40: return "JOB_CONFIG_FAULT";
-   case 0x41: return "JOB_POWER_FAULT";
-   case 0x42: return "JOB_READ_FAULT";
-   case 0x43: return "JOB_WRITE_FAULT";
-   case 0x44: return "JOB_AFFINITY_FAULT";
-   case 0x48: return "JOB_BUS_FAULT";
-   case 0x50: return "INSTR_INVALID_PC";
-   case 0x51: return "INSTR_INVALID_ENC";
-   case 0x52: return "INSTR_TYPE_MISMATCH";
-   case 0x53: return "INSTR_OPERAND_FAULT";
-   case 0x54: return "INSTR_TLS_FAULT";
-   case 0x55: return "INSTR_BARRIER_FAULT";
-   case 0x56: return "INSTR_ALIGN_FAULT";
-   case 0x58: return "DATA_INVALID_FAULT";
-   case 0x59: return "TILE_RANGE_FAULT";
-   case 0x5A: return "ADDR_RANGE_FAULT";
-   case 0x60: return "OUT_OF_MEMORY";
-   /* GPU exceptions */
-   case 0x80: return "DELAYED_BUS_FAULT";
-   case 0x88: return "SHAREABILITY_FAULT";
-   /* MMU exceptions */
-   case 0xC1: return "TRANSLATION_FAULT_LEVEL1";
-   case 0xC2: return "TRANSLATION_FAULT_LEVEL2";
-   case 0xC3: return "TRANSLATION_FAULT_LEVEL3";
-   case 0xC4: return "TRANSLATION_FAULT_LEVEL4";
-   case 0xC8: return "PERMISSION_FAULT";
-   case 0xC9 ... 0xCF: return "PERMISSION_FAULT";
-   case 0xD1: return "TRANSTAB_BUS_FAULT_LEVEL1";
-   case 0xD2: return "TRANSTAB_BUS_FAULT_LEVEL2";
-   case 0xD3: return "TRANSTAB_BUS_FAULT_LEVEL3";
-   case 0xD4: return "TRANSTAB_BUS_FAULT_LEVEL4";
-   case 0xD8: return "ACCESS_FLAG";
-   case 0xD9 ... 0xDF: return "ACCESS_FLAG";
-   case 0xE0 ... 0xE7: return "ADDRESS_SIZE_FAULT";
-   case 0xE8 ... 0xEF: return "MEMORY_ATTRIBUTES_FAULT";
+#define PANFROST_EXCEPTION(id) \
+   [DRM_PANFROST_EXCEPTION_ ## id] = { \
+   .name = #id, \
}
 
-   return "UNKNOWN";
+struct panfrost_exception_info {
+   const char *name;
+};
+
+static const struct panfrost_exception_info panfrost_exception_infos[] = {
+   PANFROST_EXCEPTION(OK),
+   PANFROST_EXCEPTION(DONE),
+   PANFROST_EXCEPTION(INTERRUPTED),
+   PANFROST_EXCEPTION(STOPPED),
+   PANFROST_EXCEPTION(TERMINATED),
+   PANFROST_EXCEPTION(KABOOM),
+   PANFROST_EXCEPTION(EUREKA),
+   PANFROST_EXCEPTION(ACTIVE),
+   PANFROST_EXCEPTION(JOB_CONFIG_FAULT),
+   PANFROST_EXCEPTION(JOB_POWER_FAULT),
+   PANFROST_EXCEPTION(JOB_READ_FAULT),
+   PANFROST_EXCEPTION(JOB_WRITE_FAULT),
+   PANFROST_EXCEPTION(JOB_AFFINITY_FAULT),
+   PANFROST_EXCEPTION(JOB_BUS_FAULT),
+   PANFROST_EXCEPTION(INSTR_INVALID_PC),
+   PANFROST_EXCEPTION(INSTR_INVALID_ENC),
+   PANFROST_EXCEPTION(INSTR_TYPE_MISMATCH),
+   PANFROST_EXCEPTION(INSTR_OPERAND_FAULT),
+   PANFROST_EXCEPTION(INSTR_TLS_FAULT),
+   PANFROST_EXCEPTION(INSTR_BARRIER_FAULT),
+   PANFROST_EXCEPTION(INSTR_ALIGN_FAULT),
+   PANFROST_EXCEPTION(DATA_INVALID_FAULT),
+   PANFROST_EXCEPTION(TILE_RANGE_FAULT),
+   PANFROST_EXCEPTION(ADDR_RANGE_FAULT),
+   PANFROST_EXCEPTION(IMPRECISE_FAULT),
+   PANFROST_EXCEPTION(OOM),
+   PANFROST_EXCEPTION(OOM_AFBC),
+   PANFROST_EXCEPTION(UNKNOWN),
+   PANFROST_EXCEPTION(DELAYED_BUS_FAULT),
+   PANFROST_EXCEPTION(GPU_SHAREABILITY_FAULT),
+   PANFROST_EXCEPTION(SYS_SHAREABILITY_FAULT),
+   PANFROST_EXCEPTION(GPU_CACHEABILITY_FAULT),
+   PANFROST_EXCEPTION(TRANSLATION_FAULT_0),
+   PANFROST_EXCEPTION(TRANSLATION_FAULT_1),
+   PANFROST_EXCEPTION(TRANSLATION_FAULT_2),
+   PANFROST_EXCEPTION(TRANSLATION_FAULT_3),
+   PANFROST_EXCEPTION(TRANSLATION_FAULT_4),
+   PANFROST_EXCEPTION(TRANSLATION_FAULT_IDENTITY),
+   PANFROST_EXCEPTION(PERM_FAULT_0),
+   PANFROST_EXCEPTION(PERM_FAULT_1),
+   PANFROST_EXCEPTION(PERM_FAULT_2),
+   PANFROST_EXCEPTION(PERM_FAULT_3),

[PATCH v3 13/15] drm/panfrost: Don't reset the GPU on job faults unless we really have to

2021-06-25 Thread Boris Brezillon
If we can recover from a fault without a reset there's no reason to
issue one.

v3:
* Drop the mention of Valhall requiring a reset on JOB_BUS_FAULT
* Set the fence error to -EINVAL instead of having per-exception
  error codes

Signed-off-by: Boris Brezillon 
---
 drivers/gpu/drm/panfrost/panfrost_device.c |  9 +
 drivers/gpu/drm/panfrost/panfrost_device.h |  2 ++
 drivers/gpu/drm/panfrost/panfrost_job.c| 16 ++--
 3 files changed, 25 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_device.c 
b/drivers/gpu/drm/panfrost/panfrost_device.c
index 736854542b05..f4e42009526d 100644
--- a/drivers/gpu/drm/panfrost/panfrost_device.c
+++ b/drivers/gpu/drm/panfrost/panfrost_device.c
@@ -379,6 +379,15 @@ const char *panfrost_exception_name(u32 exception_code)
return panfrost_exception_infos[exception_code].name;
 }
 
+bool panfrost_exception_needs_reset(const struct panfrost_device *pfdev,
+   u32 exception_code)
+{
+   /* Right now, none of the GPU we support need a reset, but this
+* might change.
+*/
+   return false;
+}
+
 void panfrost_device_reset(struct panfrost_device *pfdev)
 {
panfrost_gpu_soft_reset(pfdev);
diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h 
b/drivers/gpu/drm/panfrost/panfrost_device.h
index efe9a675b614..ecbc79ad0006 100644
--- a/drivers/gpu/drm/panfrost/panfrost_device.h
+++ b/drivers/gpu/drm/panfrost/panfrost_device.h
@@ -175,6 +175,8 @@ int panfrost_device_resume(struct device *dev);
 int panfrost_device_suspend(struct device *dev);
 
 const char *panfrost_exception_name(u32 exception_code);
+bool panfrost_exception_needs_reset(const struct panfrost_device *pfdev,
+   u32 exception_code);
 
 static inline void
 panfrost_device_schedule_reset(struct panfrost_device *pfdev)
diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c 
b/drivers/gpu/drm/panfrost/panfrost_job.c
index 0566e2f7e84a..948bd174ff99 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.c
+++ b/drivers/gpu/drm/panfrost/panfrost_job.c
@@ -497,14 +497,26 @@ static void panfrost_job_handle_irq(struct 
panfrost_device *pfdev, u32 status)
job_write(pfdev, JOB_INT_CLEAR, mask);
 
if (status & JOB_INT_MASK_ERR(j)) {
+   u32 js_status = job_read(pfdev, JS_STATUS(j));
+
job_write(pfdev, JS_COMMAND_NEXT(j), JS_COMMAND_NOP);
 
dev_err(pfdev->dev, "js fault, js=%d, status=%s, 
head=0x%x, tail=0x%x",
j,
-   panfrost_exception_name(job_read(pfdev, 
JS_STATUS(j))),
+   panfrost_exception_name(js_status),
job_read(pfdev, JS_HEAD_LO(j)),
job_read(pfdev, JS_TAIL_LO(j)));
-   drm_sched_fault(>js->queue[j].sched);
+
+   /* If we need a reset, signal it to the timeout
+* handler, otherwise, update the fence error field and
+* signal the job fence.
+*/
+   if (panfrost_exception_needs_reset(pfdev, js_status)) {
+   drm_sched_fault(>js->queue[j].sched);
+   } else {
+   dma_fence_set_error(pfdev->jobs[j]->done_fence, 
-EINVAL);
+   status |= JOB_INT_MASK_DONE(j);
+   }
}
 
if (status & JOB_INT_MASK_DONE(j)) {
-- 
2.31.1



[PATCH v3 10/15] drm/panfrost: Make sure job interrupts are masked before resetting

2021-06-25 Thread Boris Brezillon
This is not yet needed because we let active jobs be killed during by
the reset and we don't really bother making sure they can be restarted.
But once we start adding soft-stop support, controlling when we deal
with the remaining interrrupts and making sure those are handled before
the reset is issued gets tricky if we keep job interrupts active.

Let's prepare for that and mask+flush job IRQs before issuing a reset.

Signed-off-by: Boris Brezillon 
---
 drivers/gpu/drm/panfrost/panfrost_job.c | 21 +++--
 1 file changed, 15 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c 
b/drivers/gpu/drm/panfrost/panfrost_job.c
index 88d34fd781e8..0566e2f7e84a 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.c
+++ b/drivers/gpu/drm/panfrost/panfrost_job.c
@@ -34,6 +34,7 @@ struct panfrost_queue_state {
 struct panfrost_job_slot {
struct panfrost_queue_state queue[NUM_JOB_SLOTS];
spinlock_t job_lock;
+   int irq;
 };
 
 static struct panfrost_job *
@@ -400,7 +401,15 @@ static void panfrost_reset(struct panfrost_device *pfdev,
if (bad)
drm_sched_increase_karma(bad);
 
-   spin_lock(>js->job_lock);
+   /* Mask job interrupts and synchronize to make sure we won't be
+* interrupted during our reset.
+*/
+   job_write(pfdev, JOB_INT_MASK, 0);
+   synchronize_irq(pfdev->js->irq);
+
+   /* Schedulers are stopped and interrupts are masked+flushed, we don't
+* need to protect the 'evict unfinished jobs' lock with the job_lock.
+*/
for (i = 0; i < NUM_JOB_SLOTS; i++) {
if (pfdev->jobs[i]) {
pm_runtime_put_noidle(pfdev->dev);
@@ -408,7 +417,6 @@ static void panfrost_reset(struct panfrost_device *pfdev,
pfdev->jobs[i] = NULL;
}
}
-   spin_unlock(>js->job_lock);
 
panfrost_device_reset(pfdev);
 
@@ -504,6 +512,7 @@ static void panfrost_job_handle_irq(struct panfrost_device 
*pfdev, u32 status)
 
job = pfdev->jobs[j];
/* Only NULL if job timeout occurred */
+   WARN_ON(!job);
if (job) {
pfdev->jobs[j] = NULL;
 
@@ -563,7 +572,7 @@ static void panfrost_reset_work(struct work_struct *work)
 int panfrost_job_init(struct panfrost_device *pfdev)
 {
struct panfrost_job_slot *js;
-   int ret, j, irq;
+   int ret, j;
 
INIT_WORK(>reset.work, panfrost_reset_work);
 
@@ -573,11 +582,11 @@ int panfrost_job_init(struct panfrost_device *pfdev)
 
spin_lock_init(>job_lock);
 
-   irq = platform_get_irq_byname(to_platform_device(pfdev->dev), "job");
-   if (irq <= 0)
+   js->irq = platform_get_irq_byname(to_platform_device(pfdev->dev), 
"job");
+   if (js->irq <= 0)
return -ENODEV;
 
-   ret = devm_request_threaded_irq(pfdev->dev, irq,
+   ret = devm_request_threaded_irq(pfdev->dev, js->irq,
panfrost_job_irq_handler,
panfrost_job_irq_handler_thread,
IRQF_SHARED, KBUILD_MODNAME "-job",
-- 
2.31.1



[PATCH v3 08/15] drm/panfrost: Use a threaded IRQ for job interrupts

2021-06-25 Thread Boris Brezillon
This should avoid switching to interrupt context when the GPU is under
heavy use.

v3:
* Don't take the job_lock in panfrost_job_handle_irq()

Signed-off-by: Boris Brezillon 
---
 drivers/gpu/drm/panfrost/panfrost_job.c | 53 ++---
 1 file changed, 38 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c 
b/drivers/gpu/drm/panfrost/panfrost_job.c
index be8f68f63974..e0c479e67304 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.c
+++ b/drivers/gpu/drm/panfrost/panfrost_job.c
@@ -470,19 +470,12 @@ static const struct drm_sched_backend_ops 
panfrost_sched_ops = {
.free_job = panfrost_job_free
 };
 
-static irqreturn_t panfrost_job_irq_handler(int irq, void *data)
+static void panfrost_job_handle_irq(struct panfrost_device *pfdev, u32 status)
 {
-   struct panfrost_device *pfdev = data;
-   u32 status = job_read(pfdev, JOB_INT_STAT);
int j;
 
dev_dbg(pfdev->dev, "jobslot irq status=%x\n", status);
 
-   if (!status)
-   return IRQ_NONE;
-
-   pm_runtime_mark_last_busy(pfdev->dev);
-
for (j = 0; status; j++) {
u32 mask = MK_JS_MASK(j);
 
@@ -519,7 +512,6 @@ static irqreturn_t panfrost_job_irq_handler(int irq, void 
*data)
if (status & JOB_INT_MASK_DONE(j)) {
struct panfrost_job *job;
 
-   spin_lock(>js->job_lock);
job = pfdev->jobs[j];
/* Only NULL if job timeout occurred */
if (job) {
@@ -531,21 +523,49 @@ static irqreturn_t panfrost_job_irq_handler(int irq, void 
*data)
dma_fence_signal_locked(job->done_fence);
pm_runtime_put_autosuspend(pfdev->dev);
}
-   spin_unlock(>js->job_lock);
}
 
status &= ~mask;
}
+}
 
+static irqreturn_t panfrost_job_irq_handler_thread(int irq, void *data)
+{
+   struct panfrost_device *pfdev = data;
+   u32 status = job_read(pfdev, JOB_INT_RAWSTAT);
+
+   while (status) {
+   pm_runtime_mark_last_busy(pfdev->dev);
+
+   spin_lock(>js->job_lock);
+   panfrost_job_handle_irq(pfdev, status);
+   spin_unlock(>js->job_lock);
+   status = job_read(pfdev, JOB_INT_RAWSTAT);
+   }
+
+   job_write(pfdev, JOB_INT_MASK,
+ GENMASK(16 + NUM_JOB_SLOTS - 1, 16) |
+ GENMASK(NUM_JOB_SLOTS - 1, 0));
return IRQ_HANDLED;
 }
 
+static irqreturn_t panfrost_job_irq_handler(int irq, void *data)
+{
+   struct panfrost_device *pfdev = data;
+   u32 status = job_read(pfdev, JOB_INT_STAT);
+
+   if (!status)
+   return IRQ_NONE;
+
+   job_write(pfdev, JOB_INT_MASK, 0);
+   return IRQ_WAKE_THREAD;
+}
+
 static void panfrost_reset(struct work_struct *work)
 {
struct panfrost_device *pfdev = container_of(work,
 struct panfrost_device,
 reset.work);
-   unsigned long flags;
unsigned int i;
bool cookie;
 
@@ -575,7 +595,7 @@ static void panfrost_reset(struct work_struct *work)
/* All timers have been stopped, we can safely reset the pending state. 
*/
atomic_set(>reset.pending, 0);
 
-   spin_lock_irqsave(>js->job_lock, flags);
+   spin_lock(>js->job_lock);
for (i = 0; i < NUM_JOB_SLOTS; i++) {
if (pfdev->jobs[i]) {
pm_runtime_put_noidle(pfdev->dev);
@@ -583,7 +603,7 @@ static void panfrost_reset(struct work_struct *work)
pfdev->jobs[i] = NULL;
}
}
-   spin_unlock_irqrestore(>js->job_lock, flags);
+   spin_unlock(>js->job_lock);
 
panfrost_device_reset(pfdev);
 
@@ -610,8 +630,11 @@ int panfrost_job_init(struct panfrost_device *pfdev)
if (irq <= 0)
return -ENODEV;
 
-   ret = devm_request_irq(pfdev->dev, irq, panfrost_job_irq_handler,
-  IRQF_SHARED, KBUILD_MODNAME "-job", pfdev);
+   ret = devm_request_threaded_irq(pfdev->dev, irq,
+   panfrost_job_irq_handler,
+   panfrost_job_irq_handler_thread,
+   IRQF_SHARED, KBUILD_MODNAME "-job",
+   pfdev);
if (ret) {
dev_err(pfdev->dev, "failed to request job irq");
return ret;
-- 
2.31.1



[PATCH v3 12/15] drm/panfrost: Reset the GPU when the AS_ACTIVE bit is stuck

2021-06-25 Thread Boris Brezillon
Things are unlikely to resolve until we reset the GPU. Let's not wait
for other faults/timeout to happen to trigger this reset.

Signed-off-by: Boris Brezillon 
Reviewed-by: Steven Price 
---
 drivers/gpu/drm/panfrost/panfrost_mmu.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c 
b/drivers/gpu/drm/panfrost/panfrost_mmu.c
index 65e98c51cb66..5267c3a1f02f 100644
--- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
+++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
@@ -36,8 +36,11 @@ static int wait_ready(struct panfrost_device *pfdev, u32 
as_nr)
ret = readl_relaxed_poll_timeout_atomic(pfdev->iomem + AS_STATUS(as_nr),
val, !(val & AS_STATUS_AS_ACTIVE), 10, 1000);
 
-   if (ret)
+   if (ret) {
+   /* The GPU hung, let's trigger a reset */
+   panfrost_device_schedule_reset(pfdev);
dev_err(pfdev->dev, "AS_ACTIVE bit stuck\n");
+   }
 
return ret;
 }
-- 
2.31.1



[PATCH v3 04/15] drm/panfrost: Drop the pfdev argument passed to panfrost_exception_name()

2021-06-25 Thread Boris Brezillon
Currently unused. We'll add it back if we need per-GPU definitions.

Signed-off-by: Boris Brezillon 
Reviewed-by: Steven Price 
---
 drivers/gpu/drm/panfrost/panfrost_device.c | 2 +-
 drivers/gpu/drm/panfrost/panfrost_device.h | 2 +-
 drivers/gpu/drm/panfrost/panfrost_gpu.c| 2 +-
 drivers/gpu/drm/panfrost/panfrost_job.c| 2 +-
 drivers/gpu/drm/panfrost/panfrost_mmu.c| 2 +-
 5 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_device.c 
b/drivers/gpu/drm/panfrost/panfrost_device.c
index fbcf5edbe367..bce6b0aff05e 100644
--- a/drivers/gpu/drm/panfrost/panfrost_device.c
+++ b/drivers/gpu/drm/panfrost/panfrost_device.c
@@ -292,7 +292,7 @@ void panfrost_device_fini(struct panfrost_device *pfdev)
panfrost_clk_fini(pfdev);
 }
 
-const char *panfrost_exception_name(struct panfrost_device *pfdev, u32 
exception_code)
+const char *panfrost_exception_name(u32 exception_code)
 {
switch (exception_code) {
/* Non-Fault Status code */
diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h 
b/drivers/gpu/drm/panfrost/panfrost_device.h
index 4c6bdea5537b..ade8a1974ee9 100644
--- a/drivers/gpu/drm/panfrost/panfrost_device.h
+++ b/drivers/gpu/drm/panfrost/panfrost_device.h
@@ -172,6 +172,6 @@ void panfrost_device_reset(struct panfrost_device *pfdev);
 int panfrost_device_resume(struct device *dev);
 int panfrost_device_suspend(struct device *dev);
 
-const char *panfrost_exception_name(struct panfrost_device *pfdev, u32 
exception_code);
+const char *panfrost_exception_name(u32 exception_code);
 
 #endif
diff --git a/drivers/gpu/drm/panfrost/panfrost_gpu.c 
b/drivers/gpu/drm/panfrost/panfrost_gpu.c
index 2aae636f1cf5..ec59f15940fb 100644
--- a/drivers/gpu/drm/panfrost/panfrost_gpu.c
+++ b/drivers/gpu/drm/panfrost/panfrost_gpu.c
@@ -33,7 +33,7 @@ static irqreturn_t panfrost_gpu_irq_handler(int irq, void 
*data)
address |= gpu_read(pfdev, GPU_FAULT_ADDRESS_LO);
 
dev_warn(pfdev->dev, "GPU Fault 0x%08x (%s) at 0x%016llx\n",
-fault_status & 0xFF, panfrost_exception_name(pfdev, 
fault_status),
+fault_status & 0xFF, 
panfrost_exception_name(fault_status),
 address);
 
if (state & GPU_IRQ_MULTIPLE_FAULT)
diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c 
b/drivers/gpu/drm/panfrost/panfrost_job.c
index d6c9698bca3b..3cd1aec6c261 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.c
+++ b/drivers/gpu/drm/panfrost/panfrost_job.c
@@ -500,7 +500,7 @@ static irqreturn_t panfrost_job_irq_handler(int irq, void 
*data)
 
dev_err(pfdev->dev, "js fault, js=%d, status=%s, 
head=0x%x, tail=0x%x",
j,
-   panfrost_exception_name(pfdev, job_read(pfdev, 
JS_STATUS(j))),
+   panfrost_exception_name(job_read(pfdev, 
JS_STATUS(j))),
job_read(pfdev, JS_HEAD_LO(j)),
job_read(pfdev, JS_TAIL_LO(j)));
 
diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c 
b/drivers/gpu/drm/panfrost/panfrost_mmu.c
index d76dff201ea6..b4f0c673cd7f 100644
--- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
+++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
@@ -676,7 +676,7 @@ static irqreturn_t panfrost_mmu_irq_handler_thread(int irq, 
void *data)
"TODO",
fault_status,
(fault_status & (1 << 10) ? "DECODER FAULT" : 
"SLAVE FAULT"),
-   exception_type, panfrost_exception_name(pfdev, 
exception_type),
+   exception_type, 
panfrost_exception_name(exception_type),
access_type, access_type_name(pfdev, 
fault_status),
source_id);
 
-- 
2.31.1



[PATCH v3 03/15] drm/panfrost: Get rid of the unused JS_STATUS_EVENT_ACTIVE definition

2021-06-25 Thread Boris Brezillon
Exception types will be defined as an enum in panfrost_drm.h so userspace
and use the same definitions if needed.

Signed-off-by: Boris Brezillon 
Reviewed-by: Steven Price 
---
 drivers/gpu/drm/panfrost/panfrost_regs.h | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_regs.h 
b/drivers/gpu/drm/panfrost/panfrost_regs.h
index eddaa62ad8b0..151cfebd80a0 100644
--- a/drivers/gpu/drm/panfrost/panfrost_regs.h
+++ b/drivers/gpu/drm/panfrost/panfrost_regs.h
@@ -261,9 +261,6 @@
 #define JS_COMMAND_SOFT_STOP_1 0x06/* Execute SOFT_STOP if 
JOB_CHAIN_FLAG is 1 */
 #define JS_COMMAND_HARD_STOP_1 0x07/* Execute HARD_STOP if 
JOB_CHAIN_FLAG is 1 */
 
-#define JS_STATUS_EVENT_ACTIVE 0x08
-
-
 /* MMU regs */
 #define MMU_INT_RAWSTAT0x2000
 #define MMU_INT_CLEAR  0x2004
-- 
2.31.1



[PATCH v3 07/15] drm/panfrost: Expose a helper to trigger a GPU reset

2021-06-25 Thread Boris Brezillon
Expose a helper to trigger a GPU reset so we can easily trigger reset
operations outside the job timeout handler.

Signed-off-by: Boris Brezillon 
Reviewed-by: Steven Price 
---
 drivers/gpu/drm/panfrost/panfrost_device.h | 8 
 drivers/gpu/drm/panfrost/panfrost_job.c| 4 +---
 2 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h 
b/drivers/gpu/drm/panfrost/panfrost_device.h
index ade8a1974ee9..6024eaf34ba0 100644
--- a/drivers/gpu/drm/panfrost/panfrost_device.h
+++ b/drivers/gpu/drm/panfrost/panfrost_device.h
@@ -174,4 +174,12 @@ int panfrost_device_suspend(struct device *dev);
 
 const char *panfrost_exception_name(u32 exception_code);
 
+static inline void
+panfrost_device_schedule_reset(struct panfrost_device *pfdev)
+{
+   /* Schedule a reset if there's no reset in progress. */
+   if (!atomic_xchg(>reset.pending, 1))
+   schedule_work(>reset.work);
+}
+
 #endif
diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c 
b/drivers/gpu/drm/panfrost/panfrost_job.c
index 3cd1aec6c261..be8f68f63974 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.c
+++ b/drivers/gpu/drm/panfrost/panfrost_job.c
@@ -458,9 +458,7 @@ static enum drm_gpu_sched_stat panfrost_job_timedout(struct 
drm_sched_job
if (!panfrost_scheduler_stop(>js->queue[js], sched_job))
return DRM_GPU_SCHED_STAT_NOMINAL;
 
-   /* Schedule a reset if there's no reset in progress. */
-   if (!atomic_xchg(>reset.pending, 1))
-   schedule_work(>reset.work);
+   panfrost_device_schedule_reset(pfdev);
 
return DRM_GPU_SCHED_STAT_NOMINAL;
 }
-- 
2.31.1



[PATCH v3 14/15] drm/panfrost: Kill in-flight jobs on FD close

2021-06-25 Thread Boris Brezillon
If the process who submitted these jobs decided to close the FD before
the jobs are done it probably means it doesn't care about the result.

v3:
* Set fence error to ECANCELED when a TERMINATED exception is received

Signed-off-by: Boris Brezillon 
---
 drivers/gpu/drm/panfrost/panfrost_job.c | 43 +
 1 file changed, 37 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c 
b/drivers/gpu/drm/panfrost/panfrost_job.c
index 948bd174ff99..aa1e6542adde 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.c
+++ b/drivers/gpu/drm/panfrost/panfrost_job.c
@@ -498,14 +498,21 @@ static void panfrost_job_handle_irq(struct 
panfrost_device *pfdev, u32 status)
 
if (status & JOB_INT_MASK_ERR(j)) {
u32 js_status = job_read(pfdev, JS_STATUS(j));
+   const char *exception_name = 
panfrost_exception_name(js_status);
 
job_write(pfdev, JS_COMMAND_NEXT(j), JS_COMMAND_NOP);
 
-   dev_err(pfdev->dev, "js fault, js=%d, status=%s, 
head=0x%x, tail=0x%x",
-   j,
-   panfrost_exception_name(js_status),
-   job_read(pfdev, JS_HEAD_LO(j)),
-   job_read(pfdev, JS_TAIL_LO(j)));
+   if (js_status < 
DRM_PANFROST_EXCEPTION_JOB_CONFIG_FAULT) {
+   dev_dbg(pfdev->dev, "js interrupt, js=%d, 
status=%s, head=0x%x, tail=0x%x",
+   j, exception_name,
+   job_read(pfdev, JS_HEAD_LO(j)),
+   job_read(pfdev, JS_TAIL_LO(j)));
+   } else {
+   dev_err(pfdev->dev, "js fault, js=%d, 
status=%s, head=0x%x, tail=0x%x",
+   j, exception_name,
+   job_read(pfdev, JS_HEAD_LO(j)),
+   job_read(pfdev, JS_TAIL_LO(j)));
+   }
 
/* If we need a reset, signal it to the timeout
 * handler, otherwise, update the fence error field and
@@ -514,7 +521,16 @@ static void panfrost_job_handle_irq(struct panfrost_device 
*pfdev, u32 status)
if (panfrost_exception_needs_reset(pfdev, js_status)) {
drm_sched_fault(>js->queue[j].sched);
} else {
-   dma_fence_set_error(pfdev->jobs[j]->done_fence, 
-EINVAL);
+   int error = 0;
+
+   if (js_status == 
DRM_PANFROST_EXCEPTION_TERMINATED)
+   error = -ECANCELED;
+   else if (js_status >= 
DRM_PANFROST_EXCEPTION_JOB_CONFIG_FAULT)
+   error = -EINVAL;
+
+   if (error)
+   
dma_fence_set_error(pfdev->jobs[j]->done_fence, error);
+
status |= JOB_INT_MASK_DONE(j);
}
}
@@ -673,10 +689,25 @@ int panfrost_job_open(struct panfrost_file_priv 
*panfrost_priv)
 
 void panfrost_job_close(struct panfrost_file_priv *panfrost_priv)
 {
+   struct panfrost_device *pfdev = panfrost_priv->pfdev;
+   unsigned long flags;
int i;
 
for (i = 0; i < NUM_JOB_SLOTS; i++)
drm_sched_entity_destroy(_priv->sched_entity[i]);
+
+   /* Kill in-flight jobs */
+   spin_lock_irqsave(>js->job_lock, flags);
+   for (i = 0; i < NUM_JOB_SLOTS; i++) {
+   struct drm_sched_entity *entity = 
_priv->sched_entity[i];
+   struct panfrost_job *job = pfdev->jobs[i];
+
+   if (!job || job->base.entity != entity)
+   continue;
+
+   job_write(pfdev, JS_COMMAND(i), JS_COMMAND_HARD_STOP);
+   }
+   spin_unlock_irqrestore(>js->job_lock, flags);
 }
 
 int panfrost_job_is_idle(struct panfrost_device *pfdev)
-- 
2.31.1



[PATCH v3 09/15] drm/panfrost: Simplify the reset serialization logic

2021-06-25 Thread Boris Brezillon
Now that we can pass our own workqueue to drm_sched_init(), we can use
an ordered workqueue on for both the scheduler timeout tdr and our own
reset work (which we use when the reset is not caused by a fault/timeout
on a specific job, like when we have AS_ACTIVE bit stuck). This
guarantees that the timeout handlers and reset handler can't run
concurrently which drastically simplifies the locking.

Suggested-by: Daniel Vetter 
Signed-off-by: Boris Brezillon 
---
 drivers/gpu/drm/panfrost/panfrost_device.h |   6 +-
 drivers/gpu/drm/panfrost/panfrost_job.c| 185 -
 2 files changed, 71 insertions(+), 120 deletions(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h 
b/drivers/gpu/drm/panfrost/panfrost_device.h
index 6024eaf34ba0..bfe32907ba6b 100644
--- a/drivers/gpu/drm/panfrost/panfrost_device.h
+++ b/drivers/gpu/drm/panfrost/panfrost_device.h
@@ -108,6 +108,7 @@ struct panfrost_device {
struct mutex sched_lock;
 
struct {
+   struct workqueue_struct *wq;
struct work_struct work;
atomic_t pending;
} reset;
@@ -177,9 +178,8 @@ const char *panfrost_exception_name(u32 exception_code);
 static inline void
 panfrost_device_schedule_reset(struct panfrost_device *pfdev)
 {
-   /* Schedule a reset if there's no reset in progress. */
-   if (!atomic_xchg(>reset.pending, 1))
-   schedule_work(>reset.work);
+   atomic_set(>reset.pending, 1);
+   queue_work(pfdev->reset.wq, >reset.work);
 }
 
 #endif
diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c 
b/drivers/gpu/drm/panfrost/panfrost_job.c
index e0c479e67304..88d34fd781e8 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.c
+++ b/drivers/gpu/drm/panfrost/panfrost_job.c
@@ -25,17 +25,8 @@
 #define job_write(dev, reg, data) writel(data, dev->iomem + (reg))
 #define job_read(dev, reg) readl(dev->iomem + (reg))
 
-enum panfrost_queue_status {
-   PANFROST_QUEUE_STATUS_ACTIVE,
-   PANFROST_QUEUE_STATUS_STOPPED,
-   PANFROST_QUEUE_STATUS_STARTING,
-   PANFROST_QUEUE_STATUS_FAULT_PENDING,
-};
-
 struct panfrost_queue_state {
struct drm_gpu_scheduler sched;
-   atomic_t status;
-   struct mutex lock;
u64 fence_context;
u64 emit_seqno;
 };
@@ -379,57 +370,73 @@ void panfrost_job_enable_interrupts(struct 
panfrost_device *pfdev)
job_write(pfdev, JOB_INT_MASK, irq_mask);
 }
 
-static bool panfrost_scheduler_stop(struct panfrost_queue_state *queue,
-   struct drm_sched_job *bad)
+static void panfrost_reset(struct panfrost_device *pfdev,
+  struct drm_sched_job *bad)
 {
-   enum panfrost_queue_status old_status;
-   bool stopped = false;
+   unsigned int i;
+   bool cookie;
 
-   mutex_lock(>lock);
-   old_status = atomic_xchg(>status,
-PANFROST_QUEUE_STATUS_STOPPED);
-   if (old_status == PANFROST_QUEUE_STATUS_STOPPED)
-   goto out;
+   if (WARN_ON(!atomic_read(>reset.pending)))
+   return;
+
+   /* Stop the schedulers.
+*
+* FIXME: We temporarily get out of the dma_fence_signalling section
+* because the cleanup path generate lockdep splats when taking locks
+* to release job resources. We should rework the code to follow this
+* pattern:
+*
+*  try_lock
+*  if (locked)
+*  release
+*  else
+*  schedule_work_to_release_later
+*/
+   for (i = 0; i < NUM_JOB_SLOTS; i++)
+   drm_sched_stop(>js->queue[i].sched, bad);
+
+   cookie = dma_fence_begin_signalling();
 
-   WARN_ON(old_status != PANFROST_QUEUE_STATUS_ACTIVE);
-   drm_sched_stop(>sched, bad);
if (bad)
drm_sched_increase_karma(bad);
 
-   stopped = true;
+   spin_lock(>js->job_lock);
+   for (i = 0; i < NUM_JOB_SLOTS; i++) {
+   if (pfdev->jobs[i]) {
+   pm_runtime_put_noidle(pfdev->dev);
+   panfrost_devfreq_record_idle(>pfdevfreq);
+   pfdev->jobs[i] = NULL;
+   }
+   }
+   spin_unlock(>js->job_lock);
 
-   /*
-* Set the timeout to max so the timer doesn't get started
-* when we return from the timeout handler (restored in
-* panfrost_scheduler_start()).
+   panfrost_device_reset(pfdev);
+
+   /* GPU has been reset, we can cancel timeout/fault work that may have
+* been queued in the meantime and clear the reset pending bit.
 */
-   queue->sched.timeout = MAX_SCHEDULE_TIMEOUT;
+   atomic_set(>reset.pending, 0);
+   cancel_work_sync(>reset.work);
+   for (i = 0; i < NUM_JOB_SLOTS; i++)
+   cancel_delayed_work(>js->queue[i].sched.work_tdr);
 
-out:
-   mutex_unlock(>lock);
 
-   return stopped;
-}
+   /* Now resubmit jobs 

[PATCH v3 01/15] drm/sched: Allow using a dedicated workqueue for the timeout/fault tdr

2021-06-25 Thread Boris Brezillon
Mali Midgard/Bifrost GPUs have 3 hardware queues but only a global GPU
reset. This leads to extra complexity when we need to synchronize timeout
works with the reset work. One solution to address that is to have an
ordered workqueue at the driver level that will be used by the different
schedulers to queue their timeout work. Thanks to the serialization
provided by the ordered workqueue we are guaranteed that timeout
handlers are executed sequentially, and can thus easily reset the GPU
from the timeout handler without extra synchronization.

Signed-off-by: Boris Brezillon 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c |  2 +-
 drivers/gpu/drm/etnaviv/etnaviv_sched.c   |  3 ++-
 drivers/gpu/drm/lima/lima_sched.c |  3 ++-
 drivers/gpu/drm/panfrost/panfrost_job.c   |  3 ++-
 drivers/gpu/drm/scheduler/sched_main.c|  6 +-
 drivers/gpu/drm/v3d/v3d_sched.c   | 10 +-
 include/drm/gpu_scheduler.h   |  5 -
 7 files changed, 21 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
index 47ea46859618..532636ea20bc 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
@@ -488,7 +488,7 @@ int amdgpu_fence_driver_init_ring(struct amdgpu_ring *ring,
 
r = drm_sched_init(>sched, _sched_ops,
   num_hw_submission, amdgpu_job_hang_limit,
-  timeout, sched_score, ring->name);
+  timeout, NULL, sched_score, ring->name);
if (r) {
DRM_ERROR("Failed to create scheduler on ring %s.\n",
  ring->name);
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c 
b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
index 19826e504efc..feb6da1b6ceb 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
@@ -190,7 +190,8 @@ int etnaviv_sched_init(struct etnaviv_gpu *gpu)
 
ret = drm_sched_init(>sched, _sched_ops,
 etnaviv_hw_jobs_limit, etnaviv_job_hang_limit,
-msecs_to_jiffies(500), NULL, dev_name(gpu->dev));
+msecs_to_jiffies(500), NULL, NULL,
+dev_name(gpu->dev));
if (ret)
return ret;
 
diff --git a/drivers/gpu/drm/lima/lima_sched.c 
b/drivers/gpu/drm/lima/lima_sched.c
index ecf3267334ff..dba8329937a3 100644
--- a/drivers/gpu/drm/lima/lima_sched.c
+++ b/drivers/gpu/drm/lima/lima_sched.c
@@ -508,7 +508,8 @@ int lima_sched_pipe_init(struct lima_sched_pipe *pipe, 
const char *name)
INIT_WORK(>recover_work, lima_sched_recover_work);
 
return drm_sched_init(>base, _sched_ops, 1,
- lima_job_hang_limit, msecs_to_jiffies(timeout),
+ lima_job_hang_limit,
+ msecs_to_jiffies(timeout), NULL,
  NULL, name);
 }
 
diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c 
b/drivers/gpu/drm/panfrost/panfrost_job.c
index 682f2161b999..8ff79fd49577 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.c
+++ b/drivers/gpu/drm/panfrost/panfrost_job.c
@@ -626,7 +626,8 @@ int panfrost_job_init(struct panfrost_device *pfdev)
 
ret = drm_sched_init(>queue[j].sched,
 _sched_ops,
-1, 0, msecs_to_jiffies(JOB_TIMEOUT_MS),
+1, 0,
+msecs_to_jiffies(JOB_TIMEOUT_MS), NULL,
 NULL, "pan_js");
if (ret) {
dev_err(pfdev->dev, "Failed to create scheduler: %d.", 
ret);
diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index c0a2f8f8d472..a937d0529944 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -837,6 +837,8 @@ static int drm_sched_main(void *param)
  * @hw_submission: number of hw submissions that can be in flight
  * @hang_limit: number of times to allow a job to hang before dropping it
  * @timeout: timeout value in jiffies for the scheduler
+ * @timeout_wq: workqueue to use for timeout work. If NULL, the system_wq is
+ * used
  * @score: optional score atomic shared with other schedulers
  * @name: name used for debugging
  *
@@ -844,7 +846,8 @@ static int drm_sched_main(void *param)
  */
 int drm_sched_init(struct drm_gpu_scheduler *sched,
   const struct drm_sched_backend_ops *ops,
-  unsigned hw_submission, unsigned hang_limit, long timeout,
+  unsigned hw_submission, unsigned hang_limit,
+  long timeout, struct workqueue_struct *timeout_wq,
   atomic_t *score, const char *name)
 {
int i, ret;
@@ -852,6 +855,7 @@ int drm_sched_init(struct drm_gpu_scheduler 

[PATCH v3 00/15] drm/panfrost: Misc improvements

2021-06-25 Thread Boris Brezillon
Hello,

This is a merge of [1] and [2] since the second series depends on
patches in the preparatory series.

The main change in this v3 is the addition of patch 1 and 9 simplifying
the reset synchronisation as suggested by Daniel.

Also addressed Steve's comments, and IGT tests are now passing reliably
(which doesn't guarantee much, but that's still an improvement since
pan-reset was unreliable with v2).

Regards,

Boris

Boris Brezillon (14):
  drm/sched: Allow using a dedicated workqueue for the timeout/fault tdr
  drm/panfrost: Make ->run_job() return an ERR_PTR() when appropriate
  drm/panfrost: Get rid of the unused JS_STATUS_EVENT_ACTIVE definition
  drm/panfrost: Drop the pfdev argument passed to
panfrost_exception_name()
  drm/panfrost: Expose exception types to userspace
  drm/panfrost: Do the exception -> string translation using a table
  drm/panfrost: Expose a helper to trigger a GPU reset
  drm/panfrost: Use a threaded IRQ for job interrupts
  drm/panfrost: Simplify the reset serialization logic
  drm/panfrost: Make sure job interrupts are masked before resetting
  drm/panfrost: Disable the AS on unhandled page faults
  drm/panfrost: Reset the GPU when the AS_ACTIVE bit is stuck
  drm/panfrost: Don't reset the GPU on job faults unless we really have
to
  drm/panfrost: Kill in-flight jobs on FD close

Steven Price (1):
  drm/panfrost: Queue jobs on the hardware

 drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c  |   2 +-
 drivers/gpu/drm/etnaviv/etnaviv_sched.c|   3 +-
 drivers/gpu/drm/lima/lima_sched.c  |   3 +-
 drivers/gpu/drm/panfrost/panfrost_device.c | 139 +++--
 drivers/gpu/drm/panfrost/panfrost_device.h |  15 +-
 drivers/gpu/drm/panfrost/panfrost_gpu.c|   2 +-
 drivers/gpu/drm/panfrost/panfrost_job.c| 630 +++--
 drivers/gpu/drm/panfrost/panfrost_mmu.c|  41 +-
 drivers/gpu/drm/panfrost/panfrost_regs.h   |   3 -
 drivers/gpu/drm/scheduler/sched_main.c |   6 +-
 drivers/gpu/drm/v3d/v3d_sched.c|  10 +-
 include/drm/gpu_scheduler.h|   5 +-
 include/uapi/drm/panfrost_drm.h|  71 +++
 13 files changed, 679 insertions(+), 251 deletions(-)

-- 
2.31.1



Re: [PATCH 13/47] drm/i915/guc: Implement GuC context operations for new inteface

2021-06-25 Thread Michal Wajdeczko



On 24.06.2021 09:04, Matthew Brost wrote:
> Implement GuC context operations which includes GuC specific operations
> alloc, pin, unpin, and destroy.
> 
> Signed-off-by: John Harrison 
> Signed-off-by: Matthew Brost 
> ---
>  drivers/gpu/drm/i915/gt/intel_context.c   |   5 +
>  drivers/gpu/drm/i915/gt/intel_context_types.h |  22 +-
>  drivers/gpu/drm/i915/gt/intel_lrc_reg.h   |   1 -
>  drivers/gpu/drm/i915/gt/uc/intel_guc.h|  34 +
>  drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c |   4 +
>  .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 664 --
>  drivers/gpu/drm/i915/i915_reg.h   |   1 +
>  drivers/gpu/drm/i915/i915_request.c   |   1 +
>  8 files changed, 677 insertions(+), 55 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_context.c 
> b/drivers/gpu/drm/i915/gt/intel_context.c
> index 4033184f13b9..2b68af16222c 100644
> --- a/drivers/gpu/drm/i915/gt/intel_context.c
> +++ b/drivers/gpu/drm/i915/gt/intel_context.c
> @@ -383,6 +383,11 @@ intel_context_init(struct intel_context *ce, struct 
> intel_engine_cs *engine)
>  
>   mutex_init(>pin_mutex);
>  
> + spin_lock_init(>guc_state.lock);
> +
> + ce->guc_id = GUC_INVALID_LRC_ID;
> + INIT_LIST_HEAD(>guc_id_link);
> +
>   i915_active_init(>active,
>__intel_context_active, __intel_context_retire, 0);
>  }
> diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h 
> b/drivers/gpu/drm/i915/gt/intel_context_types.h
> index bb6fef7eae52..ce7c69b34cd1 100644
> --- a/drivers/gpu/drm/i915/gt/intel_context_types.h
> +++ b/drivers/gpu/drm/i915/gt/intel_context_types.h
> @@ -95,6 +95,7 @@ struct intel_context {
>  #define CONTEXT_BANNED   6
>  #define CONTEXT_FORCE_SINGLE_SUBMISSION  7
>  #define CONTEXT_NOPREEMPT8
> +#define CONTEXT_LRCA_DIRTY   9
>  
>   struct {
>   u64 timeout_us;
> @@ -137,14 +138,29 @@ struct intel_context {
>  
>   u8 wa_bb_page; /* if set, page num reserved for context workarounds */
>  
> + struct {
> + /** lock: protects everything in guc_state */
> + spinlock_t lock;
> + /**
> +  * sched_state: scheduling state of this context using GuC
> +  * submission
> +  */
> + u8 sched_state;
> + } guc_state;
> +
>   /* GuC scheduling state that does not require a lock. */
>   atomic_t guc_sched_state_no_lock;
>  
> + /* GuC lrc descriptor ID */
> + u16 guc_id;
> +
> + /* GuC lrc descriptor reference count */
> + atomic_t guc_id_ref;
> +
>   /*
> -  * GuC lrc descriptor ID - Not assigned in this patch but future patches
> -  * in the series will.
> +  * GuC ID link - in list when unpinned but guc_id still valid in GuC
>*/
> - u16 guc_id;
> + struct list_head guc_id_link;

some fields are being added with kerneldoc, some without
what's the rule ?

>  };
>  
>  #endif /* __INTEL_CONTEXT_TYPES__ */
> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc_reg.h 
> b/drivers/gpu/drm/i915/gt/intel_lrc_reg.h
> index 41e5350a7a05..49d4857ad9b7 100644
> --- a/drivers/gpu/drm/i915/gt/intel_lrc_reg.h
> +++ b/drivers/gpu/drm/i915/gt/intel_lrc_reg.h
> @@ -87,7 +87,6 @@
>  #define GEN11_CSB_WRITE_PTR_MASK (GEN11_CSB_PTR_MASK << 0)
>  
>  #define MAX_CONTEXT_HW_ID(1 << 21) /* exclusive */
> -#define MAX_GUC_CONTEXT_HW_ID(1 << 20) /* exclusive */
>  #define GEN11_MAX_CONTEXT_HW_ID  (1 << 11) /* exclusive */
>  /* in Gen12 ID 0x7FF is reserved to indicate idle */
>  #define GEN12_MAX_CONTEXT_HW_ID  (GEN11_MAX_CONTEXT_HW_ID - 1)
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h 
> b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
> index 9ba8219475b2..d44316dc914b 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
> @@ -44,6 +44,14 @@ struct intel_guc {
>   void (*disable)(struct intel_guc *guc);
>   } interrupts;
>  
> + /*
> +  * contexts_lock protects the pool of free guc ids and a linked list of
> +  * guc ids available to be stolen
> +  */
> + spinlock_t contexts_lock;
> + struct ida guc_ids;
> + struct list_head guc_id_list;
> +
>   bool submission_selected;
>  
>   struct i915_vma *ads_vma;
> @@ -102,6 +110,29 @@ intel_guc_send_and_receive(struct intel_guc *guc, const 
> u32 *action, u32 len,
>response_buf, response_buf_size, 0);
>  }
>  
> +static inline int intel_guc_send_busy_loop(struct intel_guc* guc,
> +const u32 *action,
> +u32 len,
> +bool loop)
> +{
> + int err;
> +
> + /* No sleeping with spin locks, just busy loop */
> + might_sleep_if(loop && (!in_atomic() && !irqs_disabled()));
> +
> +retry:
> + err = intel_guc_send_nb(guc, action, len);
> + if 

Re: [PATCH 10/47] drm/i915/guc: Add lrc descriptor context lookup array

2021-06-25 Thread Michal Wajdeczko



On 24.06.2021 09:04, Matthew Brost wrote:
> Add lrc descriptor context lookup array which can resolve the
> intel_context from the lrc descriptor index. In addition to lookup, it
> can determine in the lrc descriptor context is currently registered with
> the GuC by checking if an entry for a descriptor index is present.
> Future patches in the series will make use of this array.

s/lrc/LRC

> 
> Cc: John Harrison 
> Signed-off-by: Matthew Brost 
> ---
>  drivers/gpu/drm/i915/gt/uc/intel_guc.h|  5 +++
>  .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 32 +--
>  2 files changed, 35 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h 
> b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
> index b28fa54214f2..2313d9fc087b 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
> @@ -6,6 +6,8 @@
>  #ifndef _INTEL_GUC_H_
>  #define _INTEL_GUC_H_
>  
> +#include "linux/xarray.h"

#include 

> +
>  #include "intel_uncore.h"
>  #include "intel_guc_fw.h"
>  #include "intel_guc_fwif.h"
> @@ -46,6 +48,9 @@ struct intel_guc {
>   struct i915_vma *lrc_desc_pool;
>   void *lrc_desc_pool_vaddr;
>  
> + /* guc_id to intel_context lookup */
> + struct xarray context_lookup;
> +
>   /* Control params for fw initialization */
>   u32 params[GUC_CTL_MAX_DWORDS];

btw, IIRC there was idea to move most struct definitions to
intel_guc_types.h, is this still a plan ?

>  
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
> b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> index a366890fb840..23a94a896a0b 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> @@ -65,8 +65,6 @@ static inline struct i915_priolist *to_priolist(struct 
> rb_node *rb)
>   return rb_entry(rb, struct i915_priolist, node);
>  }
>  
> -/* Future patches will use this function */
> -__attribute__ ((unused))
>  static struct guc_lrc_desc *__get_lrc_desc(struct intel_guc *guc, u32 index)
>  {
>   struct guc_lrc_desc *base = guc->lrc_desc_pool_vaddr;
> @@ -76,6 +74,15 @@ static struct guc_lrc_desc *__get_lrc_desc(struct 
> intel_guc *guc, u32 index)
>   return [index];
>  }
>  
> +static inline struct intel_context *__get_context(struct intel_guc *guc, u32 
> id)
> +{
> + struct intel_context *ce = xa_load(>context_lookup, id);
> +
> + GEM_BUG_ON(id >= GUC_MAX_LRC_DESCRIPTORS);
> +
> + return ce;
> +}
> +
>  static int guc_lrc_desc_pool_create(struct intel_guc *guc)
>  {
>   u32 size;
> @@ -96,6 +103,25 @@ static void guc_lrc_desc_pool_destroy(struct intel_guc 
> *guc)
>   i915_vma_unpin_and_release(>lrc_desc_pool, I915_VMA_RELEASE_MAP);
>  }
>  
> +static inline void reset_lrc_desc(struct intel_guc *guc, u32 id)
> +{
> + struct guc_lrc_desc *desc = __get_lrc_desc(guc, id);
> +
> + memset(desc, 0, sizeof(*desc));
> + xa_erase_irq(>context_lookup, id);
> +}
> +
> +static inline bool lrc_desc_registered(struct intel_guc *guc, u32 id)
> +{
> + return __get_context(guc, id);
> +}
> +
> +static inline void set_lrc_desc_registered(struct intel_guc *guc, u32 id,
> +struct intel_context *ce)
> +{
> + xa_store_irq(>context_lookup, id, ce, GFP_ATOMIC);
> +}
> +
>  static void guc_add_request(struct intel_guc *guc, struct i915_request *rq)
>  {
>   /* Leaving stub as this function will be used in future patches */
> @@ -400,6 +426,8 @@ int intel_guc_submission_init(struct intel_guc *guc)
>*/
>   GEM_BUG_ON(!guc->lrc_desc_pool);
>  
> + xa_init_flags(>context_lookup, XA_FLAGS_LOCK_IRQ);
> +
>   return 0;
>  }
>  
> 


[PATCH v3 2/2] drivers/firmware: consolidate EFI framebuffer setup for all arches

2021-06-25 Thread Javier Martinez Canillas
The register_gop_device() function registers an "efi-framebuffer" platform
device to match against the efifb driver, to have an early framebuffer for
EFI platforms.

But there is already support to do exactly the same by the Generic System
Framebuffers (sysfb) driver. This used to be only for X86 but it has been
moved to drivers/firmware and could be reused by other architectures.

Also, besides supporting registering an "efi-framebuffer", this driver can
register a "simple-framebuffer" allowing to use the siple{fb,drm} drivers
on non-X86 EFI platforms. For example, on aarch64 these drivers can only
be used with DT and doesn't have code to register a "simple-frambuffer"
platform device when booting with EFI.

For these reasons, let's remove the register_gop_device() duplicated code
and instead move the platform specific logic that's there to sysfb driver.

Signed-off-by: Javier Martinez Canillas 
Acked-by: Borislav Petkov 
---

Changes in v3:
- Also update the SYSFB_SIMPLEFB symbol name in drivers/gpu/drm/tiny/Kconfig.
- We have a a max 100 char limit now, use it to avoid multi-line statements.
- Figure out the platform device name before allocating the platform device.

Changes in v2:
- Use "depends on" for the supported architectures instead of selecting it.
- Improve commit message to explain the benefits of reusing sysfb for !X86.

 arch/arm/include/asm/efi.h|  5 +-
 arch/arm64/include/asm/efi.h  |  5 +-
 arch/riscv/include/asm/efi.h  |  5 +-
 drivers/firmware/Kconfig  |  8 +--
 drivers/firmware/Makefile |  2 +-
 drivers/firmware/efi/efi-init.c   | 90 ---
 drivers/firmware/efi/sysfb_efi.c  | 76 +-
 drivers/firmware/sysfb.c  | 35 
 drivers/firmware/sysfb_simplefb.c | 31 +++
 drivers/gpu/drm/tiny/Kconfig  |  4 +-
 include/linux/sysfb.h | 26 -
 11 files changed, 143 insertions(+), 144 deletions(-)

diff --git a/arch/arm/include/asm/efi.h b/arch/arm/include/asm/efi.h
index 9de7ab2ce05d..a6f3b179e8a9 100644
--- a/arch/arm/include/asm/efi.h
+++ b/arch/arm/include/asm/efi.h
@@ -17,6 +17,7 @@
 
 #ifdef CONFIG_EFI
 void efi_init(void);
+extern void efifb_setup_from_dmi(struct screen_info *si, const char *opt);
 
 int efi_create_mapping(struct mm_struct *mm, efi_memory_desc_t *md);
 int efi_set_mapping_permissions(struct mm_struct *mm, efi_memory_desc_t *md);
@@ -52,10 +53,6 @@ void efi_virtmap_unload(void);
 struct screen_info *alloc_screen_info(void);
 void free_screen_info(struct screen_info *si);
 
-static inline void efifb_setup_from_dmi(struct screen_info *si, const char 
*opt)
-{
-}
-
 /*
  * A reasonable upper bound for the uncompressed kernel size is 32 MBytes,
  * so we will reserve that amount of memory. We have no easy way to tell what
diff --git a/arch/arm64/include/asm/efi.h b/arch/arm64/include/asm/efi.h
index 1bed37eb013a..d3e1825337be 100644
--- a/arch/arm64/include/asm/efi.h
+++ b/arch/arm64/include/asm/efi.h
@@ -14,6 +14,7 @@
 
 #ifdef CONFIG_EFI
 extern void efi_init(void);
+extern void efifb_setup_from_dmi(struct screen_info *si, const char *opt);
 #else
 #define efi_init()
 #endif
@@ -85,10 +86,6 @@ static inline void free_screen_info(struct screen_info *si)
 {
 }
 
-static inline void efifb_setup_from_dmi(struct screen_info *si, const char 
*opt)
-{
-}
-
 #define EFI_ALLOC_ALIGNSZ_64K
 
 /*
diff --git a/arch/riscv/include/asm/efi.h b/arch/riscv/include/asm/efi.h
index 6d98cd999680..7a8f0d45b13a 100644
--- a/arch/riscv/include/asm/efi.h
+++ b/arch/riscv/include/asm/efi.h
@@ -13,6 +13,7 @@
 
 #ifdef CONFIG_EFI
 extern void efi_init(void);
+extern void efifb_setup_from_dmi(struct screen_info *si, const char *opt);
 #else
 #define efi_init()
 #endif
@@ -39,10 +40,6 @@ static inline void free_screen_info(struct screen_info *si)
 {
 }
 
-static inline void efifb_setup_from_dmi(struct screen_info *si, const char 
*opt)
-{
-}
-
 void efi_virtmap_load(void);
 void efi_virtmap_unload(void);
 
diff --git a/drivers/firmware/Kconfig b/drivers/firmware/Kconfig
index 5991071e9d7f..6822727a5e98 100644
--- a/drivers/firmware/Kconfig
+++ b/drivers/firmware/Kconfig
@@ -254,9 +254,9 @@ config QCOM_SCM_DOWNLOAD_MODE_DEFAULT
 config SYSFB
bool
default y
-   depends on X86 || COMPILE_TEST
+   depends on X86 || ARM || ARM64 || RISCV || COMPILE_TEST
 
-config X86_SYSFB
+config SYSFB_SIMPLEFB
bool "Mark VGA/VBE/EFI FB as generic system framebuffer"
depends on SYSFB
help
@@ -264,10 +264,10 @@ config X86_SYSFB
  bootloader or kernel can show basic video-output during boot for
  user-guidance and debugging. Historically, x86 used the VESA BIOS
  Extensions and EFI-framebuffers for this, which are mostly limited
- to x86.
+ to x86 BIOS or EFI systems.
  This option, if enabled, marks VGA/VBE/EFI framebuffers as generic
  framebuffers so the new generic 

[PATCH v3 1/2] drivers/firmware: move x86 Generic System Framebuffers support

2021-06-25 Thread Javier Martinez Canillas
The x86 architecture has generic support to register a system framebuffer
platform device. It either registers a "simple-framebuffer" if the config
option CONFIG_X86_SYSFB is enabled, or a legacy VGA/VBE/EFI FB device.

But the code is generic enough to be reused by other architectures and can
be moved out of the arch/x86 directory.

This will allow to also support the simple{fb,drm} drivers on non-x86 EFI
platforms, such as aarch64 where these drivers are only supported with DT.

Signed-off-by: Javier Martinez Canillas 
Acked-by: Borislav Petkov 
Acked-by: Greg Kroah-Hartman 
---

Changes in v3:
- Add Borislav and Greg Acked-by tags.

Changes in v2:
- Use default y and depends on X86 instead doing a select in arch/x86/Kconfig.
- Also enable the SYSFB Kconfig option when COMPILE_TEST.
- Improve commit message to explain why is useful for other arches to use this.

 arch/x86/Kconfig  | 26 ---
 arch/x86/kernel/Makefile  |  3 --
 drivers/firmware/Kconfig  | 32 +++
 drivers/firmware/Makefile |  2 ++
 drivers/firmware/efi/Makefile |  2 ++
 .../firmware/efi}/sysfb_efi.c |  2 +-
 {arch/x86/kernel => drivers/firmware}/sysfb.c |  2 +-
 .../firmware}/sysfb_simplefb.c|  2 +-
 .../x86/include/asm => include/linux}/sysfb.h |  6 ++--
 9 files changed, 42 insertions(+), 35 deletions(-)
 rename {arch/x86/kernel => drivers/firmware/efi}/sysfb_efi.c (99%)
 rename {arch/x86/kernel => drivers/firmware}/sysfb.c (98%)
 rename {arch/x86/kernel => drivers/firmware}/sysfb_simplefb.c (99%)
 rename {arch/x86/include/asm => include/linux}/sysfb.h (95%)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 0ae3eccfec52..f169a30db768 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -2766,32 +2766,6 @@ config AMD_NB
def_bool y
depends on CPU_SUP_AMD && PCI
 
-config X86_SYSFB
-   bool "Mark VGA/VBE/EFI FB as generic system framebuffer"
-   help
- Firmwares often provide initial graphics framebuffers so the BIOS,
- bootloader or kernel can show basic video-output during boot for
- user-guidance and debugging. Historically, x86 used the VESA BIOS
- Extensions and EFI-framebuffers for this, which are mostly limited
- to x86.
- This option, if enabled, marks VGA/VBE/EFI framebuffers as generic
- framebuffers so the new generic system-framebuffer drivers can be
- used on x86. If the framebuffer is not compatible with the generic
- modes, it is advertised as fallback platform framebuffer so legacy
- drivers like efifb, vesafb and uvesafb can pick it up.
- If this option is not selected, all system framebuffers are always
- marked as fallback platform framebuffers as usual.
-
- Note: Legacy fbdev drivers, including vesafb, efifb, uvesafb, will
- not be able to pick up generic system framebuffers if this option
- is selected. You are highly encouraged to enable simplefb as
- replacement if you select this option. simplefb can correctly deal
- with generic system framebuffers. But you should still keep vesafb
- and others enabled as fallback if a system framebuffer is
- incompatible with simplefb.
-
- If unsure, say Y.
-
 endmenu
 
 
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index 0f66682ac02a..4114ea47def2 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -135,9 +135,6 @@ obj-$(CONFIG_X86_CHECK_BIOS_CORRUPTION) += check.o
 obj-$(CONFIG_SWIOTLB)  += pci-swiotlb.o
 obj-$(CONFIG_OF)   += devicetree.o
 obj-$(CONFIG_UPROBES)  += uprobes.o
-obj-y  += sysfb.o
-obj-$(CONFIG_X86_SYSFB)+= sysfb_simplefb.o
-obj-$(CONFIG_EFI)  += sysfb_efi.o
 
 obj-$(CONFIG_PERF_EVENTS)  += perf_regs.o
 obj-$(CONFIG_TRACING)  += tracepoint.o
diff --git a/drivers/firmware/Kconfig b/drivers/firmware/Kconfig
index 1db738d5b301..5991071e9d7f 100644
--- a/drivers/firmware/Kconfig
+++ b/drivers/firmware/Kconfig
@@ -251,6 +251,38 @@ config QCOM_SCM_DOWNLOAD_MODE_DEFAULT
 
  Say Y here to enable "download mode" by default.
 
+config SYSFB
+   bool
+   default y
+   depends on X86 || COMPILE_TEST
+
+config X86_SYSFB
+   bool "Mark VGA/VBE/EFI FB as generic system framebuffer"
+   depends on SYSFB
+   help
+ Firmwares often provide initial graphics framebuffers so the BIOS,
+ bootloader or kernel can show basic video-output during boot for
+ user-guidance and debugging. Historically, x86 used the VESA BIOS
+ Extensions and EFI-framebuffers for this, which are mostly limited
+ to x86.
+ This option, if enabled, marks VGA/VBE/EFI framebuffers as generic
+

[PATCH v3 0/2] allow simple{fb, drm} drivers to be used on non-x86 EFI platforms

2021-06-25 Thread Javier Martinez Canillas
The simplefb and simpledrm drivers match against a "simple-framebuffer"
device, but for aarch64 this is only registered when using Device Trees
and there's a node with a "simple-framebuffer" compatible string.

There is no code to register a "simple-framebuffer" platform device when
using EFI instead. In fact, the only platform device that's registered in
this case is an "efi-framebuffer", which means that the efifb driver is
the only driver supported to have an early console with EFI on aarch64.

The x86 architecture platform has a Generic System Framebuffers (sysfb)
support, that register a system frambuffer platform device. It either
registers a "simple-framebuffer" for the simple{fb,drm} drivers or legacy
VGA/EFI FB devices for the vgafb/efifb drivers.

The sysfb is generic enough to be reused by other architectures and can be
moved out of the arch/x86 directory to drivers/firmware, allowing the EFI
logic used by non-x86 architectures to be folded into sysfb as well.

Patch #1 in this series do the former while patch #2 do the latter. It has
been tested on x86_64 and aarch64 machines using the efifb, simplefb and
simpledrm drivers. But more testing will be highly appreciated, to make
sure that no regressions are being introduced by these changes.

The series touches different subystems and will need coordination between
maintainers but the patches have already been acked by the x86 folks. Ard
Biesheuvel said that these could be merged through the EFI tree if needed.

Best regards,
Javier

Changes in v3:
- Add Borislav and Greg Acked-by tags.
- Also update the SYSFB_SIMPLEFB symbol name in drivers/gpu/drm/tiny/Kconfig.
- We have a a max 100 char limit now, use it to avoid multi-line statements.
- Figure out the platform device name before allocating the platform device.

Changes in v2:
- Use default y and depends on X86 instead doing a select in arch/x86/Kconfig.
- Also enable the SYSFB Kconfig option when COMPILE_TEST.
- Improve commit message to explain why is useful for other arches to use this.
- Use "depends on" for the supported architectures instead of selecting it.
- Improve commit message to explain the benefits of reusing sysfb for !X86.

Javier Martinez Canillas (2):
  drivers/firmware: move x86 Generic System Framebuffers support
  drivers/firmware: consolidate EFI framebuffer setup for all arches

 arch/arm/include/asm/efi.h|  5 +-
 arch/arm64/include/asm/efi.h  |  5 +-
 arch/riscv/include/asm/efi.h  |  5 +-
 arch/x86/Kconfig  | 26 --
 arch/x86/kernel/Makefile  |  3 -
 drivers/firmware/Kconfig  | 32 +++
 drivers/firmware/Makefile |  2 +
 drivers/firmware/efi/Makefile |  2 +
 drivers/firmware/efi/efi-init.c   | 90 ---
 .../firmware/efi}/sysfb_efi.c | 78 +++-
 {arch/x86/kernel => drivers/firmware}/sysfb.c | 37 +---
 .../firmware}/sysfb_simplefb.c| 33 ---
 drivers/gpu/drm/tiny/Kconfig  |  4 +-
 .../x86/include/asm => include/linux}/sysfb.h | 32 +++
 14 files changed, 180 insertions(+), 174 deletions(-)
 rename {arch/x86/kernel => drivers/firmware/efi}/sysfb_efi.c (84%)
 rename {arch/x86/kernel => drivers/firmware}/sysfb.c (75%)
 rename {arch/x86/kernel => drivers/firmware}/sysfb_simplefb.c (81%)
 rename {arch/x86/include/asm => include/linux}/sysfb.h (70%)

-- 
2.31.1



Re: [Intel-gfx] [PATCH 06/47] drm/i915/guc: Optimize CTB writes and reads

2021-06-25 Thread Michal Wajdeczko



On 24.06.2021 09:04, Matthew Brost wrote:
> CTB writes are now in the path of command submission and should be
> optimized for performance. Rather than reading CTB descriptor values
> (e.g. head, tail) which could result in accesses across the PCIe bus,
> store shadow local copies and only read/write the descriptor values when
> absolutely necessary. Also store the current space in the each channel
> locally.
> 
> Signed-off-by: John Harrison 
> Signed-off-by: Matthew Brost 
> ---
>  drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 76 ++-
>  drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h |  6 ++
>  2 files changed, 51 insertions(+), 31 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c 
> b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> index 27ec30b5ef47..1fd5c69358ef 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> @@ -130,6 +130,10 @@ static void guc_ct_buffer_desc_init(struct 
> guc_ct_buffer_desc *desc)
>  static void guc_ct_buffer_reset(struct intel_guc_ct_buffer *ctb)
>  {
>   ctb->broken = false;
> + ctb->tail = 0;
> + ctb->head = 0;
> + ctb->space = CIRC_SPACE(ctb->tail, ctb->head, ctb->size);
> +
>   guc_ct_buffer_desc_init(ctb->desc);
>  }
>  
> @@ -383,10 +387,8 @@ static int ct_write(struct intel_guc_ct *ct,
>  {
>   struct intel_guc_ct_buffer *ctb = >ctbs.send;
>   struct guc_ct_buffer_desc *desc = ctb->desc;
> - u32 head = desc->head;
> - u32 tail = desc->tail;
> + u32 tail = ctb->tail;
>   u32 size = ctb->size;
> - u32 used;
>   u32 header;
>   u32 hxg;
>   u32 *cmds = ctb->cmds;
> @@ -398,25 +400,14 @@ static int ct_write(struct intel_guc_ct *ct,
>   if (unlikely(desc->status))
>   goto corrupted;
>  
> - if (unlikely((tail | head) >= size)) {
> +#ifdef CONFIG_DRM_I915_DEBUG_GUC

since we are caching tail, we may want to check if it's sill correct:

tail = READ_ONCE(desc->tail);
if (unlikely(tail != ctb->tail)) {
CT_ERROR(ct, "Tail was modified %u != %u\n",
 tail, ctb->tail);
desc->status |= GUC_CTB_STATUS_MISMATCH;
goto corrupted;
}

and since we own the tail then we can be more strict:

GEM_BUG_ON(tail > size);

and then finally just check GuC head:

head = READ_ONCE(desc->head);
if (unlikely(head >= size)) {
...

> + if (unlikely((desc->tail | desc->head) >= size)) {
>   CT_ERROR(ct, "Invalid offsets head=%u tail=%u (size=%u)\n",
> -  head, tail, size);
> +  desc->head, desc->tail, size);
>   desc->status |= GUC_CTB_STATUS_OVERFLOW;
>   goto corrupted;
>   }
> -
> - /*
> -  * tail == head condition indicates empty. GuC FW does not support
> -  * using up the entire buffer to get tail == head meaning full.
> -  */
> - if (tail < head)
> - used = (size - head) + tail;
> - else
> - used = tail - head;
> -
> - /* make sure there is a space including extra dw for the fence */
> - if (unlikely(used + len + 1 >= size))
> - return -ENOSPC;
> +#endif
>  
>   /*
>* dw0: CT header (including fence)
> @@ -457,7 +448,9 @@ static int ct_write(struct intel_guc_ct *ct,
>   write_barrier(ct);
>  
>   /* now update descriptor */
> + ctb->tail = tail;
>   WRITE_ONCE(desc->tail, tail);
> + ctb->space -= len + 1;

this magic "1" is likely GUC_CTB_MSG_MIN_LEN, right ?

>  
>   return 0;
>  
> @@ -473,7 +466,7 @@ static int ct_write(struct intel_guc_ct *ct,
>   * @req: pointer to pending request
>   * @status:  placeholder for status
>   *
> - * For each sent request, Guc shall send bac CT response message.
> + * For each sent request, GuC shall send back CT response message.
>   * Our message handler will update status of tracked request once
>   * response message with given fence is received. Wait here and
>   * check for valid response status value.
> @@ -520,24 +513,35 @@ static inline bool ct_deadlocked(struct intel_guc_ct 
> *ct)
>   return ret;
>  }
>  
> -static inline bool h2g_has_room(struct intel_guc_ct_buffer *ctb, u32 len_dw)
> +static inline bool h2g_has_room(struct intel_guc_ct *ct, u32 len_dw)
>  {
> - struct guc_ct_buffer_desc *desc = ctb->desc;
> - u32 head = READ_ONCE(desc->head);
> + struct intel_guc_ct_buffer *ctb = >ctbs.send;
> + u32 head;
>   u32 space;
>  
> - space = CIRC_SPACE(desc->tail, head, ctb->size);
> + if (ctb->space >= len_dw)
> + return true;
> +
> + head = READ_ONCE(ctb->desc->head);
> + if (unlikely(head > ctb->size)) {
> + CT_ERROR(ct, "Corrupted descriptor head=%u tail=%u size=%u\n",
> +   ctb->desc->head, ctb->desc->tail, ctb->size);
> + ctb->desc->status |= 

[Bug 213569] Amdgpu temperature reaching dangerous levels

2021-06-25 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=213569

--- Comment #2 from Martin (martin...@gmx.com) ---
In my case it was watching a video that made the gpu reach 70°C

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

Re: [PATCH v15 00/12] Restricted DMA

2021-06-25 Thread Will Deacon
On Thu, Jun 24, 2021 at 03:19:48PM -0400, Konrad Rzeszutek Wilk wrote:
> On Thu, Jun 24, 2021 at 11:55:14PM +0800, Claire Chang wrote:
> > This series implements mitigations for lack of DMA access control on
> > systems without an IOMMU, which could result in the DMA accessing the
> > system memory at unexpected times and/or unexpected addresses, possibly
> > leading to data leakage or corruption.
> > 
> > For example, we plan to use the PCI-e bus for Wi-Fi and that PCI-e bus is
> > not behind an IOMMU. As PCI-e, by design, gives the device full access to
> > system memory, a vulnerability in the Wi-Fi firmware could easily escalate
> > to a full system exploit (remote wifi exploits: [1a], [1b] that shows a
> > full chain of exploits; [2], [3]).
> > 
> > To mitigate the security concerns, we introduce restricted DMA. Restricted
> > DMA utilizes the existing swiotlb to bounce streaming DMA in and out of a
> > specially allocated region and does memory allocation from the same region.
> > The feature on its own provides a basic level of protection against the DMA
> > overwriting buffer contents at unexpected times. However, to protect
> > against general data leakage and system memory corruption, the system needs
> > to provide a way to restrict the DMA to a predefined memory region (this is
> > usually done at firmware level, e.g. MPU in ATF on some ARM platforms [4]).
> > 
> > [1a] 
> > https://googleprojectzero.blogspot.com/2017/04/over-air-exploiting-broadcoms-wi-fi_4.html
> > [1b] 
> > https://googleprojectzero.blogspot.com/2017/04/over-air-exploiting-broadcoms-wi-fi_11.html
> > [2] https://blade.tencent.com/en/advisories/qualpwn/
> > [3] 
> > https://www.bleepingcomputer.com/news/security/vulnerabilities-found-in-highly-popular-firmware-for-wifi-chips/
> > [4] 
> > https://github.com/ARM-software/arm-trusted-firmware/blob/master/plat/mediatek/mt8183/drivers/emi_mpu/emi_mpu.c#L132
> > 
> > v15:
> > - Apply Will's diff 
> > (https://lore.kernel.org/patchwork/patch/1448957/#1647521)
> >   to fix the crash reported by Qian.
> > - Add Stefano's Acked-by tag for patch 01/12 from v14
> 
> That all should be now be on
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb.git/
> devel/for-linus-5.14 (and linux-next)

Thanks Konrad!

Will


[PATCH v2 2/2] drm/i915/gem: only allow WB for smem only placements

2021-06-25 Thread Matthew Auld
We only support single mode and this should be immutable. For smem only
placements on DGFX this should be WB. On DG1 everything is snooped,
always, and so should be coherent.

I915_GEM_DOMAIN_GTT looks like it's for the aperture which is now gone
for DGFX, so hopefully can also be safely rejected.

Signed-off-by: Matthew Auld 
Cc: Thomas Hellström 
Cc: Maarten Lankhorst 
Cc: Daniel Vetter 
---
 drivers/gpu/drm/i915/gem/i915_gem_domain.c |  7 +++
 drivers/gpu/drm/i915/gem/i915_gem_mman.c   | 10 ++
 2 files changed, 17 insertions(+)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c 
b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
index d0c91697bb22..e3459a524e64 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
@@ -577,6 +577,13 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void 
*data,
goto out_unpin;
}
 
+   if (IS_DGFX(to_i915(obj->base.dev)) && obj->mm.n_placements == 1 &&
+   i915_gem_object_placements_contain_type(obj, INTEL_MEMORY_SYSTEM) &&
+   read_domains != I915_GEM_DOMAIN_CPU) {
+   err = -EINVAL;
+   goto out_unpin;
+   }
+
if (read_domains & I915_GEM_DOMAIN_WC)
err = i915_gem_object_set_to_wc_domain(obj, write_domain);
else if (read_domains & I915_GEM_DOMAIN_GTT)
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c 
b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
index f3586b36dd53..afc9f3dc38b9 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
@@ -673,6 +673,7 @@ __assign_mmap_offset(struct drm_i915_gem_object *obj,
 enum i915_mmap_type mmap_type,
 u64 *offset, struct drm_file *file)
 {
+   struct drm_i915_private *i915 = to_i915(obj->base.dev);
struct i915_mmap_offset *mmo;
 
if (i915_gem_object_never_mmap(obj))
@@ -697,6 +698,15 @@ __assign_mmap_offset(struct drm_i915_gem_object *obj,
i915_gem_object_placements_contain_type(obj, INTEL_MEMORY_LOCAL))
return -ENODEV;
 
+   /*
+* For smem only placements on DGFX we need to default to WB. On DG1
+* everything is snooped always, so should always be coherent.
+*/
+if (IS_DGFX(i915) &&
+mmap_type != I915_MMAP_TYPE_WB && obj->mm.n_placements == 1 &&
+i915_gem_object_placements_contain_type(obj, INTEL_MEMORY_SYSTEM))
+   return -ENODEV;
+
mmo = mmap_offset_attach(obj, mmap_type, file);
if (IS_ERR(mmo))
return PTR_ERR(mmo);
-- 
2.26.3



[PATCH v2 1/2] drm/i915/gem: only allow WC for lmem

2021-06-25 Thread Matthew Auld
This is already the case for our kernel internal mappings, and since we
now only support a single mode this should always be WC if the object
can be placed in lmem.

v2: rebase and also update set_domain

Signed-off-by: Matthew Auld 
Cc: Thomas Hellström 
Cc: Maarten Lankhorst 
Cc: Daniel Vetter 
---
 drivers/gpu/drm/i915/gem/i915_gem_domain.c |  6 ++
 drivers/gpu/drm/i915/gem/i915_gem_mman.c   |  9 +
 drivers/gpu/drm/i915/gem/i915_gem_object.c | 21 +
 drivers/gpu/drm/i915/gem/i915_gem_object.h |  4 
 4 files changed, 40 insertions(+)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c 
b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
index 073822100da7..d0c91697bb22 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
@@ -571,6 +571,12 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void 
*data,
if (READ_ONCE(obj->write_domain) == read_domains)
goto out_unpin;
 
+   if (i915_gem_object_placements_contain_type(obj, INTEL_MEMORY_LOCAL) &&
+   read_domains != I915_GEM_DOMAIN_WC) {
+   err = -EINVAL;
+   goto out_unpin;
+   }
+
if (read_domains & I915_GEM_DOMAIN_WC)
err = i915_gem_object_set_to_wc_domain(obj, write_domain);
else if (read_domains & I915_GEM_DOMAIN_GTT)
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c 
b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
index a90f796e85c0..f3586b36dd53 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
@@ -688,6 +688,15 @@ __assign_mmap_offset(struct drm_i915_gem_object *obj,
!i915_gem_object_has_iomem(obj))
return -ENODEV;
 
+   /*
+* Note that even if the object can also be placed in smem then we still
+* map as WC here, since we can only support a single mode. On DG1 this
+* sucks since we can't turn off snooping for this case.
+*/
+   if (mmap_type != I915_MMAP_TYPE_WC &&
+   i915_gem_object_placements_contain_type(obj, INTEL_MEMORY_LOCAL))
+   return -ENODEV;
+
mmo = mmap_offset_attach(obj, mmap_type, file);
if (IS_ERR(mmo))
return PTR_ERR(mmo);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c 
b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index 07e8ff9a8aae..326956c18f76 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -513,6 +513,27 @@ bool i915_gem_object_has_iomem(const struct 
drm_i915_gem_object *obj)
return obj->mem_flags & I915_BO_FLAG_IOMEM;
 }
 
+/**
+ * i915_gem_object_placements_contain_type - Check whether the object can be
+ * placed at certain memory type
+ * @obj: Pointer to the object
+ * @type: The memory type to check
+ *
+ * Return: True if the object can be placed in @type. False otherwise.
+ */
+bool i915_gem_object_placements_contain_type(struct drm_i915_gem_object *obj,
+enum intel_memory_type type)
+{
+   unsigned int i;
+
+   for (i = 0; i < obj->mm.n_placements; i++) {
+   if (obj->mm.placements[i]->type == type)
+   return true;
+   }
+
+   return false;
+}
+
 void i915_gem_init__objects(struct drm_i915_private *i915)
 {
INIT_WORK(>mm.free_work, __i915_gem_free_work);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h 
b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index ea3224a480c4..e1daa58bc225 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -12,6 +12,7 @@
 #include 
 
 #include "display/intel_frontbuffer.h"
+#include "intel_memory_region.h"
 #include "i915_gem_object_types.h"
 #include "i915_gem_gtt.h"
 #include "i915_gem_ww.h"
@@ -597,6 +598,9 @@ bool i915_gem_object_migratable(struct drm_i915_gem_object 
*obj);
 
 bool i915_gem_object_validates_to_lmem(struct drm_i915_gem_object *obj);
 
+bool i915_gem_object_placements_contain_type(struct drm_i915_gem_object *obj,
+enum intel_memory_type type);
+
 #ifdef CONFIG_MMU_NOTIFIER
 static inline bool
 i915_gem_object_is_userptr(struct drm_i915_gem_object *obj)
-- 
2.26.3



[PATCH 2/2] drm/panel: Add Innolux EJ030NA 3.0" 320x480 panel

2021-06-25 Thread Paul Cercueil
From: Christophe Branchereau 

Add support for the Innolux/Chimei EJ030NA 3.0"
320x480 TFT panel.

This panel can be found in the LDKs, RS97 V2.1 and RG300 (non IPS)
handheld gaming consoles.

While being 320x480, it is actually a horizontal 4:3
panel with non-square pixels in delta arrangement.

Signed-off-by: Christophe Branchereau 
Signed-off-by: Paul Cercueil 
---
 drivers/gpu/drm/panel/Kconfig |   9 +
 drivers/gpu/drm/panel/Makefile|   1 +
 drivers/gpu/drm/panel/panel-innolux-ej030na.c | 289 ++
 3 files changed, 299 insertions(+)
 create mode 100644 drivers/gpu/drm/panel/panel-innolux-ej030na.c

diff --git a/drivers/gpu/drm/panel/Kconfig b/drivers/gpu/drm/panel/Kconfig
index 09acd18b3592..bfc6c23b2509 100644
--- a/drivers/gpu/drm/panel/Kconfig
+++ b/drivers/gpu/drm/panel/Kconfig
@@ -134,6 +134,15 @@ config DRM_PANEL_ILITEK_ILI9881C
  Say Y if you want to enable support for panels based on the
  Ilitek ILI9881c controller.
 
+config DRM_PANEL_INNOLUX_EJ030NA
+tristate "Innolux EJ030NA 320x480 LCD panel"
+depends on OF && SPI
+select REGMAP_SPI
+help
+  Say Y here to enable support for the Innolux/Chimei EJ030NA
+  320x480 3.0" panel as found in the RS97 V2.1, RG300(non-ips)
+  and LDK handheld gaming consoles.
+
 config DRM_PANEL_INNOLUX_P079ZCA
tristate "Innolux P079ZCA panel"
depends on OF
diff --git a/drivers/gpu/drm/panel/Makefile b/drivers/gpu/drm/panel/Makefile
index a350e0990d17..1b865e8ea7c9 100644
--- a/drivers/gpu/drm/panel/Makefile
+++ b/drivers/gpu/drm/panel/Makefile
@@ -12,6 +12,7 @@ obj-$(CONFIG_DRM_PANEL_FEIXIN_K101_IM2BA02) += 
panel-feixin-k101-im2ba02.o
 obj-$(CONFIG_DRM_PANEL_FEIYANG_FY07024DI26A30D) += 
panel-feiyang-fy07024di26a30d.o
 obj-$(CONFIG_DRM_PANEL_ILITEK_IL9322) += panel-ilitek-ili9322.o
 obj-$(CONFIG_DRM_PANEL_ILITEK_ILI9881C) += panel-ilitek-ili9881c.o
+obj-$(CONFIG_DRM_PANEL_INNOLUX_EJ030NA) += panel-innolux-ej030na.o
 obj-$(CONFIG_DRM_PANEL_INNOLUX_P079ZCA) += panel-innolux-p079zca.o
 obj-$(CONFIG_DRM_PANEL_JDI_LT070ME05000) += panel-jdi-lt070me05000.o
 obj-$(CONFIG_DRM_PANEL_KHADAS_TS050) += panel-khadas-ts050.o
diff --git a/drivers/gpu/drm/panel/panel-innolux-ej030na.c 
b/drivers/gpu/drm/panel/panel-innolux-ej030na.c
new file mode 100644
index ..4160b99ef544
--- /dev/null
+++ b/drivers/gpu/drm/panel/panel-innolux-ej030na.c
@@ -0,0 +1,289 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Innolux/Chimei EJ030NA TFT LCD panel driver
+ *
+ * Copyright (C) 2020, Paul Cercueil 
+ * Copyright (C) 2020, Christophe Branchereau 
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+
+struct ej030na_info {
+   const struct drm_display_mode *display_modes;
+   unsigned int num_modes;
+   u16 width_mm, height_mm;
+   u32 bus_format, bus_flags;
+};
+
+struct ej030na {
+   struct drm_panel panel;
+   struct spi_device *spi;
+   struct regmap *map;
+
+   const struct ej030na_info *panel_info;
+
+   struct regulator *supply;
+   struct gpio_desc *reset_gpio;
+};
+
+static inline struct ej030na *to_ej030na(struct drm_panel *panel)
+{
+   return container_of(panel, struct ej030na, panel);
+}
+
+static const struct reg_sequence ej030na_init_sequence[] = {
+   { 0x05, 0x1e },
+   { 0x05, 0x5c },
+   { 0x02, 0x14 },
+   { 0x03, 0x40 },
+   { 0x04, 0x07 },
+   { 0x06, 0x12 },
+   { 0x07, 0xd2 },
+   { 0x0c, 0x06 },
+   { 0x0d, 0x40 },
+   { 0x0e, 0x40 },
+   { 0x0f, 0x40 },
+   { 0x10, 0x40 },
+   { 0x11, 0x40 },
+   { 0x2f, 0x40 },
+   { 0x5a, 0x02 },
+
+   { 0x30, 0x07 },
+   { 0x31, 0x57 },
+   { 0x32, 0x53 },
+   { 0x33, 0x77 },
+   { 0x34, 0xb8 },
+   { 0x35, 0xbd },
+   { 0x36, 0xb8 },
+   { 0x37, 0xe7 },
+   { 0x38, 0x04 },
+   { 0x39, 0xff },
+
+   { 0x40, 0x0b },
+   { 0x41, 0xb8 },
+   { 0x42, 0xab },
+   { 0x43, 0xb9 },
+   { 0x44, 0x6a },
+   { 0x45, 0x56 },
+   { 0x46, 0x61 },
+   { 0x47, 0x08 },
+   { 0x48, 0x0f },
+   { 0x49, 0x0f },
+
+   { 0x2b, 0x01 },
+};
+
+static int ej030na_prepare(struct drm_panel *panel)
+{
+   struct ej030na *priv = to_ej030na(panel);
+   struct device *dev = >spi->dev;
+   int err;
+
+   err = regulator_enable(priv->supply);
+   if (err) {
+   dev_err(dev, "Failed to enable power supply: %d\n", err);
+   return err;
+   }
+
+   /* Reset the chip */
+   gpiod_set_value_cansleep(priv->reset_gpio, 1);
+   usleep_range(50, 150);
+   gpiod_set_value_cansleep(priv->reset_gpio, 0);
+   usleep_range(50, 150);
+
+   err = regmap_multi_reg_write(priv->map, ej030na_init_sequence,
+ARRAY_SIZE(ej030na_init_sequence));
+   if (err) {
+

[PATCH 1/2] dt-bindings: display/panel: Add Innolux EJ030NA

2021-06-25 Thread Paul Cercueil
Add binding for the Innolux EJ030NA panel, which is a 320x480 3.0" 4:3
24-bit TFT LCD panel with non-square pixels and a delta-RGB 8-bit
interface.

Signed-off-by: Paul Cercueil 
---
 .../display/panel/innolux,ej030na.yaml| 62 +++
 1 file changed, 62 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/display/panel/innolux,ej030na.yaml

diff --git 
a/Documentation/devicetree/bindings/display/panel/innolux,ej030na.yaml 
b/Documentation/devicetree/bindings/display/panel/innolux,ej030na.yaml
new file mode 100644
index ..cda36c04e85c
--- /dev/null
+++ b/Documentation/devicetree/bindings/display/panel/innolux,ej030na.yaml
@@ -0,0 +1,62 @@
+# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/display/panel/innolux,ej030na.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Innolux EJ030NA 3.0" (320x480 pixels) 24-bit TFT LCD panel
+
+description: |
+  The panel must obey the rules for a SPI slave device as specified in
+  spi/spi-controller.yaml
+
+maintainers:
+  - Paul Cercueil 
+
+allOf:
+  - $ref: panel-common.yaml#
+
+properties:
+  compatible:
+const: innolux,ej030na
+
+  backlight: true
+  port: true
+  power-supply: true
+  reg: true
+  reset-gpios: true
+
+required:
+  - compatible
+  - reg
+  - power-supply
+  - reset-gpios
+
+unevaluatedProperties: false
+
+examples:
+  - |
+#include 
+
+spi {
+#address-cells = <1>;
+#size-cells = <0>;
+
+panel@0 {
+compatible = "innolux,ej030na";
+reg = <0>;
+
+spi-max-frequency = <1000>;
+
+reset-gpios = < 4 GPIO_ACTIVE_LOW>;
+power-supply = <_power>;
+
+backlight = <>;
+
+port {
+panel_input: endpoint {
+remote-endpoint = <_output>;
+};
+};
+};
+};
-- 
2.30.2



Re: [PATCH 03/47] drm/i915/guc: Increase size of CTB buffers

2021-06-25 Thread Michal Wajdeczko



On 24.06.2021 17:41, Matthew Brost wrote:
> On Thu, Jun 24, 2021 at 03:49:55PM +0200, Michal Wajdeczko wrote:
>>
>>
>> On 24.06.2021 09:04, Matthew Brost wrote:
>>> With the introduction of non-blocking CTBs more than one CTB can be in
>>> flight at a time. Increasing the size of the CTBs should reduce how
>>> often software hits the case where no space is available in the CTB
>>> buffer.
>>>
>>> Cc: John Harrison 
>>> Signed-off-by: Matthew Brost 
>>> ---
>>>  drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 11 ---
>>>  1 file changed, 8 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c 
>>> b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
>>> index 07f080ddb9ae..a17215920e58 100644
>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
>>> @@ -58,11 +58,16 @@ static inline struct drm_device *ct_to_drm(struct 
>>> intel_guc_ct *ct)
>>>   *  ++---+--+
>>>   *
>>>   * Size of each `CT Buffer`_ must be multiple of 4K.
>>> - * As we don't expect too many messages, for now use minimum sizes.
>>> + * We don't expect too many messages in flight at any time, unless we are
>>> + * using the GuC submission. In that case each request requires a minimum
>>> + * 2 dwords which gives us a maximum 256 queue'd requests. Hopefully this
>>> + * enough space to avoid backpressure on the driver. We increase the size
>>> + * of the receive buffer (relative to the send) to ensure a G2H response
>>> + * CTB has a landing spot.
>>>   */
>>>  #define CTB_DESC_SIZE  ALIGN(sizeof(struct 
>>> guc_ct_buffer_desc), SZ_2K)
>>>  #define CTB_H2G_BUFFER_SIZE(SZ_4K)
>>> -#define CTB_G2H_BUFFER_SIZE(SZ_4K)
>>> +#define CTB_G2H_BUFFER_SIZE(4 * CTB_H2G_BUFFER_SIZE)
>>>  
>>>  struct ct_request {
>>> struct list_head link;
>>> @@ -641,7 +646,7 @@ static int ct_read(struct intel_guc_ct *ct, struct 
>>> ct_incoming_msg **msg)
>>> /* beware of buffer wrap case */
>>> if (unlikely(available < 0))
>>> available += size;
>>> -   CT_DEBUG(ct, "available %d (%u:%u)\n", available, head, tail);
>>> +   CT_DEBUG(ct, "available %d (%u:%u:%u)\n", available, head, tail, size);
>>
>> CTB size is already printed in intel_guc_ct_init() and is fixed so not
>> sure if repeating it on every ct_read has any benefit
>>
> 
> I'd say more debug the better and if CT_DEBUG is enabled the logs are
> very verbose so an extra value doesn't really hurt.

fair, but this doesn't mean we should add little/no value item, anyway
since DEBUG_GUC is if off by default, this is:

Reviewed-by: Michal Wajdeczko 

> 
> Matt
> 
>>> GEM_BUG_ON(available < 0);
>>>  
>>> header = cmds[head];
>>>


Re: [Intel-gfx] [PATCH 02/47] drm/i915/guc: Improve error message for unsolicited CT response

2021-06-25 Thread Michal Wajdeczko



On 24.06.2021 09:04, Matthew Brost wrote:
> Improve the error message when a unsolicited CT response is received by
> printing fence that couldn't be found, the last fence, and all requests
> with a response outstanding.
> 
> Signed-off-by: Matthew Brost 
> ---
>  drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 10 +++---
>  1 file changed, 7 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c 
> b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> index a59e239497ee..07f080ddb9ae 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> @@ -730,12 +730,16 @@ static int ct_handle_response(struct intel_guc_ct *ct, 
> struct ct_incoming_msg *r
>   found = true;
>   break;
>   }
> - spin_unlock_irqrestore(>requests.lock, flags);
> -
>   if (!found) {
>   CT_ERROR(ct, "Unsolicited response (fence %u)\n", fence);
> - return -ENOKEY;
> + CT_ERROR(ct, "Could not find fence=%u, last_fence=%u\n", fence,
> +  ct->requests.last_fence);
> + list_for_each_entry(req, >requests.pending, link)
> + CT_ERROR(ct, "request %u awaits response\n",
> +  req->fence);

not quite sure how listing of awaiting requests could help here (if we
suspect that this is a duplicated reply, then we should rather track
short list of already processed messages to look there) but since it
does not hurt too much, this is:

Reviewed-by: Michal Wajdeczko 

> + err = -ENOKEY;
>   }
> + spin_unlock_irqrestore(>requests.lock, flags);
>  
>   if (unlikely(err))
>   return err;
> 


Re: [Intel-gfx] [PATCH 04/47] drm/i915/guc: Add non blocking CTB send function

2021-06-25 Thread Michal Wajdeczko



On 25.06.2021 00:41, Matthew Brost wrote:
> On Thu, Jun 24, 2021 at 07:02:18PM +0200, Michal Wajdeczko wrote:
>>
>>
>> On 24.06.2021 17:49, Matthew Brost wrote:
>>> On Thu, Jun 24, 2021 at 04:48:32PM +0200, Michal Wajdeczko wrote:


 On 24.06.2021 09:04, Matthew Brost wrote:
> Add non blocking CTB send function, intel_guc_send_nb. GuC submission
> will send CTBs in the critical path and does not need to wait for these
> CTBs to complete before moving on, hence the need for this new function.
>
> The non-blocking CTB now must have a flow control mechanism to ensure
> the buffer isn't overrun. A lazy spin wait is used as we believe the
> flow control condition should be rare with a properly sized buffer.
>
> The function, intel_guc_send_nb, is exported in this patch but unused.
> Several patches later in the series make use of this function.
>
> Signed-off-by: John Harrison 
> Signed-off-by: Matthew Brost 
> ---
>  drivers/gpu/drm/i915/gt/uc/intel_guc.h| 12 +++-
>  drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 77 +--
>  drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h |  3 +-
>  3 files changed, 82 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h 
> b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
> index 4abc59f6f3cd..24b1df6ad4ae 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
> @@ -74,7 +74,15 @@ static inline struct intel_guc *log_to_guc(struct 
> intel_guc_log *log)
>  static
>  inline int intel_guc_send(struct intel_guc *guc, const u32 *action, u32 
> len)
>  {
> - return intel_guc_ct_send(>ct, action, len, NULL, 0);
> + return intel_guc_ct_send(>ct, action, len, NULL, 0, 0);
> +}
> +
> +#define INTEL_GUC_SEND_NBBIT(31)

 hmm, this flag really belongs to intel_guc_ct_send() so it should be
 defined as CTB flag near that function declaration

>>>
>>> I can move this up a few lines.
>>>
> +static
> +inline int intel_guc_send_nb(struct intel_guc *guc, const u32 *action, 
> u32 len)
> +{
> + return intel_guc_ct_send(>ct, action, len, NULL, 0,
> +  INTEL_GUC_SEND_NB);
>  }
>  
>  static inline int
> @@ -82,7 +90,7 @@ intel_guc_send_and_receive(struct intel_guc *guc, const 
> u32 *action, u32 len,
>  u32 *response_buf, u32 response_buf_size)
>  {
>   return intel_guc_ct_send(>ct, action, len,
> -  response_buf, response_buf_size);
> +  response_buf, response_buf_size, 0);
>  }
>  
>  static inline void intel_guc_to_host_event_handler(struct intel_guc *guc)
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c 
> b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> index a17215920e58..c9a65d05911f 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> @@ -3,6 +3,11 @@
>   * Copyright © 2016-2019 Intel Corporation
>   */
>  
> +#include 
> +#include 
> +#include 
> +#include 
> +
>  #include "i915_drv.h"
>  #include "intel_guc_ct.h"
>  #include "gt/intel_gt.h"
> @@ -373,7 +378,7 @@ static void write_barrier(struct intel_guc_ct *ct)
>  static int ct_write(struct intel_guc_ct *ct,
>   const u32 *action,
>   u32 len /* in dwords */,
> - u32 fence)
> + u32 fence, u32 flags)
>  {
>   struct intel_guc_ct_buffer *ctb = >ctbs.send;
>   struct guc_ct_buffer_desc *desc = ctb->desc;
> @@ -421,9 +426,13 @@ static int ct_write(struct intel_guc_ct *ct,
>FIELD_PREP(GUC_CTB_MSG_0_NUM_DWORDS, len) |
>FIELD_PREP(GUC_CTB_MSG_0_FENCE, fence);
>  
> - hxg = FIELD_PREP(GUC_HXG_MSG_0_TYPE, GUC_HXG_TYPE_REQUEST) |
> -   FIELD_PREP(GUC_HXG_REQUEST_MSG_0_ACTION |
> -  GUC_HXG_REQUEST_MSG_0_DATA0, action[0]);
> + hxg = (flags & INTEL_GUC_SEND_NB) ?
> + (FIELD_PREP(GUC_HXG_MSG_0_TYPE, GUC_HXG_TYPE_EVENT) |
> +  FIELD_PREP(GUC_HXG_EVENT_MSG_0_ACTION |
> + GUC_HXG_EVENT_MSG_0_DATA0, action[0])) :
> + (FIELD_PREP(GUC_HXG_MSG_0_TYPE, GUC_HXG_TYPE_REQUEST) |
> +  FIELD_PREP(GUC_HXG_REQUEST_MSG_0_ACTION |
> + GUC_HXG_REQUEST_MSG_0_DATA0, action[0]));

 or as we already switched to accept and return whole HXG messages in
 guc_send_mmio() maybe we should do the same for CTB variant too and
 instead of using extra flag just let caller to prepare proper HXG header
 with HXG_EVENT type and then in CTB code just look at this type to make
 decision which code path to use

>>>
>>> Not sure I follow. Anyways 

Re: [PATCH 1/4] drm/i915/gem: Implement object migration

2021-06-25 Thread Matthew Auld

On 24/06/2021 19:31, Thomas Hellström wrote:

Introduce an interface to migrate objects between regions.
This is primarily intended to migrate objects to LMEM for display and
to SYSTEM for dma-buf, but might be reused in one form or another for
performande-based migration.

Signed-off-by: Thomas Hellström 
---
  drivers/gpu/drm/i915/gem/i915_gem_object.c| 91 +++
  drivers/gpu/drm/i915/gem/i915_gem_object.h| 12 +++
  .../gpu/drm/i915/gem/i915_gem_object_types.h  |  9 ++
  drivers/gpu/drm/i915/gem/i915_gem_ttm.c   | 69 ++
  drivers/gpu/drm/i915/gem/i915_gem_wait.c  | 19 
  5 files changed, 183 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c 
b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index 07e8ff9a8aae..6421c3a8b2f3 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -513,6 +513,97 @@ bool i915_gem_object_has_iomem(const struct 
drm_i915_gem_object *obj)
return obj->mem_flags & I915_BO_FLAG_IOMEM;
  }
  
+/**

+ * i915_gem_object_can_migrate - Whether an object likely can be migrated
+ *
+ * @obj: The object to migrate
+ * @id: The region intended to migrate to
+ *
+ * Check whether the object backend supports migration to the
+ * given region. Note that pinning may affect the ability to migrate.
+ *
+ * Return: true if migration is possible, false otherwise.
+ */
+bool i915_gem_object_can_migrate(struct drm_i915_gem_object *obj,
+enum intel_region_id id)
+{
+   struct drm_i915_private *i915 = to_i915(obj->base.dev);
+   unsigned int num_allowed = obj->mm.n_placements;
+   struct intel_memory_region *mr;
+   unsigned int i;
+
+   GEM_BUG_ON(id >= INTEL_REGION_UNKNOWN);
+   GEM_BUG_ON(obj->mm.madv != I915_MADV_WILLNEED);
+
+   if (!obj->ops->migrate)
+   return -EOPNOTSUPP;
+
+   mr = i915->mm.regions[id];


if (!mr)
return false;

?


+   if (obj->mm.region == mr)
+   return true;
+
+   if (!i915_gem_object_evictable(obj))
+   return false;
+
+   if (!(obj->flags & I915_BO_ALLOC_USER))
+   return true;
+
+   if (num_allowed == 0)
+   return false;
+
+   for (i = 0; i < num_allowed; ++i) {
+   if (mr == obj->mm.placements[i])
+   return true;
+   }
+
+   return false;
+}
+
+/**
+ * i915_gem_object_migrate - Migrate an object to the desired region id
+ * @obj: The object to migrate.
+ * @ww: An optional struct i915_gem_ww_ctx. If NULL, the backend may
+ * not be successful in evicting other objects to make room for this object.
+ * @id: The region id to migrate to.
+ *
+ * Attempt to migrate the object to the desired memory region. The
+ * object backend must support migration and the object may not be
+ * pinned, (explicitly pinned pages or pinned vmas). The object must
+ * be locked.
+ * On successful completion, the object will have pages pointing to
+ * memory in the new region, but an async migration task may not have
+ * completed yet, and to accomplish that, i915_gem_object_wait_migration()
+ * must be called.
+ *
+ * Return: 0 on success. Negative error code on failure. In particular may
+ * return -ENXIO on lack of region space, -EDEADLK for deadlock avoidance
+ * if @ww is set, -EINTR or -ERESTARTSYS if signal pending, and
+ * -EBUSY if the object is pinned.
+ */
+int i915_gem_object_migrate(struct drm_i915_gem_object *obj,
+   struct i915_gem_ww_ctx *ww,
+   enum intel_region_id id)
+{
+   struct drm_i915_private *i915 = to_i915(obj->base.dev);
+   struct intel_memory_region *mr;
+
+   GEM_BUG_ON(id >= INTEL_REGION_UNKNOWN);
+   GEM_BUG_ON(obj->mm.madv != I915_MADV_WILLNEED);
+   assert_object_held(obj);
+
+   mr = i915->mm.regions[id];


GEM_BUG_ON(!mr) ?


+   if (obj->mm.region == mr)
+   return 0;
+
+   if (!i915_gem_object_evictable(obj))
+   return -EBUSY;
+
+   if (!obj->ops->migrate)
+   return -EOPNOTSUPP;
+
+   return obj->ops->migrate(obj, mr);
+}
+
  void i915_gem_init__objects(struct drm_i915_private *i915)
  {
INIT_WORK(>mm.free_work, __i915_gem_free_work);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h 
b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index ea3224a480c4..8cbd7a5334e2 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -17,6 +17,8 @@
  #include "i915_gem_ww.h"
  #include "i915_vma_types.h"
  
+enum intel_region_id;

+
  /*
   * XXX: There is a prevalence of the assumption that we fit the
   * object's page count inside a 32bit _signed_ variable. Let's document
@@ -597,6 +599,16 @@ bool i915_gem_object_migratable(struct drm_i915_gem_object 
*obj);
  
  bool i915_gem_object_validates_to_lmem(struct drm_i915_gem_object *obj);
  
+int 

[PATCH v3 2/2] drm/i915/gtt: ignore min_page_size for paging structures

2021-06-25 Thread Matthew Auld
The min_page_size is only needed for pages inserted into the GTT, and
for our paging structures we only need at most 4K bytes, so simply
ignore the min_page_size restrictions here, otherwise we might see some
severe overallocation on some devices.

v2(Thomas): add some commentary

Signed-off-by: Matthew Auld 
Cc: Thomas Hellström 
Reviewed-by: Thomas Hellström 
---
 drivers/gpu/drm/i915/gt/intel_gtt.c | 14 +-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c 
b/drivers/gpu/drm/i915/gt/intel_gtt.c
index 084ea65d59c0..f7e0352edb62 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
@@ -16,7 +16,19 @@ struct drm_i915_gem_object *alloc_pt_lmem(struct 
i915_address_space *vm, int sz)
 {
struct drm_i915_gem_object *obj;
 
-   obj = i915_gem_object_create_lmem(vm->i915, sz, 0);
+   /*
+* To avoid severe over-allocation when dealing with min_page_size
+* restrictions, we override that behaviour here by allowing an object
+* size and page layout which can be smaller. In practice this should be
+* totally fine, since GTT paging structures are not typically inserted
+* into the GTT.
+*
+* Note that we also hit this path for the scratch page, and for this
+* case it might need to be 64K, but that should work fine here since we
+* used the passed in size for the page size, which should ensure it
+* also has the same alignment.
+*/
+   obj = __i915_gem_object_create_lmem_with_ps(vm->i915, sz, sz, 0);
/*
 * Ensure all paging structures for this vm share the same dma-resv
 * object underneath, with the idea that one object_lock() will lock
-- 
2.26.3



[PATCH v3 1/2] drm/i915: support forcing the page size with lmem

2021-06-25 Thread Matthew Auld
For some specialised objects we might need something larger than the
regions min_page_size due to some hw restriction, and slightly more
hairy is needing something smaller with the guarantee that such objects
will never be inserted into any GTT, which is the case for the paging
structures.

This also fixes how we setup the BO page_alignment, if we later migrate
the object somewhere else. For example if the placements are {SMEM,
LMEM}, then we might get this wrong. Pushing the min_page_size behaviour
into the manager should fix this.

v2(Thomas): push the default page size behaviour into buddy_man, and let
the user override it with the page-alignment, which looks cleaner

v3: rebase on ttm sys changes

Signed-off-by: Matthew Auld 
Cc: Thomas Hellström 
Reviewed-by: Thomas Hellström 
---
 drivers/gpu/drm/i915/gem/i915_gem_create.c|  2 +-
 drivers/gpu/drm/i915/gem/i915_gem_lmem.c  | 33 +-
 drivers/gpu/drm/i915/gem/i915_gem_lmem.h  |  5 ++
 drivers/gpu/drm/i915/gem/i915_gem_region.c| 13 +++-
 drivers/gpu/drm/i915/gem/i915_gem_region.h|  1 +
 drivers/gpu/drm/i915/gem/i915_gem_shmem.c |  3 +-
 drivers/gpu/drm/i915/gem/i915_gem_stolen.c|  3 +-
 drivers/gpu/drm/i915/gem/i915_gem_ttm.c   |  6 +-
 drivers/gpu/drm/i915/gem/i915_gem_ttm.h   |  1 +
 .../gpu/drm/i915/gem/selftests/huge_pages.c   |  3 +-
 .../drm/i915/gem/selftests/i915_gem_mman.c|  8 +--
 drivers/gpu/drm/i915/i915_ttm_buddy_manager.c | 14 -
 drivers/gpu/drm/i915/i915_ttm_buddy_manager.h |  2 +-
 drivers/gpu/drm/i915/intel_memory_region.h|  1 +
 drivers/gpu/drm/i915/intel_region_ttm.c   |  4 +-
 .../drm/i915/selftests/intel_memory_region.c  | 63 ++-
 drivers/gpu/drm/i915/selftests/mock_region.c  |  1 +
 17 files changed, 143 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_create.c 
b/drivers/gpu/drm/i915/gem/i915_gem_create.c
index 93bf63bbaff1..51f92e4b1a69 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_create.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_create.c
@@ -90,7 +90,7 @@ i915_gem_setup(struct drm_i915_gem_object *obj, u64 size)
 */
flags = I915_BO_ALLOC_USER;
 
-   ret = mr->ops->init_object(mr, obj, size, flags);
+   ret = mr->ops->init_object(mr, obj, size, 0, flags);
if (ret)
return ret;
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_lmem.c 
b/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
index 41d5182cd367..a795dd38aca7 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
@@ -93,11 +93,42 @@ bool __i915_gem_object_is_lmem(struct drm_i915_gem_object 
*obj)
  mr->type == INTEL_MEMORY_STOLEN_LOCAL);
 }
 
+/**
+ * __i915_gem_object_create_lmem_with_ps - Create lmem object and force the
+ * minimum page size for the backing pages.
+ * @i915: The i915 instance.
+ * @size: The size in bytes for the object. Note that we need to round the size
+ * up depending on the @page_size. The final object size can be fished out from
+ * the drm GEM object.
+ * @page_size: The requested minimum page size in bytes for this object. This 
is
+ * useful if we need something bigger than the regions min_page_size due to 
some
+ * hw restriction, or in some very specialised cases where it needs to be
+ * smaller, where the internal fragmentation cost is too great when rounding up
+ * the object size.
+ * @flags: The optional BO allocation flags.
+ *
+ * Note that this interface assumes you know what you are doing when forcing 
the
+ * @page_size. If this is smaller than the regions min_page_size then it can
+ * never be inserted into any GTT, otherwise it might lead to undefined
+ * behaviour.
+ *
+ * Return: The object pointer, which might be an ERR_PTR in the case of 
failure.
+ */
+struct drm_i915_gem_object *
+__i915_gem_object_create_lmem_with_ps(struct drm_i915_private *i915,
+ resource_size_t size,
+ resource_size_t page_size,
+ unsigned int flags)
+{
+   return 
i915_gem_object_create_region(i915->mm.regions[INTEL_REGION_LMEM],
+size, page_size, flags);
+}
+
 struct drm_i915_gem_object *
 i915_gem_object_create_lmem(struct drm_i915_private *i915,
resource_size_t size,
unsigned int flags)
 {
return 
i915_gem_object_create_region(i915->mm.regions[INTEL_REGION_LMEM],
-size, flags);
+size, 0, flags);
 }
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_lmem.h 
b/drivers/gpu/drm/i915/gem/i915_gem_lmem.h
index 27a611deba47..4ee81fc66302 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_lmem.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_lmem.h
@@ -23,6 +23,11 @@ bool i915_gem_object_is_lmem(struct drm_i915_gem_object 
*obj);
 
 bool 

Re: [RFC PATCH 2/9] drm: bridge: Add Samsung SEC MIPI DSIM bridge driver

2021-06-25 Thread Krzysztof Kozlowski
On 25/06/2021 12:08, Jagan Teki wrote:
> Hi Krzysztof,
> 
> On Fri, Jun 25, 2021 at 2:51 PM Krzysztof Kozlowski
>  wrote:
>>
>> On Thu, 24 Jun 2021 at 14:19, Laurent Pinchart
>>  wrote:
>>>
>>> Hi Jagan,
>>>
>>> On Thu, Jun 24, 2021 at 05:42:43PM +0530, Jagan Teki wrote:
 On Thu, Jun 24, 2021 at 8:18 AM Fabio Estevam wrote:
> On Wed, Jun 23, 2021 at 7:23 PM Laurent Pinchart wrote:
>
>> Looking at the register set, it seems to match the Exynos 5433,
>> supported by drivers/gpu/drm/exynos/exynos_drm_dsi.c. Can we leverage
>> that driver instead of adding a new one for the same IP core ?
>
> Yes. there was an attempt from Michael in this direction:
> https://patchwork.kernel.org/project/dri-devel/cover/20200911135413.3654800-1-m.tret...@pengutronix.de/

 Thanks for the reference, I will check it out and see I can send any
 updated versions wrt my i.MX8MM platform.
>>>
>>> Thanks.
>>>
>>> I had a brief look at the exynos driver, and I think it should be turned
>>> into a DRM bridge as part of this rework to be used with the i.MX8MM.
>>>
>>> Is there someone from Samsung who could assist, at least to test the
>>> changes ?
>>
>> Yes, I mentioned few guys in reply to PHY. Around the DRM drivers you
>> can get in touch with:
>> Inki Dae 
>> Seung-Woo Kim 
>> Marek Szyprowski 
>> Andrzej Hajda 
> 
> Thanks for the information.
> 
>>
>> The easiest testing of the display stack would be on Hardkernel's Odroid
>> XU4 (https://www.hardkernel.com/shop/odroid-xu4-special-price/) however
>> you will not test the DSI/DSIM directly (it has only HDMI port).
> 
> Look like I found one board with Exynos5430 with DSI. Does this SoC is
> same as mainline Exynos5433?

No, Exynos5430 is ARMv7. Looks like improvement over Exynos5422.
Exynos5422 has a very good support in mainline while Exynso5430 was
never touched at all.

Exynos5433 is ARMv8, although many things are shared with 5422. About
DSI I have no clue.


Best regards,
Krzysztof


Re: [RFC PATCH 2/9] drm: bridge: Add Samsung SEC MIPI DSIM bridge driver

2021-06-25 Thread Jagan Teki
Hi Krzysztof,

On Fri, Jun 25, 2021 at 2:51 PM Krzysztof Kozlowski
 wrote:
>
> On Thu, 24 Jun 2021 at 14:19, Laurent Pinchart
>  wrote:
> >
> > Hi Jagan,
> >
> > On Thu, Jun 24, 2021 at 05:42:43PM +0530, Jagan Teki wrote:
> > > On Thu, Jun 24, 2021 at 8:18 AM Fabio Estevam wrote:
> > > > On Wed, Jun 23, 2021 at 7:23 PM Laurent Pinchart wrote:
> > > >
> > > > > Looking at the register set, it seems to match the Exynos 5433,
> > > > > supported by drivers/gpu/drm/exynos/exynos_drm_dsi.c. Can we leverage
> > > > > that driver instead of adding a new one for the same IP core ?
> > > >
> > > > Yes. there was an attempt from Michael in this direction:
> > > > https://patchwork.kernel.org/project/dri-devel/cover/20200911135413.3654800-1-m.tret...@pengutronix.de/
> > >
> > > Thanks for the reference, I will check it out and see I can send any
> > > updated versions wrt my i.MX8MM platform.
> >
> > Thanks.
> >
> > I had a brief look at the exynos driver, and I think it should be turned
> > into a DRM bridge as part of this rework to be used with the i.MX8MM.
> >
> > Is there someone from Samsung who could assist, at least to test the
> > changes ?
>
> Yes, I mentioned few guys in reply to PHY. Around the DRM drivers you
> can get in touch with:
> Inki Dae 
> Seung-Woo Kim 
> Marek Szyprowski 
> Andrzej Hajda 

Thanks for the information.

>
> The easiest testing of the display stack would be on Hardkernel's Odroid
> XU4 (https://www.hardkernel.com/shop/odroid-xu4-special-price/) however
> you will not test the DSI/DSIM directly (it has only HDMI port).

Look like I found one board with Exynos5430 with DSI. Does this SoC is
same as mainline Exynos5433?

Jagan.


Re: [PATCH v2] drm/panfrost:report the full raw fault information instead

2021-06-25 Thread Chunyou Tang
Hi Steve,
Thinks for your reply.
When I only set the pte |= ARM_LPAE_PTE_SH_NS;there have no "GPU
Fault",When I set the pte |= ARM_LPAE_PTE_SH_IS(or
ARM_LPAE_PTE_SH_OS);there have "GPU Fault".I don't know how the pte
effect this issue?
Can you give me some suggestions again?

Thinks.

Chunyou

?? Thu, 24 Jun 2021 14:22:04 +0100
Steven Price  :

> On 22/06/2021 02:40, Chunyou Tang wrote:
> > Hi Steve,
> > I will send a new patch with suitable subject/commit
> > message. But I send a V3 or a new patch?
> 
> Send a V3 - it is a new version of this patch.
> 
> > I met a bug about the GPU,I have no idea about how to fix
> > it, If you can give me some suggestion,it is perfect.
> > 
> > You can see such kernel log:
> > 
> > Jun 20 10:20:13 icube kernel: [  774.566760] mvp_gpu :05:00.0:
> > GPU Fault 0x0088 (SHAREABILITY_FAULT) at 0x0310fd00 Jun
> > 20 10:20:13 icube kernel: [  774.566764] mvp_gpu :05:00.0:
> > There were multiple GPU faults - some have not been reported Jun 20
> > 10:20:13 icube kernel: [  774.667542] mvp_gpu :05:00.0:
> > AS_ACTIVE bit stuck Jun 20 10:20:13 icube kernel: [  774.767900]
> > mvp_gpu :05:00.0: AS_ACTIVE bit stuck Jun 20 10:20:13 icube
> > kernel: [  774.868546] mvp_gpu :05:00.0: AS_ACTIVE bit stuck
> > Jun 20 10:20:13 icube kernel: [  774.968910] mvp_gpu :05:00.0:
> > AS_ACTIVE bit stuck Jun 20 10:20:13 icube kernel: [  775.069251]
> > mvp_gpu :05:00.0: AS_ACTIVE bit stuck Jun 20 10:20:22 icube
> > kernel: [  783.693971] mvp_gpu :05:00.0: gpu sched timeout,
> > js=1, config=0x7300, status=0x8, head=0x362c900, tail=0x362c100,
> > sched_job=3252fb84
> > 
> > In
> > https://lore.kernel.org/dri-devel/20200510165538.19720-1-peron.c...@gmail.com/
> > there had a same bug like mine,and I found you at the mail list,I
> > don't know how it fixed?
> 
> The GPU_SHAREABILITY_FAULT error means that a cache line has been
> accessed both as shareable and non-shareable and therefore coherency
> cannot be guaranteed. Although the "multiple GPU faults" means that
> this may not be the underlying cause.
> 
> The fact that your dmesg log has PCI style identifiers
> (":05:00.0") suggests this is an unusual platform - I've not
> previously been aware of a Mali device behind PCI. Is this device
> working with the kbase/DDK proprietary driver? It would be worth
> looking at the kbase kernel code for the platform to see if there is
> anything special done for the platform.
> 
> From the dmesg logs all I can really tell is that the GPU seems
> unhappy about the memory system.
> 
> Steve
> 
> > I need your help!
> > 
> > thinks very much!
> > 
> > Chunyou
> > 
> > ?? Mon, 21 Jun 2021 11:45:20 +0100
> > Steven Price  :
> > 
> >> On 19/06/2021 04:18, Chunyou Tang wrote:
> >>> Hi Steve,
> >>>   1,Now I know how to write the subject
> >>>   2,the low 8 bits is the exception type in spec.
> >>>
> >>> and you can see prnfrost_exception_name()
> >>>
> >>> switch (exception_code) {
> >>> /* Non-Fault Status code */
> >>> case 0x00: return "NOT_STARTED/IDLE/OK";
> >>> case 0x01: return "DONE";
> >>> case 0x02: return "INTERRUPTED";
> >>> case 0x03: return "STOPPED";
> >>> case 0x04: return "TERMINATED";
> >>> case 0x08: return "ACTIVE";
> >>> 
> >>> 
> >>> case 0xD8: return "ACCESS_FLAG";
> >>> case 0xD9 ... 0xDF: return "ACCESS_FLAG";
> >>> case 0xE0 ... 0xE7: return "ADDRESS_SIZE_FAULT";
> >>> case 0xE8 ... 0xEF: return "MEMORY_ATTRIBUTES_FAULT";
> >>> }
> >>> return "UNKNOWN";
> >>> }
> >>>
> >>> the exception_code in case is only 8 bits,so if fault_status
> >>> in panfrost_gpu_irq_handler() don't & 0xFF,it can't get correct
> >>> exception reason,it will be always UNKNOWN.
> >>
> >> Yes, I'm happy with the change - I just need a patch that I can
> >> apply. At the moment this patch only changes the first '0x%08x'
> >> output rather than the call to panfrost_exception_name() as well.
> >> So we just need a patch which does:
> >>
> >> - fault_status & 0xFF, panfrost_exception_name(pfdev,
> >> fault_status),
> >> + fault_status, panfrost_exception_name(pfdev, fault_status &
> >> 0xFF),
> >>
> >> along with a suitable subject/commit message describing the
> >> change. If you can send me that I can apply it.
> >>
> >> Thanks,
> >>
> >> Steve
> >>
> >> PS. Sorry for going round in circles here - I'm trying to help you
> >> get setup so you'll be able to contribute patches easily in
> >> future. An important part of that is ensuring you can send a
> >> properly formatted patch to the list.
> >>
> >> PPS. I'm still not receiving your emails directly. I don't think
> >> it's a problem at my end because I'm receiving other emails, but
> >> if you can somehow fix the problem you're likely to receive a
> >> faster response.
> >>
> >>> ?? Fri, 18 Jun 2021 13:43:24 +0100
> >>> Steven Price  :
> >>>
>  On 17/06/2021 07:20, ChunyouTang wrote:
> > From: ChunyouTang 
> >
> > of the 

Re: [RFC PATCH 2/9] drm: bridge: Add Samsung SEC MIPI DSIM bridge driver

2021-06-25 Thread Krzysztof Kozlowski
On Thu, 24 Jun 2021 at 14:19, Laurent Pinchart
 wrote:
>
> Hi Jagan,
>
> On Thu, Jun 24, 2021 at 05:42:43PM +0530, Jagan Teki wrote:
> > On Thu, Jun 24, 2021 at 8:18 AM Fabio Estevam wrote:
> > > On Wed, Jun 23, 2021 at 7:23 PM Laurent Pinchart wrote:
> > >
> > > > Looking at the register set, it seems to match the Exynos 5433,
> > > > supported by drivers/gpu/drm/exynos/exynos_drm_dsi.c. Can we leverage
> > > > that driver instead of adding a new one for the same IP core ?
> > >
> > > Yes. there was an attempt from Michael in this direction:
> > > https://patchwork.kernel.org/project/dri-devel/cover/20200911135413.3654800-1-m.tret...@pengutronix.de/
> >
> > Thanks for the reference, I will check it out and see I can send any
> > updated versions wrt my i.MX8MM platform.
>
> Thanks.
>
> I had a brief look at the exynos driver, and I think it should be turned
> into a DRM bridge as part of this rework to be used with the i.MX8MM.
>
> Is there someone from Samsung who could assist, at least to test the
> changes ?

Yes, I mentioned few guys in reply to PHY. Around the DRM drivers you
can get in touch with:
Inki Dae 
Seung-Woo Kim 
Marek Szyprowski 
Andrzej Hajda 

The easiest testing of the display stack would be on Hardkernel's Odroid
XU4 (https://www.hardkernel.com/shop/odroid-xu4-special-price/) however
you will not test the DSI/DSIM directly (it has only HDMI port).

Best regards,
Krzysztof
Best regards,
Krzysztof


Re: [PATCH 10/12] dt-bindings: media: rockchip-vpu: Add PX30 compatible

2021-06-25 Thread Dafna Hirschfeld

Hi,

On 24.06.21 21:26, Ezequiel Garcia wrote:

From: Paul Kocialkowski 

The Rockchip PX30 SoC has a Hantro VPU that features a decoder (VDPU2)
and an encoder (VEPU2).

Signed-off-by: Paul Kocialkowski 
Signed-off-by: Ezequiel Garcia 
---
  Documentation/devicetree/bindings/media/rockchip-vpu.yaml | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/Documentation/devicetree/bindings/media/rockchip-vpu.yaml 
b/Documentation/devicetree/bindings/media/rockchip-vpu.yaml
index b88172a59de7..3b9c5aa91fcc 100644
--- a/Documentation/devicetree/bindings/media/rockchip-vpu.yaml
+++ b/Documentation/devicetree/bindings/media/rockchip-vpu.yaml
@@ -28,6 +28,9 @@ properties:
- items:
- const: rockchip,rk3228-vpu
- const: rockchip,rk3399-vpu
+  - items:
+  - const: rockchip,px30-vpu
+  - const: rockchip,rk3399-vpu


This rk3399 compatible is already mentioned in the last 'items' list, should we 
add it again?

Thanks,
Dafna

  
reg:

  maxItems: 1



Re: [PATCH 0/6] KVM: Remove uses of struct page from x86 and arm64 MMU

2021-06-25 Thread Christian Borntraeger




On 24.06.21 14:57, Nicholas Piggin wrote:

Excerpts from Paolo Bonzini's message of June 24, 2021 10:41 pm:

On 24/06/21 13:42, Nicholas Piggin wrote:

Excerpts from Nicholas Piggin's message of June 24, 2021 8:34 pm:

Excerpts from David Stevens's message of June 24, 2021 1:57 pm:

KVM supports mapping VM_IO and VM_PFNMAP memory into the guest by using
follow_pte in gfn_to_pfn. However, the resolved pfns may not have
assoicated struct pages, so they should not be passed to pfn_to_page.
This series removes such calls from the x86 and arm64 secondary MMU. To
do this, this series modifies gfn_to_pfn to return a struct page in
addition to a pfn, if the hva was resolved by gup. This allows the
caller to call put_page only when necessated by gup.

This series provides a helper function that unwraps the new return type
of gfn_to_pfn to provide behavior identical to the old behavior. As I
have no hardware to test powerpc/mips changes, the function is used
there for minimally invasive changes. Additionally, as gfn_to_page and
gfn_to_pfn_cache are not integrated with mmu notifier, they cannot be
easily changed over to only use pfns.

This addresses CVE-2021-22543 on x86 and arm64.


Does this fix the problem? (untested I don't have a POC setup at hand,
but at least in concept)


This one actually compiles at least. Unfortunately I don't have much
time in the near future to test, and I only just found out about this
CVE a few hours ago.


And it also works (the reproducer gets an infinite stream of userspace
exits and especially does not crash).  We can still go for David's
solution later since MMU notifiers are able to deal with this pages, but
it's a very nice patch for stable kernels.


Oh nice, thanks for testing. How's this?

Thanks,
Nick

---

KVM: Fix page ref underflow for regions with valid but non-refcounted pages

It's possible to create a region which maps valid but non-refcounted
pages (e.g., tail pages of non-compound higher order allocations). These
host pages can then be returned by gfn_to_page, gfn_to_pfn, etc., family
of APIs, which take a reference to the page, which takes it from 0 to 1.
When the reference is dropped, this will free the page incorrectly.

Fix this by only taking a reference on the page if it was non-zero,
which indicates it is participating in normal refcounting (and can be
released with put_page).

Signed-off-by: Nicholas Piggin 
---
  virt/kvm/kvm_main.c | 19 +--
  1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 6a6bc7af0e28..46fb042837d2 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2055,6 +2055,13 @@ static bool vma_is_valid(struct vm_area_struct *vma, 
bool write_fault)
return true;
  }
  
+static int kvm_try_get_pfn(kvm_pfn_t pfn)

+{
+   if (kvm_is_reserved_pfn(pfn))
+   return 1;
+   return get_page_unless_zero(pfn_to_page(pfn));
+}
+
  static int hva_to_pfn_remapped(struct vm_area_struct *vma,
   unsigned long addr, bool *async,
   bool write_fault, bool *writable,
@@ -2104,13 +2111,21 @@ static int hva_to_pfn_remapped(struct vm_area_struct 
*vma,
 * Whoever called remap_pfn_range is also going to call e.g.
 * unmap_mapping_range before the underlying pages are freed,
 * causing a call to our MMU notifier.
+*
+* Certain IO or PFNMAP mappings can be backed with valid
+* struct pages, but be allocated without refcounting e.g.,
+* tail pages of non-compound higher order allocations, which
+* would then underflow the refcount when the caller does the
+* required put_page. Don't allow those pages here.
 */
-   kvm_get_pfn(pfn);
+   if (!kvm_try_get_pfn(pfn))
+   r = -EFAULT;
  


Right. That should also take care of s390 (pin_guest_page in vsie.c
which calls gfn_to_page).
FWIW, the current API is really hard to follow as it does not tell
which functions take a reference and which dont.

Anyway, this patch (with cc stable?)

Reviewed-by: Christian Borntraeger 


  out:
pte_unmap_unlock(ptep, ptl);
*p_pfn = pfn;
-   return 0;
+
+   return r;
  }
  
  /*




Re: [PATCH v4 15/17] drm/uAPI: Move "Broadcast RGB" property from driver specific to general context

2021-06-25 Thread Werner Sembach


Am 23.06.21 um 13:26 schrieb Pekka Paalanen:
> On Wed, 23 Jun 2021 12:10:14 +0200
> Werner Sembach  wrote:
>
>> Am 23.06.21 um 09:48 schrieb Pekka Paalanen:
>>> On Tue, 22 Jun 2021 11:57:53 +0200
>>> Werner Sembach  wrote:
>>>  
 Am 22.06.21 um 09:25 schrieb Pekka Paalanen:  
> On Fri, 18 Jun 2021 11:11:14 +0200
> Werner Sembach  wrote:
>
>> Add "Broadcast RGB" to general drm context so that more drivers besides
>> i915 and gma500 can implement it without duplicating code.
>>
>> Userspace can use this property to tell the graphic driver to use full or
>> limited color range for a given connector, overwriting the default
>> behaviour/automatic detection.
>>
>> Possible options are:
>> - Automatic (default/current behaviour)
>> - Full
>> - Limited 16:235
>>
>> In theory the driver should be able to automatically detect the monitors
>> capabilities, but because of flawed standard implementations in Monitors,
>> this might fail. In this case a manual overwrite is required to not have
>> washed out colors or lose details in very dark or bright scenes.
>>
>> Signed-off-by: Werner Sembach 
>> ---
>>  drivers/gpu/drm/drm_atomic_helper.c |  4 +++
>>  drivers/gpu/drm/drm_atomic_uapi.c   |  4 +++
>>  drivers/gpu/drm/drm_connector.c | 43 +
>>  include/drm/drm_connector.h | 16 +++
>>  4 files changed, 67 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/drm_atomic_helper.c 
>> b/drivers/gpu/drm/drm_atomic_helper.c
>> index 90d62f305257..0c89d32efbd0 100644
>> --- a/drivers/gpu/drm/drm_atomic_helper.c
>> +++ b/drivers/gpu/drm/drm_atomic_helper.c
>> @@ -691,6 +691,10 @@ drm_atomic_helper_check_modeset(struct drm_device 
>> *dev,
>>  if (old_connector_state->preferred_color_format 
>> !=
>>  new_connector_state->preferred_color_format)
>>  new_crtc_state->connectors_changed = 
>> true;
>> +
>> +if (old_connector_state->preferred_color_range 
>> !=
>> +new_connector_state->preferred_color_range)
>> +new_crtc_state->connectors_changed = 
>> true;
>>  }
>>  
>>  if (funcs->atomic_check)
>> diff --git a/drivers/gpu/drm/drm_atomic_uapi.c 
>> b/drivers/gpu/drm/drm_atomic_uapi.c
>> index c536f5e22016..c589bb1a8163 100644
>> --- a/drivers/gpu/drm/drm_atomic_uapi.c
>> +++ b/drivers/gpu/drm/drm_atomic_uapi.c
>> @@ -798,6 +798,8 @@ static int drm_atomic_connector_set_property(struct 
>> drm_connector *connector,
>>  state->max_requested_bpc = val;
>>  } else if (property == 
>> connector->preferred_color_format_property) {
>>  state->preferred_color_format = val;
>> +} else if (property == 
>> connector->preferred_color_range_property) {
>> +state->preferred_color_range = val;
>>  } else if (connector->funcs->atomic_set_property) {
>>  return connector->funcs->atomic_set_property(connector,
>>  state, property, val);
>> @@ -877,6 +879,8 @@ drm_atomic_connector_get_property(struct 
>> drm_connector *connector,
>>  *val = state->max_requested_bpc;
>>  } else if (property == 
>> connector->preferred_color_format_property) {
>>  *val = state->preferred_color_format;
>> +} else if (property == 
>> connector->preferred_color_range_property) {
>> +*val = state->preferred_color_range;
>>  } else if (connector->funcs->atomic_get_property) {
>>  return connector->funcs->atomic_get_property(connector,
>>  state, property, val);
>> diff --git a/drivers/gpu/drm/drm_connector.c 
>> b/drivers/gpu/drm/drm_connector.c
>> index aea03dd02e33..9bc596638613 100644
>> --- a/drivers/gpu/drm/drm_connector.c
>> +++ b/drivers/gpu/drm/drm_connector.c
>> @@ -905,6 +905,12 @@ static const struct drm_prop_enum_list 
>> drm_active_color_format_enum_list[] = {
>>  { DRM_COLOR_FORMAT_YCRCB420, "ycbcr420" },
>>  };
>>  
>> +static const struct drm_prop_enum_list 
>> drm_preferred_color_range_enum_list[] = {
>> +{ DRM_MODE_COLOR_RANGE_UNSET, "Automatic" },
>> +{ DRM_MODE_COLOR_RANGE_FULL, "Full" },
>> +{ DRM_MODE_COLOR_RANGE_LIMITED_16_235, "Limited 16:235" },
> Hi,
>
> the same question here about these numbers as I asked on the "active
> color range" property.
>
>> +};
>> +
>>  static const struct drm_prop_enum_list 
>> 

[PATCH] drm/i915: Drop all references to DRM IRQ midlayer

2021-06-25 Thread Thomas Zimmermann
Remove all references to DRM's IRQ midlayer.

The code in xcs_resume() probably didn't work as intended. It uses
struct drm_device.irq, which is allocated to 0, but never initialized
by i915 to the device's interrupt number.

Signed-off-by: Thomas Zimmermann 
Fixes: 536f77b1caa0 ("drm/i915/gt: Call stop_ring() from ring resume, again")
Cc: Chris Wilson 
Cc: Mika Kuoppala 
Cc: Daniel Vetter 
Cc: Rodrigo Vivi 
Cc: Joonas Lahtinen 
Cc: Maarten Lankhorst 
Cc: Lucas De Marchi 
---
 drivers/gpu/drm/i915/gt/intel_ring_submission.c | 3 ++-
 drivers/gpu/drm/i915/i915_drv.c | 1 -
 drivers/gpu/drm/i915/i915_irq.c | 1 -
 3 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_ring_submission.c 
b/drivers/gpu/drm/i915/gt/intel_ring_submission.c
index 5d42a12ef3d6..d893aaaed74f 100644
--- a/drivers/gpu/drm/i915/gt/intel_ring_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_ring_submission.c
@@ -180,12 +180,13 @@ static bool stop_ring(struct intel_engine_cs *engine)
 static int xcs_resume(struct intel_engine_cs *engine)
 {
struct intel_ring *ring = engine->legacy.ring;
+   struct pci_dev *pdev = to_pci_dev(engine->i915->drm.dev);
 
ENGINE_TRACE(engine, "ring:{HEAD:%04x, TAIL:%04x}\n",
 ring->head, ring->tail);
 
/* Double check the ring is empty & disabled before we resume */
-   synchronize_hardirq(engine->i915->drm.irq);
+   synchronize_hardirq(pdev->irq);
if (!stop_ring(engine))
goto err;
 
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 850b499c71c8..73de45472f60 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -42,7 +42,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index a11bdb667241..eef616d96f12 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -33,7 +33,6 @@
 #include 
 
 #include 
-#include 
 
 #include "display/intel_de.h"
 #include "display/intel_display_types.h"

base-commit: 8c1323b422f8473421682ba783b5949ddd89a3f4
prerequisite-patch-id: c2b2f08f0eccc9f5df0c0da49fa1d36267deb11d
prerequisite-patch-id: c67e5d886a47b7d0266d81100837557fda34cb24
-- 
2.32.0



Re: [PATCH v3 2/2] backlight: lm3630a: convert to atomic PWM API and check for errors

2021-06-25 Thread Lee Jones
On Thu, 24 Jun 2021, Uwe Kleine-König wrote:

> Hi Lee,
> 
> On Tue, Jun 22, 2021 at 02:12:57PM +0100, Lee Jones wrote:
> > On Mon, 21 Jun 2021, Uwe Kleine-König wrote:
> > 
> > > The practical upside here is that this only needs a single API call to
> > > program the hardware which (depending on the underlaying hardware) can
> > > be more effective and prevents glitches.
> > > 
> > > Up to now the return value of the pwm functions was ignored. Fix this
> > > and propagate the error to the caller.
> > > 
> > > Signed-off-by: Uwe Kleine-König 
> > > ---
> > >  drivers/video/backlight/lm3630a_bl.c | 42 +---
> > >  1 file changed, 19 insertions(+), 23 deletions(-)
> > 
> > Fixed the subject line and applied, thanks.
> 
> It's not obvious to me what needed fixing here, and I don't find where
> you the patches, neither in next nor in
> https://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight.git; so I
> cannot check what you actually changed.
> 
> I assume you did s/lm3630a/lm3630a_bl/ ? I didn't because it felt
> tautological.

No, but perhaps I should have.  Format goes:

: : 

Where  has the file extension removed.

-- 
Lee Jones [李琼斯]
Senior Technical Lead - Developer Services
Linaro.org │ Open source software for Arm SoCs
Follow Linaro: Facebook | Twitter | Blog


[PATCH v4 27/27] drm/zte: Don't set struct drm_device.irq_enabled

2021-06-25 Thread Thomas Zimmermann
The field drm_device.irq_enabled is only used by legacy drivers
with userspace modesetting. Don't set it in zte.

Signed-off-by: Thomas Zimmermann 
Reviewed-by: Laurent Pinchart 
Acked-by: Daniel Vetter 
---
 drivers/gpu/drm/zte/zx_drm_drv.c | 6 --
 1 file changed, 6 deletions(-)

diff --git a/drivers/gpu/drm/zte/zx_drm_drv.c b/drivers/gpu/drm/zte/zx_drm_drv.c
index 5506336594e2..064056503ebb 100644
--- a/drivers/gpu/drm/zte/zx_drm_drv.c
+++ b/drivers/gpu/drm/zte/zx_drm_drv.c
@@ -75,12 +75,6 @@ static int zx_drm_bind(struct device *dev)
goto out_unbind;
}
 
-   /*
-* We will manage irq handler on our own.  In this case, irq_enabled
-* need to be true for using vblank core support.
-*/
-   drm->irq_enabled = true;
-
drm_mode_config_reset(drm);
drm_kms_helper_poll_init(drm);
 
-- 
2.32.0



  1   2   >