date:20230419

Re: [PATCH v3] drm/fbdev-generic: prohibit potential out-of-bounds access

2023-04-19 Thread Sui Jingfeng


Hi,

On 2023/4/20 00:31, Daniel Vetter wrote:

On Thu, Apr 20, 2023 at 12:00:41AM +0800, Sui Jingfeng wrote:

Hi,

Sorry about reply to you so late,

our  downstream (product kernel side) userspace GPU/DC driver

has been tested out a few bugs, I'm asking to fulfill my duty to that part
all days.

I may slow to reply, but I really love to reply.


On 2023/4/19 23:09, Daniel Vetter wrote:

On Tue, 18 Apr 2023 at 20:16, Sui Jingfeng <15330273...@189.cn> wrote:

Hi,

On 2023/4/19 01:52, Sui Jingfeng wrote:

Hi,

On 2023/4/18 16:32, Daniel Vetter wrote:

On Mon, Apr 17, 2023 at 07:32:19PM +0800, Sui Jingfeng wrote:

The fbdev test of IGT may write after EOF, which lead to out-of-bound
access for the drm drivers using fbdev-generic. For example, on a x86
+ aspeed bmc card platform, with a 1680x1050 resolution display,
running
fbdev test if IGT will cause the linux kernel hang with the following
call trace:

 Oops:  [#1] PREEMPT SMP PTI
 [IGT] fbdev: starting subtest eof
 Workqueue: events drm_fb_helper_damage_work [drm_kms_helper]
 [IGT] fbdev: starting subtest nullptr

 RIP: 0010:memcpy_erms+0xa/0x20
 RSP: 0018:a17d40167d98 EFLAGS: 00010246
 RAX: a17d4eb7fa80 RBX: a17d40e0aa80 RCX: 14c0
 RDX: 1a40 RSI: a17d40e0b000 RDI: a17d4eb8
 RBP: a17d40167e20 R08:  R09: 89522ecff8c0
 R10: a17d4e4c5000 R11:  R12: a17d4eb7fa80
 R13: 1a40 R14: 041a R15: a17d40167e30
 FS:  () GS:89525738()
knlGS:
 CS:  0010 DS:  ES:  CR0: 80050033
 CR2: a17d40e0b000 CR3: 0001eaeca006 CR4: 001706e0
 Call Trace:
  
  ? drm_fbdev_generic_helper_fb_dirty+0x207/0x330 [drm_kms_helper]
  drm_fb_helper_damage_work+0x8f/0x170 [drm_kms_helper]
  process_one_work+0x21f/0x430
  worker_thread+0x4e/0x3c0
  ? __pfx_worker_thread+0x10/0x10
  kthread+0xf4/0x120
  ? __pfx_kthread+0x10/0x10
  ret_from_fork+0x2c/0x50
  
 CR2: a17d40e0b000
 ---[ end trace  ]---

The direct reason is that damage rectange computed by
drm_fb_helper_memory_range_to_clip() does not guaranteed to be
in-bound.
It is already results in workaround code populate to elsewhere. Another
reason is that exposing a larger buffer size than the actual needed
help
to trigger this bug intrinsic in drm_fb_helper_memory_range_to_clip().

Others fbdev emulation solutions write to the GEM buffer directly, they
won't reproduce this bug because the .fb_dirty function callback do not
being hooked, so no chance is given to
drm_fb_helper_memory_range_to_clip()
to generate a out-of-bound when drm_fb_helper_sys_write() is called.

This patch break the trigger condition of this bug by shrinking the
shadow
buffer size to sizes->surface_height * buffer->fb->pitches[0].

Fixes: '8fbc9af55de0 ("drm/fbdev-generic: Set screen size to size of
GEM
buffer")'

Signed-off-by: Sui Jingfeng 
---
drivers/gpu/drm/drm_fbdev_generic.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/drm_fbdev_generic.c
b/drivers/gpu/drm/drm_fbdev_generic.c
index 8e5148bf40bb..b057cfbba938 100644
--- a/drivers/gpu/drm/drm_fbdev_generic.c
+++ b/drivers/gpu/drm/drm_fbdev_generic.c
@@ -94,7 +94,7 @@ static int
drm_fbdev_generic_helper_fb_probe(struct drm_fb_helper *fb_helper,
fb_helper->buffer = buffer;
fb_helper->fb = buffer->fb;
-screen_size = buffer->gem->size;
+screen_size = sizes->surface_height * buffer->fb->pitches[0];

So I read core some more and stumbled over drm_fb_helper_deferred_io().
Which has all the code and comments about this, including limiting.

I think it would be clearer if we fix the issue there, instead of
passing
limits around in obscure places that then again get broken?

No, it is more obscure doing that way...


As the size of the shadow screen buffer will be exposed to userspace.

The size 'helper->fb->height * helper->fb->pitches[0]' is a
exactly(best) fit,

You are guaranteed to waste at lease one byte by increasing one byte,

and can not store all pixels by decreasing one byte (In the case where
`helper->fb->pitches[0] = helper->fb->width * 4`).

It implicitly tell the userspace do not go beyond that boundary.

although userspace program can still choose to write  after EOF,

But it is for test purpose, to test the kernel if it can return a
-EFBIG or not.


The thing is,
Thomas both authored the limit checks in drm_fb_helper_deferred_io() and
the patch which broken them again, so clearly this isn't very
obvious. I'm
thinking of something like this:


diff --git a/drivers/gpu/drm/drm_fb_helper.c
b/drivers/gpu/drm/drm_fb_helper.c
index ef4eb8b12766..726dab67c359 100644
--- a/drivers/gpu/drm/drm_fb_helper.c
+++ b/drivers/gpu/drm/drm_fb_helper.c
@@ -697,10 +697,7 @@ void drm_fb_helper_deferred_io(struct fb_info
*info,

[pull] amdgpu drm-fixes-6.3

2023-04-19 Thread Alex Deucher

Hi Dave, Daniel,

Fixes for 6.3.

The following changes since commit 6a8f57ae2eb07ab39a6f0ccad60c760743051026:

  Linux 6.3-rc7 (2023-04-16 15:23:53 -0700)

are available in the Git repository at:

  https://gitlab.freedesktop.org/agd5f/linux.git 
tags/amd-drm-fixes-6.3-2023-04-19

for you to fetch changes up to 0b5dfe12755f87ec014bb4cc1930485026167430:

  drm/amd/display: fix a divided-by-zero error (2023-04-18 17:20:00 -0400)


amd-drm-fixes-6.3-2023-04-19:

amdgpu:
- GPU reset fix
- DCN 3.1.5 line buffer fix
- Display fix for single channel memory configs
- Fix a possible divide by 0


Alan Liu (1):
  drm/amdgpu: Fix desktop freezed after gpu-reset

Alex Hung (1):
  drm/amd/display: fix a divided-by-zero error

Daniel Miess (1):
  drm/amd/display: limit timing for single dimm memory

Dmytro Laktyushkin (1):
  drm/amd/display: set dcn315 lb bpp to 48

 drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c  |  3 +++
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crtc.c   | 17 ++---
 .../gpu/drm/amd/display/dc/dcn314/dcn314_resource.c  | 20 
 drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c |  2 +-
 .../drm/amd/display/modules/power/power_helpers.c|  4 
 5 files changed, 42 insertions(+), 4 deletions(-)

[PATCH v5] drm/fbdev-generic: prohibit potential out-of-bounds access

2023-04-19 Thread Sui Jingfeng

The fbdev test of IGT may write after EOF, which lead to out-of-bound
access for drm drivers hire fbdev-generic. For example, run fbdev test
on a x86+ast2400 platform, with 1680x1050 resolution, will cause the
linux kernel hang with the following call trace:

  Oops:  [#1] PREEMPT SMP PTI
  [IGT] fbdev: starting subtest eof
  Workqueue: events drm_fb_helper_damage_work [drm_kms_helper]
  [IGT] fbdev: starting subtest nullptr

  RIP: 0010:memcpy_erms+0xa/0x20
  RSP: 0018:a17d40167d98 EFLAGS: 00010246
  RAX: a17d4eb7fa80 RBX: a17d40e0aa80 RCX: 14c0
  RDX: 1a40 RSI: a17d40e0b000 RDI: a17d4eb8
  RBP: a17d40167e20 R08:  R09: 89522ecff8c0
  R10: a17d4e4c5000 R11:  R12: a17d4eb7fa80
  R13: 1a40 R14: 041a R15: a17d40167e30
  FS:  () GS:89525738() knlGS:
  CS:  0010 DS:  ES:  CR0: 80050033
  CR2: a17d40e0b000 CR3: 0001eaeca006 CR4: 001706e0
  Call Trace:
   
   ? drm_fbdev_generic_helper_fb_dirty+0x207/0x330 [drm_kms_helper]
   drm_fb_helper_damage_work+0x8f/0x170 [drm_kms_helper]
   process_one_work+0x21f/0x430
   worker_thread+0x4e/0x3c0
   ? __pfx_worker_thread+0x10/0x10
   kthread+0xf4/0x120
   ? __pfx_kthread+0x10/0x10
   ret_from_fork+0x2c/0x50
   
  CR2: a17d40e0b000
  ---[ end trace  ]---

The is because damage rectangles computed by
drm_fb_helper_memory_range_to_clip() function does not guaranteed to be
bound in the screen's active display area. Possible reasons are:

1) Buffers are allocated in the granularity of page size, for mmap system
   call support. The shadow screen buffer consumed by fbdev emulation may
   also choosed be page size aligned.

2) The DIV_ROUND_UP() used in drm_fb_helper_memory_range_to_clip()
   will introduce off-by-one error.

For example, on a 16KB page size system, in order to store a 1920x1080
XRGB framebuffer, we need allocate 507 pages. Unfortunately, the size
1920*1080*4 can not be divided exactly by 16KB.

 1920 * 1080 * 4 = 8294400 bytes
 506 * 16 * 1024 = 8290304 bytes
 507 * 16 * 1024 = 8306688 bytes

 line_length = 1920*4 = 7680 bytes

 507 * 16 * 1024 / 7680 = 1081.6

 off / line_length = 507 * 16 * 1024 / 7680 = 1081
 DIV_ROUND_UP(507 * 16 * 1024, 7680) will yeild 1082

memcpy_toio() typically issue the copy line by line, when copy the last
line, out-of-bound access will be happen. Because:

 1082 * line_length = 1082 * 7680 = 8309760, and 8309760 > 8306688

Note that userspace may stil write to the invisiable area if a larger
buffer than width x stride is exposed. But it is not a big issue as
long as there still have memory resolve the access if not drafting so
far.

 - Also limit the y1 (Daniel)
 - keep fix patch it to minimal (Daniel)
 - screen_size is page size aligned because of it need mmap (Thomas)
 - Adding fixes tag (Thomas)

Fixes: aa15c677cc34 ("drm/fb-helper: Fix vertical damage clipping")

Signed-off-by: Sui Jingfeng 
Reviewed-by: Thomas Zimmermann 
Tested-by: Geert Uytterhoeven 
Link: 
https://lore.kernel.org/dri-devel/ad44df29-3241-0d9e-e708-b0338bf3c...@189.cn/
---
 drivers/gpu/drm/drm_fb_helper.c | 16 
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c
index 64458982be40..6bb1b8b27d7a 100644
--- a/drivers/gpu/drm/drm_fb_helper.c
+++ b/drivers/gpu/drm/drm_fb_helper.c
@@ -641,19 +641,27 @@ static void drm_fb_helper_damage(struct drm_fb_helper 
*helper, u32 x, u32 y,
 static void drm_fb_helper_memory_range_to_clip(struct fb_info *info, off_t 
off, size_t len,
   struct drm_rect *clip)
 {
+   u32 line_length = info->fix.line_length;
+   u32 fb_height = info->var.yres;
off_t end = off + len;
u32 x1 = 0;
-   u32 y1 = off / info->fix.line_length;
+   u32 y1 = off / line_length;
u32 x2 = info->var.xres;
-   u32 y2 = DIV_ROUND_UP(end, info->fix.line_length);
+   u32 y2 = DIV_ROUND_UP(end, line_length);
+
+   /* Don't allow any of them beyond the bottom bound of display area */
+   if (y1 > fb_height)
+   y1 = fb_height;
+   if (y2 > fb_height)
+   y2 = fb_height;
 
if ((y2 - y1) == 1) {
/*
 * We've only written to a single scanline. Try to reduce
 * the number of horizontal pixels that need an update.
 */
-   off_t bit_off = (off % info->fix.line_length) * 8;
-   off_t bit_end = (end % info->fix.line_length) * 8;
+   off_t bit_off = (off % line_length) * 8;
+   off_t bit_end = (end % line_length) * 8;
 
x1 = bit_off / info->var.bits_per_pixel;
x2 = DIV_ROUND_UP(bit_end, info->var.bits_per_pixel);
-- 
2.25.1

Re: [PATCH 5/6] drm: bridge: samsung-dsim: Support non-burst mode

2023-04-19 Thread Adam Ford

On Mon, Apr 17, 2023 at 6:23 PM Marek Vasut  wrote:
>
> On 4/18/23 00:24, Adam Ford wrote:
> > On Mon, Apr 17, 2023 at 3:08 PM Marek Vasut  wrote:
> >>
> >> On 4/17/23 13:57, Adam Ford wrote:
> >>> On Sun, Apr 16, 2023 at 5:13 PM Marek Vasut  wrote:
> 
>  On 4/15/23 12:41, Adam Ford wrote:
> > The high-speed clock is hard-coded to the burst-clock
> > frequency specified in the device tree.  However, when
> > using devices like certain bridge chips without burst mode
> > and varying resolutions and refresh rates, it may be
> > necessary to set the high-speed clock dynamically based
> > on the desired pixel clock for the connected device.
> 
>  The link rate negotiation should happen internally between the nearest
>  bridge and DSIM, so please add that to DRM core instead of hacking
>  around it by tweaking the HS clock again.
> >>>
> >>> I thought you tried to add something like this before and had some 
> >>> resistance.
> >>
> >> Yes, all my attempts were rejected by a single reviewer. I suspended my
> >> efforts in that area for now.
> >>
> >>> The Pixel clock is set by the bridge already without any new code
> >>> added to the DRM core..  I am just reading that value that's there,
> >>> and setting the clock accordingly.  I don't see how this is a hack.
> >>
> >> Assume you have a DSI-to-HDMI bridge attached to your DSIM bridge, it
> >> operates in non-burst mode, like ADV7533 . How would you configure the
> >
> > I have an ADV7535
> >
> >> HS clock rate for such a bridge in DT ? (hint: you cannot, because the
> >> required clock comes from the EDID, which may not be available just yet)
> >
> > The whole idea is that you wouldn't want to or need to configure the
> > clock speed in the device tree because it comes from the
> > EDID->bridge->DSI.
> >
> > I've tested this configuration on imx8mm, imx8mn, and imx8mp and I can
> > change the resolution and refresh rate on the fly and the DSI will
> > automatically readjust accordingly.   If you fixed the clock in the
> > device tree, you wouldn't be able to do that, and that was the point
> > of this patch.
>
> Uh, I retract my comment, I was clearly confused here and we're talking
> about the same thing.

I'm working on a V2 for this series.  Are you OK with this if I update
the commit message a bit to make it more clear?

adam

[PATCH] drm/amd/display: remove unused variables otg_inst and cmd

2023-04-19 Thread Tom Rix

gcc reports
drivers/gpu/drm/amd/amdgpu/../display/dc/dcn21/dcn21_hwseq.c:
  In function ‘dcn21_set_backlight_level’:
drivers/gpu/drm/amd/amdgpu/../display/dc/dcn21/dcn21_hwseq.c:229:18:
  error: unused variable ‘otg_inst’ [-Werror=unused-variable]
  229 | uint32_t otg_inst = pipe_ctx->stream_res.tg->inst;
  |  ^~~~
drivers/gpu/drm/amd/amdgpu/../display/dc/dcn21/dcn21_hwseq.c:226:27:
  error: unused variable ‘cmd’ [-Werror=unused-variable]
  226 | union dmub_rb_cmd cmd;
  |   ^~~

These variables are not used, so remove them.

Fixes: e97cc04fe0fb ("drm/amd/display: refactor dmub commands into single 
function")
Signed-off-by: Tom Rix 
---
 drivers/gpu/drm/amd/display/dc/dcn21/dcn21_hwseq.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn21/dcn21_hwseq.c 
b/drivers/gpu/drm/amd/display/dc/dcn21/dcn21_hwseq.c
index 55a464a39529..43463d08f21b 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn21/dcn21_hwseq.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn21/dcn21_hwseq.c
@@ -223,10 +223,8 @@ bool dcn21_set_backlight_level(struct pipe_ctx *pipe_ctx,
uint32_t backlight_pwm_u16_16,
uint32_t frame_ramp)
 {
-   union dmub_rb_cmd cmd;
struct dc_context *dc = pipe_ctx->stream->ctx;
struct abm *abm = pipe_ctx->stream_res.abm;
-   uint32_t otg_inst = pipe_ctx->stream_res.tg->inst;
struct panel_cntl *panel_cntl = pipe_ctx->stream->link->panel_cntl;
 
if (dc->dc->res_pool->dmcu) {
-- 
2.27.0

Re: [PATCH 0/2] DPU1 GC1.8 wiring-up

2023-04-19 Thread Konrad Dybcio




On 20.04.2023 03:28, Abhinav Kumar wrote:
> 
> 
> On 4/19/2023 6:26 PM, Konrad Dybcio wrote:
>>
>>
>> On 20.04.2023 03:25, Dmitry Baryshkov wrote:
>>> On 20/04/2023 04:14, Konrad Dybcio wrote:
 Almost all SoCs from SDM845 to SM8550 inclusive feature a GC1.8
 dspp sub-block in addition to PCCv4. The other block differ a bit
 more, but none of them are supported upstream.

 This series adds configures the GCv1.8 on all the relevant SoCs.
>>>
>>> Does this mean that we will see gamma_lut support soon?
>> No promises, my plate is not even full, it's beyond overflowing! :P
>>
>> Konrad
> 
> So I think I wrote about this before during the catalog rework/fixes that the 
> gc registers are not written to / programmed.
> 
> If thats not done, is there any benefit to this series?
Completeness and preparation for the code itself, if nothing else?

Konrad
> 
>>>

 Signed-off-by: Konrad Dybcio 
 ---
 Konrad Dybcio (2):
     drm/msm/dpu1: Rename sm8150_dspp_blk to sdm845_dspp_blk
     drm/msm/dpu1: Enable GCv1.8 on many SoCs

    drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_0_sm8150.h   | 16 
 
    drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_0_sm8250.h   | 16 
 
    drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_3_sm6115.h   |  4 ++--
    drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_5_qcm2290.h  |  4 ++--
    drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_0_sm8350.h   | 16 
 
    drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_0_sc8280xp.h | 16 
 
    drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_1_sm8450.h   | 16 
 
    drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_9_0_sm8550.h   | 16 
 
    drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c   |  4 +++-
    9 files changed, 55 insertions(+), 53 deletions(-)
 ---
 base-commit: 3cdbc01c40e34c57697f8934f2727a88551696be
 change-id: 20230420-topic-dpu_gc-6901f75768db

 Best regards,
>>>

Re: [PATCH 0/2] DPU1 GC1.8 wiring-up

2023-04-19 Thread Abhinav Kumar





On 4/19/2023 6:26 PM, Konrad Dybcio wrote:



On 20.04.2023 03:25, Dmitry Baryshkov wrote:

On 20/04/2023 04:14, Konrad Dybcio wrote:

Almost all SoCs from SDM845 to SM8550 inclusive feature a GC1.8
dspp sub-block in addition to PCCv4. The other block differ a bit
more, but none of them are supported upstream.

This series adds configures the GCv1.8 on all the relevant SoCs.


Does this mean that we will see gamma_lut support soon?

No promises, my plate is not even full, it's beyond overflowing! :P

Konrad


So I think I wrote about this before during the catalog rework/fixes 
that the gc registers are not written to / programmed.


If thats not done, is there any benefit to this series?





Signed-off-by: Konrad Dybcio 
---
Konrad Dybcio (2):
    drm/msm/dpu1: Rename sm8150_dspp_blk to sdm845_dspp_blk
    drm/msm/dpu1: Enable GCv1.8 on many SoCs

   drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_0_sm8150.h   | 16 

   drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_0_sm8250.h   | 16 

   drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_3_sm6115.h   |  4 ++--
   drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_5_qcm2290.h  |  4 ++--
   drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_0_sm8350.h   | 16 

   drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_0_sc8280xp.h | 16 

   drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_1_sm8450.h   | 16 

   drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_9_0_sm8550.h   | 16 

   drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c   |  4 +++-
   9 files changed, 55 insertions(+), 53 deletions(-)
---
base-commit: 3cdbc01c40e34c57697f8934f2727a88551696be
change-id: 20230420-topic-dpu_gc-6901f75768db

Best regards,

Re: [PATCH 0/2] DPU1 GC1.8 wiring-up

2023-04-19 Thread Konrad Dybcio




On 20.04.2023 03:25, Dmitry Baryshkov wrote:
> On 20/04/2023 04:14, Konrad Dybcio wrote:
>> Almost all SoCs from SDM845 to SM8550 inclusive feature a GC1.8
>> dspp sub-block in addition to PCCv4. The other block differ a bit
>> more, but none of them are supported upstream.
>>
>> This series adds configures the GCv1.8 on all the relevant SoCs.
> 
> Does this mean that we will see gamma_lut support soon?
No promises, my plate is not even full, it's beyond overflowing! :P

Konrad
> 
>>
>> Signed-off-by: Konrad Dybcio 
>> ---
>> Konrad Dybcio (2):
>>    drm/msm/dpu1: Rename sm8150_dspp_blk to sdm845_dspp_blk
>>    drm/msm/dpu1: Enable GCv1.8 on many SoCs
>>
>>   drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_0_sm8150.h   | 16 
>> 
>>   drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_0_sm8250.h   | 16 
>> 
>>   drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_3_sm6115.h   |  4 ++--
>>   drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_5_qcm2290.h  |  4 ++--
>>   drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_0_sm8350.h   | 16 
>> 
>>   drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_0_sc8280xp.h | 16 
>> 
>>   drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_1_sm8450.h   | 16 
>> 
>>   drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_9_0_sm8550.h   | 16 
>> 
>>   drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c   |  4 +++-
>>   9 files changed, 55 insertions(+), 53 deletions(-)
>> ---
>> base-commit: 3cdbc01c40e34c57697f8934f2727a88551696be
>> change-id: 20230420-topic-dpu_gc-6901f75768db
>>
>> Best regards,
>

Re: [PATCH 0/2] DPU1 GC1.8 wiring-up

2023-04-19 Thread Dmitry Baryshkov


On 20/04/2023 04:14, Konrad Dybcio wrote:

Almost all SoCs from SDM845 to SM8550 inclusive feature a GC1.8
dspp sub-block in addition to PCCv4. The other block differ a bit
more, but none of them are supported upstream.

This series adds configures the GCv1.8 on all the relevant SoCs.


Does this mean that we will see gamma_lut support soon?



Signed-off-by: Konrad Dybcio 
---
Konrad Dybcio (2):
   drm/msm/dpu1: Rename sm8150_dspp_blk to sdm845_dspp_blk
   drm/msm/dpu1: Enable GCv1.8 on many SoCs

  drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_0_sm8150.h   | 16 
  drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_0_sm8250.h   | 16 
  drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_3_sm6115.h   |  4 ++--
  drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_5_qcm2290.h  |  4 ++--
  drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_0_sm8350.h   | 16 
  drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_0_sc8280xp.h | 16 
  drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_1_sm8450.h   | 16 
  drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_9_0_sm8550.h   | 16 
  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c   |  4 +++-
  9 files changed, 55 insertions(+), 53 deletions(-)
---
base-commit: 3cdbc01c40e34c57697f8934f2727a88551696be
change-id: 20230420-topic-dpu_gc-6901f75768db

Best regards,


--
With best wishes
Dmitry

[PATCH 1/2] drm/msm/dpu1: Rename sm8150_dspp_blk to sdm845_dspp_blk

2023-04-19 Thread Konrad Dybcio

SDM845 was the first SoC to include both PCC v4 and GC v1.8.
We don't currently support any other blocks but the common config
for these two can be reused for a large amount of SoCs.

Rename it to indicate the origin of that combo.

Signed-off-by: Konrad Dybcio 
---
 drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_0_sm8150.h   | 8 
 drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_0_sm8250.h   | 8 
 drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_3_sm6115.h   | 2 +-
 drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_5_qcm2290.h  | 2 +-
 drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_0_sm8350.h   | 8 
 drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_0_sc8280xp.h | 8 
 drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_1_sm8450.h   | 8 
 drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_9_0_sm8550.h   | 8 
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c   | 2 +-
 9 files changed, 27 insertions(+), 27 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_0_sm8150.h 
b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_0_sm8150.h
index 282d410269ff..c555d43ef0e0 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_0_sm8150.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_0_sm8150.h
@@ -118,13 +118,13 @@ static const struct dpu_lm_cfg sm8150_lm[] = {
 
 static const struct dpu_dspp_cfg sm8150_dspp[] = {
DSPP_BLK("dspp_0", DSPP_0, 0x54000, DSPP_SC7180_MASK,
-_dspp_sblk),
+_dspp_sblk),
DSPP_BLK("dspp_1", DSPP_1, 0x56000, DSPP_SC7180_MASK,
-_dspp_sblk),
+_dspp_sblk),
DSPP_BLK("dspp_2", DSPP_2, 0x58000, DSPP_SC7180_MASK,
-_dspp_sblk),
+_dspp_sblk),
DSPP_BLK("dspp_3", DSPP_3, 0x5a000, DSPP_SC7180_MASK,
-_dspp_sblk),
+_dspp_sblk),
 };
 
 static const struct dpu_pingpong_cfg sm8150_pp[] = {
diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_0_sm8250.h 
b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_0_sm8250.h
index 2c40229ea515..c8a174352ede 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_0_sm8250.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_0_sm8250.h
@@ -119,13 +119,13 @@ static const struct dpu_lm_cfg sm8250_lm[] = {
 
 static const struct dpu_dspp_cfg sm8250_dspp[] = {
DSPP_BLK("dspp_0", DSPP_0, 0x54000, DSPP_SC7180_MASK,
-_dspp_sblk),
+_dspp_sblk),
DSPP_BLK("dspp_1", DSPP_1, 0x56000, DSPP_SC7180_MASK,
-_dspp_sblk),
+_dspp_sblk),
DSPP_BLK("dspp_2", DSPP_2, 0x58000, DSPP_SC7180_MASK,
-_dspp_sblk),
+_dspp_sblk),
DSPP_BLK("dspp_3", DSPP_3, 0x5a000, DSPP_SC7180_MASK,
-_dspp_sblk),
+_dspp_sblk),
 };
 
 static const struct dpu_pingpong_cfg sm8250_pp[] = {
diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_3_sm6115.h 
b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_3_sm6115.h
index 6f04d8f85c92..00f82b2c18ff 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_3_sm6115.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_3_sm6115.h
@@ -56,7 +56,7 @@ static const struct dpu_lm_cfg sm6115_lm[] = {
 
 static const struct dpu_dspp_cfg sm6115_dspp[] = {
DSPP_BLK("dspp_0", DSPP_0, 0x54000, DSPP_SC7180_MASK,
-_dspp_sblk),
+_dspp_sblk),
 };
 
 static const struct dpu_pingpong_cfg sm6115_pp[] = {
diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_5_qcm2290.h 
b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_5_qcm2290.h
index 303492d62a5c..5f103140abc7 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_5_qcm2290.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_5_qcm2290.h
@@ -53,7 +53,7 @@ static const struct dpu_lm_cfg qcm2290_lm[] = {
 
 static const struct dpu_dspp_cfg qcm2290_dspp[] = {
DSPP_BLK("dspp_0", DSPP_0, 0x54000, DSPP_SC7180_MASK,
-_dspp_sblk),
+_dspp_sblk),
 };
 
 static const struct dpu_pingpong_cfg qcm2290_pp[] = {
diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_0_sm8350.h 
b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_0_sm8350.h
index ca107ca8de46..257e898fea18 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_0_sm8350.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_0_sm8350.h
@@ -117,13 +117,13 @@ static const struct dpu_lm_cfg sm8350_lm[] = {
 
 static const struct dpu_dspp_cfg sm8350_dspp[] = {
DSPP_BLK("dspp_0", DSPP_0, 0x54000, DSPP_SC7180_MASK,
-_dspp_sblk),
+_dspp_sblk),
DSPP_BLK("dspp_1", DSPP_1, 0x56000, DSPP_SC7180_MASK,
-_dspp_sblk),
+_dspp_sblk),
DSPP_BLK("dspp_2", DSPP_2, 0x58000, DSPP_SC7180_MASK,
-_dspp_sblk),
+_dspp_sblk),
DSPP_BLK("dspp_3", DSPP_3, 0x5a000, DSPP_SC7180_MASK,
-_dspp_sblk),
+_dspp_sblk),
 };
 
 static const struct dpu_pingpong_cfg sm8350_pp[]

[PATCH 2/2] drm/msm/dpu1: Enable GCv1.8 on many SoCs

2023-04-19 Thread Konrad Dybcio

There's a plethora of S(D)M-era SoCs that have a GC v1.8 but never
declared, let alone enabled it. Do so!

Signed-off-by: Konrad Dybcio 
---
 drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_0_sm8150.h   | 8 
 drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_0_sm8250.h   | 8 
 drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_3_sm6115.h   | 2 +-
 drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_5_qcm2290.h  | 2 +-
 drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_0_sm8350.h   | 8 
 drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_0_sc8280xp.h | 8 
 drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_1_sm8450.h   | 8 
 drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_9_0_sm8550.h   | 8 
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c   | 2 ++
 9 files changed, 28 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_0_sm8150.h 
b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_0_sm8150.h
index c555d43ef0e0..a49e4d265b73 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_0_sm8150.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_0_sm8150.h
@@ -117,13 +117,13 @@ static const struct dpu_lm_cfg sm8150_lm[] = {
 };
 
 static const struct dpu_dspp_cfg sm8150_dspp[] = {
-   DSPP_BLK("dspp_0", DSPP_0, 0x54000, DSPP_SC7180_MASK,
+   DSPP_BLK("dspp_0", DSPP_0, 0x54000, DSPP_MSM8998_MASK,
 _dspp_sblk),
-   DSPP_BLK("dspp_1", DSPP_1, 0x56000, DSPP_SC7180_MASK,
+   DSPP_BLK("dspp_1", DSPP_1, 0x56000, DSPP_MSM8998_MASK,
 _dspp_sblk),
-   DSPP_BLK("dspp_2", DSPP_2, 0x58000, DSPP_SC7180_MASK,
+   DSPP_BLK("dspp_2", DSPP_2, 0x58000, DSPP_MSM8998_MASK,
 _dspp_sblk),
-   DSPP_BLK("dspp_3", DSPP_3, 0x5a000, DSPP_SC7180_MASK,
+   DSPP_BLK("dspp_3", DSPP_3, 0x5a000, DSPP_MSM8998_MASK,
 _dspp_sblk),
 };
 
diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_0_sm8250.h 
b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_0_sm8250.h
index c8a174352ede..80252a96c2fd 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_0_sm8250.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_0_sm8250.h
@@ -118,13 +118,13 @@ static const struct dpu_lm_cfg sm8250_lm[] = {
 };
 
 static const struct dpu_dspp_cfg sm8250_dspp[] = {
-   DSPP_BLK("dspp_0", DSPP_0, 0x54000, DSPP_SC7180_MASK,
+   DSPP_BLK("dspp_0", DSPP_0, 0x54000, DSPP_MSM8998_MASK,
 _dspp_sblk),
-   DSPP_BLK("dspp_1", DSPP_1, 0x56000, DSPP_SC7180_MASK,
+   DSPP_BLK("dspp_1", DSPP_1, 0x56000, DSPP_MSM8998_MASK,
 _dspp_sblk),
-   DSPP_BLK("dspp_2", DSPP_2, 0x58000, DSPP_SC7180_MASK,
+   DSPP_BLK("dspp_2", DSPP_2, 0x58000, DSPP_MSM8998_MASK,
 _dspp_sblk),
-   DSPP_BLK("dspp_3", DSPP_3, 0x5a000, DSPP_SC7180_MASK,
+   DSPP_BLK("dspp_3", DSPP_3, 0x5a000, DSPP_MSM8998_MASK,
 _dspp_sblk),
 };
 
diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_3_sm6115.h 
b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_3_sm6115.h
index 00f82b2c18ff..ea89ba1ab0fd 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_3_sm6115.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_3_sm6115.h
@@ -55,7 +55,7 @@ static const struct dpu_lm_cfg sm6115_lm[] = {
 };
 
 static const struct dpu_dspp_cfg sm6115_dspp[] = {
-   DSPP_BLK("dspp_0", DSPP_0, 0x54000, DSPP_SC7180_MASK,
+   DSPP_BLK("dspp_0", DSPP_0, 0x54000, DSPP_MSM8998_MASK,
 _dspp_sblk),
 };
 
diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_5_qcm2290.h 
b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_5_qcm2290.h
index 5f103140abc7..739c1a4f6618 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_5_qcm2290.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_5_qcm2290.h
@@ -52,7 +52,7 @@ static const struct dpu_lm_cfg qcm2290_lm[] = {
 };
 
 static const struct dpu_dspp_cfg qcm2290_dspp[] = {
-   DSPP_BLK("dspp_0", DSPP_0, 0x54000, DSPP_SC7180_MASK,
+   DSPP_BLK("dspp_0", DSPP_0, 0x54000, DSPP_MSM8998_MASK,
 _dspp_sblk),
 };
 
diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_0_sm8350.h 
b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_0_sm8350.h
index 257e898fea18..f90eb457ff3d 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_0_sm8350.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_0_sm8350.h
@@ -116,13 +116,13 @@ static const struct dpu_lm_cfg sm8350_lm[] = {
 };
 
 static const struct dpu_dspp_cfg sm8350_dspp[] = {
-   DSPP_BLK("dspp_0", DSPP_0, 0x54000, DSPP_SC7180_MASK,
+   DSPP_BLK("dspp_0", DSPP_0, 0x54000, DSPP_MSM8998_MASK,
 _dspp_sblk),
-   DSPP_BLK("dspp_1", DSPP_1, 0x56000, DSPP_SC7180_MASK,
+   DSPP_BLK("dspp_1", DSPP_1, 0x56000, DSPP_MSM8998_MASK,
 _dspp_sblk),
-   DSPP_BLK("dspp_2", DSPP_2, 0x58000, DSPP_SC7180_MASK,
+   DSPP_BLK("dspp_2", DSPP_2, 0x58000, DSPP_MSM8998_MASK,
 _dspp_sblk),
-   DSPP_BLK("dspp_3", DSPP_3, 0x5a000, DSPP_SC7180_MASK,
+

[PATCH 0/2] DPU1 GC1.8 wiring-up

2023-04-19 Thread Konrad Dybcio

Almost all SoCs from SDM845 to SM8550 inclusive feature a GC1.8
dspp sub-block in addition to PCCv4. The other block differ a bit
more, but none of them are supported upstream.

This series adds configures the GCv1.8 on all the relevant SoCs.

Signed-off-by: Konrad Dybcio 
---
Konrad Dybcio (2):
  drm/msm/dpu1: Rename sm8150_dspp_blk to sdm845_dspp_blk
  drm/msm/dpu1: Enable GCv1.8 on many SoCs

 drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_0_sm8150.h   | 16 
 drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_0_sm8250.h   | 16 
 drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_3_sm6115.h   |  4 ++--
 drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_5_qcm2290.h  |  4 ++--
 drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_0_sm8350.h   | 16 
 drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_0_sc8280xp.h | 16 
 drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_1_sm8450.h   | 16 
 drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_9_0_sm8550.h   | 16 
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c   |  4 +++-
 9 files changed, 55 insertions(+), 53 deletions(-)
---
base-commit: 3cdbc01c40e34c57697f8934f2727a88551696be
change-id: 20230420-topic-dpu_gc-6901f75768db

Best regards,
-- 
Konrad Dybcio

Re: [PATCH v2 16/17] drm/msm/dpu: Implement tearcheck support on INTF block

2023-04-19 Thread Dmitry Baryshkov


On 17/04/2023 23:21, Marijn Suijten wrote:

Since DPU 5.0.0 the TEARCHECK registers and interrupts moved out of the
PINGPONG block and into the INTF.  Implement the necessary callbacks in
the INTF block, and use these callbacks together with the INTF_TEAR
interrupts.

Signed-off-by: Marijn Suijten 
---
  drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c|  11 ++
  drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h   |  10 +-
  .../gpu/drm/msm/disp/dpu1/dpu_encoder_phys_cmd.c   | 160 +--
  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.c| 214 +
  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.h|  25 +++
  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h|   2 +
  drivers/gpu/drm/msm/disp/dpu1/dpu_trace.h  |  14 ++
  7 files changed, 378 insertions(+), 58 deletions(-)


Reviewed-by: Dmitry Baryshkov 

--
With best wishes
Dmitry

Re: [Freedreno] [PATCH v2 15/17] drm/msm/dpu: Merge setup_- and enable_tearcheck pingpong callbacks

2023-04-19 Thread Dmitry Baryshkov


On 17/04/2023 23:21, Marijn Suijten wrote:

These functions are always called consecutively and are best bundled
together for simplicity, especially when the same structure of callbacks
will be replicated later on the interface block for INTF TE support.
The enable_tearcheck(false) case is now replaced with a more obvious
disable_tearcheck(), encapsulating the original register write with 0.

Suggested-by: Dmitry Baryshkov 
Signed-off-by: Marijn Suijten 
---
  drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_cmd.c | 10 --
  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_pingpong.c  | 10 ++
  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_pingpong.h  | 11 +--
  3 files changed, 15 insertions(+), 16 deletions(-)



Reviewed-by: Dmitry Baryshkov 

--
With best wishes
Dmitry

Re: [PATCH v2 14/17] drm/msm/dpu: Document and enable TEAR interrupts on DSI interfaces

2023-04-19 Thread Dmitry Baryshkov


On 17/04/2023 23:21, Marijn Suijten wrote:

All SoCs since DPU 5.0.0 have the tear interrupt registers moved out of
the PINGPONG block and into the INTF block.  Wire up these interrupts
and IRQ masks on all supported hardware.

Signed-off-by: Marijn Suijten 
---
  .../gpu/drm/msm/disp/dpu1/catalog/dpu_5_0_sm8150.h | 12 ++
  .../drm/msm/disp/dpu1/catalog/dpu_5_1_sc8180x.h| 12 ++
  .../gpu/drm/msm/disp/dpu1/catalog/dpu_6_0_sm8250.h | 12 ++
  .../gpu/drm/msm/disp/dpu1/catalog/dpu_6_2_sc7180.h |  8 ---
  .../gpu/drm/msm/disp/dpu1/catalog/dpu_6_3_sm6115.h |  8 ---
  .../drm/msm/disp/dpu1/catalog/dpu_6_5_qcm2290.h|  8 ---
  .../gpu/drm/msm/disp/dpu1/catalog/dpu_7_0_sm8350.h | 12 ++
  .../gpu/drm/msm/disp/dpu1/catalog/dpu_7_2_sc7280.h |  6 +++--
  .../drm/msm/disp/dpu1/catalog/dpu_8_0_sc8280xp.h   | 12 ++
  .../gpu/drm/msm/disp/dpu1/catalog/dpu_8_1_sm8450.h | 12 ++
  .../gpu/drm/msm/disp/dpu1/catalog/dpu_9_0_sm8550.h | 12 ++
  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c | 15 
  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h |  6 +++--
  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_interrupts.c  | 27 ++
  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_interrupts.h  |  4 
  15 files changed, 125 insertions(+), 41 deletions(-)


If there is v3 for some reason, please split this into two patches: 
core/interrups and SoC catalog changes.


--
With best wishes
Dmitry

Re: [PATCH 3/4] drm/i915/gsc: add initial support for GSC proxy

2023-04-19 Thread Teres Alexis, Alan Previn

I have a number of comments but most are personal preferences and so i labelled 
them nits.
I did catch a few minor coding styling issues and am assuming those need to be 
enforced as per i915/kernel rules?
That said, since they are so minor (or maybe they are not strict), I'm 
providing a conditional RB to fix those 4 issues
(i.e. the header inclusion alphabetical ordering and struct '{' bracket 
position)

Reviewed-by: Alan Previn 

On Wed, 2023-03-29 at 09:56 -0700, Ceraolo Spurio, Daniele wrote:
> The GSC uC needs to communicate with the CSME to perform certain
> operations. Since the GSC can't perform this communication directly
> on platforms where it is integrated in GT, i915 needs to transfer the
> messages from GSC to CSME and back.
> The proxy flow is as follow:
> 1 - i915 submits a request to GSC asking for the message to CSME
> 2 - GSC replies with the proxy header + payload for CSME
> 3 - i915 sends the reply from GSC as-is to CSME via the mei proxy
> component
> 4 - CSME replies with the proxy header + payload for GSC
> 5 - i915 submits a request to GSC with the reply from CSME
> 6 - GSC replies either with a new header + payload (same as step 2,
> so we restart from there) or with an end message.
> 
> After GSC load, i915 is expected to start the first proxy message chain,
> while all subsequent ones will be triggered by the GSC via interrupt.
> 
> To communicate with the CSME, we use a dedicated mei component, which
> means that we need to wait for it to bind before we can initialize the
> proxies. This usually happens quite fast, but given that there is a
> chance that we'll have to wait a few seconds the GSC work has been moved
> to a dedicated WQ to not stall other processes.
> 
> Signed-off-by: Daniele Ceraolo Spurio 
> Cc: Alan Previn 
> ---
>  drivers/gpu/drm/i915/Makefile |   1 +
>  drivers/gpu/drm/i915/gt/uc/intel_gsc_proxy.c  | 384 ++
>  drivers/gpu/drm/i915/gt/uc/intel_gsc_proxy.h  |  17 +
>  drivers/gpu/drm/i915/gt/uc/intel_gsc_uc.c |  40 +-
>  drivers/gpu/drm/i915/gt/uc/intel_gsc_uc.h |  14 +-
>  .../i915/gt/uc/intel_gsc_uc_heci_cmd_submit.h |   1 +
>  6 files changed, 452 insertions(+), 5 deletions(-)
>  create mode 100644 drivers/gpu/drm/i915/gt/uc/intel_gsc_proxy.c
>  create mode 100644 drivers/gpu/drm/i915/gt/uc/intel_gsc_proxy.h
> 
alan:snip

> new file mode 100644
> index ..ed8f68e78c26
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_gsc_proxy.c
> @@ -0,0 +1,384 @@
> +#include "intel_gsc_proxy.h"
> +
> +// SPDX-License-Identifier: MIT
> +/*
> + * Copyright © 2022 Intel Corporation
alan: nit: 2022 - 2023?

> + */
> +
> +#include 
> +#include "drm/i915_gsc_proxy_mei_interface.h"
alan: alphabetical
> +#include "drm/i915_component.h"
alan: snip

> +/*
> + * GSC proxy:
> + * The GSC uC needs to communicate with the CSME to perform certain 
> operations.
> + * Since the GSC can't perform this communication directly on platforms 
> where it
> + * is integrated in GT, i915 needs to transfer the messages from GSC to CSME
> + * and back. i915 must manually start the proxy flow after the GSC is loaded 
> to
> + * signal to GSC that we're ready to handle its messages and allow it to 
> query
> + * its init data from CSME; GSC will then trigger an HECI2 interrupt if it 
> needs
> + * to send messages to CSME again.
> + * The proxy flow is as follow:
> + * 1 - i915 submits a request to GSC asking for the message to CSME
> + * 2 - GSC replies with the proxy header + payload for CSME
> + * 3 - i915 sends the reply from GSC as-is to CSME via the mei proxy 
> component
> + * 4 - CSME replies with the proxy header + payload for GSC
> + * 5 - i915 submits a request to GSC with the reply from CSME
> + * 6 - GSC replies either with a new header + payload (same as step 2, so we
> + * restart from there) or with an end message.
> + */
> +
> +/*
> + * The component should load quite quickly in most cases, but it could take
> + * a bit. Using a very big timeout just to cover the worst case scenario
> + */
> +#define GSC_PROXY_INIT_TIMEOUT_MS 2
> +
> +/* the protocol supports up to 32K in each direction */
> +#define GSC_PROXY_BUFFER_SIZE SZ_32K
> +#define GSC_PROXY_CHANNEL_SIZE (GSC_PROXY_BUFFER_SIZE * 2)
> +#define GSC_PROXY_MAX_MSG_SIZE (GSC_PROXY_BUFFER_SIZE - sizeof(struct 
> intel_gsc_mtl_header))
> +
> +/* FW-defined proxy header */
> +struct intel_gsc_proxy_header
> +{
alan: i thought we typically put the '{' on the same line as the struct name 
> 
alan:snip

> +struct gsc_proxy_msg
> +{
alan: shouldnt the '{' be above?
> + struct intel_gsc_mtl_header header;
> + struct intel_gsc_proxy_header proxy_header;
> +} __packed;
> +
> +static int proxy_send_to_csme(struct intel_gsc_uc *gsc)
> +{
> + struct intel_gt *gt = gsc_uc_to_gt(gsc);
> + struct i915_gsc_proxy_component *comp = gsc->proxy.component;
> + struct intel_gsc_mtl_header *hdr;
> + void *in = gsc->proxy.to_csme;
> + void *out =

Re: [PATCH v2 11/17] drm/msm/dpu: Disable MDP vsync source selection on DPU 5.0.0 and above

2023-04-19 Thread Dmitry Baryshkov


On 20/04/2023 04:01, Konrad Dybcio wrote:



On 20.04.2023 03:00, Dmitry Baryshkov wrote:

On 17/04/2023 23:21, Marijn Suijten wrote:

Since hardware revision 5.0.0 the TE configuration moved out of the
PINGPONG block into the INTF block, including vsync source selection
that was previously part of MDP top.  Writing to the MDP_VSYNC_SEL
register has no effect anymore and is omitted downstream via the
DPU/SDE_MDP_VSYNC_SEL feature flag.  This flag is only added to INTF
blocks used by hardware prior to 5.0.0.

The code that writes to these registers in the INTF block will follow in
subsequent patches.

Signed-off-by: Marijn Suijten 
Reviewed-by: Dmitry Baryshkov 
---
   .../drm/msm/disp/dpu1/catalog/dpu_3_0_msm8998.h    |  2 +-
   .../gpu/drm/msm/disp/dpu1/catalog/dpu_4_0_sdm845.h |  2 +-
   drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h |  3 ++
   drivers/gpu/drm/msm/disp/dpu1/dpu_hw_top.c | 52 
+++---
   4 files changed, 41 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_3_0_msm8998.h 
b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_3_0_msm8998.h
index b7845591c384..6906f8046b9e 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_3_0_msm8998.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_3_0_msm8998.h
@@ -30,7 +30,7 @@ static const struct dpu_mdp_cfg msm8998_mdp[] = {
   {
   .name = "top_0", .id = MDP_TOP,
   .base = 0x0, .len = 0x458,
-    .features = 0,
+    .features = BIT(DPU_MDP_VSYNC_SEL),
   .clk_ctrls[DPU_CLK_CTRL_VIG0] = { .reg_off = 0x2ac, .bit_off = 0 },
   .clk_ctrls[DPU_CLK_CTRL_VIG1] = { .reg_off = 0x2b4, .bit_off = 0 },
   .clk_ctrls[DPU_CLK_CTRL_VIG2] = { .reg_off = 0x2bc, .bit_off = 0 },
diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_4_0_sdm845.h 
b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_4_0_sdm845.h
index 5b9b3b99f1b5..14ce397800d5 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_4_0_sdm845.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_4_0_sdm845.h
@@ -30,7 +30,7 @@ static const struct dpu_mdp_cfg sdm845_mdp[] = {
   {
   .name = "top_0", .id = MDP_TOP,
   .base = 0x0, .len = 0x45c,
-    .features = BIT(DPU_MDP_AUDIO_SELECT),
+    .features = BIT(DPU_MDP_AUDIO_SELECT) | BIT(DPU_MDP_VSYNC_SEL),
   .clk_ctrls[DPU_CLK_CTRL_VIG0] = { .reg_off = 0x2ac, .bit_off = 0 },
   .clk_ctrls[DPU_CLK_CTRL_VIG1] = { .reg_off = 0x2b4, .bit_off = 0 },
   .clk_ctrls[DPU_CLK_CTRL_VIG2] = { .reg_off = 0x2bc, .bit_off = 0 },
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
index 71584cd56fd7..599e177b89dd 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
@@ -48,6 +48,8 @@ enum {
    * @DPU_MDP_UBWC_1_5,  Universal Bandwidth compression version 1.5
    * @DPU_MDP_PERIPH_0_REMOVED Indicates that access to periph top0 block 
results
    *   in a failure
+ * @DPU_MDP_VSYNC_SEL  Enables vsync source selection via MDP_VSYNC_SEL 
register
+ * (moved into INTF block since DPU 5.0.0)
    * @DPU_MDP_MAX    Maximum value
      */
@@ -59,6 +61,7 @@ enum {
   DPU_MDP_UBWC_1_5,
   DPU_MDP_AUDIO_SELECT,
   DPU_MDP_PERIPH_0_REMOVED,
+    DPU_MDP_VSYNC_SEL,
   DPU_MDP_MAX
   };
   diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_top.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_top.c
index 2bb02e17ee52..9ea15a647a66 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_top.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_top.c
@@ -126,28 +126,16 @@ static void dpu_hw_get_danger_status(struct dpu_hw_mdp 
*mdp,
   status->sspp[SSPP_CURSOR1] = (value >> 26) & 0x3;
   }
   -static void dpu_hw_setup_vsync_source(struct dpu_hw_mdp *mdp,
+static void dpu_hw_setup_vsync_source_v1(struct dpu_hw_mdp *mdp,
   struct dpu_vsync_source_cfg *cfg)


In my opinion _v1 is not really descriptive here. Could you please rename it to 
dpu_hw_setup_vsync_source_no_vsync_sel() ?

v1 refers to the CTL rev 100 a.k.a 1.0.0 a.k.a 1, but that's not
yet very well formulated upstream.. if we even need it..


Yeah, but this mdp_top, not the ctl. And for CTL I'd probably rename _v1 
to _active to follow actual feature name.




Konrad


Or maybe rename dpu_hw_setup_vsync_source() to 
dpu_hw_setup_vsync_source_vsync_sel() and drop _v1 from this function.

Up to you.



   {
   struct dpu_hw_blk_reg_map *c;
-    u32 reg, wd_load_value, wd_ctl, wd_ctl2, i;
-    static const u32 pp_offset[PINGPONG_MAX] = {0xC, 0x8, 0x4, 0x13, 0x18};
+    u32 reg, wd_load_value, wd_ctl, wd_ctl2;
   -    if (!mdp || !cfg || (cfg->pp_count > ARRAY_SIZE(cfg->ppnumber)))
+    if (!mdp || !cfg)
   return;
     c = >hw;
-    reg = DPU_REG_READ(c, MDP_VSYNC_SEL);
-    for (i = 0; i < cfg->pp_count; i++) {
-    int pp_idx = cfg->ppnumber[i] - PINGPONG_0;
-
-    if (pp_idx >= ARRAY_SIZE(pp_offset))
-    continue;
-
-    reg &=

Re: [PATCH v2 13/17] drm/msm/dpu: Factor out shared interrupt register in INTF_BLK macro

2023-04-19 Thread Dmitry Baryshkov


On 17/04/2023 23:21, Marijn Suijten wrote:

As the INTF block is going to attain more interrupts that don't share
the same MDP_SSPP_TOP0_INTR register, factor out the _reg argument for
the caller to construct the right interrupt index (register and bit
index) to not make the interrupt bit arguments depend on one of multiple
interrupt register indices.  This brings us more in line with how PP_BLK
specifies its interrupts and allows for better wrapping in the arrays.

Signed-off-by: Marijn Suijten 
---
  .../drm/msm/disp/dpu1/catalog/dpu_3_0_msm8998.h| 16 +++---
  .../gpu/drm/msm/disp/dpu1/catalog/dpu_4_0_sdm845.h | 16 +++---
  .../gpu/drm/msm/disp/dpu1/catalog/dpu_5_0_sm8150.h | 16 +++---
  .../drm/msm/disp/dpu1/catalog/dpu_5_1_sc8180x.h| 24 +++
  .../gpu/drm/msm/disp/dpu1/catalog/dpu_6_0_sm8250.h | 16 +++---
  .../gpu/drm/msm/disp/dpu1/catalog/dpu_6_2_sc7180.h |  8 +++--
  .../gpu/drm/msm/disp/dpu1/catalog/dpu_6_3_sm6115.h |  6 ++--
  .../drm/msm/disp/dpu1/catalog/dpu_6_5_qcm2290.h|  6 ++--
  .../gpu/drm/msm/disp/dpu1/catalog/dpu_7_0_sm8350.h | 16 +++---
  .../gpu/drm/msm/disp/dpu1/catalog/dpu_7_2_sc7280.h | 12 ++--
  .../drm/msm/disp/dpu1/catalog/dpu_8_0_sc8280xp.h   | 36 --
  .../gpu/drm/msm/disp/dpu1/catalog/dpu_8_1_sm8450.h | 16 +++---
  .../gpu/drm/msm/disp/dpu1/catalog/dpu_9_0_sm8550.h | 16 +++---
  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c |  6 ++--
  14 files changed, 155 insertions(+), 55 deletions(-)


Reviewed-by: Dmitry Baryshkov 

--
With best wishes
Dmitry

Re: [PATCH v2 11/17] drm/msm/dpu: Disable MDP vsync source selection on DPU 5.0.0 and above

2023-04-19 Thread Konrad Dybcio




On 20.04.2023 03:00, Dmitry Baryshkov wrote:
> On 17/04/2023 23:21, Marijn Suijten wrote:
>> Since hardware revision 5.0.0 the TE configuration moved out of the
>> PINGPONG block into the INTF block, including vsync source selection
>> that was previously part of MDP top.  Writing to the MDP_VSYNC_SEL
>> register has no effect anymore and is omitted downstream via the
>> DPU/SDE_MDP_VSYNC_SEL feature flag.  This flag is only added to INTF
>> blocks used by hardware prior to 5.0.0.
>>
>> The code that writes to these registers in the INTF block will follow in
>> subsequent patches.
>>
>> Signed-off-by: Marijn Suijten 
>> Reviewed-by: Dmitry Baryshkov 
>> ---
>>   .../drm/msm/disp/dpu1/catalog/dpu_3_0_msm8998.h    |  2 +-
>>   .../gpu/drm/msm/disp/dpu1/catalog/dpu_4_0_sdm845.h |  2 +-
>>   drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h |  3 ++
>>   drivers/gpu/drm/msm/disp/dpu1/dpu_hw_top.c | 52 
>> +++---
>>   4 files changed, 41 insertions(+), 18 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_3_0_msm8998.h 
>> b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_3_0_msm8998.h
>> index b7845591c384..6906f8046b9e 100644
>> --- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_3_0_msm8998.h
>> +++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_3_0_msm8998.h
>> @@ -30,7 +30,7 @@ static const struct dpu_mdp_cfg msm8998_mdp[] = {
>>   {
>>   .name = "top_0", .id = MDP_TOP,
>>   .base = 0x0, .len = 0x458,
>> -    .features = 0,
>> +    .features = BIT(DPU_MDP_VSYNC_SEL),
>>   .clk_ctrls[DPU_CLK_CTRL_VIG0] = { .reg_off = 0x2ac, .bit_off = 0 },
>>   .clk_ctrls[DPU_CLK_CTRL_VIG1] = { .reg_off = 0x2b4, .bit_off = 0 },
>>   .clk_ctrls[DPU_CLK_CTRL_VIG2] = { .reg_off = 0x2bc, .bit_off = 0 },
>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_4_0_sdm845.h 
>> b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_4_0_sdm845.h
>> index 5b9b3b99f1b5..14ce397800d5 100644
>> --- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_4_0_sdm845.h
>> +++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_4_0_sdm845.h
>> @@ -30,7 +30,7 @@ static const struct dpu_mdp_cfg sdm845_mdp[] = {
>>   {
>>   .name = "top_0", .id = MDP_TOP,
>>   .base = 0x0, .len = 0x45c,
>> -    .features = BIT(DPU_MDP_AUDIO_SELECT),
>> +    .features = BIT(DPU_MDP_AUDIO_SELECT) | BIT(DPU_MDP_VSYNC_SEL),
>>   .clk_ctrls[DPU_CLK_CTRL_VIG0] = { .reg_off = 0x2ac, .bit_off = 0 },
>>   .clk_ctrls[DPU_CLK_CTRL_VIG1] = { .reg_off = 0x2b4, .bit_off = 0 },
>>   .clk_ctrls[DPU_CLK_CTRL_VIG2] = { .reg_off = 0x2bc, .bit_off = 0 },
>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h 
>> b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
>> index 71584cd56fd7..599e177b89dd 100644
>> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
>> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
>> @@ -48,6 +48,8 @@ enum {
>>    * @DPU_MDP_UBWC_1_5,  Universal Bandwidth compression version 1.5
>>    * @DPU_MDP_PERIPH_0_REMOVED Indicates that access to periph top0 block 
>> results
>>    *   in a failure
>> + * @DPU_MDP_VSYNC_SEL  Enables vsync source selection via MDP_VSYNC_SEL 
>> register
>> + * (moved into INTF block since DPU 5.0.0)
>>    * @DPU_MDP_MAX    Maximum value
>>      */
>> @@ -59,6 +61,7 @@ enum {
>>   DPU_MDP_UBWC_1_5,
>>   DPU_MDP_AUDIO_SELECT,
>>   DPU_MDP_PERIPH_0_REMOVED,
>> +    DPU_MDP_VSYNC_SEL,
>>   DPU_MDP_MAX
>>   };
>>   diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_top.c 
>> b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_top.c
>> index 2bb02e17ee52..9ea15a647a66 100644
>> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_top.c
>> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_top.c
>> @@ -126,28 +126,16 @@ static void dpu_hw_get_danger_status(struct dpu_hw_mdp 
>> *mdp,
>>   status->sspp[SSPP_CURSOR1] = (value >> 26) & 0x3;
>>   }
>>   -static void dpu_hw_setup_vsync_source(struct dpu_hw_mdp *mdp,
>> +static void dpu_hw_setup_vsync_source_v1(struct dpu_hw_mdp *mdp,
>>   struct dpu_vsync_source_cfg *cfg)
> 
> In my opinion _v1 is not really descriptive here. Could you please rename it 
> to dpu_hw_setup_vsync_source_no_vsync_sel() ?
v1 refers to the CTL rev 100 a.k.a 1.0.0 a.k.a 1, but that's not
yet very well formulated upstream.. if we even need it..

Konrad
> 
> Or maybe rename dpu_hw_setup_vsync_source() to 
> dpu_hw_setup_vsync_source_vsync_sel() and drop _v1 from this function.
> 
> Up to you.
> 
> 
>>   {
>>   struct dpu_hw_blk_reg_map *c;
>> -    u32 reg, wd_load_value, wd_ctl, wd_ctl2, i;
>> -    static const u32 pp_offset[PINGPONG_MAX] = {0xC, 0x8, 0x4, 0x13, 0x18};
>> +    u32 reg, wd_load_value, wd_ctl, wd_ctl2;
>>   -    if (!mdp || !cfg || (cfg->pp_count > ARRAY_SIZE(cfg->ppnumber)))
>> +    if (!mdp || !cfg)
>>   return;
>>     c = >hw;
>> -    reg = DPU_REG_READ(c, MDP_VSYNC_SEL);
>> -    for (i = 0; i < cfg->pp_count; i++) {
>> -    int pp_idx = cfg->ppnumber[i] -

Re: [PATCH v2 11/17] drm/msm/dpu: Disable MDP vsync source selection on DPU 5.0.0 and above

2023-04-19 Thread Dmitry Baryshkov


On 17/04/2023 23:21, Marijn Suijten wrote:

Since hardware revision 5.0.0 the TE configuration moved out of the
PINGPONG block into the INTF block, including vsync source selection
that was previously part of MDP top.  Writing to the MDP_VSYNC_SEL
register has no effect anymore and is omitted downstream via the
DPU/SDE_MDP_VSYNC_SEL feature flag.  This flag is only added to INTF
blocks used by hardware prior to 5.0.0.

The code that writes to these registers in the INTF block will follow in
subsequent patches.

Signed-off-by: Marijn Suijten 
Reviewed-by: Dmitry Baryshkov 
---
  .../drm/msm/disp/dpu1/catalog/dpu_3_0_msm8998.h|  2 +-
  .../gpu/drm/msm/disp/dpu1/catalog/dpu_4_0_sdm845.h |  2 +-
  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h |  3 ++
  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_top.c | 52 +++---
  4 files changed, 41 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_3_0_msm8998.h 
b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_3_0_msm8998.h
index b7845591c384..6906f8046b9e 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_3_0_msm8998.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_3_0_msm8998.h
@@ -30,7 +30,7 @@ static const struct dpu_mdp_cfg msm8998_mdp[] = {
{
.name = "top_0", .id = MDP_TOP,
.base = 0x0, .len = 0x458,
-   .features = 0,
+   .features = BIT(DPU_MDP_VSYNC_SEL),
.clk_ctrls[DPU_CLK_CTRL_VIG0] = { .reg_off = 0x2ac, .bit_off = 0 },
.clk_ctrls[DPU_CLK_CTRL_VIG1] = { .reg_off = 0x2b4, .bit_off = 0 },
.clk_ctrls[DPU_CLK_CTRL_VIG2] = { .reg_off = 0x2bc, .bit_off = 0 },
diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_4_0_sdm845.h 
b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_4_0_sdm845.h
index 5b9b3b99f1b5..14ce397800d5 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_4_0_sdm845.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_4_0_sdm845.h
@@ -30,7 +30,7 @@ static const struct dpu_mdp_cfg sdm845_mdp[] = {
{
.name = "top_0", .id = MDP_TOP,
.base = 0x0, .len = 0x45c,
-   .features = BIT(DPU_MDP_AUDIO_SELECT),
+   .features = BIT(DPU_MDP_AUDIO_SELECT) | BIT(DPU_MDP_VSYNC_SEL),
.clk_ctrls[DPU_CLK_CTRL_VIG0] = { .reg_off = 0x2ac, .bit_off = 0 },
.clk_ctrls[DPU_CLK_CTRL_VIG1] = { .reg_off = 0x2b4, .bit_off = 0 },
.clk_ctrls[DPU_CLK_CTRL_VIG2] = { .reg_off = 0x2bc, .bit_off = 0 },
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
index 71584cd56fd7..599e177b89dd 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
@@ -48,6 +48,8 @@ enum {
   * @DPU_MDP_UBWC_1_5,  Universal Bandwidth compression version 1.5
   * @DPU_MDP_PERIPH_0_REMOVED Indicates that access to periph top0 block 
results
   *   in a failure
+ * @DPU_MDP_VSYNC_SEL  Enables vsync source selection via MDP_VSYNC_SEL 
register
+ * (moved into INTF block since DPU 5.0.0)
   * @DPU_MDP_MAXMaximum value
  
   */

@@ -59,6 +61,7 @@ enum {
DPU_MDP_UBWC_1_5,
DPU_MDP_AUDIO_SELECT,
DPU_MDP_PERIPH_0_REMOVED,
+   DPU_MDP_VSYNC_SEL,
DPU_MDP_MAX
  };
  
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_top.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_top.c

index 2bb02e17ee52..9ea15a647a66 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_top.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_top.c
@@ -126,28 +126,16 @@ static void dpu_hw_get_danger_status(struct dpu_hw_mdp 
*mdp,
status->sspp[SSPP_CURSOR1] = (value >> 26) & 0x3;
  }
  
-static void dpu_hw_setup_vsync_source(struct dpu_hw_mdp *mdp,

+static void dpu_hw_setup_vsync_source_v1(struct dpu_hw_mdp *mdp,
struct dpu_vsync_source_cfg *cfg)


In my opinion _v1 is not really descriptive here. Could you please 
rename it to dpu_hw_setup_vsync_source_no_vsync_sel() ?


Or maybe rename dpu_hw_setup_vsync_source() to 
dpu_hw_setup_vsync_source_vsync_sel() and drop _v1 from this function.


Up to you.



  {
struct dpu_hw_blk_reg_map *c;
-   u32 reg, wd_load_value, wd_ctl, wd_ctl2, i;
-   static const u32 pp_offset[PINGPONG_MAX] = {0xC, 0x8, 0x4, 0x13, 0x18};
+   u32 reg, wd_load_value, wd_ctl, wd_ctl2;
  
-	if (!mdp || !cfg || (cfg->pp_count > ARRAY_SIZE(cfg->ppnumber)))

+   if (!mdp || !cfg)
return;
  
  	c = >hw;

-   reg = DPU_REG_READ(c, MDP_VSYNC_SEL);
-   for (i = 0; i < cfg->pp_count; i++) {
-   int pp_idx = cfg->ppnumber[i] - PINGPONG_0;
-
-   if (pp_idx >= ARRAY_SIZE(pp_offset))
-   continue;
-
-   reg &= ~(0xf << pp_offset[pp_idx]);
-   reg |= (cfg->vsync_source & 0xf) << pp_offset[pp_idx];
-   }
-   DPU_REG_WRITE(c, MDP_VSYNC_SEL, reg);
  
  	if (cfg->vsync_source >= DPU_VSYNC_SOURCE_WD_TIMER_4 &&

Re: [PATCH v2 10/17] drm/msm/dpu: Disable pingpong TE on DPU 5.0.0 and above

2023-04-19 Thread Dmitry Baryshkov


On 17/04/2023 23:21, Marijn Suijten wrote:

Since hardware revision 5.0.0 the TE configuration moved out of the
PINGPONG block into the INTF block.  Writing these registers has no
effect, and is omitted downstream via the DPU/SDE_PINGPONG_TE feature
flag.  This flag is only added to PINGPONG blocks used by hardware prior
to 5.0.0.

The existing PP_BLK_TE macro has been removed in favour of directly
passing this feature flag, which has thus far been the only difference
with PP_BLK.  PP_BLK_DITHER has been left in place as its embedded
feature flag already excludes this DPU_PINGPONG_TE bit and differs by
setting the block length to zero, as it only contains a DITHER subblock.

The code that writes to these registers in the INTF block will follow in
subsequent patches.

Signed-off-by: Marijn Suijten 
---
  .../drm/msm/disp/dpu1/catalog/dpu_3_0_msm8998.h|  8 +++
  .../gpu/drm/msm/disp/dpu1/catalog/dpu_4_0_sdm845.h |  8 +++
  .../gpu/drm/msm/disp/dpu1/catalog/dpu_5_0_sm8150.h | 12 +--
  .../drm/msm/disp/dpu1/catalog/dpu_5_1_sc8180x.h| 12 +--
  .../gpu/drm/msm/disp/dpu1/catalog/dpu_6_0_sm8250.h | 12 +--
  .../gpu/drm/msm/disp/dpu1/catalog/dpu_6_2_sc7180.h |  4 ++--
  .../gpu/drm/msm/disp/dpu1/catalog/dpu_6_3_sm6115.h |  2 +-
  .../drm/msm/disp/dpu1/catalog/dpu_6_5_qcm2290.h|  2 +-
  .../gpu/drm/msm/disp/dpu1/catalog/dpu_7_0_sm8350.h | 12 +--
  .../gpu/drm/msm/disp/dpu1/catalog/dpu_7_2_sc7280.h |  8 +++
  .../drm/msm/disp/dpu1/catalog/dpu_8_0_sc8280xp.h   | 24 ++---
  .../gpu/drm/msm/disp/dpu1/catalog/dpu_8_1_sm8450.h | 16 +++---
  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c | 25 ++
  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_pingpong.c| 12 ++-
  14 files changed, 78 insertions(+), 79 deletions(-)


Reviewed-by: Dmitry Baryshkov 

--
With best wishes
Dmitry

Re: [PATCH v2 09/17] drm/msm/dpu: Move autorefresh disable from CMD encoder to pingpong

2023-04-19 Thread Dmitry Baryshkov


On 17/04/2023 23:21, Marijn Suijten wrote:

This autorefresh disable logic in the physical command-mode encoder
consumes three callbacks to the pingpong block, and will explode in
unnecessary complexity when the same callbacks need to be called on the
interface block instead to accommodate INTF TE support.  To clean this
up, move the logic into the pingpong block under a disable_autorefresh
callback, replacing the aforementioned three get_autorefresh,
setup_autorefresh and get_vsync_info callbacks.

The same logic will have to be replicated to the interface block when it
receives INTF TE support, but it is less complex than constantly
switching on a "has_intf_te" boolean to choose a callback.

Suggested-by: Dmitry Baryshkov 
Signed-off-by: Marijn Suijten 
---
  .../gpu/drm/msm/disp/dpu1/dpu_encoder_phys_cmd.c   | 60 ++
  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_pingpong.c| 47 +++--
  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_pingpong.h| 25 ++---
  drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h|  4 ++
  4 files changed, 57 insertions(+), 79 deletions(-)


Reviewed-by: Dmitry Baryshkov 

--
With best wishes
Dmitry

Re: [PATCH v2 08/17] drm/msm/dpu: Drop unused poll_timeout_wr_ptr PINGPONG callback

2023-04-19 Thread Dmitry Baryshkov


On 17/04/2023 23:21, Marijn Suijten wrote:

This callback was migrated from downstream when DPU1 was first
introduced to mainline, but never used by any component.  Drop it to
save some lines and unnecessary confusion.

Suggested-by: Dmitry Baryshkov 
Signed-off-by: Marijn Suijten 
---
  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_pingpong.c | 18 --
  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_pingpong.h |  6 --
  2 files changed, 24 deletions(-)


Reviewed-by: Dmitry Baryshkov 


--
With best wishes
Dmitry

Re: [PATCH v2 07/17] drm/msm/dpu: Sort INTF registers numerically

2023-04-19 Thread Dmitry Baryshkov


On 17/04/2023 23:21, Marijn Suijten wrote:

A bunch of registers were appended at the end in e.g. 91143873a05d
("drm/msm/dpu: Add MISR register support for interface") rather than
being inserted in a place that maintains numerical sorting.  Restore
that.


Assuming that = "sort order":

Reviewed-by: Dmitry Baryshkov 

If I don't forget, I'll fix it when applying.



Signed-off-by: Marijn Suijten 
---
  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.c | 12 +++-
  1 file changed, 7 insertions(+), 5 deletions(-)




--
With best wishes
Dmitry

Re: [PATCH v2 06/17] drm/msm/dpu: Remove extraneous register define indentation

2023-04-19 Thread Dmitry Baryshkov


On 17/04/2023 23:21, Marijn Suijten wrote:

A bunch of registers are indented with two extra spaces, looking as if
these are values corresponding to the previous register which is not the
case, rather these are simply also register offsets and should only have
a single space separating them and the #define keyword.

Signed-off-by: Marijn Suijten 
---
  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.c | 41 +++--
  1 file changed, 21 insertions(+), 20 deletions(-)


Reviewed-by: Dmitry Baryshkov 

--
With best wishes
Dmitry

Re: [PATCH v2 05/17] drm/msm/dpu: Remove duplicate register defines from INTF

2023-04-19 Thread Dmitry Baryshkov


On 17/04/2023 23:21, Marijn Suijten wrote:

The INTF_FRAME_LINE_COUNT_EN, INTF_FRAME_COUNT and INTF_LINE_COUNT
registers are already defined higher up, in the right place when sorted
numerically.

Fixes: 25fdd5933e4c ("drm/msm: Add SDM845 DPU support")
Signed-off-by: Marijn Suijten 
---
  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.c | 5 -
  1 file changed, 5 deletions(-)


Reviewed-by: Dmitry Baryshkov 

--
With best wishes
Dmitry

Re: [PATCH v2 04/17] drm/msm/dpu: Fix PP_BLK_DIPHER -> DITHER typo

2023-04-19 Thread Dmitry Baryshkov


On 17/04/2023 23:21, Marijn Suijten wrote:

SM8550 only comes with a DITHER subblock inside the PINGPONG block,
hence the name and a block length of zero.  However, the PP_BLK macro
name was typo'd to DIPHER rather than DITHER.

Fixes: efcd0107727c ("drm/msm/dpu: add support for SM8550")
Signed-off-by: Marijn Suijten 
---
  drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_9_0_sm8550.h | 16 
  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c |  2 +-
  2 files changed, 9 insertions(+), 9 deletions(-)


Reviewed-by: Dmitry Baryshkov 

--
With best wishes
Dmitry

Re: [PATCH v2 03/17] drm/msm/dpu: Move non-MDP_TOP INTF_INTR offsets out of hwio header

2023-04-19 Thread Dmitry Baryshkov


On 17/04/2023 23:21, Marijn Suijten wrote:

These offsets do not fall under the MDP TOP block and do not fit the
comment right above.  Move them to dpu_hw_interrupts.c next to the
repsective MDP_INTF_x_OFF interrupt block offsets.

Fixes: 25fdd5933e4c ("drm/msm: Add SDM845 DPU support")
Signed-off-by: Marijn Suijten 
---
  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_interrupts.c | 5 -
  drivers/gpu/drm/msm/disp/dpu1/dpu_hwio.h  | 3 ---
  2 files changed, 4 insertions(+), 4 deletions(-)


Reviewed-by: Dmitry Baryshkov 

--
With best wishes
Dmitry

Re: [PATCH v2 02/17] drm/msm/dpu: Remove TE2 block and feature from DPU >= 7.0.0 hardware

2023-04-19 Thread Dmitry Baryshkov


On 17/04/2023 23:21, Marijn Suijten wrote:

No hardware beyond kona (sm8250) defines the TE2 PINGPONG sub-block
offset downstream.  Even though neither downstream nor upstream utilizes
these registers in any way, remove the erroneous specification for
SC8280XP, SM8350 and SM8450 to prevent confusion.

Note that downstream enables the PPSPLIT (split-FIFO) topology (single
LM for 2 PP and 2 INTF) based on the presence of a TE2 block.

Fixes: f0a1bdf64dd7 ("drm/msm/dpu: Introduce SC8280XP")
Fixes: 0a72f23f6ef8 ("drm/msm/dpu: Add SM8350 to hw catalog")
Fixes: 8cbbc3396065 ("drm/msm/dpu: add support for SM8450")
Signed-off-by: Marijn Suijten 
---
  drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_0_sm8350.h   |  4 ++--
  drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_0_sc8280xp.h | 12 ++--
  drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_1_sm8450.h   |  4 ++--
  3 files changed, 10 insertions(+), 10 deletions(-)


Reviewed-by: Dmitry Baryshkov 

--
With best wishes
Dmitry

Re: [PATCH v2 01/17] drm/msm/dpu: Remove unused INTF0 interrupt mask from SM6115/QCM2290

2023-04-19 Thread Dmitry Baryshkov


On 17/04/2023 23:21, Marijn Suijten wrote:

Neither of these SoCs has INTF0, they only have a DSI interface on index
1.  Stop enabling an interrupt that can't fire.

Fixes: 3581b7062cec ("drm/msm/disp/dpu1: add support for display on SM6115")
Fixes: 5334087ee743 ("drm/msm: add support for QCM2290 MDSS")
Signed-off-by: Marijn Suijten 
Reviewed-by: Dmitry Baryshkov 
Reviewed-by: Konrad Dybcio 
---
  drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_3_sm6115.h  | 1 -
  drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_5_qcm2290.h | 1 -
  2 files changed, 2 deletions(-)


Reviewed-by: Dmitry Baryshkov 

--
With best wishes
Dmitry

Re: [Freedreno] [PATCH 5/5] drm/msm/dpu1: Handle the reg bus ICC path

2023-04-19 Thread Dmitry Baryshkov


On 20/04/2023 00:26, Konrad Dybcio wrote:



On 19.04.2023 22:11, Jeykumar Sankaran wrote:



On 4/19/2023 12:48 PM, Konrad Dybcio wrote:



On 19.04.2023 21:06, Jeykumar Sankaran wrote:



On 4/17/2023 8:30 AM, Konrad Dybcio wrote:

Apart from the already handled data bus (MAS_MDP_Pn<->DDR), there's
another path that needs to be handled to ensure MDSS functions properly,
namely the "reg bus", a.k.a the CPU-MDSS interconnect.

Gating that path may have a variety of effects.. from none to otherwise
inexplicable DSI timeouts..

On the DPU side, we need to keep the bus alive. The vendor driver
kickstarts it to max (300Mbps) throughput on first commit, but in
exchange for some battery life in rare DPU-enabled-panel-disabled
usecases, we can request it at DPU init and gate it at suspend.

Signed-off-by: Konrad Dybcio 
---
    drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c | 22 --
    drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h |  1 +
    2 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
index dd6c1c40ab9e..d1f77faebbc0 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
@@ -384,15 +384,17 @@ static int dpu_kms_global_obj_init(struct dpu_kms 
*dpu_kms)
    return 0;
    }
    -static int dpu_kms_parse_data_bus_icc_path(struct dpu_kms *dpu_kms)
+static int dpu_kms_parse_icc_paths(struct dpu_kms *dpu_kms)
    {
    struct icc_path *path0;
    struct icc_path *path1;
+    struct icc_path *reg_bus_path;
    struct drm_device *dev = dpu_kms->dev;
    struct device *dpu_dev = dev->dev;
      path0 = msm_icc_get(dpu_dev, "mdp0-mem");
    path1 = msm_icc_get(dpu_dev, "mdp1-mem");
+    reg_bus_path = msm_icc_get(dpu_dev, "cpu-cfg");
      if (IS_ERR_OR_NULL(path0))
    return PTR_ERR_OR_ZERO(path0);
@@ -404,6 +406,10 @@ static int dpu_kms_parse_data_bus_icc_path(struct dpu_kms 
*dpu_kms)
    dpu_kms->mdp_path[1] = path1;
    dpu_kms->num_mdp_paths++;
    }
+
+    if (!IS_ERR_OR_NULL(reg_bus_path))
+    dpu_kms->reg_bus_path = reg_bus_path;
+
    return 0;
    }
    @@ -1039,7 +1045,7 @@ static int dpu_kms_hw_init(struct msm_kms *kms)
    DPU_DEBUG("REG_DMA is not defined");
    }
    -    dpu_kms_parse_data_bus_icc_path(dpu_kms);
+    dpu_kms_parse_icc_paths(dpu_kms);
      rc = pm_runtime_resume_and_get(_kms->pdev->dev);
    if (rc < 0)
@@ -1241,6 +1247,9 @@ static int __maybe_unused dpu_runtime_suspend(struct 
device *dev)
    for (i = 0; i < dpu_kms->num_mdp_paths; i++)
    icc_set_bw(dpu_kms->mdp_path[i], 0, 0);
    +    if (dpu_kms->reg_bus_path)
+    icc_set_bw(dpu_kms->reg_bus_path, 0, 0);
+
    return 0;
    }
    @@ -1261,6 +1270,15 @@ static int __maybe_unused dpu_runtime_resume(struct 
device *dev)
    return rc;
    }
    +    /*
+ * The vendor driver supports setting 76.8 / 150 / 300 Mbps on this

How do you arrive at these distint BW values? Are they provided by the ICC fwk 
for the given path?

They're hardcoded in the SDE driver.

Konrad

These bandwidths are derived from the scaling frequencies of all the buses 
participating in the icc-path. So they cannot be constants. Ideally they should 
be read from the hw catalog data of the respective platform.

msm-5.4 : rotator/sde_rotator_base.c

static const struct sde_rot_bus_data sde_rot_reg_bus_table[] = {
 {0, 0},
 {0, 76800},
 {0, 15},
 {0, 30},
};

One of the two voters begs to disagree, but I do indeed see that some
SoCs (lahaina, yupik, shima..) cast votes for 74/148/265 MBps instead
of 77/150/300 from the MDSS device (with rotator being considered
separate), or so say their DTs, thanks for pointing that out.


I wonder if it makes the difference. The values are pretty close.
Anyway, we now have the driver data, so you can push these values to the 
data.




Nonetheless, this code would taste good with bolognese sauce..

Konrad



Jeykumar S.

+ * path, but it seems to go for the highest level when display output
+ * is enabled and zero otherwise. For simplicity, we can assume that
+ * DPU being enabled and running implies that.
+ */
+    if (dpu_kms->reg_bus_path)
+    icc_set_bw(dpu_kms->reg_bus_path, 0, MBps_to_icc(300));
+
    dpu_vbif_init_memtypes(dpu_kms);
      drm_for_each_encoder(encoder, ddev)
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h
index d5d9bec90705..c332381d58c4 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h
@@ -111,6 +111,7 @@ struct dpu_kms {
    atomic_t bandwidth_ref;
    struct icc_path *mdp_path[2];
    u32 num_mdp_paths;
+    struct icc_path *reg_bus_path;
    };
      struct vsync_info {



--
With best wishes
Dmitry

Re: [PATCH v2 3/5] drm/msm/mdss: Rename path references to mdp_path

2023-04-19 Thread Dmitry Baryshkov


On 18/04/2023 15:10, Konrad Dybcio wrote:

The DPU1 driver needs to handle all MDPn<->DDR paths, as well as


Nit: msm_mdss.c is not DPU1.


CPU<->SLAVE_DISPLAY_CFG. The former ones share how their values are
calculated, but the latter one has static predefines spanning all SoCs.

In preparation for supporting the CPU<->SLAVE_DISPLAY_CFG path, rename
the path-related struct members to include "mdp_".

Signed-off-by: Konrad Dybcio 
---
  drivers/gpu/drm/msm/msm_mdss.c | 20 ++--
  1 file changed, 10 insertions(+), 10 deletions(-)


Reviewed-by: Dmitry Baryshkov 

--
With best wishes
Dmitry

Re: [PATCH v2 2/5] drm/msm/dpu1: Rename path references to mdp_path

2023-04-19 Thread Dmitry Baryshkov


On 18/04/2023 15:10, Konrad Dybcio wrote:

The DPU1 driver needs to handle all MDPn<->DDR paths, as well as
CPU<->SLAVE_DISPLAY_CFG. The former ones share how their values are
calculated, but the latter one has static predefines spanning all SoCs.

In preparation for supporting the CPU<->SLAVE_DISPLAY_CFG path, rename
the path-related struct members to include "mdp_".

Signed-off-by: Konrad Dybcio 
---
  drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.c | 10 +-
  drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c   | 12 ++--
  drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h   |  4 ++--
  3 files changed, 13 insertions(+), 13 deletions(-)


Reviewed-by: Dmitry Baryshkov 

--
With best wishes
Dmitry

Re: [Freedreno] [PATCH 1/4] drm/msm: add some cec register bitfield details

2023-04-19 Thread Dmitry Baryshkov


On 20/04/2023 03:27, Abhinav Kumar wrote:



On 4/19/2023 5:21 PM, Dmitry Baryshkov wrote:

On 20/04/2023 03:17, Abhinav Kumar wrote:



On 4/19/2023 5:11 PM, Dmitry Baryshkov wrote:

On 20/04/2023 03:10, Abhinav Kumar wrote:



On 4/19/2023 4:53 PM, Dmitry Baryshkov wrote:

On 18/04/2023 21:10, Arnaud Vrac wrote:

The register names and bitfields were determined from the downstream
msm-4.4 driver.

Signed-off-by: Arnaud Vrac 
---
  drivers/gpu/drm/msm/hdmi/hdmi.xml.h | 62 
-

  1 file changed, 61 insertions(+), 1 deletion(-)


I have opened MR against Mesa at 
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22588.


The patch is:

Reviewed-by: Dmitry Baryshkov 

Minor nit below



Also, shouldnt the register updates be done using rnn tool instead 
of manual edits?


We usually update the rnn and ask Rob to pull it at the beginning of 
the cycle.




Sorry, I didnt get this. So you are saying, we will accept manual 
edits and then replace it with the tool generated xml later? I was 
not aware of that, because previously I was always asked by Rob to 
use the tool to generate the xml and push that.


We accept manual edits for the patchset (so that one can test it), but 
before merging the patchset we ask Rob to pull the xml.




Interesting, and Rob generates the xml that time or who does that?

The MR you have created updates the freedreno/registers which is just to 
keep the XML in the driver and mesa in sync.


But I am trying to understand who generates the updated xml to merge it 
with the patchset if its not the developer who does that anymore.


In this case I went on and created the MR as Arnaud didn't create one. 
Yes, usually we do this on our own when updating the register file (in 
other words: I usually edit the xml, then regen the xml.h, then add it 
to the patchset).










diff --git a/drivers/gpu/drm/msm/hdmi/hdmi.xml.h 
b/drivers/gpu/drm/msm/hdmi/hdmi.xml.h

index 973b460486a5a..b4dd6e8cba6b7 100644
--- a/drivers/gpu/drm/msm/hdmi/hdmi.xml.h
+++ b/drivers/gpu/drm/msm/hdmi/hdmi.xml.h
@@ -76,6 +76,13 @@ enum hdmi_acr_cts {
  ACR_48 = 3,
  };
+enum hdmi_cec_tx_status {
+    CEC_TX_OK = 0,
+    CEC_TX_NACK = 1,
+    CEC_TX_ARB_LOSS = 2,
+    CEC_TX_MAX_RETRIES = 3,
+};
+
  #define REG_HDMI_CTRL    0x
  #define HDMI_CTRL_ENABLE    0x0001
  #define HDMI_CTRL_HDMI    0x0002
@@ -476,20 +483,73 @@ static inline uint32_t 
HDMI_DDC_REF_REFTIMER(uint32_t val)

  #define REG_HDMI_HDCP_SW_LOWER_AKSV    0x0288
  #define REG_HDMI_CEC_CTRL    0x028c
+#define HDMI_CEC_CTRL_ENABLE    0x0001
+#define HDMI_CEC_CTRL_SEND_TRIGGER    0x0002
+#define HDMI_CEC_CTRL_FRAME_SIZE__MASK    0x01f0
+#define HDMI_CEC_CTRL_FRAME_SIZE__SHIFT    4
+static inline uint32_t HDMI_CEC_CTRL_FRAME_SIZE(uint32_t val)
+{
+    return ((val) << HDMI_CEC_CTRL_FRAME_SIZE__SHIFT) & 
HDMI_CEC_CTRL_FRAME_SIZE__MASK;

+}
+#define HDMI_CEC_CTRL_LINE_OE    0x0200
  #define REG_HDMI_CEC_WR_DATA    0x0290
+#define HDMI_CEC_WR_DATA_BROADCAST    0x0001
+#define HDMI_CEC_WR_DATA_DATA__MASK    0xff00
+#define HDMI_CEC_WR_DATA_DATA__SHIFT    8
+static inline uint32_t HDMI_CEC_WR_DATA_DATA(uint32_t val)
+{
+    return ((val) << HDMI_CEC_WR_DATA_DATA__SHIFT) & 
HDMI_CEC_WR_DATA_DATA__MASK;

+}
-#define REG_HDMI_CEC_CEC_RETRANSMIT    0x0294
+#define REG_HDMI_CEC_RETRANSMIT    0x0294
+#define HDMI_CEC_RETRANSMIT_ENABLE    0x0001
+#define HDMI_CEC_RETRANSMIT_COUNT__MASK    0x00fe
+#define HDMI_CEC_RETRANSMIT_COUNT__SHIFT    1
+static inline uint32_t HDMI_CEC_RETRANSMIT_COUNT(uint32_t val)
+{
+    return ((val) << HDMI_CEC_RETRANSMIT_COUNT__SHIFT) & 
HDMI_CEC_RETRANSMIT_COUNT__MASK;

+}
  #define REG_HDMI_CEC_STATUS    0x0298
+#define HDMI_CEC_STATUS_BUSY    0x0001
+#define HDMI_CEC_STATUS_TX_FRAME_DONE    0x0008
+#define HDMI_CEC_STATUS_TX_STATUS__MASK    0x00f0
+#define HDMI_CEC_STATUS_TX_STATUS__SHIFT    4
+static inline uint32_t HDMI_CEC_STATUS_TX_STATUS(enum 
hdmi_cec_tx_status val)

+{
+    return ((val) << HDMI_CEC_STATUS_TX_STATUS__SHIFT) & 
HDMI_CEC_STATUS_TX_STATUS__MASK;

+}
  #define REG_HDMI_CEC_INT    0x029c
+#define HDMI_CEC_INT_TX_DONE    0x0001
+#define HDMI_CEC_INT_TX_DONE_MASK    0x0002
+#define HDMI_CEC_INT_TX_ERROR    0x0004
+#define HDMI_CEC_INT_TX_ERROR_MASK    0x0008
+#define HDMI_CEC_INT_MONITOR    0x0010
+#define HDMI_CEC_INT_MONITOR_MASK    0x0020
+#define HDMI_CEC_INT_RX_DONE    0x0040
+#define HDMI_CEC_INT_RX_DONE_MASK

Re: [PATCH 3/5] drm/msm/mdss: Rename path references to mdp_path

2023-04-19 Thread Dmitry Baryshkov


On 17/04/2023 18:30, Konrad Dybcio wrote:

The DPU1 driver needs to handle all MDPn<->DDR paths, as well as
CPU<->SLAVE_DISPLAY_CFG. The former ones share how their values are
calculated, but the latter one has static predefines spanning all SoCs.

In preparation for supporting the CPU<->SLAVE_DISPLAY_CFG path, rename
the path-related struct members to include "mdp_".

Signed-off-by: Konrad Dybcio 
---
  drivers/gpu/drm/msm/msm_mdss.c | 20 ++--
  1 file changed, 10 insertions(+), 10 deletions(-)


Reviewed-by: Dmitry Baryshkov 

--
With best wishes
Dmitry

Re: [Freedreno] [PATCH 1/4] drm/msm: add some cec register bitfield details

2023-04-19 Thread Abhinav Kumar





On 4/19/2023 5:21 PM, Dmitry Baryshkov wrote:

On 20/04/2023 03:17, Abhinav Kumar wrote:



On 4/19/2023 5:11 PM, Dmitry Baryshkov wrote:

On 20/04/2023 03:10, Abhinav Kumar wrote:



On 4/19/2023 4:53 PM, Dmitry Baryshkov wrote:

On 18/04/2023 21:10, Arnaud Vrac wrote:

The register names and bitfields were determined from the downstream
msm-4.4 driver.

Signed-off-by: Arnaud Vrac 
---
  drivers/gpu/drm/msm/hdmi/hdmi.xml.h | 62 
-

  1 file changed, 61 insertions(+), 1 deletion(-)


I have opened MR against Mesa at 
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22588.


The patch is:

Reviewed-by: Dmitry Baryshkov 

Minor nit below



Also, shouldnt the register updates be done using rnn tool instead 
of manual edits?


We usually update the rnn and ask Rob to pull it at the beginning of 
the cycle.




Sorry, I didnt get this. So you are saying, we will accept manual 
edits and then replace it with the tool generated xml later? I was not 
aware of that, because previously I was always asked by Rob to use the 
tool to generate the xml and push that.


We accept manual edits for the patchset (so that one can test it), but 
before merging the patchset we ask Rob to pull the xml.




Interesting, and Rob generates the xml that time or who does that?

The MR you have created updates the freedreno/registers which is just to 
keep the XML in the driver and mesa in sync.


But I am trying to understand who generates the updated xml to merge it 
with the patchset if its not the developer who does that anymore.








diff --git a/drivers/gpu/drm/msm/hdmi/hdmi.xml.h 
b/drivers/gpu/drm/msm/hdmi/hdmi.xml.h

index 973b460486a5a..b4dd6e8cba6b7 100644
--- a/drivers/gpu/drm/msm/hdmi/hdmi.xml.h
+++ b/drivers/gpu/drm/msm/hdmi/hdmi.xml.h
@@ -76,6 +76,13 @@ enum hdmi_acr_cts {
  ACR_48 = 3,
  };
+enum hdmi_cec_tx_status {
+    CEC_TX_OK = 0,
+    CEC_TX_NACK = 1,
+    CEC_TX_ARB_LOSS = 2,
+    CEC_TX_MAX_RETRIES = 3,
+};
+
  #define REG_HDMI_CTRL    0x
  #define HDMI_CTRL_ENABLE    0x0001
  #define HDMI_CTRL_HDMI    0x0002
@@ -476,20 +483,73 @@ static inline uint32_t 
HDMI_DDC_REF_REFTIMER(uint32_t val)

  #define REG_HDMI_HDCP_SW_LOWER_AKSV    0x0288
  #define REG_HDMI_CEC_CTRL    0x028c
+#define HDMI_CEC_CTRL_ENABLE    0x0001
+#define HDMI_CEC_CTRL_SEND_TRIGGER    0x0002
+#define HDMI_CEC_CTRL_FRAME_SIZE__MASK    0x01f0
+#define HDMI_CEC_CTRL_FRAME_SIZE__SHIFT    4
+static inline uint32_t HDMI_CEC_CTRL_FRAME_SIZE(uint32_t val)
+{
+    return ((val) << HDMI_CEC_CTRL_FRAME_SIZE__SHIFT) & 
HDMI_CEC_CTRL_FRAME_SIZE__MASK;

+}
+#define HDMI_CEC_CTRL_LINE_OE    0x0200
  #define REG_HDMI_CEC_WR_DATA    0x0290
+#define HDMI_CEC_WR_DATA_BROADCAST    0x0001
+#define HDMI_CEC_WR_DATA_DATA__MASK    0xff00
+#define HDMI_CEC_WR_DATA_DATA__SHIFT    8
+static inline uint32_t HDMI_CEC_WR_DATA_DATA(uint32_t val)
+{
+    return ((val) << HDMI_CEC_WR_DATA_DATA__SHIFT) & 
HDMI_CEC_WR_DATA_DATA__MASK;

+}
-#define REG_HDMI_CEC_CEC_RETRANSMIT    0x0294
+#define REG_HDMI_CEC_RETRANSMIT    0x0294
+#define HDMI_CEC_RETRANSMIT_ENABLE    0x0001
+#define HDMI_CEC_RETRANSMIT_COUNT__MASK    0x00fe
+#define HDMI_CEC_RETRANSMIT_COUNT__SHIFT    1
+static inline uint32_t HDMI_CEC_RETRANSMIT_COUNT(uint32_t val)
+{
+    return ((val) << HDMI_CEC_RETRANSMIT_COUNT__SHIFT) & 
HDMI_CEC_RETRANSMIT_COUNT__MASK;

+}
  #define REG_HDMI_CEC_STATUS    0x0298
+#define HDMI_CEC_STATUS_BUSY    0x0001
+#define HDMI_CEC_STATUS_TX_FRAME_DONE    0x0008
+#define HDMI_CEC_STATUS_TX_STATUS__MASK    0x00f0
+#define HDMI_CEC_STATUS_TX_STATUS__SHIFT    4
+static inline uint32_t HDMI_CEC_STATUS_TX_STATUS(enum 
hdmi_cec_tx_status val)

+{
+    return ((val) << HDMI_CEC_STATUS_TX_STATUS__SHIFT) & 
HDMI_CEC_STATUS_TX_STATUS__MASK;

+}
  #define REG_HDMI_CEC_INT    0x029c
+#define HDMI_CEC_INT_TX_DONE    0x0001
+#define HDMI_CEC_INT_TX_DONE_MASK    0x0002
+#define HDMI_CEC_INT_TX_ERROR    0x0004
+#define HDMI_CEC_INT_TX_ERROR_MASK    0x0008
+#define HDMI_CEC_INT_MONITOR    0x0010
+#define HDMI_CEC_INT_MONITOR_MASK    0x0020
+#define HDMI_CEC_INT_RX_DONE    0x0040
+#define HDMI_CEC_INT_RX_DONE_MASK    0x0080
  #define REG_HDMI_CEC_ADDR    0x02a0
  #define REG_HDMI_CEC_TIME    0x02a4
+#define HDMI_CEC_TIME_ENABLE    0x0001
+#define HDMI_CEC_TIME_SIGNAL_FREE_TIME__MASK    0xff80
+#define

Re: [PATCH 2/5] drm/msm/dpu1: Rename path references to mdp_path

2023-04-19 Thread Dmitry Baryshkov


On 17/04/2023 18:30, Konrad Dybcio wrote:

The DPU1 driver needs to handle all MDPn<->DDR paths, as well as
CPU<->SLAVE_DISPLAY_CFG. The former ones share how their values are
calculated, but the latter one has static predefines spanning all SoCs.

In preparation for supporting the CPU<->SLAVE_DISPLAY_CFG path, rename
the path-related struct members to include "mdp_".

Signed-off-by: Konrad Dybcio 
---
  drivers/gpu/drm/msm/disp/dpu1/dpu_core_perf.c | 10 +-
  drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c   | 12 ++--
  drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h   |  4 ++--
  3 files changed, 13 insertions(+), 13 deletions(-)


Reviewed-by: Dmitry Baryshkov 

--
With best wishes
Dmitry

Re: [Freedreno] [PATCH 1/5] dt-bindings: display/msm: Add reg bus interconnect

2023-04-19 Thread Dmitry Baryshkov


On 19/04/2023 23:05, Jeykumar Sankaran wrote:
Resending the question as the previous one was sent only to the 
freedreno list. Apologies for spamming!


On 4/17/2023 8:30 AM, Konrad Dybcio wrote:

Apart from the already handled data bus (MAS_MDP_Pn<->DDR), there's
another path that needs to be handled to ensure MDSS functions properly,
namely the "reg bus", a.k.a the CPU-MDSS interconnect.

Gating that path may have a variety of effects.. from none to otherwise
inexplicable DSI timeouts..

Describe it in bindings to allow for use in device trees.

Signed-off-by: Konrad Dybcio 
---
  Documentation/devicetree/bindings/display/msm/mdss-common.yaml | 1 +
  1 file changed, 1 insertion(+)

diff --git 
a/Documentation/devicetree/bindings/display/msm/mdss-common.yaml 
b/Documentation/devicetree/bindings/display/msm/mdss-common.yaml

index ccd7d6417523..9eb5b6d3e0b9 100644
--- a/Documentation/devicetree/bindings/display/msm/mdss-common.yaml
+++ b/Documentation/devicetree/bindings/display/msm/mdss-common.yaml
@@ -72,6 +72,7 @@ properties:
  items:
    - const: mdp0-mem
    - const: mdp1-mem
+  - const: cpu-cfg

If posted already, please point to the DTSI patch for this ICC path.


Probably it's worth updating the example in one of mdss schemas.


    resets:
  items:



--
With best wishes
Dmitry

Re: [PATCH 1/4] drm/msm: add some cec register bitfield details

2023-04-19 Thread Dmitry Baryshkov


On 20/04/2023 03:17, Abhinav Kumar wrote:



On 4/19/2023 5:11 PM, Dmitry Baryshkov wrote:

On 20/04/2023 03:10, Abhinav Kumar wrote:



On 4/19/2023 4:53 PM, Dmitry Baryshkov wrote:

On 18/04/2023 21:10, Arnaud Vrac wrote:

The register names and bitfields were determined from the downstream
msm-4.4 driver.

Signed-off-by: Arnaud Vrac 
---
  drivers/gpu/drm/msm/hdmi/hdmi.xml.h | 62 
-

  1 file changed, 61 insertions(+), 1 deletion(-)


I have opened MR against Mesa at 
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22588.


The patch is:

Reviewed-by: Dmitry Baryshkov 

Minor nit below



Also, shouldnt the register updates be done using rnn tool instead of 
manual edits?


We usually update the rnn and ask Rob to pull it at the beginning of 
the cycle.




Sorry, I didnt get this. So you are saying, we will accept manual edits 
and then replace it with the tool generated xml later? I was not aware 
of that, because previously I was always asked by Rob to use the tool to 
generate the xml and push that.


We accept manual edits for the patchset (so that one can test it), but 
before merging the patchset we ask Rob to pull the xml.








diff --git a/drivers/gpu/drm/msm/hdmi/hdmi.xml.h 
b/drivers/gpu/drm/msm/hdmi/hdmi.xml.h

index 973b460486a5a..b4dd6e8cba6b7 100644
--- a/drivers/gpu/drm/msm/hdmi/hdmi.xml.h
+++ b/drivers/gpu/drm/msm/hdmi/hdmi.xml.h
@@ -76,6 +76,13 @@ enum hdmi_acr_cts {
  ACR_48 = 3,
  };
+enum hdmi_cec_tx_status {
+    CEC_TX_OK = 0,
+    CEC_TX_NACK = 1,
+    CEC_TX_ARB_LOSS = 2,
+    CEC_TX_MAX_RETRIES = 3,
+};
+
  #define REG_HDMI_CTRL    0x
  #define HDMI_CTRL_ENABLE    0x0001
  #define HDMI_CTRL_HDMI    0x0002
@@ -476,20 +483,73 @@ static inline uint32_t 
HDMI_DDC_REF_REFTIMER(uint32_t val)

  #define REG_HDMI_HDCP_SW_LOWER_AKSV    0x0288
  #define REG_HDMI_CEC_CTRL    0x028c
+#define HDMI_CEC_CTRL_ENABLE    0x0001
+#define HDMI_CEC_CTRL_SEND_TRIGGER    0x0002
+#define HDMI_CEC_CTRL_FRAME_SIZE__MASK    0x01f0
+#define HDMI_CEC_CTRL_FRAME_SIZE__SHIFT    4
+static inline uint32_t HDMI_CEC_CTRL_FRAME_SIZE(uint32_t val)
+{
+    return ((val) << HDMI_CEC_CTRL_FRAME_SIZE__SHIFT) & 
HDMI_CEC_CTRL_FRAME_SIZE__MASK;

+}
+#define HDMI_CEC_CTRL_LINE_OE    0x0200
  #define REG_HDMI_CEC_WR_DATA    0x0290
+#define HDMI_CEC_WR_DATA_BROADCAST    0x0001
+#define HDMI_CEC_WR_DATA_DATA__MASK    0xff00
+#define HDMI_CEC_WR_DATA_DATA__SHIFT    8
+static inline uint32_t HDMI_CEC_WR_DATA_DATA(uint32_t val)
+{
+    return ((val) << HDMI_CEC_WR_DATA_DATA__SHIFT) & 
HDMI_CEC_WR_DATA_DATA__MASK;

+}
-#define REG_HDMI_CEC_CEC_RETRANSMIT    0x0294
+#define REG_HDMI_CEC_RETRANSMIT    0x0294
+#define HDMI_CEC_RETRANSMIT_ENABLE    0x0001
+#define HDMI_CEC_RETRANSMIT_COUNT__MASK    0x00fe
+#define HDMI_CEC_RETRANSMIT_COUNT__SHIFT    1
+static inline uint32_t HDMI_CEC_RETRANSMIT_COUNT(uint32_t val)
+{
+    return ((val) << HDMI_CEC_RETRANSMIT_COUNT__SHIFT) & 
HDMI_CEC_RETRANSMIT_COUNT__MASK;

+}
  #define REG_HDMI_CEC_STATUS    0x0298
+#define HDMI_CEC_STATUS_BUSY    0x0001
+#define HDMI_CEC_STATUS_TX_FRAME_DONE    0x0008
+#define HDMI_CEC_STATUS_TX_STATUS__MASK    0x00f0
+#define HDMI_CEC_STATUS_TX_STATUS__SHIFT    4
+static inline uint32_t HDMI_CEC_STATUS_TX_STATUS(enum 
hdmi_cec_tx_status val)

+{
+    return ((val) << HDMI_CEC_STATUS_TX_STATUS__SHIFT) & 
HDMI_CEC_STATUS_TX_STATUS__MASK;

+}
  #define REG_HDMI_CEC_INT    0x029c
+#define HDMI_CEC_INT_TX_DONE    0x0001
+#define HDMI_CEC_INT_TX_DONE_MASK    0x0002
+#define HDMI_CEC_INT_TX_ERROR    0x0004
+#define HDMI_CEC_INT_TX_ERROR_MASK    0x0008
+#define HDMI_CEC_INT_MONITOR    0x0010
+#define HDMI_CEC_INT_MONITOR_MASK    0x0020
+#define HDMI_CEC_INT_RX_DONE    0x0040
+#define HDMI_CEC_INT_RX_DONE_MASK    0x0080
  #define REG_HDMI_CEC_ADDR    0x02a0
  #define REG_HDMI_CEC_TIME    0x02a4
+#define HDMI_CEC_TIME_ENABLE    0x0001
+#define HDMI_CEC_TIME_SIGNAL_FREE_TIME__MASK    0xff80
+#define HDMI_CEC_TIME_SIGNAL_FREE_TIME__SHIFT    7
+static inline uint32_t HDMI_CEC_TIME_SIGNAL_FREE_TIME(uint32_t val)
+{
+    return ((val) << HDMI_CEC_TIME_SIGNAL_FREE_TIME__SHIFT) & 
HDMI_CEC_TIME_SIGNAL_FREE_TIME__MASK;

+}
  #define REG_HDMI_CEC_REFTIMER    0x02a8
+#define HDMI_CEC_REFTIMER_ENABLE    0x0001


I think this should come after the

Re: [PATCH 2/4] drm/msm: add hdmi cec support

2023-04-19 Thread Dmitry Baryshkov


On 18/04/2023 21:10, Arnaud Vrac wrote:

Some Qualcomm SoCs that support HDMI also support CEC, including MSM8996
and MSM8998. The hardware block can handle a single CEC logical address
and broadcast messages.

Port the CEC driver from downstream msm-4.4 kernel. It has been tested
on MSM8998 and passes the cec-compliance tool tests.

Signed-off-by: Arnaud Vrac 
---
  drivers/gpu/drm/msm/Kconfig |   8 ++
  drivers/gpu/drm/msm/Makefile|   1 +
  drivers/gpu/drm/msm/hdmi/hdmi.c |  15 ++
  drivers/gpu/drm/msm/hdmi/hdmi.h |  18 +++
  drivers/gpu/drm/msm/hdmi/hdmi_cec.c | 280 
  5 files changed, 322 insertions(+)

diff --git a/drivers/gpu/drm/msm/Kconfig b/drivers/gpu/drm/msm/Kconfig
index 85f5ab1d552c4..2a02c74207935 100644
--- a/drivers/gpu/drm/msm/Kconfig
+++ b/drivers/gpu/drm/msm/Kconfig
@@ -165,3 +165,11 @@ config DRM_MSM_HDMI_HDCP
default y
help
  Choose this option to enable HDCP state machine
+
+config DRM_MSM_HDMI_CEC
+   bool "Enable HDMI CEC support in MSM DRM driver"
+   depends on DRM_MSM && DRM_MSM_HDMI
+   select CEC_CORE
+   default y
+   help
+ Choose this option to enable CEC support
diff --git a/drivers/gpu/drm/msm/Makefile b/drivers/gpu/drm/msm/Makefile
index 7274c41228ed9..0237a2f219ac2 100644
--- a/drivers/gpu/drm/msm/Makefile
+++ b/drivers/gpu/drm/msm/Makefile
@@ -131,6 +131,7 @@ msm-$(CONFIG_DRM_MSM_DP)+= dp/dp_aux.o \
  
  msm-$(CONFIG_DRM_FBDEV_EMULATION) += msm_fbdev.o
  
+msm-$(CONFIG_DRM_MSM_HDMI_CEC) += hdmi/hdmi_cec.o

  msm-$(CONFIG_DRM_MSM_HDMI_HDCP) += hdmi/hdmi_hdcp.o
  
  msm-$(CONFIG_DRM_MSM_DSI) += dsi/dsi.o \

diff --git a/drivers/gpu/drm/msm/hdmi/hdmi.c b/drivers/gpu/drm/msm/hdmi/hdmi.c
index 3132105a2a433..1dde3890e25c0 100644
--- a/drivers/gpu/drm/msm/hdmi/hdmi.c
+++ b/drivers/gpu/drm/msm/hdmi/hdmi.c
@@ -11,6 +11,8 @@
  #include 
  #include 
  
+#include 

+
  #include 
  #include "hdmi.h"
  
@@ -53,6 +55,9 @@ static irqreturn_t msm_hdmi_irq(int irq, void *dev_id)

if (hdmi->hdcp_ctrl)
msm_hdmi_hdcp_irq(hdmi->hdcp_ctrl);
  
+	/* Process CEC: */

+   msm_hdmi_cec_irq(hdmi);
+
/* TODO audio.. */
  
  	return IRQ_HANDLED;

@@ -66,6 +71,8 @@ static void msm_hdmi_destroy(struct hdmi *hdmi)
 */
if (hdmi->workq)
destroy_workqueue(hdmi->workq);
+
+   msm_hdmi_cec_exit(hdmi);
msm_hdmi_hdcp_destroy(hdmi);
  
  	if (hdmi->i2c)

@@ -139,6 +146,8 @@ static int msm_hdmi_init(struct hdmi *hdmi)
hdmi->hdcp_ctrl = NULL;
}
  
+	msm_hdmi_cec_init(hdmi);

+
return 0;
  
  fail:

@@ -198,6 +207,12 @@ int msm_hdmi_modeset_init(struct hdmi *hdmi,
  
  	drm_connector_attach_encoder(hdmi->connector, hdmi->encoder);
  
+	if (hdmi->cec_adap) {

+   struct cec_connector_info conn_info;
+   cec_fill_conn_info_from_drm(_info, hdmi->connector);
+   cec_s_conn_info(hdmi->cec_adap, _info);
+   }
+
ret = devm_request_irq(dev->dev, hdmi->irq,
msm_hdmi_irq, IRQF_TRIGGER_HIGH,
"hdmi_isr", hdmi);
diff --git a/drivers/gpu/drm/msm/hdmi/hdmi.h b/drivers/gpu/drm/msm/hdmi/hdmi.h
index e8dbee50637fa..c639bd87f4b8f 100644
--- a/drivers/gpu/drm/msm/hdmi/hdmi.h
+++ b/drivers/gpu/drm/msm/hdmi/hdmi.h
@@ -29,6 +29,7 @@ struct hdmi_audio {
  };
  
  struct hdmi_hdcp_ctrl;

+struct cec_adapter;
  
  struct hdmi {

struct drm_device *dev;
@@ -73,6 +74,7 @@ struct hdmi {
struct workqueue_struct *workq;
  
  	struct hdmi_hdcp_ctrl *hdcp_ctrl;

+   struct cec_adapter *cec_adap;
  
  	/*

* spinlock to protect registers shared by different execution
@@ -261,4 +263,20 @@ static inline void msm_hdmi_hdcp_off(struct hdmi_hdcp_ctrl 
*hdcp_ctrl) {}
  static inline void msm_hdmi_hdcp_irq(struct hdmi_hdcp_ctrl *hdcp_ctrl) {}
  #endif
  
+/*

+ * cec
+ */
+#ifdef CONFIG_DRM_MSM_HDMI_CEC
+int msm_hdmi_cec_init(struct hdmi *hdmi);
+void msm_hdmi_cec_exit(struct hdmi *hdmi);
+void msm_hdmi_cec_irq(struct hdmi *hdmi);
+#else
+static inline int msm_hdmi_cec_init(struct hdmi *hdmi)
+{
+   return -ENXIO;
+}
+static inline void msm_hdmi_cec_exit(struct hdmi *hdmi) {}
+static inline void msm_hdmi_cec_irq(struct hdmi *hdmi) {}
+#endif
+
  #endif /* __HDMI_CONNECTOR_H__ */
diff --git a/drivers/gpu/drm/msm/hdmi/hdmi_cec.c 
b/drivers/gpu/drm/msm/hdmi/hdmi_cec.c
new file mode 100644
index 0..51326e493e5da
--- /dev/null
+++ b/drivers/gpu/drm/msm/hdmi/hdmi_cec.c
@@ -0,0 +1,280 @@
+#include 
+#include 
+
+#include "hdmi.h"
+
+#define HDMI_CEC_INT_MASK ( \
+   HDMI_CEC_INT_TX_DONE_MASK | \
+   HDMI_CEC_INT_TX_ERROR_MASK | \
+   HDMI_CEC_INT_RX_DONE_MASK)
+
+struct hdmi_cec_ctrl {
+   struct hdmi *hdmi;
+   struct work_struct work;
+   spinlock_t lock;
+   u32 irq_status;
+   u32 tx_status;
+   u32 tx_retransmits;
+};
+
+static int

Re: [PATCH 1/4] drm/msm: add some cec register bitfield details

2023-04-19 Thread Abhinav Kumar





On 4/19/2023 5:11 PM, Dmitry Baryshkov wrote:

On 20/04/2023 03:10, Abhinav Kumar wrote:



On 4/19/2023 4:53 PM, Dmitry Baryshkov wrote:

On 18/04/2023 21:10, Arnaud Vrac wrote:

The register names and bitfields were determined from the downstream
msm-4.4 driver.

Signed-off-by: Arnaud Vrac 
---
  drivers/gpu/drm/msm/hdmi/hdmi.xml.h | 62 
-

  1 file changed, 61 insertions(+), 1 deletion(-)


I have opened MR against Mesa at 
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22588.


The patch is:

Reviewed-by: Dmitry Baryshkov 

Minor nit below



Also, shouldnt the register updates be done using rnn tool instead of 
manual edits?


We usually update the rnn and ask Rob to pull it at the beginning of the 
cycle.




Sorry, I didnt get this. So you are saying, we will accept manual edits 
and then replace it with the tool generated xml later? I was not aware 
of that, because previously I was always asked by Rob to use the tool to 
generate the xml and push that.






diff --git a/drivers/gpu/drm/msm/hdmi/hdmi.xml.h 
b/drivers/gpu/drm/msm/hdmi/hdmi.xml.h

index 973b460486a5a..b4dd6e8cba6b7 100644
--- a/drivers/gpu/drm/msm/hdmi/hdmi.xml.h
+++ b/drivers/gpu/drm/msm/hdmi/hdmi.xml.h
@@ -76,6 +76,13 @@ enum hdmi_acr_cts {
  ACR_48 = 3,
  };
+enum hdmi_cec_tx_status {
+    CEC_TX_OK = 0,
+    CEC_TX_NACK = 1,
+    CEC_TX_ARB_LOSS = 2,
+    CEC_TX_MAX_RETRIES = 3,
+};
+
  #define REG_HDMI_CTRL    0x
  #define HDMI_CTRL_ENABLE    0x0001
  #define HDMI_CTRL_HDMI    0x0002
@@ -476,20 +483,73 @@ static inline uint32_t 
HDMI_DDC_REF_REFTIMER(uint32_t val)

  #define REG_HDMI_HDCP_SW_LOWER_AKSV    0x0288
  #define REG_HDMI_CEC_CTRL    0x028c
+#define HDMI_CEC_CTRL_ENABLE    0x0001
+#define HDMI_CEC_CTRL_SEND_TRIGGER    0x0002
+#define HDMI_CEC_CTRL_FRAME_SIZE__MASK    0x01f0
+#define HDMI_CEC_CTRL_FRAME_SIZE__SHIFT    4
+static inline uint32_t HDMI_CEC_CTRL_FRAME_SIZE(uint32_t val)
+{
+    return ((val) << HDMI_CEC_CTRL_FRAME_SIZE__SHIFT) & 
HDMI_CEC_CTRL_FRAME_SIZE__MASK;

+}
+#define HDMI_CEC_CTRL_LINE_OE    0x0200
  #define REG_HDMI_CEC_WR_DATA    0x0290
+#define HDMI_CEC_WR_DATA_BROADCAST    0x0001
+#define HDMI_CEC_WR_DATA_DATA__MASK    0xff00
+#define HDMI_CEC_WR_DATA_DATA__SHIFT    8
+static inline uint32_t HDMI_CEC_WR_DATA_DATA(uint32_t val)
+{
+    return ((val) << HDMI_CEC_WR_DATA_DATA__SHIFT) & 
HDMI_CEC_WR_DATA_DATA__MASK;

+}
-#define REG_HDMI_CEC_CEC_RETRANSMIT    0x0294
+#define REG_HDMI_CEC_RETRANSMIT    0x0294
+#define HDMI_CEC_RETRANSMIT_ENABLE    0x0001
+#define HDMI_CEC_RETRANSMIT_COUNT__MASK    0x00fe
+#define HDMI_CEC_RETRANSMIT_COUNT__SHIFT    1
+static inline uint32_t HDMI_CEC_RETRANSMIT_COUNT(uint32_t val)
+{
+    return ((val) << HDMI_CEC_RETRANSMIT_COUNT__SHIFT) & 
HDMI_CEC_RETRANSMIT_COUNT__MASK;

+}
  #define REG_HDMI_CEC_STATUS    0x0298
+#define HDMI_CEC_STATUS_BUSY    0x0001
+#define HDMI_CEC_STATUS_TX_FRAME_DONE    0x0008
+#define HDMI_CEC_STATUS_TX_STATUS__MASK    0x00f0
+#define HDMI_CEC_STATUS_TX_STATUS__SHIFT    4
+static inline uint32_t HDMI_CEC_STATUS_TX_STATUS(enum 
hdmi_cec_tx_status val)

+{
+    return ((val) << HDMI_CEC_STATUS_TX_STATUS__SHIFT) & 
HDMI_CEC_STATUS_TX_STATUS__MASK;

+}
  #define REG_HDMI_CEC_INT    0x029c
+#define HDMI_CEC_INT_TX_DONE    0x0001
+#define HDMI_CEC_INT_TX_DONE_MASK    0x0002
+#define HDMI_CEC_INT_TX_ERROR    0x0004
+#define HDMI_CEC_INT_TX_ERROR_MASK    0x0008
+#define HDMI_CEC_INT_MONITOR    0x0010
+#define HDMI_CEC_INT_MONITOR_MASK    0x0020
+#define HDMI_CEC_INT_RX_DONE    0x0040
+#define HDMI_CEC_INT_RX_DONE_MASK    0x0080
  #define REG_HDMI_CEC_ADDR    0x02a0
  #define REG_HDMI_CEC_TIME    0x02a4
+#define HDMI_CEC_TIME_ENABLE    0x0001
+#define HDMI_CEC_TIME_SIGNAL_FREE_TIME__MASK    0xff80
+#define HDMI_CEC_TIME_SIGNAL_FREE_TIME__SHIFT    7
+static inline uint32_t HDMI_CEC_TIME_SIGNAL_FREE_TIME(uint32_t val)
+{
+    return ((val) << HDMI_CEC_TIME_SIGNAL_FREE_TIME__SHIFT) & 
HDMI_CEC_TIME_SIGNAL_FREE_TIME__MASK;

+}
  #define REG_HDMI_CEC_REFTIMER    0x02a8
+#define HDMI_CEC_REFTIMER_ENABLE    0x0001


I think this should come after the REFTIMER field.


+#define HDMI_CEC_REFTIMER_REFTIMER__MASK    0x
+#define HDMI_CEC_REFTIMER_REFTIMER__SHIFT    0
+static inline uint32_t

Re: [PATCH 1/4] drm/msm: add some cec register bitfield details

2023-04-19 Thread Dmitry Baryshkov


On 20/04/2023 03:10, Abhinav Kumar wrote:



On 4/19/2023 4:53 PM, Dmitry Baryshkov wrote:

On 18/04/2023 21:10, Arnaud Vrac wrote:

The register names and bitfields were determined from the downstream
msm-4.4 driver.

Signed-off-by: Arnaud Vrac 
---
  drivers/gpu/drm/msm/hdmi/hdmi.xml.h | 62 
-

  1 file changed, 61 insertions(+), 1 deletion(-)


I have opened MR against Mesa at 
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22588.


The patch is:

Reviewed-by: Dmitry Baryshkov 

Minor nit below



Also, shouldnt the register updates be done using rnn tool instead of 
manual edits?


We usually update the rnn and ask Rob to pull it at the beginning of the 
cycle.






diff --git a/drivers/gpu/drm/msm/hdmi/hdmi.xml.h 
b/drivers/gpu/drm/msm/hdmi/hdmi.xml.h

index 973b460486a5a..b4dd6e8cba6b7 100644
--- a/drivers/gpu/drm/msm/hdmi/hdmi.xml.h
+++ b/drivers/gpu/drm/msm/hdmi/hdmi.xml.h
@@ -76,6 +76,13 @@ enum hdmi_acr_cts {
  ACR_48 = 3,
  };
+enum hdmi_cec_tx_status {
+    CEC_TX_OK = 0,
+    CEC_TX_NACK = 1,
+    CEC_TX_ARB_LOSS = 2,
+    CEC_TX_MAX_RETRIES = 3,
+};
+
  #define REG_HDMI_CTRL    0x
  #define HDMI_CTRL_ENABLE    0x0001
  #define HDMI_CTRL_HDMI    0x0002
@@ -476,20 +483,73 @@ static inline uint32_t 
HDMI_DDC_REF_REFTIMER(uint32_t val)

  #define REG_HDMI_HDCP_SW_LOWER_AKSV    0x0288
  #define REG_HDMI_CEC_CTRL    0x028c
+#define HDMI_CEC_CTRL_ENABLE    0x0001
+#define HDMI_CEC_CTRL_SEND_TRIGGER    0x0002
+#define HDMI_CEC_CTRL_FRAME_SIZE__MASK    0x01f0
+#define HDMI_CEC_CTRL_FRAME_SIZE__SHIFT    4
+static inline uint32_t HDMI_CEC_CTRL_FRAME_SIZE(uint32_t val)
+{
+    return ((val) << HDMI_CEC_CTRL_FRAME_SIZE__SHIFT) & 
HDMI_CEC_CTRL_FRAME_SIZE__MASK;

+}
+#define HDMI_CEC_CTRL_LINE_OE    0x0200
  #define REG_HDMI_CEC_WR_DATA    0x0290
+#define HDMI_CEC_WR_DATA_BROADCAST    0x0001
+#define HDMI_CEC_WR_DATA_DATA__MASK    0xff00
+#define HDMI_CEC_WR_DATA_DATA__SHIFT    8
+static inline uint32_t HDMI_CEC_WR_DATA_DATA(uint32_t val)
+{
+    return ((val) << HDMI_CEC_WR_DATA_DATA__SHIFT) & 
HDMI_CEC_WR_DATA_DATA__MASK;

+}
-#define REG_HDMI_CEC_CEC_RETRANSMIT    0x0294
+#define REG_HDMI_CEC_RETRANSMIT    0x0294
+#define HDMI_CEC_RETRANSMIT_ENABLE    0x0001
+#define HDMI_CEC_RETRANSMIT_COUNT__MASK    0x00fe
+#define HDMI_CEC_RETRANSMIT_COUNT__SHIFT    1
+static inline uint32_t HDMI_CEC_RETRANSMIT_COUNT(uint32_t val)
+{
+    return ((val) << HDMI_CEC_RETRANSMIT_COUNT__SHIFT) & 
HDMI_CEC_RETRANSMIT_COUNT__MASK;

+}
  #define REG_HDMI_CEC_STATUS    0x0298
+#define HDMI_CEC_STATUS_BUSY    0x0001
+#define HDMI_CEC_STATUS_TX_FRAME_DONE    0x0008
+#define HDMI_CEC_STATUS_TX_STATUS__MASK    0x00f0
+#define HDMI_CEC_STATUS_TX_STATUS__SHIFT    4
+static inline uint32_t HDMI_CEC_STATUS_TX_STATUS(enum 
hdmi_cec_tx_status val)

+{
+    return ((val) << HDMI_CEC_STATUS_TX_STATUS__SHIFT) & 
HDMI_CEC_STATUS_TX_STATUS__MASK;

+}
  #define REG_HDMI_CEC_INT    0x029c
+#define HDMI_CEC_INT_TX_DONE    0x0001
+#define HDMI_CEC_INT_TX_DONE_MASK    0x0002
+#define HDMI_CEC_INT_TX_ERROR    0x0004
+#define HDMI_CEC_INT_TX_ERROR_MASK    0x0008
+#define HDMI_CEC_INT_MONITOR    0x0010
+#define HDMI_CEC_INT_MONITOR_MASK    0x0020
+#define HDMI_CEC_INT_RX_DONE    0x0040
+#define HDMI_CEC_INT_RX_DONE_MASK    0x0080
  #define REG_HDMI_CEC_ADDR    0x02a0
  #define REG_HDMI_CEC_TIME    0x02a4
+#define HDMI_CEC_TIME_ENABLE    0x0001
+#define HDMI_CEC_TIME_SIGNAL_FREE_TIME__MASK    0xff80
+#define HDMI_CEC_TIME_SIGNAL_FREE_TIME__SHIFT    7
+static inline uint32_t HDMI_CEC_TIME_SIGNAL_FREE_TIME(uint32_t val)
+{
+    return ((val) << HDMI_CEC_TIME_SIGNAL_FREE_TIME__SHIFT) & 
HDMI_CEC_TIME_SIGNAL_FREE_TIME__MASK;

+}
  #define REG_HDMI_CEC_REFTIMER    0x02a8
+#define HDMI_CEC_REFTIMER_ENABLE    0x0001


I think this should come after the REFTIMER field.


+#define HDMI_CEC_REFTIMER_REFTIMER__MASK    0x
+#define HDMI_CEC_REFTIMER_REFTIMER__SHIFT    0
+static inline uint32_t HDMI_CEC_REFTIMER_REFTIMER(uint32_t val)
+{
+    return ((val) << HDMI_CEC_REFTIMER_REFTIMER__SHIFT) & 
HDMI_CEC_REFTIMER_REFTIMER__MASK;

+}
  #define REG_HDMI_CEC_RD_DATA    0x02ac





--
With best wishes
Dmitry

Re: [PATCH 1/4] drm/msm: add some cec register bitfield details

2023-04-19 Thread Abhinav Kumar





On 4/19/2023 4:53 PM, Dmitry Baryshkov wrote:

On 18/04/2023 21:10, Arnaud Vrac wrote:

The register names and bitfields were determined from the downstream
msm-4.4 driver.

Signed-off-by: Arnaud Vrac 
---
  drivers/gpu/drm/msm/hdmi/hdmi.xml.h | 62 
-

  1 file changed, 61 insertions(+), 1 deletion(-)


I have opened MR against Mesa at 
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22588.


The patch is:

Reviewed-by: Dmitry Baryshkov 

Minor nit below



Also, shouldnt the register updates be done using rnn tool instead of 
manual edits?




diff --git a/drivers/gpu/drm/msm/hdmi/hdmi.xml.h 
b/drivers/gpu/drm/msm/hdmi/hdmi.xml.h

index 973b460486a5a..b4dd6e8cba6b7 100644
--- a/drivers/gpu/drm/msm/hdmi/hdmi.xml.h
+++ b/drivers/gpu/drm/msm/hdmi/hdmi.xml.h
@@ -76,6 +76,13 @@ enum hdmi_acr_cts {
  ACR_48 = 3,
  };
+enum hdmi_cec_tx_status {
+    CEC_TX_OK = 0,
+    CEC_TX_NACK = 1,
+    CEC_TX_ARB_LOSS = 2,
+    CEC_TX_MAX_RETRIES = 3,
+};
+
  #define REG_HDMI_CTRL    0x
  #define HDMI_CTRL_ENABLE    0x0001
  #define HDMI_CTRL_HDMI    0x0002
@@ -476,20 +483,73 @@ static inline uint32_t 
HDMI_DDC_REF_REFTIMER(uint32_t val)

  #define REG_HDMI_HDCP_SW_LOWER_AKSV    0x0288
  #define REG_HDMI_CEC_CTRL    0x028c
+#define HDMI_CEC_CTRL_ENABLE    0x0001
+#define HDMI_CEC_CTRL_SEND_TRIGGER    0x0002
+#define HDMI_CEC_CTRL_FRAME_SIZE__MASK    0x01f0
+#define HDMI_CEC_CTRL_FRAME_SIZE__SHIFT    4
+static inline uint32_t HDMI_CEC_CTRL_FRAME_SIZE(uint32_t val)
+{
+    return ((val) << HDMI_CEC_CTRL_FRAME_SIZE__SHIFT) & 
HDMI_CEC_CTRL_FRAME_SIZE__MASK;

+}
+#define HDMI_CEC_CTRL_LINE_OE    0x0200
  #define REG_HDMI_CEC_WR_DATA    0x0290
+#define HDMI_CEC_WR_DATA_BROADCAST    0x0001
+#define HDMI_CEC_WR_DATA_DATA__MASK    0xff00
+#define HDMI_CEC_WR_DATA_DATA__SHIFT    8
+static inline uint32_t HDMI_CEC_WR_DATA_DATA(uint32_t val)
+{
+    return ((val) << HDMI_CEC_WR_DATA_DATA__SHIFT) & 
HDMI_CEC_WR_DATA_DATA__MASK;

+}
-#define REG_HDMI_CEC_CEC_RETRANSMIT    0x0294
+#define REG_HDMI_CEC_RETRANSMIT    0x0294
+#define HDMI_CEC_RETRANSMIT_ENABLE    0x0001
+#define HDMI_CEC_RETRANSMIT_COUNT__MASK    0x00fe
+#define HDMI_CEC_RETRANSMIT_COUNT__SHIFT    1
+static inline uint32_t HDMI_CEC_RETRANSMIT_COUNT(uint32_t val)
+{
+    return ((val) << HDMI_CEC_RETRANSMIT_COUNT__SHIFT) & 
HDMI_CEC_RETRANSMIT_COUNT__MASK;

+}
  #define REG_HDMI_CEC_STATUS    0x0298
+#define HDMI_CEC_STATUS_BUSY    0x0001
+#define HDMI_CEC_STATUS_TX_FRAME_DONE    0x0008
+#define HDMI_CEC_STATUS_TX_STATUS__MASK    0x00f0
+#define HDMI_CEC_STATUS_TX_STATUS__SHIFT    4
+static inline uint32_t HDMI_CEC_STATUS_TX_STATUS(enum 
hdmi_cec_tx_status val)

+{
+    return ((val) << HDMI_CEC_STATUS_TX_STATUS__SHIFT) & 
HDMI_CEC_STATUS_TX_STATUS__MASK;

+}
  #define REG_HDMI_CEC_INT    0x029c
+#define HDMI_CEC_INT_TX_DONE    0x0001
+#define HDMI_CEC_INT_TX_DONE_MASK    0x0002
+#define HDMI_CEC_INT_TX_ERROR    0x0004
+#define HDMI_CEC_INT_TX_ERROR_MASK    0x0008
+#define HDMI_CEC_INT_MONITOR    0x0010
+#define HDMI_CEC_INT_MONITOR_MASK    0x0020
+#define HDMI_CEC_INT_RX_DONE    0x0040
+#define HDMI_CEC_INT_RX_DONE_MASK    0x0080
  #define REG_HDMI_CEC_ADDR    0x02a0
  #define REG_HDMI_CEC_TIME    0x02a4
+#define HDMI_CEC_TIME_ENABLE    0x0001
+#define HDMI_CEC_TIME_SIGNAL_FREE_TIME__MASK    0xff80
+#define HDMI_CEC_TIME_SIGNAL_FREE_TIME__SHIFT    7
+static inline uint32_t HDMI_CEC_TIME_SIGNAL_FREE_TIME(uint32_t val)
+{
+    return ((val) << HDMI_CEC_TIME_SIGNAL_FREE_TIME__SHIFT) & 
HDMI_CEC_TIME_SIGNAL_FREE_TIME__MASK;

+}
  #define REG_HDMI_CEC_REFTIMER    0x02a8
+#define HDMI_CEC_REFTIMER_ENABLE    0x0001


I think this should come after the REFTIMER field.


+#define HDMI_CEC_REFTIMER_REFTIMER__MASK    0x
+#define HDMI_CEC_REFTIMER_REFTIMER__SHIFT    0
+static inline uint32_t HDMI_CEC_REFTIMER_REFTIMER(uint32_t val)
+{
+    return ((val) << HDMI_CEC_REFTIMER_REFTIMER__SHIFT) & 
HDMI_CEC_REFTIMER_REFTIMER__MASK;

+}
  #define REG_HDMI_CEC_RD_DATA    0x02ac

Re: [PATCH 3/4] drm/msm: expose edid to hdmi cec adapter

2023-04-19 Thread Dmitry Baryshkov


On 18/04/2023 21:10, Arnaud Vrac wrote:

When edid has been read after hpd, pass it to the cec adapter so that it
can extract the physical address of the device on the cec bus.
Invalidate the physical address when hpd is low.


If there is another bridge in a chain (e.g. display-connector) which 
handles HPD, then the msm_hdmi_bridge_detect() might get skipped. Please 
also add the hpd_notify() callback which invalidate the physical 
address. See adv7511, which does that both in its own HPD path and in 
the hpd_notify().




Signed-off-by: Arnaud Vrac 
---
  drivers/gpu/drm/msm/hdmi/hdmi_bridge.c |  2 ++
  drivers/gpu/drm/msm/hdmi/hdmi_hpd.c| 17 +
  2 files changed, 15 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/msm/hdmi/hdmi_bridge.c 
b/drivers/gpu/drm/msm/hdmi/hdmi_bridge.c
index 9b1391d27ed39..efc3bd4908e83 100644
--- a/drivers/gpu/drm/msm/hdmi/hdmi_bridge.c
+++ b/drivers/gpu/drm/msm/hdmi/hdmi_bridge.c
@@ -7,6 +7,7 @@
  #include 
  #include 
  #include 
+#include 
  
  #include "msm_kms.h"

  #include "hdmi.h"
@@ -256,6 +257,7 @@ static struct edid *msm_hdmi_bridge_get_edid(struct 
drm_bridge *bridge,
hdmi_write(hdmi, REG_HDMI_CTRL, hdmi_ctrl | HDMI_CTRL_ENABLE);
  
  	edid = drm_get_edid(connector, hdmi->i2c);

+   cec_s_phys_addr_from_edid(hdmi->cec_adap, edid);
  
  	hdmi_write(hdmi, REG_HDMI_CTRL, hdmi_ctrl);
  
diff --git a/drivers/gpu/drm/msm/hdmi/hdmi_hpd.c b/drivers/gpu/drm/msm/hdmi/hdmi_hpd.c

index bfa827b479897..cb3eb2625ff63 100644
--- a/drivers/gpu/drm/msm/hdmi/hdmi_hpd.c
+++ b/drivers/gpu/drm/msm/hdmi/hdmi_hpd.c
@@ -7,6 +7,7 @@
  #include 
  #include 
  #include 
+#include 
  
  #include "msm_kms.h"

  #include "hdmi.h"
@@ -230,15 +231,17 @@ enum drm_connector_status msm_hdmi_bridge_detect(
  {
struct hdmi_bridge *hdmi_bridge = to_hdmi_bridge(bridge);
struct hdmi *hdmi = hdmi_bridge->hdmi;
-   enum drm_connector_status stat_gpio, stat_reg;
+   enum drm_connector_status status, stat_gpio, stat_reg;
int retry = 20;
  
  	/*

 * some platforms may not have hpd gpio. Rely only on the status
 * provided by REG_HDMI_HPD_INT_STATUS in this case.
 */
-   if (!hdmi->hpd_gpiod)
-   return detect_reg(hdmi);
+   if (!hdmi->hpd_gpiod) {
+   status = detect_reg(hdmi);
+   goto out;
+   }
  
  	do {

stat_gpio = detect_gpio(hdmi);
@@ -259,5 +262,11 @@ enum drm_connector_status msm_hdmi_bridge_detect(
DBG("hpd gpio tells us: %d", stat_gpio);
}
  
-	return stat_gpio;

+   status = stat_gpio;
+
+out:
+   if (!status)
+   cec_phys_addr_invalidate(hdmi->cec_adap);
+
+   return status;
  }



--
With best wishes
Dmitry

Re: [PATCH 1/4] drm/msm: add some cec register bitfield details

2023-04-19 Thread Dmitry Baryshkov


On 18/04/2023 21:10, Arnaud Vrac wrote:

The register names and bitfields were determined from the downstream
msm-4.4 driver.

Signed-off-by: Arnaud Vrac 
---
  drivers/gpu/drm/msm/hdmi/hdmi.xml.h | 62 -
  1 file changed, 61 insertions(+), 1 deletion(-)


I have opened MR against Mesa at 
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22588.


The patch is:

Reviewed-by: Dmitry Baryshkov 

Minor nit below



diff --git a/drivers/gpu/drm/msm/hdmi/hdmi.xml.h 
b/drivers/gpu/drm/msm/hdmi/hdmi.xml.h
index 973b460486a5a..b4dd6e8cba6b7 100644
--- a/drivers/gpu/drm/msm/hdmi/hdmi.xml.h
+++ b/drivers/gpu/drm/msm/hdmi/hdmi.xml.h
@@ -76,6 +76,13 @@ enum hdmi_acr_cts {
ACR_48 = 3,
  };
  
+enum hdmi_cec_tx_status {

+   CEC_TX_OK = 0,
+   CEC_TX_NACK = 1,
+   CEC_TX_ARB_LOSS = 2,
+   CEC_TX_MAX_RETRIES = 3,
+};
+
  #define REG_HDMI_CTRL 0x
  #define HDMI_CTRL_ENABLE  0x0001
  #define HDMI_CTRL_HDMI
0x0002
@@ -476,20 +483,73 @@ static inline uint32_t HDMI_DDC_REF_REFTIMER(uint32_t val)
  #define REG_HDMI_HDCP_SW_LOWER_AKSV   0x0288
  
  #define REG_HDMI_CEC_CTRL	0x028c

+#define HDMI_CEC_CTRL_ENABLE   0x0001
+#define HDMI_CEC_CTRL_SEND_TRIGGER 0x0002
+#define HDMI_CEC_CTRL_FRAME_SIZE__MASK 0x01f0
+#define HDMI_CEC_CTRL_FRAME_SIZE__SHIFT4
+static inline uint32_t HDMI_CEC_CTRL_FRAME_SIZE(uint32_t val)
+{
+   return ((val) << HDMI_CEC_CTRL_FRAME_SIZE__SHIFT) & 
HDMI_CEC_CTRL_FRAME_SIZE__MASK;
+}
+#define HDMI_CEC_CTRL_LINE_OE  0x0200
  
  #define REG_HDMI_CEC_WR_DATA	0x0290

+#define HDMI_CEC_WR_DATA_BROADCAST 0x0001
+#define HDMI_CEC_WR_DATA_DATA__MASK0xff00
+#define HDMI_CEC_WR_DATA_DATA__SHIFT   8
+static inline uint32_t HDMI_CEC_WR_DATA_DATA(uint32_t val)
+{
+   return ((val) << HDMI_CEC_WR_DATA_DATA__SHIFT) & 
HDMI_CEC_WR_DATA_DATA__MASK;
+}
  
-#define REG_HDMI_CEC_CEC_RETRANSMIT0x0294

+#define REG_HDMI_CEC_RETRANSMIT
0x0294
+#define HDMI_CEC_RETRANSMIT_ENABLE 0x0001
+#define HDMI_CEC_RETRANSMIT_COUNT__MASK
0x00fe
+#define HDMI_CEC_RETRANSMIT_COUNT__SHIFT   1
+static inline uint32_t HDMI_CEC_RETRANSMIT_COUNT(uint32_t val)
+{
+   return ((val) << HDMI_CEC_RETRANSMIT_COUNT__SHIFT) & 
HDMI_CEC_RETRANSMIT_COUNT__MASK;
+}
  
  #define REG_HDMI_CEC_STATUS	0x0298

+#define HDMI_CEC_STATUS_BUSY   0x0001
+#define HDMI_CEC_STATUS_TX_FRAME_DONE  0x0008
+#define HDMI_CEC_STATUS_TX_STATUS__MASK
0x00f0
+#define HDMI_CEC_STATUS_TX_STATUS__SHIFT   4
+static inline uint32_t HDMI_CEC_STATUS_TX_STATUS(enum hdmi_cec_tx_status val)
+{
+   return ((val) << HDMI_CEC_STATUS_TX_STATUS__SHIFT) & 
HDMI_CEC_STATUS_TX_STATUS__MASK;
+}
  
  #define REG_HDMI_CEC_INT	0x029c

+#define HDMI_CEC_INT_TX_DONE   0x0001
+#define HDMI_CEC_INT_TX_DONE_MASK  0x0002
+#define HDMI_CEC_INT_TX_ERROR  0x0004
+#define HDMI_CEC_INT_TX_ERROR_MASK 0x0008
+#define HDMI_CEC_INT_MONITOR   0x0010
+#define HDMI_CEC_INT_MONITOR_MASK  0x0020
+#define HDMI_CEC_INT_RX_DONE   0x0040
+#define HDMI_CEC_INT_RX_DONE_MASK  0x0080
  
  #define REG_HDMI_CEC_ADDR	0x02a0
  
  #define REG_HDMI_CEC_TIME	0x02a4

+#define HDMI_CEC_TIME_ENABLE   0x0001
+#define HDMI_CEC_TIME_SIGNAL_FREE_TIME__MASK   0xff80
+#define HDMI_CEC_TIME_SIGNAL_FREE_TIME__SHIFT  7
+static inline uint32_t HDMI_CEC_TIME_SIGNAL_FREE_TIME(uint32_t val)
+{
+   return ((val) << HDMI_CEC_TIME_SIGNAL_FREE_TIME__SHIFT) & 
HDMI_CEC_TIME_SIGNAL_FREE_TIME__MASK;
+}
  
  #define REG_HDMI_CEC_REFTIMER	0x02a8

+#define HDMI_CEC_REFTIMER_ENABLE   0x0001


I think this should come after the REFTIMER field.


+#define HDMI_CEC_REFTIMER_REFTIMER__MASK   0x
+#define HDMI_CEC_REFTIMER_REFTIMER__SHIFT  0
+static inline uint32_t HDMI_CEC_REFTIMER_REFTIMER(uint32_t val)
+{
+   return ((val) << HDMI_CEC_REFTIMER_REFTIMER__SHIFT) & 
HDMI_CEC_REFTIMER_REFTIMER__MASK;
+}
  
  #define REG_HDMI_CEC_RD_DATA

Re: [PATCH v3 0/3] drm/xe: switch to using drm_exec

2023-04-19 Thread Matthew Brost

On Wed, Apr 19, 2023 at 07:56:47PM +0200, Francois Dugast wrote:
> This makes Xe use the new drm_exec helpers provided by this series,
> which is not merged yet:
> https://patchwork.freedesktop.org/series/114464/
> 
> with this fix:
> https://patchwork.freedesktop.org/patch/530670/?series=112994=4
> 
> v3 includes code shared by Matthew Brost.
> 
> v2: add a first patch with squashed dependencies (Lucas De Marchi)
> v3:
>   - remove "RFC"
>   - add dependencies as original patches
>   - move drm_exec calls to xe_vm_lock_dma_resv/xe_vm_unlock_dma_resv,
> use new helper functions xe_vm_bo_lock/xe_vm_bo_unlock, fixes in
> drm_exec calls (Matthew Brost)
> 

For this series in general I'd personally be inclined to include it in
the merge of [1] as the large GPUVA change isn't going to apply after
this series as GPUVA is really invasive / rebase is non-trival. Also
based on a coversation with dakr [2] [3], we probably want to move some
of our locking helpers to GPUVA + do not build DRM EXEC as a module.

Matt

[1] https://gitlab.freedesktop.org/drm/xe/kernel/-/merge_requests/340
[2] 
https://gitlab.freedesktop.org/drm/xe/kernel/-/merge_requests/340#note_1875039
[3] 
https://gitlab.freedesktop.org/nouvelles/kernel/-/tree/wip-gpuva?ref_type=heads

> Christian König (1):
>   drm: execution context for GEM buffers v3
> 
> Danilo Krummrich (1):
>   drm_exec: fix double dma_resv unlock
> 
> Francois Dugast (1):
>   drm/xe: switch to using drm_exec
> 
>  Documentation/gpu/drm-mm.rst |  12 ++
>  drivers/gpu/drm/Kconfig  |   6 +
>  drivers/gpu/drm/Makefile |   2 +
>  drivers/gpu/drm/drm_exec.c   | 248 +++
>  drivers/gpu/drm/xe/Kconfig   |   1 +
>  drivers/gpu/drm/xe/tests/xe_bo.c |  17 +-
>  drivers/gpu/drm/xe/xe_bo.c   |  29 +--
>  drivers/gpu/drm/xe/xe_bo.h   |   6 +-
>  drivers/gpu/drm/xe/xe_bo_evict.c |  24 ++-
>  drivers/gpu/drm/xe/xe_bo_types.h |   1 -
>  drivers/gpu/drm/xe/xe_exec.c |  30 +--
>  drivers/gpu/drm/xe/xe_gt_pagefault.c |  56 +-
>  drivers/gpu/drm/xe/xe_vm.c   | 287 +--
>  drivers/gpu/drm/xe/xe_vm.h   |  29 +--
>  drivers/gpu/drm/xe/xe_vm_madvise.c   |  36 ++--
>  include/drm/drm_exec.h   | 115 +++
>  16 files changed, 615 insertions(+), 284 deletions(-)
>  create mode 100644 drivers/gpu/drm/drm_exec.c
>  create mode 100644 include/drm/drm_exec.h
> 
> -- 
> 2.25.1
>

Re: [PATCH v3 3/3] drm/xe: switch to using drm_exec

2023-04-19 Thread Matthew Brost

On Wed, Apr 19, 2023 at 07:56:50PM +0200, Francois Dugast wrote:
> Replace the use of ttm_execbuf_util helpers with the drm_exec helpers.
> 
> Signed-off-by: Francois Dugast 
> Signed-off-by: Matthew Brost 
> ---
>  drivers/gpu/drm/xe/Kconfig   |   1 +
>  drivers/gpu/drm/xe/tests/xe_bo.c |  17 +-
>  drivers/gpu/drm/xe/xe_bo.c   |  29 +--
>  drivers/gpu/drm/xe/xe_bo.h   |   6 +-
>  drivers/gpu/drm/xe/xe_bo_evict.c |  24 ++-
>  drivers/gpu/drm/xe/xe_bo_types.h |   1 -
>  drivers/gpu/drm/xe/xe_exec.c |  30 +--
>  drivers/gpu/drm/xe/xe_gt_pagefault.c |  56 +-
>  drivers/gpu/drm/xe/xe_vm.c   | 287 +--
>  drivers/gpu/drm/xe/xe_vm.h   |  29 +--
>  drivers/gpu/drm/xe/xe_vm_madvise.c   |  36 ++--
>  11 files changed, 232 insertions(+), 284 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/Kconfig b/drivers/gpu/drm/xe/Kconfig
> index f6f3b491d162..bbcc9b64b776 100644
> --- a/drivers/gpu/drm/xe/Kconfig
> +++ b/drivers/gpu/drm/xe/Kconfig
> @@ -8,6 +8,7 @@ config DRM_XE
>   select SHMEM
>   select TMPFS
>   select DRM_BUDDY
> + select DRM_EXEC
>   select DRM_KMS_HELPER
>   select DRM_PANEL
>   select DRM_SUBALLOC_HELPER
> diff --git a/drivers/gpu/drm/xe/tests/xe_bo.c 
> b/drivers/gpu/drm/xe/tests/xe_bo.c
> index 9bd381e5b7a6..78e43fd5c909 100644
> --- a/drivers/gpu/drm/xe/tests/xe_bo.c
> +++ b/drivers/gpu/drm/xe/tests/xe_bo.c
> @@ -176,6 +176,7 @@ static int evict_test_run_gt(struct xe_device *xe, struct 
> xe_gt *gt, struct kuni
>   XE_BO_CREATE_VRAM_IF_DGFX(gt);
>   struct xe_vm *vm = xe_migrate_get_vm(xe->gt[0].migrate);
>   struct ww_acquire_ctx ww;
> + struct drm_exec exec;
>   int err, i;
>  
>   kunit_info(test, "Testing device %s gt id %u vram id %u\n",
> @@ -198,9 +199,9 @@ static int evict_test_run_gt(struct xe_device *xe, struct 
> xe_gt *gt, struct kuni
>   goto cleanup_bo;
>   }
>  
> - xe_bo_lock(external, , 0, false);
> + xe_bo_lock(external, , 0, false);
>   err = xe_bo_pin_external(external);
> - xe_bo_unlock(external, );
> + xe_bo_unlock(external, );
>   if (err) {
>   KUNIT_FAIL(test, "external bo pin err=%pe\n",
>  ERR_PTR(err));
> @@ -249,9 +250,9 @@ static int evict_test_run_gt(struct xe_device *xe, struct 
> xe_gt *gt, struct kuni
>  ERR_PTR(err));
>   goto cleanup_all;
>   }
> - xe_bo_lock(external, , 0, false);
> + xe_bo_lock(external, , 0, false);
>   err = xe_bo_validate(external, NULL, false);
> - xe_bo_unlock(external, );
> + xe_bo_unlock(external, );
>   if (err) {
>   KUNIT_FAIL(test, "external bo valid err=%pe\n",
>  ERR_PTR(err));
> @@ -259,18 +260,18 @@ static int evict_test_run_gt(struct xe_device *xe, 
> struct xe_gt *gt, struct kuni
>   }
>   }
>  
> - xe_bo_lock(external, , 0, false);
> + xe_bo_lock(external, , 0, false);
>   xe_bo_unpin_external(external);
> - xe_bo_unlock(external, );
> + xe_bo_unlock(external, );
>  
>   xe_bo_put(external);
>   xe_bo_put(bo);
>   continue;
>  
>  cleanup_all:
> - xe_bo_lock(external, , 0, false);
> + xe_bo_lock(external, , 0, false);
>   xe_bo_unpin_external(external);
> - xe_bo_unlock(external, );
> + xe_bo_unlock(external, );
>  cleanup_external:
>   xe_bo_put(external);
>  cleanup_bo:
> diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
> index 3ab404e33fae..bb185093c5e0 100644
> --- a/drivers/gpu/drm/xe/xe_bo.c
> +++ b/drivers/gpu/drm/xe/xe_bo.c
> @@ -8,6 +8,7 @@
>  #include 
>  
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -1720,26 +1721,30 @@ int xe_gem_mmap_offset_ioctl(struct drm_device *dev, 
> void *data,
>   return 0;
>  }
>  
> -int xe_bo_lock(struct xe_bo *bo, struct ww_acquire_ctx *ww,
> +int xe_bo_lock(struct xe_bo *bo, struct drm_exec *exec,
>  int num_resv, bool intr)
>  {
> - struct ttm_validate_buffer tv_bo;
> - LIST_HEAD(objs);
> - LIST_HEAD(dups);
> + int err;
>  
> - XE_BUG_ON(!ww);
> + drm_exec_init(exec, intr);
> + drm_exec_while_not_all_locked(exec) {
> + err = drm_exec_prepare_obj(exec, >ttm.base,
> +num_resv);
> + drm_exec_continue_on_contention(exec);
> + if (err && err != -EALREADY)
> + goto out_err;
> + }
>  
> - tv_bo.num_shared = num_resv;
> - tv_bo.bo =

Re: [PATCH v4 08/12] drm/display/dsc: add YCbCr 4:2:2 and 4:2:0 RC parameters

2023-04-19 Thread Dmitry Baryshkov


On 13/04/2023 20:25, Kandpal, Suraj wrote:

Hi,

Include RC parameters for YCbCr 4:2:2 and 4:2:0 configurations.



Looks Good to me

Reviewed-by: Suraj Kandpal 


And gentle reminder for patches 9-12. We would kindly ask to get this 
patches reviewed and ready to be merged into drm-intel after -rc1 or 
-rc2, so that we can backmerge the drm tree into msm-next and enqueue 
msm-specific DSC patches early during the cycle. Thank you for your 
understanding and collaboration.


--
With best wishes
Dmitry

Re: [PATCH 11/11] drm/msm/dpu: do not use mixer that supports dspp when not required

2023-04-19 Thread Dmitry Baryshkov


On 19/04/2023 17:41, Arnaud Vrac wrote:

This avoids using lm blocks that support DSPP when not needed, to
keep those resources available.


This will break some of the platforms. Consider qcm2290 which has a 
single LM with DSPP. So, _dpu_rm_check_lm_and_get_connected_blks should 
be performed in two steps: first skip non-DSPP-enabled LMs when DSPP is 
not required. Then, if the LM (pair) is not found, look for any suitable 
LM(pair).




Signed-off-by: Arnaud Vrac 
---
  drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
index f4dda88a73f7d..4b393d46c743f 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
@@ -362,7 +362,7 @@ static bool _dpu_rm_check_lm_and_get_connected_blks(struct 
dpu_rm *rm,
*pp_idx = idx;
  
  	if (!reqs->topology.num_dspp)

-   return true;
+   return !lm_cfg->dspp;
  
  	idx = lm_cfg->dspp - DSPP_0;

if (idx < 0 || idx >= ARRAY_SIZE(rm->dspp_blks)) {



--
With best wishes
Dmitry

Re: [PATCH 10/11] drm/msm/dpu: tweak lm pairings in msm8998 hw catalog

2023-04-19 Thread Dmitry Baryshkov


On 19/04/2023 17:41, Arnaud Vrac wrote:

Change lm blocks pairs so that lm blocks with the same features are
paired together:

LM_0 and LM_1 with PP and DSPP
LM_2 and LM_5 with PP
LM_3 and LM_4

This matches the sdm845 configuration and allows using pp or dspp when 2
lm blocks are needed in the topology. In the previous config the
reservation code could never find an lm pair without a matching feature
set.


And this matches the hardcoded configuration in msm-4.4

Fixes: 94391a14fc27 ("drm/msm/dpu1: Add MSM8998 to hw catalog")

Reviewed-by: Dmitry Baryshkov 



Signed-off-by: Arnaud Vrac 
---
  drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_3_0_msm8998.h | 8 
  1 file changed, 4 insertions(+), 4 deletions(-)



--
With best wishes
Dmitry

Re: [PATCH 08/11] drm/msm/dpu: fix cursor block register bit offset in msm8998 hw catalog

2023-04-19 Thread Dmitry Baryshkov


On 19/04/2023 17:41, Arnaud Vrac wrote:

This matches the value for both fbdev and sde implementations in the
downstream msm-4.4 repository.

Signed-off-by: Arnaud Vrac 
---
  drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_3_0_msm8998.h | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)


Reviewed-by: Dmitry Baryshkov 

--
With best wishes
Dmitry

Re: [PATCH 07/11] drm/msm/dpu: add sspp cursor blocks to msm8998 hw catalog

2023-04-19 Thread Dmitry Baryshkov


On 19/04/2023 17:41, Arnaud Vrac wrote:

Now that cursor sspp blocks can be used for cursor planes, enable them
on msm8998. The dma sspp blocks that were assigned to cursor planes can
now be used for overlay planes instead.


While the change is correct, there is more about it. Composers, using 
universal planes, will see this plane too. They have no obligations to 
use it only for the cursor. At the minimum could you please extend the 
plane_atomic_check to check for the plane dimensions for the CURSOR pipes?


For this change:

Reviewed-by: Dmitry Baryshkov 



Signed-off-by: Arnaud Vrac 
---
  .../drm/msm/disp/dpu1/catalog/dpu_3_0_msm8998.h|  8 +++--
  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c | 34 ++
  2 files changed, 40 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_3_0_msm8998.h 
b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_3_0_msm8998.h
index b07e8a9941f79..7de393b0f91d7 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_3_0_msm8998.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_3_0_msm8998.h
@@ -90,10 +90,14 @@ static const struct dpu_sspp_cfg msm8998_sspp[] = {
sdm845_dma_sblk_0, 1, SSPP_TYPE_DMA, DPU_CLK_CTRL_DMA0),
SSPP_BLK("sspp_9", SSPP_DMA1, 0x26000, 0x1ac, DMA_MSM8998_MASK,
sdm845_dma_sblk_1, 5, SSPP_TYPE_DMA, DPU_CLK_CTRL_DMA1),
-   SSPP_BLK("sspp_10", SSPP_DMA2, 0x28000, 0x1ac, DMA_CURSOR_MSM8998_MASK,
+   SSPP_BLK("sspp_10", SSPP_DMA2, 0x28000, 0x1ac, DMA_MSM8998_MASK,
sdm845_dma_sblk_2, 9, SSPP_TYPE_DMA, DPU_CLK_CTRL_DMA2),
-   SSPP_BLK("sspp_11", SSPP_DMA3, 0x2a000, 0x1ac, DMA_CURSOR_MSM8998_MASK,
+   SSPP_BLK("sspp_11", SSPP_DMA3, 0x2a000, 0x1ac, DMA_MSM8998_MASK,
sdm845_dma_sblk_3, 13, SSPP_TYPE_DMA, DPU_CLK_CTRL_DMA3),
+   SSPP_BLK("sspp_12", SSPP_CURSOR0, 0x34000, 0x1ac, 
DMA_CURSOR_MSM8998_MASK,
+   msm8998_cursor_sblk_0, 2, SSPP_TYPE_CURSOR, 
DPU_CLK_CTRL_CURSOR0),
+   SSPP_BLK("sspp_13", SSPP_CURSOR1, 0x36000, 0x1ac, 
DMA_CURSOR_MSM8998_MASK,
+   msm8998_cursor_sblk_1, 10, SSPP_TYPE_CURSOR, 
DPU_CLK_CTRL_CURSOR1),
  };
  
  static const struct dpu_lm_cfg msm8998_lm[] = {

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
index 8d5d782a43398..f34fa704936bc 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
@@ -242,6 +242,22 @@ static const uint32_t wb2_formats[] = {
DRM_FORMAT_XBGR,
  };
  
+static const uint32_t cursor_formats[] = {

+   DRM_FORMAT_ARGB,
+   DRM_FORMAT_ABGR,
+   DRM_FORMAT_RGBA,
+   DRM_FORMAT_BGRA,
+   DRM_FORMAT_XRGB,
+   DRM_FORMAT_ARGB1555,
+   DRM_FORMAT_ABGR1555,
+   DRM_FORMAT_RGBA5551,
+   DRM_FORMAT_BGRA5551,
+   DRM_FORMAT_ARGB,
+   DRM_FORMAT_ABGR,
+   DRM_FORMAT_RGBA,
+   DRM_FORMAT_BGRA,
+};
+
  /*
   * SSPP sub blocks config
   */
@@ -300,6 +316,19 @@ static const uint32_t wb2_formats[] = {
.virt_num_formats = ARRAY_SIZE(plane_formats), \
}
  
+#define _CURSOR_SBLK(num) \

+   { \
+   .maxdwnscale = SSPP_UNITY_SCALE, \
+   .maxupscale = SSPP_UNITY_SCALE, \
+   .smart_dma_priority = 0, \
+   .src_blk = {.name = STRCAT("sspp_src_", num), \
+   .id = DPU_SSPP_SRC, .base = 0x00, .len = 0x150,}, \
+   .format_list = cursor_formats, \
+   .num_formats = ARRAY_SIZE(cursor_formats), \
+   .virt_format_list = cursor_formats, \
+   .virt_num_formats = ARRAY_SIZE(cursor_formats), \
+   }
+
  static const struct dpu_sspp_sub_blks msm8998_vig_sblk_0 =
_VIG_SBLK("0", 0, DPU_SSPP_SCALER_QSEED3);
  static const struct dpu_sspp_sub_blks msm8998_vig_sblk_1 =
@@ -309,6 +338,11 @@ static const struct dpu_sspp_sub_blks msm8998_vig_sblk_2 =
  static const struct dpu_sspp_sub_blks msm8998_vig_sblk_3 =
_VIG_SBLK("3", 0, DPU_SSPP_SCALER_QSEED3);
  
+static const struct dpu_sspp_sub_blks msm8998_cursor_sblk_0 =

+   _CURSOR_SBLK("12");
+static const struct dpu_sspp_sub_blks msm8998_cursor_sblk_1 =
+   _CURSOR_SBLK("13");
+
  static const struct dpu_rotation_cfg dpu_rot_sc7280_cfg_v2 = {
.rot_maxheight = 1088,
.rot_num_formats = ARRAY_SIZE(rotation_v2_formats),



--
With best wishes
Dmitry

RE: [Intel-gfx] [PATCH 2/8] drm/i915/mtl: Define MOCS and PAT tables for MTL

2023-04-19 Thread Yang, Fei

> Hi Fei,
>
>> +#define MTL_PPGTT_PTE_PAT3  BIT_ULL(62)
>>  #define GEN12_PPGTT_PTE_LM  BIT_ULL(11)
>> +#define GEN12_PPGTT_PTE_PAT2BIT_ULL(7)
>> +#define GEN12_PPGTT_PTE_NC  BIT_ULL(5)
>> +#define GEN12_PPGTT_PTE_PAT1BIT_ULL(4)
>> +#define GEN12_PPGTT_PTE_PAT0BIT_ULL(3)
>>
>> -#define GEN12_GGTT_PTE_LM   BIT_ULL(1)
>> +#define GEN12_GGTT_PTE_LM   BIT_ULL(1)
>> +#define MTL_GGTT_PTE_PAT0   BIT_ULL(52)
>> +#define MTL_GGTT_PTE_PAT1   BIT_ULL(53)
>> +#define GEN12_GGTT_PTE_ADDR_MASKGENMASK_ULL(45, 12)
>> +#define MTL_GGTT_PTE_PAT_MASK   GENMASK_ULL(53, 52)
>>
>>  #define GEN12_PDE_64K BIT(6)
>>  #define GEN12_PTE_PS64 BIT(8)
>> @@ -147,6 +156,15 @@ typedef u64 gen8_pte_t;  #define GEN8_PDE_IPS_64K
>> BIT(11)
>>  #define GEN8_PDE_PS_2M   BIT(7)
>>
>> +#define MTL_PPAT_L4_CACHE_POLICY_MASK   REG_GENMASK(3, 2)
>> +#define MTL_PAT_INDEX_COH_MODE_MASK REG_GENMASK(1, 0)
>> +#define MTL_PPAT_L4_3_UCREG_FIELD_PREP(MTL_PPAT_L4_CACHE_POLICY_MASK, 3)
>> +#define MTL_PPAT_L4_1_WTREG_FIELD_PREP(MTL_PPAT_L4_CACHE_POLICY_MASK, 1)
>> +#define MTL_PPAT_L4_0_WBREG_FIELD_PREP(MTL_PPAT_L4_CACHE_POLICY_MASK, 0)
>> +#define MTL_3_COH_2WREG_FIELD_PREP(MTL_PAT_INDEX_COH_MODE_MASK, 3)
>> +#define MTL_2_COH_1WREG_FIELD_PREP(MTL_PAT_INDEX_COH_MODE_MASK, 2)
>> +#define MTL_0_COH_NON   REG_FIELD_PREP(MTL_PAT_INDEX_COH_MODE_MASK, 0)
>
> BTW, are all these defines needed? Not all of them look to be used.

Yes, these are all being used, not in this patch though, but in the next patch 
defining pte_encode functions.
I think the only one that might not be used is MTL_GGTT_PTE_PAT_MASK, because I 
ended up checking each bit instead of taking the PAT bits out and comparing 
against possible values.

-Fei

> Andi

[PATCH 7/8] drm/i915: use pat_index instead of cache_level

2023-04-19 Thread fei . yang

From: Fei Yang 

Currently the KMD is using enum i915_cache_level to set caching policy for
buffer objects. This is flaky because the PAT index which really controls
the caching behavior in PTE has far more levels than what's defined in the
enum. In addition, the PAT index is platform dependent, having to translate
between i915_cache_level and PAT index is not reliable, and makes the code
more complicated.

>From UMD's perspective there is also a necessity to set caching policy for
performance fine tuning. It's much easier for the UMD to directly use PAT
index because the behavior of each PAT index is clearly defined in Bspec.
Having the abstracted i915_cache_level sitting in between would only cause
more ambiguity.

For these reasons this patch replaces i915_cache_level with PAT index. Also
note, the cache_level is not completely removed yet, because the KMD still
has the need of creating buffer objects with simple cache settings such as
cached, uncached, or writethrough. For such simple cases, using cache_level
would help simplify the code.

Cc: Chris Wilson 
Cc: Matt Roper 
Signed-off-by: Fei Yang 
Reviewed-by: Andi Shyti 
---
 drivers/gpu/drm/i915/display/intel_dpt.c  | 12 +--
 drivers/gpu/drm/i915/gem/i915_gem_domain.c| 27 ++
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 10 ++-
 drivers/gpu/drm/i915/gem/i915_gem_mman.c  |  3 +-
 drivers/gpu/drm/i915/gem/i915_gem_object.c| 52 +++-
 drivers/gpu/drm/i915/gem/i915_gem_object.h|  4 +
 .../gpu/drm/i915/gem/i915_gem_object_types.h  | 25 +-
 drivers/gpu/drm/i915/gem/i915_gem_stolen.c|  4 +-
 drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c  | 16 ++--
 .../gpu/drm/i915/gem/selftests/huge_pages.c   |  2 +-
 .../drm/i915/gem/selftests/i915_gem_migrate.c |  2 +-
 .../drm/i915/gem/selftests/i915_gem_mman.c|  2 +-
 drivers/gpu/drm/i915/gt/gen6_ppgtt.c  | 10 ++-
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c  | 71 
 drivers/gpu/drm/i915/gt/gen8_ppgtt.h  |  3 +-
 drivers/gpu/drm/i915/gt/intel_ggtt.c  | 82 +--
 drivers/gpu/drm/i915/gt/intel_gtt.h   | 20 ++---
 drivers/gpu/drm/i915/gt/intel_migrate.c   | 47 ++-
 drivers/gpu/drm/i915/gt/intel_migrate.h   | 13 ++-
 drivers/gpu/drm/i915/gt/intel_ppgtt.c |  6 +-
 drivers/gpu/drm/i915/gt/selftest_migrate.c| 47 ++-
 drivers/gpu/drm/i915/gt/selftest_reset.c  |  8 +-
 drivers/gpu/drm/i915/gt/selftest_timeline.c   |  2 +-
 drivers/gpu/drm/i915/gt/selftest_tlb.c|  4 +-
 drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c  | 10 ++-
 drivers/gpu/drm/i915/i915_debugfs.c   | 55 ++---
 drivers/gpu/drm/i915/i915_gem.c   | 16 +++-
 drivers/gpu/drm/i915/i915_gpu_error.c |  8 +-
 drivers/gpu/drm/i915/i915_vma.c   | 16 ++--
 drivers/gpu/drm/i915/i915_vma.h   |  2 +-
 drivers/gpu/drm/i915/i915_vma_types.h |  2 -
 drivers/gpu/drm/i915/selftests/i915_gem.c |  5 +-
 .../gpu/drm/i915/selftests/i915_gem_evict.c   |  4 +-
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 15 ++--
 .../drm/i915/selftests/intel_memory_region.c  |  4 +-
 drivers/gpu/drm/i915/selftests/mock_gtt.c |  8 +-
 36 files changed, 378 insertions(+), 239 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_dpt.c 
b/drivers/gpu/drm/i915/display/intel_dpt.c
index c5eacfdba1a5..7c5fddb203ba 100644
--- a/drivers/gpu/drm/i915/display/intel_dpt.c
+++ b/drivers/gpu/drm/i915/display/intel_dpt.c
@@ -43,24 +43,24 @@ static void gen8_set_pte(void __iomem *addr, gen8_pte_t pte)
 static void dpt_insert_page(struct i915_address_space *vm,
dma_addr_t addr,
u64 offset,
-   enum i915_cache_level level,
+   unsigned int pat_index,
u32 flags)
 {
struct i915_dpt *dpt = i915_vm_to_dpt(vm);
gen8_pte_t __iomem *base = dpt->iomem;
 
gen8_set_pte(base + offset / I915_GTT_PAGE_SIZE,
-vm->pte_encode(addr, level, flags));
+vm->pte_encode(addr, pat_index, flags));
 }
 
 static void dpt_insert_entries(struct i915_address_space *vm,
   struct i915_vma_resource *vma_res,
-  enum i915_cache_level level,
+  unsigned int pat_index,
   u32 flags)
 {
struct i915_dpt *dpt = i915_vm_to_dpt(vm);
gen8_pte_t __iomem *base = dpt->iomem;
-   const gen8_pte_t pte_encode = vm->pte_encode(0, level, flags);
+   const gen8_pte_t pte_encode = vm->pte_encode(0, pat_index, flags);
struct sgt_iter sgt_iter;
dma_addr_t addr;
int i;
@@ -83,7 +83,7 @@ static void dpt_clear_range(struct i915_address_space *vm,
 static void dpt_bind_vma(struct i915_address_space *vm,
 struct i915_vm_pt_stash *stash,
 struct

[PATCH 6/8] drm/i915: preparation for using PAT index

2023-04-19 Thread fei . yang

From: Fei Yang 

This patch is a preparation for replacing enum i915_cache_level with PAT
index. Caching policy for buffer objects is set through the PAT index in
PTE, the old i915_cache_level is not sufficient to represent all caching
modes supported by the hardware.

Preparing the transition by adding some platform dependent data structures
and helper functions to translate the cache_level to pat_index.

cachelevel_to_pat: a platform dependent array mapping cache_level to
   pat_index.

max_pat_index: the maximum PAT index supported by the hardware. Needed for
   validating the PAT index passed in from user space.

i915_gem_get_pat_index: function to convert cache_level to PAT index.

obj_to_i915(obj): macro moved to header file for wider usage.

I915_MAX_CACHE_LEVEL: upper bound of i915_cache_level for the
  convenience of coding.

Cc: Chris Wilson 
Cc: Matt Roper 
Cc: Andi Shyti 
Signed-off-by: Fei Yang 
Reviewed-by: Andi Shyti 
---
 drivers/gpu/drm/i915/gem/i915_gem_object.c|  9 +++
 drivers/gpu/drm/i915/gem/i915_gem_object.h|  4 +
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |  1 +
 drivers/gpu/drm/i915/gem/i915_gem_shrinker.c  |  2 -
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c  |  6 ++
 drivers/gpu/drm/i915/gt/intel_ggtt.c  |  6 ++
 drivers/gpu/drm/i915/i915_pci.c   | 75 +--
 drivers/gpu/drm/i915/intel_device_info.h  |  5 ++
 .../gpu/drm/i915/selftests/mock_gem_device.c  |  9 +++
 9 files changed, 107 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c 
b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index 4666bb82f312..8c70a0ec7d2f 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -45,6 +45,15 @@ static struct kmem_cache *slab_objects;
 
 static const struct drm_gem_object_funcs i915_gem_object_funcs;
 
+unsigned int i915_gem_get_pat_index(struct drm_i915_private *i915,
+   enum i915_cache_level level)
+{
+   if (drm_WARN_ON(>drm, level >= I915_MAX_CACHE_LEVEL))
+   return 0;
+
+   return INTEL_INFO(i915)->cachelevel_to_pat[level];
+}
+
 struct drm_i915_gem_object *i915_gem_object_alloc(void)
 {
struct drm_i915_gem_object *obj;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h 
b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index 885ccde9dc3c..4c92e17b4337 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -20,6 +20,8 @@
 
 enum intel_region_id;
 
+#define obj_to_i915(obj__) to_i915((obj__)->base.dev)
+
 static inline bool i915_gem_object_size_2big(u64 size)
 {
struct drm_i915_gem_object *obj;
@@ -30,6 +32,8 @@ static inline bool i915_gem_object_size_2big(u64 size)
return false;
 }
 
+unsigned int i915_gem_get_pat_index(struct drm_i915_private *i915,
+   enum i915_cache_level level);
 void i915_gem_init__objects(struct drm_i915_private *i915);
 
 void i915_objects_module_exit(void);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h 
b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
index 830c11431ee8..41b35abccf88 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
@@ -194,6 +194,7 @@ enum i915_cache_level {
 * engine.
 */
I915_CACHE_WT,
+   I915_MAX_CACHE_LEVEL,
 };
 
 enum i915_map_type {
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c 
b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
index b1672e054b21..214763942aa2 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
@@ -460,8 +460,6 @@ void i915_gem_shrinker_taints_mutex(struct drm_i915_private 
*i915,
fs_reclaim_release(GFP_KERNEL);
 }
 
-#define obj_to_i915(obj__) to_i915((obj__)->base.dev)
-
 /**
  * i915_gem_object_make_unshrinkable - Hide the object from the shrinker. By
  * default all object types that support shrinking(see IS_SHRINKABLE), will 
also
diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c 
b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
index 11b91e0453c8..7a4b1d1afce9 100644
--- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
@@ -78,6 +78,12 @@ static u64 mtl_pte_encode(dma_addr_t addr,
case I915_CACHE_WT:
pte |= GEN12_PPGTT_PTE_PAT0;
break;
+   default:
+   /* This should never happen. Added to deal with the compile
+* error due to the addition of I915_MAX_CACHE_LEVEL. Will
+* be removed by the pat_index patch.
+*/
+   break;
}
 
return pte;
diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c 
b/drivers/gpu/drm/i915/gt/intel_ggtt.c
index 20915edc8bd9..c8390d03fce2 100644
--- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
+++

[PATCH 4/8] drm/i915/mtl: workaround coherency issue for Media

2023-04-19 Thread fei . yang

From: Fei Yang 

This patch implements Wa_22016122933.

In MTL, memory writes initiated by Media tile update the whole
cache line even for partial writes. This creates a coherency
problem for cacheable memory if both CPU and GPU are writing data
to different locations within a single cache line. CTB communication
is impacted by this issue because the head and tail pointers are
adjacent words within a cache line (see struct guc_ct_buffer_desc),
where one is written by GuC and the other by the host.
This patch circumvents the issue by making CPU/GPU shared memory
uncacheable (WC on CPU side, and PAT index 2 for GPU). Also for
CTB which is being updated by both CPU and GuC, mfence instruction
is added to make sure the CPU writes are visible to GPU right away
(flush the write combining buffer).

While fixing the CTB issue, we noticed some random GSC firmware
loading failure because the share buffers are cacheable (WB) on CPU
side but uncached on GPU side. To fix these issues we need to map
such shared buffers as WC on CPU side. Since such allocations are
not all done through GuC allocator, to avoid too many code changes,
the i915_coherent_map_type() is now hard coded to return WC for MTL.

BSpec: 45101

Signed-off-by: Fei Yang 
Reviewed-by: Andi Shyti 
Acked-by: Nirmoy Das 
---
 drivers/gpu/drm/i915/gem/i915_gem_pages.c |  5 -
 drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c | 13 +
 drivers/gpu/drm/i915/gt/uc/intel_guc.c|  7 +++
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c |  6 ++
 4 files changed, 30 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c 
b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
index ecd86130b74f..89fc8ea6bcfc 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
@@ -469,7 +469,10 @@ enum i915_map_type i915_coherent_map_type(struct 
drm_i915_private *i915,
  struct drm_i915_gem_object *obj,
  bool always_coherent)
 {
-   if (i915_gem_object_is_lmem(obj))
+   /*
+* Wa_22016122933: always return I915_MAP_WC for MTL
+*/
+   if (i915_gem_object_is_lmem(obj) || IS_METEORLAKE(i915))
return I915_MAP_WC;
if (HAS_LLC(i915) || always_coherent)
return I915_MAP_WB;
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c 
b/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c
index 1d9fdfb11268..236673c02f9a 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c
@@ -110,6 +110,13 @@ static int gsc_fw_load_prepare(struct intel_gsc_uc *gsc)
if (obj->base.size < gsc->fw.size)
return -ENOSPC;
 
+   /*
+* Wa_22016122933: For MTL the shared memory needs to be mapped
+* as WC on CPU side and UC (PAT index 2) on GPU side
+*/
+   if (IS_METEORLAKE(i915))
+   i915_gem_object_set_cache_coherency(obj, I915_CACHE_NONE);
+
dst = i915_gem_object_pin_map_unlocked(obj,
   i915_coherent_map_type(i915, 
obj, true));
if (IS_ERR(dst))
@@ -125,6 +132,12 @@ static int gsc_fw_load_prepare(struct intel_gsc_uc *gsc)
memset(dst, 0, obj->base.size);
memcpy(dst, src, gsc->fw.size);
 
+   /*
+* Wa_22016122933: Making sure the data in dst is
+* visible to GSC right away
+*/
+   intel_guc_write_barrier(>uc.guc);
+
i915_gem_object_unpin_map(gsc->fw.obj);
i915_gem_object_unpin_map(obj);
 
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
index e89f16ecf1ae..c9f20385f6a0 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
@@ -744,6 +744,13 @@ struct i915_vma *intel_guc_allocate_vma(struct intel_guc 
*guc, u32 size)
if (IS_ERR(obj))
return ERR_CAST(obj);
 
+   /*
+* Wa_22016122933: For MTL the shared memory needs to be mapped
+* as WC on CPU side and UC (PAT index 2) on GPU side
+*/
+   if (IS_METEORLAKE(gt->i915))
+   i915_gem_object_set_cache_coherency(obj, I915_CACHE_NONE);
+
vma = i915_vma_instance(obj, >ggtt->vm, NULL);
if (IS_ERR(vma))
goto err;
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
index 1803a633ed64..99a0a89091e7 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
@@ -902,6 +902,12 @@ static int ct_read(struct intel_guc_ct *ct, struct 
ct_incoming_msg **msg)
/* now update descriptor */
WRITE_ONCE(desc->head, head);
 
+   /*
+* Wa_22016122933: Making sure the head update is
+* visible to GuC right away
+*/
+   intel_guc_write_barrier(ct_to_guc(ct));
+
return available - len;
 
 corrupted:
-- 
2.25.1

[PATCH 8/8] drm/i915: Allow user to set cache at BO creation

2023-04-19 Thread fei . yang

From: Fei Yang 

To comply with the design that buffer objects shall have immutable
cache setting through out their life cycle, {set, get}_caching ioctl's
are no longer supported from MTL onward. With that change caching
policy can only be set at object creation time. The current code
applies a default (platform dependent) cache setting for all objects.
However this is not optimal for performance tuning. The patch extends
the existing gem_create uAPI to let user set PAT index for the object
at creation time.
The new extension is platform independent, so UMD's can switch to using
this extension for older platforms as well, while {set, get}_caching are
still supported on these legacy paltforms for compatibility reason.

Cc: Chris Wilson 
Cc: Matt Roper 
Cc: Andi Shyti 
Signed-off-by: Fei Yang 
Reviewed-by: Andi Shyti 
---
 drivers/gpu/drm/i915/gem/i915_gem_create.c | 36 ++
 drivers/gpu/drm/i915/gem/i915_gem_object.c |  6 
 include/uapi/drm/i915_drm.h| 36 ++
 tools/include/uapi/drm/i915_drm.h  | 36 ++
 4 files changed, 114 insertions(+)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_create.c 
b/drivers/gpu/drm/i915/gem/i915_gem_create.c
index bfe1dbda4cb7..723c3ddd6c74 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_create.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_create.c
@@ -245,6 +245,7 @@ struct create_ext {
unsigned int n_placements;
unsigned int placement_mask;
unsigned long flags;
+   unsigned int pat_index;
 };
 
 static void repr_placements(char *buf, size_t size,
@@ -394,11 +395,39 @@ static int ext_set_protected(struct i915_user_extension 
__user *base, void *data
return 0;
 }
 
+static int ext_set_pat(struct i915_user_extension __user *base, void *data)
+{
+   struct create_ext *ext_data = data;
+   struct drm_i915_private *i915 = ext_data->i915;
+   struct drm_i915_gem_create_ext_set_pat ext;
+   unsigned int max_pat_index;
+
+   BUILD_BUG_ON(sizeof(struct drm_i915_gem_create_ext_set_pat) !=
+offsetofend(struct drm_i915_gem_create_ext_set_pat, rsvd));
+
+   if (copy_from_user(, base, sizeof(ext)))
+   return -EFAULT;
+
+   max_pat_index = INTEL_INFO(i915)->max_pat_index;
+
+   if (ext.pat_index > max_pat_index) {
+   drm_dbg(>drm, "PAT index is invalid: %u\n",
+   ext.pat_index);
+   return -EINVAL;
+   }
+
+   ext_data->pat_index = ext.pat_index;
+
+   return 0;
+}
+
 static const i915_user_extension_fn create_extensions[] = {
[I915_GEM_CREATE_EXT_MEMORY_REGIONS] = ext_set_placements,
[I915_GEM_CREATE_EXT_PROTECTED_CONTENT] = ext_set_protected,
+   [I915_GEM_CREATE_EXT_SET_PAT] = ext_set_pat,
 };
 
+#define PAT_INDEX_NOT_SET  0x
 /**
  * i915_gem_create_ext_ioctl - Creates a new mm object and returns a handle to 
it.
  * @dev: drm device pointer
@@ -418,6 +447,7 @@ i915_gem_create_ext_ioctl(struct drm_device *dev, void 
*data,
if (args->flags & ~I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS)
return -EINVAL;
 
+   ext_data.pat_index = PAT_INDEX_NOT_SET;
ret = i915_user_extensions(u64_to_user_ptr(args->extensions),
   create_extensions,
   ARRAY_SIZE(create_extensions),
@@ -454,5 +484,11 @@ i915_gem_create_ext_ioctl(struct drm_device *dev, void 
*data,
if (IS_ERR(obj))
return PTR_ERR(obj);
 
+   if (ext_data.pat_index != PAT_INDEX_NOT_SET) {
+   i915_gem_object_set_pat_index(obj, ext_data.pat_index);
+   /* Mark pat_index is set by UMD */
+   obj->cache_level = I915_CACHE_INVAL;
+   }
+
return i915_gem_publish(obj, file, >size, >handle);
 }
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c 
b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index 27c948350b5b..61651f7e5806 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -209,6 +209,12 @@ bool i915_gem_object_can_bypass_llc(struct 
drm_i915_gem_object *obj)
if (!(obj->flags & I915_BO_ALLOC_USER))
return false;
 
+   /*
+* Always flush cache for UMD objects at creation time.
+*/
+   if (obj->cache_level == I915_CACHE_INVAL)
+   return true;
+
/*
 * EHL and JSL add the 'Bypass LLC' MOCS entry, which should make it
 * possible for userspace to bypass the GTT caching bits set by the
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index dba7c5a5b25e..03c5c314846e 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -3630,9 +3630,13 @@ struct drm_i915_gem_create_ext {
 *
 * For I915_GEM_CREATE_EXT_PROTECTED_CONTENT usage see
 * struct drm_i915_gem_create_ext_protected_content.
+*
+* For

[PATCH 3/8] drm/i915/mtl: Add PTE encode function

2023-04-19 Thread fei . yang

From: Fei Yang 

PTE encode functions are platform dependent. This patch implements
PTE functions for MTL, and ensures the correct PTE encode function
is used by calling pte_encode function pointer instead of the
hardcoded gen8 version of PTE encode.

Signed-off-by: Fei Yang 
Reviewed-by: Andrzej Hajda 
Reviewed-by: Andi Shyti 
Acked-by: Nirmoy Das 
---
 drivers/gpu/drm/i915/display/intel_dpt.c |  2 +-
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c | 45 
 drivers/gpu/drm/i915/gt/intel_ggtt.c | 36 +--
 3 files changed, 72 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_dpt.c 
b/drivers/gpu/drm/i915/display/intel_dpt.c
index b8027392144d..c5eacfdba1a5 100644
--- a/drivers/gpu/drm/i915/display/intel_dpt.c
+++ b/drivers/gpu/drm/i915/display/intel_dpt.c
@@ -300,7 +300,7 @@ intel_dpt_create(struct intel_framebuffer *fb)
vm->vma_ops.bind_vma= dpt_bind_vma;
vm->vma_ops.unbind_vma  = dpt_unbind_vma;
 
-   vm->pte_encode = gen8_ggtt_pte_encode;
+   vm->pte_encode = vm->gt->ggtt->vm.pte_encode;
 
dpt->obj = dpt_obj;
dpt->obj->is_dpt = true;
diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c 
b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
index 4daaa6f55668..11b91e0453c8 100644
--- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
@@ -55,6 +55,34 @@ static u64 gen8_pte_encode(dma_addr_t addr,
return pte;
 }
 
+static u64 mtl_pte_encode(dma_addr_t addr,
+ enum i915_cache_level level,
+ u32 flags)
+{
+   gen8_pte_t pte = addr | GEN8_PAGE_PRESENT | GEN8_PAGE_RW;
+
+   if (unlikely(flags & PTE_READ_ONLY))
+   pte &= ~GEN8_PAGE_RW;
+
+   if (flags & PTE_LM)
+   pte |= GEN12_PPGTT_PTE_LM | GEN12_PPGTT_PTE_NC;
+
+   switch (level) {
+   case I915_CACHE_NONE:
+   pte |= GEN12_PPGTT_PTE_PAT1;
+   break;
+   case I915_CACHE_LLC:
+   case I915_CACHE_L3_LLC:
+   pte |= GEN12_PPGTT_PTE_PAT0 | GEN12_PPGTT_PTE_PAT1;
+   break;
+   case I915_CACHE_WT:
+   pte |= GEN12_PPGTT_PTE_PAT0;
+   break;
+   }
+
+   return pte;
+}
+
 static void gen8_ppgtt_notify_vgt(struct i915_ppgtt *ppgtt, bool create)
 {
struct drm_i915_private *i915 = ppgtt->vm.i915;
@@ -427,7 +455,7 @@ gen8_ppgtt_insert_pte(struct i915_ppgtt *ppgtt,
  u32 flags)
 {
struct i915_page_directory *pd;
-   const gen8_pte_t pte_encode = gen8_pte_encode(0, cache_level, flags);
+   const gen8_pte_t pte_encode = ppgtt->vm.pte_encode(0, cache_level, 
flags);
gen8_pte_t *vaddr;
 
pd = i915_pd_entry(pdp, gen8_pd_index(idx, 2));
@@ -580,7 +608,7 @@ static void gen8_ppgtt_insert_huge(struct 
i915_address_space *vm,
   enum i915_cache_level cache_level,
   u32 flags)
 {
-   const gen8_pte_t pte_encode = gen8_pte_encode(0, cache_level, flags);
+   const gen8_pte_t pte_encode = vm->pte_encode(0, cache_level, flags);
unsigned int rem = sg_dma_len(iter->sg);
u64 start = vma_res->start;
 
@@ -743,7 +771,7 @@ static void gen8_ppgtt_insert_entry(struct 
i915_address_space *vm,
GEM_BUG_ON(pt->is_compact);
 
vaddr = px_vaddr(pt);
-   vaddr[gen8_pd_index(idx, 0)] = gen8_pte_encode(addr, level, flags);
+   vaddr[gen8_pd_index(idx, 0)] = vm->pte_encode(addr, level, flags);
drm_clflush_virt_range([gen8_pd_index(idx, 0)], sizeof(*vaddr));
 }
 
@@ -773,7 +801,7 @@ static void __xehpsdv_ppgtt_insert_entry_lm(struct 
i915_address_space *vm,
}
 
vaddr = px_vaddr(pt);
-   vaddr[gen8_pd_index(idx, 0) / 16] = gen8_pte_encode(addr, level, flags);
+   vaddr[gen8_pd_index(idx, 0) / 16] = vm->pte_encode(addr, level, flags);
 }
 
 static void xehpsdv_ppgtt_insert_entry(struct i915_address_space *vm,
@@ -820,8 +848,8 @@ static int gen8_init_scratch(struct i915_address_space *vm)
pte_flags |= PTE_LM;
 
vm->scratch[0]->encode =
-   gen8_pte_encode(px_dma(vm->scratch[0]),
-   I915_CACHE_NONE, pte_flags);
+   vm->pte_encode(px_dma(vm->scratch[0]),
+  I915_CACHE_NONE, pte_flags);
 
for (i = 1; i <= vm->top; i++) {
struct drm_i915_gem_object *obj;
@@ -963,7 +991,10 @@ struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt,
 */
ppgtt->vm.alloc_scratch_dma = alloc_pt_dma;
 
-   ppgtt->vm.pte_encode = gen8_pte_encode;
+   if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 70))
+   ppgtt->vm.pte_encode = mtl_pte_encode;
+   else
+   ppgtt->vm.pte_encode = gen8_pte_encode;
 
ppgtt->vm.bind_async_flags = I915_VMA_LOCAL_BIND;
ppgtt->vm.insert_entries = gen8_ppgtt_insert;
diff --git

[PATCH 5/8] drm/i915/mtl: end support for set caching ioctl

2023-04-19 Thread fei . yang

From: Fei Yang 

The design is to keep Buffer Object's caching policy immutable through
out its life cycle. This patch ends the support for set caching ioctl
from MTL onward. While doing that we also set BO's to be 1-way coherent
at creation time because GPU is no longer automatically snooping CPU
cache. For UMD's need to fine tune the caching policy for BO's, a follow
up patch will extend the GEM_CREATE uAPI to allow UMD's specify caching
mode at BO creation time.

Signed-off-by: Fei Yang 
Reviewed-by: Andi Shyti 
Reviewed-by: Andrzej Hajda 
---
 drivers/gpu/drm/i915/gem/i915_gem_domain.c | 3 +++
 drivers/gpu/drm/i915/gem/i915_gem_shmem.c  | 9 -
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c 
b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
index d2d5a24301b2..bb3575b1479f 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
@@ -337,6 +337,9 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void 
*data,
if (IS_DGFX(i915))
return -ENODEV;
 
+   if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70))
+   return -EOPNOTSUPP;
+
switch (args->caching) {
case I915_CACHING_NONE:
level = I915_CACHE_NONE;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c 
b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
index 37d1efcd3ca6..cad4a6017f4b 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
@@ -601,7 +601,14 @@ static int shmem_object_init(struct intel_memory_region 
*mem,
obj->write_domain = I915_GEM_DOMAIN_CPU;
obj->read_domains = I915_GEM_DOMAIN_CPU;
 
-   if (HAS_LLC(i915))
+   /*
+* MTL doesn't snoop CPU cache by default for GPU access (namely
+* 1-way coherency). However some UMD's are currently depending on
+* that. Make 1-way coherent the default setting for MTL. A follow
+* up patch will extend the GEM_CREATE uAPI to allow UMD's specify
+* caching mode at BO creation time
+*/
+   if (HAS_LLC(i915) || (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70)))
/* On some devices, we can have the GPU use the LLC (the CPU
 * cache) for about a 10% performance improvement
 * compared to uncached.  Graphics requests other than
-- 
2.25.1

[PATCH 1/8] drm/i915/mtl: Set has_llc=0

2023-04-19 Thread fei . yang

From: Fei Yang 

On MTL, LLC is not shared between GT and CPU, set has_llc=0.

Signed-off-by: Fei Yang 
Reviewed-by: Andi Shyti 
Reviewed-by: Andrzej Hajda 
Reviewed-by: Nirmoy Das 
---
 drivers/gpu/drm/i915/i915_pci.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
index d64e074d7457..272a8ba37b64 100644
--- a/drivers/gpu/drm/i915/i915_pci.c
+++ b/drivers/gpu/drm/i915/i915_pci.c
@@ -1147,6 +1147,7 @@ static const struct intel_device_info mtl_info = {
.has_flat_ccs = 0,
.has_gmd_id = 1,
.has_guc_deprivilege = 1,
+   .has_llc = 0,
.has_mslice_steering = 0,
.has_snoop = 1,
.__runtime.memory_regions = REGION_SMEM | REGION_STOLEN_LMEM,
-- 
2.25.1

[PATCH 2/8] drm/i915/mtl: Define MOCS and PAT tables for MTL

2023-04-19 Thread fei . yang

From: Madhumitha Tolakanahalli Pradeep 


On MTL, GT can no longer allocate on LLC - only the CPU can.
This, along with addition of support for L4 cache calls for
a MOCS/PAT table update.
Also the PAT index registers are multicasted for primary GT,
and there is an address jump from index 7 to 8. This patch
makes sure that these registers are programmed in the proper
way.

BSpec: 44509, 45101, 44235

Cc: Matt Roper 
Cc: Lucas De Marchi 
Signed-off-by: Madhumitha Tolakanahalli Pradeep 

Signed-off-by: Aravind Iddamsetty 
Signed-off-by: Nirmoy Das 
Signed-off-by: Fei Yang 
Reviewed-by: Andrzej Hajda 
Reviewed-by: Nirmoy Das 
Reviewed-by: Andi Shyti 
---
 drivers/gpu/drm/i915/gt/intel_gt_regs.h |  6 +-
 drivers/gpu/drm/i915/gt/intel_gtt.c | 47 ++-
 drivers/gpu/drm/i915/gt/intel_gtt.h | 20 ++-
 drivers/gpu/drm/i915/gt/intel_mocs.c| 76 +++--
 drivers/gpu/drm/i915/gt/selftest_mocs.c |  2 +-
 5 files changed, 143 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h 
b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
index fd1f9cd35e9d..e8c3b762a92a 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
@@ -356,7 +356,11 @@
 #define GEN7_TLB_RD_ADDR   _MMIO(0x4700)
 
 #define GEN12_PAT_INDEX(index) _MMIO(0x4800 + (index) * 4)
-#define XEHP_PAT_INDEX(index)  MCR_REG(0x4800 + (index) * 4)
+#define _PAT_INDEX(index)  _PICK_EVEN_2RANGES(index, 8, \
+  0x4800, 
0x4804, \
+  0x4848, 
0x484c)
+#define XEHP_PAT_INDEX(index)  MCR_REG(_PAT_INDEX(index))
+#define XELPMP_PAT_INDEX(index)_MMIO(_PAT_INDEX(index))
 
 #define XEHP_TILE0_ADDR_RANGE  MCR_REG(0x4900)
 #define   XEHP_TILE_LMEM_RANGE_SHIFT   8
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c 
b/drivers/gpu/drm/i915/gt/intel_gtt.c
index 4f436ba7a3c8..2f6a9be0ffe6 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
@@ -468,6 +468,44 @@ void gtt_write_workarounds(struct intel_gt *gt)
}
 }
 
+static void xelpmp_setup_private_ppat(struct intel_uncore *uncore)
+{
+   intel_uncore_write(uncore, XELPMP_PAT_INDEX(0),
+  MTL_PPAT_L4_0_WB);
+   intel_uncore_write(uncore, XELPMP_PAT_INDEX(1),
+  MTL_PPAT_L4_1_WT);
+   intel_uncore_write(uncore, XELPMP_PAT_INDEX(2),
+  MTL_PPAT_L4_3_UC);
+   intel_uncore_write(uncore, XELPMP_PAT_INDEX(3),
+  MTL_PPAT_L4_0_WB | MTL_2_COH_1W);
+   intel_uncore_write(uncore, XELPMP_PAT_INDEX(4),
+  MTL_PPAT_L4_0_WB | MTL_3_COH_2W);
+
+   /*
+* Remaining PAT entries are left at the hardware-default
+* fully-cached setting
+*/
+}
+
+static void xelpg_setup_private_ppat(struct intel_gt *gt)
+{
+   intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(0),
+MTL_PPAT_L4_0_WB);
+   intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(1),
+MTL_PPAT_L4_1_WT);
+   intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(2),
+MTL_PPAT_L4_3_UC);
+   intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(3),
+MTL_PPAT_L4_0_WB | MTL_2_COH_1W);
+   intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(4),
+MTL_PPAT_L4_0_WB | MTL_3_COH_2W);
+
+   /*
+* Remaining PAT entries are left at the hardware-default
+* fully-cached setting
+*/
+}
+
 static void tgl_setup_private_ppat(struct intel_uncore *uncore)
 {
/* TGL doesn't support LLC or AGE settings */
@@ -603,7 +641,14 @@ void setup_private_pat(struct intel_gt *gt)
 
GEM_BUG_ON(GRAPHICS_VER(i915) < 8);
 
-   if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50))
+   if (gt->type == GT_MEDIA) {
+   xelpmp_setup_private_ppat(gt->uncore);
+   return;
+   }
+
+   if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70))
+   xelpg_setup_private_ppat(gt);
+   else if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50))
xehp_setup_private_ppat(gt);
else if (GRAPHICS_VER(i915) >= 12)
tgl_setup_private_ppat(uncore);
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h 
b/drivers/gpu/drm/i915/gt/intel_gtt.h
index 69ce55f517f5..854ec09fd588 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
@@ -88,9 +88,18 @@ typedef u64 gen8_pte_t;
 #define BYT_PTE_SNOOPED_BY_CPU_CACHES  REG_BIT(2)
 #define BYT_PTE_WRITEABLE  REG_BIT(1)
 
+#define MTL_PPGTT_PTE_PAT3 BIT_ULL(62)
 #define GEN12_PPGTT_PTE_LM BIT_ULL(11)
+#define

[PATCH 0/8] drm/i915/mtl: Define MOCS and PAT tables for MTL

2023-04-19 Thread fei . yang

From: Fei Yang 

The series includes patches needed to enable MTL.
Also add new extension for GEM_CREATE uAPI to let
user space set cache policy for buffer objects.

v2: addressing review comments and checkpatch warnings
v3: make mtl_ggtt_pte_encode static

Fei Yang (7):
  drm/i915/mtl: Set has_llc=0
  drm/i915/mtl: Add PTE encode function
  drm/i915/mtl: workaround coherency issue for Media
  drm/i915/mtl: end support for set caching ioctl
  drm/i915: preparation for using PAT index
  drm/i915: use pat_index instead of cache_level
  drm/i915: Allow user to set cache at BO creation

Madhumitha Tolakanahalli Pradeep (1):
  drm/i915/mtl: Define MOCS and PAT tables for MTL

 drivers/gpu/drm/i915/display/intel_dpt.c  | 14 ++--
 drivers/gpu/drm/i915/gem/i915_gem_create.c| 36 
 drivers/gpu/drm/i915/gem/i915_gem_domain.c| 30 +++
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 10 ++-
 drivers/gpu/drm/i915/gem/i915_gem_mman.c  |  3 +-
 drivers/gpu/drm/i915/gem/i915_gem_object.c| 67 ++-
 drivers/gpu/drm/i915/gem/i915_gem_object.h|  8 ++
 .../gpu/drm/i915/gem/i915_gem_object_types.h  | 26 +-
 drivers/gpu/drm/i915/gem/i915_gem_pages.c |  5 +-
 drivers/gpu/drm/i915/gem/i915_gem_shmem.c |  9 +-
 drivers/gpu/drm/i915/gem/i915_gem_shrinker.c  |  2 -
 drivers/gpu/drm/i915/gem/i915_gem_stolen.c|  4 +-
 drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c  | 16 ++--
 .../gpu/drm/i915/gem/selftests/huge_pages.c   |  2 +-
 .../drm/i915/gem/selftests/i915_gem_migrate.c |  2 +-
 .../drm/i915/gem/selftests/i915_gem_mman.c|  2 +-
 drivers/gpu/drm/i915/gt/gen6_ppgtt.c  | 10 ++-
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c  | 76 -
 drivers/gpu/drm/i915/gt/gen8_ppgtt.h  |  3 +-
 drivers/gpu/drm/i915/gt/intel_ggtt.c  | 84 +--
 drivers/gpu/drm/i915/gt/intel_gt_regs.h   |  6 +-
 drivers/gpu/drm/i915/gt/intel_gtt.c   | 47 ++-
 drivers/gpu/drm/i915/gt/intel_gtt.h   | 38 ++---
 drivers/gpu/drm/i915/gt/intel_migrate.c   | 47 ++-
 drivers/gpu/drm/i915/gt/intel_migrate.h   | 13 ++-
 drivers/gpu/drm/i915/gt/intel_mocs.c  | 76 -
 drivers/gpu/drm/i915/gt/intel_ppgtt.c |  6 +-
 drivers/gpu/drm/i915/gt/selftest_migrate.c| 47 ++-
 drivers/gpu/drm/i915/gt/selftest_mocs.c   |  2 +-
 drivers/gpu/drm/i915/gt/selftest_reset.c  |  8 +-
 drivers/gpu/drm/i915/gt/selftest_timeline.c   |  2 +-
 drivers/gpu/drm/i915/gt/selftest_tlb.c|  4 +-
 drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c | 13 +++
 drivers/gpu/drm/i915/gt/uc/intel_guc.c|  7 ++
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c |  6 ++
 drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c  | 10 ++-
 drivers/gpu/drm/i915/i915_debugfs.c   | 55 +---
 drivers/gpu/drm/i915/i915_gem.c   | 16 +++-
 drivers/gpu/drm/i915/i915_gpu_error.c |  8 +-
 drivers/gpu/drm/i915/i915_pci.c   | 76 +++--
 drivers/gpu/drm/i915/i915_vma.c   | 16 ++--
 drivers/gpu/drm/i915/i915_vma.h   |  2 +-
 drivers/gpu/drm/i915/i915_vma_types.h |  2 -
 drivers/gpu/drm/i915/intel_device_info.h  |  5 ++
 drivers/gpu/drm/i915/selftests/i915_gem.c |  5 +-
 .../gpu/drm/i915/selftests/i915_gem_evict.c   |  4 +-
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 15 ++--
 .../drm/i915/selftests/intel_memory_region.c  |  4 +-
 .../gpu/drm/i915/selftests/mock_gem_device.c  |  9 ++
 drivers/gpu/drm/i915/selftests/mock_gtt.c |  8 +-
 include/uapi/drm/i915_drm.h   | 36 
 tools/include/uapi/drm/i915_drm.h | 36 
 52 files changed, 812 insertions(+), 226 deletions(-)

-- 
2.25.1

Re: [PATCH 09/11] drm/msm/dpu: set max cursor width to 512x512

2023-04-19 Thread Dmitry Baryshkov


On 19/04/2023 17:41, Arnaud Vrac wrote:

Override the default max cursor size reported to userspace of 64x64.
MSM8998 hw cursor planes support 512x512 size, and other chips use DMA
SSPPs.

Signed-off-by: Arnaud Vrac 
---
  drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c | 3 +++
  1 file changed, 3 insertions(+)


Reviewed-by: Dmitry Baryshkov 

--
With best wishes
Dmitry

Re: [PATCH 06/11] drm/msm/dpu: support cursor sspp hw blocks

2023-04-19 Thread Dmitry Baryshkov


On 19/04/2023 17:41, Arnaud Vrac wrote:

Cursor SSPP must be assigned to the last mixer stage, so we assign an
immutable zpos property with a value higher than primary/overlay planes,
to ensure it will always be on top.


The commit does more than is described in the commit message. Let's do 
it step by step. Please split into several patches. Also see below.




Signed-off-by: Arnaud Vrac 
---
  drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c   | 19 ++-
  drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c | 26 +++---
  2 files changed, 37 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
index 0e7a68714e9e1..6cce0f6cfcb01 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
@@ -738,13 +738,22 @@ static int _dpu_kms_drm_obj_init(struct dpu_kms *dpu_kms)
for (i = 0; i < catalog->sspp_count; i++) {
enum drm_plane_type type;
  
-		if ((catalog->sspp[i].features & BIT(DPU_SSPP_CURSOR))

-   && cursor_planes_idx < max_crtc_count)
-   type = DRM_PLANE_TYPE_CURSOR;
-   else if (primary_planes_idx < max_crtc_count)
+   if (catalog->sspp[i].features & BIT(DPU_SSPP_CURSOR)) {
+   if (cursor_planes_idx < max_crtc_count) {
+   type = DRM_PLANE_TYPE_CURSOR;
+   } else if (catalog->sspp[i].type == SSPP_TYPE_CURSOR) {
+   /* Cursor SSPP can only be used in the last
+* mixer stage, so it doesn't make sense to
+* assign two of those to the same CRTC */
+   continue;
+   } else {
+   type = DRM_PLANE_TYPE_OVERLAY;
+   }
+   } else if (primary_planes_idx < max_crtc_count) {
type = DRM_PLANE_TYPE_PRIMARY;
-   else
+   } else {
type = DRM_PLANE_TYPE_OVERLAY;
+   }


Ack. I'm not sure how compositors will cope if we have two planes with 
immutable zpos set to the same value. Also I'd prefer to have this as a 
separate commit.


  
  		DPU_DEBUG("Create plane type %d with features %lx (cur %lx)\n",

  type, catalog->sspp[i].features,
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c
index 128ecdc145260..5a7bb8543866c 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c
@@ -881,7 +881,14 @@ static int dpu_plane_atomic_check(struct drm_plane *plane,
r_pipe->multirect_mode = DPU_SSPP_MULTIRECT_NONE;
r_pipe->sspp = NULL;
  
-	pstate->stage = DPU_STAGE_BASE + pstate->base.normalized_zpos;

+   if (pipe_hw_caps->type == SSPP_TYPE_CURSOR) {
+   /* enforce cursor sspp to use the last mixer stage */


I'd add here 'we know that it is the last plane in the stack because of 
zpos property ranges'



+   pstate->stage = DPU_STAGE_BASE +
+   pdpu->catalog->caps->max_mixer_blendstages;
+   } else {
+   pstate->stage = DPU_STAGE_BASE + pstate->base.normalized_zpos;
+   }
+
if (pstate->stage > DPU_STAGE_BASE + 
pdpu->catalog->caps->max_mixer_blendstages) {
DPU_ERROR("> %d plane mixer stages assigned\n",
  pdpu->catalog->caps->max_mixer_blendstages);
@@ -1463,6 +1470,7 @@ struct drm_plane *dpu_plane_init(struct drm_device *dev,
struct msm_drm_private *priv = dev->dev_private;
struct dpu_kms *kms = to_dpu_kms(priv->kms);
struct dpu_hw_sspp *pipe_hw;
+   const uint64_t *format_modifiers;
uint32_t num_formats;
uint32_t supported_rotations;
int ret = -EINVAL;
@@ -1489,15 +1497,27 @@ struct drm_plane *dpu_plane_init(struct drm_device *dev,
format_list = pipe_hw->cap->sblk->format_list;
num_formats = pipe_hw->cap->sblk->num_formats;
  
+	if (pipe_hw->cap->type == SSPP_TYPE_CURSOR)

+   format_modifiers = NULL;
+   else
+   format_modifiers = supported_format_modifiers;
+
ret = drm_universal_plane_init(dev, plane, 0xff, _plane_funcs,
format_list, num_formats,
-   supported_format_modifiers, type, NULL);
+   format_modifiers, type, NULL);



Separate commit please


if (ret)
goto clean_plane;
  
  	pdpu->catalog = kms->catalog;
  
-	ret = drm_plane_create_zpos_property(plane, 0, 0, DPU_ZPOS_MAX);

+   if (pipe_hw->cap->type == SSPP_TYPE_CURSOR) {
+   /* cursor SSPP can only be used in the last mixer stage,
+* enforce it by maxing out the cursor plane zpos */
+   ret =

Re: [PATCH 05/11] drm/msm/dpu: allow using all lm mixer stages

2023-04-19 Thread Dmitry Baryshkov


On 19/04/2023 17:41, Arnaud Vrac wrote:

The max_mixer_blendstages hw catalog property represents the number of
planes that can be blended by the lm mixer, excluding the base stage, so
adjust the check for the number of currently assigned planes accordingly.

Signed-off-by: Arnaud Vrac 
---
  drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c | 6 +++---
  1 file changed, 3 insertions(+), 3 deletions(-)


Reviewed-by: Dmitry Baryshkov 

--
With best wishes
Dmitry

Re: [PATCH 04/11] drm/msm/dpu: allow using lm mixer base stage

2023-04-19 Thread Dmitry Baryshkov


On 19/04/2023 17:41, Arnaud Vrac wrote:

The dpu backend already handles applying alpha to the base stage, so we
can use it to render the bottom plane in all cases. This allows mixing
one additional plane with the hardware mixer.

Signed-off-by: Arnaud Vrac 


This might require additional changes. First, for the STAGE_BASE pipe 
in the source split mode (iow using two LMs) should programmed with 
respect to the right LM's x offset (rather than usual left top-left LM). 
See  mdss_mdp_pipe_position_update().


Also this might need some interaction with CTL_MIXER_BORDER_OUT being 
set or not. If I remember correctly, if there bottom plane is not 
fullscreen or if there are no planes at all, we should set 
CTL_MIXER_BORDER_OUT (which takes STAGE_BASE) and start assigning them 
from STAGE0. If not, we can use STAGE_BASE.



---
  drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c
index 14b5cfe306113..148921ed62f85 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c
@@ -881,7 +881,7 @@ static int dpu_plane_atomic_check(struct drm_plane *plane,
r_pipe->multirect_mode = DPU_SSPP_MULTIRECT_NONE;
r_pipe->sspp = NULL;
  
-	pstate->stage = DPU_STAGE_0 + pstate->base.normalized_zpos;

+   pstate->stage = DPU_STAGE_BASE + pstate->base.normalized_zpos;
if (pstate->stage >= pdpu->catalog->caps->max_mixer_blendstages) {
DPU_ERROR("> %d plane stages assigned\n",
  pdpu->catalog->caps->max_mixer_blendstages - 
DPU_STAGE_0);



--
With best wishes
Dmitry

Re: [PATCH 03/11] drm/msm/dpu: use hsync/vsync polarity set by the encoder

2023-04-19 Thread Dmitry Baryshkov


On 19/04/2023 17:41, Arnaud Vrac wrote:

Do not override the hsync/vsync polarity passed by the encoder when
setting up intf timings. The same logic was used in both the encoder and
intf code to set the DP and DSI polarities, so those interfaces are not
impacted. However for HDMI, the polarities were overriden to static
values based on the vertical resolution, instead of using the actual
mode polarities.

Signed-off-by: Arnaud Vrac 
---
  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_intf.c | 16 +++-
  1 file changed, 3 insertions(+), 13 deletions(-)


Reviewed-by: Dmitry Baryshkov 

As a side note: I think at some point we should get rid of the override 
in dpu_encoder too and move it to dsi bridge code.


--
With best wishes
Dmitry

Re: [PATCH 02/11] drm/msm/dpu: use the actual lm maximum width instead of a hardcoded value

2023-04-19 Thread Dmitry Baryshkov


On 19/04/2023 17:41, Arnaud Vrac wrote:

This avoids using two LMs instead of one when the display width is lower
than the maximum supported value. For example on MSM8996/MSM8998, the
actual maxwidth is 2560, so we would use two LMs for 1280x720 or
1920x1080 resolutions, while one is enough.

Signed-off-by: Arnaud Vrac 


While this looks correct (and following what we have in 4.4), later 
vendor kernels specify the topology explicitly. Probably we should check 
this with the hw guys, because it might be the following case: even 
though a single LM can supply the mode, it will spend more power 
compared to two LMs.




---
  drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c | 10 +-
  1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
index 1dc5dbe585723..dd2914726c4f6 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
@@ -53,8 +53,6 @@
  
  #define IDLE_SHORT_TIMEOUT	1
  
-#define MAX_HDISPLAY_SPLIT 1080

-
  /* timeout in frames waiting for frame done */
  #define DPU_ENCODER_FRAME_DONE_TIMEOUT_FRAMES 5
  
@@ -568,10 +566,12 @@ static struct msm_display_topology dpu_encoder_get_topology(

 */
if (intf_count == 2)
topology.num_lm = 2;
-   else if (!dpu_kms->catalog->caps->has_3d_merge)
-   topology.num_lm = 1;
+   else if (dpu_kms->catalog->caps->has_3d_merge &&
+dpu_kms->catalog->mixer_count > 0 &&
+mode->hdisplay > dpu_kms->catalog->mixer[0].sblk->maxwidth)
+   topology.num_lm = 2;
else
-   topology.num_lm = (mode->hdisplay > MAX_HDISPLAY_SPLIT) ? 2 : 1;
+   topology.num_lm = 1;
  
  	if (crtc_state->ctm)

topology.num_dspp = topology.num_lm;



--
With best wishes
Dmitry

Re: [PATCH 3/3] drm/i915/hwmon: Block waiting for GuC reset to complete

2023-04-19 Thread Dixit, Ashutosh

On Wed, 19 Apr 2023 12:40:44 -0700, Rodrigo Vivi wrote:
>

Hi Rodrigo,

> On Tue, Apr 18, 2023 at 10:23:50AM -0700, Dixit, Ashutosh wrote:
> > On Mon, 17 Apr 2023 22:35:58 -0700, Rodrigo Vivi wrote:
> > >
> >
> > Hi Rodrigo,
> >
> > > On Mon, Apr 10, 2023 at 03:35:09PM -0700, Ashutosh Dixit wrote:
> > > > Instead of erroring out when GuC reset is in progress, block waiting for
> > > > GuC reset to complete which is a more reasonable uapi behavior.
> > > >
> > > > v2: Avoid race between wake_up_all and waiting for wakeup (Rodrigo)
> > > >
> > > > Signed-off-by: Ashutosh Dixit 
> > > > ---
> > > >  drivers/gpu/drm/i915/i915_hwmon.c | 38 +++
> > > >  1 file changed, 33 insertions(+), 5 deletions(-)
> > > >
> > > > diff --git a/drivers/gpu/drm/i915/i915_hwmon.c 
> > > > b/drivers/gpu/drm/i915/i915_hwmon.c
> > > > index 9ab8971679fe3..8471a667dfc71 100644
> > > > --- a/drivers/gpu/drm/i915/i915_hwmon.c
> > > > +++ b/drivers/gpu/drm/i915/i915_hwmon.c
> > > > @@ -51,6 +51,7 @@ struct hwm_drvdata {
> > > > char name[12];
> > > > int gt_n;
> > > > bool reset_in_progress;
> > > > +   wait_queue_head_t waitq;
> > > >  };
> > > >
> > > >  struct i915_hwmon {
> > > > @@ -395,16 +396,41 @@ hwm_power_max_read(struct hwm_drvdata *ddat, long 
> > > > *val)
> > > >  static int
> > > >  hwm_power_max_write(struct hwm_drvdata *ddat, long val)
> > > >  {
> > > > +#define GUC_RESET_TIMEOUT msecs_to_jiffies(2000)
> > > > +
> > > > +   int ret = 0, timeout = GUC_RESET_TIMEOUT;
> > > > struct i915_hwmon *hwmon = ddat->hwmon;
> > > > intel_wakeref_t wakeref;
> > > > -   int ret = 0;
> > > > +   DEFINE_WAIT(wait);
> > > > u32 nval;
> > > >
> > > > -   mutex_lock(>hwmon_lock);
> > > > -   if (hwmon->ddat.reset_in_progress) {
> > > > -   ret = -EAGAIN;
> > > > -   goto unlock;
> > > > +   /* Block waiting for GuC reset to complete when needed */
> > > > +   for (;;) {
> > > > +   mutex_lock(>hwmon_lock);
> > >
> > > I'm really afraid of how this mutex is handled with the wait queue.
> > > some initial thought it looks like it is trying to reimplement ww_mutex?
> >
> > Sorry, but I am missing the relation with ww_mutex. No such relation is
> > intended.
> >
> > > all other examples of the wait_queue usages like this or didn't use
> > > locks or had it in a total different flow that I could not correlate.
> >
> > Actually there are several examples of prepare_to_wait/finish_wait
> > sequences with both spinlock and mutex in the kernel. See
> > e.g. rpm_suspend(), wait_for_rtrs_disconnection(), softsynthx_read().
> >
> > Also, as I mentioned, except for the lock, the sequence here is identical
> > to intel_guc_wait_for_pending_msg().
> >
> > >
> > > > +
> > > > +   prepare_to_wait(>waitq, , 
> > > > TASK_INTERRUPTIBLE);
> > > > +
> > > > +   if (!hwmon->ddat.reset_in_progress)
> > > > +   break;
> > >
> > > If this breaks we never unlock it?
> >
> > Correct, this is the original case in Patch 2 where the mutex is acquired
> > in the beginning of the function and released just before the final exit
> > from the function (so the mutex is held for the entire duration of the
> > function).
>
> I got really confused here...

Sorry, the patch is a little confusing/tricky but I thought I'd better
stick to the standard 'for (;;)' loop pattern otherwise it will also be
hard to review.

> I looked at the patch 2 again and I don't see any place where the lock
> remains outside of the function. What was what I asked to remove on the
> initial versions.

So it was in Patch 1 where we changed the code to take the lock in the
beginning of the function and release it at the end of the function (you
can see it Patch 1).

In Patch 2 the 'unlock' label and 'goto unlock' is introduced and the lock
is released at the 'unlock' label (it is visible in Patch 2).

> But now with this one I'm even more confused because I couldn't follow
> to understand who will remove the lock and when.

In Patch 3 again the lock is released at the the 'unlock' label (i.e. the
destination of 'goto unlock', not visible in Patch 3). But we execute 'goto
unlock' only when 'ret != 0' in the 'for (;;)' loop. But when 'ret == 0'
(when 'ddat.reset_in_progress' flag is clear) we hold the mutex, execute
the entire function and finally release the lock at the end of the
function.

Hopefully this helps.

Thanks.
--
Ashutosh

>
> >
> > >
> > > > +
> > > > +   if (signal_pending(current)) {
> > > > +   ret = -EINTR;
> > > > +   break;
> > > > +   }
> > > > +
> > > > +   if (!timeout) {
> > > > +   ret = -ETIME;
> > > > +   break;
> > > > +   }
> > > > +
> > > > +   mutex_unlock(>hwmon_lock);
> > >
> > > do we need to lock the signal pending and timeout as well?
> > > or only wrapping it around the hwmon->ddat access

Re: [PATCH 3/3] drm/i915/hwmon: Block waiting for GuC reset to complete

2023-04-19 Thread Dixit, Ashutosh

On Wed, 19 Apr 2023 06:21:27 -0700, Tvrtko Ursulin wrote:
>

Hi Tvrtko,

> On 10/04/2023 23:35, Ashutosh Dixit wrote:
> > Instead of erroring out when GuC reset is in progress, block waiting for
> > GuC reset to complete which is a more reasonable uapi behavior.
> >
> > v2: Avoid race between wake_up_all and waiting for wakeup (Rodrigo)
> >
> > Signed-off-by: Ashutosh Dixit 
> > ---
> >   drivers/gpu/drm/i915/i915_hwmon.c | 38 +++
> >   1 file changed, 33 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_hwmon.c 
> > b/drivers/gpu/drm/i915/i915_hwmon.c
> > index 9ab8971679fe3..8471a667dfc71 100644
> > --- a/drivers/gpu/drm/i915/i915_hwmon.c
> > +++ b/drivers/gpu/drm/i915/i915_hwmon.c
> > @@ -51,6 +51,7 @@ struct hwm_drvdata {
> > char name[12];
> > int gt_n;
> > bool reset_in_progress;
> > +   wait_queue_head_t waitq;
> >   };
> > struct i915_hwmon {
> > @@ -395,16 +396,41 @@ hwm_power_max_read(struct hwm_drvdata *ddat, long 
> > *val)
> >   static int
> >   hwm_power_max_write(struct hwm_drvdata *ddat, long val)
> >   {
> > +#define GUC_RESET_TIMEOUT msecs_to_jiffies(2000)
> > +
> > +   int ret = 0, timeout = GUC_RESET_TIMEOUT;
>
> Patch looks good to me

Great, thanks :)

> apart that I am not sure what is the purpose of the timeout? This is just
> the sysfs write path or has more callers?

It is just the sysfs path, but the sysfs is accessed also by the oneAPI
stack (Level 0). In the initial version I also didn't have the timeout
thinking that the app can send a signal to the blocked thread to unblock
it. I introduced the timeout after Rodrigo brought it up and I am now
thinking maybe it's better to have the timeout in the driver since the app
has no knowledge of how long GuC resets can take. But I can remove it if
you think it's not needed.

> If the
> former perhaps it would be better to just use interruptible everything
> (mutex and sleep) and wait for as long as it takes or until user presses
> Ctrl-C?

Now we are not holding the mutexes for long, just long enough do register
rmw's. So not holding the mutex across GuC reset as we were
originally. Therefore I am thinking mutex_lock_interruptible is not needed?
The sleep is already interruptible (TASK_INTERRUPTIBLE).

Anyway please let me know if you think we need to change anything.

Thanks.
--
Ashutosh

> > struct i915_hwmon *hwmon = ddat->hwmon;
> > intel_wakeref_t wakeref;
> > -   int ret = 0;
> > +   DEFINE_WAIT(wait);
> > u32 nval;
> >   - mutex_lock(>hwmon_lock);
> > -   if (hwmon->ddat.reset_in_progress) {
> > -   ret = -EAGAIN;
> > -   goto unlock;
> > +   /* Block waiting for GuC reset to complete when needed */
> > +   for (;;) {
> > +   mutex_lock(>hwmon_lock);
> > +
> > +   prepare_to_wait(>waitq, , TASK_INTERRUPTIBLE);
> > +
> > +   if (!hwmon->ddat.reset_in_progress)
> > +   break;
> > +
> > +   if (signal_pending(current)) {
> > +   ret = -EINTR;
> > +   break;
> > +   }
> > +
> > +   if (!timeout) {
> > +   ret = -ETIME;
> > +   break;
> > +   }
> > +
> > +   mutex_unlock(>hwmon_lock);
> > +
> > +   timeout = schedule_timeout(timeout);
> > }
> > +   finish_wait(>waitq, );
> > +   if (ret)
> > +   goto unlock;
> > +
> > wakeref = intel_runtime_pm_get(ddat->uncore->rpm);
> > /* Disable PL1 limit and verify, because the limit cannot be
> > disabled on all platforms */
> > @@ -508,6 +534,7 @@ void i915_hwmon_power_max_restore(struct 
> > drm_i915_private *i915, bool old)
> > intel_uncore_rmw(hwmon->ddat.uncore, hwmon->rg.pkg_rapl_limit,
> >  PKG_PWR_LIM_1_EN, old ? PKG_PWR_LIM_1_EN : 0);
> > hwmon->ddat.reset_in_progress = false;
> > +   wake_up_all(>ddat.waitq);
> > mutex_unlock(>hwmon_lock);
> >   }
> > @@ -784,6 +811,7 @@ void i915_hwmon_register(struct drm_i915_private *i915)
> > ddat->uncore = >uncore;
> > snprintf(ddat->name, sizeof(ddat->name), "i915");
> > ddat->gt_n = -1;
> > +   init_waitqueue_head(>waitq);
> > for_each_gt(gt, i915, i) {
> > ddat_gt = hwmon->ddat_gt + i;

Re: [PATCH 8/8] drm/i915: Allow user to set cache at BO creation

2023-04-19 Thread Andi Shyti

Hi Fei,

On Wed, Apr 19, 2023 at 02:12:19PM -0700, fei.y...@intel.com wrote:
> From: Fei Yang 
> 
> To comply with the design that buffer objects shall have immutable
> cache setting through out their life cycle, {set, get}_caching ioctl's
> are no longer supported from MTL onward. With that change caching
> policy can only be set at object creation time. The current code
> applies a default (platform dependent) cache setting for all objects.
> However this is not optimal for performance tuning. The patch extends
> the existing gem_create uAPI to let user set PAT index for the object
> at creation time.
> The new extension is platform independent, so UMD's can switch to using
> this extension for older platforms as well, while {set, get}_caching are
> still supported on these legacy paltforms for compatibility reason.
> 
> Cc: Chris Wilson 
> Cc: Matt Roper 
> Cc: Andi Shyti 
> Signed-off-by: Fei Yang 

That's the last one... I think you addressed all the comments!
Please add the r-b's I updated if you are going to send another
version, if not (and I hope so) they should be added by
patchwork.

Reviewed-by: Andi Shyti  

Thanks,
Andi

Re: [PATCH 6/8] drm/i915: preparation for using PAT index

2023-04-19 Thread Andi Shyti

Hi Fei,

On Wed, Apr 19, 2023 at 02:12:17PM -0700, fei.y...@intel.com wrote:
> From: Fei Yang 
> 
> This patch is a preparation for replacing enum i915_cache_level with PAT
> index. Caching policy for buffer objects is set through the PAT index in
> PTE, the old i915_cache_level is not sufficient to represent all caching
> modes supported by the hardware.
> 
> Preparing the transition by adding some platform dependent data structures
> and helper functions to translate the cache_level to pat_index.
> 
> cachelevel_to_pat: a platform dependent array mapping cache_level to
>pat_index.
> 
> max_pat_index: the maximum PAT index supported by the hardware. Needed for
>validating the PAT index passed in from user space.
> 
> i915_gem_get_pat_index: function to convert cache_level to PAT index.
> 
> obj_to_i915(obj): macro moved to header file for wider usage.
> 
> I915_MAX_CACHE_LEVEL: upper bound of i915_cache_level for the
>   convenience of coding.
> 
> Cc: Chris Wilson 
> Cc: Matt Roper 
> Cc: Andi Shyti 
> Signed-off-by: Fei Yang 

Reviewed-by: Andi Shyti  

Andi

Re: [Intel-gfx] [PATCH 1/8] drm/i915/mtl: Set has_llc=0

2023-04-19 Thread Andi Shyti

Hi Fei,

On Wed, Apr 19, 2023 at 10:10:24PM +, Yang, Fei wrote:
> > Hi Fei,
> >
> > On Wed, Apr 19, 2023 at 02:12:12PM -0700, fei.y...@intel.com wrote:
> >> From: Fei Yang 
> >>
> >> On MTL, LLC is not shared between GT and CPU, set has_llc=0.
> >>
> >> Signed-off-by: Fei Yang 
> >
> > just an unanswered questino from Nirmoy:
> >
> > This statement is bit unclear to me.  I would say "On MTL, LLC is not 
> > shared between GT and CPU"
> 
> I have updated the commit message accordingly in this version. see above.

oh... sorry... I got confused... never mind! :)

Thanks!
Andi

> > Reviewed-by: Andi Shyti 
> > Reviewed-by: Andrzej Hajda 
> > Reviewed-by: Nirmoy Das 
> >
> > Andi

Re: [PATCH 01/11] drm/msm/dpu: tweak msm8998 hw catalog values

2023-04-19 Thread Dmitry Baryshkov


On 19/04/2023 17:41, Arnaud Vrac wrote:

Match the values found in the downstream msm-4.4 kernel sde driver.

Signed-off-by: Arnaud Vrac 


Fixes: 94391a14fc27 ("drm/msm/dpu1: Add MSM8998 to hw catalog")

Reviewed-by: Dmitry Baryshkov 


---
  drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_3_0_msm8998.h |  8 
  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c  | 15 +--
  2 files changed, 9 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_3_0_msm8998.h 
b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_3_0_msm8998.h
index 2b3ae84057dfe..b07e8a9941f79 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_3_0_msm8998.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_3_0_msm8998.h
@@ -134,10 +134,10 @@ static const struct dpu_dspp_cfg msm8998_dspp[] = {
  };
  
  static const struct dpu_intf_cfg msm8998_intf[] = {

-   INTF_BLK("intf_0", INTF_0, 0x6a000, 0x280, INTF_DP, 0, 25, 
INTF_SDM845_MASK, MDP_SSPP_TOP0_INTR, 24, 25),
-   INTF_BLK("intf_1", INTF_1, 0x6a800, 0x280, INTF_DSI, 0, 25, 
INTF_SDM845_MASK, MDP_SSPP_TOP0_INTR, 26, 27),
-   INTF_BLK("intf_2", INTF_2, 0x6b000, 0x280, INTF_DSI, 1, 25, 
INTF_SDM845_MASK, MDP_SSPP_TOP0_INTR, 28, 29),
-   INTF_BLK("intf_3", INTF_3, 0x6b800, 0x280, INTF_HDMI, 0, 25, 
INTF_SDM845_MASK, MDP_SSPP_TOP0_INTR, 30, 31),
+   INTF_BLK("intf_0", INTF_0, 0x6a000, 0x280, INTF_DP, 0, 21, 
INTF_SDM845_MASK, MDP_SSPP_TOP0_INTR, 24, 25),
+   INTF_BLK("intf_1", INTF_1, 0x6a800, 0x280, INTF_DSI, 0, 21, 
INTF_SDM845_MASK, MDP_SSPP_TOP0_INTR, 26, 27),
+   INTF_BLK("intf_2", INTF_2, 0x6b000, 0x280, INTF_DSI, 1, 21, 
INTF_SDM845_MASK, MDP_SSPP_TOP0_INTR, 28, 29),
+   INTF_BLK("intf_3", INTF_3, 0x6b800, 0x280, INTF_HDMI, 0, 21, 
INTF_SDM845_MASK, MDP_SSPP_TOP0_INTR, 30, 31),
  };
  
  static const struct dpu_perf_cfg msm8998_perf_data = {

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
index 03f162af1a50b..8d5d782a43398 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
@@ -587,12 +587,12 @@ static const u32 sdm845_nrt_pri_lvl[] = {3, 3, 3, 3, 3, 
3, 3, 3};
  
  static const struct dpu_vbif_dynamic_ot_cfg msm8998_ot_rdwr_cfg[] = {

{
-   .pps = 1088 * 1920 * 30,
+   .pps = 1920 * 1080 * 30,
.ot_limit = 2,
},
{
-   .pps = 1088 * 1920 * 60,
-   .ot_limit = 6,
+   .pps = 1920 * 1080 * 60,
+   .ot_limit = 4,
},
{
.pps = 3840 * 2160 * 30,
@@ -705,10 +705,7 @@ static const struct dpu_qos_lut_entry msm8998_qos_linear[] 
= {
{.fl = 10, .lut = 0x1555b},
{.fl = 11, .lut = 0xb},
{.fl = 12, .lut = 0x1b},
-   {.fl = 13, .lut = 0x5b},
-   {.fl = 14, .lut = 0},
-   {.fl = 1,  .lut = 0x1b},
-   {.fl = 0,  .lut = 0}
+   {.fl = 0,  .lut = 0x5b}
  };
  
  static const struct dpu_qos_lut_entry sdm845_qos_linear[] = {

@@ -730,9 +727,7 @@ static const struct dpu_qos_lut_entry 
msm8998_qos_macrotile[] = {
{.fl = 10, .lut = 0x1aaff},
{.fl = 11, .lut = 0x5aaff},
{.fl = 12, .lut = 0x15aaff},
-   {.fl = 13, .lut = 0x55aaff},
-   {.fl = 1,  .lut = 0x1aaff},
-   {.fl = 0,  .lut = 0},
+   {.fl = 0,  .lut = 0x55aaff},
  };
  
  static const struct dpu_qos_lut_entry sc7180_qos_linear[] = {




--
With best wishes
Dmitry

RE: [Intel-gfx] [PATCH 1/8] drm/i915/mtl: Set has_llc=0

2023-04-19 Thread Yang, Fei

> Hi Fei,
>
> On Wed, Apr 19, 2023 at 02:12:12PM -0700, fei.y...@intel.com wrote:
>> From: Fei Yang 
>>
>> On MTL, LLC is not shared between GT and CPU, set has_llc=0.
>>
>> Signed-off-by: Fei Yang 
>
> just an unanswered questino from Nirmoy:
>
> This statement is bit unclear to me.  I would say "On MTL, LLC is not shared 
> between GT and CPU"

I have updated the commit message accordingly in this version. see above.

> Reviewed-by: Andi Shyti 
> Reviewed-by: Andrzej Hajda 
> Reviewed-by: Nirmoy Das 
>
> Andi

Re: [Intel-gfx] [PATCH 7/8] drm/i915: use pat_index instead of cache_level

2023-04-19 Thread Andi Shyti

Hi Fei,

> Currently the KMD is using enum i915_cache_level to set caching policy for
> buffer objects. This is flaky because the PAT index which really controls
> the caching behavior in PTE has far more levels than what's defined in the
> enum. In addition, the PAT index is platform dependent, having to translate
> between i915_cache_level and PAT index is not reliable, and makes the code
> more complicated.
> 
> >From UMD's perspective there is also a necessity to set caching policy for
> performance fine tuning. It's much easier for the UMD to directly use PAT
> index because the behavior of each PAT index is clearly defined in Bspec.
> Having the abstracted i915_cache_level sitting in between would only cause
> more ambiguity.
> 
> For these reasons this patch replaces i915_cache_level with PAT index. Also
> note, the cache_level is not completely removed yet, because the KMD still
> has the need of creating buffer objects with simple cache settings such as
> cached, uncached, or writethrough. For such simple cases, using cache_level
> would help simplify the code.
> 
> Cc: Chris Wilson 
> Cc: Matt Roper 
> Signed-off-by: Fei Yang 

Reviewed-by: Andi Shyti  

Andi

Re: [Intel-gfx] [PATCH 5/8] drm/i915/mtl: end support for set caching ioctl

2023-04-19 Thread Andi Shyti

Hi Fei,

On Wed, Apr 19, 2023 at 02:12:16PM -0700, fei.y...@intel.com wrote:
> From: Fei Yang 
> 
> The design is to keep Buffer Object's caching policy immutable through
> out its life cycle. This patch ends the support for set caching ioctl
> from MTL onward. While doing that we also set BO's to be 1-way coherent
> at creation time because GPU is no longer automatically snooping CPU
> cache. For UMD's need to fine tune the caching policy for BO's, a follow
> up patch will extend the GEM_CREATE uAPI to allow UMD's specify caching
> mode at BO creation time.
> 
> Signed-off-by: Fei Yang 

Reviewed-by: Andi Shyti  
Reviewed-by: Andrzej Hajda 

Andi

Re: [Intel-gfx] [PATCH 4/8] drm/i915/mtl: workaround coherency issue for Media

2023-04-19 Thread Andi Shyti

Hi Fei,

On Wed, Apr 19, 2023 at 02:12:15PM -0700, fei.y...@intel.com wrote:
> From: Fei Yang 
> 
> This patch implements Wa_22016122933.
> 
> In MTL, memory writes initiated by Media tile update the whole
> cache line even for partial writes. This creates a coherency
> problem for cacheable memory if both CPU and GPU are writing data
> to different locations within a single cache line. CTB communication
> is impacted by this issue because the head and tail pointers are
> adjacent words within a cache line (see struct guc_ct_buffer_desc),
> where one is written by GuC and the other by the host.
> This patch circumvents the issue by making CPU/GPU shared memory
> uncacheable (WC on CPU side, and PAT index 2 for GPU). Also for
> CTB which is being updated by both CPU and GuC, mfence instruction
> is added to make sure the CPU writes are visible to GPU right away
> (flush the write combining buffer).
> 
> While fixing the CTB issue, we noticed some random GSC firmware
> loading failure because the share buffers are cacheable (WB) on CPU
> side but uncached on GPU side. To fix these issues we need to map
> such shared buffers as WC on CPU side. Since such allocations are
> not all done through GuC allocator, to avoid too many code changes,
> the i915_coherent_map_type() is now hard coded to return WC for MTL.
> 
> BSpec: 45101
> 
> Signed-off-by: Fei Yang 

Reviewed-by: Andi Shyti 
Acked-by: Nirmoy Das 

Still one comment below.

[...]

> + /*
> +  * Wa_22016122933: Making sure the head update is
> +  * visible to GuC right away
> +  */
> + intel_guc_write_barrier(ct_to_guc(ct));

I thought you were going to revert this. Is this really needed,
BTW? I'm fine with leaving it.

Andi

Re: [Intel-gfx] [PATCH 3/8] drm/i915/mtl: Add PTE encode function

2023-04-19 Thread Andi Shyti

Hi Fei,

> PTE encode functions are platform dependent. This patch implements
> PTE functions for MTL, and ensures the correct PTE encode function
> is used by calling pte_encode function pointer instead of the
> hardcoded gen8 version of PTE encode.
> 
> Signed-off-by: Fei Yang 

I think nothing opened here... one comment from Nirmoy I see that
has been addressed.

Reviewed-by: Andi Shyti 
Reviewed-by: Andrzej Hajda 
Acked-by: Nirmoy Das 

Andi

Re: [Intel-gfx] [PATCH 2/8] drm/i915/mtl: Define MOCS and PAT tables for MTL

2023-04-19 Thread Andi Shyti

Hi Fei,

On Wed, Apr 19, 2023 at 02:12:13PM -0700, fei.y...@intel.com wrote:
> From: Madhumitha Tolakanahalli Pradeep 
> 
> 
> On MTL, GT can no longer allocate on LLC - only the CPU can.
> This, along with addition of support for L4 cache calls for
> a MOCS/PAT table update.
> Also the PAT index registers are multicasted for primary GT,
> and there is an address jump from index 7 to 8. This patch
> makes sure that these registers are programmed in the proper
> way.
> 
> BSpec: 44509, 45101, 44235
> 
> Cc: Matt Roper 
> Cc: Lucas De Marchi 
> Signed-off-by: Madhumitha Tolakanahalli Pradeep 
> 
> Signed-off-by: Aravind Iddamsetty 
> Signed-off-by: Nirmoy Das 
> Signed-off-by: Fei Yang 

I think nothing open left here.

Reviewed-by: Andi Shyti 
Reviewed-by: Andrzej Hajda 
Reviewed-by: Nirmoy Das 

Andi

Re: [Intel-gfx] [PATCH 1/8] drm/i915/mtl: Set has_llc=0

2023-04-19 Thread Andi Shyti

Hi Fei,

On Wed, Apr 19, 2023 at 02:12:12PM -0700, fei.y...@intel.com wrote:
> From: Fei Yang 
> 
> On MTL, LLC is not shared between GT and CPU, set has_llc=0.
> 
> Signed-off-by: Fei Yang 

just an unanswered questino from Nirmoy:

This statement is bit unclear to me.  I would say "On MTL, LLC
is not shared between GT and CPU"

Reviewed-by: Andi Shyti 
Reviewed-by: Andrzej Hajda 
Reviewed-by: Nirmoy Das 

Andi

Re: [Intel-gfx] [PATCH 2/8] drm/i915/mtl: Define MOCS and PAT tables for MTL

2023-04-19 Thread Andi Shyti

Hi Fei,

> +#define MTL_PPGTT_PTE_PAT3   BIT_ULL(62)
>  #define GEN12_PPGTT_PTE_LM   BIT_ULL(11)
> +#define GEN12_PPGTT_PTE_PAT2 BIT_ULL(7)
> +#define GEN12_PPGTT_PTE_NC   BIT_ULL(5)
> +#define GEN12_PPGTT_PTE_PAT1 BIT_ULL(4)
> +#define GEN12_PPGTT_PTE_PAT0 BIT_ULL(3)
>  
> -#define GEN12_GGTT_PTE_LMBIT_ULL(1)
> +#define GEN12_GGTT_PTE_LMBIT_ULL(1)
> +#define MTL_GGTT_PTE_PAT0BIT_ULL(52)
> +#define MTL_GGTT_PTE_PAT1BIT_ULL(53)
> +#define GEN12_GGTT_PTE_ADDR_MASK GENMASK_ULL(45, 12)
> +#define MTL_GGTT_PTE_PAT_MASKGENMASK_ULL(53, 52)
>  
>  #define GEN12_PDE_64K BIT(6)
>  #define GEN12_PTE_PS64 BIT(8)
> @@ -147,6 +156,15 @@ typedef u64 gen8_pte_t;
>  #define GEN8_PDE_IPS_64K BIT(11)
>  #define GEN8_PDE_PS_2M   BIT(7)
>  
> +#define MTL_PPAT_L4_CACHE_POLICY_MASKREG_GENMASK(3, 2)
> +#define MTL_PAT_INDEX_COH_MODE_MASK  REG_GENMASK(1, 0)
> +#define MTL_PPAT_L4_3_UC REG_FIELD_PREP(MTL_PPAT_L4_CACHE_POLICY_MASK, 3)
> +#define MTL_PPAT_L4_1_WT REG_FIELD_PREP(MTL_PPAT_L4_CACHE_POLICY_MASK, 1)
> +#define MTL_PPAT_L4_0_WB REG_FIELD_PREP(MTL_PPAT_L4_CACHE_POLICY_MASK, 0)
> +#define MTL_3_COH_2W REG_FIELD_PREP(MTL_PAT_INDEX_COH_MODE_MASK, 3)
> +#define MTL_2_COH_1W REG_FIELD_PREP(MTL_PAT_INDEX_COH_MODE_MASK, 2)
> +#define MTL_0_COH_NONREG_FIELD_PREP(MTL_PAT_INDEX_COH_MODE_MASK, 0)

BTW, are all these defines needed? Not all of them look to be
used.

Andi

Re: [Freedreno] [PATCH 5/5] drm/msm/dpu1: Handle the reg bus ICC path

2023-04-19 Thread Konrad Dybcio




On 19.04.2023 22:11, Jeykumar Sankaran wrote:
> 
> 
> On 4/19/2023 12:48 PM, Konrad Dybcio wrote:
>>
>>
>> On 19.04.2023 21:06, Jeykumar Sankaran wrote:
>>>
>>>
>>> On 4/17/2023 8:30 AM, Konrad Dybcio wrote:
 Apart from the already handled data bus (MAS_MDP_Pn<->DDR), there's
 another path that needs to be handled to ensure MDSS functions properly,
 namely the "reg bus", a.k.a the CPU-MDSS interconnect.

 Gating that path may have a variety of effects.. from none to otherwise
 inexplicable DSI timeouts..

 On the DPU side, we need to keep the bus alive. The vendor driver
 kickstarts it to max (300Mbps) throughput on first commit, but in
 exchange for some battery life in rare DPU-enabled-panel-disabled
 usecases, we can request it at DPU init and gate it at suspend.

 Signed-off-by: Konrad Dybcio 
 ---
    drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c | 22 --
    drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h |  1 +
    2 files changed, 21 insertions(+), 2 deletions(-)

 diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c 
 b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
 index dd6c1c40ab9e..d1f77faebbc0 100644
 --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
 +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
 @@ -384,15 +384,17 @@ static int dpu_kms_global_obj_init(struct dpu_kms 
 *dpu_kms)
    return 0;
    }
    -static int dpu_kms_parse_data_bus_icc_path(struct dpu_kms *dpu_kms)
 +static int dpu_kms_parse_icc_paths(struct dpu_kms *dpu_kms)
    {
    struct icc_path *path0;
    struct icc_path *path1;
 +    struct icc_path *reg_bus_path;
    struct drm_device *dev = dpu_kms->dev;
    struct device *dpu_dev = dev->dev;
      path0 = msm_icc_get(dpu_dev, "mdp0-mem");
    path1 = msm_icc_get(dpu_dev, "mdp1-mem");
 +    reg_bus_path = msm_icc_get(dpu_dev, "cpu-cfg");
      if (IS_ERR_OR_NULL(path0))
    return PTR_ERR_OR_ZERO(path0);
 @@ -404,6 +406,10 @@ static int dpu_kms_parse_data_bus_icc_path(struct 
 dpu_kms *dpu_kms)
    dpu_kms->mdp_path[1] = path1;
    dpu_kms->num_mdp_paths++;
    }
 +
 +    if (!IS_ERR_OR_NULL(reg_bus_path))
 +    dpu_kms->reg_bus_path = reg_bus_path;
 +
    return 0;
    }
    @@ -1039,7 +1045,7 @@ static int dpu_kms_hw_init(struct msm_kms *kms)
    DPU_DEBUG("REG_DMA is not defined");
    }
    -    dpu_kms_parse_data_bus_icc_path(dpu_kms);
 +    dpu_kms_parse_icc_paths(dpu_kms);
      rc = pm_runtime_resume_and_get(_kms->pdev->dev);
    if (rc < 0)
 @@ -1241,6 +1247,9 @@ static int __maybe_unused dpu_runtime_suspend(struct 
 device *dev)
    for (i = 0; i < dpu_kms->num_mdp_paths; i++)
    icc_set_bw(dpu_kms->mdp_path[i], 0, 0);
    +    if (dpu_kms->reg_bus_path)
 +    icc_set_bw(dpu_kms->reg_bus_path, 0, 0);
 +
    return 0;
    }
    @@ -1261,6 +1270,15 @@ static int __maybe_unused 
 dpu_runtime_resume(struct device *dev)
    return rc;
    }
    +    /*
 + * The vendor driver supports setting 76.8 / 150 / 300 Mbps on this
>>> How do you arrive at these distint BW values? Are they provided by the ICC 
>>> fwk for the given path?
>> They're hardcoded in the SDE driver.
>>
>> Konrad
> These bandwidths are derived from the scaling frequencies of all the buses 
> participating in the icc-path. So they cannot be constants. Ideally they 
> should be read from the hw catalog data of the respective platform.
msm-5.4 : rotator/sde_rotator_base.c

static const struct sde_rot_bus_data sde_rot_reg_bus_table[] = {
{0, 0},
{0, 76800},
{0, 15},
{0, 30},
};

One of the two voters begs to disagree, but I do indeed see that some
SoCs (lahaina, yupik, shima..) cast votes for 74/148/265 MBps instead
of 77/150/300 from the MDSS device (with rotator being considered
separate), or so say their DTs, thanks for pointing that out.

Nonetheless, this code would taste good with bolognese sauce..

Konrad

> 
> Jeykumar S.
 + * path, but it seems to go for the highest level when display output
 + * is enabled and zero otherwise. For simplicity, we can assume that
 + * DPU being enabled and running implies that.
 + */
 +    if (dpu_kms->reg_bus_path)
 +    icc_set_bw(dpu_kms->reg_bus_path, 0, MBps_to_icc(300));
 +
    dpu_vbif_init_memtypes(dpu_kms);
      drm_for_each_encoder(encoder, ddev)
 diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h 
 b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h
 index d5d9bec90705..c332381d58c4 100644
 --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h
 +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h
 @@ -111,6 +111,7 @@ struct dpu_kms {

Re: [PATCH v6 0/3] Add sync object UAPI support to VirtIO-GPU driver

2023-04-19 Thread Dmitry Osipenko

Hello Gurchetan,

On 4/18/23 02:17, Gurchetan Singh wrote:
> On Sun, Apr 16, 2023 at 4:53 AM Dmitry Osipenko <
> dmitry.osipe...@collabora.com> wrote:
> 
>> We have multiple Vulkan context types that are awaiting for the addition
>> of the sync object DRM UAPI support to the VirtIO-GPU kernel driver:
>>
>>  1. Venus context
>>  2. Native contexts (virtio-freedreno, virtio-intel, virtio-amdgpu)
>>
>> Mesa core supports DRM sync object UAPI, providing Vulkan drivers with a
>> generic fencing implementation that we want to utilize.
>>
>> This patch adds initial sync objects support. It creates fundament for a
>> further fencing improvements. Later on we will want to extend the
>> VirtIO-GPU
>> fencing API with passing fence IDs to host for waiting, it will be a new
>> additional VirtIO-GPU IOCTL and more. Today we have several VirtIO-GPU
>> context
>> drivers in works that require VirtIO-GPU to support sync objects UAPI.
>>
>> The patch is heavily inspired by the sync object UAPI implementation of the
>> MSM driver.
>>
> 
> The changes seem good, but I would recommend getting a full end-to-end
> solution (i.e, you've proxied the host fence with these changes and shared
> with the host compositor) working first.  You'll never know what you'll
> find after completing this exercise.  Or is that the plan already?
> 
> Typically, you want to land the uAPI and virtio spec changes last.
> Mesa/gfxstream/virglrenderer/crosvm all have the ability to test out
> unstable uAPIs ...

The proxied host fence isn't directly related to sync objects, though I
prepared code such that it could be extended with a proxied fence later
on, based on a prototype that was made some time ago.

The proxied host fence shouldn't require UAPI changes, but only
virtio-gpu proto extension. Normally, all in-fences belong to a job's
context, and thus, waits are skipped by the guest kernel. Hence, fence
proxying is a separate feature from sync objects, it can be added
without sync objects.

Sync objects primarily wanted by native context drivers because Mesa
relies on the sync object UAPI presence. It's one of direct blockers for
Intel and AMDGPU drivers, both of which has been using this sync object
UAPI for a few months and now wanting it to land upstream.

-- 
Best regards,
Dmitry

[PATCH 4/8] drm/i915/mtl: workaround coherency issue for Media

2023-04-19 Thread fei . yang

From: Fei Yang 

This patch implements Wa_22016122933.

In MTL, memory writes initiated by Media tile update the whole
cache line even for partial writes. This creates a coherency
problem for cacheable memory if both CPU and GPU are writing data
to different locations within a single cache line. CTB communication
is impacted by this issue because the head and tail pointers are
adjacent words within a cache line (see struct guc_ct_buffer_desc),
where one is written by GuC and the other by the host.
This patch circumvents the issue by making CPU/GPU shared memory
uncacheable (WC on CPU side, and PAT index 2 for GPU). Also for
CTB which is being updated by both CPU and GuC, mfence instruction
is added to make sure the CPU writes are visible to GPU right away
(flush the write combining buffer).

While fixing the CTB issue, we noticed some random GSC firmware
loading failure because the share buffers are cacheable (WB) on CPU
side but uncached on GPU side. To fix these issues we need to map
such shared buffers as WC on CPU side. Since such allocations are
not all done through GuC allocator, to avoid too many code changes,
the i915_coherent_map_type() is now hard coded to return WC for MTL.

BSpec: 45101

Signed-off-by: Fei Yang 
---
 drivers/gpu/drm/i915/gem/i915_gem_pages.c |  5 -
 drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c | 13 +
 drivers/gpu/drm/i915/gt/uc/intel_guc.c|  7 +++
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c |  6 ++
 4 files changed, 30 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c 
b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
index ecd86130b74f..89fc8ea6bcfc 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
@@ -469,7 +469,10 @@ enum i915_map_type i915_coherent_map_type(struct 
drm_i915_private *i915,
  struct drm_i915_gem_object *obj,
  bool always_coherent)
 {
-   if (i915_gem_object_is_lmem(obj))
+   /*
+* Wa_22016122933: always return I915_MAP_WC for MTL
+*/
+   if (i915_gem_object_is_lmem(obj) || IS_METEORLAKE(i915))
return I915_MAP_WC;
if (HAS_LLC(i915) || always_coherent)
return I915_MAP_WB;
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c 
b/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c
index 1d9fdfb11268..236673c02f9a 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c
@@ -110,6 +110,13 @@ static int gsc_fw_load_prepare(struct intel_gsc_uc *gsc)
if (obj->base.size < gsc->fw.size)
return -ENOSPC;
 
+   /*
+* Wa_22016122933: For MTL the shared memory needs to be mapped
+* as WC on CPU side and UC (PAT index 2) on GPU side
+*/
+   if (IS_METEORLAKE(i915))
+   i915_gem_object_set_cache_coherency(obj, I915_CACHE_NONE);
+
dst = i915_gem_object_pin_map_unlocked(obj,
   i915_coherent_map_type(i915, 
obj, true));
if (IS_ERR(dst))
@@ -125,6 +132,12 @@ static int gsc_fw_load_prepare(struct intel_gsc_uc *gsc)
memset(dst, 0, obj->base.size);
memcpy(dst, src, gsc->fw.size);
 
+   /*
+* Wa_22016122933: Making sure the data in dst is
+* visible to GSC right away
+*/
+   intel_guc_write_barrier(>uc.guc);
+
i915_gem_object_unpin_map(gsc->fw.obj);
i915_gem_object_unpin_map(obj);
 
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
index e89f16ecf1ae..c9f20385f6a0 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
@@ -744,6 +744,13 @@ struct i915_vma *intel_guc_allocate_vma(struct intel_guc 
*guc, u32 size)
if (IS_ERR(obj))
return ERR_CAST(obj);
 
+   /*
+* Wa_22016122933: For MTL the shared memory needs to be mapped
+* as WC on CPU side and UC (PAT index 2) on GPU side
+*/
+   if (IS_METEORLAKE(gt->i915))
+   i915_gem_object_set_cache_coherency(obj, I915_CACHE_NONE);
+
vma = i915_vma_instance(obj, >ggtt->vm, NULL);
if (IS_ERR(vma))
goto err;
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
index 1803a633ed64..99a0a89091e7 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
@@ -902,6 +902,12 @@ static int ct_read(struct intel_guc_ct *ct, struct 
ct_incoming_msg **msg)
/* now update descriptor */
WRITE_ONCE(desc->head, head);
 
+   /*
+* Wa_22016122933: Making sure the head update is
+* visible to GuC right away
+*/
+   intel_guc_write_barrier(ct_to_guc(ct));
+
return available - len;
 
 corrupted:
-- 
2.25.1

[PATCH 7/8] drm/i915: use pat_index instead of cache_level

2023-04-19 Thread fei . yang

From: Fei Yang 

Currently the KMD is using enum i915_cache_level to set caching policy for
buffer objects. This is flaky because the PAT index which really controls
the caching behavior in PTE has far more levels than what's defined in the
enum. In addition, the PAT index is platform dependent, having to translate
between i915_cache_level and PAT index is not reliable, and makes the code
more complicated.

>From UMD's perspective there is also a necessity to set caching policy for
performance fine tuning. It's much easier for the UMD to directly use PAT
index because the behavior of each PAT index is clearly defined in Bspec.
Having the abstracted i915_cache_level sitting in between would only cause
more ambiguity.

For these reasons this patch replaces i915_cache_level with PAT index. Also
note, the cache_level is not completely removed yet, because the KMD still
has the need of creating buffer objects with simple cache settings such as
cached, uncached, or writethrough. For such simple cases, using cache_level
would help simplify the code.

Cc: Chris Wilson 
Cc: Matt Roper 
Signed-off-by: Fei Yang 
---
 drivers/gpu/drm/i915/display/intel_dpt.c  | 12 +--
 drivers/gpu/drm/i915/gem/i915_gem_domain.c| 27 ++
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 10 ++-
 drivers/gpu/drm/i915/gem/i915_gem_mman.c  |  3 +-
 drivers/gpu/drm/i915/gem/i915_gem_object.c| 52 +++-
 drivers/gpu/drm/i915/gem/i915_gem_object.h|  4 +
 .../gpu/drm/i915/gem/i915_gem_object_types.h  | 25 +-
 drivers/gpu/drm/i915/gem/i915_gem_stolen.c|  4 +-
 drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c  | 16 ++--
 .../gpu/drm/i915/gem/selftests/huge_pages.c   |  2 +-
 .../drm/i915/gem/selftests/i915_gem_migrate.c |  2 +-
 .../drm/i915/gem/selftests/i915_gem_mman.c|  2 +-
 drivers/gpu/drm/i915/gt/gen6_ppgtt.c  | 10 ++-
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c  | 71 
 drivers/gpu/drm/i915/gt/gen8_ppgtt.h  |  3 +-
 drivers/gpu/drm/i915/gt/intel_ggtt.c  | 82 +--
 drivers/gpu/drm/i915/gt/intel_gtt.h   | 20 ++---
 drivers/gpu/drm/i915/gt/intel_migrate.c   | 47 ++-
 drivers/gpu/drm/i915/gt/intel_migrate.h   | 13 ++-
 drivers/gpu/drm/i915/gt/intel_ppgtt.c |  6 +-
 drivers/gpu/drm/i915/gt/selftest_migrate.c| 47 ++-
 drivers/gpu/drm/i915/gt/selftest_reset.c  |  8 +-
 drivers/gpu/drm/i915/gt/selftest_timeline.c   |  2 +-
 drivers/gpu/drm/i915/gt/selftest_tlb.c|  4 +-
 drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c  | 10 ++-
 drivers/gpu/drm/i915/i915_debugfs.c   | 55 ++---
 drivers/gpu/drm/i915/i915_gem.c   | 16 +++-
 drivers/gpu/drm/i915/i915_gpu_error.c |  8 +-
 drivers/gpu/drm/i915/i915_vma.c   | 16 ++--
 drivers/gpu/drm/i915/i915_vma.h   |  2 +-
 drivers/gpu/drm/i915/i915_vma_types.h |  2 -
 drivers/gpu/drm/i915/selftests/i915_gem.c |  5 +-
 .../gpu/drm/i915/selftests/i915_gem_evict.c   |  4 +-
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 15 ++--
 .../drm/i915/selftests/intel_memory_region.c  |  4 +-
 drivers/gpu/drm/i915/selftests/mock_gtt.c |  8 +-
 36 files changed, 378 insertions(+), 239 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_dpt.c 
b/drivers/gpu/drm/i915/display/intel_dpt.c
index c5eacfdba1a5..7c5fddb203ba 100644
--- a/drivers/gpu/drm/i915/display/intel_dpt.c
+++ b/drivers/gpu/drm/i915/display/intel_dpt.c
@@ -43,24 +43,24 @@ static void gen8_set_pte(void __iomem *addr, gen8_pte_t pte)
 static void dpt_insert_page(struct i915_address_space *vm,
dma_addr_t addr,
u64 offset,
-   enum i915_cache_level level,
+   unsigned int pat_index,
u32 flags)
 {
struct i915_dpt *dpt = i915_vm_to_dpt(vm);
gen8_pte_t __iomem *base = dpt->iomem;
 
gen8_set_pte(base + offset / I915_GTT_PAGE_SIZE,
-vm->pte_encode(addr, level, flags));
+vm->pte_encode(addr, pat_index, flags));
 }
 
 static void dpt_insert_entries(struct i915_address_space *vm,
   struct i915_vma_resource *vma_res,
-  enum i915_cache_level level,
+  unsigned int pat_index,
   u32 flags)
 {
struct i915_dpt *dpt = i915_vm_to_dpt(vm);
gen8_pte_t __iomem *base = dpt->iomem;
-   const gen8_pte_t pte_encode = vm->pte_encode(0, level, flags);
+   const gen8_pte_t pte_encode = vm->pte_encode(0, pat_index, flags);
struct sgt_iter sgt_iter;
dma_addr_t addr;
int i;
@@ -83,7 +83,7 @@ static void dpt_clear_range(struct i915_address_space *vm,
 static void dpt_bind_vma(struct i915_address_space *vm,
 struct i915_vm_pt_stash *stash,
 struct i915_vma_resource

[PATCH 8/8] drm/i915: Allow user to set cache at BO creation

2023-04-19 Thread fei . yang

From: Fei Yang 

To comply with the design that buffer objects shall have immutable
cache setting through out their life cycle, {set, get}_caching ioctl's
are no longer supported from MTL onward. With that change caching
policy can only be set at object creation time. The current code
applies a default (platform dependent) cache setting for all objects.
However this is not optimal for performance tuning. The patch extends
the existing gem_create uAPI to let user set PAT index for the object
at creation time.
The new extension is platform independent, so UMD's can switch to using
this extension for older platforms as well, while {set, get}_caching are
still supported on these legacy paltforms for compatibility reason.

Cc: Chris Wilson 
Cc: Matt Roper 
Cc: Andi Shyti 
Signed-off-by: Fei Yang 
---
 drivers/gpu/drm/i915/gem/i915_gem_create.c | 36 ++
 drivers/gpu/drm/i915/gem/i915_gem_object.c |  6 
 include/uapi/drm/i915_drm.h| 36 ++
 tools/include/uapi/drm/i915_drm.h  | 36 ++
 4 files changed, 114 insertions(+)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_create.c 
b/drivers/gpu/drm/i915/gem/i915_gem_create.c
index bfe1dbda4cb7..723c3ddd6c74 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_create.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_create.c
@@ -245,6 +245,7 @@ struct create_ext {
unsigned int n_placements;
unsigned int placement_mask;
unsigned long flags;
+   unsigned int pat_index;
 };
 
 static void repr_placements(char *buf, size_t size,
@@ -394,11 +395,39 @@ static int ext_set_protected(struct i915_user_extension 
__user *base, void *data
return 0;
 }
 
+static int ext_set_pat(struct i915_user_extension __user *base, void *data)
+{
+   struct create_ext *ext_data = data;
+   struct drm_i915_private *i915 = ext_data->i915;
+   struct drm_i915_gem_create_ext_set_pat ext;
+   unsigned int max_pat_index;
+
+   BUILD_BUG_ON(sizeof(struct drm_i915_gem_create_ext_set_pat) !=
+offsetofend(struct drm_i915_gem_create_ext_set_pat, rsvd));
+
+   if (copy_from_user(, base, sizeof(ext)))
+   return -EFAULT;
+
+   max_pat_index = INTEL_INFO(i915)->max_pat_index;
+
+   if (ext.pat_index > max_pat_index) {
+   drm_dbg(>drm, "PAT index is invalid: %u\n",
+   ext.pat_index);
+   return -EINVAL;
+   }
+
+   ext_data->pat_index = ext.pat_index;
+
+   return 0;
+}
+
 static const i915_user_extension_fn create_extensions[] = {
[I915_GEM_CREATE_EXT_MEMORY_REGIONS] = ext_set_placements,
[I915_GEM_CREATE_EXT_PROTECTED_CONTENT] = ext_set_protected,
+   [I915_GEM_CREATE_EXT_SET_PAT] = ext_set_pat,
 };
 
+#define PAT_INDEX_NOT_SET  0x
 /**
  * i915_gem_create_ext_ioctl - Creates a new mm object and returns a handle to 
it.
  * @dev: drm device pointer
@@ -418,6 +447,7 @@ i915_gem_create_ext_ioctl(struct drm_device *dev, void 
*data,
if (args->flags & ~I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS)
return -EINVAL;
 
+   ext_data.pat_index = PAT_INDEX_NOT_SET;
ret = i915_user_extensions(u64_to_user_ptr(args->extensions),
   create_extensions,
   ARRAY_SIZE(create_extensions),
@@ -454,5 +484,11 @@ i915_gem_create_ext_ioctl(struct drm_device *dev, void 
*data,
if (IS_ERR(obj))
return PTR_ERR(obj);
 
+   if (ext_data.pat_index != PAT_INDEX_NOT_SET) {
+   i915_gem_object_set_pat_index(obj, ext_data.pat_index);
+   /* Mark pat_index is set by UMD */
+   obj->cache_level = I915_CACHE_INVAL;
+   }
+
return i915_gem_publish(obj, file, >size, >handle);
 }
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c 
b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index 27c948350b5b..61651f7e5806 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -209,6 +209,12 @@ bool i915_gem_object_can_bypass_llc(struct 
drm_i915_gem_object *obj)
if (!(obj->flags & I915_BO_ALLOC_USER))
return false;
 
+   /*
+* Always flush cache for UMD objects at creation time.
+*/
+   if (obj->cache_level == I915_CACHE_INVAL)
+   return true;
+
/*
 * EHL and JSL add the 'Bypass LLC' MOCS entry, which should make it
 * possible for userspace to bypass the GTT caching bits set by the
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index dba7c5a5b25e..03c5c314846e 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -3630,9 +3630,13 @@ struct drm_i915_gem_create_ext {
 *
 * For I915_GEM_CREATE_EXT_PROTECTED_CONTENT usage see
 * struct drm_i915_gem_create_ext_protected_content.
+*
+* For I915_GEM_CREATE_EXT_SET_PAT

[PATCH 1/8] drm/i915/mtl: Set has_llc=0

2023-04-19 Thread fei . yang

From: Fei Yang 

On MTL, LLC is not shared between GT and CPU, set has_llc=0.

Signed-off-by: Fei Yang 
---
 drivers/gpu/drm/i915/i915_pci.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
index d64e074d7457..272a8ba37b64 100644
--- a/drivers/gpu/drm/i915/i915_pci.c
+++ b/drivers/gpu/drm/i915/i915_pci.c
@@ -1147,6 +1147,7 @@ static const struct intel_device_info mtl_info = {
.has_flat_ccs = 0,
.has_gmd_id = 1,
.has_guc_deprivilege = 1,
+   .has_llc = 0,
.has_mslice_steering = 0,
.has_snoop = 1,
.__runtime.memory_regions = REGION_SMEM | REGION_STOLEN_LMEM,
-- 
2.25.1

[PATCH 6/8] drm/i915: preparation for using PAT index

2023-04-19 Thread fei . yang

From: Fei Yang 

This patch is a preparation for replacing enum i915_cache_level with PAT
index. Caching policy for buffer objects is set through the PAT index in
PTE, the old i915_cache_level is not sufficient to represent all caching
modes supported by the hardware.

Preparing the transition by adding some platform dependent data structures
and helper functions to translate the cache_level to pat_index.

cachelevel_to_pat: a platform dependent array mapping cache_level to
   pat_index.

max_pat_index: the maximum PAT index supported by the hardware. Needed for
   validating the PAT index passed in from user space.

i915_gem_get_pat_index: function to convert cache_level to PAT index.

obj_to_i915(obj): macro moved to header file for wider usage.

I915_MAX_CACHE_LEVEL: upper bound of i915_cache_level for the
  convenience of coding.

Cc: Chris Wilson 
Cc: Matt Roper 
Cc: Andi Shyti 
Signed-off-by: Fei Yang 
---
 drivers/gpu/drm/i915/gem/i915_gem_object.c|  9 +++
 drivers/gpu/drm/i915/gem/i915_gem_object.h|  4 +
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |  1 +
 drivers/gpu/drm/i915/gem/i915_gem_shrinker.c  |  2 -
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c  |  6 ++
 drivers/gpu/drm/i915/gt/intel_ggtt.c  |  6 ++
 drivers/gpu/drm/i915/i915_pci.c   | 75 +--
 drivers/gpu/drm/i915/intel_device_info.h  |  5 ++
 .../gpu/drm/i915/selftests/mock_gem_device.c  |  9 +++
 9 files changed, 107 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c 
b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index 4666bb82f312..8c70a0ec7d2f 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -45,6 +45,15 @@ static struct kmem_cache *slab_objects;
 
 static const struct drm_gem_object_funcs i915_gem_object_funcs;
 
+unsigned int i915_gem_get_pat_index(struct drm_i915_private *i915,
+   enum i915_cache_level level)
+{
+   if (drm_WARN_ON(>drm, level >= I915_MAX_CACHE_LEVEL))
+   return 0;
+
+   return INTEL_INFO(i915)->cachelevel_to_pat[level];
+}
+
 struct drm_i915_gem_object *i915_gem_object_alloc(void)
 {
struct drm_i915_gem_object *obj;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h 
b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index 885ccde9dc3c..4c92e17b4337 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -20,6 +20,8 @@
 
 enum intel_region_id;
 
+#define obj_to_i915(obj__) to_i915((obj__)->base.dev)
+
 static inline bool i915_gem_object_size_2big(u64 size)
 {
struct drm_i915_gem_object *obj;
@@ -30,6 +32,8 @@ static inline bool i915_gem_object_size_2big(u64 size)
return false;
 }
 
+unsigned int i915_gem_get_pat_index(struct drm_i915_private *i915,
+   enum i915_cache_level level);
 void i915_gem_init__objects(struct drm_i915_private *i915);
 
 void i915_objects_module_exit(void);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h 
b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
index 830c11431ee8..41b35abccf88 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
@@ -194,6 +194,7 @@ enum i915_cache_level {
 * engine.
 */
I915_CACHE_WT,
+   I915_MAX_CACHE_LEVEL,
 };
 
 enum i915_map_type {
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c 
b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
index b1672e054b21..214763942aa2 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
@@ -460,8 +460,6 @@ void i915_gem_shrinker_taints_mutex(struct drm_i915_private 
*i915,
fs_reclaim_release(GFP_KERNEL);
 }
 
-#define obj_to_i915(obj__) to_i915((obj__)->base.dev)
-
 /**
  * i915_gem_object_make_unshrinkable - Hide the object from the shrinker. By
  * default all object types that support shrinking(see IS_SHRINKABLE), will 
also
diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c 
b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
index 11b91e0453c8..7a4b1d1afce9 100644
--- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
@@ -78,6 +78,12 @@ static u64 mtl_pte_encode(dma_addr_t addr,
case I915_CACHE_WT:
pte |= GEN12_PPGTT_PTE_PAT0;
break;
+   default:
+   /* This should never happen. Added to deal with the compile
+* error due to the addition of I915_MAX_CACHE_LEVEL. Will
+* be removed by the pat_index patch.
+*/
+   break;
}
 
return pte;
diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c 
b/drivers/gpu/drm/i915/gt/intel_ggtt.c
index d8405e00a862..1054b8e85d62 100644
--- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
@@ -242,6

[PATCH 5/8] drm/i915/mtl: end support for set caching ioctl

2023-04-19 Thread fei . yang

From: Fei Yang 

The design is to keep Buffer Object's caching policy immutable through
out its life cycle. This patch ends the support for set caching ioctl
from MTL onward. While doing that we also set BO's to be 1-way coherent
at creation time because GPU is no longer automatically snooping CPU
cache. For UMD's need to fine tune the caching policy for BO's, a follow
up patch will extend the GEM_CREATE uAPI to allow UMD's specify caching
mode at BO creation time.

Signed-off-by: Fei Yang 
---
 drivers/gpu/drm/i915/gem/i915_gem_domain.c | 3 +++
 drivers/gpu/drm/i915/gem/i915_gem_shmem.c  | 9 -
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c 
b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
index d2d5a24301b2..bb3575b1479f 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
@@ -337,6 +337,9 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void 
*data,
if (IS_DGFX(i915))
return -ENODEV;
 
+   if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70))
+   return -EOPNOTSUPP;
+
switch (args->caching) {
case I915_CACHING_NONE:
level = I915_CACHE_NONE;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c 
b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
index 37d1efcd3ca6..cad4a6017f4b 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
@@ -601,7 +601,14 @@ static int shmem_object_init(struct intel_memory_region 
*mem,
obj->write_domain = I915_GEM_DOMAIN_CPU;
obj->read_domains = I915_GEM_DOMAIN_CPU;
 
-   if (HAS_LLC(i915))
+   /*
+* MTL doesn't snoop CPU cache by default for GPU access (namely
+* 1-way coherency). However some UMD's are currently depending on
+* that. Make 1-way coherent the default setting for MTL. A follow
+* up patch will extend the GEM_CREATE uAPI to allow UMD's specify
+* caching mode at BO creation time
+*/
+   if (HAS_LLC(i915) || (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70)))
/* On some devices, we can have the GPU use the LLC (the CPU
 * cache) for about a 10% performance improvement
 * compared to uncached.  Graphics requests other than
-- 
2.25.1

[PATCH 3/8] drm/i915/mtl: Add PTE encode function

2023-04-19 Thread fei . yang

From: Fei Yang 

PTE encode functions are platform dependent. This patch implements
PTE functions for MTL, and ensures the correct PTE encode function
is used by calling pte_encode function pointer instead of the
hardcoded gen8 version of PTE encode.

Signed-off-by: Fei Yang 
---
 drivers/gpu/drm/i915/display/intel_dpt.c |  2 +-
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c | 45 
 drivers/gpu/drm/i915/gt/intel_ggtt.c | 36 +--
 3 files changed, 72 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_dpt.c 
b/drivers/gpu/drm/i915/display/intel_dpt.c
index b8027392144d..c5eacfdba1a5 100644
--- a/drivers/gpu/drm/i915/display/intel_dpt.c
+++ b/drivers/gpu/drm/i915/display/intel_dpt.c
@@ -300,7 +300,7 @@ intel_dpt_create(struct intel_framebuffer *fb)
vm->vma_ops.bind_vma= dpt_bind_vma;
vm->vma_ops.unbind_vma  = dpt_unbind_vma;
 
-   vm->pte_encode = gen8_ggtt_pte_encode;
+   vm->pte_encode = vm->gt->ggtt->vm.pte_encode;
 
dpt->obj = dpt_obj;
dpt->obj->is_dpt = true;
diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c 
b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
index 4daaa6f55668..11b91e0453c8 100644
--- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
@@ -55,6 +55,34 @@ static u64 gen8_pte_encode(dma_addr_t addr,
return pte;
 }
 
+static u64 mtl_pte_encode(dma_addr_t addr,
+ enum i915_cache_level level,
+ u32 flags)
+{
+   gen8_pte_t pte = addr | GEN8_PAGE_PRESENT | GEN8_PAGE_RW;
+
+   if (unlikely(flags & PTE_READ_ONLY))
+   pte &= ~GEN8_PAGE_RW;
+
+   if (flags & PTE_LM)
+   pte |= GEN12_PPGTT_PTE_LM | GEN12_PPGTT_PTE_NC;
+
+   switch (level) {
+   case I915_CACHE_NONE:
+   pte |= GEN12_PPGTT_PTE_PAT1;
+   break;
+   case I915_CACHE_LLC:
+   case I915_CACHE_L3_LLC:
+   pte |= GEN12_PPGTT_PTE_PAT0 | GEN12_PPGTT_PTE_PAT1;
+   break;
+   case I915_CACHE_WT:
+   pte |= GEN12_PPGTT_PTE_PAT0;
+   break;
+   }
+
+   return pte;
+}
+
 static void gen8_ppgtt_notify_vgt(struct i915_ppgtt *ppgtt, bool create)
 {
struct drm_i915_private *i915 = ppgtt->vm.i915;
@@ -427,7 +455,7 @@ gen8_ppgtt_insert_pte(struct i915_ppgtt *ppgtt,
  u32 flags)
 {
struct i915_page_directory *pd;
-   const gen8_pte_t pte_encode = gen8_pte_encode(0, cache_level, flags);
+   const gen8_pte_t pte_encode = ppgtt->vm.pte_encode(0, cache_level, 
flags);
gen8_pte_t *vaddr;
 
pd = i915_pd_entry(pdp, gen8_pd_index(idx, 2));
@@ -580,7 +608,7 @@ static void gen8_ppgtt_insert_huge(struct 
i915_address_space *vm,
   enum i915_cache_level cache_level,
   u32 flags)
 {
-   const gen8_pte_t pte_encode = gen8_pte_encode(0, cache_level, flags);
+   const gen8_pte_t pte_encode = vm->pte_encode(0, cache_level, flags);
unsigned int rem = sg_dma_len(iter->sg);
u64 start = vma_res->start;
 
@@ -743,7 +771,7 @@ static void gen8_ppgtt_insert_entry(struct 
i915_address_space *vm,
GEM_BUG_ON(pt->is_compact);
 
vaddr = px_vaddr(pt);
-   vaddr[gen8_pd_index(idx, 0)] = gen8_pte_encode(addr, level, flags);
+   vaddr[gen8_pd_index(idx, 0)] = vm->pte_encode(addr, level, flags);
drm_clflush_virt_range([gen8_pd_index(idx, 0)], sizeof(*vaddr));
 }
 
@@ -773,7 +801,7 @@ static void __xehpsdv_ppgtt_insert_entry_lm(struct 
i915_address_space *vm,
}
 
vaddr = px_vaddr(pt);
-   vaddr[gen8_pd_index(idx, 0) / 16] = gen8_pte_encode(addr, level, flags);
+   vaddr[gen8_pd_index(idx, 0) / 16] = vm->pte_encode(addr, level, flags);
 }
 
 static void xehpsdv_ppgtt_insert_entry(struct i915_address_space *vm,
@@ -820,8 +848,8 @@ static int gen8_init_scratch(struct i915_address_space *vm)
pte_flags |= PTE_LM;
 
vm->scratch[0]->encode =
-   gen8_pte_encode(px_dma(vm->scratch[0]),
-   I915_CACHE_NONE, pte_flags);
+   vm->pte_encode(px_dma(vm->scratch[0]),
+  I915_CACHE_NONE, pte_flags);
 
for (i = 1; i <= vm->top; i++) {
struct drm_i915_gem_object *obj;
@@ -963,7 +991,10 @@ struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt,
 */
ppgtt->vm.alloc_scratch_dma = alloc_pt_dma;
 
-   ppgtt->vm.pte_encode = gen8_pte_encode;
+   if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 70))
+   ppgtt->vm.pte_encode = mtl_pte_encode;
+   else
+   ppgtt->vm.pte_encode = gen8_pte_encode;
 
ppgtt->vm.bind_async_flags = I915_VMA_LOCAL_BIND;
ppgtt->vm.insert_entries = gen8_ppgtt_insert;
diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c 
b/drivers/gpu/drm/i915/gt/intel_ggtt.c
index

[PATCH 0/8] drm/i915/mtl: Define MOCS and PAT tables for MTL

2023-04-19 Thread fei . yang

From: Fei Yang 

The series includes patches needed to enable MTL.
Also add new extension for GEM_CREATE uAPI to let
user space set cache policy for buffer objects.

v2: addressing review comments and checkpatch warnings
v3: make mtl_ggtt_pte_encode static

Fei Yang (7):
  drm/i915/mtl: Set has_llc=0
  drm/i915/mtl: Add PTE encode function
  drm/i915/mtl: workaround coherency issue for Media
  drm/i915/mtl: end support for set caching ioctl
  drm/i915: preparation for using PAT index
  drm/i915: use pat_index instead of cache_level
  drm/i915: Allow user to set cache at BO creation

Madhumitha Tolakanahalli Pradeep (1):
  drm/i915/mtl: Define MOCS and PAT tables for MTL

 drivers/gpu/drm/i915/display/intel_dpt.c  | 14 ++--
 drivers/gpu/drm/i915/gem/i915_gem_create.c| 36 
 drivers/gpu/drm/i915/gem/i915_gem_domain.c| 30 +++
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 10 ++-
 drivers/gpu/drm/i915/gem/i915_gem_mman.c  |  3 +-
 drivers/gpu/drm/i915/gem/i915_gem_object.c| 67 ++-
 drivers/gpu/drm/i915/gem/i915_gem_object.h|  8 ++
 .../gpu/drm/i915/gem/i915_gem_object_types.h  | 26 +-
 drivers/gpu/drm/i915/gem/i915_gem_pages.c |  5 +-
 drivers/gpu/drm/i915/gem/i915_gem_shmem.c |  9 +-
 drivers/gpu/drm/i915/gem/i915_gem_shrinker.c  |  2 -
 drivers/gpu/drm/i915/gem/i915_gem_stolen.c|  4 +-
 drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c  | 16 ++--
 .../gpu/drm/i915/gem/selftests/huge_pages.c   |  2 +-
 .../drm/i915/gem/selftests/i915_gem_migrate.c |  2 +-
 .../drm/i915/gem/selftests/i915_gem_mman.c|  2 +-
 drivers/gpu/drm/i915/gt/gen6_ppgtt.c  | 10 ++-
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c  | 76 -
 drivers/gpu/drm/i915/gt/gen8_ppgtt.h  |  3 +-
 drivers/gpu/drm/i915/gt/intel_ggtt.c  | 84 +--
 drivers/gpu/drm/i915/gt/intel_gt_regs.h   |  6 +-
 drivers/gpu/drm/i915/gt/intel_gtt.c   | 47 ++-
 drivers/gpu/drm/i915/gt/intel_gtt.h   | 38 ++---
 drivers/gpu/drm/i915/gt/intel_migrate.c   | 47 ++-
 drivers/gpu/drm/i915/gt/intel_migrate.h   | 13 ++-
 drivers/gpu/drm/i915/gt/intel_mocs.c  | 76 -
 drivers/gpu/drm/i915/gt/intel_ppgtt.c |  6 +-
 drivers/gpu/drm/i915/gt/selftest_migrate.c| 47 ++-
 drivers/gpu/drm/i915/gt/selftest_mocs.c   |  2 +-
 drivers/gpu/drm/i915/gt/selftest_reset.c  |  8 +-
 drivers/gpu/drm/i915/gt/selftest_timeline.c   |  2 +-
 drivers/gpu/drm/i915/gt/selftest_tlb.c|  4 +-
 drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c | 13 +++
 drivers/gpu/drm/i915/gt/uc/intel_guc.c|  7 ++
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c |  6 ++
 drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c  | 10 ++-
 drivers/gpu/drm/i915/i915_debugfs.c   | 55 +---
 drivers/gpu/drm/i915/i915_gem.c   | 16 +++-
 drivers/gpu/drm/i915/i915_gpu_error.c |  8 +-
 drivers/gpu/drm/i915/i915_pci.c   | 76 +++--
 drivers/gpu/drm/i915/i915_vma.c   | 16 ++--
 drivers/gpu/drm/i915/i915_vma.h   |  2 +-
 drivers/gpu/drm/i915/i915_vma_types.h |  2 -
 drivers/gpu/drm/i915/intel_device_info.h  |  5 ++
 drivers/gpu/drm/i915/selftests/i915_gem.c |  5 +-
 .../gpu/drm/i915/selftests/i915_gem_evict.c   |  4 +-
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 15 ++--
 .../drm/i915/selftests/intel_memory_region.c  |  4 +-
 .../gpu/drm/i915/selftests/mock_gem_device.c  |  9 ++
 drivers/gpu/drm/i915/selftests/mock_gtt.c |  8 +-
 include/uapi/drm/i915_drm.h   | 36 
 tools/include/uapi/drm/i915_drm.h | 36 
 52 files changed, 812 insertions(+), 226 deletions(-)

-- 
2.25.1

[PATCH 2/8] drm/i915/mtl: Define MOCS and PAT tables for MTL

2023-04-19 Thread fei . yang

From: Madhumitha Tolakanahalli Pradeep 


On MTL, GT can no longer allocate on LLC - only the CPU can.
This, along with addition of support for L4 cache calls for
a MOCS/PAT table update.
Also the PAT index registers are multicasted for primary GT,
and there is an address jump from index 7 to 8. This patch
makes sure that these registers are programmed in the proper
way.

BSpec: 44509, 45101, 44235

Cc: Matt Roper 
Cc: Lucas De Marchi 
Signed-off-by: Madhumitha Tolakanahalli Pradeep 

Signed-off-by: Aravind Iddamsetty 
Signed-off-by: Nirmoy Das 
Signed-off-by: Fei Yang 
---
 drivers/gpu/drm/i915/gt/intel_gt_regs.h |  6 +-
 drivers/gpu/drm/i915/gt/intel_gtt.c | 47 ++-
 drivers/gpu/drm/i915/gt/intel_gtt.h | 20 ++-
 drivers/gpu/drm/i915/gt/intel_mocs.c| 76 +++--
 drivers/gpu/drm/i915/gt/selftest_mocs.c |  2 +-
 5 files changed, 143 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h 
b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
index fd1f9cd35e9d..e8c3b762a92a 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
@@ -356,7 +356,11 @@
 #define GEN7_TLB_RD_ADDR   _MMIO(0x4700)
 
 #define GEN12_PAT_INDEX(index) _MMIO(0x4800 + (index) * 4)
-#define XEHP_PAT_INDEX(index)  MCR_REG(0x4800 + (index) * 4)
+#define _PAT_INDEX(index)  _PICK_EVEN_2RANGES(index, 8, \
+  0x4800, 
0x4804, \
+  0x4848, 
0x484c)
+#define XEHP_PAT_INDEX(index)  MCR_REG(_PAT_INDEX(index))
+#define XELPMP_PAT_INDEX(index)_MMIO(_PAT_INDEX(index))
 
 #define XEHP_TILE0_ADDR_RANGE  MCR_REG(0x4900)
 #define   XEHP_TILE_LMEM_RANGE_SHIFT   8
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c 
b/drivers/gpu/drm/i915/gt/intel_gtt.c
index 4f436ba7a3c8..2f6a9be0ffe6 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
@@ -468,6 +468,44 @@ void gtt_write_workarounds(struct intel_gt *gt)
}
 }
 
+static void xelpmp_setup_private_ppat(struct intel_uncore *uncore)
+{
+   intel_uncore_write(uncore, XELPMP_PAT_INDEX(0),
+  MTL_PPAT_L4_0_WB);
+   intel_uncore_write(uncore, XELPMP_PAT_INDEX(1),
+  MTL_PPAT_L4_1_WT);
+   intel_uncore_write(uncore, XELPMP_PAT_INDEX(2),
+  MTL_PPAT_L4_3_UC);
+   intel_uncore_write(uncore, XELPMP_PAT_INDEX(3),
+  MTL_PPAT_L4_0_WB | MTL_2_COH_1W);
+   intel_uncore_write(uncore, XELPMP_PAT_INDEX(4),
+  MTL_PPAT_L4_0_WB | MTL_3_COH_2W);
+
+   /*
+* Remaining PAT entries are left at the hardware-default
+* fully-cached setting
+*/
+}
+
+static void xelpg_setup_private_ppat(struct intel_gt *gt)
+{
+   intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(0),
+MTL_PPAT_L4_0_WB);
+   intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(1),
+MTL_PPAT_L4_1_WT);
+   intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(2),
+MTL_PPAT_L4_3_UC);
+   intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(3),
+MTL_PPAT_L4_0_WB | MTL_2_COH_1W);
+   intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(4),
+MTL_PPAT_L4_0_WB | MTL_3_COH_2W);
+
+   /*
+* Remaining PAT entries are left at the hardware-default
+* fully-cached setting
+*/
+}
+
 static void tgl_setup_private_ppat(struct intel_uncore *uncore)
 {
/* TGL doesn't support LLC or AGE settings */
@@ -603,7 +641,14 @@ void setup_private_pat(struct intel_gt *gt)
 
GEM_BUG_ON(GRAPHICS_VER(i915) < 8);
 
-   if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50))
+   if (gt->type == GT_MEDIA) {
+   xelpmp_setup_private_ppat(gt->uncore);
+   return;
+   }
+
+   if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70))
+   xelpg_setup_private_ppat(gt);
+   else if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50))
xehp_setup_private_ppat(gt);
else if (GRAPHICS_VER(i915) >= 12)
tgl_setup_private_ppat(uncore);
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h 
b/drivers/gpu/drm/i915/gt/intel_gtt.h
index 69ce55f517f5..854ec09fd588 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
@@ -88,9 +88,18 @@ typedef u64 gen8_pte_t;
 #define BYT_PTE_SNOOPED_BY_CPU_CACHES  REG_BIT(2)
 #define BYT_PTE_WRITEABLE  REG_BIT(1)
 
+#define MTL_PPGTT_PTE_PAT3 BIT_ULL(62)
 #define GEN12_PPGTT_PTE_LM BIT_ULL(11)
+#define GEN12_PPGTT_PTE_PAT2   BIT_ULL(7)
+#define GEN12_PPGTT_PTE_NC BIT_ULL(5)
+#define

Re: [PATCH v2] drm: use mgr->dev in drm_dbg_kms in drm_dp_add_payload_part2

2023-04-19 Thread Lyude Paul

Reviewed-by: Lyude Paul 

Thanks!

On Wed, 2023-04-19 at 07:24 -0400, Jeff Layton wrote:
> I've been experiencing some intermittent crashes down in the display
> driver code. The symptoms are ususally a line like this in dmesg:
> 
> amdgpu :30:00.0: [drm] Failed to create MST payload for port 
> 6d3a3885: -5
> 
> ...followed by an Oops due to a NULL pointer dereference.
> 
> Switch to using mgr->dev instead of state->dev since "state" can be
> NULL in some cases.
> 
> Link: https://bugzilla.redhat.com/show_bug.cgi?id=2184855
> Suggested-by: Jani Nikula 
> Signed-off-by: Jeff Layton 
> ---
>  drivers/gpu/drm/display/drm_dp_mst_topology.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> I've been running this patch for a couple of days, but the problem
> hasn't occurred again as of yet. It seems sane though as long as we can
> assume that mgr->dev will be valid even when "state" is a NULL pointer.
> 
> diff --git a/drivers/gpu/drm/display/drm_dp_mst_topology.c 
> b/drivers/gpu/drm/display/drm_dp_mst_topology.c
> index 38dab76ae69e..e2e21ce79510 100644
> --- a/drivers/gpu/drm/display/drm_dp_mst_topology.c
> +++ b/drivers/gpu/drm/display/drm_dp_mst_topology.c
> @@ -3404,7 +3404,7 @@ int drm_dp_add_payload_part2(struct 
> drm_dp_mst_topology_mgr *mgr,
>  
>   /* Skip failed payloads */
>   if (payload->vc_start_slot == -1) {
> - drm_dbg_kms(state->dev, "Part 1 of payload creation for %s 
> failed, skipping part 2\n",
> + drm_dbg_kms(mgr->dev, "Part 1 of payload creation for %s 
> failed, skipping part 2\n",
>   payload->port->connector->name);
>   return -EIO;
>   }

-- 
Cheers,
 Lyude Paul (she/her)
 Software Engineer at Red Hat

Re: [Intel-gfx] [PATCH 0/8] drm/i915/mtl: Define MOCS and PAT tables for MTL

2023-04-19 Thread Andi Shyti

Hi Fei,

On Wed, Apr 19, 2023 at 11:09:34AM -0700, fei.y...@intel.com wrote:
> From: Fei Yang 
> 
> The series includes patches needed to enable MTL.
> Also add new extension for GEM_CREATE uAPI to let
> user space set cache policy for buffer objects.
> 
> v2: addressing review comments and checkpatch warnings
> 
> Fei Yang (7):
>   drm/i915/mtl: Set has_llc=0
>   drm/i915/mtl: Add PTE encode function
>   drm/i915/mtl: workaround coherency issue for Media
>   drm/i915/mtl: end support for set caching ioctl
>   drm/i915: preparation for using PAT index
>   drm/i915: use pat_index instead of cache_level
>   drm/i915: Allow user to set cache at BO creation
> 
> Madhumitha Tolakanahalli Pradeep (1):
>   drm/i915/mtl: Define MOCS and PAT tables for MTL

next time could you please add all the r-b's you got, as it's
hard to track them down?

And, could you please version your patches with format patch:

   git format-patch -v X

and also add a changelog. The changelog might be a bit annoying
but it's very useful to understand what has changed.

As a reviewer, in order to check the different versions I had to
check the date sent.

Thanks,
Andi

Re: [PATCH v2] firmware/sysfb: Fix VESA format selection

2023-04-19 Thread Pierre Asselin

Thomas Zimmermann  wrote:
> Am 19.04.23 um 06:48 schrieb Pierre Asselin:
>>
>> v2 fixes the warnings from a max3() macro with arguments of different
>> types;  split the bits_per_pixel assignment to avoid uglyfing the code
>> with too many typecasts.
>
> What exactly was that warning?

A friendly note from a robot; make W=1 sysfb_simplefb.o .
https://lore.kernel.org/dri-devel/20230418183325.2327-1...@panix.com/T/#m38e859354329ab9f756da91e99b546e3b140fa91

> I liked the all-in-one assignment of the original patch. So I'd rather
> go back to v1 and copy si->lfb_depth to the correct type, like this:
>
>u32 depth = si->lfb_depth;
>bits_per_pixel = max3(max3(colors),
>   rsvd,
>  depth);

Would that work?  If I understand correctly max3() checks that all args
have the same type.  {red,green,blue,rsvd}.{size,pos} are all u8 while
lfb_depth is u16.  The best I can do is

bits_per_pixel = max3((u16)max3(si->red_size + si->red_pos,
si->green_size + si->green_pos,
si->blue_size + si->blue_pos),
  (u16)(si->rsvd_size + si->rsvd_pos),
  si->lfb_depth);

That compiles quietly with W=1 but those two casts are ugly.
If I do that, would K read better ?

bits_per_pixel = max3(
  (u16)max3(
  si->red_size + si->red_pos,
  si->green_size + si->green_pos,
  si->blue_size + si->blue_pos
  ),
  (u16)(si->rsvd_size + si->rsvd_pos),
  si->lfb_depth
 );

I think it's clearer, but not kernel style and still ugly.

> Or, if you want to get fancy, you could add max3_t() to 
>
>#define max3_t(type, x, y, z)   max_t(type, max_t(type, x, y), z)
>
> and do
>
>bits_per_pixel = max3_t(u32,
>max3(colors),
>rsvd,
>si->lfb_depth)
>
> You could also add a max4_t(type, x, y, z, w) to  and
> compare all values with max4_t().

That would be a two-patch series.  I'd rather keep it to the strict
minimum that fixes the regression.  (You trust me to even *look* at a
kernel header and not break it ?  Dangerous assumption!)

I'm new at this.  Two months ago I didn't know what to type a the
command line after "git".

Incidentally, should I send v3 as a new email or reply to the chain?

--PA

[PATCH v4] drm/fbdev-generic: prohibit potential out-of-bounds access

2023-04-19 Thread Sui Jingfeng

The fbdev test of IGT may write after EOF, which lead to out-of-bound
access for the drm drivers hire fbdev-generic. However, run fbdev test
on x86 +ast2400 platform, with 1680x1050 resolution, will cause the
linux kernel hang with the following call trace:

  Oops:  [#1] PREEMPT SMP PTI
  [IGT] fbdev: starting subtest eof
  Workqueue: events drm_fb_helper_damage_work [drm_kms_helper]
  [IGT] fbdev: starting subtest nullptr

  RIP: 0010:memcpy_erms+0xa/0x20
  RSP: 0018:a17d40167d98 EFLAGS: 00010246
  RAX: a17d4eb7fa80 RBX: a17d40e0aa80 RCX: 14c0
  RDX: 1a40 RSI: a17d40e0b000 RDI: a17d4eb8
  RBP: a17d40167e20 R08:  R09: 89522ecff8c0
  R10: a17d4e4c5000 R11:  R12: a17d4eb7fa80
  R13: 1a40 R14: 041a R15: a17d40167e30
  FS:  () GS:89525738() knlGS:
  CS:  0010 DS:  ES:  CR0: 80050033
  CR2: a17d40e0b000 CR3: 0001eaeca006 CR4: 001706e0
  Call Trace:
   
   ? drm_fbdev_generic_helper_fb_dirty+0x207/0x330 [drm_kms_helper]
   drm_fb_helper_damage_work+0x8f/0x170 [drm_kms_helper]
   process_one_work+0x21f/0x430
   worker_thread+0x4e/0x3c0
   ? __pfx_worker_thread+0x10/0x10
   kthread+0xf4/0x120
   ? __pfx_kthread+0x10/0x10
   ret_from_fork+0x2c/0x50
   
  CR2: a17d40e0b000
  ---[ end trace  ]---

The is because damage rectangle rectange computed by
drm_fb_helper_memory_range_to_clip() does not guaranteed to be bound in the
screen's active display area. In details, we typically allocate buffers in
the granularity of the page-size for mmap system call support.

Exporting bit larger buffer in size than the size of active display to user
space do allow the userspace write below the bottom of the display, it is
not a big issue because there still have memory resolve the access.

Yet, draft too far from the boundary is dangerious. Because such a access
put the system in the situation of out-of-bound access. The root cause is
that we do not do the validation, also DIV_ROUND_UP() may also introduce
off-by-one error.

This patch add logic to restrict the damage rectangle dract out of the
visiable boundary.

Fixes: aa15c677cc34 ("drm/fb-helper: Fix vertical damage clipping")

Signed-off-by: Sui Jingfeng 
Reviewed-by: Thomas Zimmermann 
Tested-by: Geert Uytterhoeven 
Link: 
https://lore.kernel.org/dri-devel/ad44df29-3241-0d9e-e708-b0338bf3c...@189.cn/
---
 drivers/gpu/drm/drm_fb_helper.c | 16 
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c
index 64458982be40..6bb1b8b27d7a 100644
--- a/drivers/gpu/drm/drm_fb_helper.c
+++ b/drivers/gpu/drm/drm_fb_helper.c
@@ -641,19 +641,27 @@ static void drm_fb_helper_damage(struct drm_fb_helper 
*helper, u32 x, u32 y,
 static void drm_fb_helper_memory_range_to_clip(struct fb_info *info, off_t 
off, size_t len,
   struct drm_rect *clip)
 {
+   u32 line_length = info->fix.line_length;
+   u32 fb_height = info->var.yres;
off_t end = off + len;
u32 x1 = 0;
-   u32 y1 = off / info->fix.line_length;
+   u32 y1 = off / line_length;
u32 x2 = info->var.xres;
-   u32 y2 = DIV_ROUND_UP(end, info->fix.line_length);
+   u32 y2 = DIV_ROUND_UP(end, line_length);
+
+   /* Don't allow any of them beyond the bottom bound of display area */
+   if (y1 > fb_height)
+   y1 = fb_height;
+   if (y2 > fb_height)
+   y2 = fb_height;
 
if ((y2 - y1) == 1) {
/*
 * We've only written to a single scanline. Try to reduce
 * the number of horizontal pixels that need an update.
 */
-   off_t bit_off = (off % info->fix.line_length) * 8;
-   off_t bit_end = (end % info->fix.line_length) * 8;
+   off_t bit_off = (off % line_length) * 8;
+   off_t bit_end = (end % line_length) * 8;
 
x1 = bit_off / info->var.bits_per_pixel;
x2 = DIV_ROUND_UP(bit_end, info->var.bits_per_pixel);
-- 
2.25.1

Re: [PATCH v2] firmware/sysfb: Fix VESA format selection

2023-04-19 Thread Pierre Asselin

Javier Martinez Canillas  writes:
> Pierre Asselin  writes:

>> Fixes: f35cd3fa7729 [firmware/sysfb: Fix EFI/VESA format selection]
>
> The convention is f35cd3fa7729 ("firmware/sysfb: Fix EFI/VESA format
> selection")

>> +bits_per_pixel= max(bits_per_pixel, (u32)si->lfb_depth);
>
> You are missing a space here.

OK.  I'll fix that.  Thanks.

--PA

Re: [Freedreno] [PATCH 5/5] drm/msm/dpu1: Handle the reg bus ICC path

2023-04-19 Thread Konrad Dybcio




On 19.04.2023 21:06, Jeykumar Sankaran wrote:
> 
> 
> On 4/17/2023 8:30 AM, Konrad Dybcio wrote:
>> Apart from the already handled data bus (MAS_MDP_Pn<->DDR), there's
>> another path that needs to be handled to ensure MDSS functions properly,
>> namely the "reg bus", a.k.a the CPU-MDSS interconnect.
>>
>> Gating that path may have a variety of effects.. from none to otherwise
>> inexplicable DSI timeouts..
>>
>> On the DPU side, we need to keep the bus alive. The vendor driver
>> kickstarts it to max (300Mbps) throughput on first commit, but in
>> exchange for some battery life in rare DPU-enabled-panel-disabled
>> usecases, we can request it at DPU init and gate it at suspend.
>>
>> Signed-off-by: Konrad Dybcio 
>> ---
>>   drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c | 22 --
>>   drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h |  1 +
>>   2 files changed, 21 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c 
>> b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
>> index dd6c1c40ab9e..d1f77faebbc0 100644
>> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
>> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
>> @@ -384,15 +384,17 @@ static int dpu_kms_global_obj_init(struct dpu_kms 
>> *dpu_kms)
>>   return 0;
>>   }
>>   -static int dpu_kms_parse_data_bus_icc_path(struct dpu_kms *dpu_kms)
>> +static int dpu_kms_parse_icc_paths(struct dpu_kms *dpu_kms)
>>   {
>>   struct icc_path *path0;
>>   struct icc_path *path1;
>> +    struct icc_path *reg_bus_path;
>>   struct drm_device *dev = dpu_kms->dev;
>>   struct device *dpu_dev = dev->dev;
>>     path0 = msm_icc_get(dpu_dev, "mdp0-mem");
>>   path1 = msm_icc_get(dpu_dev, "mdp1-mem");
>> +    reg_bus_path = msm_icc_get(dpu_dev, "cpu-cfg");
>>     if (IS_ERR_OR_NULL(path0))
>>   return PTR_ERR_OR_ZERO(path0);
>> @@ -404,6 +406,10 @@ static int dpu_kms_parse_data_bus_icc_path(struct 
>> dpu_kms *dpu_kms)
>>   dpu_kms->mdp_path[1] = path1;
>>   dpu_kms->num_mdp_paths++;
>>   }
>> +
>> +    if (!IS_ERR_OR_NULL(reg_bus_path))
>> +    dpu_kms->reg_bus_path = reg_bus_path;
>> +
>>   return 0;
>>   }
>>   @@ -1039,7 +1045,7 @@ static int dpu_kms_hw_init(struct msm_kms *kms)
>>   DPU_DEBUG("REG_DMA is not defined");
>>   }
>>   -    dpu_kms_parse_data_bus_icc_path(dpu_kms);
>> +    dpu_kms_parse_icc_paths(dpu_kms);
>>     rc = pm_runtime_resume_and_get(_kms->pdev->dev);
>>   if (rc < 0)
>> @@ -1241,6 +1247,9 @@ static int __maybe_unused dpu_runtime_suspend(struct 
>> device *dev)
>>   for (i = 0; i < dpu_kms->num_mdp_paths; i++)
>>   icc_set_bw(dpu_kms->mdp_path[i], 0, 0);
>>   +    if (dpu_kms->reg_bus_path)
>> +    icc_set_bw(dpu_kms->reg_bus_path, 0, 0);
>> +
>>   return 0;
>>   }
>>   @@ -1261,6 +1270,15 @@ static int __maybe_unused dpu_runtime_resume(struct 
>> device *dev)
>>   return rc;
>>   }
>>   +    /*
>> + * The vendor driver supports setting 76.8 / 150 / 300 Mbps on this
> How do you arrive at these distint BW values? Are they provided by the ICC 
> fwk for the given path?
They're hardcoded in the SDE driver.

Konrad
>> + * path, but it seems to go for the highest level when display output
>> + * is enabled and zero otherwise. For simplicity, we can assume that
>> + * DPU being enabled and running implies that.
>> + */
>> +    if (dpu_kms->reg_bus_path)
>> +    icc_set_bw(dpu_kms->reg_bus_path, 0, MBps_to_icc(300));
>> +
>>   dpu_vbif_init_memtypes(dpu_kms);
>>     drm_for_each_encoder(encoder, ddev)
>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h 
>> b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h
>> index d5d9bec90705..c332381d58c4 100644
>> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h
>> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h
>> @@ -111,6 +111,7 @@ struct dpu_kms {
>>   atomic_t bandwidth_ref;
>>   struct icc_path *mdp_path[2];
>>   u32 num_mdp_paths;
>> +    struct icc_path *reg_bus_path;
>>   };
>>     struct vsync_info {
>>

Re: [PATCH 3/3] drm/i915/hwmon: Block waiting for GuC reset to complete

2023-04-19 Thread Rodrigo Vivi

On Tue, Apr 18, 2023 at 10:23:50AM -0700, Dixit, Ashutosh wrote:
> On Mon, 17 Apr 2023 22:35:58 -0700, Rodrigo Vivi wrote:
> >
> 
> Hi Rodrigo,
> 
> > On Mon, Apr 10, 2023 at 03:35:09PM -0700, Ashutosh Dixit wrote:
> > > Instead of erroring out when GuC reset is in progress, block waiting for
> > > GuC reset to complete which is a more reasonable uapi behavior.
> > >
> > > v2: Avoid race between wake_up_all and waiting for wakeup (Rodrigo)
> > >
> > > Signed-off-by: Ashutosh Dixit 
> > > ---
> > >  drivers/gpu/drm/i915/i915_hwmon.c | 38 +++
> > >  1 file changed, 33 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/i915/i915_hwmon.c 
> > > b/drivers/gpu/drm/i915/i915_hwmon.c
> > > index 9ab8971679fe3..8471a667dfc71 100644
> > > --- a/drivers/gpu/drm/i915/i915_hwmon.c
> > > +++ b/drivers/gpu/drm/i915/i915_hwmon.c
> > > @@ -51,6 +51,7 @@ struct hwm_drvdata {
> > >   char name[12];
> > >   int gt_n;
> > >   bool reset_in_progress;
> > > + wait_queue_head_t waitq;
> > >  };
> > >
> > >  struct i915_hwmon {
> > > @@ -395,16 +396,41 @@ hwm_power_max_read(struct hwm_drvdata *ddat, long 
> > > *val)
> > >  static int
> > >  hwm_power_max_write(struct hwm_drvdata *ddat, long val)
> > >  {
> > > +#define GUC_RESET_TIMEOUT msecs_to_jiffies(2000)
> > > +
> > > + int ret = 0, timeout = GUC_RESET_TIMEOUT;
> > >   struct i915_hwmon *hwmon = ddat->hwmon;
> > >   intel_wakeref_t wakeref;
> > > - int ret = 0;
> > > + DEFINE_WAIT(wait);
> > >   u32 nval;
> > >
> > > - mutex_lock(>hwmon_lock);
> > > - if (hwmon->ddat.reset_in_progress) {
> > > - ret = -EAGAIN;
> > > - goto unlock;
> > > + /* Block waiting for GuC reset to complete when needed */
> > > + for (;;) {
> > > + mutex_lock(>hwmon_lock);
> >
> > I'm really afraid of how this mutex is handled with the wait queue.
> > some initial thought it looks like it is trying to reimplement ww_mutex?
> 
> Sorry, but I am missing the relation with ww_mutex. No such relation is
> intended.
> 
> > all other examples of the wait_queue usages like this or didn't use
> > locks or had it in a total different flow that I could not correlate.
> 
> Actually there are several examples of prepare_to_wait/finish_wait
> sequences with both spinlock and mutex in the kernel. See
> e.g. rpm_suspend(), wait_for_rtrs_disconnection(), softsynthx_read().
> 
> Also, as I mentioned, except for the lock, the sequence here is identical
> to intel_guc_wait_for_pending_msg().
> 
> >
> > > +
> > > + prepare_to_wait(>waitq, , TASK_INTERRUPTIBLE);
> > > +
> > > + if (!hwmon->ddat.reset_in_progress)
> > > + break;
> >
> > If this breaks we never unlock it?
> 
> Correct, this is the original case in Patch 2 where the mutex is acquired
> in the beginning of the function and released just before the final exit
> from the function (so the mutex is held for the entire duration of the
> function).

I got really confused here... I looked at the patch 2 again and I don't
see any place where the lock remains outside of the function. What was
what I asked to remove on the initial versions.

But now with this one I'm even more confused because I couldn't follow
to understand who will remove the lock and when.

> 
> >
> > > +
> > > + if (signal_pending(current)) {
> > > + ret = -EINTR;
> > > + break;
> > > + }
> > > +
> > > + if (!timeout) {
> > > + ret = -ETIME;
> > > + break;
> > > + }
> > > +
> > > + mutex_unlock(>hwmon_lock);
> >
> > do we need to lock the signal pending and timeout as well?
> > or only wrapping it around the hwmon->ddat access would be
> > enough?
> 
> Strictly, the mutex is only needed for the hwmon->ddat.reset_in_progress
> flag. But because this is not a performance path, implementing it as done
> in the patch simplifies the code flow (since there are several if/else,
> goto's, mutex lock/unlock and prepare_to_wait/finish_wait to consider).
> 
> So if possible I *really* want to not try to over-optimize here (I did try
> a few other things when writing the patch but it was getting ugly). The
> only real requirement is to drop the lock before calling schedule_timeout()
> below (and we are reacquiring the lock as soon as we are scheduled back in,
> as you can see in the loop above).
> 
> >
> > > +
> > > + timeout = schedule_timeout(timeout);
> > >   }
> > > + finish_wait(>waitq, );
> > > + if (ret)
> > > + goto unlock;
> > > +
> > >   wakeref = intel_runtime_pm_get(ddat->uncore->rpm);
> > >
> > >   /* Disable PL1 limit and verify, because the limit cannot be disabled 
> > > on all platforms */
> > > @@ -508,6 +534,7 @@ void i915_hwmon_power_max_restore(struct 
> > > drm_i915_private *i915, bool old)
> > >   intel_uncore_rmw(hwmon->ddat.uncore, hwmon->rg.pkg_rapl_limit,
> > >PKG_PWR_LIM_1_EN, old ? PKG_PWR_LIM_1_EN : 0);
> > >   hwmon->ddat.reset_in_progress

1 2 3 >

1 - 100 of 224 matches

Mail list logo