Re: [PATCH v4 00/14] drm: Add a driver for CSF-based Mali GPUs

2024-01-28 Thread Boris Brezillon
On Mon, 22 Jan 2024 17:30:31 +0100
Boris Brezillon  wrote:

> Hello,
> 
> This is the 4th version of the kernel driver for Mali CSF-based GPUs.
> 
> A branch based on drm-misc-next and containing all the dependencies
> that are not yet available in drm-misc-next here[1], and another [2]
> containing extra patches to have things working on rk3588. The CSF
> firmware binary can be found here[3], and should be placed under
> /lib/firmware/arm/mali/arch10.8/mali_csffw.bin.
> 
> The mesa MR adding v10 support on top of panthor is available here [4].
> 
> Steve, I intentionally dropped your R-b on "drm/panthor: Add the heap
> logical block" and "drm/panthor: Add the scheduler logical block"
> because the tiler-OOM handling changed enough to require a new review
> IMHO.
> 
> Regarding the GPL2+MIT relicensing, I collected Clément's R-b for the
> devfreq code, but am still lacking Alexey Sheplyakov for some bits in
> panthor_gpu.c. The rest of the code is either new, or covered by the
> Linaro, Arm and Collabora acks.
> 
> And here is a non-exhaustive changelog, check each commit for a detailed
> changelog.
> 
> v4:
> - Fix various bugs in the VM logic
> - Address comments from Steven, Liviu, Ketil and Chris
> - Move tiler OOM handling out of the scheduler interrupt handling path
>   so we can properly recover when the system runs out of memory, and
>   panthor is blocked trying to allocate heap chunks
> - Rework the heap locking to support concurrent chunk allocation. Not
>   sure if this is supposed to happen, but we need to be robust against
>   userspace passing the same heap context to two scheduling groups.
>   Wasn't needed before the tiler_oom rework, because heap allocation
>   base serialized by the scheduler lock.
> - Make kernel BO destruction robust to NULL/ERR pointers
> 
> v3;
> - Quite a few changes at the MMU/sched level to make the fix some
>   race conditions and deadlocks
> - Addition of the a sync-only VM_BIND operation (to support
>   vkQueueSparseBind with zero commands).
> - Addition of a VM_GET_STATE ioctl
> - Various cosmetic changes (see the commit changelogs for more details)
> - Various fixes (see the commit changelogs for more details)
> 
> v2:
> - Rename the driver (pancsf -> panthor)
> - Split the commit adding the driver to ease review
> - Use drm_sched for dependency tracking/job submission
> - Add a VM_BIND ioctl
> - Add the concept of exclusive VM for BOs that are only ever mapped to a
>   single VM
> - Document the code and uAPI
> - Add a DT binding doc
> 
> Regards,
> 
> Boris
> 
> [1]https://gitlab.freedesktop.org/panfrost/linux/-/tree/panthor-v4
> [2]https://gitlab.freedesktop.org/panfrost/linux/-/tree/panthor-v4+rk3588
> [3]https://gitlab.com/firefly-linux/external/libmali/-/raw/firefly/firmware/g610/mali_csffw.bin

Here's a link to a more recent/maintained libmali tree:

[3]https://github.com/JeffyCN/mirrors/raw/libmali/firmware/g610/mali_csffw.bin


> [4]https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26358
> 
> Boris Brezillon (13):
>   drm/panthor: Add uAPI
>   drm/panthor: Add GPU register definitions
>   drm/panthor: Add the device logical block
>   drm/panthor: Add the GPU logical block
>   drm/panthor: Add GEM logical block
>   drm/panthor: Add the devfreq logical block
>   drm/panthor: Add the MMU/VM logical block
>   drm/panthor: Add the FW logical block
>   drm/panthor: Add the heap logical block
>   drm/panthor: Add the scheduler logical block
>   drm/panthor: Add the driver frontend block
>   drm/panthor: Allow driver compilation
>   drm/panthor: Add an entry to MAINTAINERS
> 
> Liviu Dudau (1):
>   dt-bindings: gpu: mali-valhall-csf: Add support for Arm Mali CSF GPUs
> 
>  .../bindings/gpu/arm,mali-valhall-csf.yaml|  147 +
>  Documentation/gpu/driver-uapi.rst |5 +
>  MAINTAINERS   |   11 +
>  drivers/gpu/drm/Kconfig   |2 +
>  drivers/gpu/drm/Makefile  |1 +
>  drivers/gpu/drm/panthor/Kconfig   |   23 +
>  drivers/gpu/drm/panthor/Makefile  |   15 +
>  drivers/gpu/drm/panthor/panthor_devfreq.c |  283 ++
>  drivers/gpu/drm/panthor/panthor_devfreq.h |   25 +
>  drivers/gpu/drm/panthor/panthor_device.c  |  544 +++
>  drivers/gpu/drm/panthor/panthor_device.h  |  393 ++
>  drivers/gpu/drm/panthor/panthor_drv.c | 1470 +++
>  drivers/gpu/drm/panthor/panthor_fw.c  | 1334 +++
>  drivers/gpu/drm/panthor/panthor_fw.h  |  504 +++
>  drivers/gpu/drm/panthor/panthor_gem.c |  228 ++
>  drivers/gpu/drm/panthor/panthor_gem.h |  144 +
>  drivers/gpu/drm/panthor/panthor_gpu.c |  482 +++
>  drivers/gpu/drm/panthor/panthor_gpu.h |   52 +
>  drivers/gpu/drm/panthor/panthor_heap.c|  596 +++
>  drivers/gpu/drm/panthor/panthor_heap.h|   39 +
>  drivers/gpu/drm/panthor/panthor_mmu.c | 2760 +
>  drivers/gpu/drm/panthor/panthor_mmu.h |  

Re: [PATCH] drm/sched: Drain all entities in DRM sched run job worker

2024-01-28 Thread Vlastimil Babka
On 1/29/24 08:44, Christian König wrote:
> Am 26.01.24 um 17:29 schrieb Matthew Brost:
>> On Fri, Jan 26, 2024 at 11:32:57AM +0100, Christian König wrote:
>>> Am 25.01.24 um 18:30 schrieb Matthew Brost:
 On Thu, Jan 25, 2024 at 04:12:58PM +0100, Christian König wrote:
> Am 24.01.24 um 22:08 schrieb Matthew Brost:
>> All entities must be drained in the DRM scheduler run job worker to
>> avoid the following case. An entity found that is ready, no job found
>> ready on entity, and run job worker goes idle with other entities + jobs
>> ready. Draining all ready entities (i.e. loop over all ready entities)
>> in the run job worker ensures all job that are ready will be scheduled.
> That doesn't make sense. drm_sched_select_entity() only returns entities
> which are "ready", e.g. have a job to run.
>
 That is what I thought too, hence my original design but it is not
 exactly true. Let me explain.

 drm_sched_select_entity() returns an entity with a non-empty spsc queue
 (job in queue) and no *current* waiting dependecies [1]. Dependecies for
 an entity can be added when drm_sched_entity_pop_job() is called [2][3]
 returning a NULL job. Thus we can get into a scenario where 2 entities
 A and B both have jobs and no current dependecies. A's job is waiting
 B's job, entity A gets selected first, a dependecy gets installed in
 drm_sched_entity_pop_job(), run work goes idle, and now we deadlock.
>>> And here is the real problem. run work doesn't goes idle in that moment.
>>>
>>> drm_sched_run_job_work() should restarts itself until there is either no
>>> more space in the ring buffer or it can't find a ready entity any more.
>>>
>>> At least that was the original design when that was all still driven by a
>>> kthread.
>>>
>>> It can perfectly be that we messed this up when switching from kthread to a
>>> work item.
>>>
>> Right, that what this patch does - the run worker does not go idle until
>> no ready entities are found. That was incorrect in the original patch
>> and fixed here. Do you have any issues with this fix? It has been tested
>> 3x times and clearly fixes the issue.
> 
> Ah! Yes in this case that patch here is a little bit ugly as well.
> 
> The original idea was that run_job restarts so that we are able to pause 
> the submission thread without searching for an entity to submit more.
> 
> I strongly suggest to replace the while loop with a call to 
> drm_sched_run_job_queue() so that when the entity can't provide a job we 
> just restart the queuing work.

Note it's already included in rc2, so any changes need to be a followup fix.
If these are important, then please make sure they get to rc3 :)



Re: [PATCH] drm/sched: Drain all entities in DRM sched run job worker

2024-01-28 Thread Christian König

Am 26.01.24 um 17:29 schrieb Matthew Brost:

On Fri, Jan 26, 2024 at 11:32:57AM +0100, Christian König wrote:

Am 25.01.24 um 18:30 schrieb Matthew Brost:

On Thu, Jan 25, 2024 at 04:12:58PM +0100, Christian König wrote:

Am 24.01.24 um 22:08 schrieb Matthew Brost:

All entities must be drained in the DRM scheduler run job worker to
avoid the following case. An entity found that is ready, no job found
ready on entity, and run job worker goes idle with other entities + jobs
ready. Draining all ready entities (i.e. loop over all ready entities)
in the run job worker ensures all job that are ready will be scheduled.

That doesn't make sense. drm_sched_select_entity() only returns entities
which are "ready", e.g. have a job to run.


That is what I thought too, hence my original design but it is not
exactly true. Let me explain.

drm_sched_select_entity() returns an entity with a non-empty spsc queue
(job in queue) and no *current* waiting dependecies [1]. Dependecies for
an entity can be added when drm_sched_entity_pop_job() is called [2][3]
returning a NULL job. Thus we can get into a scenario where 2 entities
A and B both have jobs and no current dependecies. A's job is waiting
B's job, entity A gets selected first, a dependecy gets installed in
drm_sched_entity_pop_job(), run work goes idle, and now we deadlock.

And here is the real problem. run work doesn't goes idle in that moment.

drm_sched_run_job_work() should restarts itself until there is either no
more space in the ring buffer or it can't find a ready entity any more.

At least that was the original design when that was all still driven by a
kthread.

It can perfectly be that we messed this up when switching from kthread to a
work item.


Right, that what this patch does - the run worker does not go idle until
no ready entities are found. That was incorrect in the original patch
and fixed here. Do you have any issues with this fix? It has been tested
3x times and clearly fixes the issue.


Ah! Yes in this case that patch here is a little bit ugly as well.

The original idea was that run_job restarts so that we are able to pause 
the submission thread without searching for an entity to submit more.


I strongly suggest to replace the while loop with a call to 
drm_sched_run_job_queue() so that when the entity can't provide a job we 
just restart the queuing work.


Regards,
Christian.

  


Matt


Regards,
Christian.


The proper solution is to loop over all ready entities until one with a
job is found via drm_sched_entity_pop_job() and then requeue the run
job worker. Or loop over all entities until drm_sched_select_entity()
returns NULL and then let the run job worker go idle. This is what the
old threaded design did too [4]. Hope this clears everything up.

Matt

[1] 
https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/scheduler/sched_entity.c#L144
[2] 
https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/scheduler/sched_entity.c#L464
[3] 
https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/scheduler/sched_entity.c#L397
[4] 
https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/scheduler/sched_main.c#L1011


If that's not the case any more then you have broken something else.

Regards,
Christian.


Cc: Thorsten Leemhuis 
Reported-by: Mikhail Gavrilov 
Closes: 
https://lore.kernel.org/all/CABXGCsM2VLs489CH-vF-1539-s3in37=bwuowtoeee+q26z...@mail.gmail.com/
Reported-and-tested-by: Mario Limonciello 
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3124
Link: 
https://lore.kernel.org/all/20240123021155.2775-1-mario.limoncie...@amd.com/
Reported-by: Vlastimil Babka 
Closes: 
https://lore.kernel.org/dri-devel/05ddb2da-b182-4791-8ef7-82179fd15...@amd.com/T/#m0c31d4d1b9ae9995bb880974c4f1dbaddc33a48a
Signed-off-by: Matthew Brost 
---
drivers/gpu/drm/scheduler/sched_main.c | 15 +++
1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index 550492a7a031..85f082396d42 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -1178,21 +1178,20 @@ static void drm_sched_run_job_work(struct work_struct 
*w)
struct drm_sched_entity *entity;
struct dma_fence *fence;
struct drm_sched_fence *s_fence;
-   struct drm_sched_job *sched_job;
+   struct drm_sched_job *sched_job = NULL;
int r;
if (READ_ONCE(sched->pause_submit))
return;
-   entity = drm_sched_select_entity(sched);
+   /* Find entity with a ready job */
+   while (!sched_job && (entity = drm_sched_select_entity(sched))) {
+   sched_job = drm_sched_entity_pop_job(entity);
+   if (!sched_job)
+   complete_all(>entity_idle);
+   }
if (!entity)
-   return;
-
-   sched_job = drm_sched_entity_pop_job(entity);
-   if (!sched_job) {
-   

Re: [PATCH] dt-bindings: display: bridge: it6505: Add #sound-dai-cells

2024-01-28 Thread Chen-Yu Tsai
On Fri, Jan 26, 2024 at 6:17 PM Krzysztof Kozlowski
 wrote:
>
> On 26/01/2024 08:35, Chen-Yu Tsai wrote:
> > The ITE IT6505 display bridge can take one I2S input and transmit it
> > over the DisplayPort link.
> >
> > Add #sound-dai-cells (= 0) to the binding for it.
> >
> > Signed-off-by: Chen-Yu Tsai 
> > ---
> > The driver side changes [1] are still being worked on, but given the
> > hardware is very simple, it would be nice if we could land the binding
> > first and be able to introduct device trees that have this.
> >
> > [1] 
> > https://lore.kernel.org/linux-arm-kernel/20230730180803.22570-4-jiaxin...@mediatek.com/
> >
> >  .../devicetree/bindings/display/bridge/ite,it6505.yaml | 3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git 
> > a/Documentation/devicetree/bindings/display/bridge/ite,it6505.yaml 
> > b/Documentation/devicetree/bindings/display/bridge/ite,it6505.yaml
> > index 348b02f26041..7ec4decc9c21 100644
> > --- a/Documentation/devicetree/bindings/display/bridge/ite,it6505.yaml
> > +++ b/Documentation/devicetree/bindings/display/bridge/ite,it6505.yaml
> > @@ -52,6 +52,9 @@ properties:
> >  maxItems: 1
> >  description: extcon specifier for the Power Delivery
> >
> > +  "#sound-dai-cells":
> > +const: 0
>
> In such case you also want to $ref /schemas/sound/dai-common.yaml.

Ack. I assume this also means I should change "additionalProperties: false"
to "unevaluatedProperties: false" in this file.

ChenYu

> Best regards,
> Krzysztof
>


Re: [PATCH 14/17] drm/msm/dpu: modify encoder programming for CDM over DP

2024-01-28 Thread Abhinav Kumar




On 1/28/2024 10:12 PM, Dmitry Baryshkov wrote:

On Mon, 29 Jan 2024 at 07:03, Abhinav Kumar  wrote:




On 1/28/2024 7:42 PM, Dmitry Baryshkov wrote:

On Mon, 29 Jan 2024 at 04:58, Abhinav Kumar  wrote:




On 1/27/2024 9:55 PM, Dmitry Baryshkov wrote:

On Sun, 28 Jan 2024 at 07:48, Paloma Arellano  wrote:



On 1/25/2024 1:57 PM, Dmitry Baryshkov wrote:

On 25/01/2024 21:38, Paloma Arellano wrote:

Adjust the encoder format programming in the case of video mode for DP
to accommodate CDM related changes.

Signed-off-by: Paloma Arellano 
---
 drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c   | 16 +
 drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h   |  8 +
 .../drm/msm/disp/dpu1/dpu_encoder_phys_vid.c  | 35 ---
 drivers/gpu/drm/msm/dp/dp_display.c   | 12 +++
 drivers/gpu/drm/msm/msm_drv.h |  9 -
 5 files changed, 75 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
index b0896814c1562..99ec53446ad21 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
@@ -222,6 +222,22 @@ static u32 dither_matrix[DITHER_MATRIX_SZ] = {
 15, 7, 13, 5, 3, 11, 1, 9, 12, 4, 14, 6, 0, 8, 2, 10
 };
 +u32 dpu_encoder_get_drm_fmt(const struct drm_encoder *drm_enc,
const struct drm_display_mode *mode)
+{
+const struct dpu_encoder_virt *dpu_enc;
+const struct msm_display_info *disp_info;
+struct msm_drm_private *priv;
+
+dpu_enc = to_dpu_encoder_virt(drm_enc);
+disp_info = _enc->disp_info;
+priv = drm_enc->dev->dev_private;
+
+if (disp_info->intf_type == INTF_DP &&
+ msm_dp_is_yuv_420_enabled(priv->dp[disp_info->h_tile_instance[0]],
mode))


This should not require interacting with DP. If we got here, we must
be sure that 4:2:0 is supported and can be configured.

Ack. Will drop this function and only check for if the mode is YUV420.



+return DRM_FORMAT_YUV420;
+
+return DRM_FORMAT_RGB888;
+}
   bool dpu_encoder_is_widebus_enabled(const struct drm_encoder
*drm_enc)
 {
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h
b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h
index 7b4afa71f1f96..62255d0aa4487 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h
@@ -162,6 +162,14 @@ int dpu_encoder_get_vsync_count(struct
drm_encoder *drm_enc);
  */
 bool dpu_encoder_is_widebus_enabled(const struct drm_encoder
*drm_enc);
 +/**
+ * dpu_encoder_get_drm_fmt - return DRM fourcc format
+ * @drm_enc:Pointer to previously created drm encoder structure
+ * @mode:Corresponding drm_display_mode for dpu encoder
+ */
+u32 dpu_encoder_get_drm_fmt(const struct drm_encoder *drm_enc,
+const struct drm_display_mode *mode);
+
 /**
  * dpu_encoder_get_crc_values_cnt - get number of physical encoders
contained
  *in virtual encoder that can collect CRC values
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c
b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c
index e284bf448bdda..a1dde0ff35dc8 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c
@@ -234,6 +234,7 @@ static void
dpu_encoder_phys_vid_setup_timing_engine(
 {
 struct drm_display_mode mode;
 struct dpu_hw_intf_timing_params timing_params = { 0 };
+struct dpu_hw_cdm *hw_cdm;
 const struct dpu_format *fmt = NULL;
 u32 fmt_fourcc = DRM_FORMAT_RGB888;
 unsigned long lock_flags;
@@ -254,17 +255,26 @@ static void
dpu_encoder_phys_vid_setup_timing_engine(
 DPU_DEBUG_VIDENC(phys_enc, "enabling mode:\n");
 drm_mode_debug_printmodeline();
 -if (phys_enc->split_role != ENC_ROLE_SOLO) {
+hw_cdm = phys_enc->hw_cdm;
+if (hw_cdm) {
+intf_cfg.cdm = hw_cdm->idx;
+fmt_fourcc = dpu_encoder_get_drm_fmt(phys_enc->parent, );
+}
+
+if (phys_enc->split_role != ENC_ROLE_SOLO ||
+dpu_encoder_get_drm_fmt(phys_enc->parent, ) ==
DRM_FORMAT_YUV420) {
 mode.hdisplay >>= 1;
 mode.htotal >>= 1;
 mode.hsync_start >>= 1;
 mode.hsync_end >>= 1;
+mode.hskew >>= 1;


Separate patch.

Ack.



   DPU_DEBUG_VIDENC(phys_enc,
-"split_role %d, halve horizontal %d %d %d %d\n",
+"split_role %d, halve horizontal %d %d %d %d %d\n",
 phys_enc->split_role,
 mode.hdisplay, mode.htotal,
-mode.hsync_start, mode.hsync_end);
+mode.hsync_start, mode.hsync_end,
+mode.hskew);
 }
   drm_mode_to_intf_timing_params(phys_enc, , _params);
@@ -412,8 +422,15 @@ static int dpu_encoder_phys_vid_control_vblank_irq(
 static void dpu_encoder_phys_vid_enable(struct dpu_encoder_phys
*phys_enc)
 {
 

[PATCH v4, 21/22] media: mediatek: vcodec: move vdec init interface to setup callback

2024-01-28 Thread Yunfei Dong
Getting secure video playback (svp) flag when request output buffer, then
calling init interface to init svp parameters in optee-os.

Signed-off-by: Yunfei Dong 
---
 .../mediatek/vcodec/decoder/mtk_vcodec_dec.c  | 139 +++---
 1 file changed, 89 insertions(+), 50 deletions(-)

diff --git a/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec.c 
b/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec.c
index 5d876a31e566..667005ff49c0 100644
--- a/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec.c
+++ b/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec.c
@@ -184,6 +184,69 @@ void mtk_vcodec_dec_set_default_params(struct 
mtk_vcodec_dec_ctx *ctx)
q_data->bytesperline[1] = q_data->coded_width;
 }
 
+static int mtk_vcodec_dec_init_pic_info(struct mtk_vcodec_dec_ctx *ctx, enum 
v4l2_buf_type type)
+{
+   const struct mtk_vcodec_dec_pdata *dec_pdata = ctx->dev->vdec_pdata;
+   struct mtk_q_data *q_data;
+   int ret;
+
+   if (!ctx->current_codec)
+   return 0;
+
+   if (V4L2_TYPE_IS_OUTPUT(type) && ctx->state == MTK_STATE_FREE) {
+   q_data = mtk_vdec_get_q_data(ctx, 
V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE);
+   if (!q_data)
+   return -EINVAL;
+
+   ret = vdec_if_init(ctx, q_data->fmt->fourcc);
+   if (ret) {
+   mtk_v4l2_vdec_err(ctx, "[%d]: vdec_if_init() fail 
ret=%d",
+ ctx->id, ret);
+   return -EINVAL;
+   }
+   ctx->state = MTK_STATE_INIT;
+   }
+
+   if (!dec_pdata->uses_stateless_api)
+   return 0;
+
+   /*
+* If get pic info fail, need to use the default pic info params, or
+* v4l2-compliance will fail
+*/
+   ret = vdec_if_get_param(ctx, GET_PARAM_PIC_INFO, >picinfo);
+   if (ret)
+   mtk_v4l2_vdec_err(ctx, "[%d]Error!! Get GET_PARAM_PICTURE_INFO 
Fail",
+ ctx->id);
+
+   q_data = mtk_vdec_get_q_data(ctx, V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE);
+   if (q_data->fmt->num_planes == 1) {
+   q_data->sizeimage[0] = ctx->picinfo.fb_sz[0] + 
ctx->picinfo.fb_sz[1];
+   q_data->bytesperline[0] = ctx->picinfo.buf_w;
+   } else {
+   if (ctx->is_secure_playback)
+   q_data->sizeimage[0] = ctx->picinfo.fb_sz[0] + 
ctx->picinfo.fb_sz[1];
+   else
+   q_data->sizeimage[0] = ctx->picinfo.fb_sz[0];
+
+   q_data->bytesperline[0] = ctx->picinfo.buf_w;
+   q_data->sizeimage[1] = ctx->picinfo.fb_sz[1];
+   q_data->bytesperline[1] = ctx->picinfo.buf_w;
+   }
+
+   q_data->coded_width = ctx->picinfo.buf_w;
+   q_data->coded_height = ctx->picinfo.buf_h;
+
+   ctx->last_decoded_picinfo = ctx->picinfo;
+   mtk_v4l2_vdec_dbg(2, ctx,
+ "[%d] init() plane:%d wxh=%dx%d pic wxh=%dx%d 
sz=0x%x_0x%x",
+ ctx->id, q_data->fmt->num_planes,
+ ctx->picinfo.buf_w, ctx->picinfo.buf_h,
+ ctx->picinfo.pic_w, ctx->picinfo.pic_h,
+ q_data->sizeimage[0], q_data->sizeimage[1]);
+   return 0;
+}
+
 static int vidioc_vdec_qbuf(struct file *file, void *priv,
struct v4l2_buffer *buf)
 {
@@ -479,17 +542,7 @@ static int vidioc_vdec_s_fmt(struct file *file, void *priv,
ctx->ycbcr_enc = pix_mp->ycbcr_enc;
ctx->quantization = pix_mp->quantization;
ctx->xfer_func = pix_mp->xfer_func;
-
ctx->current_codec = fmt->fourcc;
-   if (ctx->state == MTK_STATE_FREE) {
-   ret = vdec_if_init(ctx, q_data->fmt->fourcc);
-   if (ret) {
-   mtk_v4l2_vdec_err(ctx, "[%d]: vdec_if_init() 
fail ret=%d",
- ctx->id, ret);
-   return -EINVAL;
-   }
-   ctx->state = MTK_STATE_INIT;
-   }
} else {
ctx->capture_fourcc = fmt->fourcc;
}
@@ -502,46 +555,11 @@ static int vidioc_vdec_s_fmt(struct file *file, void 
*priv,
ctx->picinfo.pic_w = pix_mp->width;
ctx->picinfo.pic_h = pix_mp->height;
 
-   /*
-* If get pic info fail, need to use the default pic info 
params, or
-* v4l2-compliance will fail
-*/
-   ret = vdec_if_get_param(ctx, GET_PARAM_PIC_INFO, >picinfo);
-   if (ret) {
-   mtk_v4l2_vdec_err(ctx, "[%d]Error!! Get 
GET_PARAM_PICTURE_INFO Fail",
- ctx->id);
-   }
-
-   ctx->last_decoded_picinfo = ctx->picinfo;
-
-   

[PATCH v4, 17/22] media: mediatek: vcodec: re-construct h264 driver to support svp mode

2024-01-28 Thread Yunfei Dong
Need secure buffer size to convert secure handle to secure
pa in optee-os, re-construct the vsi struct to store each
secure buffer size.

Separate svp and normal wait interrupt condition for svp mode
waiting hardware interrupt in optee-os.

Signed-off-by: Yunfei Dong 
---
 .../decoder/vdec/vdec_h264_req_multi_if.c | 261 +++---
 .../mediatek/vcodec/decoder/vdec_msg_queue.c  |   9 +-
 2 files changed, 168 insertions(+), 102 deletions(-)

diff --git 
a/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_h264_req_multi_if.c 
b/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_h264_req_multi_if.c
index 4967e0f0984d..a1a68487131c 100644
--- 
a/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_h264_req_multi_if.c
+++ 
b/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_h264_req_multi_if.c
@@ -60,14 +60,36 @@ struct vdec_h264_slice_lat_dec_param {
  * @crc:   Used to check whether hardware's status is right
  */
 struct vdec_h264_slice_info {
+   u64 wdma_end_addr_offset;
u16 nal_info;
u16 timeout;
-   u32 bs_buf_size;
-   u64 bs_buf_addr;
-   u64 y_fb_dma;
-   u64 c_fb_dma;
u64 vdec_fb_va;
u32 crc[8];
+   u32 reserved;
+};
+
+/*
+ * struct vdec_h264_slice_mem - memory address and size
+ */
+struct vdec_h264_slice_mem {
+   union {
+   u64 buf;
+   u64 dma_addr;
+   };
+   union {
+   size_t size;
+   u64 dma_addr_end;
+   };
+};
+
+/**
+ * struct vdec_h264_slice_fb - frame buffer for decoding
+ * @y:  current y buffer address info
+ * @c:  current c buffer address info
+ */
+struct vdec_h264_slice_fb {
+   struct vdec_h264_slice_mem y;
+   struct vdec_h264_slice_mem c;
 };
 
 /**
@@ -92,18 +114,16 @@ struct vdec_h264_slice_info {
  */
 struct vdec_h264_slice_vsi {
/* LAT dec addr */
-   u64 wdma_err_addr;
-   u64 wdma_start_addr;
-   u64 wdma_end_addr;
-   u64 slice_bc_start_addr;
-   u64 slice_bc_end_addr;
-   u64 row_info_start_addr;
-   u64 row_info_end_addr;
-   u64 trans_start;
-   u64 trans_end;
-   u64 wdma_end_addr_offset;
+   struct vdec_h264_slice_mem bs;
+   struct vdec_h264_slice_fb fb;
 
-   u64 mv_buf_dma[H264_MAX_MV_NUM];
+   struct vdec_h264_slice_mem ube;
+   struct vdec_h264_slice_mem trans;
+   struct vdec_h264_slice_mem row_info;
+   struct vdec_h264_slice_mem err_map;
+   struct vdec_h264_slice_mem slice_bc;
+
+   struct vdec_h264_slice_mem mv_buf_dma[H264_MAX_MV_NUM];
struct vdec_h264_slice_info dec;
struct vdec_h264_slice_lat_dec_param h264_slice_params;
 };
@@ -392,6 +412,100 @@ static void vdec_h264_slice_get_crop_info(struct 
vdec_h264_slice_inst *inst,
   cr->left, cr->top, cr->width, cr->height);
 }
 
+static void vdec_h264_slice_setup_lat_buffer(struct vdec_h264_slice_inst *inst,
+struct mtk_vcodec_mem *bs,
+struct vdec_lat_buf *lat_buf)
+{
+   struct mtk_vcodec_mem *mem;
+   int i;
+
+   inst->vsi->bs.dma_addr = (u64)bs->dma_addr;
+   inst->vsi->bs.size = bs->size;
+
+   for (i = 0; i < H264_MAX_MV_NUM; i++) {
+   mem = >mv_buf[i];
+   inst->vsi->mv_buf_dma[i].dma_addr = mem->dma_addr;
+   inst->vsi->mv_buf_dma[i].size = mem->size;
+   }
+   inst->vsi->ube.dma_addr = lat_buf->ctx->msg_queue.wdma_addr.dma_addr;
+   inst->vsi->ube.size = lat_buf->ctx->msg_queue.wdma_addr.size;
+
+   inst->vsi->row_info.dma_addr = 0;
+   inst->vsi->row_info.size = 0;
+
+   inst->vsi->err_map.dma_addr = lat_buf->wdma_err_addr.dma_addr;
+   inst->vsi->err_map.size = lat_buf->wdma_err_addr.size;
+
+   inst->vsi->slice_bc.dma_addr = lat_buf->slice_bc_addr.dma_addr;
+   inst->vsi->slice_bc.size = lat_buf->slice_bc_addr.size;
+
+   inst->vsi->trans.dma_addr_end = inst->ctx->msg_queue.wdma_rptr_addr;
+   inst->vsi->trans.dma_addr = inst->ctx->msg_queue.wdma_wptr_addr;
+}
+
+static int vdec_h264_slice_setup_core_buffer(struct vdec_h264_slice_inst *inst,
+struct vdec_h264_slice_share_info 
*share_info,
+struct vdec_lat_buf *lat_buf)
+{
+   struct mtk_vcodec_mem *mem;
+   struct mtk_vcodec_dec_ctx *ctx = inst->ctx;
+   struct vb2_v4l2_buffer *vb2_v4l2;
+   struct vdec_fb *fb;
+   u64 y_fb_dma, c_fb_dma = 0;
+   int i;
+
+   fb = ctx->dev->vdec_pdata->get_cap_buffer(ctx);
+   if (!fb) {
+   mtk_vdec_err(ctx, "fb buffer is NULL");
+   return -EBUSY;
+   }
+
+   y_fb_dma = (u64)fb->base_y.dma_addr;
+   if (!ctx->is_secure_playback) {
+   if (ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 1)
+   c_fb_dma =
+   y_fb_dma + 

[PATCH v4, 16/22] media: mediatek: vcodec: support one plane capture buffer

2024-01-28 Thread Yunfei Dong
The capture buffer has two planes for format MM21, but user space only
allocate secure memory for plane[0], and the size is Y data + uv data.
The driver need to support one plane decoder for svp mode.

Signed-off-by: Yunfei Dong 
---
 .../mediatek/vcodec/decoder/mtk_vcodec_dec.c  |  7 -
 .../vcodec/decoder/mtk_vcodec_dec_stateless.c | 26 ++-
 .../decoder/vdec/vdec_h264_req_common.c   | 18 ++---
 .../mediatek/vcodec/decoder/vdec_drv_if.c |  4 +--
 4 files changed, 31 insertions(+), 24 deletions(-)

diff --git a/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec.c 
b/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec.c
index 604fdc8ee3ce..5d876a31e566 100644
--- a/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec.c
+++ b/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec.c
@@ -653,7 +653,12 @@ static int vidioc_vdec_g_fmt(struct file *file, void *priv,
 * So we just return picinfo yet, and update picinfo in
 * stop_streaming hook function
 */
-   q_data->sizeimage[0] = ctx->picinfo.fb_sz[0];
+
+   if (ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 1 || 
ctx->is_secure_playback)
+   q_data->sizeimage[0] = ctx->picinfo.fb_sz[0] + 
ctx->picinfo.fb_sz[1];
+   else
+   q_data->sizeimage[0] = ctx->picinfo.fb_sz[0];
+
q_data->sizeimage[1] = ctx->picinfo.fb_sz[1];
q_data->bytesperline[0] = ctx->last_decoded_picinfo.buf_w;
q_data->bytesperline[1] = ctx->last_decoded_picinfo.buf_w;
diff --git 
a/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec_stateless.c 
b/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec_stateless.c
index cc42c942eb8a..707ed57a412e 100644
--- a/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec_stateless.c
+++ b/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec_stateless.c
@@ -285,14 +285,14 @@ static struct vdec_fb *vdec_get_cap_buffer(struct 
mtk_vcodec_dec_ctx *ctx)
framebuf = container_of(vb2_v4l2, struct mtk_video_dec_buf, m2m_buf.vb);
 
pfb = >frame_buffer;
-   pfb->base_y.va = vb2_plane_vaddr(dst_buf, 0);
+   if (!ctx->is_secure_playback)
+   pfb->base_y.va = vb2_plane_vaddr(dst_buf, 0);
pfb->base_y.dma_addr = vb2_dma_contig_plane_dma_addr(dst_buf, 0);
pfb->base_y.size = ctx->q_data[MTK_Q_DATA_DST].sizeimage[0];
 
-   if (ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 2) {
+   if (ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 2 && 
!ctx->is_secure_playback) {
pfb->base_c.va = vb2_plane_vaddr(dst_buf, 1);
-   pfb->base_c.dma_addr =
-   vb2_dma_contig_plane_dma_addr(dst_buf, 1);
+   pfb->base_c.dma_addr = vb2_dma_contig_plane_dma_addr(dst_buf, 
1);
pfb->base_c.size = ctx->q_data[MTK_Q_DATA_DST].sizeimage[1];
}
mtk_v4l2_vdec_dbg(1, ctx,
@@ -339,16 +339,18 @@ static void mtk_vdec_worker(struct work_struct *work)
mtk_v4l2_vdec_dbg(3, ctx, "[%d] (%d) id=%d, vb=%p", ctx->id,
  vb2_src->vb2_queue->type, vb2_src->index, vb2_src);
 
-   bs_src->va = vb2_plane_vaddr(vb2_src, 0);
-   bs_src->dma_addr = vb2_dma_contig_plane_dma_addr(vb2_src, 0);
-   bs_src->size = (size_t)vb2_src->planes[0].bytesused;
-   if (!bs_src->va) {
-   v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
-   mtk_v4l2_vdec_err(ctx, "[%d] id=%d source buffer is NULL", 
ctx->id,
- vb2_src->index);
-   return;
+   if (!ctx->is_secure_playback) {
+   bs_src->va = vb2_plane_vaddr(vb2_src, 0);
+   if (!bs_src->va) {
+   v4l2_m2m_job_finish(dev->m2m_dev_dec, ctx->m2m_ctx);
+   mtk_v4l2_vdec_err(ctx, "[%d] id=%d source buffer is 
NULL", ctx->id,
+ vb2_src->index);
+   return;
+   }
}
 
+   bs_src->dma_addr = vb2_dma_contig_plane_dma_addr(vb2_src, 0);
+   bs_src->size = (size_t)vb2_src->planes[0].bytesused;
mtk_v4l2_vdec_dbg(3, ctx, "[%d] Bitstream VA=%p DMA=%pad Size=%zx 
vb=%p",
  ctx->id, bs_src->va, _src->dma_addr, bs_src->size, 
vb2_src);
/* Apply request controls. */
diff --git 
a/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_h264_req_common.c 
b/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_h264_req_common.c
index 5ca20d75dc8e..5e0d55218363 100644
--- a/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_h264_req_common.c
+++ b/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_h264_req_common.c
@@ -79,15 +79,15 @@ void mtk_vdec_h264_fill_dpb_info(struct mtk_vcodec_dec_ctx 
*ctx,
vb2_v4l2 = container_of(vb, struct vb2_v4l2_buffer, 

[PATCH v4,09/22] media: mediatek: vcodec: allocate tee share memory

2024-01-28 Thread Yunfei Dong
Allocate two share memory for each lat and core hardware used to share
information with optee-os. Msg buffer used to send ipi command and get ack
command with optee-os, data buffer used to store vsi information which
used for hardware decode.

Signed-off-by: Yunfei Dong 
---
 .../vcodec/decoder/mtk_vcodec_dec_optee.c | 80 ++-
 .../vcodec/decoder/mtk_vcodec_dec_optee.h | 32 
 2 files changed, 111 insertions(+), 1 deletion(-)

diff --git 
a/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec_optee.c 
b/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec_optee.c
index 38d9c1c1785a..611fb0e56480 100644
--- a/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec_optee.c
+++ b/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec_optee.c
@@ -47,13 +47,69 @@ int mtk_vcodec_dec_optee_private_init(struct 
mtk_vcodec_dec_dev *vcodec_dev)
 }
 EXPORT_SYMBOL_GPL(mtk_vcodec_dec_optee_private_init);
 
+static void mtk_vcodec_dec_optee_deinit_memref(struct mtk_vdec_optee_ca_info 
*ca_info,
+  enum mtk_vdec_optee_data_index 
data_index)
+{
+   tee_shm_free(ca_info->shm_memref[data_index].msg_shm);
+}
+
+static int mtk_vcodec_dec_optee_init_memref(struct tee_context *tee_vdec_ctx,
+   struct mtk_vdec_optee_ca_info 
*ca_info,
+   enum mtk_vdec_optee_data_index 
data_index)
+{
+   struct mtk_vdec_optee_shm_memref *shm_memref;
+   int alloc_size = 0, err = 0;
+   u64 shm_param_type = 0;
+   bool copy_buffer;
+
+   switch (data_index) {
+   case OPTEE_MSG_INDEX:
+   shm_param_type = TEE_IOCTL_PARAM_ATTR_TYPE_MEMREF_INOUT;
+   alloc_size = MTK_VDEC_OPTEE_MSG_SIZE;
+   copy_buffer = true;
+   break;
+   case OPTEE_DATA_INDEX:
+   shm_param_type = TEE_IOCTL_PARAM_ATTR_TYPE_MEMREF_INOUT;
+   alloc_size = MTK_VDEC_OPTEE_HW_SIZE;
+   copy_buffer = false;
+   break;
+   default:
+   pr_err(MTK_DBG_VCODEC_STR "tee invalid data_index: %d.\n", 
data_index);
+   return -EINVAL;
+   }
+
+   shm_memref = _info->shm_memref[data_index];
+
+   /* Allocate dynamic shared memory with decoder TA */
+   shm_memref->msg_shm_size = alloc_size;
+   shm_memref->param_type = shm_param_type;
+   shm_memref->copy_to_ta = copy_buffer;
+   shm_memref->msg_shm = tee_shm_alloc_kernel_buf(tee_vdec_ctx, 
shm_memref->msg_shm_size);
+   if (IS_ERR(shm_memref->msg_shm)) {
+   pr_err(MTK_DBG_VCODEC_STR "tee alloc buf fail: 
data_index:%d.\n", data_index);
+   return -ENOMEM;
+   }
+
+   shm_memref->msg_shm_ca_buf = tee_shm_get_va(shm_memref->msg_shm, 0);
+   if (IS_ERR(shm_memref->msg_shm_ca_buf)) {
+   pr_err(MTK_DBG_VCODEC_STR "tee get shm va fail: 
data_index:%d.\n", data_index);
+   err = PTR_ERR(shm_memref->msg_shm_ca_buf);
+   goto err_get_msg_va;
+   }
+
+   return err;
+err_get_msg_va:
+   tee_shm_free(shm_memref->msg_shm);
+   return err;
+}
+
 static int mtk_vcodec_dec_optee_init_hw_info(struct mtk_vdec_optee_private 
*optee_private,
 enum mtk_vdec_hw_id hardware_index)
 {
struct device *dev = _private->vcodec_dev->plat_dev->dev;
struct tee_ioctl_open_session_arg session_arg;
struct mtk_vdec_optee_ca_info *ca_info;
-   int err = 0, session_func;
+   int err, i, j, session_func;
 
/* Open lat and core session with vdec TA. */
switch (hardware_index) {
@@ -87,6 +143,24 @@ static int mtk_vcodec_dec_optee_init_hw_info(struct 
mtk_vdec_optee_private *opte
dev_dbg(dev, MTK_DBG_VCODEC_STR "open vdec tee session hw_id:%d 
session_id=%x.\n",
hardware_index, ca_info->vdec_session_id);
 
+   /* Allocate dynamic shared memory with decoder TA */
+   for (i = 0; i < OPTEE_MAX_INDEX; i++) {
+   err = 
mtk_vcodec_dec_optee_init_memref(optee_private->tee_vdec_ctx, ca_info, i);
+   if (err) {
+   dev_err(dev, MTK_DBG_VCODEC_STR "init vdec memref 
failed: %d.\n", i);
+   goto err_init_memref;
+   }
+   }
+
+   return err;
+err_init_memref:
+   if (i != 0) {
+   for (j = 0; j < i; j++)
+   mtk_vcodec_dec_optee_deinit_memref(ca_info, j);
+   }
+
+   tee_client_close_session(optee_private->tee_vdec_ctx, 
ca_info->vdec_session_id);
+
return err;
 }
 
@@ -94,12 +168,16 @@ static void mtk_vcodec_dec_optee_deinit_hw_info(struct 
mtk_vdec_optee_private *o
enum mtk_vdec_hw_id hw_id)
 {
struct mtk_vdec_optee_ca_info *ca_info;
+   int i;
 
if (hw_id == MTK_VDEC_LAT0)
ca_info = 

[PATCH v4, 13/22] media: mediatek: vcodec: using shared memory as vsi address

2024-01-28 Thread Yunfei Dong
The vsi buffer is allocated by tee share memory for svp mode, need to
use the share memory as the vsi address to store vsi data.

Signed-off-by: Yunfei Dong 
---
 .../vcodec/decoder/vdec/vdec_h264_req_multi_if.c | 9 +++--
 .../media/platform/mediatek/vcodec/decoder/vdec_vpu_if.c | 8 ++--
 2 files changed, 13 insertions(+), 4 deletions(-)

diff --git 
a/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_h264_req_multi_if.c 
b/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_h264_req_multi_if.c
index 0e741e0dc8ba..4967e0f0984d 100644
--- 
a/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_h264_req_multi_if.c
+++ 
b/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_h264_req_multi_if.c
@@ -417,8 +417,13 @@ static int vdec_h264_slice_init(struct mtk_vcodec_dec_ctx 
*ctx)
 
vsi_size = round_up(sizeof(struct vdec_h264_slice_vsi), 
VCODEC_DEC_ALIGNED_64);
inst->vsi = inst->vpu.vsi;
-   inst->vsi_core =
-   (struct vdec_h264_slice_vsi *)(((char *)inst->vpu.vsi) + 
vsi_size);
+   if (ctx->is_secure_playback)
+   inst->vsi_core =
+   
mtk_vcodec_dec_get_shm_buffer_va(ctx->dev->optee_private, MTK_VDEC_CORE,
+OPTEE_DATA_INDEX);
+   else
+   inst->vsi_core =
+   (struct vdec_h264_slice_vsi *)(((char *)inst->vpu.vsi) 
+ vsi_size);
inst->resolution_changed = true;
inst->realloc_mv_buf = true;
 
diff --git a/drivers/media/platform/mediatek/vcodec/decoder/vdec_vpu_if.c 
b/drivers/media/platform/mediatek/vcodec/decoder/vdec_vpu_if.c
index 5336769a3fb5..5c31641e9abe 100644
--- a/drivers/media/platform/mediatek/vcodec/decoder/vdec_vpu_if.c
+++ b/drivers/media/platform/mediatek/vcodec/decoder/vdec_vpu_if.c
@@ -18,8 +18,12 @@ static void handle_init_ack_msg(const struct 
vdec_vpu_ipi_init_ack *msg)
 
/* mapping VPU address to kernel virtual address */
/* the content in vsi is initialized to 0 in VPU */
-   vpu->vsi = mtk_vcodec_fw_map_dm_addr(vpu->ctx->dev->fw_handler,
-msg->vpu_inst_addr);
+   if (vpu->ctx->is_secure_playback)
+   vpu->vsi = 
mtk_vcodec_dec_get_shm_buffer_va(vpu->ctx->dev->optee_private,
+   MTK_VDEC_LAT0, 
OPTEE_DATA_INDEX);
+   else
+   vpu->vsi = mtk_vcodec_fw_map_dm_addr(vpu->ctx->dev->fw_handler,
+msg->vpu_inst_addr);
vpu->inst_addr = msg->vpu_inst_addr;
 
mtk_vdec_debug(vpu->ctx, "- vpu_inst_addr = 0x%x", vpu->inst_addr);
-- 
2.18.0



[PATCH v4, 19/22] media: mediatek: vcodec: disable wait interrupt for svp mode

2024-01-28 Thread Yunfei Dong
Waiting interrupt in optee-os for svp mode, need to disable it in kernel
in case of interrupt is cleaned.

Signed-off-by: Yunfei Dong 
---
 .../vcodec/decoder/mtk_vcodec_dec_hw.c| 34 +--
 .../vcodec/decoder/mtk_vcodec_dec_pm.c|  6 +-
 .../decoder/vdec/vdec_h264_req_multi_if.c | 57 +++
 3 files changed, 54 insertions(+), 43 deletions(-)

diff --git a/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec_hw.c 
b/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec_hw.c
index 881d5de41e05..1982c088c6da 100644
--- a/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec_hw.c
+++ b/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec_hw.c
@@ -72,26 +72,28 @@ static irqreturn_t mtk_vdec_hw_irq_handler(int irq, void 
*priv)
 
ctx = mtk_vcodec_get_curr_ctx(dev->main_dev, dev->hw_idx);
 
-   /* check if HW active or not */
-   cg_status = readl(dev->reg_base[VDEC_HW_SYS] + VDEC_HW_ACTIVE_ADDR);
-   if (cg_status & VDEC_HW_ACTIVE_MASK) {
-   mtk_v4l2_vdec_err(ctx, "vdec active is not 0x0 (0x%08x)", 
cg_status);
-   return IRQ_HANDLED;
-   }
+   if (!ctx->is_secure_playback) {
+   /* check if HW active or not */
+   cg_status = readl(dev->reg_base[VDEC_HW_SYS] + 
VDEC_HW_ACTIVE_ADDR);
+   if (cg_status & VDEC_HW_ACTIVE_MASK) {
+   mtk_v4l2_vdec_err(ctx, "vdec active is not 0x0 
(0x%08x)", cg_status);
+   return IRQ_HANDLED;
+   }
 
-   dec_done_status = readl(vdec_misc_addr);
-   if ((dec_done_status & MTK_VDEC_IRQ_STATUS_DEC_SUCCESS) !=
-   MTK_VDEC_IRQ_STATUS_DEC_SUCCESS)
-   return IRQ_HANDLED;
+   dec_done_status = readl(vdec_misc_addr);
+   if ((dec_done_status & MTK_VDEC_IRQ_STATUS_DEC_SUCCESS) !=
+   MTK_VDEC_IRQ_STATUS_DEC_SUCCESS)
+   return IRQ_HANDLED;
 
-   /* clear interrupt */
-   writel(dec_done_status | VDEC_IRQ_CFG, vdec_misc_addr);
-   writel(dec_done_status & ~VDEC_IRQ_CLR, vdec_misc_addr);
+   /* clear interrupt */
+   writel(dec_done_status | VDEC_IRQ_CFG, vdec_misc_addr);
+   writel(dec_done_status & ~VDEC_IRQ_CLR, vdec_misc_addr);
 
-   wake_up_dec_ctx(ctx, MTK_INST_IRQ_RECEIVED, dev->hw_idx);
+   wake_up_dec_ctx(ctx, MTK_INST_IRQ_RECEIVED, dev->hw_idx);
 
-   mtk_v4l2_vdec_dbg(3, ctx, "wake up ctx %d, dec_done_status=%x",
- ctx->id, dec_done_status);
+   mtk_v4l2_vdec_dbg(3, ctx, "wake up ctx %d, dec_done_status=%x",
+ ctx->id, dec_done_status);
+   }
 
return IRQ_HANDLED;
 }
diff --git a/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec_pm.c 
b/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec_pm.c
index aefd3e9e3061..a94eda936f16 100644
--- a/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec_pm.c
+++ b/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec_pm.c
@@ -238,7 +238,8 @@ void mtk_vcodec_dec_enable_hardware(struct 
mtk_vcodec_dec_ctx *ctx, int hw_idx)
mtk_vcodec_dec_child_dev_on(ctx->dev, MTK_VDEC_LAT0);
mtk_vcodec_dec_child_dev_on(ctx->dev, hw_idx);
 
-   mtk_vcodec_dec_enable_irq(ctx->dev, hw_idx);
+   if (!ctx->is_secure_playback)
+   mtk_vcodec_dec_enable_irq(ctx->dev, hw_idx);
 
if (IS_VDEC_INNER_RACING(ctx->dev->dec_capability))
mtk_vcodec_load_racing_info(ctx);
@@ -250,7 +251,8 @@ void mtk_vcodec_dec_disable_hardware(struct 
mtk_vcodec_dec_ctx *ctx, int hw_idx)
if (IS_VDEC_INNER_RACING(ctx->dev->dec_capability))
mtk_vcodec_record_racing_info(ctx);
 
-   mtk_vcodec_dec_disable_irq(ctx->dev, hw_idx);
+   if (!ctx->is_secure_playback)
+   mtk_vcodec_dec_disable_irq(ctx->dev, hw_idx);
 
mtk_vcodec_dec_child_dev_off(ctx->dev, hw_idx);
if (IS_VDEC_LAT_ARCH(ctx->dev->vdec_pdata->hw_arch) &&
diff --git 
a/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_h264_req_multi_if.c 
b/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_h264_req_multi_if.c
index 2dfb3043493e..3e2270399b6c 100644
--- 
a/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_h264_req_multi_if.c
+++ 
b/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_h264_req_multi_if.c
@@ -593,14 +593,16 @@ static int vdec_h264_slice_core_decode(struct 
vdec_lat_buf *lat_buf)
goto vdec_dec_end;
}
 
-   /* wait decoder done interrupt */
-   timeout = mtk_vcodec_wait_for_done_ctx(inst->ctx, MTK_INST_IRQ_RECEIVED,
-  WAIT_INTR_TIMEOUT_MS, 
MTK_VDEC_CORE);
-   if (timeout)
-   mtk_vdec_err(ctx, "core decode timeout: pic_%d", 
ctx->decoded_frame_cnt);
-   inst->vsi_core->dec.timeout = !!timeout;
-
-  

[PATCH v4,20/22] media: mediatek: vcodec: support tee decoder

2024-01-28 Thread Yunfei Dong
Initialize tee private data to support secure decoder.
Release tee related information for each instance when decoder
done.

Signed-off-by: Yunfei Dong 
---
 .../platform/mediatek/vcodec/decoder/mtk_vcodec_dec_drv.c | 8 
 1 file changed, 8 insertions(+)

diff --git 
a/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec_drv.c 
b/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec_drv.c
index f47c98faf068..08e7d250487b 100644
--- a/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec_drv.c
+++ b/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec_drv.c
@@ -310,6 +310,9 @@ static int fops_vcodec_release(struct file *file)
v4l2_fh_exit(>fh);
v4l2_ctrl_handler_free(>ctrl_hdl);
 
+   if (ctx->is_secure_playback)
+   mtk_vcodec_dec_optee_release(dev->optee_private);
+
mtk_vcodec_dbgfs_remove(dev, ctx->id);
list_del_init(>list);
kfree(ctx);
@@ -466,6 +469,11 @@ static int mtk_vcodec_probe(struct platform_device *pdev)
atomic_set(>dec_active_cnt, 0);
memset(dev->vdec_racing_info, 0, sizeof(dev->vdec_racing_info));
mutex_init(>dec_racing_info_mutex);
+   ret = mtk_vcodec_dec_optee_private_init(dev);
+   if (ret) {
+   dev_err(>dev, "Failed to init svp private.");
+   goto err_reg_cont;
+   }
 
ret = video_register_device(vfd_dec, VFL_TYPE_VIDEO, -1);
if (ret) {
-- 
2.18.0



[PATCH v4, 22/22] media: mediatek: vcodec: support hevc svp for mt8188

2024-01-28 Thread Yunfei Dong
Change hevc driver to support secure video playback(svp) for
mt8188. Need to map shared memory with optee interface and
wait interrupt in optee-os.

Signed-off-by: Yunfei Dong 
---
 .../decoder/vdec/vdec_hevc_req_multi_if.c | 89 +++
 1 file changed, 54 insertions(+), 35 deletions(-)

diff --git 
a/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_hevc_req_multi_if.c 
b/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_hevc_req_multi_if.c
index 06ed47df693b..8bf4bc13ae2d 100644
--- 
a/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_hevc_req_multi_if.c
+++ 
b/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_hevc_req_multi_if.c
@@ -415,11 +415,14 @@ static void vdec_hevc_fill_dpb_info(struct 
mtk_vcodec_dec_ctx *ctx,
hevc_dpb_info[index].field = dpb->field_pic;
 
hevc_dpb_info[index].y_dma_addr = 
vb2_dma_contig_plane_dma_addr(vb, 0);
-   if (ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 2)
-   hevc_dpb_info[index].c_dma_addr = 
vb2_dma_contig_plane_dma_addr(vb, 1);
-   else
-   hevc_dpb_info[index].c_dma_addr =
-   hevc_dpb_info[index].y_dma_addr + 
ctx->picinfo.fb_sz[0];
+   if (!ctx->is_secure_playback) {
+   if (ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 2)
+   hevc_dpb_info[index].c_dma_addr =
+   vb2_dma_contig_plane_dma_addr(vb, 1);
+   else
+   hevc_dpb_info[index].c_dma_addr =
+   hevc_dpb_info[index].y_dma_addr + 
ctx->picinfo.fb_sz[0];
+   }
}
 }
 
@@ -800,7 +803,7 @@ static int vdec_hevc_slice_setup_core_buffer(struct 
vdec_hevc_slice_inst *inst,
struct mtk_vcodec_dec_ctx *ctx = inst->ctx;
struct vb2_v4l2_buffer *vb2_v4l2;
struct vdec_fb *fb;
-   u64 y_fb_dma, c_fb_dma;
+   u64 y_fb_dma, c_fb_dma = 0;
int i;
 
fb = ctx->dev->vdec_pdata->get_cap_buffer(ctx);
@@ -810,18 +813,20 @@ static int vdec_hevc_slice_setup_core_buffer(struct 
vdec_hevc_slice_inst *inst,
}
 
y_fb_dma = (u64)fb->base_y.dma_addr;
-   if (ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 1)
-   c_fb_dma =
-   y_fb_dma + inst->ctx->picinfo.buf_w * 
inst->ctx->picinfo.buf_h;
-   else
-   c_fb_dma = (u64)fb->base_c.dma_addr;
+   if (!ctx->is_secure_playback) {
+   if (ctx->q_data[MTK_Q_DATA_DST].fmt->num_planes == 1)
+   c_fb_dma =
+   y_fb_dma + inst->ctx->picinfo.buf_w * 
inst->ctx->picinfo.buf_h;
+   else
+   c_fb_dma = (u64)fb->base_c.dma_addr;
+   }
 
mtk_vdec_debug(inst->ctx, "[hevc-core] y/c addr = 0x%llx 0x%llx", 
y_fb_dma, c_fb_dma);
 
inst->vsi_core->fb.y.dma_addr = y_fb_dma;
inst->vsi_core->fb.y.size = ctx->picinfo.fb_sz[0];
inst->vsi_core->fb.c.dma_addr = c_fb_dma;
-   inst->vsi_core->fb.y.size = ctx->picinfo.fb_sz[1];
+   inst->vsi_core->fb.c.size = ctx->picinfo.fb_sz[1];
 
inst->vsi_core->dec.vdec_fb_va = (unsigned long)fb;
 
@@ -878,8 +883,13 @@ static int vdec_hevc_slice_init(struct mtk_vcodec_dec_ctx 
*ctx)
 
vsi_size = round_up(sizeof(struct vdec_hevc_slice_vsi), 
VCODEC_DEC_ALIGNED_64);
inst->vsi = inst->vpu.vsi;
-   inst->vsi_core =
-   (struct vdec_hevc_slice_vsi *)(((char *)inst->vpu.vsi) + 
vsi_size);
+   if (ctx->is_secure_playback)
+   inst->vsi_core =
+   
mtk_vcodec_dec_get_shm_buffer_va(ctx->dev->optee_private, MTK_VDEC_CORE,
+OPTEE_DATA_INDEX);
+   else
+   inst->vsi_core =
+   (struct vdec_hevc_slice_vsi *)(((char *)inst->vpu.vsi) 
+ vsi_size);
 
inst->resolution_changed = true;
inst->realloc_mv_buf = true;
@@ -944,21 +954,22 @@ static int vdec_hevc_slice_core_decode(struct 
vdec_lat_buf *lat_buf)
goto vdec_dec_end;
}
 
-   /* wait decoder done interrupt */
-   timeout = mtk_vcodec_wait_for_done_ctx(inst->ctx, MTK_INST_IRQ_RECEIVED,
-  WAIT_INTR_TIMEOUT_MS, 
MTK_VDEC_CORE);
-   if (timeout)
-   mtk_vdec_err(ctx, "core decode timeout: pic_%d", 
ctx->decoded_frame_cnt);
-   inst->vsi_core->dec.timeout = !!timeout;
+   if (!vpu->ctx->is_secure_playback) {
+   /* wait decoder done interrupt */
+   timeout = mtk_vcodec_wait_for_done_ctx(inst->ctx, 
MTK_INST_IRQ_RECEIVED,
+  WAIT_INTR_TIMEOUT_MS, 
MTK_VDEC_CORE);
+   if (timeout)
+   mtk_vdec_err(ctx, "core decode timeout: pic_%d", 

[PATCH v4, 08/22] media: mediatek: vcodec: add tee client interface to communiate with optee-os

2024-01-28 Thread Yunfei Dong
Open tee context to initialize the environment in order to communication
with optee-os, then open tee session as the communication pipeline for
lat and core to send data for hardware decode.

Signed-off-by: Yunfei Dong 
---
 .../platform/mediatek/vcodec/decoder/Makefile |   1 +
 .../vcodec/decoder/mtk_vcodec_dec_drv.h   |   5 +
 .../vcodec/decoder/mtk_vcodec_dec_optee.c | 165 ++
 .../vcodec/decoder/mtk_vcodec_dec_optee.h |  73 
 4 files changed, 244 insertions(+)
 create mode 100644 
drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec_optee.c
 create mode 100644 
drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec_optee.h

diff --git a/drivers/media/platform/mediatek/vcodec/decoder/Makefile 
b/drivers/media/platform/mediatek/vcodec/decoder/Makefile
index 904cd22def84..1624933dfd5e 100644
--- a/drivers/media/platform/mediatek/vcodec/decoder/Makefile
+++ b/drivers/media/platform/mediatek/vcodec/decoder/Makefile
@@ -21,5 +21,6 @@ mtk-vcodec-dec-y := vdec/vdec_h264_if.o \
mtk_vcodec_dec_stateful.o \
mtk_vcodec_dec_stateless.o \
mtk_vcodec_dec_pm.o \
+   mtk_vcodec_dec_optee.o \
 
 mtk-vcodec-dec-hw-y := mtk_vcodec_dec_hw.o
diff --git 
a/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec_drv.h 
b/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec_drv.h
index 849b89dd205c..b1a2107f2a1e 100644
--- a/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec_drv.h
+++ b/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec_drv.h
@@ -11,6 +11,7 @@
 #include "../common/mtk_vcodec_dbgfs.h"
 #include "../common/mtk_vcodec_fw_priv.h"
 #include "../common/mtk_vcodec_util.h"
+#include "mtk_vcodec_dec_optee.h"
 #include "vdec_msg_queue.h"
 
 #define MTK_VCODEC_DEC_NAME"mtk-vcodec-dec"
@@ -261,6 +262,8 @@ struct mtk_vcodec_dec_ctx {
  * @dbgfs: debug log related information
  *
  * @chip_name: used to distinguish platforms and select the correct codec 
configuration values
+ *
+ * @optee_private: optee private data
  */
 struct mtk_vcodec_dec_dev {
struct v4l2_device v4l2_dev;
@@ -303,6 +306,8 @@ struct mtk_vcodec_dec_dev {
struct mtk_vcodec_dbgfs dbgfs;
 
enum mtk_vcodec_dec_chip_name chip_name;
+
+   struct mtk_vdec_optee_private *optee_private;
 };
 
 static inline struct mtk_vcodec_dec_ctx *fh_to_dec_ctx(struct v4l2_fh *fh)
diff --git 
a/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec_optee.c 
b/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec_optee.c
new file mode 100644
index ..38d9c1c1785a
--- /dev/null
+++ b/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec_optee.c
@@ -0,0 +1,165 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2023 MediaTek Inc.
+ * Author: Yunfei Dong 
+ */
+
+#include "mtk_vcodec_dec_drv.h"
+#include "mtk_vcodec_dec_optee.h"
+
+/*
+ * Randomly generated, and must correspond to the GUID on the TA side.
+ */
+static const uuid_t mtk_vdec_lat_uuid =
+   UUID_INIT(0xBC50D971, 0xD4C9, 0x42C4,
+ 0x82, 0xCB, 0x34, 0x3F, 0xB7, 0xF3, 0x78, 0x90);
+
+static const uuid_t mtk_vdec_core_uuid =
+   UUID_INIT(0xBC50D971, 0xD4C9, 0x42C4,
+ 0x82, 0xCB, 0x34, 0x3F, 0xB7, 0xF3, 0x78, 0x91);
+
+/*
+ * Check whether this driver supports decoder TA in the TEE instance,
+ * represented by the params (ver/data) of this function.
+ */
+static int mtk_vcodec_dec_optee_match(struct tee_ioctl_version_data *ver_data, 
const void *not_used)
+{
+   if (ver_data->impl_id == TEE_IMPL_ID_OPTEE)
+   return 1;
+   else
+   return 0;
+}
+
+int mtk_vcodec_dec_optee_private_init(struct mtk_vcodec_dec_dev *vcodec_dev)
+{
+   vcodec_dev->optee_private = devm_kzalloc(_dev->plat_dev->dev,
+
sizeof(*vcodec_dev->optee_private),
+GFP_KERNEL);
+   if (!vcodec_dev->optee_private)
+   return -ENOMEM;
+
+   vcodec_dev->optee_private->vcodec_dev = vcodec_dev;
+
+   atomic_set(_dev->optee_private->tee_active_cnt, 0);
+   mutex_init(_dev->optee_private->tee_mutex);
+
+   return 0;
+}
+EXPORT_SYMBOL_GPL(mtk_vcodec_dec_optee_private_init);
+
+static int mtk_vcodec_dec_optee_init_hw_info(struct mtk_vdec_optee_private 
*optee_private,
+enum mtk_vdec_hw_id hardware_index)
+{
+   struct device *dev = _private->vcodec_dev->plat_dev->dev;
+   struct tee_ioctl_open_session_arg session_arg;
+   struct mtk_vdec_optee_ca_info *ca_info;
+   int err = 0, session_func;
+
+   /* Open lat and core session with vdec TA. */
+   switch (hardware_index) {
+   case MTK_VDEC_LAT0:
+   export_uuid(session_arg.uuid, _vdec_lat_uuid);
+   session_func = MTK_VDEC_OPTEE_TA_LAT_SUBMIT_COMMAND;
+   ca_info = _private->lat_ca;
+

[PATCH v4, 18/22] media: mediatek: vcodec: remove parse nal_info in kernel

2024-01-28 Thread Yunfei Dong
The hardware can parse syntax to get nal_info, needn't to use cpu.

Signed-off-by: Yunfei Dong 
---
 .../vcodec/decoder/vdec/vdec_h264_req_multi_if.c| 13 ++---
 1 file changed, 2 insertions(+), 11 deletions(-)

diff --git 
a/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_h264_req_multi_if.c 
b/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_h264_req_multi_if.c
index a1a68487131c..2dfb3043493e 100644
--- 
a/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_h264_req_multi_if.c
+++ 
b/drivers/media/platform/mediatek/vcodec/decoder/vdec/vdec_h264_req_multi_if.c
@@ -645,11 +645,10 @@ static int vdec_h264_slice_lat_decode(void *h_vdec, 
struct mtk_vcodec_mem *bs,
struct vdec_h264_slice_inst *inst = h_vdec;
struct vdec_vpu_inst *vpu = >vpu;
struct mtk_video_dec_buf *src_buf_info;
-   int nal_start_idx, err, timeout = 0;
+   int err, timeout = 0;
unsigned int data[2];
struct vdec_lat_buf *lat_buf;
struct vdec_h264_slice_share_info *share_info;
-   unsigned char *buf;
 
if (vdec_msg_queue_init(>ctx->msg_queue, inst->ctx,
vdec_h264_slice_core_decode,
@@ -673,14 +672,6 @@ static int vdec_h264_slice_lat_decode(void *h_vdec, struct 
mtk_vcodec_mem *bs,
share_info = lat_buf->private_data;
src_buf_info = container_of(bs, struct mtk_video_dec_buf, bs_buffer);
 
-   buf = (unsigned char *)bs->va;
-   nal_start_idx = mtk_vdec_h264_find_start_code(buf, bs->size);
-   if (nal_start_idx < 0) {
-   err = -EINVAL;
-   goto err_free_fb_out;
-   }
-
-   inst->vsi->dec.nal_info = buf[nal_start_idx];
lat_buf->src_buf_req = src_buf_info->m2m_buf.vb.vb2_buf.req_obj.req;
v4l2_m2m_buf_copy_metadata(_buf_info->m2m_buf.vb, 
_buf->ts_info, true);
 
@@ -689,7 +680,7 @@ static int vdec_h264_slice_lat_decode(void *h_vdec, struct 
mtk_vcodec_mem *bs,
goto err_free_fb_out;
 
if (!inst->ctx->is_secure_playback)
-   vdec_h264_insert_startcode(inst->ctx->dev, buf, >size,
+   vdec_h264_insert_startcode(inst->ctx->dev, bs->va, >size,
   _info->h264_slice_params.pps);
 
*res_chg = inst->resolution_changed;
-- 
2.18.0



[PATCH v4, 14/22] media: mediatek: vcodec: Add capture format to support one plane memory

2024-01-28 Thread Yunfei Dong
Define one uncompressed capture format V4L2_PIX_FMT_MS21 in order to
support one plane memory. The buffer size is luma + chroma, luma is
stored at the start and chrome is stored at the end.

Signed-off-by: Yunfei Dong 
---
 Documentation/userspace-api/media/v4l/pixfmt-reserved.rst | 8 
 drivers/media/v4l2-core/v4l2-common.c | 2 ++
 drivers/media/v4l2-core/v4l2-ioctl.c  | 1 +
 include/uapi/linux/videodev2.h| 1 +
 4 files changed, 12 insertions(+)

diff --git a/Documentation/userspace-api/media/v4l/pixfmt-reserved.rst 
b/Documentation/userspace-api/media/v4l/pixfmt-reserved.rst
index 886ba7b08d6b..6ec899649d50 100644
--- a/Documentation/userspace-api/media/v4l/pixfmt-reserved.rst
+++ b/Documentation/userspace-api/media/v4l/pixfmt-reserved.rst
@@ -295,6 +295,14 @@ please make a proposal on the linux-media mailing list.
   - Compressed format used by Nuvoton NPCM video driver. This format is
 defined in Remote Framebuffer Protocol (RFC 6143, chapter 7.7.4 Hextile
 Encoding).
+* .. _V4L2-PIX-FMT-MS21:
+
+  - ``V4L2_PIX_FMT_MS21``
+  - 'MS21'
+  - This format has one plane, luma and chroma are stored in a contiguous
+memory. Luma pixel in 16x32 tiles at the start, chroma pixel in 16x16
+tiles at the end. The image height must be aligned with 32 and the 
image
+width must be aligned with 16.
 .. raw:: latex
 
 \normalsize
diff --git a/drivers/media/v4l2-core/v4l2-common.c 
b/drivers/media/v4l2-core/v4l2-common.c
index 273d83de2a87..315b906ed730 100644
--- a/drivers/media/v4l2-core/v4l2-common.c
+++ b/drivers/media/v4l2-core/v4l2-common.c
@@ -269,6 +269,8 @@ const struct v4l2_format_info *v4l2_format_info(u32 format)
  .block_w = { 16, 8, 0, 0 }, .block_h = { 32, 16, 0, 0 }},
{ .format = V4L2_PIX_FMT_MT2110R, .pixel_enc = 
V4L2_PIXEL_ENC_YUV, .mem_planes = 2, .comp_planes = 2, .bpp = { 5, 10, 0, 0 }, 
.bpp_div = { 4, 4, 1, 1 }, .hdiv = 2, .vdiv = 2,
  .block_w = { 16, 8, 0, 0 }, .block_h = { 32, 16, 0, 0 }},
+   { .format = V4L2_PIX_FMT_MS21,.pixel_enc = 
V4L2_PIXEL_ENC_YUV, .mem_planes = 1, .comp_planes = 2, .bpp = { 1, 2, 0, 0 }, 
.bpp_div = { 1, 1, 1, 1 }, .hdiv = 2, .vdiv = 2,
+ .block_w = { 16, 8, 0, 0 }, .block_h = { 32, 16, 0, 0 }},
 
/* YUV planar formats */
{ .format = V4L2_PIX_FMT_NV12,.pixel_enc = 
V4L2_PIXEL_ENC_YUV, .mem_planes = 1, .comp_planes = 2, .bpp = { 1, 2, 0, 0 }, 
.bpp_div = { 1, 1, 1, 1 }, .hdiv = 2, .vdiv = 2 },
diff --git a/drivers/media/v4l2-core/v4l2-ioctl.c 
b/drivers/media/v4l2-core/v4l2-ioctl.c
index 33076af4dfdb..c38b12511bdb 100644
--- a/drivers/media/v4l2-core/v4l2-ioctl.c
+++ b/drivers/media/v4l2-core/v4l2-ioctl.c
@@ -1511,6 +1511,7 @@ static void v4l_fill_fmtdesc(struct v4l2_fmtdesc *fmt)
case V4L2_PIX_FMT_MT2110T:  descr = "Mediatek 10bit Tile 
Mode"; break;
case V4L2_PIX_FMT_MT2110R:  descr = "Mediatek 10bit Raster 
Mode"; break;
case V4L2_PIX_FMT_HEXTILE:  descr = "Hextile Compressed 
Format"; break;
+   case V4L2_PIX_FMT_MS21: descr = "MediaTek One Plane 
Format"; break;
default:
if (fmt->description[0])
return;
diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
index 3e3f8d4b7c81..53a3c908fcba 100644
--- a/include/uapi/linux/videodev2.h
+++ b/include/uapi/linux/videodev2.h
@@ -798,6 +798,7 @@ struct v4l2_pix_format {
 #define V4L2_PIX_FMT_MM21 v4l2_fourcc('M', 'M', '2', '1') /* Mediatek 
8-bit block mode, two non-contiguous planes */
 #define V4L2_PIX_FMT_MT2110T  v4l2_fourcc('M', 'T', '2', 'T') /* Mediatek 
10-bit block tile mode */
 #define V4L2_PIX_FMT_MT2110R  v4l2_fourcc('M', 'T', '2', 'R') /* Mediatek 
10-bit block raster mode */
+#define V4L2_PIX_FMT_MS21 v4l2_fourcc('M', 'S', '2', '1') /* MediaTek 
8-bit block mode with one plane */
 #define V4L2_PIX_FMT_INZI v4l2_fourcc('I', 'N', 'Z', 'I') /* Intel Planar 
Greyscale 10-bit and Depth 16-bit */
 #define V4L2_PIX_FMT_CNF4 v4l2_fourcc('C', 'N', 'F', '4') /* Intel 4-bit 
packed depth confidence information */
 #define V4L2_PIX_FMT_HI240v4l2_fourcc('H', 'I', '2', '4') /* BTTV 8-bit 
dithered RGB */
-- 
2.18.0



[PATCH v4,15/22] media: mediatek: vcodec: Add one plane format

2024-01-28 Thread Yunfei Dong
Adding capture formats to support V4L2_PIX_FMT_MS21. This format has
one plane and only be used for secure video playback at current period.

Signed-off-by: Yunfei Dong 
---
 .../platform/mediatek/vcodec/decoder/mtk_vcodec_dec.c| 4 +++-
 .../mediatek/vcodec/decoder/mtk_vcodec_dec_stateless.c   | 9 -
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec.c 
b/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec.c
index ba742f0e391d..604fdc8ee3ce 100644
--- a/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec.c
+++ b/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec.c
@@ -49,7 +49,9 @@ static bool mtk_vdec_get_cap_fmt(struct mtk_vcodec_dec_ctx 
*ctx, int format_inde
num_frame_count++;
}
 
-   if (num_frame_count == 1 || (!ctx->is_10bit_bitstream && fmt->fourcc == 
V4L2_PIX_FMT_MM21))
+   if ((!ctx->is_10bit_bitstream && fmt->fourcc == V4L2_PIX_FMT_MM21) ||
+   (ctx->is_secure_playback && fmt->fourcc == V4L2_PIX_FMT_MS21) ||
+   num_frame_count == 1)
return true;
 
q_data = >q_data[MTK_Q_DATA_SRC];
diff --git 
a/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec_stateless.c 
b/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec_stateless.c
index d54b3833790d..cc42c942eb8a 100644
--- a/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec_stateless.c
+++ b/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec_stateless.c
@@ -229,7 +229,7 @@ static const struct mtk_stateless_control 
mtk_stateless_controls[] = {
 
 #define NUM_CTRLS ARRAY_SIZE(mtk_stateless_controls)
 
-static struct mtk_video_fmt mtk_video_formats[9];
+static struct mtk_video_fmt mtk_video_formats[10];
 
 static struct mtk_video_fmt default_out_format;
 static struct mtk_video_fmt default_cap_format;
@@ -770,6 +770,11 @@ static void mtk_vcodec_add_formats(unsigned int fourcc,
mtk_video_formats[count_formats].type = MTK_FMT_FRAME;
mtk_video_formats[count_formats].num_planes = 2;
break;
+   case V4L2_PIX_FMT_MS21:
+   mtk_video_formats[count_formats].fourcc = fourcc;
+   mtk_video_formats[count_formats].type = MTK_FMT_FRAME;
+   mtk_video_formats[count_formats].num_planes = 1;
+   break;
default:
mtk_v4l2_vdec_err(ctx, "Can not add unsupported format type");
return;
@@ -798,6 +803,8 @@ static void mtk_vcodec_get_supported_formats(struct 
mtk_vcodec_dec_ctx *ctx)
cap_format_count++;
}
if (ctx->dev->dec_capability & MTK_VDEC_FORMAT_MM21) {
+   mtk_vcodec_add_formats(V4L2_PIX_FMT_MS21, ctx);
+   cap_format_count++;
mtk_vcodec_add_formats(V4L2_PIX_FMT_MM21, ctx);
cap_format_count++;
}
-- 
2.18.0



[PATCH v4, 10/22] media: mediatek: vcodec: send share memory data to optee

2024-01-28 Thread Yunfei Dong
Setting msg and vsi information to shared buffer, then call tee invoke
function to send it to optee-os.

Signed-off-by: Yunfei Dong 
---
 .../vcodec/decoder/mtk_vcodec_dec_optee.c | 140 ++
 .../vcodec/decoder/mtk_vcodec_dec_optee.h |  51 +++
 2 files changed, 191 insertions(+)

diff --git 
a/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec_optee.c 
b/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec_optee.c
index 611fb0e56480..f29a8d143fee 100644
--- a/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec_optee.c
+++ b/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec_optee.c
@@ -241,3 +241,143 @@ void mtk_vcodec_dec_optee_release(struct 
mtk_vdec_optee_private *optee_private)
mutex_unlock(_private->tee_mutex);
 }
 EXPORT_SYMBOL_GPL(mtk_vcodec_dec_optee_release);
+
+static int mtk_vcodec_dec_optee_fill_shm(struct tee_param *command_params,
+struct mtk_vdec_optee_shm_memref 
*shm_memref,
+struct mtk_vdec_optee_data_to_shm 
*data,
+int index, struct device *dev)
+{
+   if (!data->msg_buf_size[index] || !data->msg_buf[index]) {
+   pr_err(MTK_DBG_VCODEC_STR "tee invalid buf param: %d.\n", 
index);
+   return -EINVAL;
+   }
+
+   *command_params = (struct tee_param) {
+   .attr = shm_memref->param_type,
+   .u.memref = {
+   .shm = shm_memref->msg_shm,
+   .size = data->msg_buf_size[index],
+   .shm_offs = 0,
+   },
+   };
+
+   if (!shm_memref->copy_to_ta) {
+   dev_dbg(dev, MTK_DBG_VCODEC_STR "share memref data: 0x%x 
param_type:%llu.\n",
+   *((unsigned int *)shm_memref->msg_shm_ca_buf), 
shm_memref->param_type);
+   return 0;
+   }
+
+   memset(shm_memref->msg_shm_ca_buf, 0, shm_memref->msg_shm_size);
+   memcpy(shm_memref->msg_shm_ca_buf, data->msg_buf[index], 
data->msg_buf_size[index]);
+
+   dev_dbg(dev, MTK_DBG_VCODEC_STR "share memref data => msg id:0x%x 0x%x 
param_type:%llu.\n",
+   *((unsigned int *)data->msg_buf[index]),
+   *((unsigned int *)shm_memref->msg_shm_ca_buf),
+   shm_memref->param_type);
+
+   return 0;
+}
+
+void mtk_vcodec_dec_optee_set_data(struct mtk_vdec_optee_data_to_shm *data,
+  void *buf, int buf_size,
+  enum mtk_vdec_optee_data_index index)
+{
+   data->msg_buf[index] = buf;
+   data->msg_buf_size[index] = buf_size;
+}
+EXPORT_SYMBOL_GPL(mtk_vcodec_dec_optee_set_data);
+
+int mtk_vcodec_dec_optee_invokd_cmd(struct mtk_vdec_optee_private 
*optee_private,
+   enum mtk_vdec_hw_id hw_id,
+   struct mtk_vdec_optee_data_to_shm *data)
+{
+   struct device *dev = _private->vcodec_dev->plat_dev->dev;
+   struct tee_ioctl_invoke_arg trans_args;
+   struct tee_param command_params[MTK_OPTEE_MAX_TEE_PARAMS];
+   struct mtk_vdec_optee_ca_info *ca_info;
+   struct mtk_vdec_optee_shm_memref *shm_memref;
+   int ret, index;
+
+   if (hw_id == MTK_VDEC_LAT0)
+   ca_info = _private->lat_ca;
+   else
+   ca_info = _private->core_ca;
+
+   memset(_args, 0, sizeof(trans_args));
+   memset(command_params, 0, sizeof(command_params));
+
+   trans_args = (struct tee_ioctl_invoke_arg) {
+   .func = ca_info->vdec_session_func,
+   .session = ca_info->vdec_session_id,
+   .num_params = MTK_OPTEE_MAX_TEE_PARAMS,
+   };
+
+   /* Fill msg command parameters */
+   for (index = 0; index < MTK_OPTEE_MAX_TEE_PARAMS; index++) {
+   shm_memref = _info->shm_memref[index];
+
+   if (shm_memref->param_type == TEE_IOCTL_PARAM_ATTR_TYPE_NONE ||
+   data->msg_buf_size[index] == 0)
+   continue;
+
+   dev_dbg(dev, MTK_DBG_VCODEC_STR "tee share memory data size: %d 
-> %d.\n",
+   data->msg_buf_size[index], shm_memref->msg_shm_size);
+
+   if (data->msg_buf_size[index] > shm_memref->msg_shm_size) {
+   dev_err(dev, MTK_DBG_VCODEC_STR "tee buf size big than 
shm (%d -> %d).\n",
+   data->msg_buf_size[index], 
shm_memref->msg_shm_size);
+   return -EINVAL;
+   }
+
+   ret = mtk_vcodec_dec_optee_fill_shm(_params[index], 
shm_memref,
+   data, index, dev);
+   if (ret)
+   return ret;
+   }
+
+   ret = tee_client_invoke_func(optee_private->tee_vdec_ctx, _args, 
command_params);
+   if (ret < 0 || trans_args.ret != 0) {
+   dev_err(dev, 

[PATCH v4, 12/22] media: mediatek: vcodec: add interface to allocate/free secure memory

2024-01-28 Thread Yunfei Dong
Need to call dma heap interface to allocate/free secure memory when playing
secure video.

Signed-off-by: Yunfei Dong 
---
 .../media/platform/mediatek/vcodec/Kconfig|   1 +
 .../mediatek/vcodec/common/mtk_vcodec_util.c  | 122 +-
 .../mediatek/vcodec/common/mtk_vcodec_util.h  |   3 +
 3 files changed, 123 insertions(+), 3 deletions(-)

diff --git a/drivers/media/platform/mediatek/vcodec/Kconfig 
b/drivers/media/platform/mediatek/vcodec/Kconfig
index bc8292232530..707865703e61 100644
--- a/drivers/media/platform/mediatek/vcodec/Kconfig
+++ b/drivers/media/platform/mediatek/vcodec/Kconfig
@@ -17,6 +17,7 @@ config VIDEO_MEDIATEK_VCODEC
depends on VIDEO_MEDIATEK_VPU || !VIDEO_MEDIATEK_VPU
depends on MTK_SCP || !MTK_SCP
depends on MTK_SMI || (COMPILE_TEST && MTK_SMI=n)
+   depends on DMABUF_HEAPS
select VIDEOBUF2_DMA_CONTIG
select V4L2_MEM2MEM_DEV
select VIDEO_MEDIATEK_VCODEC_VPU if VIDEO_MEDIATEK_VPU
diff --git a/drivers/media/platform/mediatek/vcodec/common/mtk_vcodec_util.c 
b/drivers/media/platform/mediatek/vcodec/common/mtk_vcodec_util.c
index 9ce34a3b5ee6..5cb7c347322b 100644
--- a/drivers/media/platform/mediatek/vcodec/common/mtk_vcodec_util.c
+++ b/drivers/media/platform/mediatek/vcodec/common/mtk_vcodec_util.c
@@ -5,9 +5,11 @@
 *  Tiffany Lin 
 */
 
+#include 
 #include 
 #include 
 #include 
+#include 
 
 #include "../decoder/mtk_vcodec_dec_drv.h"
 #include "../encoder/mtk_vcodec_enc_drv.h"
@@ -45,7 +47,7 @@ int mtk_vcodec_write_vdecsys(struct mtk_vcodec_dec_ctx *ctx, 
unsigned int reg,
 }
 EXPORT_SYMBOL(mtk_vcodec_write_vdecsys);
 
-int mtk_vcodec_mem_alloc(void *priv, struct mtk_vcodec_mem *mem)
+static int mtk_vcodec_mem_alloc_nor(void *priv, struct mtk_vcodec_mem *mem)
 {
enum mtk_instance_type inst_type = *((unsigned int *)priv);
struct platform_device *plat_dev;
@@ -76,9 +78,71 @@ int mtk_vcodec_mem_alloc(void *priv, struct mtk_vcodec_mem 
*mem)
 
return 0;
 }
-EXPORT_SYMBOL(mtk_vcodec_mem_alloc);
 
-void mtk_vcodec_mem_free(void *priv, struct mtk_vcodec_mem *mem)
+static int mtk_vcodec_mem_alloc_sec(struct mtk_vcodec_dec_ctx *ctx, struct 
mtk_vcodec_mem *mem)
+{
+   struct device *dev = >dev->plat_dev->dev;
+   struct dma_buf *dma_buffer;
+   struct dma_heap *vdec_heap;
+   struct dma_buf_attachment *attach;
+   struct sg_table *sgt;
+   unsigned long size = mem->size;
+   int ret = 0;
+
+   if (!size)
+   return -EINVAL;
+
+   vdec_heap = dma_heap_find("restricted_mtk_cm");
+   if (!vdec_heap) {
+   mtk_v4l2_vdec_err(ctx, "dma heap find failed!");
+   return -EPERM;
+   }
+
+   dma_buffer = dma_heap_buffer_alloc(vdec_heap, size, 
DMA_HEAP_VALID_FD_FLAGS,
+  DMA_HEAP_VALID_HEAP_FLAGS);
+   if (IS_ERR_OR_NULL(dma_buffer)) {
+   mtk_v4l2_vdec_err(ctx, "dma heap alloc size=0x%lx failed!", 
size);
+   return PTR_ERR(dma_buffer);
+   }
+
+   attach = dma_buf_attach(dma_buffer, dev);
+   if (IS_ERR_OR_NULL(attach)) {
+   mtk_v4l2_vdec_err(ctx, "dma attach size=0x%lx failed!", size);
+   ret = PTR_ERR(attach);
+   goto err_attach;
+   }
+
+   sgt = dma_buf_map_attachment(attach, DMA_BIDIRECTIONAL);
+   if (IS_ERR_OR_NULL(sgt)) {
+   mtk_v4l2_vdec_err(ctx, "dma map attach size=0x%lx failed!", 
size);
+   ret = PTR_ERR(sgt);
+   goto err_sgt;
+   }
+
+   mem->va = dma_buffer;
+   mem->dma_addr = (dma_addr_t)sg_dma_address((sgt)->sgl);
+
+   if (!mem->va || !mem->dma_addr) {
+   mtk_v4l2_vdec_err(ctx, "dma buffer size=0x%lx failed!", size);
+   ret = -EPERM;
+   goto err_addr;
+   }
+
+   mem->attach = attach;
+   mem->sgt = sgt;
+
+   return 0;
+err_addr:
+   dma_buf_unmap_attachment(attach, sgt, DMA_BIDIRECTIONAL);
+err_sgt:
+   dma_buf_detach(dma_buffer, attach);
+err_attach:
+   dma_buf_put(dma_buffer);
+
+   return ret;
+}
+
+static void mtk_vcodec_mem_free_nor(void *priv, struct mtk_vcodec_mem *mem)
 {
enum mtk_instance_type inst_type = *((unsigned int *)priv);
struct platform_device *plat_dev;
@@ -111,6 +175,57 @@ void mtk_vcodec_mem_free(void *priv, struct mtk_vcodec_mem 
*mem)
mem->dma_addr = 0;
mem->size = 0;
 }
+
+static void mtk_vcodec_mem_free_sec(struct mtk_vcodec_mem *mem)
+{
+   if (mem->sgt)
+   dma_buf_unmap_attachment(mem->attach, mem->sgt, 
DMA_BIDIRECTIONAL);
+   dma_buf_detach((struct dma_buf *)mem->va, mem->attach);
+   dma_buf_put((struct dma_buf *)mem->va);
+
+   mem->attach = NULL;
+   mem->sgt = NULL;
+   mem->va = NULL;
+   mem->dma_addr = 0;
+   mem->size = 0;
+}
+
+int mtk_vcodec_mem_alloc(void *priv, struct mtk_vcodec_mem *mem)
+{
+   enum mtk_instance_type inst_type = 

[PATCH v4, 03/22] v4l2: verify restricted dmabufs are used in restricted queue

2024-01-28 Thread Yunfei Dong
From: Jeffrey Kardatzke 

Verfies in the dmabuf implementations that if the restricted memory
flag is set for a queue that the dmabuf submitted to the queue is
unmappable.

Signed-off-by: Jeffrey Kardatzke 
Signed-off-by: Yunfei Dong 
---
 drivers/media/common/videobuf2/videobuf2-dma-contig.c | 8 
 drivers/media/common/videobuf2/videobuf2-dma-sg.c | 8 
 2 files changed, 16 insertions(+)

diff --git a/drivers/media/common/videobuf2/videobuf2-dma-contig.c 
b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
index 3d4fd4ef5310..f953570fef27 100644
--- a/drivers/media/common/videobuf2/videobuf2-dma-contig.c
+++ b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
@@ -710,6 +710,14 @@ static int vb2_dc_map_dmabuf(void *mem_priv)
return -EINVAL;
}
 
+   /* Verify the dmabuf is restricted if we are in restricted mode, this 
is done
+* by validating there is no page entry for the dmabuf.
+*/
+   if (buf->vb->vb2_queue->restricted_mem && sg_page(sgt->sgl)) {
+   pr_err("restricted queue requires restricted dma_buf");
+   return -EINVAL;
+   }
+
/* checking if dmabuf is big enough to store contiguous chunk */
contig_size = vb2_dc_get_contiguous_size(sgt);
if (contig_size < buf->size) {
diff --git a/drivers/media/common/videobuf2/videobuf2-dma-sg.c 
b/drivers/media/common/videobuf2/videobuf2-dma-sg.c
index 6975a71d740f..f87bd3f40b9b 100644
--- a/drivers/media/common/videobuf2/videobuf2-dma-sg.c
+++ b/drivers/media/common/videobuf2/videobuf2-dma-sg.c
@@ -570,6 +570,14 @@ static int vb2_dma_sg_map_dmabuf(void *mem_priv)
return -EINVAL;
}
 
+   /* Verify the dmabuf is restricted if we are in restricted mode, this 
is done
+* by validating there is no page entry for the dmabuf.
+*/
+   if (buf->vb->vb2_queue->restricted_mem && !sg_page(sgt->sgl)) {
+   pr_err("restricted queue requires restricted dma_buf");
+   return -EINVAL;
+   }
+
buf->dma_sgt = sgt;
buf->vaddr = NULL;
 
-- 
2.18.0



[PATCH v4, 11/22] media: mediatek: vcodec: initialize msg and vsi information

2024-01-28 Thread Yunfei Dong
Need to initialize msg and vsi information before sending to optee-os, then
calling optee invoke command to send the information to optee-os.

For the optee communication interface is different with scp, using
flag to separate them.

Signed-off-by: Yunfei Dong 
---
 .../vcodec/decoder/mtk_vcodec_dec_drv.h   |  2 +
 .../mediatek/vcodec/decoder/vdec_vpu_if.c | 49 ---
 .../mediatek/vcodec/decoder/vdec_vpu_if.h |  4 ++
 3 files changed, 49 insertions(+), 6 deletions(-)

diff --git 
a/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec_drv.h 
b/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec_drv.h
index b1a2107f2a1e..47eca245dc07 100644
--- a/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec_drv.h
+++ b/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec_drv.h
@@ -175,6 +175,7 @@ struct mtk_vcodec_dec_pdata {
  * @vpu_inst: vpu instance pointer.
  *
  * @is_10bit_bitstream: set to true if it's 10bit bitstream
+ * @is_secure_playback: Secure Video Playback (SVP) mode
  */
 struct mtk_vcodec_dec_ctx {
enum mtk_instance_type type;
@@ -220,6 +221,7 @@ struct mtk_vcodec_dec_ctx {
void *vpu_inst;
 
bool is_10bit_bitstream;
+   bool is_secure_playback;
 };
 
 /**
diff --git a/drivers/media/platform/mediatek/vcodec/decoder/vdec_vpu_if.c 
b/drivers/media/platform/mediatek/vcodec/decoder/vdec_vpu_if.c
index 82e57ae983d5..5336769a3fb5 100644
--- a/drivers/media/platform/mediatek/vcodec/decoder/vdec_vpu_if.c
+++ b/drivers/media/platform/mediatek/vcodec/decoder/vdec_vpu_if.c
@@ -148,7 +148,10 @@ static void vpu_dec_ipi_handler(void *data, unsigned int 
len, void *priv)
 
 static int vcodec_vpu_send_msg(struct vdec_vpu_inst *vpu, void *msg, int len)
 {
-   int err, id, msgid;
+   struct mtk_vdec_optee_data_to_shm *optee_data;
+   int data_size, id, hw_id, msgid;
+   void *ack_msg, *data_msg;
+   int err;
 
msgid = *(uint32_t *)msg;
mtk_vdec_debug(vpu->ctx, "id=%X", msgid);
@@ -158,16 +161,46 @@ static int vcodec_vpu_send_msg(struct vdec_vpu_inst *vpu, 
void *msg, int len)
 
if (vpu->ctx->dev->vdec_pdata->hw_arch == MTK_VDEC_LAT_SINGLE_CORE) {
if (msgid == AP_IPIMSG_DEC_CORE ||
-   msgid == AP_IPIMSG_DEC_CORE_END)
+   msgid == AP_IPIMSG_DEC_CORE_END) {
+   optee_data = >core_optee_info;
id = vpu->core_id;
-   else
+   } else {
+   optee_data = >lat_optee_info;
id = vpu->id;
+   }
} else {
+   optee_data = >lat_optee_info;
id = vpu->id;
}
 
-   err = mtk_vcodec_fw_ipi_send(vpu->ctx->dev->fw_handler, id, msg,
-len, 2000);
+   if (!vpu->ctx->is_secure_playback) {
+   err = mtk_vcodec_fw_ipi_send(vpu->ctx->dev->fw_handler, id, 
msg, len, 2000);
+   } else {
+   hw_id = (id == SCP_IPI_VDEC_LAT) ? MTK_VDEC_LAT0 : 
MTK_VDEC_CORE;
+
+   mtk_vcodec_dec_optee_set_data(optee_data, msg, len, 
OPTEE_MSG_INDEX);
+
+   /* There is no need to copy the data (VSI) message to shared 
memory,
+* but we still need to set the buffer size to a non-zero value.
+*/
+   if (msgid == AP_IPIMSG_DEC_CORE || msgid == 
AP_IPIMSG_DEC_START) {
+   data_msg = 
mtk_vcodec_dec_get_shm_buffer_va(vpu->ctx->dev->optee_private,
+   hw_id, 
OPTEE_DATA_INDEX);
+   data_size = 
mtk_vcodec_dec_get_shm_buffer_size(vpu->ctx->dev->optee_private,
+  hw_id, 
OPTEE_DATA_INDEX);
+   mtk_vcodec_dec_optee_set_data(optee_data, data_msg, 
data_size,
+ OPTEE_DATA_INDEX);
+   }
+
+   err = 
mtk_vcodec_dec_optee_invokd_cmd(vpu->ctx->dev->optee_private,
+ hw_id, optee_data);
+   vpu->failure = err;
+
+   ack_msg = 
mtk_vcodec_dec_get_shm_buffer_va(vpu->ctx->dev->optee_private, hw_id,
+  OPTEE_MSG_INDEX);
+   vpu_dec_ipi_handler(ack_msg, 0, vpu->ctx->dev);
+   }
+
if (err) {
mtk_vdec_err(vpu->ctx, "send fail vpu_id=%d msg_id=%X 
status=%d",
 id, msgid, err);
@@ -213,7 +246,11 @@ int vpu_dec_init(struct vdec_vpu_inst *vpu)
return err;
}
 
-   if (vpu->ctx->dev->vdec_pdata->hw_arch == MTK_VDEC_LAT_SINGLE_CORE) {
+   /* Using tee interface to communicate with optee os directly for SVP 
mode,
+* fw ipi interface is used for normal playback.
+*/
+   if 

[PATCH v4,04/22] v4l: add documentation for restricted memory flag

2024-01-28 Thread Yunfei Dong
From: Jeffrey Kardatzke 

Adds documentation for V4L2_MEMORY_FLAG_RESTRICTED.

Signed-off-by: Jeffrey Kardatzke 
Signed-off-by: Yunfei Dong 
---
 Documentation/userspace-api/media/v4l/buffer.rst | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/Documentation/userspace-api/media/v4l/buffer.rst 
b/Documentation/userspace-api/media/v4l/buffer.rst
index 52bbee81c080..807e43bfed2b 100644
--- a/Documentation/userspace-api/media/v4l/buffer.rst
+++ b/Documentation/userspace-api/media/v4l/buffer.rst
@@ -696,7 +696,7 @@ enum v4l2_memory
 
 .. _memory-flags:
 
-Memory Consistency Flags
+Memory Flags
 
 
 .. raw:: latex
@@ -728,6 +728,14 @@ Memory Consistency Flags
only if the buffer is used for :ref:`memory mapping ` I/O and the
queue reports the :ref:`V4L2_BUF_CAP_SUPPORTS_MMAP_CACHE_HINTS
` capability.
+* .. _`V4L2-MEMORY-FLAG-RESTRICTED`:
+
+  - ``V4L2_MEMORY_FLAG_RESTRICTED``
+  - 0x0002
+  - The queued buffers are expected to be in restricted memory. If not, an
+   error will be returned. This flag can only be used with 
``V4L2_MEMORY_DMABUF``.
+   Typically restricted buffers are allocated using a restricted dma-heap. 
This flag
+   can only be specified if the ``V4L2_BUF_CAP_SUPPORTS_RESTRICTED_MEM`` 
is set.
 
 .. raw:: latex
 
-- 
2.18.0



[PATCH v4,06/22] dma-heap: Add proper kref handling on dma-buf heaps

2024-01-28 Thread Yunfei Dong
From: John Stultz 

Add proper refcounting on the dma_heap structure.
While existing heaps are built-in, we may eventually
have heaps loaded from modules, and we'll need to be
able to properly handle the references to the heaps

Signed-off-by: John Stultz 
Signed-off-by: T.J. Mercier 
Signed-off-by: Yong Wu 
[Yong: Just add comment for "minor" and "refcount"]
Signed-off-by: Yunfei Dong 
---
 drivers/dma-buf/dma-heap.c | 29 +
 include/linux/dma-heap.h   |  2 ++
 2 files changed, 31 insertions(+)

diff --git a/drivers/dma-buf/dma-heap.c b/drivers/dma-buf/dma-heap.c
index 22f6c193db0d..97025ee8500f 100644
--- a/drivers/dma-buf/dma-heap.c
+++ b/drivers/dma-buf/dma-heap.c
@@ -11,6 +11,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -30,6 +31,7 @@
  * @heap_devt: heap device node
  * @list:  list head connecting to list of heaps
  * @heap_cdev: heap char device
+ * @refcount:  reference counter for this heap device
  *
  * Represents a heap of memory from which buffers can be made.
  */
@@ -40,6 +42,7 @@ struct dma_heap {
dev_t heap_devt;
struct list_head list;
struct cdev heap_cdev;
+   struct kref refcount;
 };
 
 static LIST_HEAD(heap_list);
@@ -240,6 +243,7 @@ struct dma_heap *dma_heap_add(const struct 
dma_heap_export_info *exp_info)
if (!heap)
return ERR_PTR(-ENOMEM);
 
+   kref_init(>refcount);
heap->name = exp_info->name;
heap->ops = exp_info->ops;
heap->priv = exp_info->priv;
@@ -304,6 +308,31 @@ struct dma_heap *dma_heap_add(const struct 
dma_heap_export_info *exp_info)
return err_ret;
 }
 
+static void dma_heap_release(struct kref *ref)
+{
+   struct dma_heap *heap = container_of(ref, struct dma_heap, refcount);
+   unsigned int minor = MINOR(heap->heap_devt);
+
+   mutex_lock(_list_lock);
+   list_del(>list);
+   mutex_unlock(_list_lock);
+
+   device_destroy(dma_heap_class, heap->heap_devt);
+   cdev_del(>heap_cdev);
+   xa_erase(_heap_minors, minor);
+
+   kfree(heap);
+}
+
+/**
+ * dma_heap_put - drops a reference to a dmabuf heap, potentially freeing it
+ * @heap: DMA-Heap whose reference count to decrement
+ */
+void dma_heap_put(struct dma_heap *heap)
+{
+   kref_put(>refcount, dma_heap_release);
+}
+
 static char *dma_heap_devnode(const struct device *dev, umode_t *mode)
 {
return kasprintf(GFP_KERNEL, "dma_heap/%s", dev_name(dev));
diff --git a/include/linux/dma-heap.h b/include/linux/dma-heap.h
index fbe86ec889a8..d57593f8a1bc 100644
--- a/include/linux/dma-heap.h
+++ b/include/linux/dma-heap.h
@@ -46,4 +46,6 @@ const char *dma_heap_get_name(struct dma_heap *heap);
 
 struct dma_heap *dma_heap_add(const struct dma_heap_export_info *exp_info);
 
+void dma_heap_put(struct dma_heap *heap);
+
 #endif /* _DMA_HEAPS_H */
-- 
2.18.0



[PATCH v4,00/22] media: add driver to support secure video decoder

2024-01-28 Thread Yunfei Dong
The patch series used to enable secure video playback (SVP) on MediaTek
hardware in the Linux kernel.

Memory Definitions:
secure memory - Memory allocated in the TEE (Trusted Execution
Environment) which is inaccessible in the REE (Rich Execution
Environment, i.e. linux kernel/user space).
secure handle - Integer value which acts as reference to 'secure
memory'. Used in communication between TEE and REE to reference
'secure memory'.
secure buffer - 'secure memory' that is used to store decrypted,
compressed video or for other general purposes in the TEE.
secure surface - 'secure memory' that is used to store graphic buffers.

Memory Usage in SVP:
The overall flow of SVP starts with encrypted video coming in from an
outside source into the REE. The REE will then allocate a 'secure
buffer' and send the corresponding 'secure handle' along with the
encrypted, compressed video data to the TEE. The TEE will then decrypt
the video and store the result in the 'secure buffer'. The REE will
then allocate a 'secure surface'. The REE will pass the 'secure
handles' for both the 'secure buffer' and 'secure surface' into the
TEE for video decoding. The video decoder HW will then decode the
contents of the 'secure buffer' and place the result in the 'secure
surface'. The REE will then attach the 'secure surface' to the overlay
plane for rendering of the video.

Everything relating to ensuring security of the actual contents of the
'secure buffer' and 'secure surface' is out of scope for the REE and
is the responsibility of the TEE.

This patch series is consists of four parts. The first is from Jeffrey,
adding secure memory flag in v4l2 framework to support request secure
buffer.

The second and third parts are from John and T.J, adding some heap
interfaces, then our kernel users could allocate buffer from special
heap. The patch v1 is inside below dmabuf link.
https://lore.kernel.org/linux-mediatek/20230911023038.30649-1-yong...@mediatek.com/
To avoid confusing, move them into vcodec patch set since we use the
new interfaces directly.

The last part is mediaTek video decoder driver, adding tee interface to
support secure video decoder.

---
Changed in v4:
- change the driver according to maintainer advice for patch 1/2/3/4
- replace secure with restricted for patch 1/2/3/4
- fix svp decoder error for patch 21
- add to support hevc for patch 22

Changed in v3:
- rewrite the cover-letter of this patch series
- disable irq for svp mode
- rebase the driver based on the latest media stage

Changed in v2:
- remove setting decoder mode and getting secure handle from decode
- add Jeffrey's patch
- add John and T.J's patch
- getting secure flag with request buffer
- fix some comments from patch v1
---
Jeffrey Kardatzke (4):
  v4l2: add restricted memory flags
  v4l2: handle restricted memory flags in queue setup
  v4l2: verify restricted dmabufs are used in restricted queue
  v4l: add documentation for restricted memory flag

John Stultz (2):
  dma-heap: Add proper kref handling on dma-buf heaps
  dma-heap: Provide accessors so that in-kernel drivers can allocate
dmabufs from specific heaps

T.J. Mercier (1):
  dma-buf: heaps: Deduplicate docs and adopt common format

Yunfei Dong (15):
  media: mediatek: vcodec: add tee client interface to communiate with
optee-os
  media: mediatek: vcodec: allocate tee share memory
  media: mediatek: vcodec: send share memory data to optee
  media: mediatek: vcodec: initialize msg and vsi information
  media: mediatek: vcodec: add interface to allocate/free secure memory
  media: mediatek: vcodec: using shared memory as vsi address
  media: mediatek: vcodec: Add capture format to support one plane
memory
  media: mediatek: vcodec: Add one plane format
  media: mediatek: vcodec: support one plane capture buffer
  media: mediatek: vcodec: re-construct h264 driver to support svp mode
  media: mediatek: vcodec: remove parse nal_info in kernel
  media: mediatek: vcodec: disable wait interrupt for svp mode
  media: mediatek: vcodec: support tee decoder
  media: mediatek: vcodec: move vdec init interface to setup callback
  media: mediatek: vcodec: support hevc svp for mt8188

 .../userspace-api/media/v4l/buffer.rst|  10 +-
 .../media/v4l/pixfmt-reserved.rst |   8 +
 drivers/dma-buf/dma-heap.c| 139 +--
 .../media/common/videobuf2/videobuf2-core.c   |  21 +
 .../common/videobuf2/videobuf2-dma-contig.c   |   8 +
 .../media/common/videobuf2/videobuf2-dma-sg.c |   8 +
 .../media/common/videobuf2/videobuf2-v4l2.c   |   4 +-
 .../media/platform/mediatek/vcodec/Kconfig|   1 +
 .../mediatek/vcodec/common/mtk_vcodec_util.c  | 122 +-
 .../mediatek/vcodec/common/mtk_vcodec_util.h  |   3 +
 .../platform/mediatek/vcodec/decoder/Makefile |   1 +
 .../mediatek/vcodec/decoder/mtk_vcodec_dec.c  | 150 ---
 .../vcodec/decoder/mtk_vcodec_dec_drv.c   |   8 +
 .../vcodec/decoder/mtk_vcodec_dec_drv.h   |   7 +
 

[PATCH v4, 07/22] dma-heap: Provide accessors so that in-kernel drivers can allocate dmabufs from specific heaps

2024-01-28 Thread Yunfei Dong
From: John Stultz 

This allows drivers who don't want to create their own
DMA-BUF exporter to be able to allocate DMA-BUFs directly
from existing DMA-BUF Heaps.

There is some concern that the premise of DMA-BUF heaps is
that userland knows better about what type of heap memory
is needed for a pipeline, so it would likely be best for
drivers to import and fill DMA-BUFs allocated by userland
instead of allocating one themselves, but this is still
up for debate.

Signed-off-by: John Stultz 
Signed-off-by: T.J. Mercier 
Signed-off-by: Yong Wu 
[Yong: Fix the checkpatch alignment warning]
Signed-off-by: Yunfei Dong 
---
 drivers/dma-buf/dma-heap.c | 83 ++
 include/linux/dma-heap.h   |  6 +++
 2 files changed, 73 insertions(+), 16 deletions(-)

diff --git a/drivers/dma-buf/dma-heap.c b/drivers/dma-buf/dma-heap.c
index 97025ee8500f..6efe833a4b10 100644
--- a/drivers/dma-buf/dma-heap.c
+++ b/drivers/dma-buf/dma-heap.c
@@ -51,12 +51,24 @@ static dev_t dma_heap_devt;
 static struct class *dma_heap_class;
 static DEFINE_XARRAY_ALLOC(dma_heap_minors);
 
-static int dma_heap_buffer_alloc(struct dma_heap *heap, size_t len,
-unsigned int fd_flags,
-unsigned int heap_flags)
+/**
+ * dma_heap_buffer_alloc - Allocate dma-buf from a dma_heap
+ * @heap:  DMA-Heap to allocate from
+ * @len:   size to allocate in bytes
+ * @fd_flags:  flags to set on returned dma-buf fd
+ * @heap_flags: flags to pass to the dma heap
+ *
+ * This is for internal dma-buf allocations only. Free returned buffers with 
dma_buf_put().
+ */
+struct dma_buf *dma_heap_buffer_alloc(struct dma_heap *heap, size_t len,
+ unsigned int fd_flags,
+ unsigned int heap_flags)
 {
-   struct dma_buf *dmabuf;
-   int fd;
+   if (fd_flags & ~DMA_HEAP_VALID_FD_FLAGS)
+   return ERR_PTR(-EINVAL);
+
+   if (heap_flags & ~DMA_HEAP_VALID_HEAP_FLAGS)
+   return ERR_PTR(-EINVAL);
 
/*
 * Allocations from all heaps have to begin
@@ -64,9 +76,20 @@ static int dma_heap_buffer_alloc(struct dma_heap *heap, 
size_t len,
 */
len = PAGE_ALIGN(len);
if (!len)
-   return -EINVAL;
+   return ERR_PTR(-EINVAL);
+
+   return heap->ops->allocate(heap, len, fd_flags, heap_flags);
+}
+EXPORT_SYMBOL_GPL(dma_heap_buffer_alloc);
 
-   dmabuf = heap->ops->allocate(heap, len, fd_flags, heap_flags);
+static int dma_heap_bufferfd_alloc(struct dma_heap *heap, size_t len,
+  unsigned int fd_flags,
+  unsigned int heap_flags)
+{
+   struct dma_buf *dmabuf;
+   int fd;
+
+   dmabuf = dma_heap_buffer_alloc(heap, len, fd_flags, heap_flags);
if (IS_ERR(dmabuf))
return PTR_ERR(dmabuf);
 
@@ -104,15 +127,9 @@ static long dma_heap_ioctl_allocate(struct file *file, 
void *data)
if (heap_allocation->fd)
return -EINVAL;
 
-   if (heap_allocation->fd_flags & ~DMA_HEAP_VALID_FD_FLAGS)
-   return -EINVAL;
-
-   if (heap_allocation->heap_flags & ~DMA_HEAP_VALID_HEAP_FLAGS)
-   return -EINVAL;
-
-   fd = dma_heap_buffer_alloc(heap, heap_allocation->len,
-  heap_allocation->fd_flags,
-  heap_allocation->heap_flags);
+   fd = dma_heap_bufferfd_alloc(heap, heap_allocation->len,
+heap_allocation->fd_flags,
+heap_allocation->heap_flags);
if (fd < 0)
return fd;
 
@@ -205,6 +222,7 @@ void *dma_heap_get_drvdata(struct dma_heap *heap)
 {
return heap->priv;
 }
+EXPORT_SYMBOL_GPL(dma_heap_get_drvdata);
 
 /**
  * dma_heap_get_name - get heap name
@@ -217,6 +235,7 @@ const char *dma_heap_get_name(struct dma_heap *heap)
 {
return heap->name;
 }
+EXPORT_SYMBOL_GPL(dma_heap_get_name);
 
 /**
  * dma_heap_add - adds a heap to dmabuf heaps
@@ -307,6 +326,37 @@ struct dma_heap *dma_heap_add(const struct 
dma_heap_export_info *exp_info)
kfree(heap);
return err_ret;
 }
+EXPORT_SYMBOL_GPL(dma_heap_add);
+
+/**
+ * dma_heap_find - get the heap registered with the specified name
+ * @name: Name of the DMA-Heap to find
+ *
+ * Returns:
+ * The DMA-Heap with the provided name.
+ *
+ * NOTE: DMA-Heaps returned from this function MUST be released using
+ * dma_heap_put() when the user is done to enable the heap to be unloaded.
+ */
+struct dma_heap *dma_heap_find(const char *name)
+{
+   struct dma_heap *h;
+
+   mutex_lock(_list_lock);
+   list_for_each_entry(h, _list, list) {
+   if (!kref_get_unless_zero(>refcount))
+   continue;
+
+   if (!strcmp(h->name, name)) {
+   mutex_unlock(_list_lock);
+   return 

[PATCH v4, 05/22] dma-buf: heaps: Deduplicate docs and adopt common format

2024-01-28 Thread Yunfei Dong
From: "T.J. Mercier" 

The docs for dma_heap_get_name were incorrect, and since they were
duplicated in the header they were wrong there too.

The docs formatting was inconsistent so I tried to make it more
consistent across functions since I'm already in here doing cleanup.

Remove multiple unused includes and alphabetize.

Signed-off-by: T.J. Mercier 
Signed-off-by: Yong Wu 
[Yong: Just add a comment for "priv" to mute build warning]
Signed-off-by: Yunfei Dong 
---
 drivers/dma-buf/dma-heap.c | 27 +++
 include/linux/dma-heap.h   | 21 +
 2 files changed, 16 insertions(+), 32 deletions(-)

diff --git a/drivers/dma-buf/dma-heap.c b/drivers/dma-buf/dma-heap.c
index 84ae708fafe7..22f6c193db0d 100644
--- a/drivers/dma-buf/dma-heap.c
+++ b/drivers/dma-buf/dma-heap.c
@@ -7,17 +7,15 @@
  */
 
 #include 
-#include 
 #include 
 #include 
+#include 
 #include 
-#include 
 #include 
-#include 
 #include 
-#include 
 #include 
-#include 
+#include 
+#include 
 #include 
 
 #define DEVNAME "dma_heap"
@@ -28,9 +26,10 @@
  * struct dma_heap - represents a dmabuf heap in the system
  * @name:  used for debugging/device-node name
  * @ops:   ops struct for this heap
- * @heap_devt  heap device node
- * @list   list head connecting to list of heaps
- * @heap_cdev  heap char device
+ * @priv:  private data for this heap
+ * @heap_devt: heap device node
+ * @list:  list head connecting to list of heaps
+ * @heap_cdev: heap char device
  *
  * Represents a heap of memory from which buffers can be made.
  */
@@ -193,11 +192,11 @@ static const struct file_operations dma_heap_fops = {
 };
 
 /**
- * dma_heap_get_drvdata() - get per-subdriver data for the heap
+ * dma_heap_get_drvdata - get per-heap driver data
  * @heap: DMA-Heap to retrieve private data for
  *
  * Returns:
- * The per-subdriver data for the heap.
+ * The per-heap data for the heap.
  */
 void *dma_heap_get_drvdata(struct dma_heap *heap)
 {
@@ -205,8 +204,8 @@ void *dma_heap_get_drvdata(struct dma_heap *heap)
 }
 
 /**
- * dma_heap_get_name() - get heap name
- * @heap: DMA-Heap to retrieve private data for
+ * dma_heap_get_name - get heap name
+ * @heap: DMA-Heap to retrieve the name of
  *
  * Returns:
  * The char* for the heap name.
@@ -216,6 +215,10 @@ const char *dma_heap_get_name(struct dma_heap *heap)
return heap->name;
 }
 
+/**
+ * dma_heap_add - adds a heap to dmabuf heaps
+ * @exp_info: information needed to register this heap
+ */
 struct dma_heap *dma_heap_add(const struct dma_heap_export_info *exp_info)
 {
struct dma_heap *heap, *h, *err_ret;
diff --git a/include/linux/dma-heap.h b/include/linux/dma-heap.h
index 0c05561cad6e..fbe86ec889a8 100644
--- a/include/linux/dma-heap.h
+++ b/include/linux/dma-heap.h
@@ -9,14 +9,13 @@
 #ifndef _DMA_HEAPS_H
 #define _DMA_HEAPS_H
 
-#include 
 #include 
 
 struct dma_heap;
 
 /**
  * struct dma_heap_ops - ops to operate on a given heap
- * @allocate:  allocate dmabuf and return struct dma_buf ptr
+ * @allocate:  allocate dmabuf and return struct dma_buf ptr
  *
  * allocate returns dmabuf on success, ERR_PTR(-errno) on error.
  */
@@ -41,28 +40,10 @@ struct dma_heap_export_info {
void *priv;
 };
 
-/**
- * dma_heap_get_drvdata() - get per-heap driver data
- * @heap: DMA-Heap to retrieve private data for
- *
- * Returns:
- * The per-heap data for the heap.
- */
 void *dma_heap_get_drvdata(struct dma_heap *heap);
 
-/**
- * dma_heap_get_name() - get heap name
- * @heap: DMA-Heap to retrieve private data for
- *
- * Returns:
- * The char* for the heap name.
- */
 const char *dma_heap_get_name(struct dma_heap *heap);
 
-/**
- * dma_heap_add - adds a heap to dmabuf heaps
- * @exp_info:  information needed to register this heap
- */
 struct dma_heap *dma_heap_add(const struct dma_heap_export_info *exp_info);
 
 #endif /* _DMA_HEAPS_H */
-- 
2.18.0



[PATCH v4,01/22] v4l2: add restricted memory flags

2024-01-28 Thread Yunfei Dong
From: Jeffrey Kardatzke 

Adds a V4L2 flag which indicates that a queue is using restricted
dmabufs and the corresponding capability flag.

Signed-off-by: Jeffrey Kardatzke 
Signed-off-by: Yunfei Dong 
---
 include/media/videobuf2-core.h | 8 +++-
 include/uapi/linux/videodev2.h | 2 ++
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/include/media/videobuf2-core.h b/include/media/videobuf2-core.h
index 56719a26a46c..047d4798e423 100644
--- a/include/media/videobuf2-core.h
+++ b/include/media/videobuf2-core.h
@@ -518,6 +518,9 @@ struct vb2_buf_ops {
  * ->finish().
  * @non_coherent_mem: when set queue will attempt to allocate buffers using
  * non-coherent memory.
+ * @allow_restricted_mem: when set user-space can pass the 
%V4L2_MEMORY_FLAG_RESTRICTED
+ * flag to indicate the dma bufs are restricted.
+ * @restricted_mem: when set queue will verify that the dma bufs are 
restricted.
  * @lock:  pointer to a mutex that protects the  vb2_queue. The
  * driver can set this to a mutex to let the v4l2 core serialize
  * the queuing ioctls. If the driver wants to handle locking
@@ -604,6 +607,8 @@ struct vb2_queue {
unsigned intuses_requests:1;
unsigned intallow_cache_hints:1;
unsigned intnon_coherent_mem:1;
+   unsigned intallow_restricted_mem:1;
+   unsigned intrestricted_mem:1;
 
struct mutex*lock;
void*owner;
@@ -773,7 +778,8 @@ void vb2_core_querybuf(struct vb2_queue *q, struct 
vb2_buffer *vb, void *pb);
  * @q: pointer to  vb2_queue with videobuf2 queue.
  * @memory:memory type, as defined by  vb2_memory.
  * @flags: auxiliary queue/buffer management flags. Currently, the only
- * used flag is %V4L2_MEMORY_FLAG_NON_COHERENT.
+ * used flags are %V4L2_MEMORY_FLAG_NON_COHERENT and
+ * %V4L2_MEMORY_FLAG_RESTRICTED.
  * @count: requested buffer count.
  *
  * Videobuf2 core helper to implement VIDIOC_REQBUF() operation. It is called
diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
index 68e7ac178cc2..3e3f8d4b7c81 100644
--- a/include/uapi/linux/videodev2.h
+++ b/include/uapi/linux/videodev2.h
@@ -1026,6 +1026,7 @@ struct v4l2_requestbuffers {
 };
 
 #define V4L2_MEMORY_FLAG_NON_COHERENT  (1 << 0)
+#define V4L2_MEMORY_FLAG_RESTRICTED(1 << 1)
 
 /* capabilities for struct v4l2_requestbuffers and v4l2_create_buffers */
 #define V4L2_BUF_CAP_SUPPORTS_MMAP (1 << 0)
@@ -1036,6 +1037,7 @@ struct v4l2_requestbuffers {
 #define V4L2_BUF_CAP_SUPPORTS_M2M_HOLD_CAPTURE_BUF (1 << 5)
 #define V4L2_BUF_CAP_SUPPORTS_MMAP_CACHE_HINTS (1 << 6)
 #define V4L2_BUF_CAP_SUPPORTS_MAX_NUM_BUFFERS  (1 << 7)
+#define V4L2_BUF_CAP_SUPPORTS_RESTRICTED_MEM   (1 << 8)
 
 /**
  * struct v4l2_plane - plane info for multi-planar buffers
-- 
2.18.0



[PATCH v4,02/22] v4l2: handle restricted memory flags in queue setup

2024-01-28 Thread Yunfei Dong
From: Jeffrey Kardatzke 

Validates the restricted memory flags when setting up a queue and
ensures the queue has the proper capability.

Signed-off-by: Jeffrey Kardatzke 
Signed-off-by: Yunfei Dong 
---
 .../media/common/videobuf2/videobuf2-core.c   | 21 +++
 .../media/common/videobuf2/videobuf2-v4l2.c   |  4 +++-
 2 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/drivers/media/common/videobuf2/videobuf2-core.c 
b/drivers/media/common/videobuf2/videobuf2-core.c
index 41a832dd1426..7f5fd9ec9117 100644
--- a/drivers/media/common/videobuf2/videobuf2-core.c
+++ b/drivers/media/common/videobuf2/videobuf2-core.c
@@ -813,6 +813,15 @@ static bool verify_coherency_flags(struct vb2_queue *q, 
bool non_coherent_mem)
return true;
 }
 
+static bool verify_restricted_mem_flags(struct vb2_queue *q, bool 
restricted_mem)
+{
+   if (restricted_mem != q->restricted_mem) {
+   dprintk(q, 1, "restricted memory model mismatch\n");
+   return false;
+   }
+   return true;
+}
+
 int vb2_core_reqbufs(struct vb2_queue *q, enum vb2_memory memory,
 unsigned int flags, unsigned int *count)
 {
@@ -820,6 +829,7 @@ int vb2_core_reqbufs(struct vb2_queue *q, enum vb2_memory 
memory,
unsigned int q_num_bufs = vb2_get_num_buffers(q);
unsigned plane_sizes[VB2_MAX_PLANES] = { };
bool non_coherent_mem = flags & V4L2_MEMORY_FLAG_NON_COHERENT;
+   bool restricted_mem = flags & V4L2_MEMORY_FLAG_RESTRICTED;
unsigned int i;
int ret = 0;
 
@@ -862,6 +872,9 @@ int vb2_core_reqbufs(struct vb2_queue *q, enum vb2_memory 
memory,
return 0;
}
 
+   if (restricted_mem && (!q->allow_restricted_mem || memory != 
V4L2_MEMORY_DMABUF))
+   return -EINVAL;
+
/*
 * Make sure the requested values and current defaults are sane.
 */
@@ -882,6 +895,7 @@ int vb2_core_reqbufs(struct vb2_queue *q, enum vb2_memory 
memory,
if (ret)
return ret;
set_queue_coherency(q, non_coherent_mem);
+   q->restricted_mem = restricted_mem;
 
/*
 * Ask the driver how many buffers and planes per buffer it requires.
@@ -986,6 +1000,7 @@ int vb2_core_create_bufs(struct vb2_queue *q, enum 
vb2_memory memory,
unsigned plane_sizes[VB2_MAX_PLANES] = { };
bool non_coherent_mem = flags & V4L2_MEMORY_FLAG_NON_COHERENT;
unsigned int q_num_bufs = vb2_get_num_buffers(q);
+   bool restricted_mem = flags & V4L2_MEMORY_FLAG_RESTRICTED;
bool no_previous_buffers = !q_num_bufs;
int ret = 0;
 
@@ -994,6 +1009,9 @@ int vb2_core_create_bufs(struct vb2_queue *q, enum 
vb2_memory memory,
return -ENOBUFS;
}
 
+   if (restricted_mem && (!q->allow_restricted_mem || memory != 
V4L2_MEMORY_DMABUF))
+   return -EINVAL;
+
if (no_previous_buffers) {
if (q->waiting_in_dqbuf && *count) {
dprintk(q, 1, "another dup()ped fd is waiting for a 
buffer\n");
@@ -1015,6 +1033,7 @@ int vb2_core_create_bufs(struct vb2_queue *q, enum 
vb2_memory memory,
return ret;
q->waiting_for_buffers = !q->is_output;
set_queue_coherency(q, non_coherent_mem);
+   q->restricted_mem = restricted_mem;
} else {
if (q->memory != memory) {
dprintk(q, 1, "memory model mismatch\n");
@@ -1022,6 +1041,8 @@ int vb2_core_create_bufs(struct vb2_queue *q, enum 
vb2_memory memory,
}
if (!verify_coherency_flags(q, non_coherent_mem))
return -EINVAL;
+   if (!verify_restricted_mem_flags(q, restricted_mem))
+   return -EINVAL;
}
 
num_buffers = min(*count, q->max_num_buffers - q_num_bufs);
diff --git a/drivers/media/common/videobuf2/videobuf2-v4l2.c 
b/drivers/media/common/videobuf2/videobuf2-v4l2.c
index 54d572c3b515..e825b1e2e22f 100644
--- a/drivers/media/common/videobuf2/videobuf2-v4l2.c
+++ b/drivers/media/common/videobuf2/videobuf2-v4l2.c
@@ -686,6 +686,8 @@ static void fill_buf_caps(struct vb2_queue *q, u32 *caps)
*caps |= V4L2_BUF_CAP_SUPPORTS_MMAP_CACHE_HINTS;
if (q->supports_requests)
*caps |= V4L2_BUF_CAP_SUPPORTS_REQUESTS;
+   if (q->allow_restricted_mem && q->io_modes & VB2_DMABUF)
+   *caps |= V4L2_BUF_CAP_SUPPORTS_RESTRICTED_MEM;
 }
 
 static void validate_memory_flags(struct vb2_queue *q,
@@ -700,7 +702,7 @@ static void validate_memory_flags(struct vb2_queue *q,
*flags = 0;
} else {
/* Clear all unknown flags. */
-   *flags &= V4L2_MEMORY_FLAG_NON_COHERENT;
+   *flags &= V4L2_MEMORY_FLAG_NON_COHERENT | 
V4L2_MEMORY_FLAG_RESTRICTED;
}
 }
 
-- 
2.18.0



Re: [PATCH v19 22/30] drm/shmem-helper: Add common memory shrinker

2024-01-28 Thread Dmitry Osipenko
On 1/26/24 21:12, Boris Brezillon wrote:
> On Fri, 26 Jan 2024 19:27:49 +0300
> Dmitry Osipenko  wrote:
> 
>> On 1/26/24 12:55, Boris Brezillon wrote:
>>> On Fri, 26 Jan 2024 00:56:47 +0300
>>> Dmitry Osipenko  wrote:
>>>   
 On 1/25/24 13:19, Boris Brezillon wrote:  
> On Fri,  5 Jan 2024 21:46:16 +0300
> Dmitry Osipenko  wrote:
> 
>> +static bool drm_gem_shmem_is_evictable(struct drm_gem_shmem_object 
>> *shmem)
>> +{
>> +return (shmem->madv >= 0) && shmem->base.funcs->evict &&
>> +refcount_read(>pages_use_count) &&
>> +!refcount_read(>pages_pin_count) &&
>> +!shmem->base.dma_buf && !shmem->base.import_attach &&
>> +!shmem->evicted;
>
> Are we missing
>
> && dma_resv_test_signaled(shmem->base.resv,
> DMA_RESV_USAGE_BOOKKEEP)
>
> to make sure the GPU is done using the BO?
> The same applies to drm_gem_shmem_is_purgeable() BTW.
>
> If you don't want to do this test here, we need a way to let drivers
> provide a custom is_{evictable,purgeable}() test.
>
> I guess we should also expose drm_gem_shmem_shrinker_update_lru_locked()
> to let drivers move the GEMs that were used most recently (those
> referenced by a GPU job) at the end of the evictable LRU.

 We have the signaled-check in the common drm_gem_evict() helper:

 https://elixir.bootlin.com/linux/v6.8-rc1/source/drivers/gpu/drm/drm_gem.c#L1496
   
>>>
>>> Ah, indeed. I'll need DMA_RESV_USAGE_BOOKKEEP instead of
>>> DMA_RESV_USAGE_READ in panthor, but I can add it in the driver specific  
>>> ->evict() hook (though that means calling dma_resv_test_signaled()  
>>> twice, which is not great, oh well).  
>>
>> Maybe we should change drm_gem_evict() to use BOOKKEEP. The
>> test_signaled(BOOKKEEP) should be a "stronger" check than
>> test_signaled(READ)?
> 
> It is, just wondering if some users have a good reason to want
> READ here.
> 
>>
>>> The problem about the evictable LRU remains though: we need a way to let
>>> drivers put their BOs at the end of the list when the BO has been used
>>> by the GPU, don't we?  
>>
>> If BO is use, then it won't be evicted, while idling BOs will be
>> evicted. Hence, the used BOs will be naturally moved down the LRU list
>> each time shrinker is invoked.
>>
> 
> That only do the trick if the BOs being used most often are busy when
> the shrinker kicks in though. Let's take this scenario:
> 
> 
> BO 1  BO 2
> shinker
> 
>   busy
>   idle (first-pos-in-evictable-LRU)
> 
> busy
> idle (second-pos-in-evictable-LRU)
> 
>   busy
>   idle
> 
>   busy
>   idle
> 
>   busy
>   idle
> 
>   
> find a BO to evict
>   
> pick BO 2
> 
>   busy (swapin)
>   idle
> 
> If the LRU had been updated at each busy event, BO 1 should have
> been picked for eviction. But we evicted the BO that was first
> recorded idle instead of the one that was least recently
> recorded busy.

You have to swapin(BO) every time BO goes to busy state, and swapin does 
drm_gem_lru_move_tail(BO). Hence, each time BO goes idle->busy, it's moved down 
the LRU list.

For example, please see patch #29 where virtio-gpu invokes swapin for each 
job's BO in the submit()->virtio_gpu_array_prepare() code path.

-- 
Best regards,
Dmitry




Re: [PATCH 14/17] drm/msm/dpu: modify encoder programming for CDM over DP

2024-01-28 Thread Dmitry Baryshkov
On Mon, 29 Jan 2024 at 07:03, Abhinav Kumar  wrote:
>
>
>
> On 1/28/2024 7:42 PM, Dmitry Baryshkov wrote:
> > On Mon, 29 Jan 2024 at 04:58, Abhinav Kumar  
> > wrote:
> >>
> >>
> >>
> >> On 1/27/2024 9:55 PM, Dmitry Baryshkov wrote:
> >>> On Sun, 28 Jan 2024 at 07:48, Paloma Arellano  
> >>> wrote:
> 
> 
>  On 1/25/2024 1:57 PM, Dmitry Baryshkov wrote:
> > On 25/01/2024 21:38, Paloma Arellano wrote:
> >> Adjust the encoder format programming in the case of video mode for DP
> >> to accommodate CDM related changes.
> >>
> >> Signed-off-by: Paloma Arellano 
> >> ---
> >> drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c   | 16 +
> >> drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h   |  8 +
> >> .../drm/msm/disp/dpu1/dpu_encoder_phys_vid.c  | 35 
> >> ---
> >> drivers/gpu/drm/msm/dp/dp_display.c   | 12 +++
> >> drivers/gpu/drm/msm/msm_drv.h |  9 -
> >> 5 files changed, 75 insertions(+), 5 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
> >> b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
> >> index b0896814c1562..99ec53446ad21 100644
> >> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
> >> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
> >> @@ -222,6 +222,22 @@ static u32 dither_matrix[DITHER_MATRIX_SZ] = {
> >> 15, 7, 13, 5, 3, 11, 1, 9, 12, 4, 14, 6, 0, 8, 2, 10
> >> };
> >> +u32 dpu_encoder_get_drm_fmt(const struct drm_encoder *drm_enc,
> >> const struct drm_display_mode *mode)
> >> +{
> >> +const struct dpu_encoder_virt *dpu_enc;
> >> +const struct msm_display_info *disp_info;
> >> +struct msm_drm_private *priv;
> >> +
> >> +dpu_enc = to_dpu_encoder_virt(drm_enc);
> >> +disp_info = _enc->disp_info;
> >> +priv = drm_enc->dev->dev_private;
> >> +
> >> +if (disp_info->intf_type == INTF_DP &&
> >> + msm_dp_is_yuv_420_enabled(priv->dp[disp_info->h_tile_instance[0]],
> >> mode))
> >
> > This should not require interacting with DP. If we got here, we must
> > be sure that 4:2:0 is supported and can be configured.
>  Ack. Will drop this function and only check for if the mode is YUV420.
> >
> >> +return DRM_FORMAT_YUV420;
> >> +
> >> +return DRM_FORMAT_RGB888;
> >> +}
> >>   bool dpu_encoder_is_widebus_enabled(const struct drm_encoder
> >> *drm_enc)
> >> {
> >> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h
> >> b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h
> >> index 7b4afa71f1f96..62255d0aa4487 100644
> >> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h
> >> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h
> >> @@ -162,6 +162,14 @@ int dpu_encoder_get_vsync_count(struct
> >> drm_encoder *drm_enc);
> >>  */
> >> bool dpu_encoder_is_widebus_enabled(const struct drm_encoder
> >> *drm_enc);
> >> +/**
> >> + * dpu_encoder_get_drm_fmt - return DRM fourcc format
> >> + * @drm_enc:Pointer to previously created drm encoder structure
> >> + * @mode:Corresponding drm_display_mode for dpu encoder
> >> + */
> >> +u32 dpu_encoder_get_drm_fmt(const struct drm_encoder *drm_enc,
> >> +const struct drm_display_mode *mode);
> >> +
> >> /**
> >>  * dpu_encoder_get_crc_values_cnt - get number of physical encoders
> >> contained
> >>  *in virtual encoder that can collect CRC values
> >> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c
> >> b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c
> >> index e284bf448bdda..a1dde0ff35dc8 100644
> >> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c
> >> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c
> >> @@ -234,6 +234,7 @@ static void
> >> dpu_encoder_phys_vid_setup_timing_engine(
> >> {
> >> struct drm_display_mode mode;
> >> struct dpu_hw_intf_timing_params timing_params = { 0 };
> >> +struct dpu_hw_cdm *hw_cdm;
> >> const struct dpu_format *fmt = NULL;
> >> u32 fmt_fourcc = DRM_FORMAT_RGB888;
> >> unsigned long lock_flags;
> >> @@ -254,17 +255,26 @@ static void
> >> dpu_encoder_phys_vid_setup_timing_engine(
> >> DPU_DEBUG_VIDENC(phys_enc, "enabling mode:\n");
> >> drm_mode_debug_printmodeline();
> >> -if (phys_enc->split_role != ENC_ROLE_SOLO) {
> >> +hw_cdm = phys_enc->hw_cdm;
> >> +if (hw_cdm) {
> >> +intf_cfg.cdm = hw_cdm->idx;
> >> +fmt_fourcc = dpu_encoder_get_drm_fmt(phys_enc->parent, );
> >> +}
> >> +
> >> +if (phys_enc->split_role != ENC_ROLE_SOLO ||
> >> +dpu_encoder_get_drm_fmt(phys_enc->parent, ) 

Re: [PATCH 17/17] drm/msm/dp: allow YUV420 mode for DP connector when VSC SDP supported

2024-01-28 Thread Abhinav Kumar




On 1/28/2024 9:05 PM, Dmitry Baryshkov wrote:

On Mon, 29 Jan 2024 at 06:30, Abhinav Kumar  wrote:




On 1/28/2024 7:52 PM, Dmitry Baryshkov wrote:

On Mon, 29 Jan 2024 at 05:17, Abhinav Kumar  wrote:




On 1/25/2024 2:05 PM, Dmitry Baryshkov wrote:

On 25/01/2024 21:38, Paloma Arellano wrote:

All the components of YUV420 over DP are added. Therefore, let's mark the
connector property as true for DP connector when the DP type is not eDP
and when VSC SDP is supported.

Signed-off-by: Paloma Arellano 
---
drivers/gpu/drm/msm/dp/dp_display.c | 5 -
1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/dp/dp_display.c
b/drivers/gpu/drm/msm/dp/dp_display.c
index 4329435518351..97edd607400b8 100644
--- a/drivers/gpu/drm/msm/dp/dp_display.c
+++ b/drivers/gpu/drm/msm/dp/dp_display.c
@@ -370,11 +370,14 @@ static int dp_display_process_hpd_high(struct
dp_display_private *dp)
dp_link_process_request(dp->link);
-if (!dp->dp_display.is_edp)
+if (!dp->dp_display.is_edp) {
+if (dp_panel_vsc_sdp_supported(dp->panel))
+dp->dp_display.connector->ycbcr_420_allowed = true;


Please consider fixing a TODO in drm_bridge_connector_init().



I am not totally clear if that TODO can ever go for DP/HDMI usage of
drm_bridge_connector.

We do not know if the sink supports VSC SDP till we read the DPCD and
till we know that sink supports VSC SDP, there is no reason to mark the
YUV modes as supported. This is the same logic followed across vendors.

drm_bride_connector_init() happens much earlier than the point where we
read DPCD. The only thing which can be done is perhaps add some callback
to update_ycbcr_420_allowed once DPCD is read. But I don't think its
absolutely necessary to have a callback just for this.


After checking the drm_connector docs, I'd still hold my opinion and
consider this patch to be a misuse of the property. If you check the
drm_connector::ycbcr_420_allowed docs, you'll see that it describes
the output from the source point of view. In other words, it should be
true if the DP connector can send YUV420 rather than being set if the
attached display supports such output. This matches ycbcr420_allowed
usage by AMD, dw-hdmi, intel_hdmi and even intel_dp usage.



hmmm I think I misread intel_dp_update_420(). I saw this is called after
HPD so I thought they unset ycbcr_420_allowed if VSC SDP is not
supported. But they have other DPCD checking there so anyway they will
fail this bridge_connector_init() model.

But one argument which I can give in my defense is, lets say the sink
exposed YUV formats but did not support SDP, then atomic_check() will
keep failing or should keep failing. This will avoid this scenario. But
we can assume that would be a rogue sink.


This should be handled in DP's atomic_check. As usual, bonus point if
this is done via helpers that can be reused by other platforms.


I think we can pass a yuv_supported flag to msm_dp_modeset_init() and
set it to true from dpu_kms if catalog has CDM block and get rid of the
dp_panel_vsc_sdp_supported().


These are two different issues. CDM should be checked in PDU (whether
the DPU can provide YUV data to the DP block).



Yes, I found this issue while discussing this. We need to make this change.



But that doesnt address the TODO you have pointed to. What is really the
expectation of the TODO? Do we need to pass a ycbcr_420_allowed flag to
drm_bridge_connector_init()?


Ugh. No. I was thinking about a `ycbcr420_allowed` flag in the struct
drm_bridge (to follow existing interlace_allowed) flag. But, this
might be not the best option. Each bridge can either pass through YUV
data from the previous bridge or generate YCbCr data on its own. So in
theory this demands two flags plus one flag for the encoder. Which
might be an overkill, until we end up in a situation when the driver
can not decide for the full bridge chain.



Yes.


So let's probably ignore the TODO for the purpose of this series. Just
fix the usage of ycbcr420_allowed according to docs.



Ack.



That would need a tree wide cleanup and thats difficult to sign up for
in this series and I would not as well.

One thing which I can suggest to be less intrusive is have a new API
called drm_bridge_connector_init_with_YUV() which looks something like
below:

struct drm_connector *drm_bridge_connector_init_with_ycbcr_420(struct
drm_device *drm, struct drm_encoder *encoder)
{
 drm_bridge_connector_init();
 connector->ycbcr_420_allowed = true;
}

But I don't know if the community would be interested in this idea or
would find that useful.


drm_dp_set_subconnector_property(dp->dp_display.connector,
 connector_status_connected,
 dp->panel->dpcd,
 dp->panel->downstream_ports);
+}
edid = dp->panel->edid;












Re: [PATCH] drm/sched: Drain all entities in DRM sched run job worker

2024-01-28 Thread Luben Tuikov
On 2024-01-26 11:29, Matthew Brost wrote:
> On Fri, Jan 26, 2024 at 11:32:57AM +0100, Christian König wrote:
>> Am 25.01.24 um 18:30 schrieb Matthew Brost:
>>> On Thu, Jan 25, 2024 at 04:12:58PM +0100, Christian König wrote:

 Am 24.01.24 um 22:08 schrieb Matthew Brost:
> All entities must be drained in the DRM scheduler run job worker to

Hi Matt,

Thanks for the patch. Under close review, let's use "checked" instead of 
"drained",
to read as follows,

All entities must be checked in the DRM scheduler run job worker to ...

> avoid the following case. An entity found that is ready, no job found

Continue with the example given by using a colon, as follows,

... avoid the following case: an entity is found which is ready, yet
no job is returned for that entity when calling 
drm_sched_entity_pop_job(entity).
This causes the job worker to go idle. The correct behaviour is to loop
over all ready entities, until drm_sched_entity_pop_job(entity) returns 
non-NULL,
or there are no more ready entities.

> ready on entity, and run job worker goes idle with other entities + jobs
> ready. Draining all ready entities (i.e. loop over all ready entities)

You see here how "drain" isn't clear enough, and you clarify in parenthesis
that we in fact "loop over all ready entities". So, perhaps let's not use the
verb "drain" and simply use the sentence in the paragraph I've redacted above.

Also, let's please not use "drain" in the title, as it is confusing and makes me
think of capacitors, transistors, or buckets with water and Archimedes screws 
and siphons,
and instead say,

[PATCH]: drm/sched: Really find a ready entity and job in DRM sched run-job 
worker

Which makes it really simple and accessible a description. :-)

> in the run job worker ensures all job that are ready will be scheduled.
 That doesn't make sense. drm_sched_select_entity() only returns entities
 which are "ready", e.g. have a job to run.

>>> That is what I thought too, hence my original design but it is not
>>> exactly true. Let me explain.
>>>
>>> drm_sched_select_entity() returns an entity with a non-empty spsc queue
>>> (job in queue) and no *current* waiting dependecies [1]. Dependecies for
>>> an entity can be added when drm_sched_entity_pop_job() is called [2][3]
>>> returning a NULL job. Thus we can get into a scenario where 2 entities
>>> A and B both have jobs and no current dependecies. A's job is waiting
>>> B's job, entity A gets selected first, a dependecy gets installed in
>>> drm_sched_entity_pop_job(), run work goes idle, and now we deadlock.
>>
>> And here is the real problem. run work doesn't goes idle in that moment.
>>
>> drm_sched_run_job_work() should restarts itself until there is either no
>> more space in the ring buffer or it can't find a ready entity any more.
>>
>> At least that was the original design when that was all still driven by a
>> kthread.
>>
>> It can perfectly be that we messed this up when switching from kthread to a
>> work item.
>>
> 
> Right, that what this patch does - the run worker does not go idle until
> no ready entities are found. That was incorrect in the original patch
> and fixed here. Do you have any issues with this fix? It has been tested
> 3x times and clearly fixes the issue.

Thanks for following up with Christian.

I agree, the fix makes sense and achieves the original intention as described
by Christian. Also, thanks to all who tested it. Good work, thanks!

With the above changes to the patch title and text addressed, this patch would 
be then,

Reviewed-by: Luben Tuikov 

-- 
Regards,
Luben

 
> 
> Matt
> 
>> Regards,
>> Christian.
>>
>>>
>>> The proper solution is to loop over all ready entities until one with a
>>> job is found via drm_sched_entity_pop_job() and then requeue the run
>>> job worker. Or loop over all entities until drm_sched_select_entity()
>>> returns NULL and then let the run job worker go idle. This is what the
>>> old threaded design did too [4]. Hope this clears everything up.
>>>
>>> Matt
>>>
>>> [1] 
>>> https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/scheduler/sched_entity.c#L144
>>> [2] 
>>> https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/scheduler/sched_entity.c#L464
>>> [3] 
>>> https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/scheduler/sched_entity.c#L397
>>> [4] 
>>> https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/scheduler/sched_main.c#L1011
>>>
 If that's not the case any more then you have broken something else.

 Regards,
 Christian.

> Cc: Thorsten Leemhuis 
> Reported-by: Mikhail Gavrilov 
> Closes: 
> https://lore.kernel.org/all/CABXGCsM2VLs489CH-vF-1539-s3in37=bwuowtoeee+q26z...@mail.gmail.com/
> Reported-and-tested-by: Mario Limonciello 
> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3124
> Link: 
> https://lore.kernel.org/all/20240123021155.2775-1-mario.limoncie...@amd.com/
> 

Re: [PATCH 01/17] drm/msm/dpu: allow dpu_encoder_helper_phys_setup_cdm to work for DP

2024-01-28 Thread Dmitry Baryshkov
On Mon, 29 Jan 2024 at 06:33, Abhinav Kumar  wrote:
>
>
>
> On 1/28/2024 8:12 PM, Dmitry Baryshkov wrote:
> > On Mon, 29 Jan 2024 at 06:01, Abhinav Kumar  
> > wrote:
> >>
> >>
> >>
> >> On 1/28/2024 7:23 PM, Dmitry Baryshkov wrote:
> >>> On Mon, 29 Jan 2024 at 05:06, Abhinav Kumar  
> >>> wrote:
> 
> 
> 
>  On 1/26/2024 4:39 PM, Paloma Arellano wrote:
> >
> > On 1/25/2024 1:14 PM, Dmitry Baryshkov wrote:
> >> On 25/01/2024 21:38, Paloma Arellano wrote:
> >>> Generalize dpu_encoder_helper_phys_setup_cdm to be compatible with DP.
> >>>
> >>> Signed-off-by: Paloma Arellano 
> >>> ---
> >>> .../gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h  |  4 +--
> >>> .../drm/msm/disp/dpu1/dpu_encoder_phys_wb.c   | 31 
> >>> ++-
> >>> 2 files changed, 18 insertions(+), 17 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h
> >>> b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h
> >>> index 993f263433314..37ac385727c3b 100644
> >>> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h
> >>> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h
> >>> @@ -153,6 +153,7 @@ enum dpu_intr_idx {
> >>>  * @hw_intf:Hardware interface to the intf registers
> >>>  * @hw_wb:Hardware interface to the wb registers
> >>>  * @hw_cdm:Hardware interface to the CDM registers
> >>> + * @cdm_cfg:CDM block config needed to store WB/DP block's CDM
> >>> configuration
> >>
> >> Please realign the description.
> > Ack
> >>
> >>>  * @dpu_kms:Pointer to the dpu_kms top level
> >>>  * @cached_mode:DRM mode cached at mode_set time, acted on in
> >>> enable
> >>>  * @vblank_ctl_lock:Vblank ctl mutex lock to protect
> >>> vblank_refcount
> >>> @@ -183,6 +184,7 @@ struct dpu_encoder_phys {
> >>> struct dpu_hw_intf *hw_intf;
> >>> struct dpu_hw_wb *hw_wb;
> >>> struct dpu_hw_cdm *hw_cdm;
> >>> +struct dpu_hw_cdm_cfg cdm_cfg;
> >>
> >> It might be slightly better to move it after all the pointers, so
> >> after the dpu_kms.
> > Ack
> >>
> >>> struct dpu_kms *dpu_kms;
> >>> struct drm_display_mode cached_mode;
> >>> struct mutex vblank_ctl_lock;
> >>> @@ -213,7 +215,6 @@ static inline int
> >>> dpu_encoder_phys_inc_pending(struct dpu_encoder_phys *phys)
> >>>  * @wbirq_refcount: Reference count of writeback interrupt
> >>>  * @wb_done_timeout_cnt: number of wb done irq timeout errors
> >>>  * @wb_cfg:  writeback block config to store fb related details
> >>> - * @cdm_cfg: cdm block config needed to store writeback block's CDM
> >>> configuration
> >>>  * @wb_conn: backpointer to writeback connector
> >>>  * @wb_job: backpointer to current writeback job
> >>>  * @dest:   dpu buffer layout for current writeback output buffer
> >>> @@ -223,7 +224,6 @@ struct dpu_encoder_phys_wb {
> >>> atomic_t wbirq_refcount;
> >>> int wb_done_timeout_cnt;
> >>> struct dpu_hw_wb_cfg wb_cfg;
> >>> -struct dpu_hw_cdm_cfg cdm_cfg;
> >>> struct drm_writeback_connector *wb_conn;
> >>> struct drm_writeback_job *wb_job;
> >>> struct dpu_hw_fmt_layout dest;
> >>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c
> >>> b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c
> >>> index 4cd2d9e3131a4..072fc6950e496 100644
> >>> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c
> >>> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c
> >>> @@ -269,28 +269,21 @@ static void
> >>> dpu_encoder_phys_wb_setup_ctl(struct dpu_encoder_phys *phys_enc)
> >>>  * This API does not handle
> >>> DPU_CHROMA_H1V2.
> >>>  * @phys_enc:Pointer to physical encoder
> >>>  */
> >>> -static void dpu_encoder_helper_phys_setup_cdm(struct
> >>> dpu_encoder_phys *phys_enc)
> >>> +static void dpu_encoder_helper_phys_setup_cdm(struct
> >>> dpu_encoder_phys *phys_enc,
> >>> +  const struct dpu_format *dpu_fmt,
> >>> +  u32 output_type)
> >>> {
> >>> struct dpu_hw_cdm *hw_cdm;
> >>> struct dpu_hw_cdm_cfg *cdm_cfg;
> >>> struct dpu_hw_pingpong *hw_pp;
> >>> -struct dpu_encoder_phys_wb *wb_enc;
> >>> -const struct msm_format *format;
> >>> -const struct dpu_format *dpu_fmt;
> >>> -struct drm_writeback_job *wb_job;
> >>> int ret;
> >>>   if (!phys_enc)
> >>> return;
> >>> -wb_enc = to_dpu_encoder_phys_wb(phys_enc);
> >>> -cdm_cfg = _enc->cdm_cfg;
> >>> +cdm_cfg = 

Re: [PATCH 17/17] drm/msm/dp: allow YUV420 mode for DP connector when VSC SDP supported

2024-01-28 Thread Dmitry Baryshkov
On Mon, 29 Jan 2024 at 06:30, Abhinav Kumar  wrote:
>
>
>
> On 1/28/2024 7:52 PM, Dmitry Baryshkov wrote:
> > On Mon, 29 Jan 2024 at 05:17, Abhinav Kumar  
> > wrote:
> >>
> >>
> >>
> >> On 1/25/2024 2:05 PM, Dmitry Baryshkov wrote:
> >>> On 25/01/2024 21:38, Paloma Arellano wrote:
>  All the components of YUV420 over DP are added. Therefore, let's mark the
>  connector property as true for DP connector when the DP type is not eDP
>  and when VSC SDP is supported.
> 
>  Signed-off-by: Paloma Arellano 
>  ---
> drivers/gpu/drm/msm/dp/dp_display.c | 5 -
> 1 file changed, 4 insertions(+), 1 deletion(-)
> 
>  diff --git a/drivers/gpu/drm/msm/dp/dp_display.c
>  b/drivers/gpu/drm/msm/dp/dp_display.c
>  index 4329435518351..97edd607400b8 100644
>  --- a/drivers/gpu/drm/msm/dp/dp_display.c
>  +++ b/drivers/gpu/drm/msm/dp/dp_display.c
>  @@ -370,11 +370,14 @@ static int dp_display_process_hpd_high(struct
>  dp_display_private *dp)
> dp_link_process_request(dp->link);
>  -if (!dp->dp_display.is_edp)
>  +if (!dp->dp_display.is_edp) {
>  +if (dp_panel_vsc_sdp_supported(dp->panel))
>  +dp->dp_display.connector->ycbcr_420_allowed = true;
> >>>
> >>> Please consider fixing a TODO in drm_bridge_connector_init().
> >>>
> >>
> >> I am not totally clear if that TODO can ever go for DP/HDMI usage of
> >> drm_bridge_connector.
> >>
> >> We do not know if the sink supports VSC SDP till we read the DPCD and
> >> till we know that sink supports VSC SDP, there is no reason to mark the
> >> YUV modes as supported. This is the same logic followed across vendors.
> >>
> >> drm_bride_connector_init() happens much earlier than the point where we
> >> read DPCD. The only thing which can be done is perhaps add some callback
> >> to update_ycbcr_420_allowed once DPCD is read. But I don't think its
> >> absolutely necessary to have a callback just for this.
> >
> > After checking the drm_connector docs, I'd still hold my opinion and
> > consider this patch to be a misuse of the property. If you check the
> > drm_connector::ycbcr_420_allowed docs, you'll see that it describes
> > the output from the source point of view. In other words, it should be
> > true if the DP connector can send YUV420 rather than being set if the
> > attached display supports such output. This matches ycbcr420_allowed
> > usage by AMD, dw-hdmi, intel_hdmi and even intel_dp usage.
> >
>
> hmmm I think I misread intel_dp_update_420(). I saw this is called after
> HPD so I thought they unset ycbcr_420_allowed if VSC SDP is not
> supported. But they have other DPCD checking there so anyway they will
> fail this bridge_connector_init() model.
>
> But one argument which I can give in my defense is, lets say the sink
> exposed YUV formats but did not support SDP, then atomic_check() will
> keep failing or should keep failing. This will avoid this scenario. But
> we can assume that would be a rogue sink.

This should be handled in DP's atomic_check. As usual, bonus point if
this is done via helpers that can be reused by other platforms.

> I think we can pass a yuv_supported flag to msm_dp_modeset_init() and
> set it to true from dpu_kms if catalog has CDM block and get rid of the
> dp_panel_vsc_sdp_supported().

These are two different issues. CDM should be checked in PDU (whether
the DPU can provide YUV data to the DP block).

>
> But that doesnt address the TODO you have pointed to. What is really the
> expectation of the TODO? Do we need to pass a ycbcr_420_allowed flag to
> drm_bridge_connector_init()?

Ugh. No. I was thinking about a `ycbcr420_allowed` flag in the struct
drm_bridge (to follow existing interlace_allowed) flag. But, this
might be not the best option. Each bridge can either pass through YUV
data from the previous bridge or generate YCbCr data on its own. So in
theory this demands two flags plus one flag for the encoder. Which
might be an overkill, until we end up in a situation when the driver
can not decide for the full bridge chain.

So let's probably ignore the TODO for the purpose of this series. Just
fix the usage of ycbcr420_allowed according to docs.

>
> That would need a tree wide cleanup and thats difficult to sign up for
> in this series and I would not as well.
>
> One thing which I can suggest to be less intrusive is have a new API
> called drm_bridge_connector_init_with_YUV() which looks something like
> below:
>
> struct drm_connector *drm_bridge_connector_init_with_ycbcr_420(struct
> drm_device *drm, struct drm_encoder *encoder)
> {
> drm_bridge_connector_init();
> connector->ycbcr_420_allowed = true;
> }
>
> But I don't know if the community would be interested in this idea or
> would find that useful.
>
> drm_dp_set_subconnector_property(dp->dp_display.connector,
>  connector_status_connected,
>  

Re: [PATCH 14/17] drm/msm/dpu: modify encoder programming for CDM over DP

2024-01-28 Thread Abhinav Kumar




On 1/28/2024 7:42 PM, Dmitry Baryshkov wrote:

On Mon, 29 Jan 2024 at 04:58, Abhinav Kumar  wrote:




On 1/27/2024 9:55 PM, Dmitry Baryshkov wrote:

On Sun, 28 Jan 2024 at 07:48, Paloma Arellano  wrote:



On 1/25/2024 1:57 PM, Dmitry Baryshkov wrote:

On 25/01/2024 21:38, Paloma Arellano wrote:

Adjust the encoder format programming in the case of video mode for DP
to accommodate CDM related changes.

Signed-off-by: Paloma Arellano 
---
drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c   | 16 +
drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h   |  8 +
.../drm/msm/disp/dpu1/dpu_encoder_phys_vid.c  | 35 ---
drivers/gpu/drm/msm/dp/dp_display.c   | 12 +++
drivers/gpu/drm/msm/msm_drv.h |  9 -
5 files changed, 75 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
index b0896814c1562..99ec53446ad21 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
@@ -222,6 +222,22 @@ static u32 dither_matrix[DITHER_MATRIX_SZ] = {
15, 7, 13, 5, 3, 11, 1, 9, 12, 4, 14, 6, 0, 8, 2, 10
};
+u32 dpu_encoder_get_drm_fmt(const struct drm_encoder *drm_enc,
const struct drm_display_mode *mode)
+{
+const struct dpu_encoder_virt *dpu_enc;
+const struct msm_display_info *disp_info;
+struct msm_drm_private *priv;
+
+dpu_enc = to_dpu_encoder_virt(drm_enc);
+disp_info = _enc->disp_info;
+priv = drm_enc->dev->dev_private;
+
+if (disp_info->intf_type == INTF_DP &&
+ msm_dp_is_yuv_420_enabled(priv->dp[disp_info->h_tile_instance[0]],
mode))


This should not require interacting with DP. If we got here, we must
be sure that 4:2:0 is supported and can be configured.

Ack. Will drop this function and only check for if the mode is YUV420.



+return DRM_FORMAT_YUV420;
+
+return DRM_FORMAT_RGB888;
+}
  bool dpu_encoder_is_widebus_enabled(const struct drm_encoder
*drm_enc)
{
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h
b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h
index 7b4afa71f1f96..62255d0aa4487 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h
@@ -162,6 +162,14 @@ int dpu_encoder_get_vsync_count(struct
drm_encoder *drm_enc);
 */
bool dpu_encoder_is_widebus_enabled(const struct drm_encoder
*drm_enc);
+/**
+ * dpu_encoder_get_drm_fmt - return DRM fourcc format
+ * @drm_enc:Pointer to previously created drm encoder structure
+ * @mode:Corresponding drm_display_mode for dpu encoder
+ */
+u32 dpu_encoder_get_drm_fmt(const struct drm_encoder *drm_enc,
+const struct drm_display_mode *mode);
+
/**
 * dpu_encoder_get_crc_values_cnt - get number of physical encoders
contained
 *in virtual encoder that can collect CRC values
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c
b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c
index e284bf448bdda..a1dde0ff35dc8 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c
@@ -234,6 +234,7 @@ static void
dpu_encoder_phys_vid_setup_timing_engine(
{
struct drm_display_mode mode;
struct dpu_hw_intf_timing_params timing_params = { 0 };
+struct dpu_hw_cdm *hw_cdm;
const struct dpu_format *fmt = NULL;
u32 fmt_fourcc = DRM_FORMAT_RGB888;
unsigned long lock_flags;
@@ -254,17 +255,26 @@ static void
dpu_encoder_phys_vid_setup_timing_engine(
DPU_DEBUG_VIDENC(phys_enc, "enabling mode:\n");
drm_mode_debug_printmodeline();
-if (phys_enc->split_role != ENC_ROLE_SOLO) {
+hw_cdm = phys_enc->hw_cdm;
+if (hw_cdm) {
+intf_cfg.cdm = hw_cdm->idx;
+fmt_fourcc = dpu_encoder_get_drm_fmt(phys_enc->parent, );
+}
+
+if (phys_enc->split_role != ENC_ROLE_SOLO ||
+dpu_encoder_get_drm_fmt(phys_enc->parent, ) ==
DRM_FORMAT_YUV420) {
mode.hdisplay >>= 1;
mode.htotal >>= 1;
mode.hsync_start >>= 1;
mode.hsync_end >>= 1;
+mode.hskew >>= 1;


Separate patch.

Ack.



  DPU_DEBUG_VIDENC(phys_enc,
-"split_role %d, halve horizontal %d %d %d %d\n",
+"split_role %d, halve horizontal %d %d %d %d %d\n",
phys_enc->split_role,
mode.hdisplay, mode.htotal,
-mode.hsync_start, mode.hsync_end);
+mode.hsync_start, mode.hsync_end,
+mode.hskew);
}
  drm_mode_to_intf_timing_params(phys_enc, , _params);
@@ -412,8 +422,15 @@ static int dpu_encoder_phys_vid_control_vblank_irq(
static void dpu_encoder_phys_vid_enable(struct dpu_encoder_phys
*phys_enc)
{
struct dpu_hw_ctl *ctl;
+struct dpu_hw_cdm *hw_cdm;
+const struct dpu_format *fmt = NULL;
+u32 fmt_fourcc = DRM_FORMAT_RGB888;

Re: [PATCH 01/17] drm/msm/dpu: allow dpu_encoder_helper_phys_setup_cdm to work for DP

2024-01-28 Thread Abhinav Kumar




On 1/28/2024 8:12 PM, Dmitry Baryshkov wrote:

On Mon, 29 Jan 2024 at 06:01, Abhinav Kumar  wrote:




On 1/28/2024 7:23 PM, Dmitry Baryshkov wrote:

On Mon, 29 Jan 2024 at 05:06, Abhinav Kumar  wrote:




On 1/26/2024 4:39 PM, Paloma Arellano wrote:


On 1/25/2024 1:14 PM, Dmitry Baryshkov wrote:

On 25/01/2024 21:38, Paloma Arellano wrote:

Generalize dpu_encoder_helper_phys_setup_cdm to be compatible with DP.

Signed-off-by: Paloma Arellano 
---
.../gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h  |  4 +--
.../drm/msm/disp/dpu1/dpu_encoder_phys_wb.c   | 31 ++-
2 files changed, 18 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h
b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h
index 993f263433314..37ac385727c3b 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h
@@ -153,6 +153,7 @@ enum dpu_intr_idx {
 * @hw_intf:Hardware interface to the intf registers
 * @hw_wb:Hardware interface to the wb registers
 * @hw_cdm:Hardware interface to the CDM registers
+ * @cdm_cfg:CDM block config needed to store WB/DP block's CDM
configuration


Please realign the description.

Ack



 * @dpu_kms:Pointer to the dpu_kms top level
 * @cached_mode:DRM mode cached at mode_set time, acted on in
enable
 * @vblank_ctl_lock:Vblank ctl mutex lock to protect
vblank_refcount
@@ -183,6 +184,7 @@ struct dpu_encoder_phys {
struct dpu_hw_intf *hw_intf;
struct dpu_hw_wb *hw_wb;
struct dpu_hw_cdm *hw_cdm;
+struct dpu_hw_cdm_cfg cdm_cfg;


It might be slightly better to move it after all the pointers, so
after the dpu_kms.

Ack



struct dpu_kms *dpu_kms;
struct drm_display_mode cached_mode;
struct mutex vblank_ctl_lock;
@@ -213,7 +215,6 @@ static inline int
dpu_encoder_phys_inc_pending(struct dpu_encoder_phys *phys)
 * @wbirq_refcount: Reference count of writeback interrupt
 * @wb_done_timeout_cnt: number of wb done irq timeout errors
 * @wb_cfg:  writeback block config to store fb related details
- * @cdm_cfg: cdm block config needed to store writeback block's CDM
configuration
 * @wb_conn: backpointer to writeback connector
 * @wb_job: backpointer to current writeback job
 * @dest:   dpu buffer layout for current writeback output buffer
@@ -223,7 +224,6 @@ struct dpu_encoder_phys_wb {
atomic_t wbirq_refcount;
int wb_done_timeout_cnt;
struct dpu_hw_wb_cfg wb_cfg;
-struct dpu_hw_cdm_cfg cdm_cfg;
struct drm_writeback_connector *wb_conn;
struct drm_writeback_job *wb_job;
struct dpu_hw_fmt_layout dest;
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c
b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c
index 4cd2d9e3131a4..072fc6950e496 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c
@@ -269,28 +269,21 @@ static void
dpu_encoder_phys_wb_setup_ctl(struct dpu_encoder_phys *phys_enc)
 * This API does not handle
DPU_CHROMA_H1V2.
 * @phys_enc:Pointer to physical encoder
 */
-static void dpu_encoder_helper_phys_setup_cdm(struct
dpu_encoder_phys *phys_enc)
+static void dpu_encoder_helper_phys_setup_cdm(struct
dpu_encoder_phys *phys_enc,
+  const struct dpu_format *dpu_fmt,
+  u32 output_type)
{
struct dpu_hw_cdm *hw_cdm;
struct dpu_hw_cdm_cfg *cdm_cfg;
struct dpu_hw_pingpong *hw_pp;
-struct dpu_encoder_phys_wb *wb_enc;
-const struct msm_format *format;
-const struct dpu_format *dpu_fmt;
-struct drm_writeback_job *wb_job;
int ret;
  if (!phys_enc)
return;
-wb_enc = to_dpu_encoder_phys_wb(phys_enc);
-cdm_cfg = _enc->cdm_cfg;
+cdm_cfg = _enc->cdm_cfg;
hw_pp = phys_enc->hw_pp;
hw_cdm = phys_enc->hw_cdm;
-wb_job = wb_enc->wb_job;
-
-format = msm_framebuffer_format(wb_enc->wb_job->fb);
-dpu_fmt = dpu_get_dpu_format_ext(format->pixel_format,
wb_job->fb->modifier);
  if (!hw_cdm)
return;
@@ -306,10 +299,10 @@ static void
dpu_encoder_helper_phys_setup_cdm(struct dpu_encoder_phys *phys_enc)
  memset(cdm_cfg, 0, sizeof(struct dpu_hw_cdm_cfg));
-cdm_cfg->output_width = wb_job->fb->width;
-cdm_cfg->output_height = wb_job->fb->height;
+cdm_cfg->output_width = phys_enc->cached_mode.hdisplay;
+cdm_cfg->output_height = phys_enc->cached_mode.vdisplay;


This is a semantic change. Instead of passing the FB size, this passes
the mode dimensions. They are not guaranteed to be the same,
especially for the WB case.



The WB job is storing the output FB of WB. I cannot think of a use-case
where this cannot match the current mode programmed to the WB encoder.

Yes, if it was the drm_plane's 

Re: [PATCH 17/17] drm/msm/dp: allow YUV420 mode for DP connector when VSC SDP supported

2024-01-28 Thread Abhinav Kumar




On 1/28/2024 7:52 PM, Dmitry Baryshkov wrote:

On Mon, 29 Jan 2024 at 05:17, Abhinav Kumar  wrote:




On 1/25/2024 2:05 PM, Dmitry Baryshkov wrote:

On 25/01/2024 21:38, Paloma Arellano wrote:

All the components of YUV420 over DP are added. Therefore, let's mark the
connector property as true for DP connector when the DP type is not eDP
and when VSC SDP is supported.

Signed-off-by: Paloma Arellano 
---
   drivers/gpu/drm/msm/dp/dp_display.c | 5 -
   1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/dp/dp_display.c
b/drivers/gpu/drm/msm/dp/dp_display.c
index 4329435518351..97edd607400b8 100644
--- a/drivers/gpu/drm/msm/dp/dp_display.c
+++ b/drivers/gpu/drm/msm/dp/dp_display.c
@@ -370,11 +370,14 @@ static int dp_display_process_hpd_high(struct
dp_display_private *dp)
   dp_link_process_request(dp->link);
-if (!dp->dp_display.is_edp)
+if (!dp->dp_display.is_edp) {
+if (dp_panel_vsc_sdp_supported(dp->panel))
+dp->dp_display.connector->ycbcr_420_allowed = true;


Please consider fixing a TODO in drm_bridge_connector_init().



I am not totally clear if that TODO can ever go for DP/HDMI usage of
drm_bridge_connector.

We do not know if the sink supports VSC SDP till we read the DPCD and
till we know that sink supports VSC SDP, there is no reason to mark the
YUV modes as supported. This is the same logic followed across vendors.

drm_bride_connector_init() happens much earlier than the point where we
read DPCD. The only thing which can be done is perhaps add some callback
to update_ycbcr_420_allowed once DPCD is read. But I don't think its
absolutely necessary to have a callback just for this.


After checking the drm_connector docs, I'd still hold my opinion and
consider this patch to be a misuse of the property. If you check the
drm_connector::ycbcr_420_allowed docs, you'll see that it describes
the output from the source point of view. In other words, it should be
true if the DP connector can send YUV420 rather than being set if the
attached display supports such output. This matches ycbcr420_allowed
usage by AMD, dw-hdmi, intel_hdmi and even intel_dp usage.



hmmm I think I misread intel_dp_update_420(). I saw this is called after 
HPD so I thought they unset ycbcr_420_allowed if VSC SDP is not 
supported. But they have other DPCD checking there so anyway they will 
fail this bridge_connector_init() model.


But one argument which I can give in my defense is, lets say the sink 
exposed YUV formats but did not support SDP, then atomic_check() will 
keep failing or should keep failing. This will avoid this scenario. But 
we can assume that would be a rogue sink.


I think we can pass a yuv_supported flag to msm_dp_modeset_init() and 
set it to true from dpu_kms if catalog has CDM block and get rid of the 
dp_panel_vsc_sdp_supported().


But that doesnt address the TODO you have pointed to. What is really the 
expectation of the TODO? Do we need to pass a ycbcr_420_allowed flag to

drm_bridge_connector_init()?

That would need a tree wide cleanup and thats difficult to sign up for 
in this series and I would not as well.


One thing which I can suggest to be less intrusive is have a new API 
called drm_bridge_connector_init_with_YUV() which looks something like 
below:


struct drm_connector *drm_bridge_connector_init_with_ycbcr_420(struct 
drm_device *drm, struct drm_encoder *encoder)

{
drm_bridge_connector_init();
connector->ycbcr_420_allowed = true;
}

But I don't know if the community would be interested in this idea or 
would find that useful.



   drm_dp_set_subconnector_property(dp->dp_display.connector,
connector_status_connected,
dp->panel->dpcd,
dp->panel->downstream_ports);
+}
   edid = dp->panel->edid;








Re: [PATCH v11 14/26] locking/lockdep, cpu/hotplus: Use a weaker annotation in AP thread

2024-01-28 Thread Byungchul Park
On Fri, Jan 26, 2024 at 06:30:02PM +0100, Thomas Gleixner wrote:
> On Wed, Jan 24 2024 at 20:59, Byungchul Park wrote:
> 
> Why is lockdep in the subsystem prefix here? You are changing the CPU
> hotplug (not hotplus) code, right?

I will fix the typo ;( Thank you.

I referred to the commit cb92173d1f047. I will remove the prefix if the
way is more desirable.

> > cb92173d1f0 ("locking/lockdep, cpu/hotplug: Annotate AP thread") was
> > introduced to make lockdep_assert_cpus_held() work in AP thread.
> >
> > However, the annotation is too strong for that purpose. We don't have to
> > use more than try lock annotation for that.
> 
> This lacks a proper explanation why this is too strong.

rwsem_acquire() implies:

   1. might be a waiter on contention of the lock.
   2. enter to the critical section of the lock.

All we need in here is to act 2, not 1. That's why I suggested trylock
version of annotation for that purpose.

Now that dept partially replies on lockdep annotaions for the waiters
and events, dept is interpeting rwsem_acquire() as a potential waiter
and reports a deadlock by the wait.

Of course, the first priority should be not to change the current
behavior. I think the change from non-trylock to trylock for the
annotation won't. Or am I missing something?

Byungchul

> > Furthermore, now that Dept was introduced, false positive alarms was
> > reported by that. Replaced it with try lock annotation.
> 
> I still have zero idea what this is about.
> 
> Thanks,
> 
> tglx


Re: [PATCH 01/17] drm/msm/dpu: allow dpu_encoder_helper_phys_setup_cdm to work for DP

2024-01-28 Thread Dmitry Baryshkov
On Mon, 29 Jan 2024 at 06:01, Abhinav Kumar  wrote:
>
>
>
> On 1/28/2024 7:23 PM, Dmitry Baryshkov wrote:
> > On Mon, 29 Jan 2024 at 05:06, Abhinav Kumar  
> > wrote:
> >>
> >>
> >>
> >> On 1/26/2024 4:39 PM, Paloma Arellano wrote:
> >>>
> >>> On 1/25/2024 1:14 PM, Dmitry Baryshkov wrote:
>  On 25/01/2024 21:38, Paloma Arellano wrote:
> > Generalize dpu_encoder_helper_phys_setup_cdm to be compatible with DP.
> >
> > Signed-off-by: Paloma Arellano 
> > ---
> >.../gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h  |  4 +--
> >.../drm/msm/disp/dpu1/dpu_encoder_phys_wb.c   | 31 
> > ++-
> >2 files changed, 18 insertions(+), 17 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h
> > b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h
> > index 993f263433314..37ac385727c3b 100644
> > --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h
> > +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h
> > @@ -153,6 +153,7 @@ enum dpu_intr_idx {
> > * @hw_intf:Hardware interface to the intf registers
> > * @hw_wb:Hardware interface to the wb registers
> > * @hw_cdm:Hardware interface to the CDM registers
> > + * @cdm_cfg:CDM block config needed to store WB/DP block's CDM
> > configuration
> 
>  Please realign the description.
> >>> Ack
> 
> > * @dpu_kms:Pointer to the dpu_kms top level
> > * @cached_mode:DRM mode cached at mode_set time, acted on in
> > enable
> > * @vblank_ctl_lock:Vblank ctl mutex lock to protect
> > vblank_refcount
> > @@ -183,6 +184,7 @@ struct dpu_encoder_phys {
> >struct dpu_hw_intf *hw_intf;
> >struct dpu_hw_wb *hw_wb;
> >struct dpu_hw_cdm *hw_cdm;
> > +struct dpu_hw_cdm_cfg cdm_cfg;
> 
>  It might be slightly better to move it after all the pointers, so
>  after the dpu_kms.
> >>> Ack
> 
> >struct dpu_kms *dpu_kms;
> >struct drm_display_mode cached_mode;
> >struct mutex vblank_ctl_lock;
> > @@ -213,7 +215,6 @@ static inline int
> > dpu_encoder_phys_inc_pending(struct dpu_encoder_phys *phys)
> > * @wbirq_refcount: Reference count of writeback interrupt
> > * @wb_done_timeout_cnt: number of wb done irq timeout errors
> > * @wb_cfg:  writeback block config to store fb related details
> > - * @cdm_cfg: cdm block config needed to store writeback block's CDM
> > configuration
> > * @wb_conn: backpointer to writeback connector
> > * @wb_job: backpointer to current writeback job
> > * @dest:   dpu buffer layout for current writeback output buffer
> > @@ -223,7 +224,6 @@ struct dpu_encoder_phys_wb {
> >atomic_t wbirq_refcount;
> >int wb_done_timeout_cnt;
> >struct dpu_hw_wb_cfg wb_cfg;
> > -struct dpu_hw_cdm_cfg cdm_cfg;
> >struct drm_writeback_connector *wb_conn;
> >struct drm_writeback_job *wb_job;
> >struct dpu_hw_fmt_layout dest;
> > diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c
> > b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c
> > index 4cd2d9e3131a4..072fc6950e496 100644
> > --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c
> > +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c
> > @@ -269,28 +269,21 @@ static void
> > dpu_encoder_phys_wb_setup_ctl(struct dpu_encoder_phys *phys_enc)
> > * This API does not handle
> > DPU_CHROMA_H1V2.
> > * @phys_enc:Pointer to physical encoder
> > */
> > -static void dpu_encoder_helper_phys_setup_cdm(struct
> > dpu_encoder_phys *phys_enc)
> > +static void dpu_encoder_helper_phys_setup_cdm(struct
> > dpu_encoder_phys *phys_enc,
> > +  const struct dpu_format *dpu_fmt,
> > +  u32 output_type)
> >{
> >struct dpu_hw_cdm *hw_cdm;
> >struct dpu_hw_cdm_cfg *cdm_cfg;
> >struct dpu_hw_pingpong *hw_pp;
> > -struct dpu_encoder_phys_wb *wb_enc;
> > -const struct msm_format *format;
> > -const struct dpu_format *dpu_fmt;
> > -struct drm_writeback_job *wb_job;
> >int ret;
> >  if (!phys_enc)
> >return;
> >-wb_enc = to_dpu_encoder_phys_wb(phys_enc);
> > -cdm_cfg = _enc->cdm_cfg;
> > +cdm_cfg = _enc->cdm_cfg;
> >hw_pp = phys_enc->hw_pp;
> >hw_cdm = phys_enc->hw_cdm;
> > -wb_job = wb_enc->wb_job;
> > -
> > -format = msm_framebuffer_format(wb_enc->wb_job->fb);
> > -dpu_fmt = dpu_get_dpu_format_ext(format->pixel_format,
> > wb_job->fb->modifier);
> >  if (!hw_cdm)
> >return;
> 

Re: [PATCH] drm/sched: Drain all entities in DRM sched run job worker

2024-01-28 Thread Luben Tuikov
On 2024-01-24 16:08, Matthew Brost wrote:
> All entities must be drained in the DRM scheduler run job worker to
> avoid the following case. An entity found that is ready, no job found
> ready on entity, and run job worker goes idle with other entities + jobs
> ready. Draining all ready entities (i.e. loop over all ready entities)
> in the run job worker ensures all job that are ready will be scheduled.
> 
> Cc: Thorsten Leemhuis 
> Reported-by: Mikhail Gavrilov 
> Closes: 
> https://lore.kernel.org/all/CABXGCsM2VLs489CH-vF-1539-s3in37=bwuowtoeee+q26z...@mail.gmail.com/
> Reported-and-tested-by: Mario Limonciello 
> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3124
> Link: 
> https://lore.kernel.org/all/20240123021155.2775-1-mario.limoncie...@amd.com/
> Reported-by: Vlastimil Babka 
> Closes: 
> https://lore.kernel.org/dri-devel/05ddb2da-b182-4791-8ef7-82179fd15...@amd.com/T/#m0c31d4d1b9ae9995bb880974c4f1dbaddc33a48a
> Signed-off-by: Matthew Brost 

Hi Matthew,

Thanks for working on this and sending the patch.

Could we add a fixes-tag to the tag list,

Fixes: f7fe64ad0f22 ("drm/sched: Split free_job into own work item")

This really drives to point as shown here,
https://gitlab.freedesktop.org/drm/amd/-/issues/3124
which is mentioned in a Closes tag--thanks!
-- 
Regards,
Luben

> ---
>  drivers/gpu/drm/scheduler/sched_main.c | 15 +++
>  1 file changed, 7 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
> b/drivers/gpu/drm/scheduler/sched_main.c
> index 550492a7a031..85f082396d42 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -1178,21 +1178,20 @@ static void drm_sched_run_job_work(struct work_struct 
> *w)
>   struct drm_sched_entity *entity;
>   struct dma_fence *fence;
>   struct drm_sched_fence *s_fence;
> - struct drm_sched_job *sched_job;
> + struct drm_sched_job *sched_job = NULL;
>   int r;
>  
>   if (READ_ONCE(sched->pause_submit))
>   return;
>  
> - entity = drm_sched_select_entity(sched);
> + /* Find entity with a ready job */
> + while (!sched_job && (entity = drm_sched_select_entity(sched))) {
> + sched_job = drm_sched_entity_pop_job(entity);
> + if (!sched_job)
> + complete_all(>entity_idle);
> + }
>   if (!entity)
> - return;
> -
> - sched_job = drm_sched_entity_pop_job(entity);
> - if (!sched_job) {
> - complete_all(>entity_idle);
>   return; /* No more work */
> - }
>  
>   s_fence = sched_job->s_fence;
>  


OpenPGP_0x4C15479431A334AF.asc
Description: OpenPGP public key


OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: [PATCH 01/17] drm/msm/dpu: allow dpu_encoder_helper_phys_setup_cdm to work for DP

2024-01-28 Thread Abhinav Kumar




On 1/28/2024 7:23 PM, Dmitry Baryshkov wrote:

On Mon, 29 Jan 2024 at 05:06, Abhinav Kumar  wrote:




On 1/26/2024 4:39 PM, Paloma Arellano wrote:


On 1/25/2024 1:14 PM, Dmitry Baryshkov wrote:

On 25/01/2024 21:38, Paloma Arellano wrote:

Generalize dpu_encoder_helper_phys_setup_cdm to be compatible with DP.

Signed-off-by: Paloma Arellano 
---
   .../gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h  |  4 +--
   .../drm/msm/disp/dpu1/dpu_encoder_phys_wb.c   | 31 ++-
   2 files changed, 18 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h
b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h
index 993f263433314..37ac385727c3b 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h
@@ -153,6 +153,7 @@ enum dpu_intr_idx {
* @hw_intf:Hardware interface to the intf registers
* @hw_wb:Hardware interface to the wb registers
* @hw_cdm:Hardware interface to the CDM registers
+ * @cdm_cfg:CDM block config needed to store WB/DP block's CDM
configuration


Please realign the description.

Ack



* @dpu_kms:Pointer to the dpu_kms top level
* @cached_mode:DRM mode cached at mode_set time, acted on in
enable
* @vblank_ctl_lock:Vblank ctl mutex lock to protect
vblank_refcount
@@ -183,6 +184,7 @@ struct dpu_encoder_phys {
   struct dpu_hw_intf *hw_intf;
   struct dpu_hw_wb *hw_wb;
   struct dpu_hw_cdm *hw_cdm;
+struct dpu_hw_cdm_cfg cdm_cfg;


It might be slightly better to move it after all the pointers, so
after the dpu_kms.

Ack



   struct dpu_kms *dpu_kms;
   struct drm_display_mode cached_mode;
   struct mutex vblank_ctl_lock;
@@ -213,7 +215,6 @@ static inline int
dpu_encoder_phys_inc_pending(struct dpu_encoder_phys *phys)
* @wbirq_refcount: Reference count of writeback interrupt
* @wb_done_timeout_cnt: number of wb done irq timeout errors
* @wb_cfg:  writeback block config to store fb related details
- * @cdm_cfg: cdm block config needed to store writeback block's CDM
configuration
* @wb_conn: backpointer to writeback connector
* @wb_job: backpointer to current writeback job
* @dest:   dpu buffer layout for current writeback output buffer
@@ -223,7 +224,6 @@ struct dpu_encoder_phys_wb {
   atomic_t wbirq_refcount;
   int wb_done_timeout_cnt;
   struct dpu_hw_wb_cfg wb_cfg;
-struct dpu_hw_cdm_cfg cdm_cfg;
   struct drm_writeback_connector *wb_conn;
   struct drm_writeback_job *wb_job;
   struct dpu_hw_fmt_layout dest;
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c
b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c
index 4cd2d9e3131a4..072fc6950e496 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c
@@ -269,28 +269,21 @@ static void
dpu_encoder_phys_wb_setup_ctl(struct dpu_encoder_phys *phys_enc)
* This API does not handle
DPU_CHROMA_H1V2.
* @phys_enc:Pointer to physical encoder
*/
-static void dpu_encoder_helper_phys_setup_cdm(struct
dpu_encoder_phys *phys_enc)
+static void dpu_encoder_helper_phys_setup_cdm(struct
dpu_encoder_phys *phys_enc,
+  const struct dpu_format *dpu_fmt,
+  u32 output_type)
   {
   struct dpu_hw_cdm *hw_cdm;
   struct dpu_hw_cdm_cfg *cdm_cfg;
   struct dpu_hw_pingpong *hw_pp;
-struct dpu_encoder_phys_wb *wb_enc;
-const struct msm_format *format;
-const struct dpu_format *dpu_fmt;
-struct drm_writeback_job *wb_job;
   int ret;
 if (!phys_enc)
   return;
   -wb_enc = to_dpu_encoder_phys_wb(phys_enc);
-cdm_cfg = _enc->cdm_cfg;
+cdm_cfg = _enc->cdm_cfg;
   hw_pp = phys_enc->hw_pp;
   hw_cdm = phys_enc->hw_cdm;
-wb_job = wb_enc->wb_job;
-
-format = msm_framebuffer_format(wb_enc->wb_job->fb);
-dpu_fmt = dpu_get_dpu_format_ext(format->pixel_format,
wb_job->fb->modifier);
 if (!hw_cdm)
   return;
@@ -306,10 +299,10 @@ static void
dpu_encoder_helper_phys_setup_cdm(struct dpu_encoder_phys *phys_enc)
 memset(cdm_cfg, 0, sizeof(struct dpu_hw_cdm_cfg));
   -cdm_cfg->output_width = wb_job->fb->width;
-cdm_cfg->output_height = wb_job->fb->height;
+cdm_cfg->output_width = phys_enc->cached_mode.hdisplay;
+cdm_cfg->output_height = phys_enc->cached_mode.vdisplay;


This is a semantic change. Instead of passing the FB size, this passes
the mode dimensions. They are not guaranteed to be the same,
especially for the WB case.



The WB job is storing the output FB of WB. I cannot think of a use-case
where this cannot match the current mode programmed to the WB encoder.

Yes, if it was the drm_plane's FB, then it cannot be guaranteed as the
plane can scale the contents but here thats not the case. Here its the
output FB of WB.


Is it a part of WB 

Re: [PATCH 17/17] drm/msm/dp: allow YUV420 mode for DP connector when VSC SDP supported

2024-01-28 Thread Dmitry Baryshkov
On Mon, 29 Jan 2024 at 05:17, Abhinav Kumar  wrote:
>
>
>
> On 1/25/2024 2:05 PM, Dmitry Baryshkov wrote:
> > On 25/01/2024 21:38, Paloma Arellano wrote:
> >> All the components of YUV420 over DP are added. Therefore, let's mark the
> >> connector property as true for DP connector when the DP type is not eDP
> >> and when VSC SDP is supported.
> >>
> >> Signed-off-by: Paloma Arellano 
> >> ---
> >>   drivers/gpu/drm/msm/dp/dp_display.c | 5 -
> >>   1 file changed, 4 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/drivers/gpu/drm/msm/dp/dp_display.c
> >> b/drivers/gpu/drm/msm/dp/dp_display.c
> >> index 4329435518351..97edd607400b8 100644
> >> --- a/drivers/gpu/drm/msm/dp/dp_display.c
> >> +++ b/drivers/gpu/drm/msm/dp/dp_display.c
> >> @@ -370,11 +370,14 @@ static int dp_display_process_hpd_high(struct
> >> dp_display_private *dp)
> >>   dp_link_process_request(dp->link);
> >> -if (!dp->dp_display.is_edp)
> >> +if (!dp->dp_display.is_edp) {
> >> +if (dp_panel_vsc_sdp_supported(dp->panel))
> >> +dp->dp_display.connector->ycbcr_420_allowed = true;
> >
> > Please consider fixing a TODO in drm_bridge_connector_init().
> >
>
> I am not totally clear if that TODO can ever go for DP/HDMI usage of
> drm_bridge_connector.
>
> We do not know if the sink supports VSC SDP till we read the DPCD and
> till we know that sink supports VSC SDP, there is no reason to mark the
> YUV modes as supported. This is the same logic followed across vendors.
>
> drm_bride_connector_init() happens much earlier than the point where we
> read DPCD. The only thing which can be done is perhaps add some callback
> to update_ycbcr_420_allowed once DPCD is read. But I don't think its
> absolutely necessary to have a callback just for this.

After checking the drm_connector docs, I'd still hold my opinion and
consider this patch to be a misuse of the property. If you check the
drm_connector::ycbcr_420_allowed docs, you'll see that it describes
the output from the source point of view. In other words, it should be
true if the DP connector can send YUV420 rather than being set if the
attached display supports such output. This matches ycbcr420_allowed
usage by AMD, dw-hdmi, intel_hdmi and even intel_dp usage.

> >>   drm_dp_set_subconnector_property(dp->dp_display.connector,
> >>connector_status_connected,
> >>dp->panel->dpcd,
> >>dp->panel->downstream_ports);
> >> +}
> >>   edid = dp->panel->edid;
> >



-- 
With best wishes
Dmitry


Re: [PATCH 14/17] drm/msm/dpu: modify encoder programming for CDM over DP

2024-01-28 Thread Dmitry Baryshkov
On Mon, 29 Jan 2024 at 04:58, Abhinav Kumar  wrote:
>
>
>
> On 1/27/2024 9:55 PM, Dmitry Baryshkov wrote:
> > On Sun, 28 Jan 2024 at 07:48, Paloma Arellano  
> > wrote:
> >>
> >>
> >> On 1/25/2024 1:57 PM, Dmitry Baryshkov wrote:
> >>> On 25/01/2024 21:38, Paloma Arellano wrote:
>  Adjust the encoder format programming in the case of video mode for DP
>  to accommodate CDM related changes.
> 
>  Signed-off-by: Paloma Arellano 
>  ---
> drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c   | 16 +
> drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h   |  8 +
> .../drm/msm/disp/dpu1/dpu_encoder_phys_vid.c  | 35 ---
> drivers/gpu/drm/msm/dp/dp_display.c   | 12 +++
> drivers/gpu/drm/msm/msm_drv.h |  9 -
> 5 files changed, 75 insertions(+), 5 deletions(-)
> 
>  diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
>  b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
>  index b0896814c1562..99ec53446ad21 100644
>  --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
>  +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
>  @@ -222,6 +222,22 @@ static u32 dither_matrix[DITHER_MATRIX_SZ] = {
> 15, 7, 13, 5, 3, 11, 1, 9, 12, 4, 14, 6, 0, 8, 2, 10
> };
> +u32 dpu_encoder_get_drm_fmt(const struct drm_encoder *drm_enc,
>  const struct drm_display_mode *mode)
>  +{
>  +const struct dpu_encoder_virt *dpu_enc;
>  +const struct msm_display_info *disp_info;
>  +struct msm_drm_private *priv;
>  +
>  +dpu_enc = to_dpu_encoder_virt(drm_enc);
>  +disp_info = _enc->disp_info;
>  +priv = drm_enc->dev->dev_private;
>  +
>  +if (disp_info->intf_type == INTF_DP &&
>  + msm_dp_is_yuv_420_enabled(priv->dp[disp_info->h_tile_instance[0]],
>  mode))
> >>>
> >>> This should not require interacting with DP. If we got here, we must
> >>> be sure that 4:2:0 is supported and can be configured.
> >> Ack. Will drop this function and only check for if the mode is YUV420.
> >>>
>  +return DRM_FORMAT_YUV420;
>  +
>  +return DRM_FORMAT_RGB888;
>  +}
>   bool dpu_encoder_is_widebus_enabled(const struct drm_encoder
>  *drm_enc)
> {
>  diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h
>  b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h
>  index 7b4afa71f1f96..62255d0aa4487 100644
>  --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h
>  +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h
>  @@ -162,6 +162,14 @@ int dpu_encoder_get_vsync_count(struct
>  drm_encoder *drm_enc);
>  */
> bool dpu_encoder_is_widebus_enabled(const struct drm_encoder
>  *drm_enc);
> +/**
>  + * dpu_encoder_get_drm_fmt - return DRM fourcc format
>  + * @drm_enc:Pointer to previously created drm encoder structure
>  + * @mode:Corresponding drm_display_mode for dpu encoder
>  + */
>  +u32 dpu_encoder_get_drm_fmt(const struct drm_encoder *drm_enc,
>  +const struct drm_display_mode *mode);
>  +
> /**
>  * dpu_encoder_get_crc_values_cnt - get number of physical encoders
>  contained
>  *in virtual encoder that can collect CRC values
>  diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c
>  b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c
>  index e284bf448bdda..a1dde0ff35dc8 100644
>  --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c
>  +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c
>  @@ -234,6 +234,7 @@ static void
>  dpu_encoder_phys_vid_setup_timing_engine(
> {
> struct drm_display_mode mode;
> struct dpu_hw_intf_timing_params timing_params = { 0 };
>  +struct dpu_hw_cdm *hw_cdm;
> const struct dpu_format *fmt = NULL;
> u32 fmt_fourcc = DRM_FORMAT_RGB888;
> unsigned long lock_flags;
>  @@ -254,17 +255,26 @@ static void
>  dpu_encoder_phys_vid_setup_timing_engine(
> DPU_DEBUG_VIDENC(phys_enc, "enabling mode:\n");
> drm_mode_debug_printmodeline();
> -if (phys_enc->split_role != ENC_ROLE_SOLO) {
>  +hw_cdm = phys_enc->hw_cdm;
>  +if (hw_cdm) {
>  +intf_cfg.cdm = hw_cdm->idx;
>  +fmt_fourcc = dpu_encoder_get_drm_fmt(phys_enc->parent, );
>  +}
>  +
>  +if (phys_enc->split_role != ENC_ROLE_SOLO ||
>  +dpu_encoder_get_drm_fmt(phys_enc->parent, ) ==
>  DRM_FORMAT_YUV420) {
> mode.hdisplay >>= 1;
> mode.htotal >>= 1;
> mode.hsync_start >>= 1;
> mode.hsync_end >>= 1;
>  +mode.hskew >>= 1;
> >>>
> >>> Separate patch.
> >> Ack.
> >>>
>   DPU_DEBUG_VIDENC(phys_enc,
>  -"split_role %d, halve horizontal 

Re: [PATCH v3 3/3] dt-bindings: mfd: atmel,hlcdc: Convert to DT schema format

2024-01-28 Thread Dharma.B
Hi Conor,

On 26/01/24 9:03 pm, Conor Dooley wrote:
> On Fri, Jan 26, 2024 at 02:22:42PM +,dharm...@microchip.com  wrote:
>> On 25/01/24 1:57 pm, Conor Dooley - M52691 wrote:
> If the lvds pll is an input to the hlcdc, you need to add it here.
>From your description earlier it does sound like it is an input to
> the hlcdc, but now you are claiming that it is not.
 The LVDS PLL serves as an input to both the LCDC and LVDSC
>>> Then it should be an input to both the LCDC and LVDSC in the devicetree.
>> For the LVDSC to operate, the presence of the LVDS PLL is crucial. However, 
>> in the case of the LCDC, LVDS PLL is not essential for its operation unless 
>> LVDS interface is used and when it is used lvds driver will take care of 
>> preparing and enabling the LVDS PLL.
> Please fix your line wrapping, not sure what's going on here, but these
> lines are super long.
> 
>> Consequently, it seems that there might not be any significant actions we 
>> can take within the LCD driver regarding the LVDS PLL.
> You should be getting a reference to the clock and calling enable on it
> etc, even if the LVDSC is also doing so. That will allow the clock
> framework to correctly track users.
> 
>> If there are no intentions to utilize it within the driver, is it necessary 
>> to explicitly designate it as an input in the device tree?
> The binding describes the hardware, so yes it should be there. What the
> driver implementation does with the clock is not relevant. That said, I
> think the driver should actually be using it, as I wrote above.
> 
>> If yes, I will update the bindings with optional LVDS PLL clock.
>>
>> clock-names:
>>items:
>>  - const: periph_clk
>>  - const: sys_clk
>>  - const: slow_clk
>>  - const: lvds_pll  # Optional clock
> This looks correct, but the comment is not needed. Setting minItems: 3
> does this for you.
Sure, thanks.
> 
 with the
 LVDS_PLL multiplied by 7 for the Pixel clock to the LVDS PHY, and
>>> Are you sure? The diagram doesn't show a multiplier, the 7x comment
>>> there seems to be showing relations?
>> Sorry,
>> LVDS PLL = (PCK * 7) goes to LVDSC PHY
>> PCK = (LVDS PLL / 7) goes to LCDC
> I'll take your word for it 
> 
 LVDS_PLL divided by 7 for the Pixel clock to the LCDC.
 I am inclined to believe that appropriately configuring and enabling it
 in the LVDS driver would be the appropriate course of action.
>>> We're talking about bindings here, not drivers, but I would imagine that
>>> if two peripherals are using the same clock then both of them should be
>>> getting a reference to and enabling that clock so that the clock
>>> framework can correctly track the users.
>>>
> I don't know your hardware, so I have no idea which of the two is
> correct, but it sounds like the former. Without digging into how this
> works my assumption about the hardware here looks like is that the lvds
> controller is a clock provider,
 It's a PLL clock from PMC.

> and that the lvds controller's clock is
> an optional input for the hlcdc.
 Again it's a PLL clock from PMC.

 Please refer Section 39.3
 https://ww1.microchip.com/downloads/aemDocuments/documents/MPU32/ProductDocuments/DataSheets/SAM9X7-Series-Data-Sheet-DS60001813.pdf
>>> It is not the same exact clock as you pointed out above though, so the
>>> by 7 divider should be modelled.
>> Modelled in mfd binding? If possible, could you please provide an example 
>> for better clarity? Thank you.
> Whatever node corresponds to the register range controlling this PLL
> should be a "clock-controller" (like any other clock provider does).
> Your PMC should have this property. I don't know if the correct location
> is the mfd node or somewhere else, you'll have to check your docs.
Sure, Noted. I'll do that in separate patch.
---
I will proceed with updating the clock names to include "lvds pll" and 
adjusting the clocks minitems to 3. Does this seem appropriate to you?

Please let me know if there are any additional considerations or 
specific aspects that require attention.

-- 
With Best Regards,
Dharma B.
> 
> Thanks,
> Conor.




Re: [PATCH] drm/i915/gvt: Fix uninitialized variable in handle_mmio()

2024-01-28 Thread Zhenyu Wang
On 2024.01.26 11:41:47 +0300, Dan Carpenter wrote:
> This code prints the wrong variable in the warning message.  It should
> print "i" instead of "info->offset".  On the first iteration "info" is
> uninitialized leading to a crash and on subsequent iterations it prints
> the previous offset instead of the current one.
> 
> Fixes: e0f74ed4634d ("i915/gvt: Separate the MMIO tracking table from GVT-g")
> Signed-off-by: Dan Carpenter 
> ---
>  drivers/gpu/drm/i915/gvt/handlers.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gvt/handlers.c 
> b/drivers/gpu/drm/i915/gvt/handlers.c
> index 90f6c1ece57d..efcb00472be2 100644
> --- a/drivers/gpu/drm/i915/gvt/handlers.c
> +++ b/drivers/gpu/drm/i915/gvt/handlers.c
> @@ -2849,8 +2849,7 @@ static int handle_mmio(struct intel_gvt_mmio_table_iter 
> *iter, u32 offset,
>   for (i = start; i < end; i += 4) {
>   p = intel_gvt_find_mmio_info(gvt, i);
>   if (p) {
> - WARN(1, "dup mmio definition offset %x\n",
> - info->offset);
> + WARN(1, "dup mmio definition offset %x\n", i);
>  
>   /* We return -EEXIST here to make GVT-g load fail.
>* So duplicated MMIO can be found as soon as
> -- 
> 2.43.0
>

Thanks for the fix.

Reviewed-by: Zhenyu Wang 



signature.asc
Description: PGP signature


Re: [PATCH 01/17] drm/msm/dpu: allow dpu_encoder_helper_phys_setup_cdm to work for DP

2024-01-28 Thread Dmitry Baryshkov
On Mon, 29 Jan 2024 at 05:06, Abhinav Kumar  wrote:
>
>
>
> On 1/26/2024 4:39 PM, Paloma Arellano wrote:
> >
> > On 1/25/2024 1:14 PM, Dmitry Baryshkov wrote:
> >> On 25/01/2024 21:38, Paloma Arellano wrote:
> >>> Generalize dpu_encoder_helper_phys_setup_cdm to be compatible with DP.
> >>>
> >>> Signed-off-by: Paloma Arellano 
> >>> ---
> >>>   .../gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h  |  4 +--
> >>>   .../drm/msm/disp/dpu1/dpu_encoder_phys_wb.c   | 31 ++-
> >>>   2 files changed, 18 insertions(+), 17 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h
> >>> b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h
> >>> index 993f263433314..37ac385727c3b 100644
> >>> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h
> >>> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h
> >>> @@ -153,6 +153,7 @@ enum dpu_intr_idx {
> >>>* @hw_intf:Hardware interface to the intf registers
> >>>* @hw_wb:Hardware interface to the wb registers
> >>>* @hw_cdm:Hardware interface to the CDM registers
> >>> + * @cdm_cfg:CDM block config needed to store WB/DP block's CDM
> >>> configuration
> >>
> >> Please realign the description.
> > Ack
> >>
> >>>* @dpu_kms:Pointer to the dpu_kms top level
> >>>* @cached_mode:DRM mode cached at mode_set time, acted on in
> >>> enable
> >>>* @vblank_ctl_lock:Vblank ctl mutex lock to protect
> >>> vblank_refcount
> >>> @@ -183,6 +184,7 @@ struct dpu_encoder_phys {
> >>>   struct dpu_hw_intf *hw_intf;
> >>>   struct dpu_hw_wb *hw_wb;
> >>>   struct dpu_hw_cdm *hw_cdm;
> >>> +struct dpu_hw_cdm_cfg cdm_cfg;
> >>
> >> It might be slightly better to move it after all the pointers, so
> >> after the dpu_kms.
> > Ack
> >>
> >>>   struct dpu_kms *dpu_kms;
> >>>   struct drm_display_mode cached_mode;
> >>>   struct mutex vblank_ctl_lock;
> >>> @@ -213,7 +215,6 @@ static inline int
> >>> dpu_encoder_phys_inc_pending(struct dpu_encoder_phys *phys)
> >>>* @wbirq_refcount: Reference count of writeback interrupt
> >>>* @wb_done_timeout_cnt: number of wb done irq timeout errors
> >>>* @wb_cfg:  writeback block config to store fb related details
> >>> - * @cdm_cfg: cdm block config needed to store writeback block's CDM
> >>> configuration
> >>>* @wb_conn: backpointer to writeback connector
> >>>* @wb_job: backpointer to current writeback job
> >>>* @dest:   dpu buffer layout for current writeback output buffer
> >>> @@ -223,7 +224,6 @@ struct dpu_encoder_phys_wb {
> >>>   atomic_t wbirq_refcount;
> >>>   int wb_done_timeout_cnt;
> >>>   struct dpu_hw_wb_cfg wb_cfg;
> >>> -struct dpu_hw_cdm_cfg cdm_cfg;
> >>>   struct drm_writeback_connector *wb_conn;
> >>>   struct drm_writeback_job *wb_job;
> >>>   struct dpu_hw_fmt_layout dest;
> >>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c
> >>> b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c
> >>> index 4cd2d9e3131a4..072fc6950e496 100644
> >>> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c
> >>> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c
> >>> @@ -269,28 +269,21 @@ static void
> >>> dpu_encoder_phys_wb_setup_ctl(struct dpu_encoder_phys *phys_enc)
> >>>* This API does not handle
> >>> DPU_CHROMA_H1V2.
> >>>* @phys_enc:Pointer to physical encoder
> >>>*/
> >>> -static void dpu_encoder_helper_phys_setup_cdm(struct
> >>> dpu_encoder_phys *phys_enc)
> >>> +static void dpu_encoder_helper_phys_setup_cdm(struct
> >>> dpu_encoder_phys *phys_enc,
> >>> +  const struct dpu_format *dpu_fmt,
> >>> +  u32 output_type)
> >>>   {
> >>>   struct dpu_hw_cdm *hw_cdm;
> >>>   struct dpu_hw_cdm_cfg *cdm_cfg;
> >>>   struct dpu_hw_pingpong *hw_pp;
> >>> -struct dpu_encoder_phys_wb *wb_enc;
> >>> -const struct msm_format *format;
> >>> -const struct dpu_format *dpu_fmt;
> >>> -struct drm_writeback_job *wb_job;
> >>>   int ret;
> >>> if (!phys_enc)
> >>>   return;
> >>>   -wb_enc = to_dpu_encoder_phys_wb(phys_enc);
> >>> -cdm_cfg = _enc->cdm_cfg;
> >>> +cdm_cfg = _enc->cdm_cfg;
> >>>   hw_pp = phys_enc->hw_pp;
> >>>   hw_cdm = phys_enc->hw_cdm;
> >>> -wb_job = wb_enc->wb_job;
> >>> -
> >>> -format = msm_framebuffer_format(wb_enc->wb_job->fb);
> >>> -dpu_fmt = dpu_get_dpu_format_ext(format->pixel_format,
> >>> wb_job->fb->modifier);
> >>> if (!hw_cdm)
> >>>   return;
> >>> @@ -306,10 +299,10 @@ static void
> >>> dpu_encoder_helper_phys_setup_cdm(struct dpu_encoder_phys *phys_enc)
> >>> memset(cdm_cfg, 0, sizeof(struct dpu_hw_cdm_cfg));
> >>>   -cdm_cfg->output_width = wb_job->fb->width;
> >>> -cdm_cfg->output_height = wb_job->fb->height;
> >>> +cdm_cfg->output_width = phys_enc->cached_mode.hdisplay;
> >>> +

Re: [PATCH 17/17] drm/msm/dp: allow YUV420 mode for DP connector when VSC SDP supported

2024-01-28 Thread Abhinav Kumar




On 1/25/2024 2:05 PM, Dmitry Baryshkov wrote:

On 25/01/2024 21:38, Paloma Arellano wrote:

All the components of YUV420 over DP are added. Therefore, let's mark the
connector property as true for DP connector when the DP type is not eDP
and when VSC SDP is supported.

Signed-off-by: Paloma Arellano 
---
  drivers/gpu/drm/msm/dp/dp_display.c | 5 -
  1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/dp/dp_display.c 
b/drivers/gpu/drm/msm/dp/dp_display.c

index 4329435518351..97edd607400b8 100644
--- a/drivers/gpu/drm/msm/dp/dp_display.c
+++ b/drivers/gpu/drm/msm/dp/dp_display.c
@@ -370,11 +370,14 @@ static int dp_display_process_hpd_high(struct 
dp_display_private *dp)

  dp_link_process_request(dp->link);
-    if (!dp->dp_display.is_edp)
+    if (!dp->dp_display.is_edp) {
+    if (dp_panel_vsc_sdp_supported(dp->panel))
+    dp->dp_display.connector->ycbcr_420_allowed = true;


Please consider fixing a TODO in drm_bridge_connector_init().



I am not totally clear if that TODO can ever go for DP/HDMI usage of 
drm_bridge_connector.


We do not know if the sink supports VSC SDP till we read the DPCD and 
till we know that sink supports VSC SDP, there is no reason to mark the 
YUV modes as supported. This is the same logic followed across vendors.


drm_bride_connector_init() happens much earlier than the point where we 
read DPCD. The only thing which can be done is perhaps add some callback 
to update_ycbcr_420_allowed once DPCD is read. But I don't think its 
absolutely necessary to have a callback just for this.



  drm_dp_set_subconnector_property(dp->dp_display.connector,
   connector_status_connected,
   dp->panel->dpcd,
   dp->panel->downstream_ports);
+    }
  edid = dp->panel->edid;




Re: [PATCH 01/17] drm/msm/dpu: allow dpu_encoder_helper_phys_setup_cdm to work for DP

2024-01-28 Thread Abhinav Kumar




On 1/26/2024 4:39 PM, Paloma Arellano wrote:


On 1/25/2024 1:14 PM, Dmitry Baryshkov wrote:

On 25/01/2024 21:38, Paloma Arellano wrote:

Generalize dpu_encoder_helper_phys_setup_cdm to be compatible with DP.

Signed-off-by: Paloma Arellano 
---
  .../gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h  |  4 +--
  .../drm/msm/disp/dpu1/dpu_encoder_phys_wb.c   | 31 ++-
  2 files changed, 18 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h

index 993f263433314..37ac385727c3b 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h
@@ -153,6 +153,7 @@ enum dpu_intr_idx {
   * @hw_intf:    Hardware interface to the intf registers
   * @hw_wb:    Hardware interface to the wb registers
   * @hw_cdm:    Hardware interface to the CDM registers
+ * @cdm_cfg:    CDM block config needed to store WB/DP block's CDM 
configuration


Please realign the description.

Ack



   * @dpu_kms:    Pointer to the dpu_kms top level
   * @cached_mode:    DRM mode cached at mode_set time, acted on in 
enable
   * @vblank_ctl_lock:    Vblank ctl mutex lock to protect 
vblank_refcount

@@ -183,6 +184,7 @@ struct dpu_encoder_phys {
  struct dpu_hw_intf *hw_intf;
  struct dpu_hw_wb *hw_wb;
  struct dpu_hw_cdm *hw_cdm;
+    struct dpu_hw_cdm_cfg cdm_cfg;


It might be slightly better to move it after all the pointers, so 
after the dpu_kms.

Ack



  struct dpu_kms *dpu_kms;
  struct drm_display_mode cached_mode;
  struct mutex vblank_ctl_lock;
@@ -213,7 +215,6 @@ static inline int 
dpu_encoder_phys_inc_pending(struct dpu_encoder_phys *phys)

   * @wbirq_refcount: Reference count of writeback interrupt
   * @wb_done_timeout_cnt: number of wb done irq timeout errors
   * @wb_cfg:  writeback block config to store fb related details
- * @cdm_cfg: cdm block config needed to store writeback block's CDM 
configuration

   * @wb_conn: backpointer to writeback connector
   * @wb_job: backpointer to current writeback job
   * @dest:   dpu buffer layout for current writeback output buffer
@@ -223,7 +224,6 @@ struct dpu_encoder_phys_wb {
  atomic_t wbirq_refcount;
  int wb_done_timeout_cnt;
  struct dpu_hw_wb_cfg wb_cfg;
-    struct dpu_hw_cdm_cfg cdm_cfg;
  struct drm_writeback_connector *wb_conn;
  struct drm_writeback_job *wb_job;
  struct dpu_hw_fmt_layout dest;
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c

index 4cd2d9e3131a4..072fc6950e496 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_wb.c
@@ -269,28 +269,21 @@ static void 
dpu_encoder_phys_wb_setup_ctl(struct dpu_encoder_phys *phys_enc)
   * This API does not handle 
DPU_CHROMA_H1V2.

   * @phys_enc:Pointer to physical encoder
   */
-static void dpu_encoder_helper_phys_setup_cdm(struct 
dpu_encoder_phys *phys_enc)
+static void dpu_encoder_helper_phys_setup_cdm(struct 
dpu_encoder_phys *phys_enc,

+  const struct dpu_format *dpu_fmt,
+  u32 output_type)
  {
  struct dpu_hw_cdm *hw_cdm;
  struct dpu_hw_cdm_cfg *cdm_cfg;
  struct dpu_hw_pingpong *hw_pp;
-    struct dpu_encoder_phys_wb *wb_enc;
-    const struct msm_format *format;
-    const struct dpu_format *dpu_fmt;
-    struct drm_writeback_job *wb_job;
  int ret;
    if (!phys_enc)
  return;
  -    wb_enc = to_dpu_encoder_phys_wb(phys_enc);
-    cdm_cfg = _enc->cdm_cfg;
+    cdm_cfg = _enc->cdm_cfg;
  hw_pp = phys_enc->hw_pp;
  hw_cdm = phys_enc->hw_cdm;
-    wb_job = wb_enc->wb_job;
-
-    format = msm_framebuffer_format(wb_enc->wb_job->fb);
-    dpu_fmt = dpu_get_dpu_format_ext(format->pixel_format, 
wb_job->fb->modifier);

    if (!hw_cdm)
  return;
@@ -306,10 +299,10 @@ static void 
dpu_encoder_helper_phys_setup_cdm(struct dpu_encoder_phys *phys_enc)

    memset(cdm_cfg, 0, sizeof(struct dpu_hw_cdm_cfg));
  -    cdm_cfg->output_width = wb_job->fb->width;
-    cdm_cfg->output_height = wb_job->fb->height;
+    cdm_cfg->output_width = phys_enc->cached_mode.hdisplay;
+    cdm_cfg->output_height = phys_enc->cached_mode.vdisplay;


This is a semantic change. Instead of passing the FB size, this passes 
the mode dimensions. They are not guaranteed to be the same, 
especially for the WB case.




The WB job is storing the output FB of WB. I cannot think of a use-case 
where this cannot match the current mode programmed to the WB encoder.


Yes, if it was the drm_plane's FB, then it cannot be guaranteed as the 
plane can scale the contents but here thats not the case. Here its the 
output FB of WB.



  cdm_cfg->output_fmt = dpu_fmt;
-    cdm_cfg->output_type = CDM_CDWN_OUTPUT_WB;
+    cdm_cfg->output_type = output_type;
  

Re: [PATCH 14/17] drm/msm/dpu: modify encoder programming for CDM over DP

2024-01-28 Thread Abhinav Kumar




On 1/27/2024 9:55 PM, Dmitry Baryshkov wrote:

On Sun, 28 Jan 2024 at 07:48, Paloma Arellano  wrote:



On 1/25/2024 1:57 PM, Dmitry Baryshkov wrote:

On 25/01/2024 21:38, Paloma Arellano wrote:

Adjust the encoder format programming in the case of video mode for DP
to accommodate CDM related changes.

Signed-off-by: Paloma Arellano 
---
   drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c   | 16 +
   drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h   |  8 +
   .../drm/msm/disp/dpu1/dpu_encoder_phys_vid.c  | 35 ---
   drivers/gpu/drm/msm/dp/dp_display.c   | 12 +++
   drivers/gpu/drm/msm/msm_drv.h |  9 -
   5 files changed, 75 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
index b0896814c1562..99ec53446ad21 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
@@ -222,6 +222,22 @@ static u32 dither_matrix[DITHER_MATRIX_SZ] = {
   15, 7, 13, 5, 3, 11, 1, 9, 12, 4, 14, 6, 0, 8, 2, 10
   };
   +u32 dpu_encoder_get_drm_fmt(const struct drm_encoder *drm_enc,
const struct drm_display_mode *mode)
+{
+const struct dpu_encoder_virt *dpu_enc;
+const struct msm_display_info *disp_info;
+struct msm_drm_private *priv;
+
+dpu_enc = to_dpu_encoder_virt(drm_enc);
+disp_info = _enc->disp_info;
+priv = drm_enc->dev->dev_private;
+
+if (disp_info->intf_type == INTF_DP &&
+ msm_dp_is_yuv_420_enabled(priv->dp[disp_info->h_tile_instance[0]],
mode))


This should not require interacting with DP. If we got here, we must
be sure that 4:2:0 is supported and can be configured.

Ack. Will drop this function and only check for if the mode is YUV420.



+return DRM_FORMAT_YUV420;
+
+return DRM_FORMAT_RGB888;
+}
 bool dpu_encoder_is_widebus_enabled(const struct drm_encoder
*drm_enc)
   {
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h
b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h
index 7b4afa71f1f96..62255d0aa4487 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h
@@ -162,6 +162,14 @@ int dpu_encoder_get_vsync_count(struct
drm_encoder *drm_enc);
*/
   bool dpu_encoder_is_widebus_enabled(const struct drm_encoder
*drm_enc);
   +/**
+ * dpu_encoder_get_drm_fmt - return DRM fourcc format
+ * @drm_enc:Pointer to previously created drm encoder structure
+ * @mode:Corresponding drm_display_mode for dpu encoder
+ */
+u32 dpu_encoder_get_drm_fmt(const struct drm_encoder *drm_enc,
+const struct drm_display_mode *mode);
+
   /**
* dpu_encoder_get_crc_values_cnt - get number of physical encoders
contained
*in virtual encoder that can collect CRC values
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c
b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c
index e284bf448bdda..a1dde0ff35dc8 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c
@@ -234,6 +234,7 @@ static void
dpu_encoder_phys_vid_setup_timing_engine(
   {
   struct drm_display_mode mode;
   struct dpu_hw_intf_timing_params timing_params = { 0 };
+struct dpu_hw_cdm *hw_cdm;
   const struct dpu_format *fmt = NULL;
   u32 fmt_fourcc = DRM_FORMAT_RGB888;
   unsigned long lock_flags;
@@ -254,17 +255,26 @@ static void
dpu_encoder_phys_vid_setup_timing_engine(
   DPU_DEBUG_VIDENC(phys_enc, "enabling mode:\n");
   drm_mode_debug_printmodeline();
   -if (phys_enc->split_role != ENC_ROLE_SOLO) {
+hw_cdm = phys_enc->hw_cdm;
+if (hw_cdm) {
+intf_cfg.cdm = hw_cdm->idx;
+fmt_fourcc = dpu_encoder_get_drm_fmt(phys_enc->parent, );
+}
+
+if (phys_enc->split_role != ENC_ROLE_SOLO ||
+dpu_encoder_get_drm_fmt(phys_enc->parent, ) ==
DRM_FORMAT_YUV420) {
   mode.hdisplay >>= 1;
   mode.htotal >>= 1;
   mode.hsync_start >>= 1;
   mode.hsync_end >>= 1;
+mode.hskew >>= 1;


Separate patch.

Ack.



 DPU_DEBUG_VIDENC(phys_enc,
-"split_role %d, halve horizontal %d %d %d %d\n",
+"split_role %d, halve horizontal %d %d %d %d %d\n",
   phys_enc->split_role,
   mode.hdisplay, mode.htotal,
-mode.hsync_start, mode.hsync_end);
+mode.hsync_start, mode.hsync_end,
+mode.hskew);
   }
 drm_mode_to_intf_timing_params(phys_enc, , _params);
@@ -412,8 +422,15 @@ static int dpu_encoder_phys_vid_control_vblank_irq(
   static void dpu_encoder_phys_vid_enable(struct dpu_encoder_phys
*phys_enc)
   {
   struct dpu_hw_ctl *ctl;
+struct dpu_hw_cdm *hw_cdm;
+const struct dpu_format *fmt = NULL;
+u32 fmt_fourcc = DRM_FORMAT_RGB888;
 ctl = phys_enc->hw_ctl;
+hw_cdm = phys_enc->hw_cdm;
+if (hw_cdm)
+fmt_fourcc = dpu_encoder_get_drm_fmt(phys_enc->parent,

[drm-tip:drm-tip 1/7] drivers/gpu/drm/bridge/samsung-dsim.c:1504:3: error: implicit declaration of function 'samsung_dsim_set_stop_state' is invalid in C99

2024-01-28 Thread kernel test robot
tree:   git://anongit.freedesktop.org/drm/drm-tip drm-tip
head:   0f1b42b9d395bd4097b2846230a13869dc638216
commit: cd3a0e22e5de2867cd98b5223094a467a5b0993d [1/7] Merge remote-tracking 
branch 'drm-misc/drm-misc-next' into drm-tip
config: arm-defconfig 
(https://download.01.org/0day-ci/archive/20240129/202401291018.wgyuxgmh-...@intel.com/config)
compiler: clang version 14.0.6 (https://github.com/llvm/llvm-project.git 
f28c006a5895fc0e329fe15fead81e37457cb1d1)
reproduce (this is a W=1 build): 
(https://download.01.org/0day-ci/archive/20240129/202401291018.wgyuxgmh-...@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot 
| Closes: 
https://lore.kernel.org/oe-kbuild-all/202401291018.wgyuxgmh-...@intel.com/

All errors (new ones prefixed by >>):

>> drivers/gpu/drm/bridge/samsung-dsim.c:1504:3: error: implicit declaration of 
>> function 'samsung_dsim_set_stop_state' is invalid in C99 
>> [-Werror,-Wimplicit-function-declaration]
   samsung_dsim_set_stop_state(dsi, true);
   ^
   drivers/gpu/drm/bridge/samsung-dsim.c:1504:3: note: did you mean 
'samsung_dsim_set_phy_ctrl'?
   drivers/gpu/drm/bridge/samsung-dsim.c:749:13: note: 
'samsung_dsim_set_phy_ctrl' declared here
   static void samsung_dsim_set_phy_ctrl(struct samsung_dsim *dsi)
   ^
   drivers/gpu/drm/bridge/samsung-dsim.c:1629:22: error: use of undeclared 
identifier 'samsung_dsim_atomic_disable'; did you mean 
'samsung_dsim_atomic_enable'?
   .atomic_disable = samsung_dsim_atomic_disable,
 ^~~
 samsung_dsim_atomic_enable
   drivers/gpu/drm/bridge/samsung-dsim.c:1487:13: note: 
'samsung_dsim_atomic_enable' declared here
   static void samsung_dsim_atomic_enable(struct drm_bridge *bridge,
   ^
   2 errors generated.


vim +/samsung_dsim_set_stop_state +1504 drivers/gpu/drm/bridge/samsung-dsim.c

e7447128ca4a25 Jagan Teki 2023-03-08  1497  
e7447128ca4a25 Jagan Teki 2023-03-08  1498  static void 
samsung_dsim_atomic_post_disable(struct drm_bridge *bridge,
e7447128ca4a25 Jagan Teki 2023-03-08  1499  
 struct drm_bridge_state *old_bridge_state)
e7447128ca4a25 Jagan Teki 2023-03-08  1500  {
e7447128ca4a25 Jagan Teki 2023-03-08  1501  struct samsung_dsim 
*dsi = bridge_to_dsi(bridge);
e7447128ca4a25 Jagan Teki 2023-03-08  1502  
b2fe2292624ac4 Dario Binacchi 2023-12-18  1503  if 
(!samsung_dsim_hw_is_exynos(dsi->plat_data->hw_type))
b2fe2292624ac4 Dario Binacchi 2023-12-18 @1504  
samsung_dsim_set_stop_state(dsi, true);
e7447128ca4a25 Jagan Teki 2023-03-08  1505  
e7447128ca4a25 Jagan Teki 2023-03-08  1506  dsi->state &= 
~DSIM_STATE_ENABLED;
e7447128ca4a25 Jagan Teki 2023-03-08  1507  
pm_runtime_put_sync(dsi->dev);
e7447128ca4a25 Jagan Teki 2023-03-08  1508  }
e7447128ca4a25 Jagan Teki 2023-03-08  1509  

:: The code at line 1504 was first introduced by commit
:: b2fe2292624ac4fc98dcdaf76c983d3f6e8455e5 drm: bridge: samsung-dsim: 
enter display mode in the enable() callback

:: TO: Dario Binacchi 
:: CC: Robert Foss 

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


[PATCH] nouveau: offload fence uevents work to workqueue

2024-01-28 Thread Dave Airlie
From: Dave Airlie 

This should break the deadlock between the fctx lock and the irq lock.

This offloads the processing off the work from the irq into a workqueue.

Signed-off-by: Dave Airlie 
---
 drivers/gpu/drm/nouveau/nouveau_fence.c | 24 ++--
 drivers/gpu/drm/nouveau/nouveau_fence.h |  1 +
 2 files changed, 19 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c 
b/drivers/gpu/drm/nouveau/nouveau_fence.c
index ca762ea55413..93f08f9479d8 100644
--- a/drivers/gpu/drm/nouveau/nouveau_fence.c
+++ b/drivers/gpu/drm/nouveau/nouveau_fence.c
@@ -103,6 +103,7 @@ nouveau_fence_context_kill(struct nouveau_fence_chan *fctx, 
int error)
 void
 nouveau_fence_context_del(struct nouveau_fence_chan *fctx)
 {
+   cancel_work_sync(>uevent_work);
nouveau_fence_context_kill(fctx, 0);
nvif_event_dtor(>event);
fctx->dead = 1;
@@ -145,12 +146,13 @@ nouveau_fence_update(struct nouveau_channel *chan, struct 
nouveau_fence_chan *fc
return drop;
 }
 
-static int
-nouveau_fence_wait_uevent_handler(struct nvif_event *event, void *repv, u32 
repc)
+static void
+nouveau_fence_uevent_work(struct work_struct *work)
 {
-   struct nouveau_fence_chan *fctx = container_of(event, typeof(*fctx), 
event);
+   struct nouveau_fence_chan *fctx = container_of(work, struct 
nouveau_fence_chan,
+  uevent_work);
unsigned long flags;
-   int ret = NVIF_EVENT_KEEP;
+   int drop = 0;
 
spin_lock_irqsave(>lock, flags);
if (!list_empty(>pending)) {
@@ -160,11 +162,20 @@ nouveau_fence_wait_uevent_handler(struct nvif_event 
*event, void *repv, u32 repc
fence = list_entry(fctx->pending.next, typeof(*fence), head);
chan = rcu_dereference_protected(fence->channel, 
lockdep_is_held(>lock));
if (nouveau_fence_update(chan, fctx))
-   ret = NVIF_EVENT_DROP;
+   drop = 1;
}
+   if (drop)
+   nvif_event_block(>event);
+
spin_unlock_irqrestore(>lock, flags);
+}
 
-   return ret;
+static int
+nouveau_fence_wait_uevent_handler(struct nvif_event *event, void *repv, u32 
repc)
+{
+   struct nouveau_fence_chan *fctx = container_of(event, typeof(*fctx), 
event);
+   schedule_work(>uevent_work);
+   return NVIF_EVENT_KEEP;
 }
 
 void
@@ -178,6 +189,7 @@ nouveau_fence_context_new(struct nouveau_channel *chan, 
struct nouveau_fence_cha
} args;
int ret;
 
+   INIT_WORK(>uevent_work, nouveau_fence_uevent_work);
INIT_LIST_HEAD(>flip);
INIT_LIST_HEAD(>pending);
spin_lock_init(>lock);
diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.h 
b/drivers/gpu/drm/nouveau/nouveau_fence.h
index 64d33ae7f356..8bc065acfe35 100644
--- a/drivers/gpu/drm/nouveau/nouveau_fence.h
+++ b/drivers/gpu/drm/nouveau/nouveau_fence.h
@@ -44,6 +44,7 @@ struct nouveau_fence_chan {
u32 context;
char name[32];
 
+   struct work_struct uevent_work;
struct nvif_event event;
int notify_ref, dead, killed;
 };
-- 
2.43.0



Re: [PATCH] drm/amd/display: add panel_power_savings sysfs entry to eDP connectors

2024-01-28 Thread Dominik Förderer

I've applied the patch to 6.7.2. The device then shows up under:

/sys/devices/pci:00/:00:08.1/:c1:00.0/drm/card1/card1-eDP-1/amdgpu/panel_power_savings
(on Framework Laptop 13 amd 7840U with 780M).

After a few tests i can say that at least in my system it’s not working. 
Setting a value between 0 and 4 in /sys/.../panel_power_savings changes 
nothing in the panel behavior. There are no errors in kernel log.


Setting an abmlevel via kernel option still works as intended.


The issue can be resolved if one set the panel_power_savings value and 
after that change the display resolution to a lower value and than 
switch back. For example this script works:



xterm -e 'echo 0 | sudo tee 
/sys/devices/pci:00/:00:08.1/:c1:00.0/drm/card1/card1-eDP-1/amdgpu/panel_power_savings'
gnome-randr modify -m 1920x1200@59.999 eDP-1 && gnome-randr modify -m 
2256x1504@59.999 eDP-1



Am 26.01.24 um 23:22 schrieb Hamza Mahfooz:

We want programs besides the compositor to be able to enable or disable
panel power saving features. However, since they are currently only
configurable through DRM properties, that isn't possible. So, to remedy
that issue introduce a new "panel_power_savings" sysfs attribute.

Cc: Mario Limonciello 
Signed-off-by: Hamza Mahfooz 
---
.../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 59 +++
1 file changed, 59 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c

index cd98b3565178..b3fcd833015d 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -6534,6 +6534,58 @@ 
amdgpu_dm_connector_atomic_duplicate_state(struct drm_connector 
*connector)

return _state->base;
}
+static ssize_t panel_power_savings_show(struct device *device,
+ struct device_attribute *attr,
+ char *buf)
+{
+ struct drm_connector *connector = dev_get_drvdata(device);
+ struct drm_device *dev = connector->dev;
+ ssize_t val;
+
+ drm_modeset_lock(>mode_config.connection_mutex, NULL);
+ val = to_dm_connector_state(connector->state)->abm_level;
+ drm_modeset_unlock(>mode_config.connection_mutex);
+
+ return sysfs_emit(buf, "%lu\n", val);
+}
+
+static ssize_t panel_power_savings_store(struct device *device,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct drm_connector *connector = dev_get_drvdata(device);
+ struct drm_device *dev = connector->dev;
+ long val;
+ int ret;
+
+ ret = kstrtol(buf, 0, );
+
+ if (ret)
+ return ret;
+
+ if (val < 0 || val > 4)
+ return -EINVAL;
+
+ drm_modeset_lock(>mode_config.connection_mutex, NULL);
+ to_dm_connector_state(connector->state)->abm_level = val ?:
+ ABM_LEVEL_IMMEDIATE_DISABLE;
+ drm_modeset_unlock(>mode_config.connection_mutex);
+
+ return count;
+}
+
+static DEVICE_ATTR_RW(panel_power_savings);
+
+static struct attribute *amdgpu_attrs[] = {
+ _attr_panel_power_savings.attr,
+ NULL
+};
+
+static const struct attribute_group amdgpu_group = {
+ .name = "amdgpu",
+ .attrs = amdgpu_attrs
+};
+
static int
amdgpu_dm_connector_late_register(struct drm_connector *connector)
{
@@ -6541,6 +6593,13 @@ amdgpu_dm_connector_late_register(struct 
drm_connector *connector)

to_amdgpu_dm_connector(connector);
int r;
+ if (connector->connector_type == DRM_MODE_CONNECTOR_eDP) {
+ r = sysfs_create_group(>kdev->kobj,
+ _group);
+ if (r)
+ return r;
+ }
+
amdgpu_dm_register_backlight_device(amdgpu_dm_connector);
if ((connector->connector_type == DRM_MODE_CONNECTOR_DisplayPort) ||



--
Dominik Förderer (Netzwerkadministrator)
Windeck-Gymnasium Bühl
Humboldtstr. 3
Tel. 07223/9409585



OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: KASAN: use-after-free Read in drm_gem_object_release

2024-01-28 Thread Hussen Argaw
+251961377649

dila Ethiopia


Re: [PATCH] drm/amd/display: add panel_power_savings sysfs entry to eDP connectors

2024-01-28 Thread administrator

I've applied the patch to 6.7.2. The device then shows up under:

/sys/devices/pci:00/:00:08.1/:c1:00.0/drm/card1/card1-eDP-1/amdgpu/panel_power_savings
(on Framework Laptop 13 amd 7840U with 780M).

After a few tests i can say that at least in my system it’s not  
working. Setting a value between 0 and 4 in
/sys/.../panel_power_savings changes nothing in the panel behavior.  
There are no errors in kernel log.


Setting an abmlevel via kernel option still works as intended.

The issue can be resolved if one set the panel_power_savings value and  
after that change the display resolution to a lower value and than  
switch back. For example this script works:


xterm -e 'echo 0 | sudo tee  
/sys/devices/pci:00/:00:08.1/:c1:00.0/drm/card1/card1-eDP-1/amdgpu/panel_power_savings'
gnome-randr modify -m 1920x1200@59.999 eDP-1 && gnome-randr modify -m  
2256x1504@59.999 eDP-1



Am 26.01.24 um 23:22 schrieb Hamza Mahfooz:
We want programs besides the compositor to be able to enable or disable
panel power saving features. However, since they are currently only
configurable through DRM properties, that isn't possible. So, to remedy
that issue introduce a new "panel_power_savings" sysfs attribute.

Cc: Mario Limonciello 
Signed-off-by: Hamza Mahfooz 
---
.../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 59 +++
1 file changed, 59 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c  
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c

index cd98b3565178..b3fcd833015d 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -6534,6 +6534,58 @@  
amdgpu_dm_connector_atomic_duplicate_state(struct drm_connector  
*connector)

return _state->base;
}
+static ssize_t panel_power_savings_show(struct device *device,
+ struct device_attribute *attr,
+ char *buf)
+{
+ struct drm_connector *connector = dev_get_drvdata(device);
+ struct drm_device *dev = connector->dev;
+ ssize_t val;
+
+ drm_modeset_lock(>mode_config.connection_mutex, NULL);
+ val = to_dm_connector_state(connector->state)->abm_level;
+ drm_modeset_unlock(>mode_config.connection_mutex);
+
+ return sysfs_emit(buf, "%lu\n", val);
+}
+
+static ssize_t panel_power_savings_store(struct device *device,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct drm_connector *connector = dev_get_drvdata(device);
+ struct drm_device *dev = connector->dev;
+ long val;
+ int ret;
+
+ ret = kstrtol(buf, 0, );
+
+ if (ret)
+ return ret;
+
+ if (val < 0 || val > 4)
+ return -EINVAL;
+
+ drm_modeset_lock(>mode_config.connection_mutex, NULL);
+ to_dm_connector_state(connector->state)->abm_level = val ?:
+ ABM_LEVEL_IMMEDIATE_DISABLE;
+ drm_modeset_unlock(>mode_config.connection_mutex);
+
+ return count;
+}
+
+static DEVICE_ATTR_RW(panel_power_savings);
+
+static struct attribute *amdgpu_attrs[] = {
+ _attr_panel_power_savings.attr,
+ NULL
+};
+
+static const struct attribute_group amdgpu_group = {
+ .name = "amdgpu",
+ .attrs = amdgpu_attrs
+};
+
static int
amdgpu_dm_connector_late_register(struct drm_connector *connector)
{
@@ -6541,6 +6593,13 @@ amdgpu_dm_connector_late_register(struct  
drm_connector *connector)

to_amdgpu_dm_connector(connector);
int r;
+ if (connector->connector_type == DRM_MODE_CONNECTOR_eDP) {
+ r = sysfs_create_group(>kdev->kobj,
+ _group);
+ if (r)
+ return r;
+ }
+
amdgpu_dm_register_backlight_device(amdgpu_dm_connector);
if ((connector->connector_type == DRM_MODE_CONNECTOR_DisplayPort) ||






Re: [PATCH] drm/amd/display: add panel_power_savings sysfs entry to eDP connectors

2024-01-28 Thread Dominik Förderer

I've applied the patch to 6.7.2. The device then shows up under:

/sys/devices/pci:00/:00:08.1/:c1:00.0/drm/card1/card1-eDP-1/amdgpu/panel_power_savings
(on Framework Laptop 13 amd 7840U with 780M).

After a few tests i can say that at least in my system it’s not working. 
Setting a value between 0 and 4 in /sys/.../panel_power_savings changes 
nothing in the panel behavior. There are no errors in kernel log.


Setting an abmlevel via kernel option still works as intended.


The issue can be resolved if one set the panel_power_savings value and 
after that change the display resolution to a lower value and than 
switch back. For example this script works:



xterm -e 'echo 0 | sudo tee 
/sys/devices/pci:00/:00:08.1/:c1:00.0/drm/card1/card1-eDP-1/amdgpu/panel_power_savings'
gnome-randr modify -m 1920x1200@59.999 eDP-1 && gnome-randr modify -m 
2256x1504@59.999 eDP-1



Am 26.01.24 um 23:22 schrieb Hamza Mahfooz:

We want programs besides the compositor to be able to enable or disable
panel power saving features. However, since they are currently only
configurable through DRM properties, that isn't possible. So, to remedy
that issue introduce a new "panel_power_savings" sysfs attribute.

Cc: Mario Limonciello 
Signed-off-by: Hamza Mahfooz 
---
.../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 59 +++
1 file changed, 59 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c

index cd98b3565178..b3fcd833015d 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -6534,6 +6534,58 @@ 
amdgpu_dm_connector_atomic_duplicate_state(struct drm_connector 
*connector)

return _state->base;
}
+static ssize_t panel_power_savings_show(struct device *device,
+ struct device_attribute *attr,
+ char *buf)
+{
+ struct drm_connector *connector = dev_get_drvdata(device);
+ struct drm_device *dev = connector->dev;
+ ssize_t val;
+
+ drm_modeset_lock(>mode_config.connection_mutex, NULL);
+ val = to_dm_connector_state(connector->state)->abm_level;
+ drm_modeset_unlock(>mode_config.connection_mutex);
+
+ return sysfs_emit(buf, "%lu\n", val);
+}
+
+static ssize_t panel_power_savings_store(struct device *device,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct drm_connector *connector = dev_get_drvdata(device);
+ struct drm_device *dev = connector->dev;
+ long val;
+ int ret;
+
+ ret = kstrtol(buf, 0, );
+
+ if (ret)
+ return ret;
+
+ if (val < 0 || val > 4)
+ return -EINVAL;
+
+ drm_modeset_lock(>mode_config.connection_mutex, NULL);
+ to_dm_connector_state(connector->state)->abm_level = val ?:
+ ABM_LEVEL_IMMEDIATE_DISABLE;
+ drm_modeset_unlock(>mode_config.connection_mutex);
+
+ return count;
+}
+
+static DEVICE_ATTR_RW(panel_power_savings);
+
+static struct attribute *amdgpu_attrs[] = {
+ _attr_panel_power_savings.attr,
+ NULL
+};
+
+static const struct attribute_group amdgpu_group = {
+ .name = "amdgpu",
+ .attrs = amdgpu_attrs
+};
+
static int
amdgpu_dm_connector_late_register(struct drm_connector *connector)
{
@@ -6541,6 +6593,13 @@ amdgpu_dm_connector_late_register(struct 
drm_connector *connector)

to_amdgpu_dm_connector(connector);
int r;
+ if (connector->connector_type == DRM_MODE_CONNECTOR_eDP) {
+ r = sysfs_create_group(>kdev->kobj,
+ _group);
+ if (r)
+ return r;
+ }
+
amdgpu_dm_register_backlight_device(amdgpu_dm_connector);
if ((connector->connector_type == DRM_MODE_CONNECTOR_DisplayPort) ||



--
Dominik Förderer (Netzwerkadministrator)
Windeck-Gymnasium Bühl
Humboldtstr. 3
Tel. 07223/9409585



OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: [PATCH] drm/amd/display: add panel_power_savings sysfs entry to eDP connectors

2024-01-28 Thread Dominik Förderer

I've applied the patch to 6.7.2. The device then shows up under:

/sys/devices/pci:00/:00:08.1/:c1:00.0/drm/card1/card1-eDP-1/amdgpu/panel_power_savings
(on Framework Laptop 13 amd 7840U with 780M).

After a few tests i can say that at least in my system it’s not working. 
Setting a value between 0 and 4 in /sys/.../panel_power_savings changes 
nothing in the panel behavior. There are no errors in kernel log.


Setting an abmlevel via kernel option still works as intended.


The issue can be resolved if one set the panel_power_savings value and 
after that change the display resolution to a lower value and than 
switch back. For example this script works:



xterm -e 'echo 0 | sudo tee 
/sys/devices/pci:00/:00:08.1/:c1:00.0/drm/card1/card1-eDP-1/amdgpu/panel_power_savings'
gnome-randr modify -m 1920x1200@59.999 eDP-1 && gnome-randr modify -m 
2256x1504@59.999 eDP-1



Am 26.01.24 um 23:22 schrieb Hamza Mahfooz:

We want programs besides the compositor to be able to enable or disable
panel power saving features. However, since they are currently only
configurable through DRM properties, that isn't possible. So, to remedy
that issue introduce a new "panel_power_savings" sysfs attribute.

Cc: Mario Limonciello 
Signed-off-by: Hamza Mahfooz 
---
.../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 59 +++
1 file changed, 59 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c

index cd98b3565178..b3fcd833015d 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -6534,6 +6534,58 @@ 
amdgpu_dm_connector_atomic_duplicate_state(struct drm_connector 
*connector)

return _state->base;
}
+static ssize_t panel_power_savings_show(struct device *device,
+ struct device_attribute *attr,
+ char *buf)
+{
+ struct drm_connector *connector = dev_get_drvdata(device);
+ struct drm_device *dev = connector->dev;
+ ssize_t val;
+
+ drm_modeset_lock(>mode_config.connection_mutex, NULL);
+ val = to_dm_connector_state(connector->state)->abm_level;
+ drm_modeset_unlock(>mode_config.connection_mutex);
+
+ return sysfs_emit(buf, "%lu\n", val);
+}
+
+static ssize_t panel_power_savings_store(struct device *device,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct drm_connector *connector = dev_get_drvdata(device);
+ struct drm_device *dev = connector->dev;
+ long val;
+ int ret;
+
+ ret = kstrtol(buf, 0, );
+
+ if (ret)
+ return ret;
+
+ if (val < 0 || val > 4)
+ return -EINVAL;
+
+ drm_modeset_lock(>mode_config.connection_mutex, NULL);
+ to_dm_connector_state(connector->state)->abm_level = val ?:
+ ABM_LEVEL_IMMEDIATE_DISABLE;
+ drm_modeset_unlock(>mode_config.connection_mutex);
+
+ return count;
+}
+
+static DEVICE_ATTR_RW(panel_power_savings);
+
+static struct attribute *amdgpu_attrs[] = {
+ _attr_panel_power_savings.attr,
+ NULL
+};
+
+static const struct attribute_group amdgpu_group = {
+ .name = "amdgpu",
+ .attrs = amdgpu_attrs
+};
+
static int
amdgpu_dm_connector_late_register(struct drm_connector *connector)
{
@@ -6541,6 +6593,13 @@ amdgpu_dm_connector_late_register(struct 
drm_connector *connector)

to_amdgpu_dm_connector(connector);
int r;
+ if (connector->connector_type == DRM_MODE_CONNECTOR_eDP) {
+ r = sysfs_create_group(>kdev->kobj,
+ _group);
+ if (r)
+ return r;
+ }
+
amdgpu_dm_register_backlight_device(amdgpu_dm_connector);
if ((connector->connector_type == DRM_MODE_CONNECTOR_DisplayPort) ||



--
Dominik Förderer (Netzwerkadministrator)
Windeck-Gymnasium Bühl
Humboldtstr. 3
Tel. 07223/9409585



OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: [PATCH next 10/11] block: Use a boolean expression instead of max() on booleans

2024-01-28 Thread Linus Torvalds
On Sun, 28 Jan 2024 at 14:22, David Laight  wrote:
>
> H blame gcc :-)

I do agree that the gcc warning quoting is unnecessarily ugly (even
just visually), but..

> The error message displays as '0' but is e2:80:98 30 e2:80:99
> I HATE UTF-8, it wouldn't be as bad if it were a bijection.

No, that's not the problem. The UTF-8 that gcc emits is fine.

And your email was also UTF-8:

Content-Type: text/plain; charset=UTF-8

The problem is that you clearly then used some other tool in between
that took the UTF-8 byte stream, and used it as (presumably) Latin1,
which is bogus.

If you just make everything use and stay as UTF-8, it all works out
beautifully. But I suspect you have an editor or a MUA that is fixed
in some 1980s mindset, and when you cut-and-pasted the UTF-8, it
treated it as Latin1.

Just make all your environment be utf-8, like it should be. It's not
the 80s any more. We don't do mullets, and we don't do Latin1, ok?

Linus


RE: [PATCH next 10/11] block: Use a boolean expression instead of max() on booleans

2024-01-28 Thread David Laight
From: Linus Torvalds
> Sent: 28 January 2024 19:59
> 
> On Sun, 28 Jan 2024 at 11:36, David Laight  wrote:
> >
> > However it generates:
> > error: comparison of constant ‘0’ with boolean expression is always 
> > true [-Werror=bool-compare]
> > inside the signedness check that max() does unless a '+ 0' is added.
> 
> Please fix your locale. You have random garbage characters there,
> presumably because you have some incorrect locale setting somewhere in
> your toolchain.

H blame gcc :-)
The error message displays as '0' but is e2:80:98 30 e2:80:99
I HATE UTF-8, it wouldn't be as bad if it were a bijection.

Lets see if adding 'LANG=C' in the shell script I use to
do kernel builds is enough.

I also managed to send parts 1 to 6 without deleting the RE:
(I have to cut from wordpad into a 'reply-all' of the first
message I send. Work uses mimecast and it has started bouncing
my copy of every message I send to the lists.)

Maybe I should start using telnet to send raw SMTP :-)

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)


[PATCH v3 3/3] drm/amdgpu: Implement check_async_props for planes

2024-01-28 Thread André Almeida
AMD GPUs can do async flips with changes on more properties than just
the FB ID, so implement a custom check_async_props for AMD planes.

Allow amdgpu to do async flips with overlay planes as well.

Signed-off-by: André Almeida 
---
v3: allow overlay planes

 .../amd/display/amdgpu_dm/amdgpu_dm_plane.c   | 29 +++
 1 file changed, 29 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c
index 116121e647ca..ed75b69636b4 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c
@@ -25,6 +25,7 @@
  */
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -1430,6 +1431,33 @@ static void 
amdgpu_dm_plane_drm_plane_destroy_state(struct drm_plane *plane,
drm_atomic_helper_plane_destroy_state(plane, state);
 }
 
+static int amdgpu_dm_plane_check_async_props(struct drm_property *prop,
+ struct drm_plane *plane,
+ struct drm_plane_state *plane_state,
+ struct drm_mode_object *obj,
+ u64 prop_value, u64 old_val)
+{
+   struct drm_mode_config *config = >dev->mode_config;
+   int ret;
+
+   if (prop != config->prop_fb_id &&
+   prop != config->prop_in_fence_fd) {
+   ret = drm_atomic_plane_get_property(plane, plane_state,
+   prop, _val);
+   return drm_atomic_check_prop_changes(ret, old_val, prop_value, 
prop);
+   }
+
+   if (plane_state->plane->type != DRM_PLANE_TYPE_PRIMARY &&
+   plane_state->plane->type != DRM_PLANE_TYPE_OVERLAY) {
+   drm_dbg_atomic(prop->dev,
+  "[OBJECT:%d] Only primary or overlay planes can 
be changed during async flip\n",
+  obj->id);
+   return -EINVAL;
+   }
+
+   return 0;
+}
+
 static const struct drm_plane_funcs dm_plane_funcs = {
.update_plane   = drm_atomic_helper_update_plane,
.disable_plane  = drm_atomic_helper_disable_plane,
@@ -1438,6 +1466,7 @@ static const struct drm_plane_funcs dm_plane_funcs = {
.atomic_duplicate_state = amdgpu_dm_plane_drm_plane_duplicate_state,
.atomic_destroy_state = amdgpu_dm_plane_drm_plane_destroy_state,
.format_mod_supported = amdgpu_dm_plane_format_mod_supported,
+   .check_async_props = amdgpu_dm_plane_check_async_props,
 };
 
 int amdgpu_dm_plane_init(struct amdgpu_display_manager *dm,
-- 
2.43.0



[PATCH v3 0/3] drm/atomic: Allow drivers to write their own plane check for async

2024-01-28 Thread André Almeida
Hi,

AMD hardware can do more on the async flip path than just the primary plane, so
to lift up the current restrictions, this patchset allows drivers to write their
own check for planes for async flips.

This patchset allows for async commits with IN_FENCE_ID in any driver and
overlay planes on AMD. Userspace can query if a driver supports this with
TEST_ONLY commits.

Changes from v2:
 - Allow IN_FENCE_ID for any driver
 - Allow overlay planes again
v2: https://lore.kernel.org/lkml/20240119181235.255060-1-andrealm...@igalia.com/

Changes from v1:
 - Drop overlay planes option for now
v1: 
https://lore.kernel.org/dri-devel/20240116045159.1015510-1-andrealm...@igalia.com/

André Almeida (3):
  drm/atomic: Allow drivers to write their own plane check for async
flips
  drm/atomic: Allow userspace to use explicit sync with atomic async
flips
  drm/amdgpu: Implement check_async_props for planes

 .../amd/display/amdgpu_dm/amdgpu_dm_plane.c   | 29 +
 drivers/gpu/drm/drm_atomic_uapi.c | 63 ++-
 include/drm/drm_atomic_uapi.h | 12 
 include/drm/drm_plane.h   |  5 ++
 4 files changed, 92 insertions(+), 17 deletions(-)

-- 
2.43.0



[PATCH v3 1/3] drm/atomic: Allow drivers to write their own plane check for async flips

2024-01-28 Thread André Almeida
Some hardware are more flexible on what they can flip asynchronously, so
rework the plane check so drivers can implement their own check, lifting
up some of the restrictions.

Signed-off-by: André Almeida 
---
v3: no changes

 drivers/gpu/drm/drm_atomic_uapi.c | 62 ++-
 include/drm/drm_atomic_uapi.h | 12 ++
 include/drm/drm_plane.h   |  5 +++
 3 files changed, 62 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/drm_atomic_uapi.c 
b/drivers/gpu/drm/drm_atomic_uapi.c
index aee4a65d4959..6d5b9fec90c7 100644
--- a/drivers/gpu/drm/drm_atomic_uapi.c
+++ b/drivers/gpu/drm/drm_atomic_uapi.c
@@ -620,7 +620,7 @@ static int drm_atomic_plane_set_property(struct drm_plane 
*plane,
return 0;
 }
 
-static int
+int
 drm_atomic_plane_get_property(struct drm_plane *plane,
const struct drm_plane_state *state,
struct drm_property *property, uint64_t *val)
@@ -683,6 +683,7 @@ drm_atomic_plane_get_property(struct drm_plane *plane,
 
return 0;
 }
+EXPORT_SYMBOL(drm_atomic_plane_get_property);
 
 static int drm_atomic_set_writeback_fb_for_connector(
struct drm_connector_state *conn_state,
@@ -1026,18 +1027,54 @@ int drm_atomic_connector_commit_dpms(struct 
drm_atomic_state *state,
return ret;
 }
 
-static int drm_atomic_check_prop_changes(int ret, uint64_t old_val, uint64_t 
prop_value,
+int drm_atomic_check_prop_changes(int ret, uint64_t old_val, uint64_t 
prop_value,
 struct drm_property *prop)
 {
if (ret != 0 || old_val != prop_value) {
drm_dbg_atomic(prop->dev,
-  "[PROP:%d:%s] No prop can be changed during 
async flip\n",
+  "[PROP:%d:%s] This prop cannot be changed during 
async flip\n",
   prop->base.id, prop->name);
return -EINVAL;
}
 
return 0;
 }
+EXPORT_SYMBOL(drm_atomic_check_prop_changes);
+
+/* plane changes may have exceptions, so we have a special function for them */
+static int drm_atomic_check_plane_changes(struct drm_property *prop,
+ struct drm_plane *plane,
+ struct drm_plane_state *plane_state,
+ struct drm_mode_object *obj,
+ u64 prop_value, u64 old_val)
+{
+   struct drm_mode_config *config = >dev->mode_config;
+   int ret;
+
+   if (plane->funcs->check_async_props)
+   return plane->funcs->check_async_props(prop, plane, plane_state,
+obj, prop_value, 
old_val);
+
+   /*
+* if you are trying to change something other than the FB ID, your
+* change will be either rejected or ignored, so we can stop the check
+* here
+*/
+   if (prop != config->prop_fb_id) {
+   ret = drm_atomic_plane_get_property(plane, plane_state,
+   prop, _val);
+   return drm_atomic_check_prop_changes(ret, old_val, prop_value, 
prop);
+   }
+
+   if (plane_state->plane->type != DRM_PLANE_TYPE_PRIMARY) {
+   drm_dbg_atomic(prop->dev,
+  "[OBJECT:%d] Only primary planes can be changed 
during async flip\n",
+  obj->id);
+   return -EINVAL;
+   }
+
+   return 0;
+}
 
 int drm_atomic_set_property(struct drm_atomic_state *state,
struct drm_file *file_priv,
@@ -1100,7 +1137,6 @@ int drm_atomic_set_property(struct drm_atomic_state 
*state,
case DRM_MODE_OBJECT_PLANE: {
struct drm_plane *plane = obj_to_plane(obj);
struct drm_plane_state *plane_state;
-   struct drm_mode_config *config = >dev->mode_config;
 
plane_state = drm_atomic_get_plane_state(state, plane);
if (IS_ERR(plane_state)) {
@@ -1108,19 +1144,11 @@ int drm_atomic_set_property(struct drm_atomic_state 
*state,
break;
}
 
-   if (async_flip && prop != config->prop_fb_id) {
-   ret = drm_atomic_plane_get_property(plane, plane_state,
-   prop, _val);
-   ret = drm_atomic_check_prop_changes(ret, old_val, 
prop_value, prop);
-   break;
-   }
-
-   if (async_flip && plane_state->plane->type != 
DRM_PLANE_TYPE_PRIMARY) {
-   drm_dbg_atomic(prop->dev,
-  "[OBJECT:%d] Only primary planes can be 
changed during async flip\n",
-  obj->id);
-   ret = -EINVAL;
-   break;
+   if (async_flip) {
+   

[PATCH v3 2/3] drm/atomic: Allow userspace to use explicit sync with atomic async flips

2024-01-28 Thread André Almeida
Allow userspace to use explicit synchronization with atomic async flips.
That means that the flip will wait for some hardware fence, and then
will flip as soon as possible (async) in regard of the vblank.

Signed-off-by: André Almeida 
---
v3: new patch

 drivers/gpu/drm/drm_atomic_uapi.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/drm_atomic_uapi.c 
b/drivers/gpu/drm/drm_atomic_uapi.c
index 6d5b9fec90c7..edae7924ad69 100644
--- a/drivers/gpu/drm/drm_atomic_uapi.c
+++ b/drivers/gpu/drm/drm_atomic_uapi.c
@@ -1060,7 +1060,8 @@ static int drm_atomic_check_plane_changes(struct 
drm_property *prop,
 * change will be either rejected or ignored, so we can stop the check
 * here
 */
-   if (prop != config->prop_fb_id) {
+   if (prop != config->prop_fb_id &&
+   prop != config->prop_in_fence_fd) {
ret = drm_atomic_plane_get_property(plane, plane_state,
prop, _val);
return drm_atomic_check_prop_changes(ret, old_val, prop_value, 
prop);
-- 
2.43.0



[PATCH] drm/imagination: Use memdup_user() rather than duplicating its implementation

2024-01-28 Thread Markus Elfring
From: Markus Elfring 
Date: Sun, 28 Jan 2024 20:50:36 +0100

* Reuse existing functionality from memdup_user() instead of keeping
  duplicate source code.

  Generated by: scripts/coccinelle/api/memdup_user.cocci

* Delete labels and statements which became unnecessary
  with this refactoring.

Signed-off-by: Markus Elfring 
---
 drivers/gpu/drm/imagination/pvr_context.c | 21 +++--
 drivers/gpu/drm/imagination/pvr_job.c | 15 +++
 2 files changed, 6 insertions(+), 30 deletions(-)

diff --git a/drivers/gpu/drm/imagination/pvr_context.c 
b/drivers/gpu/drm/imagination/pvr_context.c
index eded5e955cc0..27814ae8a8f8 100644
--- a/drivers/gpu/drm/imagination/pvr_context.c
+++ b/drivers/gpu/drm/imagination/pvr_context.c
@@ -66,29 +66,14 @@ static int
 process_static_context_state(struct pvr_device *pvr_dev, const struct 
pvr_stream_cmd_defs *cmd_defs,
 u64 stream_user_ptr, u32 stream_size, void *dest)
 {
-   void *stream;
int err;
+   void *stream = memdup_user(u64_to_user_ptr(stream_user_ptr), 
stream_size);

-   stream = kzalloc(stream_size, GFP_KERNEL);
-   if (!stream)
-   return -ENOMEM;
-
-   if (copy_from_user(stream, u64_to_user_ptr(stream_user_ptr), 
stream_size)) {
-   err = -EFAULT;
-   goto err_free;
-   }
+   if (IS_ERR(stream))
+   return PTR_ERR(stream);

err = pvr_stream_process(pvr_dev, cmd_defs, stream, stream_size, dest);
-   if (err)
-   goto err_free;
-
kfree(stream);
-
-   return 0;
-
-err_free:
-   kfree(stream);
-
return err;
 }

diff --git a/drivers/gpu/drm/imagination/pvr_job.c 
b/drivers/gpu/drm/imagination/pvr_job.c
index 78c2f3c6dce0..e17d53b93b1f 100644
--- a/drivers/gpu/drm/imagination/pvr_job.c
+++ b/drivers/gpu/drm/imagination/pvr_job.c
@@ -87,23 +87,14 @@ static int pvr_fw_cmd_init(struct pvr_device *pvr_dev, 
struct pvr_job *job,
   const struct pvr_stream_cmd_defs *stream_def,
   u64 stream_userptr, u32 stream_len)
 {
-   void *stream;
int err;
+   void *stream = memdup_user(u64_to_user_ptr(stream_userptr), stream_len);

-   stream = kzalloc(stream_len, GFP_KERNEL);
-   if (!stream)
-   return -ENOMEM;
-
-   if (copy_from_user(stream, u64_to_user_ptr(stream_userptr), 
stream_len)) {
-   err = -EFAULT;
-   goto err_free_stream;
-   }
+   if (IS_ERR(stream))
+   return PTR_ERR(stream);

err = pvr_job_process_stream(pvr_dev, stream_def, stream, stream_len, 
job);
-
-err_free_stream:
kfree(stream);
-
return err;
 }

--
2.43.0



Re: [PATCH next 10/11] block: Use a boolean expression instead of max() on booleans

2024-01-28 Thread Linus Torvalds
On Sun, 28 Jan 2024 at 11:36, David Laight  wrote:
>
> However it generates:
> error: comparison of constant ‘0’ with boolean expression is always true 
> [-Werror=bool-compare]
> inside the signedness check that max() does unless a '+ 0' is added.

Please fix your locale. You have random garbage characters there,
presumably because you have some incorrect locale setting somewhere in
your toolchain.

   Linus


[PATCH next 11/11] minmax: min() and max() don't need to return constant expressions

2024-01-28 Thread David Laight
After changing the handful of places max() was used to size an on-stack
array to use max_const() it is no longer necessary for min() and max()
to return constant expressions from constant inputs.
Remove the associated logic to reduce the expanded text.

Remove the 'hack' that allowed max(bool, bool).

Fixup the initial block comment to match current reality.

Signed-off-by: David Laight 
---
 include/linux/minmax.h | 23 ---
 1 file changed, 12 insertions(+), 11 deletions(-)

diff --git a/include/linux/minmax.h b/include/linux/minmax.h
index c08916588425..5e65c98ff256 100644
--- a/include/linux/minmax.h
+++ b/include/linux/minmax.h
@@ -8,13 +8,10 @@
 #include 
 
 /*
- * min()/max()/clamp() macros must accomplish three things:
+ * min()/max()/clamp() macros must accomplish several things:
  *
  * - Avoid multiple evaluations of the arguments (so side-effects like
  *   "x++" happen only once) when non-constant.
- * - Retain result as a constant expressions when called with only
- *   constant expressions (to avoid tripping VLA warnings in stack
- *   allocation usage).
  * - Perform signed v unsigned type-checking (to generate compile
  *   errors instead of nasty runtime surprises).
  * - Unsigned char/short are always promoted to signed int and can be
@@ -22,13 +19,19 @@
  * - Unsigned arguments can be compared against non-negative signed constants.
  * - Comparison of a signed argument against an unsigned constant fails
  *   even if the constant is below __INT_MAX__ and could be cast to int.
+ *
+ * The return value of min()/max() is not a constant expression for
+ * constant parameters - so will trigger a VLA warging if used to size
+ * an on-stack array.
+ * Instead use min_const() or max_const() which do generate constant
+ * expressions and are also valid for static initialisers.
  */
 #define __typecheck(x, y) \
(!!(sizeof((typeof(x) *)1 == (typeof(y) *)1)))
 
 /* Allow unsigned compares against non-negative signed constants. */
 #define __is_ok_unsigned(x) \
-   (is_unsigned_type(typeof(x)) || (__is_constexpr(x) ? (x) + 0 >= 0 : 0))
+   (is_unsigned_type(typeof(x)) || (__is_constexpr(x) ? (x) >= 0 : 0))
 
 /* Check for signed after promoting unsigned char/short to int */
 #define __is_ok_signed(x) is_signed_type(typeof((x) + 0))
@@ -53,12 +56,10 @@
typeof(y) __y_##uniq = (y); \
__cmp(op, __x_##uniq, __y_##uniq); })
 
-#define __careful_cmp(op, x, y, uniq)  \
-   __builtin_choose_expr(__is_constexpr((x) - (y)),\
-   __cmp(op, x, y),\
-   ({ _Static_assert(__types_ok(x, y), \
-   #op "(" #x ", " #y ") signedness error, fix types or 
consider u" #op "() before " #op "_t()"); \
-   __cmp_once(op, x, y, uniq); }))
+#define __careful_cmp(op, x, y, uniq) ({   \
+   _Static_assert(__types_ok(x, y),\
+   #op "(" #x ", " #y ") signedness error, fix types or consider 
u" #op "() before " #op "_t()"); \
+   __cmp_once(op, x, y, uniq); })
 
 #define __careful_cmp_const(op, x, y)  \
(BUILD_BUG_ON_ZERO(!__is_constexpr((x) - (y))) +\
-- 
2.17.1

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



[PATCH next 10/11] block: Use a boolean expression instead of max() on booleans

2024-01-28 Thread David Laight
blk_stack_limits() contains:
t->zoned = max(t->zoned, b->zoned);
These are bool, so it is just a bitwise or.
However it generates:
error: comparison of constant ‘0’ with boolean expression is always true 
[-Werror=bool-compare]
inside the signedness check that max() does unless a '+ 0' is added.
It is a shame the compiler generates this warning for code that will
be optimised away.

Change so that the extra '+ 0' can be removed.

Signed-off-by: David Laight 
---
 block/blk-settings.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/blk-settings.c b/block/blk-settings.c
index 06ea91e51b8b..9ca21fea039d 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -688,7 +688,7 @@ int blk_stack_limits(struct queue_limits *t, struct 
queue_limits *b,
   b->max_secure_erase_sectors);
t->zone_write_granularity = max(t->zone_write_granularity,
b->zone_write_granularity);
-   t->zoned = max(t->zoned, b->zoned);
+   t->zoned = t->zoned | b->zoned;
return ret;
 }
 EXPORT_SYMBOL(blk_stack_limits);
-- 
2.17.1

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)


[PATCH next 09/11] tree-wide: minmax: Replace all the uses of max() for array sizes with max_const()

2024-01-28 Thread David Laight
These are the only uses of max() that require a constant value
from constant parameters.
There don't seem to be any similar uses of min().

Replacing the max() by max_const() lets min()/max() be simplified
speeding up compilation.

max_const() will convert enums to int (or unsigned int) so that the
casts added by max_t() are no longer needed.

Signed-off-by: David Laight 
---
 drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c | 2 +-
 drivers/gpu/drm/drm_color_mgmt.c   | 4 ++--
 drivers/input/touchscreen/cyttsp4_core.c   | 2 +-
 drivers/net/can/usb/etas_es58x/es58x_devlink.c | 2 +-
 fs/btrfs/tree-checker.c| 2 +-
 lib/vsprintf.c | 4 ++--
 net/ipv4/proc.c| 2 +-
 net/ipv6/proc.c| 2 +-
 8 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
index 00cd615bbcdc..935fb4014f7c 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
@@ -708,7 +708,7 @@ static const char *smu_get_feature_name(struct smu_context 
*smu,
 size_t smu_cmn_get_pp_feature_mask(struct smu_context *smu,
   char *buf)
 {
-   int8_t sort_feature[max(SMU_FEATURE_COUNT, SMU_FEATURE_MAX)];
+   int8_t sort_feature[max_const(SMU_FEATURE_COUNT, SMU_FEATURE_MAX)];
uint64_t feature_mask;
int i, feature_index;
uint32_t count = 0;
diff --git a/drivers/gpu/drm/drm_color_mgmt.c b/drivers/gpu/drm/drm_color_mgmt.c
index d021497841b8..43a6bd0ca960 100644
--- a/drivers/gpu/drm/drm_color_mgmt.c
+++ b/drivers/gpu/drm/drm_color_mgmt.c
@@ -532,8 +532,8 @@ int drm_plane_create_color_properties(struct drm_plane 
*plane,
 {
struct drm_device *dev = plane->dev;
struct drm_property *prop;
-   struct drm_prop_enum_list enum_list[max_t(int, DRM_COLOR_ENCODING_MAX,
-  DRM_COLOR_RANGE_MAX)];
+   struct drm_prop_enum_list enum_list[max_const(DRM_COLOR_ENCODING_MAX,
+ DRM_COLOR_RANGE_MAX)];
int i, len;
 
if (WARN_ON(supported_encodings == 0 ||
diff --git a/drivers/input/touchscreen/cyttsp4_core.c 
b/drivers/input/touchscreen/cyttsp4_core.c
index 7cb26929dc73..c6884c3c3fca 100644
--- a/drivers/input/touchscreen/cyttsp4_core.c
+++ b/drivers/input/touchscreen/cyttsp4_core.c
@@ -871,7 +871,7 @@ static void cyttsp4_get_mt_touches(struct cyttsp4_mt_data 
*md, int num_cur_tch)
struct cyttsp4_touch tch;
int sig;
int i, j, t = 0;
-   int ids[max(CY_TMA1036_MAX_TCH, CY_TMA4XX_MAX_TCH)];
+   int ids[max_const(CY_TMA1036_MAX_TCH, CY_TMA4XX_MAX_TCH)];
 
memset(ids, 0, si->si_ofs.tch_abs[CY_TCH_T].max * sizeof(int));
for (i = 0; i < num_cur_tch; i++) {
diff --git a/drivers/net/can/usb/etas_es58x/es58x_devlink.c 
b/drivers/net/can/usb/etas_es58x/es58x_devlink.c
index 635edeb8f68c..28fa87668bf8 100644
--- a/drivers/net/can/usb/etas_es58x/es58x_devlink.c
+++ b/drivers/net/can/usb/etas_es58x/es58x_devlink.c
@@ -215,7 +215,7 @@ static int es58x_devlink_info_get(struct devlink *devlink,
struct es58x_sw_version *fw_ver = _dev->firmware_version;
struct es58x_sw_version *bl_ver = _dev->bootloader_version;
struct es58x_hw_revision *hw_rev = _dev->hardware_revision;
-   char buf[max(sizeof("xx.xx.xx"), sizeof("axxx/xxx"))];
+   char buf[max_const(sizeof("xx.xx.xx"), sizeof("axxx/xxx"))];
int ret = 0;
 
if (es58x_sw_version_is_valid(fw_ver)) {
diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
index 6eccf8496486..aec4729a9a82 100644
--- a/fs/btrfs/tree-checker.c
+++ b/fs/btrfs/tree-checker.c
@@ -615,7 +615,7 @@ static int check_dir_item(struct extent_buffer *leaf,
 */
if (key->type == BTRFS_DIR_ITEM_KEY ||
key->type == BTRFS_XATTR_ITEM_KEY) {
-   char namebuf[max(BTRFS_NAME_LEN, XATTR_NAME_MAX)];
+   char namebuf[max_const(BTRFS_NAME_LEN, XATTR_NAME_MAX)];
 
read_extent_buffer(leaf, namebuf,
(unsigned long)(di + 1), name_len);
diff --git a/lib/vsprintf.c b/lib/vsprintf.c
index 552738f14275..6c3c319afd86 100644
--- a/lib/vsprintf.c
+++ b/lib/vsprintf.c
@@ -1080,8 +1080,8 @@ char *resource_string(char *buf, char *end, struct 
resource *res,
 #define FLAG_BUF_SIZE  (2 * sizeof(res->flags))
 #define DECODED_BUF_SIZE   sizeof("[mem - 64bit pref window disabled]")
 #define RAW_BUF_SIZE   sizeof("[mem - flags 0x]")
-   char sym[max(2*RSRC_BUF_SIZE + DECODED_BUF_SIZE,
-2*RSRC_BUF_SIZE + FLAG_BUF_SIZE + RAW_BUF_SIZE)];
+   char sym[max_const(2*RSRC_BUF_SIZE + DECODED_BUF_SIZE,
+  2*RSRC_BUF_SIZE + FLAG_BUF_SIZE + 

[PATCH next 08/11 minmax: Add min_const() and max_const()

2024-01-28 Thread David Laight
The expansions of min() and max() contain statement expressions so are
not valid for static intialisers.
min_const() and max_const() are expressions so can be used for static
initialisers.
The arguments are checked for being constant and for negative signed
values being converted to large unsigned values.

Using these to size on-stack arrays lets min/max be simplified.
Zero is added before the compare to convert enum values to integers
avoinding the need for casts when enums have been used for constants.

Signed-off-by: David Laight 
---
 include/linux/minmax.h | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/include/linux/minmax.h b/include/linux/minmax.h
index 278a390b8a4c..c08916588425 100644
--- a/include/linux/minmax.h
+++ b/include/linux/minmax.h
@@ -60,19 +60,34 @@
#op "(" #x ", " #y ") signedness error, fix types or 
consider u" #op "() before " #op "_t()"); \
__cmp_once(op, x, y, uniq); }))
 
+#define __careful_cmp_const(op, x, y)  \
+   (BUILD_BUG_ON_ZERO(!__is_constexpr((x) - (y))) +\
+   BUILD_BUG_ON_ZERO(!__types_ok(x, y)) +  \
+   __cmp(op, (x) + 0, (y) + 0))
+
 /**
  * min - return minimum of two values of the same or compatible types
  * @x: first value
  * @y: second value
+ *
+ * If @x and @y are constants the return value is constant, but not 'constant
+ * enough' for things like static initialisers.
+ * min_const(@x, @y) is a constant expression for constant inputs.
  */
 #define min(x, y)  __careful_cmp(min, x, y, __COUNTER__)
+#define min_const(x, y)__careful_cmp_const(min, x, y)
 
 /**
  * max - return maximum of two values of the same or compatible types
  * @x: first value
  * @y: second value
+ *
+ * If @x and @y are constants the return value is constant, but not 'constant
+ * enough' for things like static initialisers.
+ * max_const(@x, @y) is a constant expression for constant inputs.
  */
 #define max(x, y)  __careful_cmp(max, x, y, __COUNTER__)
+#define max_const(x, y)__careful_cmp_const(max, x, y)
 
 /**
  * umin - return minimum of two non-negative values
-- 
2.17.1

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



RE: [PATCH next 02/11] minmax: Use _Static_assert() instead of static_assert()

2024-01-28 Thread David Laight
The wrapper just adds two more lines of error output when the test fails.

Signed-off-by: David Laight 
---
 include/linux/minmax.h | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/linux/minmax.h b/include/linux/minmax.h
index 63c45865b48a..900eec7a28e5 100644
--- a/include/linux/minmax.h
+++ b/include/linux/minmax.h
@@ -48,7 +48,7 @@
 #define __cmp_once(op, x, y, unique_x, unique_y) ({\
typeof(x) unique_x = (x);   \
typeof(y) unique_y = (y);   \
-   static_assert(__types_ok(x, y), \
+   _Static_assert(__types_ok(x, y),\
#op "(" #x ", " #y ") signedness error, fix types or consider 
u" #op "() before " #op "_t()"); \
__cmp(op, unique_x, unique_y); })
 
@@ -137,11 +137,11 @@
typeof(val) unique_val = (val); 
\
typeof(lo) unique_lo = (lo);
\
typeof(hi) unique_hi = (hi);
\
-   static_assert(__builtin_choose_expr(__is_constexpr((lo) > (hi)),
\
+   _Static_assert(__builtin_choose_expr(__is_constexpr((lo) > (hi)),   
\
(lo) <= (hi), true),
\
"clamp() low limit " #lo " greater than high limit " #hi);  
\
-   static_assert(__types_ok(val, lo), "clamp() 'lo' signedness error");
\
-   static_assert(__types_ok(val, hi), "clamp() 'hi' signedness error");
\
+   _Static_assert(__types_ok(val, lo), "clamp() 'lo' signedness error");   
\
+   _Static_assert(__types_ok(val, hi), "clamp() 'hi' signedness error");   
\
__clamp(unique_val, unique_lo, unique_hi); })
 
 #define __careful_clamp(val, lo, hi) ({
\
-- 
2.17.1

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



RE: [PATCH next 01/11] minmax: Put all the clamp() definitions together

2024-01-28 Thread David Laight
The defines for clamp() have got separated, move togther for readability.
Update description of signedness check.

Signed-off-by: David Laight 
---
 include/linux/minmax.h | 120 +++--
 1 file changed, 56 insertions(+), 64 deletions(-)

diff --git a/include/linux/minmax.h b/include/linux/minmax.h
index 2ec559284a9f..63c45865b48a 100644
--- a/include/linux/minmax.h
+++ b/include/linux/minmax.h
@@ -57,26 +57,6 @@
__cmp(op, x, y),\
__cmp_once(op, x, y, __UNIQUE_ID(__x), __UNIQUE_ID(__y)))
 
-#define __clamp(val, lo, hi)   \
-   ((val) >= (hi) ? (hi) : ((val) <= (lo) ? (lo) : (val)))
-
-#define __clamp_once(val, lo, hi, unique_val, unique_lo, unique_hi) ({ 
\
-   typeof(val) unique_val = (val); 
\
-   typeof(lo) unique_lo = (lo);
\
-   typeof(hi) unique_hi = (hi);
\
-   static_assert(__builtin_choose_expr(__is_constexpr((lo) > (hi)),
\
-   (lo) <= (hi), true),
\
-   "clamp() low limit " #lo " greater than high limit " #hi);  
\
-   static_assert(__types_ok(val, lo), "clamp() 'lo' signedness error");
\
-   static_assert(__types_ok(val, hi), "clamp() 'hi' signedness error");
\
-   __clamp(unique_val, unique_lo, unique_hi); })
-
-#define __careful_clamp(val, lo, hi) ({
\
-   __builtin_choose_expr(__is_constexpr((val) - (lo) + (hi)),  \
-   __clamp(val, lo, hi),   \
-   __clamp_once(val, lo, hi, __UNIQUE_ID(__val),   \
-__UNIQUE_ID(__lo), __UNIQUE_ID(__hi))); })
-
 /**
  * min - return minimum of two values of the same or compatible types
  * @x: first value
@@ -124,6 +104,22 @@
  */
 #define max3(x, y, z) max((typeof(x))max(x, y), z)
 
+/**
+ * min_t - return minimum of two values, using the specified type
+ * @type: data type to use
+ * @x: first value
+ * @y: second value
+ */
+#define min_t(type, x, y)  __careful_cmp(min, (type)(x), (type)(y))
+
+/**
+ * max_t - return maximum of two values, using the specified type
+ * @type: data type to use
+ * @x: first value
+ * @y: second value
+ */
+#define max_t(type, x, y)  __careful_cmp(max, (type)(x), (type)(y))
+
 /**
  * min_not_zero - return the minimum that is _not_ zero, unless both are zero
  * @x: value1
@@ -134,39 +130,60 @@
typeof(y) __y = (y);\
__x == 0 ? __y : ((__y == 0) ? __x : min(__x, __y)); })
 
+#define __clamp(val, lo, hi)   \
+   ((val) >= (hi) ? (hi) : ((val) <= (lo) ? (lo) : (val)))
+
+#define __clamp_once(val, lo, hi, unique_val, unique_lo, unique_hi) ({ 
\
+   typeof(val) unique_val = (val); 
\
+   typeof(lo) unique_lo = (lo);
\
+   typeof(hi) unique_hi = (hi);
\
+   static_assert(__builtin_choose_expr(__is_constexpr((lo) > (hi)),
\
+   (lo) <= (hi), true),
\
+   "clamp() low limit " #lo " greater than high limit " #hi);  
\
+   static_assert(__types_ok(val, lo), "clamp() 'lo' signedness error");
\
+   static_assert(__types_ok(val, hi), "clamp() 'hi' signedness error");
\
+   __clamp(unique_val, unique_lo, unique_hi); })
+
+#define __careful_clamp(val, lo, hi) ({
\
+   __builtin_choose_expr(__is_constexpr((val) - (lo) + (hi)),  \
+   __clamp(val, lo, hi),   \
+   __clamp_once(val, lo, hi, __UNIQUE_ID(__val),   \
+__UNIQUE_ID(__lo), __UNIQUE_ID(__hi))); })
+
 /**
  * clamp - return a value clamped to a given range with strict typechecking
  * @val: current value
  * @lo: lowest allowable value
  * @hi: highest allowable value
  *
- * This macro does strict typechecking of @lo/@hi to make sure they are of the
- * same type as @val.  See the unnecessary pointer comparisons.
+ * This macro checks that @val, @lo and @hi have the same signedness.
  */
 #define clamp(val, lo, hi) __careful_clamp(val, lo, hi)
 
-/*
- * ..and if you can't take the strict
- * types, you can specify one yourself.
- *
- * Or not use min/max/clamp at all, of course.
- */
-
 /**
- * min_t - return minimum of two values, using the specified type
- * @type: data type to use
- * @x: first value
- * @y: second value
+ * clamp_t - return a value clamped to a given range using a given type
+ * @type: the type of variable to use
+ * @val: current value
+ * @lo: minimum allowable value
+ * @hi: maximum allowable value
+ *
+ * This macro does no typechecking and 

[PATCH next 00/11] minmax: Optimise to reduce .i line length.

2024-01-28 Thread David Laight
The changes to minmax.h that changed the type check to a signedness
check significantly increased the length of the expansion.
In some cases it has also significantly increased compile type.
This is particularly noticeable for nested expansions.

The fact that _Static_assert() only requires a compile time constant
not a constant expression allows a lot of simplification.

The other thing that complicates the expansion is the necessity of
returning a constant expression from constant arguments (for VLA).
I can only find a handful of places this is done.
Penalising most of the code for these few cases seems 'suboptimal'.
Instead I've added min_const() and max_const() for VLA and static
initialisers, these check the arguments are constant to avoid misuse.

Patch [9] is dependent on the earlier patches.
Patch [10] isn't dependant on them.
Patch [11] depends on both 9 and 10.

David Laight (11):
  [1] minmax: Put all the clamp() definitions together
  [2] minmax: Use _Static_assert() instead of static_assert()
  [3] minmax: Simplify signedness check
  [4] minmax: Replace multiple __UNIQUE_ID() by directly using __COUNTER__
  [5] minmax: Move the signedness check out of __cmp_once() and
__clamp_once()
  [6] minmax: Remove 'constexpr' check from __careful_clamp()
  [7] minmax: minmax: Add __types_ok3() and optimise defines with 3
arguments
  [8] minmax: Add min_const() and max_const()
  [9] tree-wide: minmax: Replace all the uses of max() for array sizes with
max_const().
  [10] block: Use a boolean expression instead of max() on booleans
  [11] minmax: min() and max() don't need to return constant expressions

 block/blk-settings.c  |   2 +-
 drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c|   2 +-
 drivers/gpu/drm/drm_color_mgmt.c  |   4 +-
 drivers/input/touchscreen/cyttsp4_core.c  |   2 +-
 .../net/can/usb/etas_es58x/es58x_devlink.c|   2 +-
 fs/btrfs/tree-checker.c   |   2 +-
 include/linux/minmax.h| 211 ++
 lib/vsprintf.c|   4 +-
 net/ipv4/proc.c   |   2 +-
 net/ipv6/proc.c   |   2 +-
 10 files changed, 127 insertions(+), 106 deletions(-)

-- 
2.17.1

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



RE: [PATCH next 00611] minmax: Remove 'constexpr' check from __careful_clamp()

2024-01-28 Thread David Laight
Nothing requires that clamp() return a constant expression.
The logic to do so significantly increases the .i file.
Remove the check and directly expand __clamp_once() from clamp_t()
since the type check can't fail.

Signed-off-by: David Laight 
---
 include/linux/minmax.h | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/include/linux/minmax.h b/include/linux/minmax.h
index 111c52a14fe5..5c7fce76abe5 100644
--- a/include/linux/minmax.h
+++ b/include/linux/minmax.h
@@ -141,12 +141,10 @@
"clamp() low limit " #lo " greater than high limit " #hi);  
\
__clamp(__val_##uniq, __lo_##uniq, __hi_##uniq); })
 
-#define __careful_clamp(val, lo, hi, uniq) \
-   __builtin_choose_expr(__is_constexpr((val) - (lo) + (hi)),  \
-   __clamp(val, lo, hi),   \
-   ({ _Static_assert(__types_ok(val, lo), "clamp() 'lo' signedness 
error");\
-   _Static_assert(__types_ok(val, hi), "clamp() 'hi' signedness 
error");   \
-   __clamp_once(val, lo, hi, uniq); }))
+#define __careful_clamp(val, lo, hi, uniq) ({  
\
+   _Static_assert(__types_ok(val, lo), "clamp() 'lo' signedness error");   
\
+   _Static_assert(__types_ok(val, hi), "clamp() 'hi' signedness error");   
\
+   __clamp_once(val, lo, hi, uniq); })
 
 /**
  * clamp - return a value clamped to a given range with strict typechecking
@@ -168,7 +166,9 @@
  * This macro does no typechecking and uses temporary variables of type
  * @type to make all the comparisons.
  */
-#define clamp_t(type, val, lo, hi) clamp((type)(val), (type)(lo), (type)(hi))
+#define __clamp_t(type, val, lo, hi, uniq) \
+   __clamp_once((type)(val), (type)(lo), (type)(hi), uniq)
+#define clamp_t(type, val, lo, hi) __clamp_t(type, val, lo, hi, __COUNTER__)
 
 /**
  * clamp_val - return a value clamped to a given range using val's type
-- 
2.17.1

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



RE: [PATCH next 0711] minmax: minmax: Add __types_ok3() and optimise defines with 3 arguments

2024-01-28 Thread David Laight
min3() and max3() were added to optimise nested min(x, min(y, z))
sequences, bit only moved where the expansion was requiested.

Add a separate implementation for 3 argument calls.
These are never required to generate constant expressiions to
remove that logic.

Signed-off-by: David Laight 
---
 include/linux/minmax.h | 23 +++
 1 file changed, 19 insertions(+), 4 deletions(-)

diff --git a/include/linux/minmax.h b/include/linux/minmax.h
index 5c7fce76abe5..278a390b8a4c 100644
--- a/include/linux/minmax.h
+++ b/include/linux/minmax.h
@@ -38,6 +38,11 @@
((__is_ok_signed(x) && __is_ok_signed(y)) ||\
 (__is_ok_unsigned(x) && __is_ok_unsigned(y)))
 
+/* Check three values for min3(), max3() and clamp() */
+#define __types_ok3(x, y, z)   
\
+   ((__is_ok_signed(x) && __is_ok_signed(y) && __is_ok_signed(z)) ||   
\
+(__is_ok_unsigned(x) && __is_ok_unsigned(y) && __is_ok_unsigned(z)))
+
 #define __cmp_op_min <
 #define __cmp_op_max >
 
@@ -87,13 +92,24 @@
 #define umax(x, y) \
__careful_cmp(max, __zero_extend(x), _zero_extend(y), __COUNTER__)
 
+#define __cmp_once3(op, x, y, z, uniq) ({  \
+   typeof(x) __x_##uniq = (x); \
+   typeof(x) __y_##uniq = (y); \
+   typeof(x) __z_##uniq = (z); \
+   __cmp(op, __cmp(op, __x_##uniq, __y_##uniq), __z_##uniq); })
+
+#define __careful_cmp3(op, x, y, z, uniq) ({   \
+   static_assert(__types_ok3(x, y, z), \
+   #op "3(" #x ", " #y ", " #z ") signedness error");  \
+   __cmp_once3(op, x, y, z, uniq); })
+
 /**
  * min3 - return minimum of three values
  * @x: first value
  * @y: second value
  * @z: third value
  */
-#define min3(x, y, z) min((typeof(x))min(x, y), z)
+#define min3(x, y, z) __careful_cmp3(min, x, y, z, __COUNTER__)
 
 /**
  * max3 - return maximum of three values
@@ -101,7 +117,7 @@
  * @y: second value
  * @z: third value
  */
-#define max3(x, y, z) max((typeof(x))max(x, y), z)
+#define max3(x, y, z) __careful_cmp3(max, x, y, z, __COUNTER__)
 
 /**
  * min_t - return minimum of two values, using the specified type
@@ -142,8 +158,7 @@
__clamp(__val_##uniq, __lo_##uniq, __hi_##uniq); })
 
 #define __careful_clamp(val, lo, hi, uniq) ({  
\
-   _Static_assert(__types_ok(val, lo), "clamp() 'lo' signedness error");   
\
-   _Static_assert(__types_ok(val, hi), "clamp() 'hi' signedness error");   
\
+   _Static_assert(__types_ok3(val, lo, hi), "clamp() signedness error");   
\
__clamp_once(val, lo, hi, uniq); })
 
 /**
-- 
2.17.1

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



RE: [PATCH next 04/11] minmax: Replace multiple __UNIQUE_ID() by directly using __COUNTER__

2024-01-28 Thread David Laight
Provided __COUNTER__ is passed through an extra #define it can be pasted
onto multiple local variables to give unique names.
This saves having 3 __UNIQUE_ID() for #defines with three locals and
look less messy in general.

Stop the umin()/umax() lines being overlong by factoring out the
zero-extension logic.

Signed-off-by: David Laight 
---
 include/linux/minmax.h | 48 +-
 1 file changed, 24 insertions(+), 24 deletions(-)

diff --git a/include/linux/minmax.h b/include/linux/minmax.h
index c32b4b40ce01..8ee003d8abaf 100644
--- a/include/linux/minmax.h
+++ b/include/linux/minmax.h
@@ -8,7 +8,7 @@
 #include 
 
 /*
- * min()/max()/clamp() macros must accomplish several things:
+ * min()/max()/clamp() macros must accomplish three things:
  *
  * - Avoid multiple evaluations of the arguments (so side-effects like
  *   "x++" happen only once) when non-constant.
@@ -43,31 +43,31 @@
 
 #define __cmp(op, x, y)((x) __cmp_op_##op (y) ? (x) : (y))
 
-#define __cmp_once(op, x, y, unique_x, unique_y) ({\
-   typeof(x) unique_x = (x);   \
-   typeof(y) unique_y = (y);   \
-   _Static_assert(__types_ok(x, y),\
+#define __cmp_once(op, x, y, uniq) ({  \
+   typeof(x) __x_##uniq = (x); \
+   typeof(y) __y_##uniq = (y); \
+   _Static_assert(__types_ok(x, y),\
#op "(" #x ", " #y ") signedness error, fix types or consider 
u" #op "() before " #op "_t()"); \
-   __cmp(op, unique_x, unique_y); })
+   __cmp(op, __x_##uniq, __y_##uniq); })
 
-#define __careful_cmp(op, x, y)\
+#define __careful_cmp(op, x, y, uniq)  \
__builtin_choose_expr(__is_constexpr((x) - (y)),\
__cmp(op, x, y),\
-   __cmp_once(op, x, y, __UNIQUE_ID(__x), __UNIQUE_ID(__y)))
+   __cmp_once(op, x, y, uniq))
 
 /**
  * min - return minimum of two values of the same or compatible types
  * @x: first value
  * @y: second value
  */
-#define min(x, y)  __careful_cmp(min, x, y)
+#define min(x, y)  __careful_cmp(min, x, y, __COUNTER__)
 
 /**
  * max - return maximum of two values of the same or compatible types
  * @x: first value
  * @y: second value
  */
-#define max(x, y)  __careful_cmp(max, x, y)
+#define max(x, y)  __careful_cmp(max, x, y, __COUNTER__)
 
 /**
  * umin - return minimum of two non-negative values
@@ -75,8 +75,9 @@
  * @x: first value
  * @y: second value
  */
+#define __zero_extend(x) ((x) + 0u + 0ul + 0ull)
 #define umin(x, y) \
-   __careful_cmp(min, (x) + 0u + 0ul + 0ull, (y) + 0u + 0ul + 0ull)
+   __careful_cmp(min, __zero_extend(x), _zero_extend(y), __COUNTER__)
 
 /**
  * umax - return maximum of two non-negative values
@@ -84,7 +85,7 @@
  * @y: second value
  */
 #define umax(x, y) \
-   __careful_cmp(max, (x) + 0u + 0ul + 0ull, (y) + 0u + 0ul + 0ull)
+   __careful_cmp(max, __zero_extend(x), _zero_extend(y), __COUNTER__)
 
 /**
  * min3 - return minimum of three values
@@ -108,7 +109,7 @@
  * @x: first value
  * @y: second value
  */
-#define min_t(type, x, y)  __careful_cmp(min, (type)(x), (type)(y))
+#define min_t(type, x, y)  __careful_cmp(min, (type)(x), (type)(y), 
__COUNTER__)
 
 /**
  * max_t - return maximum of two values, using the specified type
@@ -116,7 +117,7 @@
  * @x: first value
  * @y: second value
  */
-#define max_t(type, x, y)  __careful_cmp(max, (type)(x), (type)(y))
+#define max_t(type, x, y)  __careful_cmp(max, (type)(x), (type)(y), 
__COUNTER__)
 
 /**
  * min_not_zero - return the minimum that is _not_ zero, unless both are zero
@@ -131,22 +132,21 @@
 #define __clamp(val, lo, hi)   \
((val) >= (hi) ? (hi) : ((val) <= (lo) ? (lo) : (val)))
 
-#define __clamp_once(val, lo, hi, unique_val, unique_lo, unique_hi) ({ 
\
-   typeof(val) unique_val = (val); 
\
-   typeof(lo) unique_lo = (lo);
\
-   typeof(hi) unique_hi = (hi);
\
+#define __clamp_once(val, lo, hi, uniq) ({ 
\
+   typeof(val) __val_##uniq = (val);   
\
+   typeof(lo) __lo_##uniq = (lo);  
\
+   typeof(hi) __hi_##uniq = (hi);  
\
_Static_assert(__builtin_choose_expr(__is_constexpr((lo) > (hi)),   
\
(lo) <= (hi), true),
\
"clamp() low limit " #lo " greater than high limit " #hi);  
\
_Static_assert(__types_ok(val, lo), "clamp() 'lo' signedness error");   
\
_Static_assert(__types_ok(val, hi), "clamp() 'hi' signedness error");   
\
- 

RE: [PATCH next 05/11] minmax: Move the signedness check out of __cmp_once() and __clamp_once()

2024-01-28 Thread David Laight
There is no need to do the signedness/type check when the arguments
are being cast to a fixed type.
So move the check out of __xxx_once() into __careful_xxx().

Signed-off-by: David Laight 
---
 include/linux/minmax.h | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/include/linux/minmax.h b/include/linux/minmax.h
index 8ee003d8abaf..111c52a14fe5 100644
--- a/include/linux/minmax.h
+++ b/include/linux/minmax.h
@@ -46,14 +46,14 @@
 #define __cmp_once(op, x, y, uniq) ({  \
typeof(x) __x_##uniq = (x); \
typeof(y) __y_##uniq = (y); \
-   _Static_assert(__types_ok(x, y),\
-   #op "(" #x ", " #y ") signedness error, fix types or consider 
u" #op "() before " #op "_t()"); \
__cmp(op, __x_##uniq, __y_##uniq); })
 
 #define __careful_cmp(op, x, y, uniq)  \
__builtin_choose_expr(__is_constexpr((x) - (y)),\
__cmp(op, x, y),\
-   __cmp_once(op, x, y, uniq))
+   ({ _Static_assert(__types_ok(x, y), \
+   #op "(" #x ", " #y ") signedness error, fix types or 
consider u" #op "() before " #op "_t()"); \
+   __cmp_once(op, x, y, uniq); }))
 
 /**
  * min - return minimum of two values of the same or compatible types
@@ -139,14 +139,14 @@
_Static_assert(__builtin_choose_expr(__is_constexpr((lo) > (hi)),   
\
(lo) <= (hi), true),
\
"clamp() low limit " #lo " greater than high limit " #hi);  
\
-   _Static_assert(__types_ok(val, lo), "clamp() 'lo' signedness error");   
\
-   _Static_assert(__types_ok(val, hi), "clamp() 'hi' signedness error");   
\
__clamp(__val_##uniq, __lo_##uniq, __hi_##uniq); })
 
-#define __careful_clamp(val, lo, hi, uniq) ({  \
+#define __careful_clamp(val, lo, hi, uniq) \
__builtin_choose_expr(__is_constexpr((val) - (lo) + (hi)),  \
__clamp(val, lo, hi),   \
-   __clamp_once(val, lo, hi, uniq)); })
+   ({ _Static_assert(__types_ok(val, lo), "clamp() 'lo' signedness 
error");\
+   _Static_assert(__types_ok(val, hi), "clamp() 'hi' signedness 
error");   \
+   __clamp_once(val, lo, hi, uniq); }))
 
 /**
  * clamp - return a value clamped to a given range with strict typechecking
-- 
2.17.1

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



RE: [PATCH next 03/11] minmax: Simplify signedness check

2024-01-28 Thread David Laight
It is enough to check that both 'x' and 'y' are valid for either
a signed compare or an unsigned compare.
For unsigned they must be an unsigned type or a positive constant.
For signed they must be signed after unsigned char/short are promoted.

The predicate for _Static_assert() only needs to be a compile-time
constant not a constant integeger expression.
In particular the short-circuit evaluation of || && ?: can be used
to avoid the non-constantness of (pointer_type)1 in is_signed_type().

The '+ 0' in '(x) + 0 > = 0' is there to convert 'bool' to 'int'
and avoid a compiler warning because max() gets used for 'bool'
in one place (a very expensive 'or').
(The code is optimised away by two earlier checks - but the compiler
still bleats.)

Signed-off-by: David Laight 
---
 include/linux/minmax.h | 22 ++
 1 file changed, 10 insertions(+), 12 deletions(-)

diff --git a/include/linux/minmax.h b/include/linux/minmax.h
index 900eec7a28e5..c32b4b40ce01 100644
--- a/include/linux/minmax.h
+++ b/include/linux/minmax.h
@@ -8,7 +8,7 @@
 #include 
 
 /*
- * min()/max()/clamp() macros must accomplish three things:
+ * min()/max()/clamp() macros must accomplish several things:
  *
  * - Avoid multiple evaluations of the arguments (so side-effects like
  *   "x++" happen only once) when non-constant.
@@ -26,19 +26,17 @@
 #define __typecheck(x, y) \
(!!(sizeof((typeof(x) *)1 == (typeof(y) *)1)))
 
-/* is_signed_type() isn't a constexpr for pointer types */
-#define __is_signed(x) 
\
-   __builtin_choose_expr(__is_constexpr(is_signed_type(typeof(x))),
\
-   is_signed_type(typeof(x)), 0)
+/* Allow unsigned compares against non-negative signed constants. */
+#define __is_ok_unsigned(x) \
+   (is_unsigned_type(typeof(x)) || (__is_constexpr(x) ? (x) + 0 >= 0 : 0))
 
-/* True for a non-negative signed int constant */
-#define __is_noneg_int(x)  \
-   (__builtin_choose_expr(__is_constexpr(x) && __is_signed(x), x, -1) >= 0)
+/* Check for signed after promoting unsigned char/short to int */
+#define __is_ok_signed(x) is_signed_type(typeof((x) + 0))
 
-#define __types_ok(x, y)   \
-   (__is_signed(x) == __is_signed(y) ||\
-   __is_signed((x) + 0) == __is_signed((y) + 0) || \
-   __is_noneg_int(x) || __is_noneg_int(y))
+/* Allow if both x and y are valid for either signed or unsigned compares. */
+#define __types_ok(x, y)   \
+   ((__is_ok_signed(x) && __is_ok_signed(y)) ||\
+(__is_ok_unsigned(x) && __is_ok_unsigned(y)))
 
 #define __cmp_op_min <
 #define __cmp_op_max >
-- 
2.17.1

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



Re: [PATCH] backlight: mp3309c: Use pwm_apply_might_sleep()

2024-01-28 Thread Uwe Kleine-König
Hello Sean,

On Sun, Jan 28, 2024 at 03:49:04PM +, Sean Young wrote:
> pwm_apply_state() is deprecated since commit c748a6d77c06a ("pwm: Rename
> pwm_apply_state() to pwm_apply_might_sleep()"). This is the final user
> in the tree.
> 
> Signed-off-by: Sean Young 

The "problem" here is that the mp3309c driver didn't exist yet in commit
c748a6d77c06a, so it relies on the pwm_apply_state compatibility stub.

I would mention that in the commit log.

Otherwise the change looks fine.

thanks for catching and addressing this issue
Uwe

-- 
Pengutronix e.K.   | Uwe Kleine-König|
Industrial Linux Solutions | https://www.pengutronix.de/ |


signature.asc
Description: PGP signature


[PATCH AUTOSEL 4.19 8/8] drm/amdgpu: Release 'adev->pm.fw' before return in 'amdgpu_device_need_post()'

2024-01-28 Thread Sasha Levin
From: Srinivasan Shanmugam 

[ Upstream commit 8a44fdd3cf91debbd09b43bd2519ad2b2486ccf4 ]

In function 'amdgpu_device_need_post(struct amdgpu_device *adev)' -
'adev->pm.fw' may not be released before return.

Using the function release_firmware() to release adev->pm.fw.

Thus fixing the below:
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1571 amdgpu_device_need_post() warn: 
'adev->pm.fw' from request_firmware() not released on lines: 1554.

Cc: Monk Liu 
Cc: Christian König 
Cc: Alex Deucher 
Signed-off-by: Srinivasan Shanmugam 
Suggested-by: Lijo Lazar 
Reviewed-by: Alex Deucher 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index c84f475d4f13..ae28f72c73ef 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -823,6 +823,7 @@ bool amdgpu_device_need_post(struct amdgpu_device *adev)
return true;
 
fw_ver = *((uint32_t *)adev->pm.fw->data + 69);
+   release_firmware(adev->pm.fw);
if (fw_ver < 0x00160e00)
return true;
}
-- 
2.43.0



[PATCH AUTOSEL 5.10 12/13] drm/amd/powerplay: Fix kzalloc parameter 'ATOM_Tonga_PPM_Table' in 'get_platform_power_management_table()'

2024-01-28 Thread Sasha Levin
From: Srinivasan Shanmugam 

[ Upstream commit 6616b5e1999146b1304abe78232af810080c67e3 ]

In 'struct phm_ppm_table *ptr' allocation using kzalloc, an incorrect
structure type is passed to sizeof() in kzalloc, larger structure types
were used, thus using correct type 'struct phm_ppm_table' fixes the
below:

drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/process_pptables_v1_0.c:203 
get_platform_power_management_table() warn: struct type mismatch 'phm_ppm_table 
vs _ATOM_Tonga_PPM_Table'

Cc: Eric Huang 
Cc: Christian König 
Cc: Alex Deucher 
Signed-off-by: Srinivasan Shanmugam 
Acked-by: Alex Deucher 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c 
b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c
index b760f95e7fa7..5998c78ad536 100644
--- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c
+++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c
@@ -204,7 +204,7 @@ static int get_platform_power_management_table(
struct pp_hwmgr *hwmgr,
ATOM_Tonga_PPM_Table *atom_ppm_table)
 {
-   struct phm_ppm_table *ptr = kzalloc(sizeof(ATOM_Tonga_PPM_Table), 
GFP_KERNEL);
+   struct phm_ppm_table *ptr = kzalloc(sizeof(*ptr), GFP_KERNEL);
struct phm_ppt_v1_information *pp_table_information =
(struct phm_ppt_v1_information *)(hwmgr->pptable);
 
-- 
2.43.0



[PATCH AUTOSEL 5.4 11/11] drm/amdgpu: Release 'adev->pm.fw' before return in 'amdgpu_device_need_post()'

2024-01-28 Thread Sasha Levin
From: Srinivasan Shanmugam 

[ Upstream commit 8a44fdd3cf91debbd09b43bd2519ad2b2486ccf4 ]

In function 'amdgpu_device_need_post(struct amdgpu_device *adev)' -
'adev->pm.fw' may not be released before return.

Using the function release_firmware() to release adev->pm.fw.

Thus fixing the below:
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1571 amdgpu_device_need_post() warn: 
'adev->pm.fw' from request_firmware() not released on lines: 1554.

Cc: Monk Liu 
Cc: Christian König 
Cc: Alex Deucher 
Signed-off-by: Srinivasan Shanmugam 
Suggested-by: Lijo Lazar 
Reviewed-by: Alex Deucher 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index e5032eb9ae29..9dcb38bab0e1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -847,6 +847,7 @@ bool amdgpu_device_need_post(struct amdgpu_device *adev)
return true;
 
fw_ver = *((uint32_t *)adev->pm.fw->data + 69);
+   release_firmware(adev->pm.fw);
if (fw_ver < 0x00160e00)
return true;
}
-- 
2.43.0



[PATCH AUTOSEL 5.15 19/19] drm/amdkfd: Fix 'node' NULL check in 'svm_range_get_range_boundaries()'

2024-01-28 Thread Sasha Levin
From: Srinivasan Shanmugam 

[ Upstream commit d7a254fad873775ce6c32b77796c81e81e6b7f2e ]

Range interval [start, last] is ordered by rb_tree, rb_prev, rb_next
return value still needs NULL check, thus modified from "node" to "rb_node".

Fixes the below:
drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_svm.c:2691 
svm_range_get_range_boundaries() warn: can 'node' even be NULL?

Suggested-by: Philip Yang 
Cc: Felix Kuehling 
Cc: Christian König 
Cc: Alex Deucher 
Signed-off-by: Srinivasan Shanmugam 
Reviewed-by: Felix Kuehling 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index e2d4e2b42a7c..7f55decc5f37 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -2325,6 +2325,7 @@ svm_range_get_range_boundaries(struct kfd_process *p, 
int64_t addr,
 {
struct vm_area_struct *vma;
struct interval_tree_node *node;
+   struct rb_node *rb_node;
unsigned long start_limit, end_limit;
 
vma = find_vma(p->mm, addr << PAGE_SHIFT);
@@ -2341,16 +2342,15 @@ svm_range_get_range_boundaries(struct kfd_process *p, 
int64_t addr,
if (node) {
end_limit = min(end_limit, node->start);
/* Last range that ends before the fault address */
-   node = container_of(rb_prev(>rb),
-   struct interval_tree_node, rb);
+   rb_node = rb_prev(>rb);
} else {
/* Last range must end before addr because
 * there was no range after addr
 */
-   node = container_of(rb_last(>svms.objects.rb_root),
-   struct interval_tree_node, rb);
+   rb_node = rb_last(>svms.objects.rb_root);
}
-   if (node) {
+   if (rb_node) {
+   node = container_of(rb_node, struct interval_tree_node, rb);
if (node->last >= addr) {
WARN(1, "Overlap with prev node and page fault addr\n");
return -EFAULT;
-- 
2.43.0



[PATCH AUTOSEL 5.10 13/13] drm/amdgpu: Release 'adev->pm.fw' before return in 'amdgpu_device_need_post()'

2024-01-28 Thread Sasha Levin
From: Srinivasan Shanmugam 

[ Upstream commit 8a44fdd3cf91debbd09b43bd2519ad2b2486ccf4 ]

In function 'amdgpu_device_need_post(struct amdgpu_device *adev)' -
'adev->pm.fw' may not be released before return.

Using the function release_firmware() to release adev->pm.fw.

Thus fixing the below:
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1571 amdgpu_device_need_post() warn: 
'adev->pm.fw' from request_firmware() not released on lines: 1554.

Cc: Monk Liu 
Cc: Christian König 
Cc: Alex Deucher 
Signed-off-by: Srinivasan Shanmugam 
Suggested-by: Lijo Lazar 
Reviewed-by: Alex Deucher 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index a093f1b27724..e833c02fabff 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -1184,6 +1184,7 @@ bool amdgpu_device_need_post(struct amdgpu_device *adev)
return true;
 
fw_ver = *((uint32_t *)adev->pm.fw->data + 69);
+   release_firmware(adev->pm.fw);
if (fw_ver < 0x00160e00)
return true;
}
-- 
2.43.0



[PATCH AUTOSEL 5.15 18/19] drm/amdgpu: Release 'adev->pm.fw' before return in 'amdgpu_device_need_post()'

2024-01-28 Thread Sasha Levin
From: Srinivasan Shanmugam 

[ Upstream commit 8a44fdd3cf91debbd09b43bd2519ad2b2486ccf4 ]

In function 'amdgpu_device_need_post(struct amdgpu_device *adev)' -
'adev->pm.fw' may not be released before return.

Using the function release_firmware() to release adev->pm.fw.

Thus fixing the below:
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1571 amdgpu_device_need_post() warn: 
'adev->pm.fw' from request_firmware() not released on lines: 1554.

Cc: Monk Liu 
Cc: Christian König 
Cc: Alex Deucher 
Signed-off-by: Srinivasan Shanmugam 
Suggested-by: Lijo Lazar 
Reviewed-by: Alex Deucher 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 19e32f38a4c4..816dd59212c1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -1292,6 +1292,7 @@ bool amdgpu_device_need_post(struct amdgpu_device *adev)
return true;
 
fw_ver = *((uint32_t *)adev->pm.fw->data + 69);
+   release_firmware(adev->pm.fw);
if (fw_ver < 0x00160e00)
return true;
}
-- 
2.43.0



[PATCH AUTOSEL 5.15 17/19] drm/amd/powerplay: Fix kzalloc parameter 'ATOM_Tonga_PPM_Table' in 'get_platform_power_management_table()'

2024-01-28 Thread Sasha Levin
From: Srinivasan Shanmugam 

[ Upstream commit 6616b5e1999146b1304abe78232af810080c67e3 ]

In 'struct phm_ppm_table *ptr' allocation using kzalloc, an incorrect
structure type is passed to sizeof() in kzalloc, larger structure types
were used, thus using correct type 'struct phm_ppm_table' fixes the
below:

drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/process_pptables_v1_0.c:203 
get_platform_power_management_table() warn: struct type mismatch 'phm_ppm_table 
vs _ATOM_Tonga_PPM_Table'

Cc: Eric Huang 
Cc: Christian König 
Cc: Alex Deucher 
Signed-off-by: Srinivasan Shanmugam 
Acked-by: Alex Deucher 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c 
b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c
index f2a55c1413f5..17882f8dfdd3 100644
--- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c
+++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c
@@ -200,7 +200,7 @@ static int get_platform_power_management_table(
struct pp_hwmgr *hwmgr,
ATOM_Tonga_PPM_Table *atom_ppm_table)
 {
-   struct phm_ppm_table *ptr = kzalloc(sizeof(ATOM_Tonga_PPM_Table), 
GFP_KERNEL);
+   struct phm_ppm_table *ptr = kzalloc(sizeof(*ptr), GFP_KERNEL);
struct phm_ppt_v1_information *pp_table_information =
(struct phm_ppt_v1_information *)(hwmgr->pptable);
 
-- 
2.43.0



[PATCH AUTOSEL 5.15 13/19] drm/amdkfd: Fix lock dependency warning

2024-01-28 Thread Sasha Levin
From: Felix Kuehling 

[ Upstream commit 47bf0f83fc86df1bf42b385a91aadb910137c5c9 ]

==
WARNING: possible circular locking dependency detected
6.5.0-kfd-fkuehlin #276 Not tainted
--
kworker/8:2/2676 is trying to acquire lock:
9435aae95c88 ((work_completion)(_bo->eviction_work)){+.+.}-{0:0}, at: 
__flush_work+0x52/0x550

but task is already holding lock:
9435cd8e1720 (>lock){+.+.}-{3:3}, at: 
svm_range_deferred_list_work+0xe8/0x340 [amdgpu]

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #2 (>lock){+.+.}-{3:3}:
   __mutex_lock+0x97/0xd30
   kfd_ioctl_alloc_memory_of_gpu+0x6d/0x3c0 [amdgpu]
   kfd_ioctl+0x1b2/0x5d0 [amdgpu]
   __x64_sys_ioctl+0x86/0xc0
   do_syscall_64+0x39/0x80
   entry_SYSCALL_64_after_hwframe+0x63/0xcd

-> #1 (>mmap_lock){}-{3:3}:
   down_read+0x42/0x160
   svm_range_evict_svm_bo_worker+0x8b/0x340 [amdgpu]
   process_one_work+0x27a/0x540
   worker_thread+0x53/0x3e0
   kthread+0xeb/0x120
   ret_from_fork+0x31/0x50
   ret_from_fork_asm+0x11/0x20

-> #0 ((work_completion)(_bo->eviction_work)){+.+.}-{0:0}:
   __lock_acquire+0x1426/0x2200
   lock_acquire+0xc1/0x2b0
   __flush_work+0x80/0x550
   __cancel_work_timer+0x109/0x190
   svm_range_bo_release+0xdc/0x1c0 [amdgpu]
   svm_range_free+0x175/0x180 [amdgpu]
   svm_range_deferred_list_work+0x15d/0x340 [amdgpu]
   process_one_work+0x27a/0x540
   worker_thread+0x53/0x3e0
   kthread+0xeb/0x120
   ret_from_fork+0x31/0x50
   ret_from_fork_asm+0x11/0x20

other info that might help us debug this:

Chain exists of:
  (work_completion)(_bo->eviction_work) --> >mmap_lock --> >lock

 Possible unsafe locking scenario:

   CPU0CPU1
   
  lock(>lock);
   lock(>mmap_lock);
   lock(>lock);
  lock((work_completion)(_bo->eviction_work));

I believe this cannot really lead to a deadlock in practice, because
svm_range_evict_svm_bo_worker only takes the mmap_read_lock if the BO
refcount is non-0. That means it's impossible that svm_range_bo_release
is running concurrently. However, there is no good way to annotate this.

To avoid the problem, take a BO reference in
svm_range_schedule_evict_svm_bo instead of in the worker. That way it's
impossible for a BO to get freed while eviction work is pending and the
cancel_work_sync call in svm_range_bo_release can be eliminated.

v2: Use svm_bo_ref_unless_zero and explained why that's safe. Also
removed redundant checks that are already done in
amdkfd_fence_enable_signaling.

Signed-off-by: Felix Kuehling 
Reviewed-by: Philip Yang 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 26 ++
 1 file changed, 10 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index 2cbe8ea16f24..e2d4e2b42a7c 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -347,14 +347,9 @@ static void svm_range_bo_release(struct kref *kref)
spin_lock(_bo->list_lock);
}
spin_unlock(_bo->list_lock);
-   if (!dma_fence_is_signaled(_bo->eviction_fence->base)) {
-   /* We're not in the eviction worker.
-* Signal the fence and synchronize with any
-* pending eviction work.
-*/
+   if (!dma_fence_is_signaled(_bo->eviction_fence->base))
+   /* We're not in the eviction worker. Signal the fence. */
dma_fence_signal(_bo->eviction_fence->base);
-   cancel_work_sync(_bo->eviction_work);
-   }
dma_fence_put(_bo->eviction_fence->base);
amdgpu_bo_unref(_bo->bo);
kfree(svm_bo);
@@ -2872,13 +2867,14 @@ svm_range_trigger_migration(struct mm_struct *mm, 
struct svm_range *prange,
 
 int svm_range_schedule_evict_svm_bo(struct amdgpu_amdkfd_fence *fence)
 {
-   if (!fence)
-   return -EINVAL;
-
-   if (dma_fence_is_signaled(>base))
-   return 0;
-
-   if (fence->svm_bo) {
+   /* Dereferencing fence->svm_bo is safe here because the fence hasn't
+* signaled yet and we're under the protection of the fence->lock.
+* After the fence is signaled in svm_range_bo_release, we cannot get
+* here any more.
+*
+* Reference is dropped in svm_range_evict_svm_bo_worker.
+*/
+   if (svm_bo_ref_unless_zero(fence->svm_bo)) {
WRITE_ONCE(fence->svm_bo->evicting, 1);
schedule_work(>svm_bo->eviction_work);
}
@@ -2893,8 +2889,6 @@ static void svm_range_evict_svm_bo_worker(struct 
work_struct *work)
struct mm_struct *mm;
 

[PATCH AUTOSEL 6.1 26/27] drm/amdkfd: Fix 'node' NULL check in 'svm_range_get_range_boundaries()'

2024-01-28 Thread Sasha Levin
From: Srinivasan Shanmugam 

[ Upstream commit d7a254fad873775ce6c32b77796c81e81e6b7f2e ]

Range interval [start, last] is ordered by rb_tree, rb_prev, rb_next
return value still needs NULL check, thus modified from "node" to "rb_node".

Fixes the below:
drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_svm.c:2691 
svm_range_get_range_boundaries() warn: can 'node' even be NULL?

Suggested-by: Philip Yang 
Cc: Felix Kuehling 
Cc: Christian König 
Cc: Alex Deucher 
Signed-off-by: Srinivasan Shanmugam 
Reviewed-by: Felix Kuehling 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index 5188c4d2e7c0..7fa5e70f1aac 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -2553,6 +2553,7 @@ svm_range_get_range_boundaries(struct kfd_process *p, 
int64_t addr,
 {
struct vm_area_struct *vma;
struct interval_tree_node *node;
+   struct rb_node *rb_node;
unsigned long start_limit, end_limit;
 
vma = find_vma(p->mm, addr << PAGE_SHIFT);
@@ -2575,16 +2576,15 @@ svm_range_get_range_boundaries(struct kfd_process *p, 
int64_t addr,
if (node) {
end_limit = min(end_limit, node->start);
/* Last range that ends before the fault address */
-   node = container_of(rb_prev(>rb),
-   struct interval_tree_node, rb);
+   rb_node = rb_prev(>rb);
} else {
/* Last range must end before addr because
 * there was no range after addr
 */
-   node = container_of(rb_last(>svms.objects.rb_root),
-   struct interval_tree_node, rb);
+   rb_node = rb_last(>svms.objects.rb_root);
}
-   if (node) {
+   if (rb_node) {
+   node = container_of(rb_node, struct interval_tree_node, rb);
if (node->last >= addr) {
WARN(1, "Overlap with prev node and page fault addr\n");
return -EFAULT;
-- 
2.43.0



[PATCH AUTOSEL 6.1 25/27] drm/amdgpu: Release 'adev->pm.fw' before return in 'amdgpu_device_need_post()'

2024-01-28 Thread Sasha Levin
From: Srinivasan Shanmugam 

[ Upstream commit 8a44fdd3cf91debbd09b43bd2519ad2b2486ccf4 ]

In function 'amdgpu_device_need_post(struct amdgpu_device *adev)' -
'adev->pm.fw' may not be released before return.

Using the function release_firmware() to release adev->pm.fw.

Thus fixing the below:
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1571 amdgpu_device_need_post() warn: 
'adev->pm.fw' from request_firmware() not released on lines: 1554.

Cc: Monk Liu 
Cc: Christian König 
Cc: Alex Deucher 
Signed-off-by: Srinivasan Shanmugam 
Suggested-by: Lijo Lazar 
Reviewed-by: Alex Deucher 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index a5352e5e2bd4..4b91f95066ec 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -1310,6 +1310,7 @@ bool amdgpu_device_need_post(struct amdgpu_device *adev)
return true;
 
fw_ver = *((uint32_t *)adev->pm.fw->data + 69);
+   release_firmware(adev->pm.fw);
if (fw_ver < 0x00160e00)
return true;
}
-- 
2.43.0



[PATCH AUTOSEL 6.1 24/27] drm/amdgpu: Fix with right return code '-EIO' in 'amdgpu_gmc_vram_checking()'

2024-01-28 Thread Sasha Levin
From: Srinivasan Shanmugam 

[ Upstream commit fac4ebd79fed60e79cccafdad45a2bb8d3795044 ]

The amdgpu_gmc_vram_checking() function in emulation checks whether
all of the memory range of shared system memory could be accessed by
GPU, from this aspect, -EIO is returned for error scenarios.

Fixes the below:
drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c:919 gmc_v6_0_hw_init() warn: missing 
error code? 'r'
drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c:1103 gmc_v7_0_hw_init() warn: missing 
error code? 'r'
drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c:1223 gmc_v8_0_hw_init() warn: missing 
error code? 'r'
drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c:2344 gmc_v9_0_hw_init() warn: missing 
error code? 'r'

Cc: Xiaojian Du 
Cc: Lijo Lazar 
Cc: Christian König 
Cc: Alex Deucher 
Signed-off-by: Srinivasan Shanmugam 
Suggested-by: Christian König 
Reviewed-by: Christian König 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 21 ++---
 1 file changed, 14 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
index 2bc791ed8830..ea0fb079f942 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
@@ -808,19 +808,26 @@ int amdgpu_gmc_vram_checking(struct amdgpu_device *adev)
 * seconds, so here, we just pick up three parts for emulation.
 */
ret = memcmp(vram_ptr, cptr, 10);
-   if (ret)
-   return ret;
+   if (ret) {
+   ret = -EIO;
+   goto release_buffer;
+   }
 
ret = memcmp(vram_ptr + (size / 2), cptr, 10);
-   if (ret)
-   return ret;
+   if (ret) {
+   ret = -EIO;
+   goto release_buffer;
+   }
 
ret = memcmp(vram_ptr + size - 10, cptr, 10);
-   if (ret)
-   return ret;
+   if (ret) {
+   ret = -EIO;
+   goto release_buffer;
+   }
 
+release_buffer:
amdgpu_bo_free_kernel(_bo, _gpu,
_ptr);
 
-   return 0;
+   return ret;
 }
-- 
2.43.0



[PATCH AUTOSEL 6.1 23/27] drm/amd/powerplay: Fix kzalloc parameter 'ATOM_Tonga_PPM_Table' in 'get_platform_power_management_table()'

2024-01-28 Thread Sasha Levin
From: Srinivasan Shanmugam 

[ Upstream commit 6616b5e1999146b1304abe78232af810080c67e3 ]

In 'struct phm_ppm_table *ptr' allocation using kzalloc, an incorrect
structure type is passed to sizeof() in kzalloc, larger structure types
were used, thus using correct type 'struct phm_ppm_table' fixes the
below:

drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/process_pptables_v1_0.c:203 
get_platform_power_management_table() warn: struct type mismatch 'phm_ppm_table 
vs _ATOM_Tonga_PPM_Table'

Cc: Eric Huang 
Cc: Christian König 
Cc: Alex Deucher 
Signed-off-by: Srinivasan Shanmugam 
Acked-by: Alex Deucher 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c 
b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c
index f2a55c1413f5..17882f8dfdd3 100644
--- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c
+++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c
@@ -200,7 +200,7 @@ static int get_platform_power_management_table(
struct pp_hwmgr *hwmgr,
ATOM_Tonga_PPM_Table *atom_ppm_table)
 {
-   struct phm_ppm_table *ptr = kzalloc(sizeof(ATOM_Tonga_PPM_Table), 
GFP_KERNEL);
+   struct phm_ppm_table *ptr = kzalloc(sizeof(*ptr), GFP_KERNEL);
struct phm_ppt_v1_information *pp_table_information =
(struct phm_ppt_v1_information *)(hwmgr->pptable);
 
-- 
2.43.0



[PATCH AUTOSEL 6.1 16/27] drm/amdkfd: Fix lock dependency warning

2024-01-28 Thread Sasha Levin
From: Felix Kuehling 

[ Upstream commit 47bf0f83fc86df1bf42b385a91aadb910137c5c9 ]

==
WARNING: possible circular locking dependency detected
6.5.0-kfd-fkuehlin #276 Not tainted
--
kworker/8:2/2676 is trying to acquire lock:
9435aae95c88 ((work_completion)(_bo->eviction_work)){+.+.}-{0:0}, at: 
__flush_work+0x52/0x550

but task is already holding lock:
9435cd8e1720 (>lock){+.+.}-{3:3}, at: 
svm_range_deferred_list_work+0xe8/0x340 [amdgpu]

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #2 (>lock){+.+.}-{3:3}:
   __mutex_lock+0x97/0xd30
   kfd_ioctl_alloc_memory_of_gpu+0x6d/0x3c0 [amdgpu]
   kfd_ioctl+0x1b2/0x5d0 [amdgpu]
   __x64_sys_ioctl+0x86/0xc0
   do_syscall_64+0x39/0x80
   entry_SYSCALL_64_after_hwframe+0x63/0xcd

-> #1 (>mmap_lock){}-{3:3}:
   down_read+0x42/0x160
   svm_range_evict_svm_bo_worker+0x8b/0x340 [amdgpu]
   process_one_work+0x27a/0x540
   worker_thread+0x53/0x3e0
   kthread+0xeb/0x120
   ret_from_fork+0x31/0x50
   ret_from_fork_asm+0x11/0x20

-> #0 ((work_completion)(_bo->eviction_work)){+.+.}-{0:0}:
   __lock_acquire+0x1426/0x2200
   lock_acquire+0xc1/0x2b0
   __flush_work+0x80/0x550
   __cancel_work_timer+0x109/0x190
   svm_range_bo_release+0xdc/0x1c0 [amdgpu]
   svm_range_free+0x175/0x180 [amdgpu]
   svm_range_deferred_list_work+0x15d/0x340 [amdgpu]
   process_one_work+0x27a/0x540
   worker_thread+0x53/0x3e0
   kthread+0xeb/0x120
   ret_from_fork+0x31/0x50
   ret_from_fork_asm+0x11/0x20

other info that might help us debug this:

Chain exists of:
  (work_completion)(_bo->eviction_work) --> >mmap_lock --> >lock

 Possible unsafe locking scenario:

   CPU0CPU1
   
  lock(>lock);
   lock(>mmap_lock);
   lock(>lock);
  lock((work_completion)(_bo->eviction_work));

I believe this cannot really lead to a deadlock in practice, because
svm_range_evict_svm_bo_worker only takes the mmap_read_lock if the BO
refcount is non-0. That means it's impossible that svm_range_bo_release
is running concurrently. However, there is no good way to annotate this.

To avoid the problem, take a BO reference in
svm_range_schedule_evict_svm_bo instead of in the worker. That way it's
impossible for a BO to get freed while eviction work is pending and the
cancel_work_sync call in svm_range_bo_release can be eliminated.

v2: Use svm_bo_ref_unless_zero and explained why that's safe. Also
removed redundant checks that are already done in
amdkfd_fence_enable_signaling.

Signed-off-by: Felix Kuehling 
Reviewed-by: Philip Yang 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 26 ++
 1 file changed, 10 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index 208812512d8a..4ecc4be1a910 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -380,14 +380,9 @@ static void svm_range_bo_release(struct kref *kref)
spin_lock(_bo->list_lock);
}
spin_unlock(_bo->list_lock);
-   if (!dma_fence_is_signaled(_bo->eviction_fence->base)) {
-   /* We're not in the eviction worker.
-* Signal the fence and synchronize with any
-* pending eviction work.
-*/
+   if (!dma_fence_is_signaled(_bo->eviction_fence->base))
+   /* We're not in the eviction worker. Signal the fence. */
dma_fence_signal(_bo->eviction_fence->base);
-   cancel_work_sync(_bo->eviction_work);
-   }
dma_fence_put(_bo->eviction_fence->base);
amdgpu_bo_unref(_bo->bo);
kfree(svm_bo);
@@ -3310,13 +3305,14 @@ svm_range_trigger_migration(struct mm_struct *mm, 
struct svm_range *prange,
 
 int svm_range_schedule_evict_svm_bo(struct amdgpu_amdkfd_fence *fence)
 {
-   if (!fence)
-   return -EINVAL;
-
-   if (dma_fence_is_signaled(>base))
-   return 0;
-
-   if (fence->svm_bo) {
+   /* Dereferencing fence->svm_bo is safe here because the fence hasn't
+* signaled yet and we're under the protection of the fence->lock.
+* After the fence is signaled in svm_range_bo_release, we cannot get
+* here any more.
+*
+* Reference is dropped in svm_range_evict_svm_bo_worker.
+*/
+   if (svm_bo_ref_unless_zero(fence->svm_bo)) {
WRITE_ONCE(fence->svm_bo->evicting, 1);
schedule_work(>svm_bo->eviction_work);
}
@@ -3331,8 +3327,6 @@ static void svm_range_evict_svm_bo_worker(struct 
work_struct *work)
int r = 0;
 
svm_bo = 

[PATCH AUTOSEL 6.1 17/27] drm/amdkfd: Fix lock dependency warning with srcu

2024-01-28 Thread Sasha Levin
From: Philip Yang 

[ Upstream commit 2a9de42e8d3c82c6990d226198602be44f43f340 ]

==
WARNING: possible circular locking dependency detected
6.5.0-kfd-yangp #2289 Not tainted
--
kworker/0:2/996 is trying to acquire lock:
(srcu){.+.+}-{0:0}, at: __synchronize_srcu+0x5/0x1a0

but task is already holding lock:
((work_completion)(>deferred_list_work)){+.+.}-{0:0}, at:
process_one_work+0x211/0x560

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #3 ((work_completion)(>deferred_list_work)){+.+.}-{0:0}:
__flush_work+0x88/0x4f0
svm_range_list_lock_and_flush_work+0x3d/0x110 [amdgpu]
svm_range_set_attr+0xd6/0x14c0 [amdgpu]
kfd_ioctl+0x1d1/0x630 [amdgpu]
__x64_sys_ioctl+0x88/0xc0

-> #2 (>lock#2){+.+.}-{3:3}:
__mutex_lock+0x99/0xc70
amdgpu_amdkfd_gpuvm_restore_process_bos+0x54/0x740 [amdgpu]
restore_process_helper+0x22/0x80 [amdgpu]
restore_process_worker+0x2d/0xa0 [amdgpu]
process_one_work+0x29b/0x560
worker_thread+0x3d/0x3d0

-> #1 ((work_completion)(&(>restore_work)->work)){+.+.}-{0:0}:
__flush_work+0x88/0x4f0
__cancel_work_timer+0x12c/0x1c0
kfd_process_notifier_release_internal+0x37/0x1f0 [amdgpu]
__mmu_notifier_release+0xad/0x240
exit_mmap+0x6a/0x3a0
mmput+0x6a/0x120
do_exit+0x322/0xb90
do_group_exit+0x37/0xa0
__x64_sys_exit_group+0x18/0x20
do_syscall_64+0x38/0x80

-> #0 (srcu){.+.+}-{0:0}:
__lock_acquire+0x1521/0x2510
lock_sync+0x5f/0x90
__synchronize_srcu+0x4f/0x1a0
__mmu_notifier_release+0x128/0x240
exit_mmap+0x6a/0x3a0
mmput+0x6a/0x120
svm_range_deferred_list_work+0x19f/0x350 [amdgpu]
process_one_work+0x29b/0x560
worker_thread+0x3d/0x3d0

other info that might help us debug this:
Chain exists of:
  srcu --> >lock#2 --> (work_completion)(>deferred_list_work)

Possible unsafe locking scenario:

CPU0CPU1

lock((work_completion)(>deferred_list_work));
lock(>lock#2);
lock((work_completion)(>deferred_list_work));
sync(srcu);

Signed-off-by: Philip Yang 
Reviewed-by: Felix Kuehling 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index 4ecc4be1a910..5188c4d2e7c0 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -2241,8 +2241,10 @@ static void svm_range_deferred_list_work(struct 
work_struct *work)
mutex_unlock(>lock);
mmap_write_unlock(mm);
 
-   /* Pairs with mmget in svm_range_add_list_work */
-   mmput(mm);
+   /* Pairs with mmget in svm_range_add_list_work. If dropping the
+* last mm refcount, schedule release work to avoid circular 
locking
+*/
+   mmput_async(mm);
 
spin_lock(>deferred_list_lock);
}
-- 
2.43.0



[PATCH AUTOSEL 6.6 30/31] drm/amdkfd: Fix 'node' NULL check in 'svm_range_get_range_boundaries()'

2024-01-28 Thread Sasha Levin
From: Srinivasan Shanmugam 

[ Upstream commit d7a254fad873775ce6c32b77796c81e81e6b7f2e ]

Range interval [start, last] is ordered by rb_tree, rb_prev, rb_next
return value still needs NULL check, thus modified from "node" to "rb_node".

Fixes the below:
drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_svm.c:2691 
svm_range_get_range_boundaries() warn: can 'node' even be NULL?

Suggested-by: Philip Yang 
Cc: Felix Kuehling 
Cc: Christian König 
Cc: Alex Deucher 
Signed-off-by: Srinivasan Shanmugam 
Reviewed-by: Felix Kuehling 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index b51224a85a38..87e9ca65e58e 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -2657,6 +2657,7 @@ svm_range_get_range_boundaries(struct kfd_process *p, 
int64_t addr,
 {
struct vm_area_struct *vma;
struct interval_tree_node *node;
+   struct rb_node *rb_node;
unsigned long start_limit, end_limit;
 
vma = vma_lookup(p->mm, addr << PAGE_SHIFT);
@@ -2676,16 +2677,15 @@ svm_range_get_range_boundaries(struct kfd_process *p, 
int64_t addr,
if (node) {
end_limit = min(end_limit, node->start);
/* Last range that ends before the fault address */
-   node = container_of(rb_prev(>rb),
-   struct interval_tree_node, rb);
+   rb_node = rb_prev(>rb);
} else {
/* Last range must end before addr because
 * there was no range after addr
 */
-   node = container_of(rb_last(>svms.objects.rb_root),
-   struct interval_tree_node, rb);
+   rb_node = rb_last(>svms.objects.rb_root);
}
-   if (node) {
+   if (rb_node) {
+   node = container_of(rb_node, struct interval_tree_node, rb);
if (node->last >= addr) {
WARN(1, "Overlap with prev node and page fault addr\n");
return -EFAULT;
-- 
2.43.0



[PATCH AUTOSEL 6.6 29/31] drm/amdgpu: Release 'adev->pm.fw' before return in 'amdgpu_device_need_post()'

2024-01-28 Thread Sasha Levin
From: Srinivasan Shanmugam 

[ Upstream commit 8a44fdd3cf91debbd09b43bd2519ad2b2486ccf4 ]

In function 'amdgpu_device_need_post(struct amdgpu_device *adev)' -
'adev->pm.fw' may not be released before return.

Using the function release_firmware() to release adev->pm.fw.

Thus fixing the below:
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1571 amdgpu_device_need_post() warn: 
'adev->pm.fw' from request_firmware() not released on lines: 1554.

Cc: Monk Liu 
Cc: Christian König 
Cc: Alex Deucher 
Signed-off-by: Srinivasan Shanmugam 
Suggested-by: Lijo Lazar 
Reviewed-by: Alex Deucher 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 56d99ffbba2e..7791367e7c02 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -1218,6 +1218,7 @@ bool amdgpu_device_need_post(struct amdgpu_device *adev)
return true;
 
fw_ver = *((uint32_t *)adev->pm.fw->data + 69);
+   release_firmware(adev->pm.fw);
if (fw_ver < 0x00160e00)
return true;
}
-- 
2.43.0



[PATCH AUTOSEL 6.6 26/31] drm/amdgpu: fix avg vs input power reporting on smu7

2024-01-28 Thread Sasha Levin
From: Alex Deucher 

[ Upstream commit 25852d4b97572ff62ffee574cb8bb4bc551af23a ]

Hawaii, Bonaire, Fiji, and Tonga support average power, the others
support current power.

Reviewed-by: Yang Wang 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 .../gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c | 17 -
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c 
b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c
index 11372fcc59c8..a2c7b2e111fa 100644
--- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c
+++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c
@@ -3995,6 +3995,7 @@ static int smu7_read_sensor(struct pp_hwmgr *hwmgr, int 
idx,
uint32_t sclk, mclk, activity_percent;
uint32_t offset, val_vid;
struct smu7_hwmgr *data = (struct smu7_hwmgr *)(hwmgr->backend);
+   struct amdgpu_device *adev = hwmgr->adev;
 
/* size must be at least 4 bytes for all sensors */
if (*size < 4)
@@ -4038,7 +4039,21 @@ static int smu7_read_sensor(struct pp_hwmgr *hwmgr, int 
idx,
*size = 4;
return 0;
case AMDGPU_PP_SENSOR_GPU_INPUT_POWER:
-   return smu7_get_gpu_power(hwmgr, (uint32_t *)value);
+   if ((adev->asic_type != CHIP_HAWAII) &&
+   (adev->asic_type != CHIP_BONAIRE) &&
+   (adev->asic_type != CHIP_FIJI) &&
+   (adev->asic_type != CHIP_TONGA))
+   return smu7_get_gpu_power(hwmgr, (uint32_t *)value);
+   else
+   return -EOPNOTSUPP;
+   case AMDGPU_PP_SENSOR_GPU_AVG_POWER:
+   if ((adev->asic_type != CHIP_HAWAII) &&
+   (adev->asic_type != CHIP_BONAIRE) &&
+   (adev->asic_type != CHIP_FIJI) &&
+   (adev->asic_type != CHIP_TONGA))
+   return -EOPNOTSUPP;
+   else
+   return smu7_get_gpu_power(hwmgr, (uint32_t *)value);
case AMDGPU_PP_SENSOR_VDDGFX:
if ((data->vr_config & VRCONF_VDDGFX_MASK) ==
(VR_SVI2_PLANE_2 << VRCONF_VDDGFX_SHIFT))
-- 
2.43.0



[PATCH AUTOSEL 6.6 28/31] drm/amdgpu: Fix with right return code '-EIO' in 'amdgpu_gmc_vram_checking()'

2024-01-28 Thread Sasha Levin
From: Srinivasan Shanmugam 

[ Upstream commit fac4ebd79fed60e79cccafdad45a2bb8d3795044 ]

The amdgpu_gmc_vram_checking() function in emulation checks whether
all of the memory range of shared system memory could be accessed by
GPU, from this aspect, -EIO is returned for error scenarios.

Fixes the below:
drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c:919 gmc_v6_0_hw_init() warn: missing 
error code? 'r'
drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c:1103 gmc_v7_0_hw_init() warn: missing 
error code? 'r'
drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c:1223 gmc_v8_0_hw_init() warn: missing 
error code? 'r'
drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c:2344 gmc_v9_0_hw_init() warn: missing 
error code? 'r'

Cc: Xiaojian Du 
Cc: Lijo Lazar 
Cc: Christian König 
Cc: Alex Deucher 
Signed-off-by: Srinivasan Shanmugam 
Suggested-by: Christian König 
Reviewed-by: Christian König 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 21 ++---
 1 file changed, 14 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
index d78bd9732543..bc0eda1a729c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
@@ -876,21 +876,28 @@ int amdgpu_gmc_vram_checking(struct amdgpu_device *adev)
 * seconds, so here, we just pick up three parts for emulation.
 */
ret = memcmp(vram_ptr, cptr, 10);
-   if (ret)
-   return ret;
+   if (ret) {
+   ret = -EIO;
+   goto release_buffer;
+   }
 
ret = memcmp(vram_ptr + (size / 2), cptr, 10);
-   if (ret)
-   return ret;
+   if (ret) {
+   ret = -EIO;
+   goto release_buffer;
+   }
 
ret = memcmp(vram_ptr + size - 10, cptr, 10);
-   if (ret)
-   return ret;
+   if (ret) {
+   ret = -EIO;
+   goto release_buffer;
+   }
 
+release_buffer:
amdgpu_bo_free_kernel(_bo, _gpu,
_ptr);
 
-   return 0;
+   return ret;
 }
 
 static ssize_t current_memory_partition_show(
-- 
2.43.0



[PATCH AUTOSEL 6.6 27/31] drm/amd/powerplay: Fix kzalloc parameter 'ATOM_Tonga_PPM_Table' in 'get_platform_power_management_table()'

2024-01-28 Thread Sasha Levin
From: Srinivasan Shanmugam 

[ Upstream commit 6616b5e1999146b1304abe78232af810080c67e3 ]

In 'struct phm_ppm_table *ptr' allocation using kzalloc, an incorrect
structure type is passed to sizeof() in kzalloc, larger structure types
were used, thus using correct type 'struct phm_ppm_table' fixes the
below:

drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/process_pptables_v1_0.c:203 
get_platform_power_management_table() warn: struct type mismatch 'phm_ppm_table 
vs _ATOM_Tonga_PPM_Table'

Cc: Eric Huang 
Cc: Christian König 
Cc: Alex Deucher 
Signed-off-by: Srinivasan Shanmugam 
Acked-by: Alex Deucher 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c 
b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c
index f2a55c1413f5..17882f8dfdd3 100644
--- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c
+++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c
@@ -200,7 +200,7 @@ static int get_platform_power_management_table(
struct pp_hwmgr *hwmgr,
ATOM_Tonga_PPM_Table *atom_ppm_table)
 {
-   struct phm_ppm_table *ptr = kzalloc(sizeof(ATOM_Tonga_PPM_Table), 
GFP_KERNEL);
+   struct phm_ppm_table *ptr = kzalloc(sizeof(*ptr), GFP_KERNEL);
struct phm_ppt_v1_information *pp_table_information =
(struct phm_ppt_v1_information *)(hwmgr->pptable);
 
-- 
2.43.0



[PATCH AUTOSEL 6.6 20/31] drm/amdkfd: Fix lock dependency warning with srcu

2024-01-28 Thread Sasha Levin
From: Philip Yang 

[ Upstream commit 2a9de42e8d3c82c6990d226198602be44f43f340 ]

==
WARNING: possible circular locking dependency detected
6.5.0-kfd-yangp #2289 Not tainted
--
kworker/0:2/996 is trying to acquire lock:
(srcu){.+.+}-{0:0}, at: __synchronize_srcu+0x5/0x1a0

but task is already holding lock:
((work_completion)(>deferred_list_work)){+.+.}-{0:0}, at:
process_one_work+0x211/0x560

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #3 ((work_completion)(>deferred_list_work)){+.+.}-{0:0}:
__flush_work+0x88/0x4f0
svm_range_list_lock_and_flush_work+0x3d/0x110 [amdgpu]
svm_range_set_attr+0xd6/0x14c0 [amdgpu]
kfd_ioctl+0x1d1/0x630 [amdgpu]
__x64_sys_ioctl+0x88/0xc0

-> #2 (>lock#2){+.+.}-{3:3}:
__mutex_lock+0x99/0xc70
amdgpu_amdkfd_gpuvm_restore_process_bos+0x54/0x740 [amdgpu]
restore_process_helper+0x22/0x80 [amdgpu]
restore_process_worker+0x2d/0xa0 [amdgpu]
process_one_work+0x29b/0x560
worker_thread+0x3d/0x3d0

-> #1 ((work_completion)(&(>restore_work)->work)){+.+.}-{0:0}:
__flush_work+0x88/0x4f0
__cancel_work_timer+0x12c/0x1c0
kfd_process_notifier_release_internal+0x37/0x1f0 [amdgpu]
__mmu_notifier_release+0xad/0x240
exit_mmap+0x6a/0x3a0
mmput+0x6a/0x120
do_exit+0x322/0xb90
do_group_exit+0x37/0xa0
__x64_sys_exit_group+0x18/0x20
do_syscall_64+0x38/0x80

-> #0 (srcu){.+.+}-{0:0}:
__lock_acquire+0x1521/0x2510
lock_sync+0x5f/0x90
__synchronize_srcu+0x4f/0x1a0
__mmu_notifier_release+0x128/0x240
exit_mmap+0x6a/0x3a0
mmput+0x6a/0x120
svm_range_deferred_list_work+0x19f/0x350 [amdgpu]
process_one_work+0x29b/0x560
worker_thread+0x3d/0x3d0

other info that might help us debug this:
Chain exists of:
  srcu --> >lock#2 --> (work_completion)(>deferred_list_work)

Possible unsafe locking scenario:

CPU0CPU1

lock((work_completion)(>deferred_list_work));
lock(>lock#2);
lock((work_completion)(>deferred_list_work));
sync(srcu);

Signed-off-by: Philip Yang 
Reviewed-by: Felix Kuehling 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index a4c911fa1675..b51224a85a38 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -2343,8 +2343,10 @@ static void svm_range_deferred_list_work(struct 
work_struct *work)
mutex_unlock(>lock);
mmap_write_unlock(mm);
 
-   /* Pairs with mmget in svm_range_add_list_work */
-   mmput(mm);
+   /* Pairs with mmget in svm_range_add_list_work. If dropping the
+* last mm refcount, schedule release work to avoid circular 
locking
+*/
+   mmput_async(mm);
 
spin_lock(>deferred_list_lock);
}
-- 
2.43.0



[PATCH AUTOSEL 6.6 19/31] drm/amdkfd: Fix lock dependency warning

2024-01-28 Thread Sasha Levin
From: Felix Kuehling 

[ Upstream commit 47bf0f83fc86df1bf42b385a91aadb910137c5c9 ]

==
WARNING: possible circular locking dependency detected
6.5.0-kfd-fkuehlin #276 Not tainted
--
kworker/8:2/2676 is trying to acquire lock:
9435aae95c88 ((work_completion)(_bo->eviction_work)){+.+.}-{0:0}, at: 
__flush_work+0x52/0x550

but task is already holding lock:
9435cd8e1720 (>lock){+.+.}-{3:3}, at: 
svm_range_deferred_list_work+0xe8/0x340 [amdgpu]

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #2 (>lock){+.+.}-{3:3}:
   __mutex_lock+0x97/0xd30
   kfd_ioctl_alloc_memory_of_gpu+0x6d/0x3c0 [amdgpu]
   kfd_ioctl+0x1b2/0x5d0 [amdgpu]
   __x64_sys_ioctl+0x86/0xc0
   do_syscall_64+0x39/0x80
   entry_SYSCALL_64_after_hwframe+0x63/0xcd

-> #1 (>mmap_lock){}-{3:3}:
   down_read+0x42/0x160
   svm_range_evict_svm_bo_worker+0x8b/0x340 [amdgpu]
   process_one_work+0x27a/0x540
   worker_thread+0x53/0x3e0
   kthread+0xeb/0x120
   ret_from_fork+0x31/0x50
   ret_from_fork_asm+0x11/0x20

-> #0 ((work_completion)(_bo->eviction_work)){+.+.}-{0:0}:
   __lock_acquire+0x1426/0x2200
   lock_acquire+0xc1/0x2b0
   __flush_work+0x80/0x550
   __cancel_work_timer+0x109/0x190
   svm_range_bo_release+0xdc/0x1c0 [amdgpu]
   svm_range_free+0x175/0x180 [amdgpu]
   svm_range_deferred_list_work+0x15d/0x340 [amdgpu]
   process_one_work+0x27a/0x540
   worker_thread+0x53/0x3e0
   kthread+0xeb/0x120
   ret_from_fork+0x31/0x50
   ret_from_fork_asm+0x11/0x20

other info that might help us debug this:

Chain exists of:
  (work_completion)(_bo->eviction_work) --> >mmap_lock --> >lock

 Possible unsafe locking scenario:

   CPU0CPU1
   
  lock(>lock);
   lock(>mmap_lock);
   lock(>lock);
  lock((work_completion)(_bo->eviction_work));

I believe this cannot really lead to a deadlock in practice, because
svm_range_evict_svm_bo_worker only takes the mmap_read_lock if the BO
refcount is non-0. That means it's impossible that svm_range_bo_release
is running concurrently. However, there is no good way to annotate this.

To avoid the problem, take a BO reference in
svm_range_schedule_evict_svm_bo instead of in the worker. That way it's
impossible for a BO to get freed while eviction work is pending and the
cancel_work_sync call in svm_range_bo_release can be eliminated.

v2: Use svm_bo_ref_unless_zero and explained why that's safe. Also
removed redundant checks that are already done in
amdkfd_fence_enable_signaling.

Signed-off-by: Felix Kuehling 
Reviewed-by: Philip Yang 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 26 ++
 1 file changed, 10 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index 8e368e4659fd..a4c911fa1675 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -391,14 +391,9 @@ static void svm_range_bo_release(struct kref *kref)
spin_lock(_bo->list_lock);
}
spin_unlock(_bo->list_lock);
-   if (!dma_fence_is_signaled(_bo->eviction_fence->base)) {
-   /* We're not in the eviction worker.
-* Signal the fence and synchronize with any
-* pending eviction work.
-*/
+   if (!dma_fence_is_signaled(_bo->eviction_fence->base))
+   /* We're not in the eviction worker. Signal the fence. */
dma_fence_signal(_bo->eviction_fence->base);
-   cancel_work_sync(_bo->eviction_work);
-   }
dma_fence_put(_bo->eviction_fence->base);
amdgpu_bo_unref(_bo->bo);
kfree(svm_bo);
@@ -3424,13 +3419,14 @@ svm_range_trigger_migration(struct mm_struct *mm, 
struct svm_range *prange,
 
 int svm_range_schedule_evict_svm_bo(struct amdgpu_amdkfd_fence *fence)
 {
-   if (!fence)
-   return -EINVAL;
-
-   if (dma_fence_is_signaled(>base))
-   return 0;
-
-   if (fence->svm_bo) {
+   /* Dereferencing fence->svm_bo is safe here because the fence hasn't
+* signaled yet and we're under the protection of the fence->lock.
+* After the fence is signaled in svm_range_bo_release, we cannot get
+* here any more.
+*
+* Reference is dropped in svm_range_evict_svm_bo_worker.
+*/
+   if (svm_bo_ref_unless_zero(fence->svm_bo)) {
WRITE_ONCE(fence->svm_bo->evicting, 1);
schedule_work(>svm_bo->eviction_work);
}
@@ -3445,8 +3441,6 @@ static void svm_range_evict_svm_bo_worker(struct 
work_struct *work)
int r = 0;
 
svm_bo = 

[PATCH AUTOSEL 6.7 37/39] drm/amdgpu: Release 'adev->pm.fw' before return in 'amdgpu_device_need_post()'

2024-01-28 Thread Sasha Levin
From: Srinivasan Shanmugam 

[ Upstream commit 8a44fdd3cf91debbd09b43bd2519ad2b2486ccf4 ]

In function 'amdgpu_device_need_post(struct amdgpu_device *adev)' -
'adev->pm.fw' may not be released before return.

Using the function release_firmware() to release adev->pm.fw.

Thus fixing the below:
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1571 amdgpu_device_need_post() warn: 
'adev->pm.fw' from request_firmware() not released on lines: 1554.

Cc: Monk Liu 
Cc: Christian König 
Cc: Alex Deucher 
Signed-off-by: Srinivasan Shanmugam 
Suggested-by: Lijo Lazar 
Reviewed-by: Alex Deucher 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 93cf73d6fa11..16601d039dfa 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -1485,6 +1485,7 @@ bool amdgpu_device_need_post(struct amdgpu_device *adev)
return true;
 
fw_ver = *((uint32_t *)adev->pm.fw->data + 69);
+   release_firmware(adev->pm.fw);
if (fw_ver < 0x00160e00)
return true;
}
-- 
2.43.0



[PATCH AUTOSEL 6.7 38/39] drm/amdkfd: Fix 'node' NULL check in 'svm_range_get_range_boundaries()'

2024-01-28 Thread Sasha Levin
From: Srinivasan Shanmugam 

[ Upstream commit d7a254fad873775ce6c32b77796c81e81e6b7f2e ]

Range interval [start, last] is ordered by rb_tree, rb_prev, rb_next
return value still needs NULL check, thus modified from "node" to "rb_node".

Fixes the below:
drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_svm.c:2691 
svm_range_get_range_boundaries() warn: can 'node' even be NULL?

Suggested-by: Philip Yang 
Cc: Felix Kuehling 
Cc: Christian König 
Cc: Alex Deucher 
Signed-off-by: Srinivasan Shanmugam 
Reviewed-by: Felix Kuehling 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index f66f88d2b643..9af1d094385a 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -2680,6 +2680,7 @@ svm_range_get_range_boundaries(struct kfd_process *p, 
int64_t addr,
 {
struct vm_area_struct *vma;
struct interval_tree_node *node;
+   struct rb_node *rb_node;
unsigned long start_limit, end_limit;
 
vma = vma_lookup(p->mm, addr << PAGE_SHIFT);
@@ -2699,16 +2700,15 @@ svm_range_get_range_boundaries(struct kfd_process *p, 
int64_t addr,
if (node) {
end_limit = min(end_limit, node->start);
/* Last range that ends before the fault address */
-   node = container_of(rb_prev(>rb),
-   struct interval_tree_node, rb);
+   rb_node = rb_prev(>rb);
} else {
/* Last range must end before addr because
 * there was no range after addr
 */
-   node = container_of(rb_last(>svms.objects.rb_root),
-   struct interval_tree_node, rb);
+   rb_node = rb_last(>svms.objects.rb_root);
}
-   if (node) {
+   if (rb_node) {
+   node = container_of(rb_node, struct interval_tree_node, rb);
if (node->last >= addr) {
WARN(1, "Overlap with prev node and page fault addr\n");
return -EFAULT;
-- 
2.43.0



  1   2   >