Re: [PATCH drm-misc-next v3] drm/sched: implement dynamic job-flow control

2023-10-27 Thread Luben Tuikov
On 2023-10-27 12:41, Boris Brezillon wrote:
> On Fri, 27 Oct 2023 10:32:52 -0400
> Luben Tuikov  wrote:
> 
>> On 2023-10-27 04:25, Boris Brezillon wrote:
>>> Hi Danilo,
>>>
>>> On Thu, 26 Oct 2023 18:13:00 +0200
>>> Danilo Krummrich  wrote:
>>>   
 Currently, job flow control is implemented simply by limiting the number
 of jobs in flight. Therefore, a scheduler is initialized with a credit
 limit that corresponds to the number of jobs which can be sent to the
 hardware.

 This implies that for each job, drivers need to account for the maximum
 job size possible in order to not overflow the ring buffer.

 However, there are drivers, such as Nouveau, where the job size has a
 rather large range. For such drivers it can easily happen that job
 submissions not even filling the ring by 1% can block subsequent
 submissions, which, in the worst case, can lead to the ring run dry.

 In order to overcome this issue, allow for tracking the actual job size
 instead of the number of jobs. Therefore, add a field to track a job's
 credit count, which represents the number of credits a job contributes
 to the scheduler's credit limit.

 Signed-off-by: Danilo Krummrich 
 ---
 Changes in V2:
 ==
   - fixed up influence on scheduling fairness due to consideration of a 
 job's
 size
 - If we reach a ready entity in drm_sched_select_entity() but can't 
 actually
   queue a job from it due to size limitations, just give up and go to 
 sleep
   until woken up due to a pending job finishing, rather than continue 
 to try
   other entities.
   - added a callback to dynamically update a job's credits (Boris)  
>>>
>>> This callback seems controversial. I'd suggest dropping it, so the
>>> patch can be merged.  
>>
>> Sorry, why is it controversial? (I did read back-and-forth above, but it 
>> wasn't clear
>> why it is /controversial/.)
> 
> That's a question for Christian, I guess. I didn't quite get what he
> was worried about, other than this hook introducing a new way for
> drivers to screw things up by returning funky/invalid credits (which we

It's up to the driver--they can test, test, test, and fix their code and so on.
We can only do so much and shouldn't be baby-sitting drivers ad nauseam. Drivers
can also not define this callback. :-)

> can report with WARN_ON()s). But let's be honest, there's probably a
> hundred different ways (if not more) drivers can shoot themselves in the
> foot with drm_sched already...

Yes, that's true. So there's no worries with this hook.

> 
>>
>> I believe only drivers are privy to changes in the credit availability as 
>> their
>> firmware and hardware executes new jobs and finishes others, and so this 
>> "update"
>> here is essential--leaving it only to prepare_job() wouldn't quite fulfill 
>> the vision
>> of why the credit mechanism introduced by this patch in the first place.
> 
> I kinda agree with you, even if I wouldn't so pessimistic as to how
> useful this patch would be without the ->update_job_credits() hook
> (it already makes the situation a lot better for Nouveau and probably
> other drivers too).

Sure, and that's a good thing.

The heart of the dynamic credit scheme this patch is introducing *is* 
update_job_credits()
callback. Without it, it's just about like the current job flow-control scheme 
we have with
varied job weights (credits). Remember, it is an optional callback and driver 
can choose NOT
to define it--simple. :-)

So, I'm very excited about this, and see a wide range of applications and 
tricks drivers
can do with the credit scheme (albeit had it been an "int" bwha-ha-ha-ha ]:-> ).

Have a good weekend everyone!
-- 
Regards,
Luben


OpenPGP_0x4C15479431A334AF.asc
Description: OpenPGP public key


OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: [PATCH drm-misc-next v3] drm/sched: implement dynamic job-flow control

2023-10-27 Thread Luben Tuikov
On 2023-10-27 12:31, Boris Brezillon wrote:
> On Fri, 27 Oct 2023 16:23:24 +0200
> Danilo Krummrich  wrote:
> 
>> On 10/27/23 10:25, Boris Brezillon wrote:
>>> Hi Danilo,
>>>
>>> On Thu, 26 Oct 2023 18:13:00 +0200
>>> Danilo Krummrich  wrote:
>>>   
 Currently, job flow control is implemented simply by limiting the number
 of jobs in flight. Therefore, a scheduler is initialized with a credit
 limit that corresponds to the number of jobs which can be sent to the
 hardware.

 This implies that for each job, drivers need to account for the maximum
 job size possible in order to not overflow the ring buffer.

 However, there are drivers, such as Nouveau, where the job size has a
 rather large range. For such drivers it can easily happen that job
 submissions not even filling the ring by 1% can block subsequent
 submissions, which, in the worst case, can lead to the ring run dry.

 In order to overcome this issue, allow for tracking the actual job size
 instead of the number of jobs. Therefore, add a field to track a job's
 credit count, which represents the number of credits a job contributes
 to the scheduler's credit limit.

 Signed-off-by: Danilo Krummrich 
 ---
 Changes in V2:
 ==
- fixed up influence on scheduling fairness due to consideration of a 
 job's
  size
  - If we reach a ready entity in drm_sched_select_entity() but can't 
 actually
queue a job from it due to size limitations, just give up and go to 
 sleep
until woken up due to a pending job finishing, rather than continue 
 to try
other entities.
- added a callback to dynamically update a job's credits (Boris)  
>>>
>>> This callback seems controversial. I'd suggest dropping it, so the
>>> patch can be merged.  
>>
>> I don't think we should drop it just for the sake of moving forward. If 
>> there are objections
>> we'll discuss them. I've seen good reasons why the drivers you are working 
>> on require this,
>> while, following the discussion, I have *not* seen any reasons to drop it. 
>> It helps simplifying
>> driver code and doesn't introduce any complexity or overhead to existing 
>> drivers.
> 
> Up to you. I'm just saying, moving one step in the right direction is
> better than being stuck, and it's not like adding this callback in a
> follow-up patch is super complicated either. If you're confident that
> we can get all parties to agree on keeping this hook, fine by me.

I'd rather have it in now, as it is really *the vision* of this patch. There's 
no point
in pushing in something half-baked.
-- 
Regards,
Luben

OpenPGP_0x4C15479431A334AF.asc
Description: OpenPGP public key


OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: [PATCH drm-misc-next v3] drm/sched: implement dynamic job-flow control

2023-10-27 Thread Luben Tuikov
Hi,

On 2023-10-27 12:26, Boris Brezillon wrote:
> On Fri, 27 Oct 2023 16:34:26 +0200
> Danilo Krummrich  wrote:
> 
>> On 10/27/23 09:17, Boris Brezillon wrote:
>>> Hi Danilo,
>>>
>>> On Thu, 26 Oct 2023 18:13:00 +0200
>>> Danilo Krummrich  wrote:
>>>   
 +
 +  /**
 +   * @update_job_credits: Called once the scheduler is considering this
 +   * job for execution.
 +   *
 +   * Drivers may use this to update the job's submission credits, which is
 +   * useful to e.g. deduct the number of native fences which have been
 +   * signaled meanwhile.
 +   *
 +   * The callback must either return the new number of submission credits
 +   * for the given job, or zero if no update is required.
 +   *
 +   * This callback is optional.
 +   */
 +  u32 (*update_job_credits)(struct drm_sched_job *sched_job);  
>>>
>>> I'm copying my late reply to v2 here so it doesn't get lost:
>>>
>>> I keep thinking it'd be simpler to make this a void function that
>>> updates s_job->submission_credits directly. I also don't see the
>>> problem with doing a sanity check on job->submission_credits. I mean,
>>> if the driver is doing something silly, you can't do much to prevent it
>>> anyway, except warn the user that something wrong has happened. If you
>>> want to
>>>
>>> WARN_ON(job->submission_credits == 0 ||
>>> job->submission_credits > job_old_submission_credits);
>>>
>>> that's fine. But none of this sanity checking has to do with the
>>> function prototype/semantics, and I'm still not comfortable with this 0  
>>> => no-change. If there's no change, we should just leave  
>>> job->submission_credits unchanged (or return job->submission_credits)
>>> instead of inventing a new special case.  
>>
>> If we can avoid letting drivers change fields of generic structures directly
>> without any drawbacks I think we should avoid it. Currently, drivers 
>> shouldn't
>> have the need to mess with job->credits directly. The initial value is set
>> through drm_sched_job_init() and is updated through the return value of
>> update_job_credits().
> 
> Fair enough. I do agree that keeping internal fields out of driver
> hands is a good thing in general, it's just that it's already
> free-for-all in so many places in drm_sched (like the fact drivers

"Free-for-all" doesn't mean we need to follow suit. We should keep
good programming practices, as this patch strives to.

> iterate the pending list in their stop-queue handling) that I didn't
> really see it as an issue. Note that's there's always the option of
> providing drm_sched_job_{update,get}_credits() helpers, with the update
> helper making sure the new credits value is consistent (smaller or
> equal to the old one, and not zero).
> 
>>
>> I'm fine getting rid of the 0 => no-change semantics though. Instead we can 
>> just
>> WARN() on 0.
> 
> Yeah, I think that's preferable. It's pretty easy to return the old
> value if the driver has a way to detect when nothing changed (with a
> get helper if you don't want drivers to touch the credits field).
> 
>> However, if we do that I'd also want to change it for
>> drm_sched_job_init() (where 0 currently defaults to 1) such that we accept 
>> 0, but
>> WARN() accordingly.
> 
> Sure. You update all drivers anyway, so passing 1 instead of 0 is not a
> big deal, I would say.

At this point in time, we should consider 1 as normal, 0 out of spec and
WARN on it but carry on and (perhaps) reset it to 1. Drivers in the future, may
see a need (i.e. do tricks) to return 0, at which point they'll submit a patch 
which
does two things, 1) removes the WARN, 2) removes the reset from 0 to 1, and
explain why they need to return 0 to allow (one more) job, but we're nowhere 
near then yet,
so status quo for now.

I don't see how it makes sense to call drm_sched_job_init(credits:0), and I 
believe
the code is correct to default to 1 in that case--which defaults to the current
flow control we have, which we want.

> 
>>
>> I think it's consequent to either consistently give 0 a different meaning or 
>> just
>> accept it but WARN() on it.
> 
> Using default as a default value makes sense when you're passing

I suppose you meant "using zero as a default value".

> zero-initialized objects that are later extended with new fields, but
> here you update the function prototype and all the call sites, so we're
> better off considering 0 as an invalid value, IMHO.

Yes, absolutely.

You never want to give 0 a meaning, since as you pointed out, it is zero-ed
memory, and as such, can have any meaning you'd like. So yes: WARN on 0;
1 is good and normal.

Regards,
Luben


OpenPGP_0x4C15479431A334AF.asc
Description: OpenPGP public key


OpenPGP_signature.asc
Description: OpenPGP digital signature


[PATCH] drm/panel-edp: Add timings for BOE NV133WUM-N63

2023-10-27 Thread Clayton Craft
This panel is found on laptops e.g., variants of the Thinkpad X13s.
Configuration was collected from the panel's EDID.

Signed-off-by: Clayton Craft 
---
 drivers/gpu/drm/panel/panel-edp.c | 27 +++
 1 file changed, 27 insertions(+)

diff --git a/drivers/gpu/drm/panel/panel-edp.c 
b/drivers/gpu/drm/panel/panel-edp.c
index 95c8472d878a..5db283f014f3 100644
--- a/drivers/gpu/drm/panel/panel-edp.c
+++ b/drivers/gpu/drm/panel/panel-edp.c
@@ -1204,6 +1204,29 @@ static const struct panel_desc boe_nv133fhm_n61 = {
},
 };
 
+static const struct drm_display_mode boe_nv133wum_n63_modes = {
+   .clock = 157760,
+   .hdisplay = 1920,
+   .hsync_start = 1920 + 48,
+   .hsync_end = 1920 + 48 + 32,
+   .htotal = 1920 + 48 + 32 + 80,
+   .vdisplay = 1200,
+   .vsync_start = 1200 + 3,
+   .vsync_end = 1200 + 3 + 6,
+   .vtotal = 1200 + 3 + 6 + 31,
+   .flags = DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_NVSYNC,
+};
+
+static const struct panel_desc boe_nv133wum_n63 = {
+   .modes = _nv133wum_n63_modes,
+   .num_modes = 1,
+   .bpc = 8,
+   .size = {
+   .width = 286,
+   .height = 179,
+   },
+};
+
 static const struct drm_display_mode boe_nv140fhmn49_modes[] = {
{
.clock = 148500,
@@ -1723,6 +1746,9 @@ static const struct of_device_id platform_of_match[] = {
}, {
.compatible = "boe,nv133fhm-n62",
.data = _nv133fhm_n61,
+   }, {
+   .compatible = "boe,nv133wum-n63",
+   .data = _nv133wum_n63,
}, {
.compatible = "boe,nv140fhmn49",
.data = _nv140fhmn49,
@@ -1852,6 +1878,7 @@ static const struct edp_panel_entry edp_panels[] = {
EDP_PANEL_ENTRY('B', 'O', 'E', 0x095f, _200_500_e50, 
"NE135FBM-N41 v8.1"),
EDP_PANEL_ENTRY('B', 'O', 'E', 0x098d, _nv110wtm_n61.delay, 
"NV110WTM-N61"),
EDP_PANEL_ENTRY('B', 'O', 'E', 0x09dd, _200_500_e50, 
"NT116WHM-N21"),
+   EDP_PANEL_ENTRY('B', 'O', 'E', 0x0a1b, _200_500_e50, 
"NV133WUM-N63"),
EDP_PANEL_ENTRY('B', 'O', 'E', 0x0a5d, _200_500_e50, 
"NV116WHM-N45"),
EDP_PANEL_ENTRY('B', 'O', 'E', 0x0ac5, _200_500_e50, 
"NV116WHM-N4C"),
 
-- 
2.40.1



Re: [PATCH] drm/msm/gem: Add metadata

2023-10-27 Thread Dmitry Baryshkov
On Fri, 27 Oct 2023 at 22:45, Rob Clark  wrote:
>
> From: Rob Clark 
>
> The EXT_external_objects extension is a bit awkward as it doesn't pass
> explicit modifiers, leaving the importer to guess with incomplete
> information.  In the case of vk (turnip) exporting and gl (freedreno)
> importing, the "OPTIMAL_TILING_EXT" layout depends on VkImageCreateInfo
> flags (among other things), which the importer does not know.  Which
> unfortunately leaves us with the need for a metadata back-channel.
>
> The contents of the metadata are defined by userspace.  The
> EXT_external_objects extension is only required to work between
> compatible versions of gl and vk drivers, as defined by device and
> driver UUIDs.
>
> Signed-off-by: Rob Clark 
> ---
>  drivers/gpu/drm/msm/msm_drv.c | 59 +--
>  drivers/gpu/drm/msm/msm_gem.h |  4 +++
>  include/uapi/drm/msm_drm.h|  2 ++
>  3 files changed, 63 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
> index b61ccea05327..8fe2677ea37a 100644
> --- a/drivers/gpu/drm/msm/msm_drv.c
> +++ b/drivers/gpu/drm/msm/msm_drv.c
> @@ -37,9 +37,10 @@
>   * - 1.9.0 - Add MSM_SUBMIT_FENCE_SN_IN
>   * - 1.10.0 - Add MSM_SUBMIT_BO_NO_IMPLICIT
>   * - 1.11.0 - Add wait boost (MSM_WAIT_FENCE_BOOST, MSM_PREP_BOOST)
> + * - 1.12.0 - Add MSM_INFO_SET_METADATA and MSM_INFO_GET_METADATA
>   */
>  #define MSM_VERSION_MAJOR  1
> -#define MSM_VERSION_MINOR  10
> +#define MSM_VERSION_MINOR  12
>  #define MSM_VERSION_PATCHLEVEL 0
>
>  static void msm_deinit_vram(struct drm_device *ddev);
> @@ -566,6 +567,8 @@ static int msm_ioctl_gem_info(struct drm_device *dev, 
> void *data,
> break;
> case MSM_INFO_SET_NAME:
> case MSM_INFO_GET_NAME:
> +   case MSM_INFO_SET_METADATA:
> +   case MSM_INFO_GET_METADATA:
> break;
> default:
> return -EINVAL;
> @@ -618,7 +621,7 @@ static int msm_ioctl_gem_info(struct drm_device *dev, 
> void *data,
> break;
> case MSM_INFO_GET_NAME:
> if (args->value && (args->len < strlen(msm_obj->name))) {
> -   ret = -EINVAL;
> +   ret = -ETOOSMALL;

This is unrelated and it also slightly changes user interface, so it
IMO should come as a separate commit/

> break;
> }
> args->len = strlen(msm_obj->name);
> @@ -627,6 +630,58 @@ static int msm_ioctl_gem_info(struct drm_device *dev, 
> void *data,
>  msm_obj->name, args->len))
> ret = -EFAULT;
> }
> +   break;
> +   case MSM_INFO_SET_METADATA:
> +   /* Impose a moderate upper bound on metadata size: */
> +   if (args->len > 128) {
> +   ret = -EOVERFLOW;
> +   break;
> +   }
> +
> +   ret = msm_gem_lock_interruptible(obj);
> +   if (ret)
> +   break;
> +
> +   msm_obj->metadata =
> +   krealloc(msm_obj->metadata, args->len, GFP_KERNEL);
> +   msm_obj->metadata_size = args->len;
> +
> +   if (copy_from_user(msm_obj->metadata, 
> u64_to_user_ptr(args->value),
> +  args->len)) {
> +   msm_obj->metadata_size = 0;
> +   ret = -EFAULT;
> +   }
> +
> +   msm_gem_unlock(obj);
> +
> +   break;
> +   case MSM_INFO_GET_METADATA:
> +   if (!args->value) {
> +   /*
> +* Querying the size is inherently racey, but
> +* EXT_external_objects expects the app to confirm
> +* via device and driver UUIDs that the exporter and
> +* importer versions match.  All we can do from the
> +* kernel side is check the length under obj lock
> +* when userspace tries to retrieve the metadata
> +*/
> +   args->len = msm_obj->metadata_size;
> +   break;
> +   }
> +
> +   ret = msm_gem_lock_interruptible(obj);
> +   if (ret)
> +   break;
> +
> +   if (args->len < msm_obj->metadata_size) {
> +   ret = -ETOOSMALL;
> +   } else if (copy_to_user(u64_to_user_ptr(args->value),
> +   msm_obj->metadata, args->len)) {
> +   ret = -EFAULT;
> +   }

Doesn't checkpwatch warn here about extra curly brackets?

> +
> +   msm_gem_unlock(obj);
> +
> break;
> }
>
> diff --git a/drivers/gpu/drm/msm/msm_gem.h b/drivers/gpu/drm/msm/msm_gem.h
> index 

Re: [PATCH 3/7] drm/msm/gem: Don't queue job to sched in error cases

2023-10-27 Thread Dmitry Baryshkov
On Fri, 27 Oct 2023 at 19:59, Rob Clark  wrote:
>
> From: Rob Clark 
>
> We shouldn't be running the job in error cases.  This also avoids having
> to think too hard about where the objs get unpinned (and if necessary,
> the resv takes over tracking that the obj is busy).. ie. error cases it
> always happens synchronously, and normal cases it happens from scheduler
> job_run() callback.
>
> Signed-off-by: Rob Clark 

Reviewed-by: Dmitry Baryshkov 

> ---
>  drivers/gpu/drm/msm/msm_gem_submit.c | 3 +++
>  1 file changed, 3 insertions(+)

-- 
With best wishes
Dmitry


Re: [PATCH] drm/amdgpu: Fixes uninitialized variable usage in amdgpu_dm_setup_replay

2023-10-27 Thread Yuran Pereira
Hello,
On Fri, Oct 27, 2023 at 11:57:45AM -0400, Hamza Mahfooz wrote:
> On 10/27/23 11:55, Lakha, Bhawanpreet wrote:
> > [AMD Official Use Only - General]
> > 
> > 
> > 
> > There was a consensus to use memset instead of {0}. I remember making
> > changes related to that previously.
> 
> Hm, seems like it's used rather consistently in the DM and in DC
> though.
> 
Have you decided which one should be used?

Should I submit a v2 patch using {0} instead of memset?


Yuran Pereira
> > 
> > Bhawan
> > 
> > 
> > *From:* Mahfooz, Hamza 
> > *Sent:* October 27, 2023 11:53 AM
> > *To:* Yuran Pereira ; airl...@gmail.com
> > 
> > *Cc:* Li, Sun peng (Leo) ; Lakha, Bhawanpreet
> > ; Pan, Xinhui ; Siqueira,
> > Rodrigo ; linux-ker...@vger.kernel.org
> > ; amd-...@lists.freedesktop.org
> > ; dri-devel@lists.freedesktop.org
> > ; Deucher, Alexander
> > ; Koenig, Christian
> > ;
> > linux-kernel-ment...@lists.linuxfoundation.org
> > 
> > *Subject:* Re: [PATCH] drm/amdgpu: Fixes uninitialized variable usage in
> > amdgpu_dm_setup_replay
> > On 10/26/23 17:25, Yuran Pereira wrote:
> > > Since `pr_config` is not initialized after its declaration, the
> > > following operations with `replay_enable_option` may be performed
> > > when `replay_enable_option` is holding junk values which could
> > > possibly lead to undefined behaviour
> > > 
> > > ```
> > >   ...
> > >   pr_config.replay_enable_option |= pr_enable_option_static_screen;
> > >   ...
> > > 
> > >   if (!pr_config.replay_timing_sync_supported)
> > >   pr_config.replay_enable_option &= ~pr_enable_option_general_ui;
> > >   ...
> > > ```
> > > 
> > > This patch initializes `pr_config` after its declaration to ensure that
> > > it doesn't contain junk data, and prevent any undefined behaviour
> > > 
> > > Addresses-Coverity-ID: 1544428 ("Uninitialized scalar variable")
> > > Fixes: dede1fea4460 ("drm/amd/display: Add Freesync Panel DM code")
> > > Signed-off-by: Yuran Pereira 
> > > ---
> > >    drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c | 3 +++
> > >    1 file changed, 3 insertions(+)
> > > 
> > > diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c 
> > > b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c
> > > index 32d3086c4cb7..40526507f50b 100644
> > > --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c
> > > +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c
> > > @@ -23,6 +23,7 @@
> > >     *
> > >     */
> > > +#include 
> > >    #include "amdgpu_dm_replay.h"
> > >    #include "dc.h"
> > >    #include "dm_helpers.h"
> > > @@ -74,6 +75,8 @@ bool amdgpu_dm_setup_replay(struct dc_link *link, 
> > > struct amdgpu_dm_connector *ac
> > >     struct replay_config pr_config;
> > 
> > I would prefer setting pr_config = {0};
> > 
> > >     union replay_debug_flags *debug_flags = NULL;
> > > + memset(_config, 0, sizeof(pr_config));
> > > +
> > >     // For eDP, if Replay is supported, return true to skip checks
> > >     if (link->replay_settings.config.replay_supported)
> > >     return true;
> > -- 
> > Hamza
> > 
> -- 
> Hamza
> 


Re: [PATCH 2/7] drm/msm/gem: Remove submit_unlock_unpin_bo()

2023-10-27 Thread Dmitry Baryshkov
On Fri, 27 Oct 2023 at 19:59, Rob Clark  wrote:
>
> From: Rob Clark 
>
> The only point it is called is before pinning objects, so the "unpin"
> part of the name is fiction.  Just remove call submit_cleanup_bo()

Nit: 'remove it and call'

Other than that:

Reviewed-by: Dmitry Baryshkov 

> directly.
>
> Signed-off-by: Rob Clark 
> ---
>  drivers/gpu/drm/msm/msm_gem_submit.c | 10 ++
>  1 file changed, 2 insertions(+), 8 deletions(-)


-- 
With best wishes
Dmitry


Re: [PATCH v2 1/2] drm/msm/dp: don't touch DP subconnector property in eDP case

2023-10-27 Thread Dmitry Baryshkov
On Sat, 28 Oct 2023 at 00:02, Abhinav Kumar  wrote:
>
>
>
> On 10/25/2023 2:23 AM, Dmitry Baryshkov wrote:
> > From: Abel Vesa 
> >
> > In case of the eDP connection there is no subconnetor and as such no
> > subconnector property. Put drm_dp_set_subconnector_property() calls
> > under the !is_edp condition.
> >
> > Fixes: bfcc3d8f94f4 ("drm/msm/dp: support setting the DP subconnector type")
> > Signed-off-by: Abel Vesa 
> > Signed-off-by: Dmitry Baryshkov 
> > ---
> >   drivers/gpu/drm/msm/dp/dp_display.c | 15 ++-
> >   1 file changed, 10 insertions(+), 5 deletions(-)
> >
>
> We still need to unify the calls to drm_dp_set_subconnector_property()
> for the hpd connect/disconnect places preferably in
> dp_display_send_hpd_notification().
>
> That way, we would have had to make this change only in one location.

Good point, I'd like to take another look at the HPD handling in the
DP driver after we land the pending pm_runtime changes. As a part of
that I'll check the drm_dp_set_subconnector_property() calls.

> If you want to pursue that as a separate patch, I am fine as well.
>
> Hence,
>
> Reviewed-by: Abhinav Kumar 



-- 
With best wishes
Dmitry


[PATCH] drm/radeon: replace 1-element arrays with flexible-array members

2023-10-27 Thread José Pekkarinen
Reported by coccinelle, the following patch will move the
following 1 element arrays to flexible arrays.

drivers/gpu/drm/radeon/atombios.h:5523:32-48: WARNING use flexible-array member 
instead 
(https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays)
drivers/gpu/drm/radeon/atombios.h:5545:32-48: WARNING use flexible-array member 
instead 
(https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays)
drivers/gpu/drm/radeon/atombios.h:5461:34-44: WARNING use flexible-array member 
instead 
(https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays)
drivers/gpu/drm/radeon/atombios.h:4447:30-40: WARNING use flexible-array member 
instead 
(https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays)
drivers/gpu/drm/radeon/atombios.h:4236:30-41: WARNING use flexible-array member 
instead 
(https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays)
drivers/gpu/drm/radeon/atombios.h:7044:24-37: WARNING use flexible-array member 
instead 
(https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays)
drivers/gpu/drm/radeon/atombios.h:7054:24-37: WARNING use flexible-array member 
instead 
(https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays)
drivers/gpu/drm/radeon/atombios.h:7095:28-45: WARNING use flexible-array member 
instead 
(https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays)
drivers/gpu/drm/radeon/atombios.h:7553:8-17: WARNING use flexible-array member 
instead 
(https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays)
drivers/gpu/drm/radeon/atombios.h:7559:8-17: WARNING use flexible-array member 
instead 
(https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays)
drivers/gpu/drm/radeon/atombios.h:3896:27-37: WARNING use flexible-array member 
instead 
(https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays)
drivers/gpu/drm/radeon/atombios.h:5443:16-25: WARNING use flexible-array member 
instead 
(https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays)
drivers/gpu/drm/radeon/atombios.h:5454:34-43: WARNING use flexible-array member 
instead 
(https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays)
drivers/gpu/drm/radeon/atombios.h:4603:21-32: WARNING use flexible-array member 
instead 
(https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays)
drivers/gpu/drm/radeon/atombios.h:6299:32-44: WARNING use flexible-array member 
instead 
(https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays)
drivers/gpu/drm/radeon/atombios.h:4628:32-46: WARNING use flexible-array member 
instead 
(https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays)
drivers/gpu/drm/radeon/atombios.h:6285:29-39: WARNING use flexible-array member 
instead 
(https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays)
drivers/gpu/drm/radeon/atombios.h:4296:30-36: WARNING use flexible-array member 
instead 
(https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays)
drivers/gpu/drm/radeon/atombios.h:4756:28-36: WARNING use flexible-array member 
instead 
(https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays)
drivers/gpu/drm/radeon/atombios.h:4064:22-35: WARNING use flexible-array member 
instead 
(https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays)
drivers/gpu/drm/radeon/atombios.h:7327:9-24: WARNING use flexible-array member 
instead 
(https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays)
drivers/gpu/drm/radeon/atombios.h:7332:32-53: WARNING use flexible-array member 
instead 
(https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays)
drivers/gpu/drm/radeon/atombios.h:6030:8-17: WARNING use flexible-array member 
instead 
(https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays)
drivers/gpu/drm/radeon/atombios.h:7362:26-41: WARNING use flexible-array member 
instead 
(https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays)
drivers/gpu/drm/radeon/atombios.h:7369:29-44: WARNING use flexible-array member 
instead 
(https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays)
drivers/gpu/drm/radeon/atombios.h:7349:24-32: WARNING use flexible-array member 
instead 

Re: [PATCH 2/5] drm/panel: nv3051d: Add Powkiddy RK2023 Panel Support

2023-10-27 Thread Jessica Zhang




On 10/20/2023 8:02 AM, Chris Morgan wrote:

On Thu, Oct 19, 2023 at 10:22:24AM -0700, Jessica Zhang wrote:



On 10/18/2023 9:18 AM, Chris Morgan wrote:

From: Chris Morgan 

Refactor the driver to add support for the powkiddy,rk2023-panel
panel. This panel is extremely similar to the rg353p-panel but
requires a smaller vertical back porch and isn't as tolerant of
higher speeds.

Tested on my RG351V, RG353P, RG353V, and RK2023.

Signed-off-by: Chris Morgan 


Hi Chris,

Thanks for the patch. Just have a minor question below.


---
   .../gpu/drm/panel/panel-newvision-nv3051d.c   | 56 +++
   1 file changed, 45 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/panel/panel-newvision-nv3051d.c 
b/drivers/gpu/drm/panel/panel-newvision-nv3051d.c
index 79de6c886292..d24c51503d68 100644
--- a/drivers/gpu/drm/panel/panel-newvision-nv3051d.c
+++ b/drivers/gpu/drm/panel/panel-newvision-nv3051d.c
@@ -28,6 +28,7 @@ struct nv3051d_panel_info {
unsigned int num_modes;
u16 width_mm, height_mm;
u32 bus_flags;
+   u32 mode_flags;
   };
   struct panel_nv3051d {
@@ -385,15 +386,7 @@ static int panel_nv3051d_probe(struct mipi_dsi_device *dsi)
dsi->lanes = 4;
dsi->format = MIPI_DSI_FMT_RGB888;
-   dsi->mode_flags = MIPI_DSI_MODE_VIDEO | MIPI_DSI_MODE_VIDEO_BURST |
- MIPI_DSI_MODE_LPM | MIPI_DSI_MODE_NO_EOT_PACKET;
-
-   /*
-* The panel in the RG351V is identical to the 353P, except it
-* requires MIPI_DSI_CLOCK_NON_CONTINUOUS to operate correctly.
-*/
-   if (of_device_is_compatible(dev->of_node, "anbernic,rg351v-panel"))
-   dsi->mode_flags |= MIPI_DSI_CLOCK_NON_CONTINUOUS;
+   dsi->mode_flags = ctx->panel_info->mode_flags;
drm_panel_init(>panel, >dev, _nv3051d_funcs,
   DRM_MODE_CONNECTOR_DSI);
@@ -481,18 +474,59 @@ static const struct drm_display_mode 
nv3051d_rgxx3_modes[] = {
},
   };
-static const struct nv3051d_panel_info nv3051d_rgxx3_info = {
+static const struct drm_display_mode nv3051d_rk2023_modes[] = {
+   {
+   .hdisplay   = 640,
+   .hsync_start= 640 + 40,
+   .hsync_end  = 640 + 40 + 2,
+   .htotal = 640 + 40 + 2 + 80,
+   .vdisplay   = 480,
+   .vsync_start= 480 + 18,
+   .vsync_end  = 480 + 18 + 2,
+   .vtotal = 480 + 18 + 2 + 4,
+   .clock  = 24150,
+   .flags  = DRM_MODE_FLAG_NHSYNC | DRM_MODE_FLAG_NVSYNC,
+   },
+};
+
+static const struct nv3051d_panel_info nv3051d_rg351v_info = {
.display_modes = nv3051d_rgxx3_modes,
.num_modes = ARRAY_SIZE(nv3051d_rgxx3_modes),
.width_mm = 70,
.height_mm = 57,
.bus_flags = DRM_BUS_FLAG_DE_LOW | DRM_BUS_FLAG_PIXDATA_DRIVE_NEGEDGE,
+   .mode_flags = MIPI_DSI_MODE_VIDEO | MIPI_DSI_MODE_VIDEO_BURST |
+ MIPI_DSI_MODE_LPM | MIPI_DSI_MODE_NO_EOT_PACKET |
+ MIPI_DSI_CLOCK_NON_CONTINUOUS,
+};
+
+static const struct nv3051d_panel_info nv3051d_rg353p_info = {
+   .display_modes = nv3051d_rgxx3_modes,
+   .num_modes = ARRAY_SIZE(nv3051d_rgxx3_modes),
+   .width_mm = 70,
+   .height_mm = 57,


Will all the panels for this driver be 70x57? If so, would it be better to
set display_info.[width_mm|height_mm] directly?


They are all so far the same size, but I can't guarantee that going forward.
To my knowledge this is the last of the nv3051d devices I'll be working on
in the foreseeable future though, and so far they're all identical in size.


Got it, if it's not guaranteed might be better to leave it as it then. 
Thanks for clarifying.


BR,

Jessica Zhang






+   .bus_flags = DRM_BUS_FLAG_DE_LOW | DRM_BUS_FLAG_PIXDATA_DRIVE_NEGEDGE,
+   .mode_flags = MIPI_DSI_MODE_VIDEO | MIPI_DSI_MODE_VIDEO_BURST |
+ MIPI_DSI_MODE_LPM | MIPI_DSI_MODE_NO_EOT_PACKET,
+};
+
+static const struct nv3051d_panel_info nv3051d_rk2023_info = {
+   .display_modes = nv3051d_rk2023_modes,
+   .num_modes = ARRAY_SIZE(nv3051d_rk2023_modes),
+   .width_mm = 70,
+   .height_mm = 57,
+   .bus_flags = DRM_BUS_FLAG_DE_LOW | DRM_BUS_FLAG_PIXDATA_DRIVE_NEGEDGE,
+   .mode_flags = MIPI_DSI_MODE_VIDEO | MIPI_DSI_MODE_VIDEO_BURST |
+ MIPI_DSI_MODE_LPM | MIPI_DSI_MODE_NO_EOT_PACKET,
   };
   static const struct of_device_id newvision_nv3051d_of_match[] = {
-   { .compatible = "newvision,nv3051d", .data = _rgxx3_info },
+   { .compatible = "anbernic,rg351v-panel", .data = _rg351v_info },
+   { .compatible = "anbernic,rg353p-panel", .data = _rg353p_info },
+   { .compatible = "powkiddy,rk2023-panel", .data = _rk2023_info },
{ /* sentinel */ }
   };
+


Sorry, will fix that in a V2. Thank you.



I think you can drop this stray newline.

Thanks,

Jessica Zhang


   

[PATCH RFC v7 09/10] drm/msm/dpu: Use DRM solid_fill property

2023-10-27 Thread Jessica Zhang
Drop DPU_PLANE_COLOR_FILL_FLAG and check the DRM solid_fill property to
determine if the plane is solid fill. In addition drop the DPU plane
color_fill field as we can now use drm_plane_state.solid_fill instead,
and pass in drm_plane_state.alpha to _dpu_plane_color_fill_pipe() to
allow userspace to configure the alpha value for the solid fill color.

Reviewed-by: Dmitry Baryshkov 
Signed-off-by: Jessica Zhang 
---
 drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c | 38 ---
 1 file changed, 25 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c
index 9615653db787..832747080daf 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c
@@ -42,7 +42,6 @@
 #define SHARP_SMOOTH_THR_DEFAULT   8
 #define SHARP_NOISE_THR_DEFAULT2
 
-#define DPU_PLANE_COLOR_FILL_FLAG  BIT(31)
 #define DPU_ZPOS_MAX 255
 
 /*
@@ -84,7 +83,6 @@ struct dpu_plane {
 
enum dpu_sspp pipe;
 
-   uint32_t color_fill;
bool is_error;
bool is_rt_pipe;
const struct dpu_mdss_cfg *catalog;
@@ -640,19 +638,34 @@ static void _dpu_plane_color_fill_pipe(struct 
dpu_plane_state *pstate,
_dpu_plane_setup_scaler(pipe, fmt, true, _cfg, pstate->rotation);
 }
 
+static uint32_t _dpu_plane_get_abgr_fill_color(struct drm_plane_state *state)
+{
+   struct drm_solid_fill solid_fill = state->solid_fill;
+
+   uint32_t ret = 0;
+   uint8_t a = state->alpha & 0xFF;
+   uint8_t b = solid_fill.b >> 24;
+   uint8_t g = solid_fill.g >> 24;
+   uint8_t r = solid_fill.r >> 24;
+
+   ret |= a << 24;
+   ret |= b << 16;
+   ret |= g << 8;
+   ret |= r;
+
+   return ret;
+}
+
 /**
  * _dpu_plane_color_fill - enables color fill on plane
  * @pdpu:   Pointer to DPU plane object
  * @color:  RGB fill color value, [23..16] Blue, [15..8] Green, [7..0] Red
- * @alpha:  8-bit fill alpha value, 255 selects 100% alpha
  */
-static void _dpu_plane_color_fill(struct dpu_plane *pdpu,
-   uint32_t color, uint32_t alpha)
+static void _dpu_plane_color_fill(struct dpu_plane *pdpu, uint32_t color)
 {
const struct dpu_format *fmt;
const struct drm_plane *plane = >base;
struct dpu_plane_state *pstate = to_dpu_plane_state(plane->state);
-   u32 fill_color = (color & 0xFF) | ((alpha & 0xFF) << 24);
 
DPU_DEBUG_PLANE(pdpu, "\n");
 
@@ -667,11 +680,11 @@ static void _dpu_plane_color_fill(struct dpu_plane *pdpu,
 
/* update sspp */
_dpu_plane_color_fill_pipe(pstate, >pipe, 
>pipe_cfg.dst_rect,
-  fill_color, fmt);
+  color, fmt);
 
if (pstate->r_pipe.sspp)
_dpu_plane_color_fill_pipe(pstate, >r_pipe, 
>r_pipe_cfg.dst_rect,
-  fill_color, fmt);
+  color, fmt);
 }
 
 static int dpu_plane_prepare_fb(struct drm_plane *plane,
@@ -1019,10 +1032,9 @@ void dpu_plane_flush(struct drm_plane *plane)
 */
if (pdpu->is_error)
/* force white frame with 100% alpha pipe output on error */
-   _dpu_plane_color_fill(pdpu, 0xFF, 0xFF);
-   else if (pdpu->color_fill & DPU_PLANE_COLOR_FILL_FLAG)
-   /* force 100% alpha */
-   _dpu_plane_color_fill(pdpu, pdpu->color_fill, 0xFF);
+   _dpu_plane_color_fill(pdpu, 0x);
+   else if (drm_plane_solid_fill_enabled(plane->state))
+   _dpu_plane_color_fill(pdpu, 
_dpu_plane_get_abgr_fill_color(plane->state));
else {
dpu_plane_flush_csc(pdpu, >pipe);
dpu_plane_flush_csc(pdpu, >r_pipe);
@@ -1067,7 +1079,7 @@ static void dpu_plane_sspp_update_pipe(struct drm_plane 
*plane,
}
 
/* override for color fill */
-   if (pdpu->color_fill & DPU_PLANE_COLOR_FILL_FLAG) {
+   if (drm_plane_solid_fill_enabled(plane->state)) {
_dpu_plane_set_qos_ctrl(plane, pipe, false);
 
/* skip remaining processing on color fill */

-- 
2.42.0



[PATCH RFC v7 01/10] drm: Introduce pixel_source DRM plane property

2023-10-27 Thread Jessica Zhang
Add support for pixel_source property to drm_plane and related
documentation. In addition, force pixel_source to
DRM_PLANE_PIXEL_SOURCE_FB in DRM_IOCTL_MODE_SETPLANE as to not break
legacy userspace.

This enum property will allow user to specify a pixel source for the
plane. Possible pixel sources will be defined in the
drm_plane_pixel_source enum.

Currently, the only pixel sources are DRM_PLANE_PIXEL_SOURCE_FB (the
default value) and DRM_PLANE_PIXEL_SOURCE_NONE.

Acked-by: Dmitry Baryshkov 
Acked-by: Pekka Paalanen 
Acked-by: Harry Wentland 
Acked-by: Sebastian Wick 
Acked-by: Simon Ser 
Signed-off-by: Jessica Zhang 
---
 drivers/gpu/drm/drm_atomic_state_helper.c |  1 +
 drivers/gpu/drm/drm_atomic_uapi.c |  4 ++
 drivers/gpu/drm/drm_blend.c   | 94 +++
 drivers/gpu/drm/drm_plane.c   | 19 +--
 include/drm/drm_blend.h   |  2 +
 include/drm/drm_plane.h   | 21 +++
 6 files changed, 137 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/drm_atomic_state_helper.c 
b/drivers/gpu/drm/drm_atomic_state_helper.c
index 784e63d70a42..01638c51ce0a 100644
--- a/drivers/gpu/drm/drm_atomic_state_helper.c
+++ b/drivers/gpu/drm/drm_atomic_state_helper.c
@@ -252,6 +252,7 @@ void __drm_atomic_helper_plane_state_reset(struct 
drm_plane_state *plane_state,
 
plane_state->alpha = DRM_BLEND_ALPHA_OPAQUE;
plane_state->pixel_blend_mode = DRM_MODE_BLEND_PREMULTI;
+   plane_state->pixel_source = DRM_PLANE_PIXEL_SOURCE_FB;
 
if (plane->color_encoding_property) {
if (!drm_object_property_get_default_value(>base,
diff --git a/drivers/gpu/drm/drm_atomic_uapi.c 
b/drivers/gpu/drm/drm_atomic_uapi.c
index 98d3b10c08ae..46c78b87803d 100644
--- a/drivers/gpu/drm/drm_atomic_uapi.c
+++ b/drivers/gpu/drm/drm_atomic_uapi.c
@@ -562,6 +562,8 @@ static int drm_atomic_plane_set_property(struct drm_plane 
*plane,
state->src_w = val;
} else if (property == config->prop_src_h) {
state->src_h = val;
+   } else if (property == plane->pixel_source_property) {
+   state->pixel_source = val;
} else if (property == plane->alpha_property) {
state->alpha = val;
} else if (property == plane->blend_mode_property) {
@@ -634,6 +636,8 @@ drm_atomic_plane_get_property(struct drm_plane *plane,
*val = state->src_w;
} else if (property == config->prop_src_h) {
*val = state->src_h;
+   } else if (property == plane->pixel_source_property) {
+   *val = state->pixel_source;
} else if (property == plane->alpha_property) {
*val = state->alpha;
} else if (property == plane->blend_mode_property) {
diff --git a/drivers/gpu/drm/drm_blend.c b/drivers/gpu/drm/drm_blend.c
index 6e74de833466..fce734cdb85b 100644
--- a/drivers/gpu/drm/drm_blend.c
+++ b/drivers/gpu/drm/drm_blend.c
@@ -185,6 +185,25 @@
  *  plane does not expose the "alpha" property, then this is
  *  assumed to be 1.0
  *
+ * pixel_source:
+ * pixel_source is set up with drm_plane_create_pixel_source_property().
+ * It is used to toggle the active source of pixel data for the plane.
+ * The plane will only display data from the set pixel_source -- any
+ * data from other sources will be ignored.
+ *
+ * For non-framebuffer sources, if pixel_source is set to a non-framebuffer
+ * and non-NONE source, and the corresponding source property is NULL, then
+ * the atomic commit should return an error.
+ *
+ * Possible values:
+ *
+ * "NONE":
+ * No active pixel source.
+ * Committing with a NONE pixel source will disable the plane.
+ *
+ * "FB":
+ * Framebuffer source set by the "FB_ID" property.
+ *
  * Note that all the property extensions described here apply either to the
  * plane or the CRTC (e.g. for the background color, which currently is not
  * exposed and assumed to be black).
@@ -615,3 +634,78 @@ int drm_plane_create_blend_mode_property(struct drm_plane 
*plane,
return 0;
 }
 EXPORT_SYMBOL(drm_plane_create_blend_mode_property);
+
+static const struct drm_prop_enum_list drm_pixel_source_enum_list[] = {
+   { DRM_PLANE_PIXEL_SOURCE_NONE, "NONE" },
+   { DRM_PLANE_PIXEL_SOURCE_FB, "FB" },
+};
+
+/**
+ * drm_plane_create_pixel_source_property - create a new pixel source property
+ * @plane: DRM plane
+ * @extra_sources: Bitmask of additional supported pixel_sources for the 
driver.
+ *DRM_PLANE_PIXEL_SOURCE_FB and DRM_PLANE_PIXEL_SOURCE_NONE 
will
+ *always be enabled as supported sources.
+ *
+ * This creates a new property describing the current source of pixel data for 
the
+ * plane. The pixel_source will be initialized as DRM_PLANE_PIXEL_SOURCE_FB by 
default.
+ *
+ * Drivers can set a custom default source by overriding the pixel_source 

[PATCH RFC v7 06/10] drm/atomic: Move framebuffer checks to helper

2023-10-27 Thread Jessica Zhang
Currently framebuffer checks happen directly in
drm_atomic_plane_check(). Move these checks into their own helper
method.

Reviewed-by: Dmitry Baryshkov 
Acked-by: Harry Wentland 
Acked-by: Sebastian Wick 
Signed-off-by: Jessica Zhang 
---
 drivers/gpu/drm/drm_atomic.c | 130 ---
 1 file changed, 73 insertions(+), 57 deletions(-)

diff --git a/drivers/gpu/drm/drm_atomic.c b/drivers/gpu/drm/drm_atomic.c
index af778d32785b..48a2df4e3d27 100644
--- a/drivers/gpu/drm/drm_atomic.c
+++ b/drivers/gpu/drm/drm_atomic.c
@@ -589,6 +589,76 @@ plane_switching_crtc(const struct drm_plane_state 
*old_plane_state,
return true;
 }
 
+static int drm_atomic_plane_check_fb(const struct drm_plane_state *state)
+{
+   struct drm_plane *plane = state->plane;
+   const struct drm_framebuffer *fb = state->fb;
+   struct drm_mode_rect *clips;
+
+   uint32_t num_clips;
+   unsigned int fb_width, fb_height;
+   int ret;
+
+   /* Check whether this plane supports the fb pixel format. */
+   ret = drm_plane_check_pixel_format(plane, fb->format->format,
+  fb->modifier);
+
+   if (ret) {
+   drm_dbg_atomic(plane->dev,
+  "[PLANE:%d:%s] invalid pixel format %p4cc, 
modifier 0x%llx\n",
+  plane->base.id, plane->name,
+  >format->format, fb->modifier);
+   return ret;
+   }
+
+   fb_width = fb->width << 16;
+   fb_height = fb->height << 16;
+
+   /* Make sure source coordinates are inside the fb. */
+   if (state->src_w > fb_width ||
+   state->src_x > fb_width - state->src_w ||
+   state->src_h > fb_height ||
+   state->src_y > fb_height - state->src_h) {
+   drm_dbg_atomic(plane->dev,
+  "[PLANE:%d:%s] invalid source coordinates "
+  "%u.%06ux%u.%06u+%u.%06u+%u.%06u (fb %ux%u)\n",
+  plane->base.id, plane->name,
+  state->src_w >> 16,
+  ((state->src_w & 0x) * 15625) >> 10,
+  state->src_h >> 16,
+  ((state->src_h & 0x) * 15625) >> 10,
+  state->src_x >> 16,
+  ((state->src_x & 0x) * 15625) >> 10,
+  state->src_y >> 16,
+  ((state->src_y & 0x) * 15625) >> 10,
+  fb->width, fb->height);
+   return -ENOSPC;
+   }
+
+   clips = __drm_plane_get_damage_clips(state);
+   num_clips = drm_plane_get_damage_clips_count(state);
+
+   /* Make sure damage clips are valid and inside the fb. */
+   while (num_clips > 0) {
+   if (clips->x1 >= clips->x2 ||
+   clips->y1 >= clips->y2 ||
+   clips->x1 < 0 ||
+   clips->y1 < 0 ||
+   clips->x2 > fb_width ||
+   clips->y2 > fb_height) {
+   drm_dbg_atomic(plane->dev,
+  "[PLANE:%d:%s] invalid damage clip %d %d 
%d %d\n",
+  plane->base.id, plane->name, clips->x1,
+  clips->y1, clips->x2, clips->y2);
+   return -EINVAL;
+   }
+   clips++;
+   num_clips--;
+   }
+
+   return 0;
+}
+
 /**
  * drm_atomic_plane_check - check plane state
  * @old_plane_state: old plane state to check
@@ -605,9 +675,6 @@ static int drm_atomic_plane_check(const struct 
drm_plane_state *old_plane_state,
struct drm_plane *plane = new_plane_state->plane;
struct drm_crtc *crtc = new_plane_state->crtc;
const struct drm_framebuffer *fb = new_plane_state->fb;
-   unsigned int fb_width, fb_height;
-   struct drm_mode_rect *clips;
-   uint32_t num_clips;
int ret;
 
/* either *both* CRTC and FB must be set, or neither */
@@ -634,17 +701,6 @@ static int drm_atomic_plane_check(const struct 
drm_plane_state *old_plane_state,
return -EINVAL;
}
 
-   /* Check whether this plane supports the fb pixel format. */
-   ret = drm_plane_check_pixel_format(plane, fb->format->format,
-  fb->modifier);
-   if (ret) {
-   drm_dbg_atomic(plane->dev,
-  "[PLANE:%d:%s] invalid pixel format %p4cc, 
modifier 0x%llx\n",
-  plane->base.id, plane->name,
-  >format->format, fb->modifier);
-   return ret;
-   }
-
/* Give drivers some help against integer overflows */
if (new_plane_state->crtc_w > INT_MAX ||
new_plane_state->crtc_x > INT_MAX - (int32_t) 

[PATCH RFC v7 00/10] Support for Solid Fill Planes

2023-10-27 Thread Jessica Zhang
Some drivers support hardware that have optimizations for solid fill
planes. This series aims to expose these capabilities to userspace as
some compositors have a solid fill flag (ex. SOLID_COLOR in the Android
hardware composer HAL) that can be set by apps like the Android Gears
test app.

In order to expose this capability to userspace, this series will:

- Introduce solid_fill and pixel_source properties to allow userspace to
  toggle between FB and solid fill sources
- Loosen NULL FB checks within the DRM atomic commit callstack to allow
  for NULL FB when solid fill is enabled.
- Add NULL FB checks in methods where FB was previously assumed to be
  non-NULL
- Have MSM DPU driver use drm_plane_state.solid_fill instead of
  dpu_plane_state.color_fill

Note: The solid fill planes feature depends on both the solid_fill *and*
pixel_source properties.

To use this feature, userspace can set the solid_fill property to a blob
containing the solid fill color (in RGB323232 format) and and setting the
pixel_source property to DRM_PLANE_PIXEL_SOURCE_SOLID_FILL. This will
disable memory fetch and the resulting plane will display the color
specified by the solid_fill blob.

In order to disable a solid fill plane, the user must set the pixel_source
to NONE. A plane with a pixel_source of "solid_fill" and a NULL solid_fill
blob will cause an error on commit.

Currently, there's only one version of the solid_fill blob property.
However if other drivers want to support a similar feature, but require
more than just the solid fill color, they can extend this feature by
extending the pixel_source enum and adding another solid fill blob property.

This 2 property approach was chosen because passing in a special 1x1 FB
with the necessary color information would have unecessary overhead that
does not reflect the behavior of the solid fill feature. In addition,
assigning the solid fill blob to FB_ID would require loosening some core
drm_property checks that might cause unwanted side effects elsewhere.

---
Changes in v7:
- Updated cover letter (Sebastian)
- Updated the uAPI documentation (Sebastian, Pekka)
- Specify that padding must be set to zero in drm_mode_solid_fill
  documentation (Pekka)
- Use %08x format when printing solid fill info in plane state dump (Pekka)
- Use new_plane_state->fb directly in drm_atomic_plane_check() (Dmitry)
- Dropped documentation for alpha for _dpu_plane_color_fill() (Dmitry)
- Defined DPU_SOLID_FILL_FORMAT macro (Dmitry)
- Fixed some necessary checks being skipped in the DPU atomic
  commit path (Dmitry)
- Rebased to tip of msm-next
- Picked up Acked-by and Reviewed-by tags

Changes in v6:
- Have _dpu_plane_color_fill() take in a single ABGR color instead
  of having separate alpha and BGR color parameters (Dmitry)
- Drop plane->state->pixel_source != DRM_PLANE_PIXEL_SOURCE_FB check
  in SetPlane ioctl (Dmitry)
- Add DRM_PLANE_PIXEL_SOURCE_NONE as a default pixel source (Sebastian)
- Dropped versioning from solid fill property blob (Dmitry)
- Use DRM_ENUM_NAME_FN (Dmitry)
- Use drm_atomic_replace_property_blob_from_id() (Dmitry)
- drm_atomic_check_fb -> drm_atomic_plane_check_fb (Dmitry)
- Group redundant NULL FB checks (Dmitry)
- Squashed drm_plane_needs_disable() implementation with 
  DRM_PLANE_PIXEL_SOURCE_NONE declaration (Sebastian)
- Add comment to support RGBA solid fill color in the future (Dmitry)
- Link to v5: 
https://lore.kernel.org/r/20230728-solid-fill-v5-0-053dbefa9...@quicinc.com

Changes in v5:
- Added support for PIXEL_SOURCE_NONE (Sebastian)
- Added WARN_ON() in drm_plane_has_visible_data() if pixel_source isn't
  set (Dmitry)
- Added debugfs support for both properties (Dmitry)
- Corrected u32 to u8 conversion (Pekka)
- Moved drm_solid_fill_info struct and related documentation to
  include/uapi (Pekka)
- Changed drm_solid_fill_info.version to __u32 for data alignment (Pekka)
- Added more detailed UAPI and kernel documentation (Pekka)
- Reordered patch series so that the pixel_source property is introduced
  before solid_fill (Dmitry)
- Fixed inconsistent ABGR/RGBA format declaration (Pekka)
- Reset pixel_source to FB in drm_mode_setplane() (Dmitry)
- Rename supported_sources to extra_sources (Dmitry)
- Only destroy old solid_fill blob state if new state is valid (Pekka)
- Link to v4: 
https://lore.kernel.org/r/20230404-solid-fill-v4-0-f4ec0caa7...@quicinc.com

Changes in v4:
- Rebased onto latest kernel
- Reworded cover letter for clarity (Dmitry)
- Reworded commit messages for clarity
- Split existing changes into smaller commits
- Added pixel_source enum property (Dmitry, Pekka, Ville)
- Updated drm-kms comment docs with pixel_source and solid_fill
  properties (Dmitry)
- Inlined drm_atomic_convert_solid_fill_info() (Dmitry)
- Passed in plane state alpha value to _dpu_plane_color_fill_pipe()
- Link to v3: 
https://lore.kernel.org/r/20230104234036.636-1-quic_jessz...@quicinc.com

Changes in v3:
- Fixed some logic errors in atomic checks (Dmitry)
- 

[PATCH RFC v7 05/10] drm/atomic: Add solid fill data to plane state dump

2023-10-27 Thread Jessica Zhang
Add solid_fill property data to the atomic plane state dump.

Reviewed-by: Dmitry Baryshkov 
Acked-by: Harry Wentland 
Acked-by: Sebastian Wick 
Signed-off-by: Jessica Zhang 
---
 drivers/gpu/drm/drm_atomic.c | 4 
 drivers/gpu/drm/drm_plane.c  | 8 
 include/drm/drm_plane.h  | 3 +++
 3 files changed, 15 insertions(+)

diff --git a/drivers/gpu/drm/drm_atomic.c b/drivers/gpu/drm/drm_atomic.c
index 9f9abbe76369..af778d32785b 100644
--- a/drivers/gpu/drm/drm_atomic.c
+++ b/drivers/gpu/drm/drm_atomic.c
@@ -726,6 +726,10 @@ static void drm_atomic_plane_print_state(struct 
drm_printer *p,
drm_printf(p, "\tfb=%u\n", state->fb ? state->fb->base.id : 0);
if (state->fb)
drm_framebuffer_print_info(p, 2, state->fb);
+   drm_printf(p, "\tsolid_fill=%u\n",
+   state->solid_fill_blob ? 
state->solid_fill_blob->base.id : 0);
+   if (state->solid_fill_blob)
+   drm_plane_solid_fill_print_info(p, 2, state);
drm_printf(p, "\tcrtc-pos=" DRM_RECT_FMT "\n", DRM_RECT_ARG());
drm_printf(p, "\tsrc-pos=" DRM_RECT_FP_FMT "\n", DRM_RECT_FP_ARG());
drm_printf(p, "\trotation=%x\n", state->rotation);
diff --git a/drivers/gpu/drm/drm_plane.c b/drivers/gpu/drm/drm_plane.c
index 559d101162ba..289b3be86d52 100644
--- a/drivers/gpu/drm/drm_plane.c
+++ b/drivers/gpu/drm/drm_plane.c
@@ -1495,6 +1495,14 @@ __drm_plane_get_damage_clips(const struct 
drm_plane_state *state)
state->fb_damage_clips->data : NULL);
 }
 
+void drm_plane_solid_fill_print_info(struct drm_printer *p, unsigned int 
indent,
+const struct drm_plane_state *state)
+{
+   drm_printf_indent(p, indent, "r=0x%08x\n", state->solid_fill.r);
+   drm_printf_indent(p, indent, "g=0x%08x\n", state->solid_fill.g);
+   drm_printf_indent(p, indent, "b=0x%08x\n", state->solid_fill.b);
+}
+
 /**
  * drm_plane_get_damage_clips - Returns damage clips.
  * @state: Plane state.
diff --git a/include/drm/drm_plane.h b/include/drm/drm_plane.h
index baaf737392bc..6171fb1a0b47 100644
--- a/include/drm/drm_plane.h
+++ b/include/drm/drm_plane.h
@@ -1001,6 +1001,9 @@ drm_plane_get_damage_clips_count(const struct 
drm_plane_state *state);
 struct drm_mode_rect *
 drm_plane_get_damage_clips(const struct drm_plane_state *state);
 
+void drm_plane_solid_fill_print_info(struct drm_printer *p, unsigned int 
indent,
+const struct drm_plane_state *state);
+
 int drm_plane_create_scaling_filter_property(struct drm_plane *plane,
 unsigned int supported_filters);
 

-- 
2.42.0



[PATCH RFC v7 07/10] drm/atomic: Loosen FB atomic checks

2023-10-27 Thread Jessica Zhang
Loosen the requirements for atomic and legacy commit so that, in cases
where pixel_source != FB, the commit can still go through.

This includes adding framebuffer NULL checks in other areas to account for
FB being NULL when non-FB pixel sources are enabled.

To disable a plane, the pixel_source must be NONE or the FB must be NULL
if pixel_source == FB.

Signed-off-by: Jessica Zhang 
---
 drivers/gpu/drm/drm_atomic.c| 21 ++--
 drivers/gpu/drm/drm_atomic_helper.c | 39 +
 include/drm/drm_atomic_helper.h |  4 ++--
 include/drm/drm_plane.h | 29 +++
 4 files changed, 64 insertions(+), 29 deletions(-)

diff --git a/drivers/gpu/drm/drm_atomic.c b/drivers/gpu/drm/drm_atomic.c
index 48a2df4e3d27..66e49c06aced 100644
--- a/drivers/gpu/drm/drm_atomic.c
+++ b/drivers/gpu/drm/drm_atomic.c
@@ -674,17 +674,16 @@ static int drm_atomic_plane_check(const struct 
drm_plane_state *old_plane_state,
 {
struct drm_plane *plane = new_plane_state->plane;
struct drm_crtc *crtc = new_plane_state->crtc;
-   const struct drm_framebuffer *fb = new_plane_state->fb;
int ret;
 
-   /* either *both* CRTC and FB must be set, or neither */
-   if (crtc && !fb) {
-   drm_dbg_atomic(plane->dev, "[PLANE:%d:%s] CRTC set but no FB\n",
+   /* either *both* CRTC and pixel source must be set, or neither */
+   if (crtc && !drm_plane_has_visible_data(new_plane_state)) {
+   drm_dbg_atomic(plane->dev, "[PLANE:%d:%s] CRTC set but no 
visible data\n",
   plane->base.id, plane->name);
return -EINVAL;
-   } else if (fb && !crtc) {
-   drm_dbg_atomic(plane->dev, "[PLANE:%d:%s] FB set but no CRTC\n",
-  plane->base.id, plane->name);
+   } else if (drm_plane_has_visible_data(new_plane_state) && !crtc) {
+   drm_dbg_atomic(plane->dev, "[PLANE:%d:%s] Source %d has visible 
data but no CRTC\n",
+  plane->base.id, plane->name, 
new_plane_state->pixel_source);
return -EINVAL;
}
 
@@ -715,9 +714,11 @@ static int drm_atomic_plane_check(const struct 
drm_plane_state *old_plane_state,
}
 
 
-   ret = drm_atomic_plane_check_fb(new_plane_state);
-   if (ret)
-   return ret;
+   if (new_plane_state->pixel_source == DRM_PLANE_PIXEL_SOURCE_FB && 
new_plane_state->fb) {
+   ret = drm_atomic_plane_check_fb(new_plane_state);
+   if (ret)
+   return ret;
+   }
 
if (plane_switching_crtc(old_plane_state, new_plane_state)) {
drm_dbg_atomic(plane->dev,
diff --git a/drivers/gpu/drm/drm_atomic_helper.c 
b/drivers/gpu/drm/drm_atomic_helper.c
index 292e38eb6218..27e8c8dd9324 100644
--- a/drivers/gpu/drm/drm_atomic_helper.c
+++ b/drivers/gpu/drm/drm_atomic_helper.c
@@ -852,7 +852,6 @@ int drm_atomic_helper_check_plane_state(struct 
drm_plane_state *plane_state,
bool can_position,
bool can_update_disabled)
 {
-   struct drm_framebuffer *fb = plane_state->fb;
struct drm_rect *src = _state->src;
struct drm_rect *dst = _state->dst;
unsigned int rotation = plane_state->rotation;
@@ -864,7 +863,7 @@ int drm_atomic_helper_check_plane_state(struct 
drm_plane_state *plane_state,
*src = drm_plane_state_src(plane_state);
*dst = drm_plane_state_dest(plane_state);
 
-   if (!fb) {
+   if (!drm_plane_has_visible_data(plane_state)) {
plane_state->visible = false;
return 0;
}
@@ -881,25 +880,31 @@ int drm_atomic_helper_check_plane_state(struct 
drm_plane_state *plane_state,
return -EINVAL;
}
 
-   drm_rect_rotate(src, fb->width << 16, fb->height << 16, rotation);
+   /* Check that selected pixel source is valid */
+   if (plane_state->pixel_source == DRM_PLANE_PIXEL_SOURCE_FB && 
plane_state->fb) {
+   struct drm_framebuffer *fb = plane_state->fb;
+   drm_rect_rotate(src, fb->width << 16, fb->height << 16, 
rotation);
 
-   /* Check scaling */
-   hscale = drm_rect_calc_hscale(src, dst, min_scale, max_scale);
-   vscale = drm_rect_calc_vscale(src, dst, min_scale, max_scale);
-   if (hscale < 0 || vscale < 0) {
-   drm_dbg_kms(plane_state->plane->dev,
-   "Invalid scaling of plane\n");
-   drm_rect_debug_print("src: ", _state->src, true);
-   drm_rect_debug_print("dst: ", _state->dst, false);
-   return -ERANGE;
-   }
+   /* Check scaling */
+   hscale = drm_rect_calc_hscale(src, dst, min_scale, max_scale);
+   vscale = drm_rect_calc_vscale(src, dst, min_scale, max_scale);
 
-   if (crtc_state->enable)
-   

[PATCH RFC v7 10/10] drm/msm/dpu: Add solid fill and pixel source properties

2023-10-27 Thread Jessica Zhang
Add solid_fill and pixel_source properties to DPU plane

Reviewed-by: Dmitry Baryshkov 
Signed-off-by: Jessica Zhang 
---
 drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c
index 832747080daf..8e9fda541211 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c
@@ -1493,6 +1493,8 @@ struct drm_plane *dpu_plane_init(struct drm_device *dev,
DPU_ERROR("failed to install zpos property, rc = %d\n", ret);
 
drm_plane_create_alpha_property(plane);
+   drm_plane_create_solid_fill_property(plane);
+   drm_plane_create_pixel_source_property(plane, 
BIT(DRM_PLANE_PIXEL_SOURCE_SOLID_FILL));
drm_plane_create_blend_mode_property(plane,
BIT(DRM_MODE_BLEND_PIXEL_NONE) |
BIT(DRM_MODE_BLEND_PREMULTI) |

-- 
2.42.0



[PATCH RFC v7 03/10] drm: Add solid fill pixel source

2023-10-27 Thread Jessica Zhang
Add "SOLID_FILL" as a valid pixel source. If the pixel_source property is
set to "SOLID_FILL", it will display data from the drm_plane "solid_fill"
blob property.

Reviewed-by: Dmitry Baryshkov 
Acked-by: Pekka Paalanen 
Acked-by: Harry Wentland 
Acked-by: Sebastian Wick 
Signed-off-by: Jessica Zhang 
---
 drivers/gpu/drm/drm_blend.c | 10 +-
 include/drm/drm_plane.h |  1 +
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/drm_blend.c b/drivers/gpu/drm/drm_blend.c
index 665c5d9b056f..37b31b7e5ce5 100644
--- a/drivers/gpu/drm/drm_blend.c
+++ b/drivers/gpu/drm/drm_blend.c
@@ -204,6 +204,9 @@
  * "FB":
  * Framebuffer source set by the "FB_ID" property.
  *
+ * "SOLID_FILL":
+ * Solid fill color source set by the "solid_fill" property.
+ *
  * solid_fill:
  * solid_fill is set up with drm_plane_create_solid_fill_property(). It
  * contains pixel data that drivers can use to fill a plane.
@@ -642,6 +645,7 @@ EXPORT_SYMBOL(drm_plane_create_blend_mode_property);
 static const struct drm_prop_enum_list drm_pixel_source_enum_list[] = {
{ DRM_PLANE_PIXEL_SOURCE_NONE, "NONE" },
{ DRM_PLANE_PIXEL_SOURCE_FB, "FB" },
+   { DRM_PLANE_PIXEL_SOURCE_SOLID_FILL, "SOLID_FILL" },
 };
 
 /**
@@ -666,6 +670,9 @@ static const struct drm_prop_enum_list 
drm_pixel_source_enum_list[] = {
  * "FB":
  * Framebuffer pixel source
  *
+ * "SOLID_FILL":
+ * Solid fill color pixel source
+ *
  * Returns:
  * Zero on success, negative errno on failure.
  */
@@ -675,7 +682,8 @@ int drm_plane_create_pixel_source_property(struct drm_plane 
*plane,
struct drm_device *dev = plane->dev;
struct drm_property *prop;
static const unsigned int valid_source_mask = 
BIT(DRM_PLANE_PIXEL_SOURCE_FB) |
- 
BIT(DRM_PLANE_PIXEL_SOURCE_NONE);
+ 
BIT(DRM_PLANE_PIXEL_SOURCE_NONE) |
+ 
BIT(DRM_PLANE_PIXEL_SOURCE_SOLID_FILL);
int i;
 
/* FB is supported by default */
diff --git a/include/drm/drm_plane.h b/include/drm/drm_plane.h
index e2ae7c26cc57..baaf737392bc 100644
--- a/include/drm/drm_plane.h
+++ b/include/drm/drm_plane.h
@@ -43,6 +43,7 @@ enum drm_scaling_filter {
 enum drm_plane_pixel_source {
DRM_PLANE_PIXEL_SOURCE_NONE,
DRM_PLANE_PIXEL_SOURCE_FB,
+   DRM_PLANE_PIXEL_SOURCE_SOLID_FILL,
DRM_PLANE_PIXEL_SOURCE_MAX
 };
 

-- 
2.42.0



[PATCH RFC v7 04/10] drm/atomic: Add pixel source to plane state dump

2023-10-27 Thread Jessica Zhang
Add pixel source to the atomic plane state dump

Reviewed-by: Dmitry Baryshkov 
Acked-by: Harry Wentland 
Acked-by: Sebastian Wick 
Signed-off-by: Jessica Zhang 
---
 drivers/gpu/drm/drm_atomic.c| 1 +
 drivers/gpu/drm/drm_blend.c | 1 +
 drivers/gpu/drm/drm_crtc_internal.h | 1 +
 3 files changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/drm_atomic.c b/drivers/gpu/drm/drm_atomic.c
index c277b198fa3f..9f9abbe76369 100644
--- a/drivers/gpu/drm/drm_atomic.c
+++ b/drivers/gpu/drm/drm_atomic.c
@@ -722,6 +722,7 @@ static void drm_atomic_plane_print_state(struct drm_printer 
*p,
 
drm_printf(p, "plane[%u]: %s\n", plane->base.id, plane->name);
drm_printf(p, "\tcrtc=%s\n", state->crtc ? state->crtc->name : 
"(null)");
+   drm_printf(p, "\tpixel-source=%s\n", 
drm_get_pixel_source_name(state->pixel_source));
drm_printf(p, "\tfb=%u\n", state->fb ? state->fb->base.id : 0);
if (state->fb)
drm_framebuffer_print_info(p, 2, state->fb);
diff --git a/drivers/gpu/drm/drm_blend.c b/drivers/gpu/drm/drm_blend.c
index 37b31b7e5ce5..9c1608f7c1df 100644
--- a/drivers/gpu/drm/drm_blend.c
+++ b/drivers/gpu/drm/drm_blend.c
@@ -647,6 +647,7 @@ static const struct drm_prop_enum_list 
drm_pixel_source_enum_list[] = {
{ DRM_PLANE_PIXEL_SOURCE_FB, "FB" },
{ DRM_PLANE_PIXEL_SOURCE_SOLID_FILL, "SOLID_FILL" },
 };
+DRM_ENUM_NAME_FN(drm_get_pixel_source_name, drm_pixel_source_enum_list);
 
 /**
  * drm_plane_create_pixel_source_property - create a new pixel source property
diff --git a/drivers/gpu/drm/drm_crtc_internal.h 
b/drivers/gpu/drm/drm_crtc_internal.h
index 501a10edd0e1..7bc93ba449d5 100644
--- a/drivers/gpu/drm/drm_crtc_internal.h
+++ b/drivers/gpu/drm/drm_crtc_internal.h
@@ -267,6 +267,7 @@ int drm_plane_check_pixel_format(struct drm_plane *plane,
 u32 format, u64 modifier);
 struct drm_mode_rect *
 __drm_plane_get_damage_clips(const struct drm_plane_state *state);
+const char *drm_get_pixel_source_name(int val);
 
 /* drm_bridge.c */
 void drm_bridge_detach(struct drm_bridge *bridge);

-- 
2.42.0



[PATCH RFC v7 08/10] drm/msm/dpu: Allow NULL FBs in atomic commit

2023-10-27 Thread Jessica Zhang
Since solid fill planes allow for a NULL framebuffer in a valid commit,
add NULL framebuffer checks to atomic commit calls within DPU.

Reviewed-by: Dmitry Baryshkov 
Signed-off-by: Jessica Zhang 
---
 drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c  |  9 +++-
 drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c | 35 ---
 2 files changed, 31 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c
index 3c475f8042b0..3b9648c679ad 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c
@@ -451,6 +451,7 @@ static void _dpu_crtc_blend_setup_mixer(struct drm_crtc 
*crtc,
struct drm_plane_state *state;
struct dpu_crtc_state *cstate = to_dpu_crtc_state(crtc->state);
struct dpu_plane_state *pstate = NULL;
+   const struct msm_format *fmt;
struct dpu_format *format;
struct dpu_hw_ctl *ctl = mixer->lm_ctl;
 
@@ -470,7 +471,13 @@ static void _dpu_crtc_blend_setup_mixer(struct drm_crtc 
*crtc,
pstate = to_dpu_plane_state(state);
fb = state->fb;
 
-   format = to_dpu_format(msm_framebuffer_format(pstate->base.fb));
+   if (drm_plane_solid_fill_enabled(state))
+   fmt = dpu_get_msm_format(&_dpu_crtc_get_kms(crtc)->base,
+   DRM_FORMAT_ABGR, 0);
+   else
+   fmt = msm_framebuffer_format(pstate->base.fb);
+
+   format = to_dpu_format(fmt);
 
if (pstate->stage == DPU_STAGE_BASE && format->alpha_enable)
bg_alpha_enable = true;
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c
index 3eef5e025e12..9615653db787 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c
@@ -55,6 +55,8 @@
 
 #define DEFAULT_REFRESH_RATE   60
 
+#define DPU_SOLID_FILL_FORMAT  DRM_FORMAT_ABGR
+
 static const uint32_t qcom_compressed_supported_formats[] = {
DRM_FORMAT_ABGR,
DRM_FORMAT_ARGB,
@@ -658,7 +660,7 @@ static void _dpu_plane_color_fill(struct dpu_plane *pdpu,
 * select fill format to match user property expectation,
 * h/w only supports RGB variants
 */
-   fmt = dpu_get_dpu_format(DRM_FORMAT_ABGR);
+   fmt = dpu_get_dpu_format(DPU_SOLID_FILL_FORMAT);
/* should not happen ever */
if (!fmt)
return;
@@ -877,18 +879,23 @@ static int dpu_plane_atomic_check(struct drm_plane *plane,
 
pipe_cfg->dst_rect = new_plane_state->dst;
 
-   fb_rect.x2 = new_plane_state->fb->width;
-   fb_rect.y2 = new_plane_state->fb->height;
+   if (new_plane_state->pixel_source == DRM_PLANE_PIXEL_SOURCE_FB && 
new_plane_state->fb) {
+   fb_rect.x2 = new_plane_state->fb->width;
+   fb_rect.y2 = new_plane_state->fb->height;
 
-   /* Ensure fb size is supported */
-   if (drm_rect_width(_rect) > MAX_IMG_WIDTH ||
-   drm_rect_height(_rect) > MAX_IMG_HEIGHT) {
-   DPU_DEBUG_PLANE(pdpu, "invalid framebuffer " DRM_RECT_FMT "\n",
-   DRM_RECT_ARG(_rect));
-   return -E2BIG;
+   /* Ensure fb size is supported */
+   if (drm_rect_width(_rect) > MAX_IMG_WIDTH ||
+   drm_rect_height(_rect) > MAX_IMG_HEIGHT) {
+   DPU_DEBUG_PLANE(pdpu, "invalid framebuffer " 
DRM_RECT_FMT "\n",
+   DRM_RECT_ARG(_rect));
+   return -E2BIG;
+   }
}
 
-   fmt = to_dpu_format(msm_framebuffer_format(new_plane_state->fb));
+   if (drm_plane_solid_fill_enabled(new_plane_state))
+   fmt = dpu_get_dpu_format(DPU_SOLID_FILL_FORMAT);
+   else
+   fmt = 
to_dpu_format(msm_framebuffer_format(new_plane_state->fb));
 
max_linewidth = pdpu->catalog->caps->max_linewidth;
 
@@ -1123,8 +1130,7 @@ static void dpu_plane_sspp_atomic_update(struct drm_plane 
*plane)
struct drm_crtc *crtc = state->crtc;
struct drm_framebuffer *fb = state->fb;
bool is_rt_pipe;
-   const struct dpu_format *fmt =
-   to_dpu_format(msm_framebuffer_format(fb));
+   const struct dpu_format *fmt;
struct dpu_sw_pipe_cfg *pipe_cfg = >pipe_cfg;
struct dpu_sw_pipe_cfg *r_pipe_cfg = >r_pipe_cfg;
struct dpu_kms *kms = _dpu_plane_get_kms(>base);
@@ -1133,6 +1139,11 @@ static void dpu_plane_sspp_atomic_update(struct 
drm_plane *plane)
bool layout_valid = false;
int ret;
 
+   if (drm_plane_solid_fill_enabled(state))
+   return;
+
+   fmt = to_dpu_format(msm_framebuffer_format(fb));
+
ret = dpu_format_populate_layout(aspace, fb, );
if (ret)
DPU_ERROR_PLANE(pdpu, "failed to get 

[PATCH RFC v7 02/10] drm: Introduce solid fill DRM plane property

2023-10-27 Thread Jessica Zhang
Document and add support for solid_fill property to drm_plane. In
addition, add support for setting and getting the values for solid_fill.

To enable solid fill planes, userspace must assign a property blob to
the "solid_fill" plane property containing the following information:

struct drm_mode_solid_fill {
u32 r, g, b, pad;
};

Acked-by: Harry Wentland 
Acked-by: Sebastian Wick 
Signed-off-by: Jessica Zhang 
---
 drivers/gpu/drm/drm_atomic_state_helper.c |  9 
 drivers/gpu/drm/drm_atomic_uapi.c | 26 ++
 drivers/gpu/drm/drm_blend.c   | 30 ++
 include/drm/drm_blend.h   |  1 +
 include/drm/drm_plane.h   | 36 +++
 include/uapi/drm/drm_mode.h   | 24 +
 6 files changed, 126 insertions(+)

diff --git a/drivers/gpu/drm/drm_atomic_state_helper.c 
b/drivers/gpu/drm/drm_atomic_state_helper.c
index 01638c51ce0a..86fb876efbe6 100644
--- a/drivers/gpu/drm/drm_atomic_state_helper.c
+++ b/drivers/gpu/drm/drm_atomic_state_helper.c
@@ -254,6 +254,11 @@ void __drm_atomic_helper_plane_state_reset(struct 
drm_plane_state *plane_state,
plane_state->pixel_blend_mode = DRM_MODE_BLEND_PREMULTI;
plane_state->pixel_source = DRM_PLANE_PIXEL_SOURCE_FB;
 
+   if (plane_state->solid_fill_blob) {
+   drm_property_blob_put(plane_state->solid_fill_blob);
+   plane_state->solid_fill_blob = NULL;
+   }
+
if (plane->color_encoding_property) {
if (!drm_object_property_get_default_value(>base,
   
plane->color_encoding_property,
@@ -336,6 +341,9 @@ void __drm_atomic_helper_plane_duplicate_state(struct 
drm_plane *plane,
if (state->fb)
drm_framebuffer_get(state->fb);
 
+   if (state->solid_fill_blob)
+   drm_property_blob_get(state->solid_fill_blob);
+
state->fence = NULL;
state->commit = NULL;
state->fb_damage_clips = NULL;
@@ -385,6 +393,7 @@ void __drm_atomic_helper_plane_destroy_state(struct 
drm_plane_state *state)
drm_crtc_commit_put(state->commit);
 
drm_property_blob_put(state->fb_damage_clips);
+   drm_property_blob_put(state->solid_fill_blob);
 }
 EXPORT_SYMBOL(__drm_atomic_helper_plane_destroy_state);
 
diff --git a/drivers/gpu/drm/drm_atomic_uapi.c 
b/drivers/gpu/drm/drm_atomic_uapi.c
index 46c78b87803d..576acb34195f 100644
--- a/drivers/gpu/drm/drm_atomic_uapi.c
+++ b/drivers/gpu/drm/drm_atomic_uapi.c
@@ -316,6 +316,20 @@ drm_atomic_set_crtc_for_connector(struct 
drm_connector_state *conn_state,
 }
 EXPORT_SYMBOL(drm_atomic_set_crtc_for_connector);
 
+static void drm_atomic_set_solid_fill_prop(struct drm_plane_state *state)
+{
+   struct drm_mode_solid_fill *user_info;
+
+   if (!state->solid_fill_blob)
+   return;
+
+   user_info = (struct drm_mode_solid_fill *)state->solid_fill_blob->data;
+
+   state->solid_fill.r = user_info->r;
+   state->solid_fill.g = user_info->g;
+   state->solid_fill.b = user_info->b;
+}
+
 static void set_out_fence_for_crtc(struct drm_atomic_state *state,
   struct drm_crtc *crtc, s32 __user *fence_ptr)
 {
@@ -564,6 +578,15 @@ static int drm_atomic_plane_set_property(struct drm_plane 
*plane,
state->src_h = val;
} else if (property == plane->pixel_source_property) {
state->pixel_source = val;
+   } else if (property == plane->solid_fill_property) {
+   ret = drm_atomic_replace_property_blob_from_id(dev,
+   >solid_fill_blob,
+   val, sizeof(struct drm_mode_solid_fill),
+   -1, );
+   if (ret)
+   return ret;
+
+   drm_atomic_set_solid_fill_prop(state);
} else if (property == plane->alpha_property) {
state->alpha = val;
} else if (property == plane->blend_mode_property) {
@@ -638,6 +661,9 @@ drm_atomic_plane_get_property(struct drm_plane *plane,
*val = state->src_h;
} else if (property == plane->pixel_source_property) {
*val = state->pixel_source;
+   } else if (property == plane->solid_fill_property) {
+   *val = state->solid_fill_blob ?
+   state->solid_fill_blob->base.id : 0;
} else if (property == plane->alpha_property) {
*val = state->alpha;
} else if (property == plane->blend_mode_property) {
diff --git a/drivers/gpu/drm/drm_blend.c b/drivers/gpu/drm/drm_blend.c
index fce734cdb85b..665c5d9b056f 100644
--- a/drivers/gpu/drm/drm_blend.c
+++ b/drivers/gpu/drm/drm_blend.c
@@ -204,6 +204,10 @@
  * "FB":
  * Framebuffer source set by the "FB_ID" property.
  *
+ * solid_fill:
+ * solid_fill is set up with 

[PATCH v2 4/4] drm/i915/mtl: Add module parameter override for Wa_16019325821/Wa_14019159160

2023-10-27 Thread John . C . Harrison
From: John Harrison 

These w/a's can have signficant performance implications for any
workload which uses both RCS and CCS. On the other hand, the hang
itself is only seen in one or two very specific workloads. So add a
module parameter to control whether the w/a's are enabled or not and
default to not.

Signed-off-by: John Harrison 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc.c| 3 ++-
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c| 3 ++-
 drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 3 ++-
 drivers/gpu/drm/i915/i915_params.c| 3 +++
 drivers/gpu/drm/i915/i915_params.h| 1 +
 5 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
index 6252f32d67011..4c89983b1e907 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
@@ -296,7 +296,8 @@ static u32 guc_ctl_wa_flags(struct intel_guc *guc)
 
/* Wa_16019325821 */
/* Wa_14019159160 */
-   if (IS_GFX_GT_IP_RANGE(gt, IP_VER(12, 70), IP_VER(12, 71)))
+   if (gt->i915->params.enable_mtl_rcs_ccs_wa &&
+   IS_GFX_GT_IP_RANGE(gt, IP_VER(12, 70), IP_VER(12, 71)))
flags |= GUC_WA_RCS_CCS_SWITCHOUT;
 
/*
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
index 8f7298cbbc322..78757e78bce88 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
@@ -845,7 +845,8 @@ static void guc_waklv_init(struct intel_guc *guc)
remain = guc_ads_waklv_size(guc);
 
/* Wa_14019159160 */
-   if (IS_GFX_GT_IP_RANGE(gt, IP_VER(12, 70), IP_VER(12, 71))) {
+   if (gt->i915->params.enable_mtl_rcs_ccs_wa &&
+   IS_GFX_GT_IP_RANGE(gt, IP_VER(12, 70), IP_VER(12, 71))) {
size = guc_waklv_ra_mode(guc, offset, remain);
offset += size;
remain -= size;
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index 225812b299524..4de54a100c451 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -4384,7 +4384,8 @@ static void guc_default_vfuncs(struct intel_engine_cs 
*engine)
 
/* Wa_16019325821 */
/* Wa_14019159160 */
-   if ((engine->class == COMPUTE_CLASS || engine->class == RENDER_CLASS) &&
+   if (engine->i915->params.enable_mtl_rcs_ccs_wa &&
+   (engine->class == COMPUTE_CLASS || engine->class == RENDER_CLASS) &&
IS_GFX_GT_IP_RANGE(engine->gt, IP_VER(12, 70), IP_VER(12, 71)))
engine->flags |= I915_ENGINE_USES_WA_HOLD_SWITCHOUT;
 
diff --git a/drivers/gpu/drm/i915/i915_params.c 
b/drivers/gpu/drm/i915/i915_params.c
index de43048543e8b..1004171d99943 100644
--- a/drivers/gpu/drm/i915/i915_params.c
+++ b/drivers/gpu/drm/i915/i915_params.c
@@ -134,6 +134,9 @@ i915_param_named_unsafe(lmem_size, uint, 0400,
 i915_param_named_unsafe(lmem_bar_size, uint, 0400,
"Set the lmem bar size(in MiB).");
 
+i915_param_named(enable_mtl_rcs_ccs_wa, bool, 0400,
+   "Enable the RCS/CCS switchout hold workaround for MTL (only some 
workloads are affected by issue and w/a has a performance penalty) 
(default:false)");
+
 static void _param_print_bool(struct drm_printer *p, const char *name,
  bool val)
 {
diff --git a/drivers/gpu/drm/i915/i915_params.h 
b/drivers/gpu/drm/i915/i915_params.h
index 1315d7fac850f..971a765d74f56 100644
--- a/drivers/gpu/drm/i915/i915_params.h
+++ b/drivers/gpu/drm/i915/i915_params.h
@@ -62,6 +62,7 @@ struct drm_printer;
param(unsigned int, lmem_size, 0, 0400) \
param(unsigned int, lmem_bar_size, 0, 0400) \
/* leave bools at the end to not create holes */ \
+   param(bool, enable_mtl_rcs_ccs_wa, false, 0x400) \
param(bool, enable_hangcheck, true, 0600) \
param(bool, error_capture, true, 
IS_ENABLED(CONFIG_DRM_I915_CAPTURE_ERROR) ? 0600 : 0) \
param(bool, enable_gvt, false, IS_ENABLED(CONFIG_DRM_I915_GVT) ? 0400 : 
0)
-- 
2.41.0



[PATCH v2 3/4] drm/i915/guc: Enable Wa_14019159160

2023-10-27 Thread John . C . Harrison
From: John Harrison 

Use the new w/a KLV support to enable a MTL w/a. Note, this w/a is a
super-set of Wa_16019325821, so requires turning that one as well as
setting the new flag for Wa_14019159160 itself.

Signed-off-by: John Harrison 
---
 drivers/gpu/drm/i915/gt/gen8_engine_cs.c  |  3 ++
 drivers/gpu/drm/i915/gt/intel_engine_types.h  |  1 +
 drivers/gpu/drm/i915/gt/uc/abi/guc_klvs_abi.h |  7 
 drivers/gpu/drm/i915/gt/uc/intel_guc.c|  1 +
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c| 34 ++-
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c |  1 +
 6 files changed, 38 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c 
b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
index 9cccd60a5c41d..359b21fb02ab2 100644
--- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
@@ -744,6 +744,7 @@ static u32 *gen12_emit_preempt_busywait(struct i915_request 
*rq, u32 *cs)
 
 /* Wa_14014475959:dg2 */
 /* Wa_16019325821 */
+/* Wa_14019159160 */
 #define HOLD_SWITCHOUT_SEMAPHORE_PPHWSP_OFFSET 0x540
 static u32 hold_switchout_semaphore_offset(struct i915_request *rq)
 {
@@ -753,6 +754,7 @@ static u32 hold_switchout_semaphore_offset(struct 
i915_request *rq)
 
 /* Wa_14014475959:dg2 */
 /* Wa_16019325821 */
+/* Wa_14019159160 */
 static u32 *hold_switchout_emit_wa_busywait(struct i915_request *rq, u32 *cs)
 {
int i;
@@ -793,6 +795,7 @@ gen12_emit_fini_breadcrumb_tail(struct i915_request *rq, 
u32 *cs)
 
/* Wa_14014475959:dg2 */
/* Wa_16019325821 */
+   /* Wa_14019159160 */
if (intel_engine_uses_wa_hold_switchout(rq->engine))
cs = hold_switchout_emit_wa_busywait(rq, cs);
 
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h 
b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index f08739d020332..3b4993955a4b6 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -695,6 +695,7 @@ intel_engine_has_relative_mmio(const struct intel_engine_cs 
* const engine)
 
 /* Wa_14014475959:dg2 */
 /* Wa_16019325821 */
+/* Wa_14019159160 */
 static inline bool
 intel_engine_uses_wa_hold_switchout(struct intel_engine_cs *engine)
 {
diff --git a/drivers/gpu/drm/i915/gt/uc/abi/guc_klvs_abi.h 
b/drivers/gpu/drm/i915/gt/uc/abi/guc_klvs_abi.h
index 58012edd4eb0e..bebf28e3c4794 100644
--- a/drivers/gpu/drm/i915/gt/uc/abi/guc_klvs_abi.h
+++ b/drivers/gpu/drm/i915/gt/uc/abi/guc_klvs_abi.h
@@ -101,4 +101,11 @@ enum {
GUC_CONTEXT_POLICIES_KLV_NUM_IDS = 5,
 };
 
+/*
+ * Workaround keys:
+ */
+enum {
+   GUC_WORKAROUND_KLV_SERIALIZED_RA_MODE   = 
0x9001,
+};
+
 #endif /* _ABI_GUC_KLVS_ABI_H */
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
index 0e6c160de3315..6252f32d67011 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
@@ -295,6 +295,7 @@ static u32 guc_ctl_wa_flags(struct intel_guc *guc)
flags |= GUC_WA_HOLD_CCS_SWITCHOUT;
 
/* Wa_16019325821 */
+   /* Wa_14019159160 */
if (IS_GFX_GT_IP_RANGE(gt, IP_VER(12, 70), IP_VER(12, 71)))
flags |= GUC_WA_RCS_CCS_SWITCHOUT;
 
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
index 251e7a7a05cb8..8f7298cbbc322 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
@@ -810,6 +810,25 @@ guc_capture_prep_lists(struct intel_guc *guc)
return PAGE_ALIGN(total_size);
 }
 
+/* Wa_14019159160 */
+static u32 guc_waklv_ra_mode(struct intel_guc *guc, u32 offset, u32 remain)
+{
+   u32 size;
+   u32 klv_entry[] = {
+   /* 16:16 key/length */
+   FIELD_PREP(GUC_KLV_0_KEY, 
GUC_WORKAROUND_KLV_SERIALIZED_RA_MODE) |
+   FIELD_PREP(GUC_KLV_0_LEN, 0),
+   /* 0 dwords data */
+   };
+
+   size = sizeof(klv_entry);
+   GEM_BUG_ON(remain < size);
+
+   iosys_map_memcpy_to(>ads_map, offset, klv_entry, size);
+
+   return size;
+}
+
 static void guc_waklv_init(struct intel_guc *guc)
 {
struct intel_gt *gt = guc_to_gt(guc);
@@ -825,15 +844,12 @@ static void guc_waklv_init(struct intel_guc *guc)
offset = guc_ads_waklv_offset(guc);
remain = guc_ads_waklv_size(guc);
 
-   /*
-* Add workarounds here:
-*
-* if (want_wa_) {
-*  size = guc_waklv_(guc, offset, remain);
-*  offset += size;
-*  remain -= size;
-* }
-*/
+   /* Wa_14019159160 */
+   if (IS_GFX_GT_IP_RANGE(gt, IP_VER(12, 70), IP_VER(12, 71))) {
+   size = guc_waklv_ra_mode(guc, offset, remain);
+   offset += size;
+   remain -= size;
+   }
 
size = guc_ads_waklv_size(guc) - remain;
if (!size)
diff --git 

[PATCH v2 1/4] drm/i915: Enable Wa_16019325821

2023-10-27 Thread John . C . Harrison
From: John Harrison 

Some platforms require holding RCS context switches until CCS is idle
(the reverse w/a of Wa_14014475959). Some platforms require both
versions.

Signed-off-by: John Harrison 
---
 drivers/gpu/drm/i915/gt/gen8_engine_cs.c  | 19 +++
 drivers/gpu/drm/i915/gt/intel_engine_types.h  |  7 ---
 drivers/gpu/drm/i915/gt/uc/intel_guc.c|  4 
 drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h   |  3 ++-
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c |  7 ++-
 5 files changed, 27 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c 
b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
index 86a04afff64b3..9cccd60a5c41d 100644
--- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
@@ -743,21 +743,23 @@ static u32 *gen12_emit_preempt_busywait(struct 
i915_request *rq, u32 *cs)
 }
 
 /* Wa_14014475959:dg2 */
-#define CCS_SEMAPHORE_PPHWSP_OFFSET0x540
-static u32 ccs_semaphore_offset(struct i915_request *rq)
+/* Wa_16019325821 */
+#define HOLD_SWITCHOUT_SEMAPHORE_PPHWSP_OFFSET 0x540
+static u32 hold_switchout_semaphore_offset(struct i915_request *rq)
 {
return i915_ggtt_offset(rq->context->state) +
-   (LRC_PPHWSP_PN * PAGE_SIZE) + CCS_SEMAPHORE_PPHWSP_OFFSET;
+   (LRC_PPHWSP_PN * PAGE_SIZE) + 
HOLD_SWITCHOUT_SEMAPHORE_PPHWSP_OFFSET;
 }
 
 /* Wa_14014475959:dg2 */
-static u32 *ccs_emit_wa_busywait(struct i915_request *rq, u32 *cs)
+/* Wa_16019325821 */
+static u32 *hold_switchout_emit_wa_busywait(struct i915_request *rq, u32 *cs)
 {
int i;
 
*cs++ = MI_ATOMIC_INLINE | MI_ATOMIC_GLOBAL_GTT | MI_ATOMIC_CS_STALL |
MI_ATOMIC_MOVE;
-   *cs++ = ccs_semaphore_offset(rq);
+   *cs++ = hold_switchout_semaphore_offset(rq);
*cs++ = 0;
*cs++ = 1;
 
@@ -773,7 +775,7 @@ static u32 *ccs_emit_wa_busywait(struct i915_request *rq, 
u32 *cs)
MI_SEMAPHORE_POLL |
MI_SEMAPHORE_SAD_EQ_SDD;
*cs++ = 0;
-   *cs++ = ccs_semaphore_offset(rq);
+   *cs++ = hold_switchout_semaphore_offset(rq);
*cs++ = 0;
 
return cs;
@@ -790,8 +792,9 @@ gen12_emit_fini_breadcrumb_tail(struct i915_request *rq, 
u32 *cs)
cs = gen12_emit_preempt_busywait(rq, cs);
 
/* Wa_14014475959:dg2 */
-   if (intel_engine_uses_wa_hold_ccs_switchout(rq->engine))
-   cs = ccs_emit_wa_busywait(rq, cs);
+   /* Wa_16019325821 */
+   if (intel_engine_uses_wa_hold_switchout(rq->engine))
+   cs = hold_switchout_emit_wa_busywait(rq, cs);
 
rq->tail = intel_ring_offset(rq, cs);
assert_ring_tail_valid(rq->ring, rq->tail);
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h 
b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index 8769760257fd9..f08739d020332 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -584,7 +584,7 @@ struct intel_engine_cs {
 #define I915_ENGINE_HAS_RCS_REG_STATE  BIT(9)
 #define I915_ENGINE_HAS_EU_PRIORITYBIT(10)
 #define I915_ENGINE_FIRST_RENDER_COMPUTE BIT(11)
-#define I915_ENGINE_USES_WA_HOLD_CCS_SWITCHOUT BIT(12)
+#define I915_ENGINE_USES_WA_HOLD_SWITCHOUT BIT(12)
unsigned int flags;
 
/*
@@ -694,10 +694,11 @@ intel_engine_has_relative_mmio(const struct 
intel_engine_cs * const engine)
 }
 
 /* Wa_14014475959:dg2 */
+/* Wa_16019325821 */
 static inline bool
-intel_engine_uses_wa_hold_ccs_switchout(struct intel_engine_cs *engine)
+intel_engine_uses_wa_hold_switchout(struct intel_engine_cs *engine)
 {
-   return engine->flags & I915_ENGINE_USES_WA_HOLD_CCS_SWITCHOUT;
+   return engine->flags & I915_ENGINE_USES_WA_HOLD_SWITCHOUT;
 }
 
 #endif /* __INTEL_ENGINE_TYPES_H__ */
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
index 3f3df1166b860..0e6c160de3315 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
@@ -294,6 +294,10 @@ static u32 guc_ctl_wa_flags(struct intel_guc *guc)
IS_DG2(gt->i915))
flags |= GUC_WA_HOLD_CCS_SWITCHOUT;
 
+   /* Wa_16019325821 */
+   if (IS_GFX_GT_IP_RANGE(gt, IP_VER(12, 70), IP_VER(12, 71)))
+   flags |= GUC_WA_RCS_CCS_SWITCHOUT;
+
/*
 * Wa_14012197797
 * Wa_22011391025
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
index 8ae1846431da7..48863188a130e 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
@@ -96,8 +96,9 @@
 #define   GUC_WA_GAM_CREDITS   BIT(10)
 #define   GUC_WA_DUAL_QUEUEBIT(11)
 #define   GUC_WA_RCS_RESET_BEFORE_RC6  BIT(13)
-#define   GUC_WA_CONTEXT_ISOLATION BIT(15)
 #define   GUC_WA_PRE_PARSERBIT(14)
+#define   GUC_WA_CONTEXT_ISOLATION BIT(15)
+#define   GUC_WA_RCS_CCS_SWITCHOUT 

[PATCH v2 2/4] drm/i915/guc: Add support for w/a KLVs

2023-10-27 Thread John . C . Harrison
From: John Harrison 

To prevent running out of bits, new w/a enable flags are being added
via a KLV system instead of a 32 bit flags word.

Signed-off-by: John Harrison 
---
 .../gpu/drm/i915/gt/uc/abi/guc_errors_abi.h   |  1 +
 drivers/gpu/drm/i915/gt/uc/intel_guc.h|  2 +
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c| 73 ++-
 drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c |  6 ++
 drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h   |  5 +-
 5 files changed, 85 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/abi/guc_errors_abi.h 
b/drivers/gpu/drm/i915/gt/uc/abi/guc_errors_abi.h
index dabeaf4f245f3..00d6402333f8e 100644
--- a/drivers/gpu/drm/i915/gt/uc/abi/guc_errors_abi.h
+++ b/drivers/gpu/drm/i915/gt/uc/abi/guc_errors_abi.h
@@ -36,6 +36,7 @@ enum intel_guc_load_status {
INTEL_GUC_LOAD_STATUS_INVALID_INIT_DATA_RANGE_START,
INTEL_GUC_LOAD_STATUS_MPU_DATA_INVALID = 0x73,
INTEL_GUC_LOAD_STATUS_INIT_MMIO_SAVE_RESTORE_INVALID   = 0x74,
+   INTEL_GUC_LOAD_STATUS_KLV_WORKAROUND_INIT_ERROR= 0x75,
INTEL_GUC_LOAD_STATUS_INVALID_INIT_DATA_RANGE_END,
 
INTEL_GUC_LOAD_STATUS_READY= 0xF0,
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h 
b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
index 2b6dfe62c8f2a..4113776ff3e19 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
@@ -198,6 +198,8 @@ struct intel_guc {
struct guc_mmio_reg *ads_regset;
/** @ads_golden_ctxt_size: size of the golden contexts in the ADS */
u32 ads_golden_ctxt_size;
+   /** @ads_waklv_size: size of workaround KLVs */
+   u32 ads_waklv_size;
/** @ads_capture_size: size of register lists in the ADS used for error 
capture */
u32 ads_capture_size;
/** @ads_engine_usage_size: size of engine usage in the ADS */
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
index 63724e17829a7..251e7a7a05cb8 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
@@ -46,6 +46,10 @@
  *  +---+
  *  | padding   |
  *  +---+ <== 4K aligned
+ *  | w/a KLVs  |
+ *  +---+
+ *  | padding   |
+ *  +---+ <== 4K aligned
  *  | capture lists |
  *  +---+
  *  | padding   |
@@ -88,6 +92,11 @@ static u32 guc_ads_golden_ctxt_size(struct intel_guc *guc)
return PAGE_ALIGN(guc->ads_golden_ctxt_size);
 }
 
+static u32 guc_ads_waklv_size(struct intel_guc *guc)
+{
+   return PAGE_ALIGN(guc->ads_waklv_size);
+}
+
 static u32 guc_ads_capture_size(struct intel_guc *guc)
 {
return PAGE_ALIGN(guc->ads_capture_size);
@@ -113,7 +122,7 @@ static u32 guc_ads_golden_ctxt_offset(struct intel_guc *guc)
return PAGE_ALIGN(offset);
 }
 
-static u32 guc_ads_capture_offset(struct intel_guc *guc)
+static u32 guc_ads_waklv_offset(struct intel_guc *guc)
 {
u32 offset;
 
@@ -123,6 +132,16 @@ static u32 guc_ads_capture_offset(struct intel_guc *guc)
return PAGE_ALIGN(offset);
 }
 
+static u32 guc_ads_capture_offset(struct intel_guc *guc)
+{
+   u32 offset;
+
+   offset = guc_ads_waklv_offset(guc) +
+guc_ads_waklv_size(guc);
+
+   return PAGE_ALIGN(offset);
+}
+
 static u32 guc_ads_private_data_offset(struct intel_guc *guc)
 {
u32 offset;
@@ -791,6 +810,49 @@ guc_capture_prep_lists(struct intel_guc *guc)
return PAGE_ALIGN(total_size);
 }
 
+static void guc_waklv_init(struct intel_guc *guc)
+{
+   struct intel_gt *gt = guc_to_gt(guc);
+   u32 offset, addr_ggtt, remain, size;
+
+   if (!intel_uc_uses_guc_submission(>uc))
+   return;
+
+   if (GUC_FIRMWARE_VER(guc) < MAKE_GUC_VER(70, 10, 0))
+   return;
+
+   GEM_BUG_ON(iosys_map_is_null(>ads_map));
+   offset = guc_ads_waklv_offset(guc);
+   remain = guc_ads_waklv_size(guc);
+
+   /*
+* Add workarounds here:
+*
+* if (want_wa_) {
+*  size = guc_waklv_(guc, offset, remain);
+*  offset += size;
+*  remain -= size;
+* }
+*/
+
+   size = guc_ads_waklv_size(guc) - remain;
+   if (!size)
+   return;
+
+   offset = guc_ads_waklv_offset(guc);
+   addr_ggtt = intel_guc_ggtt_offset(guc, guc->ads_vma) + offset;
+
+   ads_blob_write(guc, ads.wa_klv_addr_lo, addr_ggtt);
+   ads_blob_write(guc, ads.wa_klv_addr_hi, 0);
+   ads_blob_write(guc, ads.wa_klv_size, size);
+}
+
+static int guc_prep_waklv(struct intel_guc *guc)
+{
+   /* Fudge 

[PATCH v2 0/4] Enable Wa_14019159160 and Wa_16019325821 for MTL

2023-10-27 Thread John . C . Harrison
From: John Harrison 

Enable Wa_14019159160 and  Wa_16019325821 for MTL

RCS/CCS workarounds for MTL.

v2: Fix bug in WA KLV implementation (offset not being reset to start
of list). Add better comment to prep patch about how KLVs can be added.
Add a module parameter override and disable the w/a by default as it
causes performance regressions and is only required by very specific
workloads.

Signed-off-by: John Harrison 


John Harrison (4):
  drm/i915: Enable Wa_16019325821
  drm/i915/guc: Add support for w/a KLVs
  drm/i915/guc: Enable Wa_14019159160
  drm/i915/mtl: Add module parameter override for
Wa_16019325821/Wa_14019159160

 drivers/gpu/drm/i915/gt/gen8_engine_cs.c  | 22 +++--
 drivers/gpu/drm/i915/gt/intel_engine_types.h  |  8 +-
 .../gpu/drm/i915/gt/uc/abi/guc_errors_abi.h   |  1 +
 drivers/gpu/drm/i915/gt/uc/abi/guc_klvs_abi.h |  7 ++
 drivers/gpu/drm/i915/gt/uc/intel_guc.c|  6 ++
 drivers/gpu/drm/i915/gt/uc/intel_guc.h|  2 +
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c| 90 ++-
 drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c |  6 ++
 drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h   |  8 +-
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c |  9 +-
 drivers/gpu/drm/i915/i915_params.c|  3 +
 drivers/gpu/drm/i915/i915_params.h|  1 +
 12 files changed, 148 insertions(+), 15 deletions(-)

-- 
2.41.0



Re: [Intel-gfx] [PATCH 3/4] drm/i915/guc: Add support for w/a KLVs

2023-10-27 Thread John Harrison

On 10/6/2023 17:38, Belgaumkar, Vinay wrote:

On 9/15/2023 2:55 PM, john.c.harri...@intel.com wrote:

From: John Harrison 

To prevent running out of bits, new w/a enable flags are being added
via a KLV system instead of a 32 bit flags word.

Signed-off-by: John Harrison 
---
  .../gpu/drm/i915/gt/uc/abi/guc_errors_abi.h   |  1 +
  drivers/gpu/drm/i915/gt/uc/intel_guc.h    |  3 +
  drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c    | 64 ++-
  drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c |  6 ++
  drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h   |  5 +-
  5 files changed, 77 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/abi/guc_errors_abi.h 
b/drivers/gpu/drm/i915/gt/uc/abi/guc_errors_abi.h

index dabeaf4f245f3..00d6402333f8e 100644
--- a/drivers/gpu/drm/i915/gt/uc/abi/guc_errors_abi.h
+++ b/drivers/gpu/drm/i915/gt/uc/abi/guc_errors_abi.h
@@ -36,6 +36,7 @@ enum intel_guc_load_status {
  INTEL_GUC_LOAD_STATUS_INVALID_INIT_DATA_RANGE_START,
  INTEL_GUC_LOAD_STATUS_MPU_DATA_INVALID = 0x73,
  INTEL_GUC_LOAD_STATUS_INIT_MMIO_SAVE_RESTORE_INVALID   = 0x74,
+    INTEL_GUC_LOAD_STATUS_KLV_WORKAROUND_INIT_ERROR    = 0x75,
  INTEL_GUC_LOAD_STATUS_INVALID_INIT_DATA_RANGE_END,
    INTEL_GUC_LOAD_STATUS_READY    = 0xF0,
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h 
b/drivers/gpu/drm/i915/gt/uc/intel_guc.h

index 6c392bad29c19..3b1fc5f96306b 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
@@ -186,6 +186,8 @@ struct intel_guc {
  struct guc_mmio_reg *ads_regset;
  /** @ads_golden_ctxt_size: size of the golden contexts in the 
ADS */

  u32 ads_golden_ctxt_size;
+    /** @ads_waklv_size: size of workaround KLVs */
+    u32 ads_waklv_size;
  /** @ads_capture_size: size of register lists in the ADS used 
for error capture */

  u32 ads_capture_size;
  /** @ads_engine_usage_size: size of engine usage in the ADS */
@@ -295,6 +297,7 @@ struct intel_guc {
  #define MAKE_GUC_VER(maj, min, pat)    (((maj) << 16) | ((min) << 
8) | (pat))
  #define MAKE_GUC_VER_STRUCT(ver)    MAKE_GUC_VER((ver).major, 
(ver).minor, (ver).patch)
  #define GUC_SUBMIT_VER(guc) 
MAKE_GUC_VER_STRUCT((guc)->submission_version)
+#define GUC_FIRMWARE_VER(guc) 
MAKE_GUC_VER_STRUCT((guc)->fw.file_selected.ver)
    static inline struct intel_guc *log_to_guc(struct intel_guc_log 
*log)

  {
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c

index 63724e17829a7..792910af3a481 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
@@ -46,6 +46,10 @@
   *  +---+
   *  | padding   |
   *  +---+ <== 4K aligned
+ *  | w/a KLVs  |
+ *  +---+
+ *  | padding   |
+ *  +---+ <== 4K aligned
   *  | capture lists |
   *  +---+
   *  | padding   |
@@ -88,6 +92,11 @@ static u32 guc_ads_golden_ctxt_size(struct 
intel_guc *guc)

  return PAGE_ALIGN(guc->ads_golden_ctxt_size);
  }
  +static u32 guc_ads_waklv_size(struct intel_guc *guc)
+{
+    return PAGE_ALIGN(guc->ads_waklv_size);
+}
+
  static u32 guc_ads_capture_size(struct intel_guc *guc)
  {
  return PAGE_ALIGN(guc->ads_capture_size);
@@ -113,7 +122,7 @@ static u32 guc_ads_golden_ctxt_offset(struct 
intel_guc *guc)

  return PAGE_ALIGN(offset);
  }
  -static u32 guc_ads_capture_offset(struct intel_guc *guc)
+static u32 guc_ads_waklv_offset(struct intel_guc *guc)
  {
  u32 offset;
  @@ -123,6 +132,16 @@ static u32 guc_ads_capture_offset(struct 
intel_guc *guc)

  return PAGE_ALIGN(offset);
  }
  +static u32 guc_ads_capture_offset(struct intel_guc *guc)
+{
+    u32 offset;
+
+    offset = guc_ads_waklv_offset(guc) +
+ guc_ads_waklv_size(guc);
+
+    return PAGE_ALIGN(offset);
+}
+
  static u32 guc_ads_private_data_offset(struct intel_guc *guc)
  {
  u32 offset;
@@ -791,6 +810,40 @@ guc_capture_prep_lists(struct intel_guc *guc)
  return PAGE_ALIGN(total_size);
  }
  +static void guc_waklv_init(struct intel_guc *guc)
+{
+    struct intel_gt *gt = guc_to_gt(guc);
+    u32 offset, addr_ggtt, remain, size;
+
+    if (!intel_uc_uses_guc_submission(>uc))
+    return;
+
+    if (GUC_FIRMWARE_VER(guc) < MAKE_GUC_VER(70, 10, 0))
+    return;

should this be <= ?
No. GuC 70.10.0 is when w/a KLVs were introduced. So we want to skip on 
any version that is prior to 70.10.0.



+
+    GEM_BUG_ON(iosys_map_is_null(>ads_map));
+    offset = guc_ads_waklv_offset(guc);
+    remain = guc_ads_waklv_size(guc);
+
+    /* Add workarounds here */
+

extra blank line?
The 

Re: [Intel-gfx] [PATCH 2/4] drm/i915: Enable Wa_16019325821

2023-10-27 Thread John Harrison

On 10/6/2023 17:10, Belgaumkar, Vinay wrote:

On 9/15/2023 2:55 PM, john.c.harri...@intel.com wrote:

From: John Harrison 

Some platforms require holding RCS context switches until CCS is idle
(the reverse w/a of Wa_14014475959). Some platforms require both
versions.

Signed-off-by: John Harrison 
---
  drivers/gpu/drm/i915/gt/gen8_engine_cs.c  | 19 +++
  drivers/gpu/drm/i915/gt/intel_engine_types.h  |  7 ---
  drivers/gpu/drm/i915/gt/uc/intel_guc.c    |  4 
  drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h   |  3 ++-
  .../gpu/drm/i915/gt/uc/intel_guc_submission.c |  8 +++-
  5 files changed, 28 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c 
b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c

index 0143445dba830..8b494825c55f2 100644
--- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
@@ -733,21 +733,23 @@ static u32 *gen12_emit_preempt_busywait(struct 
i915_request *rq, u32 *cs)

  }
    /* Wa_14014475959:dg2 */
-#define CCS_SEMAPHORE_PPHWSP_OFFSET    0x540
-static u32 ccs_semaphore_offset(struct i915_request *rq)
+/* Wa_16019325821 */
+#define HOLD_SWITCHOUT_SEMAPHORE_PPHWSP_OFFSET    0x540
+static u32 hold_switchout_semaphore_offset(struct i915_request *rq)
  {
  return i915_ggtt_offset(rq->context->state) +
-    (LRC_PPHWSP_PN * PAGE_SIZE) + CCS_SEMAPHORE_PPHWSP_OFFSET;
+    (LRC_PPHWSP_PN * PAGE_SIZE) + 
HOLD_SWITCHOUT_SEMAPHORE_PPHWSP_OFFSET;

  }
    /* Wa_14014475959:dg2 */
-static u32 *ccs_emit_wa_busywait(struct i915_request *rq, u32 *cs)
+/* Wa_16019325821 */
+static u32 *hold_switchout_emit_wa_busywait(struct i915_request *rq, 
u32 *cs)

  {
  int i;
    *cs++ = MI_ATOMIC_INLINE | MI_ATOMIC_GLOBAL_GTT | 
MI_ATOMIC_CS_STALL |

  MI_ATOMIC_MOVE;
-    *cs++ = ccs_semaphore_offset(rq);
+    *cs++ = hold_switchout_semaphore_offset(rq);
  *cs++ = 0;
  *cs++ = 1;
  @@ -763,7 +765,7 @@ static u32 *ccs_emit_wa_busywait(struct 
i915_request *rq, u32 *cs)

  MI_SEMAPHORE_POLL |
  MI_SEMAPHORE_SAD_EQ_SDD;
  *cs++ = 0;
-    *cs++ = ccs_semaphore_offset(rq);
+    *cs++ = hold_switchout_semaphore_offset(rq);
  *cs++ = 0;
    return cs;
@@ -780,8 +782,9 @@ gen12_emit_fini_breadcrumb_tail(struct 
i915_request *rq, u32 *cs)

  cs = gen12_emit_preempt_busywait(rq, cs);
    /* Wa_14014475959:dg2 */
-    if (intel_engine_uses_wa_hold_ccs_switchout(rq->engine))
-    cs = ccs_emit_wa_busywait(rq, cs);
+    /* Wa_16019325821 */
+    if (intel_engine_uses_wa_hold_switchout(rq->engine))
+    cs = hold_switchout_emit_wa_busywait(rq, cs);
    rq->tail = intel_ring_offset(rq, cs);
  assert_ring_tail_valid(rq->ring, rq->tail);
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h 
b/drivers/gpu/drm/i915/gt/intel_engine_types.h

index a7e6775980043..68fe1cef9cd94 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -573,7 +573,7 @@ struct intel_engine_cs {
  #define I915_ENGINE_HAS_RCS_REG_STATE  BIT(9)
  #define I915_ENGINE_HAS_EU_PRIORITY    BIT(10)
  #define I915_ENGINE_FIRST_RENDER_COMPUTE BIT(11)
-#define I915_ENGINE_USES_WA_HOLD_CCS_SWITCHOUT BIT(12)
+#define I915_ENGINE_USES_WA_HOLD_SWITCHOUT BIT(12)
  unsigned int flags;
    /*
@@ -683,10 +683,11 @@ intel_engine_has_relative_mmio(const struct 
intel_engine_cs * const engine)

  }
    /* Wa_14014475959:dg2 */
+/* Wa_16019325821 */
  static inline bool
-intel_engine_uses_wa_hold_ccs_switchout(struct intel_engine_cs *engine)
+intel_engine_uses_wa_hold_switchout(struct intel_engine_cs *engine)
  {
-    return engine->flags & I915_ENGINE_USES_WA_HOLD_CCS_SWITCHOUT;
+    return engine->flags & I915_ENGINE_USES_WA_HOLD_SWITCHOUT;
  }
    #endif /* __INTEL_ENGINE_TYPES_H__ */
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc.c

index 27df41c53b890..4001679ba0793 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
@@ -294,6 +294,10 @@ static u32 guc_ctl_wa_flags(struct intel_guc *guc)
  IS_DG2(gt->i915))
  flags |= GUC_WA_HOLD_CCS_SWITCHOUT;
  +    /* Wa_16019325821 */
+    if (IS_GFX_GT_IP_RANGE(gt, IP_VER(12, 70), IP_VER(12, 71)))
+    flags |= GUC_WA_RCS_CCS_SWITCHOUT;
+
  /*
   * Wa_14012197797
   * Wa_22011391025
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h

index b4d56eccfb1f0..f97af0168a66b 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h
@@ -95,8 +95,9 @@
  #define   GUC_WA_GAM_CREDITS    BIT(10)
  #define   GUC_WA_DUAL_QUEUE    BIT(11)
  #define   GUC_WA_RCS_RESET_BEFORE_RC6    BIT(13)
-#define   GUC_WA_CONTEXT_ISOLATION    BIT(15)
  #define   GUC_WA_PRE_PARSER    BIT(14)
+#define   GUC_WA_CONTEXT_ISOLATION    BIT(15)
+#define   GUC_WA_RCS_CCS_SWITCHOUT    

Re: [PATCH v2 1/2] drm/msm/dp: don't touch DP subconnector property in eDP case

2023-10-27 Thread Abhinav Kumar




On 10/25/2023 2:23 AM, Dmitry Baryshkov wrote:

From: Abel Vesa 

In case of the eDP connection there is no subconnetor and as such no
subconnector property. Put drm_dp_set_subconnector_property() calls
under the !is_edp condition.

Fixes: bfcc3d8f94f4 ("drm/msm/dp: support setting the DP subconnector type")
Signed-off-by: Abel Vesa 
Signed-off-by: Dmitry Baryshkov 
---
  drivers/gpu/drm/msm/dp/dp_display.c | 15 ++-
  1 file changed, 10 insertions(+), 5 deletions(-)



We still need to unify the calls to drm_dp_set_subconnector_property() 
for the hpd connect/disconnect places preferably in 
dp_display_send_hpd_notification().


That way, we would have had to make this change only in one location.

If you want to pursue that as a separate patch, I am fine as well.

Hence,

Reviewed-by: Abhinav Kumar 


Re: [PATCH] dt-bindings: display: ssd132x: Remove '-' before compatible enum

2023-10-27 Thread Javier Martinez Canillas
Javier Martinez Canillas  writes:

[...]

>>> Pushed to drm-misc (drm-misc-next). Thanks!
>>
>> Given what introduced this is before the drm-misc-next-2023-10-19 tag, 
>> isn't it going into 6.7 and needs to be in the fixes branch? Though that 
>> doesn't exist yet for 6.7 fixes. I don't understand why that's not done 
>> as part of the last tag for a cycle. But drm-misc is special.
>>
>
> I pushed to drm-misc-next because I thought that there will be a last PR
> of drm-misc-next for 6.7, but it seems I missed it for a couple of hours
> (that is drm-misc-next-2023-10-27) :(
>
> https://lists.freedesktop.org/archives/dri-devel/2023-October/425698.html
>
> The solution now is to cherry-pick the fixes already in drm-misc-next to
> drm-misc-next-fixes, to make sure that land in 6.7. I can do that once
> drm-next is back merged to that branch, if you think the schema warning
> fix must land in 6.7 and can't wait for the next release.
>

Or wait for the drm-misc-fixes branch to be back merged and land it as a
part of the 6.7-rc cycle. I'll do that since it would cause less trouble
to the drm-misc maintainers.

-- 
Best regards,

Javier Martinez Canillas
Core Platforms
Red Hat



Re: [PATCH] dt-bindings: display: ssd132x: Remove '-' before compatible enum

2023-10-27 Thread Javier Martinez Canillas
Rob Herring  writes:

Hello Rob,

> On Fri, Oct 27, 2023 at 11:30:56AM +0200, Javier Martinez Canillas wrote:
>> Rob Herring  writes:
>> 
>> > On Sat, 21 Oct 2023 00:30:17 +0200, Javier Martinez Canillas wrote:
>> >> This is a leftover from when the binding schema had the compatible string
>> >> property enum as a 'oneOf' child and the '-' was not removed when 'oneOf'
>> >> got dropped during the binding review process.
>> >> 
>> >> Reported-by: Rob Herring 
>> >> Closes: 
>> >> https://lore.kernel.org/dri-devel/cal_jsq+h8dcnpkqhokqoodcc8+qi3m0prxrfkz_y4v37ymj...@mail.gmail.com/
>> >> Signed-off-by: Javier Martinez Canillas 
>> >> ---
>> >> 
>> >>  .../devicetree/bindings/display/solomon,ssd132x.yaml  | 8 
>> >>  1 file changed, 4 insertions(+), 4 deletions(-)
>> >> 
>> >
>> > Reviewed-by: Rob Herring 
>> >
>> 
>> Pushed to drm-misc (drm-misc-next). Thanks!
>
> Given what introduced this is before the drm-misc-next-2023-10-19 tag, 
> isn't it going into 6.7 and needs to be in the fixes branch? Though that 
> doesn't exist yet for 6.7 fixes. I don't understand why that's not done 
> as part of the last tag for a cycle. But drm-misc is special.
>

I pushed to drm-misc-next because I thought that there will be a last PR
of drm-misc-next for 6.7, but it seems I missed it for a couple of hours
(that is drm-misc-next-2023-10-27) :(

https://lists.freedesktop.org/archives/dri-devel/2023-October/425698.html

The solution now is to cherry-pick the fixes already in drm-misc-next to
drm-misc-next-fixes, to make sure that land in 6.7. I can do that once
drm-next is back merged to that branch, if you think the schema warning
fix must land in 6.7 and can't wait for the next release.

> Rob
>

-- 
Best regards,

Javier Martinez Canillas
Core Platforms
Red Hat



[pull] amdgpu, amdkfd drm-next-6.7

2023-10-27 Thread Alex Deucher
Hi Dave, Sima,

Fixes for 6.7.

The following changes since commit 5258dfd4a6adb5f45f046b0dd2e73c680f880d9d:

  usb: typec: altmodes/displayport: fixup drm internal api change vs new user. 
(2023-10-27 07:55:41 +1000)

are available in the Git repository at:

  https://gitlab.freedesktop.org/agd5f/linux.git 
tags/amd-drm-next-6.7-2023-10-27

for you to fetch changes up to dd3dd9829bf9a4ecd55482050745efdd9f7f97fc:

  drm/amdgpu: Remove unused variables from amdgpu_show_fdinfo (2023-10-27 
14:23:01 -0400)


amd-drm-next-6.7-2023-10-27:

amdgpu:
- RAS fixes
- Seamless boot fixes
- NBIO 7.7 fix
- SMU 14.0 fixes
- GC 11.5 fixes
- DML2 fixes
- ASPM fixes
- VPE fixes
- Misc code cleanups
- SRIOV fixes
- Add some missing copyright notices
- DCN 3.5 fixes
- FAMS fixes
- Backlight fix
- S/G display fix
- fdinfo cleanups
- EXT_COHERENT fixes for APU and NUMA systems

amdkfd:
- Misc fixes
- Misc code cleanups
- SVM fixes


Agustin Gutierrez (1):
  drm/amd/display: Remove power sequencing check

Alex Hung (2):
  drm/amd/display: Revert "drm/amd/display: allow edp updates for virtual 
signal"
  drm/amd/display: Set emulated sink type to HDMI accordingly.

Alvin Lee (1):
  drm/amd/display: Update FAMS sequence for DCN30 & DCN32

Aric Cyr (1):
  drm/amd/display: 3.2.256

Aurabindo Pillai (1):
  drm/amd/display: add interface to query SubVP status

Candice Li (2):
  drm/amdgpu: Identify data parity error corrected in replay mode
  drm/amdgpu: Retrieve CE count from ce_count_lo_chip in EccInfo table

David Francis (1):
  drm/amdgpu: Add EXT_COHERENT support for APU and NUMA systems

Fangzhi Zuo (1):
  drm/amd/display: Fix MST Multi-Stream Not Lighting Up on dcn35

George Shen (1):
  drm/amd/display: Update SDP VSC colorimetry from DP test automation 
request

Hamza Mahfooz (1):
  drm/amd/display: fix S/G display enablement

Hugo Hu (1):
  drm/amd/display: reprogram det size while seamless boot

Ilya Bakoulin (1):
  drm/amd/display: Fix shaper using bad LUT params

Iswara Nagulendran (1):
  drm/amd/display: Read before writing Backlight Mode Set Register

James Zhu (1):
  drm/amdxcp: fix amdxcp unloads incompletely

Jesse Zhang (1):
  drm/amdkfd: Fix shift out-of-bounds issue

Jiadong Zhu (2):
  drm/amd/pm: drop unneeded dpm features disablement for SMU 14.0.0
  drm/amdgpu: add tmz support for GC IP v11.5.0

Kenneth Feng (1):
  drm/amd/amdgpu: avoid to disable gfxhub interrupt when driver is unloaded

Lang Yu (1):
  drm/amdgpu/vpe: correct queue stop programing

Li Ma (2):
  drm/amdgpu: modify if condition in nbio_v7_7.c
  drm/amd/amdgpu: fix the GPU power print error in pm info

Lijo Lazar (4):
  drm/amdgpu: Add API to get full IP version
  drm/amdgpu: Use discovery table's subrevision
  drm/amdgpu: Add a read to GFX v9.4.3 ring test
  drm/amdgpu: Use pcie domain of xcc acpi objects

Lin.Cao (2):
  drm/amdgpu remove restriction of sriov max_pfn on Vega10
  drm/amd: check num of link levels when update pcie param

Ma Jun (1):
  drm/amd/pm: Fix the return value in default case

Mario Limonciello (4):
  drm/amd: Disable ASPM for VI w/ all Intel systems
  drm/amd: Disable PP_PCIE_DPM_MASK when dynamic speed switching not 
supported
  drm/amd: Move AMD_IS_APU check for ASPM into top level function
  drm/amd: Explicitly disable ASPM when dynamic switching disabled

Michael Strauss (1):
  drm/amd/display: Disable SYMCLK32_SE RCO on DCN314

Mukul Joshi (1):
  drm/amdgpu: Fix typo in IP discovery parsing

Nicholas Kazlauskas (2):
  drm/amd/display: Revert "Improve x86 and dmub ips handshake"
  drm/amd/display: Fix IPS handshake for idle optimizations

Philip Yang (2):
  Revert "drm/amdkfd:remove unused code"
  Revert "drm/amdkfd: Use partial migrations in GPU page faults"

Qu Huang (1):
  drm/amdgpu: Fix a null pointer access when the smc_rreg pointer is NULL

Rob Clark (1):
  drm/amdgpu: Remove duplicate fdinfo fields

Rodrigo Siqueira (5):
  drm/amd/display: Set the DML2 attribute to false in all DCNs older than 
version 3.5
  drm/amd/display: Fix DMUB errors introduced by DML2
  drm/amd/display: Correct enum typo
  drm/amd/display: Add prefix to amdgpu crtc functions
  drm/amd/display: Add prefix for plane functions

Samson Tam (2):
  drm/amd/display: fix num_ways overflow error
  drm/amd/display: add null check for invalid opps

Srinivasan Shanmugam (1):
  drm/amdkfd: Address 'remap_list' not described in 'svm_range_add'

Stylon Wang (3):
  drm/amd/display: Add missing copyright notice in DMUB
  drm/amd/display: Fix copyright notice in DML2 code
  drm/amd/display: Fix copyright notice in DC code

Sung Joon Kim (2):
  drm/amd/display: Add a check for idle power optimization
  

[PATCH] drm/msm/gem: Add metadata

2023-10-27 Thread Rob Clark
From: Rob Clark 

The EXT_external_objects extension is a bit awkward as it doesn't pass
explicit modifiers, leaving the importer to guess with incomplete
information.  In the case of vk (turnip) exporting and gl (freedreno)
importing, the "OPTIMAL_TILING_EXT" layout depends on VkImageCreateInfo
flags (among other things), which the importer does not know.  Which
unfortunately leaves us with the need for a metadata back-channel.

The contents of the metadata are defined by userspace.  The
EXT_external_objects extension is only required to work between
compatible versions of gl and vk drivers, as defined by device and
driver UUIDs.

Signed-off-by: Rob Clark 
---
 drivers/gpu/drm/msm/msm_drv.c | 59 +--
 drivers/gpu/drm/msm/msm_gem.h |  4 +++
 include/uapi/drm/msm_drm.h|  2 ++
 3 files changed, 63 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
index b61ccea05327..8fe2677ea37a 100644
--- a/drivers/gpu/drm/msm/msm_drv.c
+++ b/drivers/gpu/drm/msm/msm_drv.c
@@ -37,9 +37,10 @@
  * - 1.9.0 - Add MSM_SUBMIT_FENCE_SN_IN
  * - 1.10.0 - Add MSM_SUBMIT_BO_NO_IMPLICIT
  * - 1.11.0 - Add wait boost (MSM_WAIT_FENCE_BOOST, MSM_PREP_BOOST)
+ * - 1.12.0 - Add MSM_INFO_SET_METADATA and MSM_INFO_GET_METADATA
  */
 #define MSM_VERSION_MAJOR  1
-#define MSM_VERSION_MINOR  10
+#define MSM_VERSION_MINOR  12
 #define MSM_VERSION_PATCHLEVEL 0
 
 static void msm_deinit_vram(struct drm_device *ddev);
@@ -566,6 +567,8 @@ static int msm_ioctl_gem_info(struct drm_device *dev, void 
*data,
break;
case MSM_INFO_SET_NAME:
case MSM_INFO_GET_NAME:
+   case MSM_INFO_SET_METADATA:
+   case MSM_INFO_GET_METADATA:
break;
default:
return -EINVAL;
@@ -618,7 +621,7 @@ static int msm_ioctl_gem_info(struct drm_device *dev, void 
*data,
break;
case MSM_INFO_GET_NAME:
if (args->value && (args->len < strlen(msm_obj->name))) {
-   ret = -EINVAL;
+   ret = -ETOOSMALL;
break;
}
args->len = strlen(msm_obj->name);
@@ -627,6 +630,58 @@ static int msm_ioctl_gem_info(struct drm_device *dev, void 
*data,
 msm_obj->name, args->len))
ret = -EFAULT;
}
+   break;
+   case MSM_INFO_SET_METADATA:
+   /* Impose a moderate upper bound on metadata size: */
+   if (args->len > 128) {
+   ret = -EOVERFLOW;
+   break;
+   }
+
+   ret = msm_gem_lock_interruptible(obj);
+   if (ret)
+   break;
+
+   msm_obj->metadata =
+   krealloc(msm_obj->metadata, args->len, GFP_KERNEL);
+   msm_obj->metadata_size = args->len;
+
+   if (copy_from_user(msm_obj->metadata, 
u64_to_user_ptr(args->value),
+  args->len)) {
+   msm_obj->metadata_size = 0;
+   ret = -EFAULT;
+   }
+
+   msm_gem_unlock(obj);
+
+   break;
+   case MSM_INFO_GET_METADATA:
+   if (!args->value) {
+   /*
+* Querying the size is inherently racey, but
+* EXT_external_objects expects the app to confirm
+* via device and driver UUIDs that the exporter and
+* importer versions match.  All we can do from the
+* kernel side is check the length under obj lock
+* when userspace tries to retrieve the metadata
+*/
+   args->len = msm_obj->metadata_size;
+   break;
+   }
+
+   ret = msm_gem_lock_interruptible(obj);
+   if (ret)
+   break;
+
+   if (args->len < msm_obj->metadata_size) {
+   ret = -ETOOSMALL;
+   } else if (copy_to_user(u64_to_user_ptr(args->value),
+   msm_obj->metadata, args->len)) {
+   ret = -EFAULT;
+   }
+
+   msm_gem_unlock(obj);
+
break;
}
 
diff --git a/drivers/gpu/drm/msm/msm_gem.h b/drivers/gpu/drm/msm/msm_gem.h
index 7f34263048a3..8d414b072c29 100644
--- a/drivers/gpu/drm/msm/msm_gem.h
+++ b/drivers/gpu/drm/msm/msm_gem.h
@@ -109,6 +109,10 @@ struct msm_gem_object {
 
char name[32]; /* Identifier to print for the debugfs files */
 
+   /* userspace metadata backchannel */
+   void *metadata;
+   u32 metadata_size;
+
/**
 * pin_count: Number of times the pages are pinned
 *
diff --git a/include/uapi/drm/msm_drm.h 

[PATCH] accel/qaic: Quiet array bounds check on DMA abort message

2023-10-27 Thread Jeffrey Hugo
From: Carl Vanderlip 

Current wrapper is right-sized to the message being transferred;
however, this is smaller than the structure defining message wrappers
since the trailing element is a union of message/transfer headers of
various sizes (8 and 32 bytes on 32-bit system where issue was
reported). Using the smaller header with a small message
(wire_trans_dma_xfer is 24 bytes including header) ends up being smaller
than a wrapper with the larger header. There are no accesses outside of
the defined size, however they are possible if the larger union member
is referenced.

Abort messages are outside of hot-path and changing the wrapper struct
would require a larger rewrite, so having the memory allocated to the
message be 8 bytes too big is acceptable.

Reported-by: kernel test robot 
Closes: 
https://lore.kernel.org/oe-kbuild-all/202310182253.bcb9jcyj-...@intel.com/
Signed-off-by: Carl Vanderlip 
Reviewed-by: Pranjal Ramajor Asha Kanojiya 
Reviewed-by: Jeffrey Hugo 
Signed-off-by: Jeffrey Hugo 
---
 drivers/accel/qaic/qaic_control.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/accel/qaic/qaic_control.c 
b/drivers/accel/qaic/qaic_control.c
index 388abd40024b..84915824be54 100644
--- a/drivers/accel/qaic/qaic_control.c
+++ b/drivers/accel/qaic/qaic_control.c
@@ -1138,7 +1138,7 @@ static int abort_dma_cont(struct qaic_device *qdev, 
struct wrapper_list *wrapper
if (!list_is_first(>list, >list))
kref_put(>ref_count, free_wrapper);
 
-   wrapper = add_wrapper(wrappers, offsetof(struct wrapper_msg, trans) + 
sizeof(*out_trans));
+   wrapper = add_wrapper(wrappers, sizeof(*wrapper));
 
if (!wrapper)
return -ENOMEM;
-- 
2.40.1



RE: [PATCH] drm/radeon: replace 1-element arrays with flexible-array members

2023-10-27 Thread Deucher, Alexander
[Public]

> -Original Message-
> From: José Pekkarinen 
> Sent: Friday, October 27, 2023 12:59 PM
> To: Deucher, Alexander ; Koenig, Christian
> ; Pan, Xinhui ;
> sk...@linuxfoundation.org
> Cc: José Pekkarinen ; airl...@gmail.com;
> dan...@ffwll.ch; amd-...@lists.freedesktop.org; dri-
> de...@lists.freedesktop.org; linux-ker...@vger.kernel.org; linux-kernel-
> ment...@lists.linuxfoundation.org
> Subject: [PATCH] drm/radeon: replace 1-element arrays with flexible-array
> members
>
> Reported by coccinelle, the following patch will move the following 1 element
> arrays to flexible arrays.
>
> drivers/gpu/drm/radeon/atombios.h:5523:32-48: WARNING use flexible-
> array member instead
> (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-
> length-and-one-element-arrays)
> drivers/gpu/drm/radeon/atombios.h:5545:32-48: WARNING use flexible-
> array member instead
> (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-
> length-and-one-element-arrays)
> drivers/gpu/drm/radeon/atombios.h:5461:34-44: WARNING use flexible-
> array member instead
> (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-
> length-and-one-element-arrays)
> drivers/gpu/drm/radeon/atombios.h:4447:30-40: WARNING use flexible-
> array member instead
> (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-
> length-and-one-element-arrays)
> drivers/gpu/drm/radeon/atombios.h:4236:30-41: WARNING use flexible-
> array member instead
> (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-
> length-and-one-element-arrays)
> drivers/gpu/drm/radeon/atombios.h:7044:24-37: WARNING use flexible-
> array member instead
> (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-
> length-and-one-element-arrays)
> drivers/gpu/drm/radeon/atombios.h:7054:24-37: WARNING use flexible-
> array member instead
> (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-
> length-and-one-element-arrays)
> drivers/gpu/drm/radeon/atombios.h:7095:28-45: WARNING use flexible-
> array member instead
> (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-
> length-and-one-element-arrays)
> drivers/gpu/drm/radeon/atombios.h:7553:8-17: WARNING use flexible-array
> member instead
> (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-
> length-and-one-element-arrays)
> drivers/gpu/drm/radeon/atombios.h:7559:8-17: WARNING use flexible-array
> member instead
> (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-
> length-and-one-element-arrays)
> drivers/gpu/drm/radeon/atombios.h:3896:27-37: WARNING use flexible-
> array member instead
> (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-
> length-and-one-element-arrays)
> drivers/gpu/drm/radeon/atombios.h:5443:16-25: WARNING use flexible-
> array member instead
> (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-
> length-and-one-element-arrays)
> drivers/gpu/drm/radeon/atombios.h:5454:34-43: WARNING use flexible-
> array member instead
> (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-
> length-and-one-element-arrays)
> drivers/gpu/drm/radeon/atombios.h:4603:21-32: WARNING use flexible-
> array member instead
> (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-
> length-and-one-element-arrays)
> drivers/gpu/drm/radeon/atombios.h:6299:32-44: WARNING use flexible-
> array member instead
> (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-
> length-and-one-element-arrays)
> drivers/gpu/drm/radeon/atombios.h:4628:32-46: WARNING use flexible-
> array member instead
> (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-
> length-and-one-element-arrays)
> drivers/gpu/drm/radeon/atombios.h:6285:29-39: WARNING use flexible-
> array member instead
> (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-
> length-and-one-element-arrays)
> drivers/gpu/drm/radeon/atombios.h:4296:30-36: WARNING use flexible-
> array member instead
> (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-
> length-and-one-element-arrays)
> drivers/gpu/drm/radeon/atombios.h:4756:28-36: WARNING use flexible-
> array member instead
> (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-
> length-and-one-element-arrays)
> drivers/gpu/drm/radeon/atombios.h:4064:22-35: WARNING use flexible-
> array member instead
> (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-
> length-and-one-element-arrays)
> drivers/gpu/drm/radeon/atombios.h:7327:9-24: WARNING use flexible-array
> member instead
> (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-
> length-and-one-element-arrays)
> drivers/gpu/drm/radeon/atombios.h:7332:32-53: WARNING use flexible-
> array member instead
> (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-
> length-and-one-element-arrays)
> 

Re: [PATCH v2] i915/perf: Fix NULL deref bugs with drm_dbg() calls

2023-10-27 Thread Tvrtko Ursulin



On 27/10/2023 18:28, Harshit Mogalapalli wrote:

When i915 perf interface is not available dereferencing it will lead to
NULL dereferences.

As returning -ENOTSUPP is pretty clear return when perf interface is not
available.

Fixes: 2fec539112e8 ("i915/perf: Replace DRM_DEBUG with driver specific drm_dbg 
call")
Suggested-by: Tvrtko Ursulin 
Signed-off-by: Harshit Mogalapalli 
---
v1 --> v2: Remove the debug calls as they don't add much value and
-ENOTSUPP is a good enough return value.
---
  drivers/gpu/drm/i915/i915_perf.c | 15 +++
  1 file changed, 3 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 2f3ecd7d4804..7b1c8de2f9cb 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -4227,11 +4227,8 @@ int i915_perf_open_ioctl(struct drm_device *dev, void 
*data,
u32 known_open_flags;
int ret;
  
-	if (!perf->i915) {

-   drm_dbg(>i915->drm,
-   "i915 perf interface not available for this system\n");
+   if (!perf->i915)
return -ENOTSUPP;
-   }
  
  	known_open_flags = I915_PERF_FLAG_FD_CLOEXEC |

   I915_PERF_FLAG_FD_NONBLOCK |
@@ -4607,11 +4604,8 @@ int i915_perf_add_config_ioctl(struct drm_device *dev, 
void *data,
struct i915_oa_reg *regs;
int err, id;
  
-	if (!perf->i915) {

-   drm_dbg(>i915->drm,
-   "i915 perf interface not available for this system\n");
+   if (!perf->i915)
return -ENOTSUPP;
-   }
  
  	if (!perf->metrics_kobj) {

drm_dbg(>i915->drm,
@@ -4773,11 +4767,8 @@ int i915_perf_remove_config_ioctl(struct drm_device 
*dev, void *data,
struct i915_oa_config *oa_config;
int ret;
  
-	if (!perf->i915) {

-   drm_dbg(>i915->drm,
-   "i915 perf interface not available for this system\n");
+   if (!perf->i915)
return -ENOTSUPP;
-   }
  
  	if (i915_perf_stream_paranoid && !perfmon_capable()) {

drm_dbg(>i915->drm,


Thanks for re-spinning it so quickly! LGTM.

Reviewed-by: Tvrtko Ursulin 

Regards,

Tvrtko


Re: (subset) [PATCH v2] drm/doc: ci: Require more context for flaky tests

2023-10-27 Thread Helen Koike




On 26/10/2023 12:31, Maxime Ripard wrote:

Hi,

On Thu, Oct 26, 2023 at 11:27:18AM -0300, Helen Koike wrote:

On 26/10/2023 10:27, Maxime Ripard wrote:

On Thu, Oct 26, 2023 at 09:08:03AM -0300, Helen Koike wrote:



On 26/10/2023 09:01, Helen Koike wrote:



On 26/10/2023 07:58, Maxime Ripard wrote:

On Wed, 25 Oct 2023 16:24:41 +0200, Maxime Ripard wrote:

Flaky tests can be very difficult to reproduce after the facts, which
will make it even harder to ever fix.

Let's document the metadata we agreed on to provide more context to
anyone trying to address these fixes.


[...]


Applied to drm/drm-misc (drm-misc-next).


Thanks!

Could you also apply 
https://patchwork.kernel.org/project/dri-devel/cover/20231024004525.169002-1-helen.ko...@collabora.com/
(and the dependencies listed on it).


For some reason, commit message 7/10 (drm/ci: increase i915 job timeout to
1h30m) looks incomplete in patchwork, but it looks fine in my branch:

https://gitlab.freedesktop.org/helen.fornazier/linux/-/commits/for-drm-misc-wip/

Let me know if you prefer that I send it again or if you could pull from the
branch.


It was fine on lore.kernel.org and that's where I'm pulling from, so it all 
worked out :)

Everything you asked for should be applied now

Maxime


Awesome, thank you!

Sorry, just another request, could you please pull this other one updating
MAINTAINERS?

https://patchwork.kernel.org/project/linux-arm-msm/patch/20230919182249.153499-1-helen.ko...@collabora.com/


I don't mind, but the expectation (the one I had at least) was that you would 
do it :)

If you don't have drm-misc access, please create an account, you have
done way more than expected to get one already


Nice! I just created this request 
https://gitlab.freedesktop.org/freedesktop/freedesktop/-/issues/983


Would you mind approving it?

Thanks!
Helen



Maxime


[PATCH v2] i915/perf: Fix NULL deref bugs with drm_dbg() calls

2023-10-27 Thread Harshit Mogalapalli
When i915 perf interface is not available dereferencing it will lead to
NULL dereferences.

As returning -ENOTSUPP is pretty clear return when perf interface is not
available.

Fixes: 2fec539112e8 ("i915/perf: Replace DRM_DEBUG with driver specific drm_dbg 
call")
Suggested-by: Tvrtko Ursulin 
Signed-off-by: Harshit Mogalapalli 
---
v1 --> v2: Remove the debug calls as they don't add much value and
-ENOTSUPP is a good enough return value.
---
 drivers/gpu/drm/i915/i915_perf.c | 15 +++
 1 file changed, 3 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 2f3ecd7d4804..7b1c8de2f9cb 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -4227,11 +4227,8 @@ int i915_perf_open_ioctl(struct drm_device *dev, void 
*data,
u32 known_open_flags;
int ret;
 
-   if (!perf->i915) {
-   drm_dbg(>i915->drm,
-   "i915 perf interface not available for this system\n");
+   if (!perf->i915)
return -ENOTSUPP;
-   }
 
known_open_flags = I915_PERF_FLAG_FD_CLOEXEC |
   I915_PERF_FLAG_FD_NONBLOCK |
@@ -4607,11 +4604,8 @@ int i915_perf_add_config_ioctl(struct drm_device *dev, 
void *data,
struct i915_oa_reg *regs;
int err, id;
 
-   if (!perf->i915) {
-   drm_dbg(>i915->drm,
-   "i915 perf interface not available for this system\n");
+   if (!perf->i915)
return -ENOTSUPP;
-   }
 
if (!perf->metrics_kobj) {
drm_dbg(>i915->drm,
@@ -4773,11 +4767,8 @@ int i915_perf_remove_config_ioctl(struct drm_device 
*dev, void *data,
struct i915_oa_config *oa_config;
int ret;
 
-   if (!perf->i915) {
-   drm_dbg(>i915->drm,
-   "i915 perf interface not available for this system\n");
+   if (!perf->i915)
return -ENOTSUPP;
-   }
 
if (i915_perf_stream_paranoid && !perfmon_capable()) {
drm_dbg(>i915->drm,
-- 
2.39.3



Re: [PATCH] dt-bindings: display: ssd132x: Remove '-' before compatible enum

2023-10-27 Thread Rob Herring
On Fri, Oct 27, 2023 at 11:30:56AM +0200, Javier Martinez Canillas wrote:
> Rob Herring  writes:
> 
> > On Sat, 21 Oct 2023 00:30:17 +0200, Javier Martinez Canillas wrote:
> >> This is a leftover from when the binding schema had the compatible string
> >> property enum as a 'oneOf' child and the '-' was not removed when 'oneOf'
> >> got dropped during the binding review process.
> >> 
> >> Reported-by: Rob Herring 
> >> Closes: 
> >> https://lore.kernel.org/dri-devel/cal_jsq+h8dcnpkqhokqoodcc8+qi3m0prxrfkz_y4v37ymj...@mail.gmail.com/
> >> Signed-off-by: Javier Martinez Canillas 
> >> ---
> >> 
> >>  .../devicetree/bindings/display/solomon,ssd132x.yaml  | 8 
> >>  1 file changed, 4 insertions(+), 4 deletions(-)
> >> 
> >
> > Reviewed-by: Rob Herring 
> >
> 
> Pushed to drm-misc (drm-misc-next). Thanks!

Given what introduced this is before the drm-misc-next-2023-10-19 tag, 
isn't it going into 6.7 and needs to be in the fixes branch? Though that 
doesn't exist yet for 6.7 fixes. I don't understand why that's not done 
as part of the last tag for a cycle. But drm-misc is special.

Rob


Re: [PATCH] i915/perf: Fix NULL deref bugs with drm_dbg() calls

2023-10-27 Thread Harshit Mogalapalli

Hi Tvrtko,

On 27/10/23 8:17 pm, Tvrtko Ursulin wrote:


On 27/10/2023 15:11, Andrzej Hajda wrote:

On 27.10.2023 16:07, Harshit Mogalapalli wrote:

When i915 perf interface is not available dereferencing it will lead to
NULL dereferences.

Fix this by using DRM_DEBUG() which the scenario before the commit in
the Fixes tag.

Fixes: 2fec539112e8 ("i915/perf: Replace DRM_DEBUG with driver 
specific drm_dbg call")

Signed-off-by: Harshit Mogalapalli 



Reviewed-by: Andrzej Hajda 


Please hold off merging.


---
This is found using smatch(static analysis tool), only compile tested.
---
  drivers/gpu/drm/i915/i915_perf.c | 9 +++--
  1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_perf.c 
b/drivers/gpu/drm/i915/i915_perf.c

index 2f3ecd7d4804..bb48c96b7950 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -4228,8 +4228,7 @@ int i915_perf_open_ioctl(struct drm_device 
*dev, void *data,

  int ret;
  if (!perf->i915) {
-    drm_dbg(>i915->drm,
-    "i915 perf interface not available for this system\n");
+    DRM_DEBUG("i915 perf interface not available for this 
system\n");


What's that struct drm_device *dev function argument a few lines up? :)

Although TBH all these these could just be removed since I doubt they 
are adding any value and ENOTSUPP is pretty clear.




Thanks for checking. I will remove the dbg() calls and send a V2.

Regards,
Harshit


Regards,

Tvrtko


  return -ENOTSUPP;
  }
@@ -4608,8 +4607,7 @@ int i915_perf_add_config_ioctl(struct 
drm_device *dev, void *data,

  int err, id;
  if (!perf->i915) {
-    drm_dbg(>i915->drm,
-    "i915 perf interface not available for this system\n");
+    DRM_DEBUG("i915 perf interface not available for this 
system\n");

  return -ENOTSUPP;
  }
@@ -4774,8 +4772,7 @@ int i915_perf_remove_config_ioctl(struct 
drm_device *dev, void *data,

  int ret;
  if (!perf->i915) {
-    drm_dbg(>i915->drm,
-    "i915 perf interface not available for this system\n");
+    DRM_DEBUG("i915 perf interface not available for this 
system\n");

  return -ENOTSUPP;
  }






[PATCH 6/7] drm/exec: Pass in initial # of objects

2023-10-27 Thread Rob Clark
From: Rob Clark 

In cases where the # is known ahead of time, it is silly to do the table
resize dance.

Signed-off-by: Rob Clark 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c  |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c |  4 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c |  4 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c |  4 ++--
 drivers/gpu/drm/drm_exec.c  | 15 ---
 drivers/gpu/drm/nouveau/nouveau_exec.c  |  2 +-
 drivers/gpu/drm/nouveau/nouveau_uvmm.c  |  2 +-
 include/drm/drm_exec.h  |  2 +-
 8 files changed, 22 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index efdb1c48f431..d27ca8f61929 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -65,7 +65,7 @@ static int amdgpu_cs_parser_init(struct amdgpu_cs_parser *p,
}
 
amdgpu_sync_create(>sync);
-   drm_exec_init(>exec, DRM_EXEC_INTERRUPTIBLE_WAIT);
+   drm_exec_init(>exec, DRM_EXEC_INTERRUPTIBLE_WAIT, 0);
return 0;
 }
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
index 720011019741..796fa6f1420b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c
@@ -70,7 +70,7 @@ int amdgpu_map_static_csa(struct amdgpu_device *adev, struct 
amdgpu_vm *vm,
struct drm_exec exec;
int r;
 
-   drm_exec_init(, DRM_EXEC_INTERRUPTIBLE_WAIT);
+   drm_exec_init(, DRM_EXEC_INTERRUPTIBLE_WAIT, 0);
drm_exec_until_all_locked() {
r = amdgpu_vm_lock_pd(vm, , 0);
if (likely(!r))
@@ -110,7 +110,7 @@ int amdgpu_unmap_static_csa(struct amdgpu_device *adev, 
struct amdgpu_vm *vm,
struct drm_exec exec;
int r;
 
-   drm_exec_init(, DRM_EXEC_INTERRUPTIBLE_WAIT);
+   drm_exec_init(, DRM_EXEC_INTERRUPTIBLE_WAIT, 0);
drm_exec_until_all_locked() {
r = amdgpu_vm_lock_pd(vm, , 0);
if (likely(!r))
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
index ca4d2d430e28..16f1715148ad 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
@@ -203,7 +203,7 @@ static void amdgpu_gem_object_close(struct drm_gem_object 
*obj,
struct drm_exec exec;
long r;
 
-   drm_exec_init(, DRM_EXEC_IGNORE_DUPLICATES);
+   drm_exec_init(, DRM_EXEC_IGNORE_DUPLICATES, 0);
drm_exec_until_all_locked() {
r = drm_exec_prepare_obj(, >tbo.base, 1);
drm_exec_retry_on_contention();
@@ -739,7 +739,7 @@ int amdgpu_gem_va_ioctl(struct drm_device *dev, void *data,
}
 
drm_exec_init(, DRM_EXEC_INTERRUPTIBLE_WAIT |
- DRM_EXEC_IGNORE_DUPLICATES);
+ DRM_EXEC_IGNORE_DUPLICATES, 0);
drm_exec_until_all_locked() {
if (gobj) {
r = drm_exec_lock_obj(, gobj);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
index b6015157763a..3c351941701e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c
@@ -1105,7 +1105,7 @@ int amdgpu_mes_ctx_map_meta_data(struct amdgpu_device 
*adev,
 
amdgpu_sync_create();
 
-   drm_exec_init(, 0);
+   drm_exec_init(, 0, 0);
drm_exec_until_all_locked() {
r = drm_exec_lock_obj(,
  _data->meta_data_obj->tbo.base);
@@ -1176,7 +1176,7 @@ int amdgpu_mes_ctx_unmap_meta_data(struct amdgpu_device 
*adev,
struct drm_exec exec;
long r;
 
-   drm_exec_init(, 0);
+   drm_exec_init(, 0, 0);
drm_exec_until_all_locked() {
r = drm_exec_lock_obj(,
  _data->meta_data_obj->tbo.base);
diff --git a/drivers/gpu/drm/drm_exec.c b/drivers/gpu/drm/drm_exec.c
index 5d2809de4517..27d11c20d148 100644
--- a/drivers/gpu/drm/drm_exec.c
+++ b/drivers/gpu/drm/drm_exec.c
@@ -69,16 +69,25 @@ static void drm_exec_unlock_all(struct drm_exec *exec)
  * drm_exec_init - initialize a drm_exec object
  * @exec: the drm_exec object to initialize
  * @flags: controls locking behavior, see DRM_EXEC_* defines
+ * @nr: the initial # of objects
  *
  * Initialize the object and make sure that we can track locked objects.
+ *
+ * If nr is non-zero then it is used as the initial objects table size.
+ * In either case, the table will grow (be re-allocated) on demand.
  */
-void drm_exec_init(struct drm_exec *exec, uint32_t flags)
+void drm_exec_init(struct drm_exec *exec, uint32_t flags, unsigned nr)
 {
+   size_t sz = PAGE_SIZE;
+
+   if (nr)
+   sz = (size_t)nr * sizeof(void *);
+
exec->flags = flags;
-   exec->objects = kmalloc(PAGE_SIZE, GFP_KERNEL);
+   exec->objects = kmalloc(sz, GFP_KERNEL);
 

[PATCH 7/7] drm/msm/gem: Convert to drm_exec

2023-10-27 Thread Rob Clark
From: Rob Clark 

Replace the ww_mutex locking dance with the drm_exec helper.

Signed-off-by: Rob Clark 
---
 drivers/gpu/drm/msm/Kconfig  |   1 +
 drivers/gpu/drm/msm/msm_gem.h|   5 +-
 drivers/gpu/drm/msm/msm_gem_submit.c | 117 +--
 3 files changed, 24 insertions(+), 99 deletions(-)

diff --git a/drivers/gpu/drm/msm/Kconfig b/drivers/gpu/drm/msm/Kconfig
index 6309a857ca31..f91d87afc0d3 100644
--- a/drivers/gpu/drm/msm/Kconfig
+++ b/drivers/gpu/drm/msm/Kconfig
@@ -16,6 +16,7 @@ config DRM_MSM
select DRM_DP_AUX_BUS
select DRM_DISPLAY_DP_HELPER
select DRM_DISPLAY_HELPER
+   select DRM_EXEC
select DRM_KMS_HELPER
select DRM_PANEL
select DRM_BRIDGE
diff --git a/drivers/gpu/drm/msm/msm_gem.h b/drivers/gpu/drm/msm/msm_gem.h
index af884ced7a0d..7f34263048a3 100644
--- a/drivers/gpu/drm/msm/msm_gem.h
+++ b/drivers/gpu/drm/msm/msm_gem.h
@@ -9,6 +9,7 @@
 
 #include 
 #include 
+#include "drm/drm_exec.h"
 #include "drm/gpu_scheduler.h"
 #include "msm_drv.h"
 
@@ -254,7 +255,7 @@ struct msm_gem_submit {
struct msm_gpu *gpu;
struct msm_gem_address_space *aspace;
struct list_head node;   /* node in ring submit list */
-   struct ww_acquire_ctx ticket;
+   struct drm_exec exec;
uint32_t seqno; /* Sequence number of the submit on the ring */
 
/* Hw fence, which is created when the scheduler executes the job, and
@@ -287,8 +288,6 @@ struct msm_gem_submit {
struct drm_msm_gem_submit_reloc *relocs;
} *cmd;  /* array of size nr_cmds */
struct {
-/* make sure these don't conflict w/ MSM_SUBMIT_BO_x */
-#define BO_LOCKED  0x4000  /* obj lock is held */
uint32_t flags;
union {
struct drm_gem_object *obj;
diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c 
b/drivers/gpu/drm/msm/msm_gem_submit.c
index 603f04d851d9..f8d14d4ccfef 100644
--- a/drivers/gpu/drm/msm/msm_gem_submit.c
+++ b/drivers/gpu/drm/msm/msm_gem_submit.c
@@ -248,85 +248,31 @@ static int submit_lookup_cmds(struct msm_gem_submit 
*submit,
return ret;
 }
 
-static void submit_unlock_bo(struct msm_gem_submit *submit, int i)
-{
-   struct drm_gem_object *obj = submit->bos[i].obj;
-   unsigned cleanup_flags = BO_LOCKED;
-   unsigned flags = submit->bos[i].flags & cleanup_flags;
-
-   /*
-* Clear flags bit before dropping lock, so that the msm_job_run()
-* path isn't racing with submit_cleanup() (ie. the read/modify/
-* write is protected by the obj lock in all paths)
-*/
-   submit->bos[i].flags &= ~cleanup_flags;
-
-   if (flags & BO_LOCKED)
-   dma_resv_unlock(obj->resv);
-}
-
 /* This is where we make sure all the bo's are reserved and pin'd: */
 static int submit_lock_objects(struct msm_gem_submit *submit)
 {
-   int contended, slow_locked = -1, i, ret = 0;
-
-retry:
-   for (i = 0; i < submit->nr_bos; i++) {
-   struct drm_gem_object *obj = submit->bos[i].obj;
-
-   if (slow_locked == i)
-   slow_locked = -1;
+   int ret;
 
-   contended = i;
+   drm_exec_init(>exec, DRM_EXEC_INTERRUPTIBLE_WAIT, 
submit->nr_bos);
 
-   if (!(submit->bos[i].flags & BO_LOCKED)) {
-   ret = dma_resv_lock_interruptible(obj->resv,
- >ticket);
+   drm_exec_until_all_locked (>exec) {
+   for (unsigned i = 0; i < submit->nr_bos; i++) {
+   struct drm_gem_object *obj = submit->bos[i].obj;
+   ret = drm_exec_prepare_obj(>exec, obj, 1);
+   drm_exec_retry_on_contention(>exec);
if (ret)
-   goto fail;
-   submit->bos[i].flags |= BO_LOCKED;
+   goto error;
}
}
 
-   ww_acquire_done(>ticket);
-
return 0;
 
-fail:
-   if (ret == -EALREADY) {
-   SUBMIT_ERROR(submit, "handle %u at index %u already on submit 
list\n",
-submit->bos[i].handle, i);
-   ret = -EINVAL;
-   }
-
-   for (; i >= 0; i--)
-   submit_unlock_bo(submit, i);
-
-   if (slow_locked > 0)
-   submit_unlock_bo(submit, slow_locked);
-
-   if (ret == -EDEADLK) {
-   struct drm_gem_object *obj = submit->bos[contended].obj;
-   /* we lost out in a seqno race, lock and retry.. */
-   ret = dma_resv_lock_slow_interruptible(obj->resv,
-  >ticket);
-   if (!ret) {
-   submit->bos[contended].flags |= BO_LOCKED;
-   slow_locked = contended;
-   goto retry;
-   }
-
-   /* Not expecting -EALREADY 

[PATCH 5/7] drm/msm/gem: Cleanup submit_cleanup_bo()

2023-10-27 Thread Rob Clark
From: Rob Clark 

Now that it only handles unlock duty, drop the superfluous arg and
rename it.

Signed-off-by: Rob Clark 
---
 drivers/gpu/drm/msm/msm_gem_submit.c | 15 +--
 1 file changed, 5 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c 
b/drivers/gpu/drm/msm/msm_gem_submit.c
index d001bf286606..603f04d851d9 100644
--- a/drivers/gpu/drm/msm/msm_gem_submit.c
+++ b/drivers/gpu/drm/msm/msm_gem_submit.c
@@ -248,14 +248,10 @@ static int submit_lookup_cmds(struct msm_gem_submit 
*submit,
return ret;
 }
 
-/* Unwind bo state, according to cleanup_flags.  In the success case, only
- * the lock is dropped at the end of the submit (and active/pin ref is dropped
- * later when the submit is retired).
- */
-static void submit_cleanup_bo(struct msm_gem_submit *submit, int i,
-   unsigned cleanup_flags)
+static void submit_unlock_bo(struct msm_gem_submit *submit, int i)
 {
struct drm_gem_object *obj = submit->bos[i].obj;
+   unsigned cleanup_flags = BO_LOCKED;
unsigned flags = submit->bos[i].flags & cleanup_flags;
 
/*
@@ -304,10 +300,10 @@ static int submit_lock_objects(struct msm_gem_submit 
*submit)
}
 
for (; i >= 0; i--)
-   submit_cleanup_bo(submit, i, BO_LOCKED);
+   submit_unlock_bo(submit, i);
 
if (slow_locked > 0)
-   submit_cleanup_bo(submit, slow_locked, BO_LOCKED);
+   submit_unlock_bo(submit, slow_locked);
 
if (ret == -EDEADLK) {
struct drm_gem_object *obj = submit->bos[contended].obj;
@@ -533,7 +529,6 @@ static int submit_reloc(struct msm_gem_submit *submit, 
struct drm_gem_object *ob
  */
 static void submit_cleanup(struct msm_gem_submit *submit, bool error)
 {
-   unsigned cleanup_flags = BO_LOCKED;
unsigned i;
 
if (error)
@@ -541,7 +536,7 @@ static void submit_cleanup(struct msm_gem_submit *submit, 
bool error)
 
for (i = 0; i < submit->nr_bos; i++) {
struct drm_gem_object *obj = submit->bos[i].obj;
-   submit_cleanup_bo(submit, i, cleanup_flags);
+   submit_unlock_bo(submit, i);
if (error)
drm_gem_object_put(obj);
}
-- 
2.41.0



[PATCH 4/7] drm/msm/gem: Split out submit_unpin_objects() helper

2023-10-27 Thread Rob Clark
From: Rob Clark 

Untangle unpinning from unlock/unref loop.  The unpin only happens in
error paths so it is easier to decouple from the normal unlock path.

Since we never have an intermediate state where a subset of buffers
are pinned (ie. we never bail out of the pin or unpin loops) we can
replace the bo state flag bit with a global flag in the submit.

Signed-off-by: Rob Clark 
---
 drivers/gpu/drm/msm/msm_gem.h|  6 +++---
 drivers/gpu/drm/msm/msm_gem_submit.c | 22 +-
 drivers/gpu/drm/msm/msm_ringbuffer.c |  3 ++-
 3 files changed, 22 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_gem.h b/drivers/gpu/drm/msm/msm_gem.h
index c36c1c1fa222..af884ced7a0d 100644
--- a/drivers/gpu/drm/msm/msm_gem.h
+++ b/drivers/gpu/drm/msm/msm_gem.h
@@ -270,8 +270,9 @@ struct msm_gem_submit {
int fence_id;   /* key into queue->fence_idr */
struct msm_gpu_submitqueue *queue;
struct pid *pid;/* submitting process */
-   bool fault_dumped;  /* Limit devcoredump dumping to one per submit */
-   bool in_rb; /* "sudo" mode, copy cmds into RB */
+   bool bos_pinned : 1;
+   bool fault_dumped:1;/* Limit devcoredump dumping to one per submit */
+   bool in_rb : 1; /* "sudo" mode, copy cmds into RB */
struct msm_ringbuffer *ring;
unsigned int nr_cmds;
unsigned int nr_bos;
@@ -288,7 +289,6 @@ struct msm_gem_submit {
struct {
 /* make sure these don't conflict w/ MSM_SUBMIT_BO_x */
 #define BO_LOCKED  0x4000  /* obj lock is held */
-#define BO_PINNED  0x2000  /* obj (pages) is pinned and on active list */
uint32_t flags;
union {
struct drm_gem_object *obj;
diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c 
b/drivers/gpu/drm/msm/msm_gem_submit.c
index 786b48a55309..d001bf286606 100644
--- a/drivers/gpu/drm/msm/msm_gem_submit.c
+++ b/drivers/gpu/drm/msm/msm_gem_submit.c
@@ -265,9 +265,6 @@ static void submit_cleanup_bo(struct msm_gem_submit 
*submit, int i,
 */
submit->bos[i].flags &= ~cleanup_flags;
 
-   if (flags & BO_PINNED)
-   msm_gem_unpin_locked(obj);
-
if (flags & BO_LOCKED)
dma_resv_unlock(obj->resv);
 }
@@ -407,13 +404,28 @@ static int submit_pin_objects(struct msm_gem_submit 
*submit)
mutex_lock(>lru.lock);
for (i = 0; i < submit->nr_bos; i++) {
msm_gem_pin_obj_locked(submit->bos[i].obj);
-   submit->bos[i].flags |= BO_PINNED;
}
mutex_unlock(>lru.lock);
 
+   submit->bos_pinned = true;
+
return ret;
 }
 
+static void submit_unpin_objects(struct msm_gem_submit *submit)
+{
+   if (!submit->bos_pinned)
+   return;
+
+   for (int i = 0; i < submit->nr_bos; i++) {
+   struct drm_gem_object *obj = submit->bos[i].obj;
+
+   msm_gem_unpin_locked(obj);
+   }
+
+   submit->bos_pinned = false;
+}
+
 static void submit_attach_object_fences(struct msm_gem_submit *submit)
 {
int i;
@@ -525,7 +537,7 @@ static void submit_cleanup(struct msm_gem_submit *submit, 
bool error)
unsigned i;
 
if (error)
-   cleanup_flags |= BO_PINNED;
+   submit_unpin_objects(submit);
 
for (i = 0; i < submit->nr_bos; i++) {
struct drm_gem_object *obj = submit->bos[i].obj;
diff --git a/drivers/gpu/drm/msm/msm_ringbuffer.c 
b/drivers/gpu/drm/msm/msm_ringbuffer.c
index 9d6e2e10d25a..7ea5eca118eb 100644
--- a/drivers/gpu/drm/msm/msm_ringbuffer.c
+++ b/drivers/gpu/drm/msm/msm_ringbuffer.c
@@ -29,9 +29,10 @@ static struct dma_fence *msm_job_run(struct drm_sched_job 
*job)
struct drm_gem_object *obj = submit->bos[i].obj;
 
msm_gem_unpin_active(obj);
-   submit->bos[i].flags &= ~BO_PINNED;
}
 
+   submit->bos_pinned = false;
+
mutex_unlock(>lru.lock);
 
msm_gpu_submit(gpu, submit);
-- 
2.41.0



[PATCH 3/7] drm/msm/gem: Don't queue job to sched in error cases

2023-10-27 Thread Rob Clark
From: Rob Clark 

We shouldn't be running the job in error cases.  This also avoids having
to think too hard about where the objs get unpinned (and if necessary,
the resv takes over tracking that the obj is busy).. ie. error cases it
always happens synchronously, and normal cases it happens from scheduler
job_run() callback.

Signed-off-by: Rob Clark 
---
 drivers/gpu/drm/msm/msm_gem_submit.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c 
b/drivers/gpu/drm/msm/msm_gem_submit.c
index 2d5527dc3e1a..786b48a55309 100644
--- a/drivers/gpu/drm/msm/msm_gem_submit.c
+++ b/drivers/gpu/drm/msm/msm_gem_submit.c
@@ -946,6 +946,9 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data,
}
}
 
+   if (ret)
+   goto out;
+
submit_attach_object_fences(submit);
 
/* The scheduler owns a ref now: */
-- 
2.41.0



[PATCH 2/7] drm/msm/gem: Remove submit_unlock_unpin_bo()

2023-10-27 Thread Rob Clark
From: Rob Clark 

The only point it is called is before pinning objects, so the "unpin"
part of the name is fiction.  Just remove call submit_cleanup_bo()
directly.

Signed-off-by: Rob Clark 
---
 drivers/gpu/drm/msm/msm_gem_submit.c | 10 ++
 1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c 
b/drivers/gpu/drm/msm/msm_gem_submit.c
index 996274ef32a6..2d5527dc3e1a 100644
--- a/drivers/gpu/drm/msm/msm_gem_submit.c
+++ b/drivers/gpu/drm/msm/msm_gem_submit.c
@@ -272,12 +272,6 @@ static void submit_cleanup_bo(struct msm_gem_submit 
*submit, int i,
dma_resv_unlock(obj->resv);
 }
 
-static void submit_unlock_unpin_bo(struct msm_gem_submit *submit, int i)
-{
-   unsigned cleanup_flags = BO_PINNED | BO_LOCKED;
-   submit_cleanup_bo(submit, i, cleanup_flags);
-}
-
 /* This is where we make sure all the bo's are reserved and pin'd: */
 static int submit_lock_objects(struct msm_gem_submit *submit)
 {
@@ -313,10 +307,10 @@ static int submit_lock_objects(struct msm_gem_submit 
*submit)
}
 
for (; i >= 0; i--)
-   submit_unlock_unpin_bo(submit, i);
+   submit_cleanup_bo(submit, i, BO_LOCKED);
 
if (slow_locked > 0)
-   submit_unlock_unpin_bo(submit, slow_locked);
+   submit_cleanup_bo(submit, slow_locked, BO_LOCKED);
 
if (ret == -EDEADLK) {
struct drm_gem_object *obj = submit->bos[contended].obj;
-- 
2.41.0



[PATCH 1/7] drm/msm/gem: Remove "valid" tracking

2023-10-27 Thread Rob Clark
From: Rob Clark 

This was a small optimization for pre-soft-pin userspace.  But mesa
switched to soft-pin nearly 5yrs ago.  So lets drop the optimization
and simplify the code.

Signed-off-by: Rob Clark 
---
 drivers/gpu/drm/msm/msm_gem.h|  2 --
 drivers/gpu/drm/msm/msm_gem_submit.c | 44 +---
 2 files changed, 8 insertions(+), 38 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_gem.h b/drivers/gpu/drm/msm/msm_gem.h
index 8ddef5443140..c36c1c1fa222 100644
--- a/drivers/gpu/drm/msm/msm_gem.h
+++ b/drivers/gpu/drm/msm/msm_gem.h
@@ -271,7 +271,6 @@ struct msm_gem_submit {
struct msm_gpu_submitqueue *queue;
struct pid *pid;/* submitting process */
bool fault_dumped;  /* Limit devcoredump dumping to one per submit */
-   bool valid; /* true if no cmdstream patching needed */
bool in_rb; /* "sudo" mode, copy cmds into RB */
struct msm_ringbuffer *ring;
unsigned int nr_cmds;
@@ -288,7 +287,6 @@ struct msm_gem_submit {
} *cmd;  /* array of size nr_cmds */
struct {
 /* make sure these don't conflict w/ MSM_SUBMIT_BO_x */
-#define BO_VALID   0x8000  /* is current addr in cmdstream correct/valid? 
*/
 #define BO_LOCKED  0x4000  /* obj lock is held */
 #define BO_PINNED  0x2000  /* obj (pages) is pinned and on active list */
uint32_t flags;
diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c 
b/drivers/gpu/drm/msm/msm_gem_submit.c
index 6d8ec1337e8b..996274ef32a6 100644
--- a/drivers/gpu/drm/msm/msm_gem_submit.c
+++ b/drivers/gpu/drm/msm/msm_gem_submit.c
@@ -150,8 +150,6 @@ static int submit_lookup_objects(struct msm_gem_submit 
*submit,
 
submit->bos[i].handle = submit_bo.handle;
submit->bos[i].flags = submit_bo.flags;
-   /* in validate_objects() we figure out if this is true: */
-   submit->bos[i].iova  = submit_bo.presumed;
}
 
spin_lock(>table_lock);
@@ -278,9 +276,6 @@ static void submit_unlock_unpin_bo(struct msm_gem_submit 
*submit, int i)
 {
unsigned cleanup_flags = BO_PINNED | BO_LOCKED;
submit_cleanup_bo(submit, i, cleanup_flags);
-
-   if (!(submit->bos[i].flags & BO_VALID))
-   submit->bos[i].iova = 0;
 }
 
 /* This is where we make sure all the bo's are reserved and pin'd: */
@@ -390,8 +385,6 @@ static int submit_pin_objects(struct msm_gem_submit *submit)
struct msm_drm_private *priv = submit->dev->dev_private;
int i, ret = 0;
 
-   submit->valid = true;
-
for (i = 0; i < submit->nr_bos; i++) {
struct drm_gem_object *obj = submit->bos[i].obj;
struct msm_gem_vma *vma;
@@ -407,14 +400,7 @@ static int submit_pin_objects(struct msm_gem_submit 
*submit)
if (ret)
break;
 
-   if (vma->iova == submit->bos[i].iova) {
-   submit->bos[i].flags |= BO_VALID;
-   } else {
-   submit->bos[i].iova = vma->iova;
-   /* iova changed, so address in cmdstream is not valid: 
*/
-   submit->bos[i].flags &= ~BO_VALID;
-   submit->valid = false;
-   }
+   submit->bos[i].iova = vma->iova;
}
 
/*
@@ -451,7 +437,7 @@ static void submit_attach_object_fences(struct 
msm_gem_submit *submit)
 }
 
 static int submit_bo(struct msm_gem_submit *submit, uint32_t idx,
-   struct drm_gem_object **obj, uint64_t *iova, bool *valid)
+   struct drm_gem_object **obj, uint64_t *iova)
 {
if (idx >= submit->nr_bos) {
SUBMIT_ERROR(submit, "invalid buffer index: %u (out of %u)\n",
@@ -463,8 +449,6 @@ static int submit_bo(struct msm_gem_submit *submit, 
uint32_t idx,
*obj = submit->bos[idx].obj;
if (iova)
*iova = submit->bos[idx].iova;
-   if (valid)
-   *valid = !!(submit->bos[idx].flags & BO_VALID);
 
return 0;
 }
@@ -477,9 +461,6 @@ static int submit_reloc(struct msm_gem_submit *submit, 
struct drm_gem_object *ob
uint32_t *ptr;
int ret = 0;
 
-   if (!nr_relocs)
-   return 0;
-
if (offset % 4) {
SUBMIT_ERROR(submit, "non-aligned cmdstream buffer: %u\n", 
offset);
return -EINVAL;
@@ -500,7 +481,6 @@ static int submit_reloc(struct msm_gem_submit *submit, 
struct drm_gem_object *ob
struct drm_msm_gem_submit_reloc submit_reloc = relocs[i];
uint32_t off;
uint64_t iova;
-   bool valid;
 
if (submit_reloc.submit_offset % 4) {
SUBMIT_ERROR(submit, "non-aligned reloc offset: %u\n",
@@ -519,13 +499,10 @@ static int submit_reloc(struct msm_gem_submit *submit, 
struct drm_gem_object *ob
goto out;
}
 
-   ret = 

[PATCH 0/7] drm/msm/gem: drm_exec conversion

2023-10-27 Thread Rob Clark
From: Rob Clark 

Simplify the exec path (removing a legacy optimization) and convert to
drm_exec.  One drm_exec patch to allow passing in the expected # of GEM
objects to avoid re-allocation.

I'd be a bit happier if I could avoid the extra objects table allocation
in drm_exec in the first place, but wasn't really happy with any of the
things I tried to get rid of that.

Rob Clark (7):
  drm/msm/gem: Remove "valid" tracking
  drm/msm/gem: Remove submit_unlock_unpin_bo()
  drm/msm/gem: Don't queue job to sched in error cases
  drm/msm/gem: Split out submit_unpin_objects() helper
  drm/msm/gem: Cleanup submit_cleanup_bo()
  drm/exec: Pass in initial # of objects
  drm/msm/gem: Convert to drm_exec

 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c  |   2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c |   4 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c |   4 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c |   4 +-
 drivers/gpu/drm/drm_exec.c  |  15 +-
 drivers/gpu/drm/msm/Kconfig |   1 +
 drivers/gpu/drm/msm/msm_gem.h   |  13 +-
 drivers/gpu/drm/msm/msm_gem_submit.c| 197 ++--
 drivers/gpu/drm/msm/msm_ringbuffer.c|   3 +-
 drivers/gpu/drm/nouveau/nouveau_exec.c  |   2 +-
 drivers/gpu/drm/nouveau/nouveau_uvmm.c  |   2 +-
 include/drm/drm_exec.h  |   2 +-
 12 files changed, 79 insertions(+), 170 deletions(-)

-- 
2.41.0



Re: [PATCH] drm/panel-edp: Add panel entry for AUO B116XTN02 and BOE NT116WHM-N21,836X2 and NV116WHM-N49 V8.0

2023-10-27 Thread Doug Anderson
Hi,

On Thu, Oct 26, 2023 at 8:05 PM Sheng-Liang Pan
 wrote:
>
> Add panel identification entry for
> - AUO B116XTN02 family (product ID:0x235c)
> - BOE NT116WHM-N21,836X2 (product ID:0x09c3)
> - BOE NV116WHM-N49 V8.0 (product ID:0x0979)
>
> Signed-off-by: Sheng-Liang Pan 
> 
> ---
>
>  drivers/gpu/drm/panel/panel-edp.c | 3 +++
>  1 file changed, 3 insertions(+)

nit: the ${SUBJECT} of this patch is too long. It's OK to go a little
over suggested lengths, but this one is probably too long. Other than
that:

Reviewed-by: Douglas Anderson 

...as per my usual policy, I don't let simple changes to this table
sit on the list. I fixed the subject myself and pushed to
drm-misc-next:

3db2420422a5 drm/panel-edp: Add AUO B116XTN02, BOE NT116WHM-N21,836X2,
NV116WHM-N49 V8.0


[PATCH] accel/qaic: Support for 0 resize slice execution in BO

2023-10-27 Thread Jeffrey Hugo
From: Pranjal Ramajor Asha Kanojiya 

Add support to partially execute a slice which is resized to zero.
Executing a zero size slice in a BO should mean that there is no DMA
transfers involved but you should still configure doorbell and semaphores.

For example consider a BO of size 18K and it is sliced into 3 6K slices
and user calls partial execute ioctl with resize as 10K.
slice 0 - size is 6k and offset is 0, so resize of 10K will not cut short
  this slice hence we send the entire slice for execution.
slice 1 - size is 6k and offset is 6k, so resize of 10K will cut short this
  slice and only the first 4k should be DMA along with configuring
  doorbell and semaphores.
slice 2 - size is 6k and offset is 12k, so resize of 10k will cut short
  this slice and no DMA transfer would be involved but we should
  would configure doorbell and semaphores.

This change begs to change the behavior of 0 resize. Currently, 0 resize
partial execute ioctl behaves exactly like execute ioctl i.e. no resize.
After this patch all the slice in BO should behave exactly like slice 2 in
above example.

Refactor copy_partial_exec_reqs() to make it more readable and less
complex.

Signed-off-by: Pranjal Ramajor Asha Kanojiya 
Reviewed-by: Jeffrey Hugo 
Signed-off-by: Jeffrey Hugo 
---

Userspace specifically requested this change in uAPI.

 drivers/accel/qaic/qaic_data.c | 104 ++---
 include/uapi/drm/qaic_accel.h  |   5 +-
 2 files changed, 46 insertions(+), 63 deletions(-)

diff --git a/drivers/accel/qaic/qaic_data.c b/drivers/accel/qaic/qaic_data.c
index ebc3cca1b094..8da81768f2ab 100644
--- a/drivers/accel/qaic/qaic_data.c
+++ b/drivers/accel/qaic/qaic_data.c
@@ -51,6 +51,7 @@
})
 #define NUM_EVENTS 128
 #define NUM_DELAYS 10
+#define fifo_at(base, offset) ((base) + (offset) * get_dbc_req_elem_size())
 
 static unsigned int wait_exec_default_timeout_ms = 5000; /* 5 sec default */
 module_param(wait_exec_default_timeout_ms, uint, 0600);
@@ -1058,6 +1059,16 @@ int qaic_attach_slice_bo_ioctl(struct drm_device *dev, 
void *data, struct drm_fi
return ret;
 }
 
+static inline u32 fifo_space_avail(u32 head, u32 tail, u32 q_size)
+{
+   u32 avail = head - tail - 1;
+
+   if (head <= tail)
+   avail += q_size;
+
+   return avail;
+}
+
 static inline int copy_exec_reqs(struct qaic_device *qdev, struct bo_slice 
*slice, u32 dbc_id,
 u32 head, u32 *ptail)
 {
@@ -1066,27 +1077,20 @@ static inline int copy_exec_reqs(struct qaic_device 
*qdev, struct bo_slice *slic
u32 tail = *ptail;
u32 avail;
 
-   avail = head - tail;
-   if (head <= tail)
-   avail += dbc->nelem;
-
-   --avail;
-
+   avail = fifo_space_avail(head, tail, dbc->nelem);
if (avail < slice->nents)
return -EAGAIN;
 
if (tail + slice->nents > dbc->nelem) {
avail = dbc->nelem - tail;
avail = min_t(u32, avail, slice->nents);
-   memcpy(dbc->req_q_base + tail * get_dbc_req_elem_size(), reqs,
-  sizeof(*reqs) * avail);
+   memcpy(fifo_at(dbc->req_q_base, tail), reqs, sizeof(*reqs) * 
avail);
reqs += avail;
avail = slice->nents - avail;
if (avail)
memcpy(dbc->req_q_base, reqs, sizeof(*reqs) * avail);
} else {
-   memcpy(dbc->req_q_base + tail * get_dbc_req_elem_size(), reqs,
-  sizeof(*reqs) * slice->nents);
+   memcpy(fifo_at(dbc->req_q_base, tail), reqs, sizeof(*reqs) * 
slice->nents);
}
 
*ptail = (tail + slice->nents) % dbc->nelem;
@@ -1094,46 +1098,31 @@ static inline int copy_exec_reqs(struct qaic_device 
*qdev, struct bo_slice *slic
return 0;
 }
 
-/*
- * Based on the value of resize we may only need to transmit first_n
- * entries and the last entry, with last_bytes to send from the last entry.
- * Note that first_n could be 0.
- */
 static inline int copy_partial_exec_reqs(struct qaic_device *qdev, struct 
bo_slice *slice,
-u64 resize, u32 dbc_id, u32 head, u32 
*ptail)
+u64 resize, struct dma_bridge_chan 
*dbc, u32 head,
+u32 *ptail)
 {
-   struct dma_bridge_chan *dbc = >dbc[dbc_id];
struct dbc_req *reqs = slice->reqs;
struct dbc_req *last_req;
u32 tail = *ptail;
-   u64 total_bytes;
u64 last_bytes;
u32 first_n;
u32 avail;
-   int ret;
-   int i;
-
-   avail = head - tail;
-   if (head <= tail)
-   avail += dbc->nelem;
 
-   --avail;
+   avail = fifo_space_avail(head, tail, dbc->nelem);
 
-   total_bytes = 0;
-   for (i = 0; i < slice->nents; i++) {
-   total_bytes += le32_to_cpu(reqs[i].len);
-  

Re: [PATCH drm-misc-next v3] drm/sched: implement dynamic job-flow control

2023-10-27 Thread Boris Brezillon
On Fri, 27 Oct 2023 10:32:52 -0400
Luben Tuikov  wrote:

> On 2023-10-27 04:25, Boris Brezillon wrote:
> > Hi Danilo,
> > 
> > On Thu, 26 Oct 2023 18:13:00 +0200
> > Danilo Krummrich  wrote:
> >   
> >> Currently, job flow control is implemented simply by limiting the number
> >> of jobs in flight. Therefore, a scheduler is initialized with a credit
> >> limit that corresponds to the number of jobs which can be sent to the
> >> hardware.
> >>
> >> This implies that for each job, drivers need to account for the maximum
> >> job size possible in order to not overflow the ring buffer.
> >>
> >> However, there are drivers, such as Nouveau, where the job size has a
> >> rather large range. For such drivers it can easily happen that job
> >> submissions not even filling the ring by 1% can block subsequent
> >> submissions, which, in the worst case, can lead to the ring run dry.
> >>
> >> In order to overcome this issue, allow for tracking the actual job size
> >> instead of the number of jobs. Therefore, add a field to track a job's
> >> credit count, which represents the number of credits a job contributes
> >> to the scheduler's credit limit.
> >>
> >> Signed-off-by: Danilo Krummrich 
> >> ---
> >> Changes in V2:
> >> ==
> >>   - fixed up influence on scheduling fairness due to consideration of a 
> >> job's
> >> size
> >> - If we reach a ready entity in drm_sched_select_entity() but can't 
> >> actually
> >>   queue a job from it due to size limitations, just give up and go to 
> >> sleep
> >>   until woken up due to a pending job finishing, rather than continue 
> >> to try
> >>   other entities.
> >>   - added a callback to dynamically update a job's credits (Boris)  
> > 
> > This callback seems controversial. I'd suggest dropping it, so the
> > patch can be merged.  
> 
> Sorry, why is it controversial? (I did read back-and-forth above, but it 
> wasn't clear
> why it is /controversial/.)

That's a question for Christian, I guess. I didn't quite get what he
was worried about, other than this hook introducing a new way for
drivers to screw things up by returning funky/invalid credits (which we
can report with WARN_ON()s). But let's be honest, there's probably a
hundred different ways (if not more) drivers can shoot themselves in the
foot with drm_sched already...

> 
> I believe only drivers are privy to changes in the credit availability as 
> their
> firmware and hardware executes new jobs and finishes others, and so this 
> "update"
> here is essential--leaving it only to prepare_job() wouldn't quite fulfill 
> the vision
> of why the credit mechanism introduced by this patch in the first place.

I kinda agree with you, even if I wouldn't so pessimistic as to how
useful this patch would be without the ->update_job_credits() hook
(it already makes the situation a lot better for Nouveau and probably
other drivers too).


Re: [PATCH v2 08/29] drm/dp: Add helpers to calculate the link BW overhead

2023-10-27 Thread Imre Deak
On Fri, Oct 27, 2023 at 06:48:44PM +0300, Ville Syrjälä wrote:
> On Tue, Oct 24, 2023 at 01:22:17PM +0300, Imre Deak wrote:
> > Add helpers drivers can use to calculate the BW allocation overhead -
> > due to SSC, FEC, DSC and data alignment on symbol cycles - and the
> > channel coding efficiency - due to the 8b/10b, 128b/132b encoding. On
> > 128b/132b links the FEC overhead is part of the coding efficiency, so
> > not accounted for in the BW allocation overhead.
> > 
> > The drivers can use these functions to calculate a ratio, controlling
> > the stream symbol insertion rate of the source device in each SST TU
> > or MST MTP frame. Drivers can calculate this
> > 
> > m/n = (pixel_data_rate * drm_dp_bw_overhead()) /
> >   (link_data_rate * drm_dp_bw_channel_coding_efficiency())
> > 
> > ratio for a given link and pixel stream and with that the
> > 
> > mtp_count = CEIL(64 * m / n)
> > 
> > allocated MTPs for the stream in a link frame and
> > 
> > pbn = CEIL(64 * dm_mst_get_pbn_divider() * m / n)
> > 
> > allocated PBNs for the stream on the MST link path.
> > 
> > Take drm_dp_bw_overhead() into use in drm_dp_calc_pbn_mode(), for
> > drivers calculating the PBN value directly.
> > 
> > v2:
> > - Add dockbook description to drm_dp_bw_channel_coding_efficiency().
> >   (LKP).
> > - Clarify the way m/n ratio is calculated in the commit log.
> > 
> > Cc: Lyude Paul 
> > Cc: kernel test robot 
> > Cc: dri-devel@lists.freedesktop.org
> > Signed-off-by: Imre Deak 
> > ---
> >  drivers/gpu/drm/display/drm_dp_helper.c   | 124 ++
> >  drivers/gpu/drm/display/drm_dp_mst_topology.c |  23 +++-
> >  include/drm/display/drm_dp_helper.h   |  11 ++
> >  3 files changed, 152 insertions(+), 6 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/display/drm_dp_helper.c 
> > b/drivers/gpu/drm/display/drm_dp_helper.c
> > index e5d7970a9ddd0..79629bf7547bf 100644
> > --- a/drivers/gpu/drm/display/drm_dp_helper.c
> > +++ b/drivers/gpu/drm/display/drm_dp_helper.c
> > @@ -3899,4 +3899,128 @@ int drm_panel_dp_aux_backlight(struct drm_panel 
> > *panel, struct drm_dp_aux *aux)
> >  }
> >  EXPORT_SYMBOL(drm_panel_dp_aux_backlight);
> >  
> > +/* See DP Standard v2.1 2.6.4.4.1.1, 2.8.4.4, 2.8.7 */
> > +static int drm_dp_link_symbol_cycles(int lane_count, int pixels, int 
> > bpp_x16,
> > +int symbol_size, bool is_mst)
> > +{
> > +   int cycles = DIV_ROUND_UP(pixels * bpp_x16, 16 * symbol_size * 
> > lane_count);
> > +   int align = is_mst ? 4 / lane_count : 1;
> > +
> > +   return ALIGN(cycles, align);
> > +}
> > +
> > +static int drm_dp_link_dsc_symbol_cycles(int lane_count, int pixels, int 
> > slice_count,
> > +int bpp_x16, int symbol_size, bool 
> > is_mst)
> > +{
> > +   int slice_pixels = DIV_ROUND_UP(pixels, slice_count);
> > +   int slice_data_cycles = drm_dp_link_symbol_cycles(lane_count, 
> > slice_pixels,
> > + bpp_x16, symbol_size, 
> > is_mst);
> > +   int slice_eoc_cycles = is_mst ? 4 / lane_count : 1;
> > +
> > +   return slice_count * (slice_data_cycles + slice_eoc_cycles);
> > +}
> > +
> > +/**
> > + * drm_dp_bw_overhead - Calculate the BW overhead of a DP link stream
> > + * @lane_count: DP link lane count
> > + * @hactive: pixel count of the active period in one scanline of the stream
> > + * @dsc_slice_count: DSC slice count if @flags/DRM_DP_LINK_BW_OVERHEAD_DSC 
> > is set
> > + * @bpp_x16: bits per pixel in .4 binary fixed point
> > + * @flags: DRM_DP_OVERHEAD_x flags
> > + *
> > + * Calculate the BW allocation overhead of a DP link stream, depending
> > + * on the link's
> > + * - @lane_count
> > + * - SST/MST mode (@flags / %DRM_DP_OVERHEAD_MST)
> > + * - symbol size (@flags / %DRM_DP_OVERHEAD_UHBR)
> > + * - FEC mode (@flags / %DRM_DP_OVERHEAD_FEC)
> > + * - SSC mode (@flags / %DRM_DP_OVERHEAD_SSC)
> > + * as well as the stream's
> > + * - @hactive timing
> > + * - @bpp_x16 color depth
> > + * - compression mode (@flags / %DRM_DP_OVERHEAD_DSC).
> > + * Note that this overhead doesn't account for the 8b/10b, 128b/132b
> > + * channel coding efficiency, for that see
> > + * @drm_dp_link_bw_channel_coding_efficiency().
> > + *
> > + * Returns the overhead as 100% + overhead% in 1ppm units.
> > + */
> > +int drm_dp_bw_overhead(int lane_count, int hactive,
> > +  int dsc_slice_count,
> > +  int bpp_x16, unsigned long flags)
> > +{
> > +   int symbol_size = flags & DRM_DP_BW_OVERHEAD_UHBR ? 32 : 8;
> > +   bool is_mst = flags & DRM_DP_BW_OVERHEAD_MST;
> > +   u32 overhead = 100;
> > +   int symbol_cycles;
> > +
> > +   /*
> > +* DP Standard v2.1 2.6.4.1
> > +* SSC downspread and ref clock variation margin:
> > +*   5300ppm + 300ppm ~ 0.6%
> > +*/
> > +   if (flags & DRM_DP_BW_OVERHEAD_SSC)
> 
> Maybe the flag name should reflect that it accounts for both 
> downspread and ref clk difference.

Ok, will rename it.

> 

[RFC PATCH] drm/imx: ipuv3-plane: Allow preventing sequential DMA bursts

2023-10-27 Thread Miquel Raynal
Sequential DMA bursts improve NIC/RAM usage thanks to the basic NIC
hardware optimizations available when performing in-order sequential
accesses. This can be further enforced with the IPU DMA locking
mechanism which basically prevents any other IP to access the
interconnect for a longer time while performing up to 8 sequential DMA
bursts. The drawback is a lower availability for short time periods and
delayed accesses which may cause problem with latency-sensible systems
(typically, the network might suffer from high drop rates). This is even
more visible with larger displays requiring even more RAM bandwidth.

Issues have been observed on IMX6Q. The setup featured a 60Hz 1024x768
LVDS display just showing a static picture (thus no CPU usage, only
background DMA bringing the picture to the display engine). When
performing full speed iperf3 uplink tests with the FEC, almost no drop
was observed, whereas the drop would raise above 50% when limiting the
bandwidth to 1Mb/s (on a 100Mb/s link). The exact same test with the
display pipeline disabled would show little to no drop. The LP-DDR3 chip
on the module would allow up to ~53MiB each 1/60th of a second, and the
display pipeline consume approximately ~10MiB of this bandwidth, and
thus be active 20% of the time on each time slot.

One particular feature of the IPU DMA controller (IDMAC) is the ability
to serialize DMA bursts and to lock further interconnect accesses when
doing so. Experimentally, disabling the locking lead to a drop rate from
50% down to 10%. A few more % could be earned by setting the burst
number to 1. It seems this huge difference could be explained by a
possible hardware conflict between the locking feature and some QoS
logic. Indeed, on IMX6Q, the NIC-301 manages priorities and by default
will elect ENET's requests (priority 2) above IPU's requests (priority
0). But the QoS seems to only be valid above a certain threshold, which
is: 4 consequent DMA bursts in the case of the IPU. It was indeed
observed that tweaking the number of bursts to be lowered from 8 to 4
would lead to a significant increase in the Ethernet transfers
stability. IOW, it looks like when the display pipeline performs DMA
transfers, incoming DMA requests from other master devices on the
interconnect are delayed too much (or canceled).

I have no clue to explain why on the Ethernet MAC side some uDMA
transfers would never reach completion, especially without notification
nor any error. All uplink transfers are properly queued at the FEC level
and more importantly, the corresponding interrupts are fired upon
"proper transmission" and report no error whatsoever (note: there is no
actual way to know the uDMA internal controller could not fetch the
data, only MAC errors could be reported at this stage).

As a solution, we might want to prevent these DMA bursts from being
queued together. Maybe the IMX6Q is primarily used for its graphics
capabilities, but when the network (and other RAM consuming subsystem)
also matter, it may be relevant to apply this workaround in order to
help them fetching from RAM more reliably.

Signed-off-by: Miquel Raynal 
---

Hello,

This really is an RFC as the bug was also observed on v6.5 but the fix
proposed here was written and tested on a v4.14 kernel. I want to
discuss the approach and ideally get some feedback from imx6 experts who
know the SoC internals before publishing a clean series. There is a lot
of guessing in this workaround, besides the experimental measures I
managed to do. I would be glad if someone could sched any light or
involve knowledgeable people in this conversation.

The initial report was there and mainly focused on the network
subsystem:
https://lore.kernel.org/netdev/18b72fdb-d24a-a416-ffab-3a15b281a...@katalix.com/T/#md265d6da81b8fb6b85e3adbb399bcda79dfc761c
In this thread I made wrong observations because for speeding up my test
cycles, I dropped the support for: DRM, SND, USB as these subsystems
seemed totally irrelevant. It actually had a strong impact.

In the end, I really think there is something wrong with the locking of
IPU DMA bursts when mixed with the QoS of the NIC.

Any comments welcome!
[I know this deserves a DT binding entry, but we are not there yet]

Thanks,
Miquèl

 drivers/gpu/drm/imx/ipuv3-plane.c | 27 ---
 drivers/gpu/drm/imx/ipuv3-plane.h |  1 +
 2 files changed, 25 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/imx/ipuv3-plane.c 
b/drivers/gpu/drm/imx/ipuv3-plane.c
index cf98596c7ce1..0492df96ac3e 100644
--- a/drivers/gpu/drm/imx/ipuv3-plane.c
+++ b/drivers/gpu/drm/imx/ipuv3-plane.c
@@ -496,8 +496,8 @@ static int ipu_chan_assign_axi_id(int ipu_chan)
}
 }
 
-static void ipu_calculate_bursts(u32 width, u32 cpp, u32 stride,
-u8 *burstsize, u8 *num_bursts)
+static void ipu_calculate_bursts(struct ipu_plane *ipu_plane, u32 width, u32 
cpp,
+u32 stride, u8 *burstsize, u8 *num_bursts)
 {
   

Re: [PATCH drm-misc-next v3] drm/sched: implement dynamic job-flow control

2023-10-27 Thread Boris Brezillon
On Fri, 27 Oct 2023 16:23:24 +0200
Danilo Krummrich  wrote:

> On 10/27/23 10:25, Boris Brezillon wrote:
> > Hi Danilo,
> > 
> > On Thu, 26 Oct 2023 18:13:00 +0200
> > Danilo Krummrich  wrote:
> >   
> >> Currently, job flow control is implemented simply by limiting the number
> >> of jobs in flight. Therefore, a scheduler is initialized with a credit
> >> limit that corresponds to the number of jobs which can be sent to the
> >> hardware.
> >>
> >> This implies that for each job, drivers need to account for the maximum
> >> job size possible in order to not overflow the ring buffer.
> >>
> >> However, there are drivers, such as Nouveau, where the job size has a
> >> rather large range. For such drivers it can easily happen that job
> >> submissions not even filling the ring by 1% can block subsequent
> >> submissions, which, in the worst case, can lead to the ring run dry.
> >>
> >> In order to overcome this issue, allow for tracking the actual job size
> >> instead of the number of jobs. Therefore, add a field to track a job's
> >> credit count, which represents the number of credits a job contributes
> >> to the scheduler's credit limit.
> >>
> >> Signed-off-by: Danilo Krummrich 
> >> ---
> >> Changes in V2:
> >> ==
> >>- fixed up influence on scheduling fairness due to consideration of a 
> >> job's
> >>  size
> >>  - If we reach a ready entity in drm_sched_select_entity() but can't 
> >> actually
> >>queue a job from it due to size limitations, just give up and go to 
> >> sleep
> >>until woken up due to a pending job finishing, rather than continue 
> >> to try
> >>other entities.
> >>- added a callback to dynamically update a job's credits (Boris)  
> > 
> > This callback seems controversial. I'd suggest dropping it, so the
> > patch can be merged.  
> 
> I don't think we should drop it just for the sake of moving forward. If there 
> are objections
> we'll discuss them. I've seen good reasons why the drivers you are working on 
> require this,
> while, following the discussion, I have *not* seen any reasons to drop it. It 
> helps simplifying
> driver code and doesn't introduce any complexity or overhead to existing 
> drivers.

Up to you. I'm just saying, moving one step in the right direction is
better than being stuck, and it's not like adding this callback in a
follow-up patch is super complicated either. If you're confident that
we can get all parties to agree on keeping this hook, fine by me.


Re: [PATCH drm-misc-next v3] drm/sched: implement dynamic job-flow control

2023-10-27 Thread Boris Brezillon
On Fri, 27 Oct 2023 16:34:26 +0200
Danilo Krummrich  wrote:

> On 10/27/23 09:17, Boris Brezillon wrote:
> > Hi Danilo,
> > 
> > On Thu, 26 Oct 2023 18:13:00 +0200
> > Danilo Krummrich  wrote:
> >   
> >> +
> >> +  /**
> >> +   * @update_job_credits: Called once the scheduler is considering this
> >> +   * job for execution.
> >> +   *
> >> +   * Drivers may use this to update the job's submission credits, which is
> >> +   * useful to e.g. deduct the number of native fences which have been
> >> +   * signaled meanwhile.
> >> +   *
> >> +   * The callback must either return the new number of submission credits
> >> +   * for the given job, or zero if no update is required.
> >> +   *
> >> +   * This callback is optional.
> >> +   */
> >> +  u32 (*update_job_credits)(struct drm_sched_job *sched_job);  
> > 
> > I'm copying my late reply to v2 here so it doesn't get lost:
> > 
> > I keep thinking it'd be simpler to make this a void function that
> > updates s_job->submission_credits directly. I also don't see the
> > problem with doing a sanity check on job->submission_credits. I mean,
> > if the driver is doing something silly, you can't do much to prevent it
> > anyway, except warn the user that something wrong has happened. If you
> > want to
> > 
> > WARN_ON(job->submission_credits == 0 ||
> > job->submission_credits > job_old_submission_credits);
> > 
> > that's fine. But none of this sanity checking has to do with the
> > function prototype/semantics, and I'm still not comfortable with this 0  
> > => no-change. If there's no change, we should just leave  
> > job->submission_credits unchanged (or return job->submission_credits)
> > instead of inventing a new special case.  
> 
> If we can avoid letting drivers change fields of generic structures directly
> without any drawbacks I think we should avoid it. Currently, drivers shouldn't
> have the need to mess with job->credits directly. The initial value is set
> through drm_sched_job_init() and is updated through the return value of
> update_job_credits().

Fair enough. I do agree that keeping internal fields out of driver
hands is a good thing in general, it's just that it's already
free-for-all in so many places in drm_sched (like the fact drivers
iterate the pending list in their stop-queue handling) that I didn't
really see it as an issue. Note that's there's always the option of
providing drm_sched_job_{update,get}_credits() helpers, with the update
helper making sure the new credits value is consistent (smaller or
equal to the old one, and not zero).

> 
> I'm fine getting rid of the 0 => no-change semantics though. Instead we can 
> just
> WARN() on 0.

Yeah, I think that's preferable. It's pretty easy to return the old
value if the driver has a way to detect when nothing changed (with a
get helper if you don't want drivers to touch the credits field).

> However, if we do that I'd also want to change it for
> drm_sched_job_init() (where 0 currently defaults to 1) such that we accept 0, 
> but
> WARN() accordingly.

Sure. You update all drivers anyway, so passing 1 instead of 0 is not a
big deal, I would say.

> 
> I think it's consequent to either consistently give 0 a different meaning or 
> just
> accept it but WARN() on it.

Using default as a default value makes sense when you're passing
zero-initialized objects that are later extended with new fields, but
here you update the function prototype and all the call sites, so we're
better off considering 0 as an invalid value, IMHO.


Re: [PATCH v2 0/2] accel/qaic: Add support for host/device timesync

2023-10-27 Thread Jeffrey Hugo

On 10/16/2023 11:01 AM, Jeffrey Hugo wrote:

AIC100 supports a timesync mechanism that allows AIC100 to timestamp
device logs with a host based time. This becomes useful for putting host
logs in a unified timeline with device logs for debugging and performance
profiling. The mechanism consists of a boot-time initialization and a
runtime perodic resync to counteract the effects of time source drift
over time between the host and device.

v2:
- Fix readq usage on 32-bit powerpc
- Fix doc warning due to trailing "_"

Ajit Pal Singh (1):
   accel/qaic: Add support for periodic timesync

Pranjal Ramajor Asha Kanojiya (1):
   accel/qaic: Support MHI QAIC_TIMESYNC channel

  Documentation/accel/qaic/aic100.rst |   6 +-
  Documentation/accel/qaic/qaic.rst   |   5 +
  drivers/accel/qaic/Makefile |   3 +-
  drivers/accel/qaic/mhi_controller.c |  36 ++-
  drivers/accel/qaic/qaic.h   |   4 +
  drivers/accel/qaic/qaic_drv.c   |  17 ++
  drivers/accel/qaic/qaic_timesync.c  | 395 
  drivers/accel/qaic/qaic_timesync.h  |  11 +
  8 files changed, 473 insertions(+), 4 deletions(-)
  create mode 100644 drivers/accel/qaic/qaic_timesync.c
  create mode 100644 drivers/accel/qaic/qaic_timesync.h



Pushed to drm-misc-next


Re: [PATCH] drm/amdgpu: Fixes uninitialized variable usage in amdgpu_dm_setup_replay

2023-10-27 Thread Hamza Mahfooz

On 10/27/23 11:55, Lakha, Bhawanpreet wrote:

[AMD Official Use Only - General]



There was a consensus to use memset instead of {0}. I remember making 
changes related to that previously.


Hm, seems like it's used rather consistently in the DM and in DC
though.



Bhawan


*From:* Mahfooz, Hamza 
*Sent:* October 27, 2023 11:53 AM
*To:* Yuran Pereira ; airl...@gmail.com 

*Cc:* Li, Sun peng (Leo) ; Lakha, Bhawanpreet 
; Pan, Xinhui ; Siqueira, 
Rodrigo ; linux-ker...@vger.kernel.org 
; amd-...@lists.freedesktop.org 
; dri-devel@lists.freedesktop.org 
; Deucher, Alexander 
; Koenig, Christian 
; 
linux-kernel-ment...@lists.linuxfoundation.org 

*Subject:* Re: [PATCH] drm/amdgpu: Fixes uninitialized variable usage in 
amdgpu_dm_setup_replay

On 10/26/23 17:25, Yuran Pereira wrote:

Since `pr_config` is not initialized after its declaration, the
following operations with `replay_enable_option` may be performed
when `replay_enable_option` is holding junk values which could
possibly lead to undefined behaviour

```
  ...
  pr_config.replay_enable_option |= pr_enable_option_static_screen;
  ...

  if (!pr_config.replay_timing_sync_supported)
  pr_config.replay_enable_option &= ~pr_enable_option_general_ui;
  ...
```

This patch initializes `pr_config` after its declaration to ensure that
it doesn't contain junk data, and prevent any undefined behaviour

Addresses-Coverity-ID: 1544428 ("Uninitialized scalar variable")
Fixes: dede1fea4460 ("drm/amd/display: Add Freesync Panel DM code")
Signed-off-by: Yuran Pereira 
---
   drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c | 3 +++
   1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c
index 32d3086c4cb7..40526507f50b 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c
@@ -23,6 +23,7 @@
    *
    */
  
+#include 

   #include "amdgpu_dm_replay.h"
   #include "dc.h"
   #include "dm_helpers.h"
@@ -74,6 +75,8 @@ bool amdgpu_dm_setup_replay(struct dc_link *link, struct 
amdgpu_dm_connector *ac
    struct replay_config pr_config;


I would prefer setting pr_config = {0};


    union replay_debug_flags *debug_flags = NULL;
  
+ memset(_config, 0, sizeof(pr_config));

+
    // For eDP, if Replay is supported, return true to skip checks
    if (link->replay_settings.config.replay_supported)
    return true;

--
Hamza


--
Hamza



Re: [PATCH v2 1/2] accel/qaic: Add support for periodic timesync

2023-10-27 Thread Jeffrey Hugo

On 10/22/2023 5:06 AM, Stanislaw Gruszka wrote:

On Mon, Oct 16, 2023 at 11:01:13AM -0600, Jeffrey Hugo wrote:

From: Ajit Pal Singh 

Device and Host have a time synchronization mechanism that happens once
during boot when device is in SBL mode. After that, in mission-mode there
is no timesync. In an experiment after continuous operation, device time
drifted w.r.t. host by approximately 3 seconds per day. This drift leads
to mismatch in timestamp of device and Host logs. To correct this
implement periodic timesync in driver. This timesync is carried out via
QAIC_TIMESYNC_PERIODIC MHI channel.

Signed-off-by: Ajit Pal Singh 
Signed-off-by: Pranjal Ramajor Asha Kanojiya 
Reviewed-by: Jeffrey Hugo 
Reviewed-by: Carl Vanderlip 
Reviewed-by: Pranjal Ramajor Asha Kanojiya 
Signed-off-by: Jeffrey Hugo 

Reviewed-by: Stanislaw Gruszka 


@@ -586,8 +587,16 @@ static int __init qaic_init(void)
goto free_pci;
}
  
+	ret = qaic_timesync_init();

+   if (ret) {
+   pr_debug("qaic: qaic_timesync_init failed %d\n", ret);
+   goto free_mhi;

I would print at error level here. Or if timesync is optional do not error exit.


Good point.  timesync is optional so will not do error exit.




+#ifdef readq
+static u64 read_qtimer(const volatile void __iomem *addr)
+{
+   return readq(addr);
+}
+#else
+static u64 read_qtimer(const volatile void __iomem *addr)
+{
+   u64 low, high;
+
+   low = readl(addr);
+   high = readl(addr + sizeof(u32));
+   return low | (high << 32);
+}

If that's only for compile on 32-bit PowerPC, I think would be better
to limit supported architectures on Kconfig.


The issue was flagged on 32-bit PowerPC, but I concluded from a code 
review that the issue exists on any 32-bit architecture.  Given that 
this is an add-on card I'd prefer to support as many architectures as 
possible.


-Jeff



Re: [PATCH] drm/amdgpu: Fixes uninitialized variable usage in amdgpu_dm_setup_replay

2023-10-27 Thread Lakha, Bhawanpreet
[AMD Official Use Only - General]


There was a consensus to use memset instead of {0}. I remember making changes 
related to that previously.

Bhawan


From: Mahfooz, Hamza 
Sent: October 27, 2023 11:53 AM
To: Yuran Pereira ; airl...@gmail.com 

Cc: Li, Sun peng (Leo) ; Lakha, Bhawanpreet 
; Pan, Xinhui ; Siqueira, 
Rodrigo ; linux-ker...@vger.kernel.org 
; amd-...@lists.freedesktop.org 
; dri-devel@lists.freedesktop.org 
; Deucher, Alexander 
; Koenig, Christian ; 
linux-kernel-ment...@lists.linuxfoundation.org 

Subject: Re: [PATCH] drm/amdgpu: Fixes uninitialized variable usage in 
amdgpu_dm_setup_replay

On 10/26/23 17:25, Yuran Pereira wrote:
> Since `pr_config` is not initialized after its declaration, the
> following operations with `replay_enable_option` may be performed
> when `replay_enable_option` is holding junk values which could
> possibly lead to undefined behaviour
>
> ```
>  ...
>  pr_config.replay_enable_option |= pr_enable_option_static_screen;
>  ...
>
>  if (!pr_config.replay_timing_sync_supported)
>  pr_config.replay_enable_option &= ~pr_enable_option_general_ui;
>  ...
> ```
>
> This patch initializes `pr_config` after its declaration to ensure that
> it doesn't contain junk data, and prevent any undefined behaviour
>
> Addresses-Coverity-ID: 1544428 ("Uninitialized scalar variable")
> Fixes: dede1fea4460 ("drm/amd/display: Add Freesync Panel DM code")
> Signed-off-by: Yuran Pereira 
> ---
>   drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c | 3 +++
>   1 file changed, 3 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c
> index 32d3086c4cb7..40526507f50b 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c
> @@ -23,6 +23,7 @@
>*
>*/
>
> +#include 
>   #include "amdgpu_dm_replay.h"
>   #include "dc.h"
>   #include "dm_helpers.h"
> @@ -74,6 +75,8 @@ bool amdgpu_dm_setup_replay(struct dc_link *link, struct 
> amdgpu_dm_connector *ac
>struct replay_config pr_config;

I would prefer setting pr_config = {0};

>union replay_debug_flags *debug_flags = NULL;
>
> + memset(_config, 0, sizeof(pr_config));
> +
>// For eDP, if Replay is supported, return true to skip checks
>if (link->replay_settings.config.replay_supported)
>return true;
--
Hamza



Re: [PATCH] drm/amdgpu: Fixes uninitialized variable usage in amdgpu_dm_setup_replay

2023-10-27 Thread Hamza Mahfooz

Also, please write the tagline in present tense.
On 10/27/23 11:53, Hamza Mahfooz wrote:

On 10/26/23 17:25, Yuran Pereira wrote:

Since `pr_config` is not initialized after its declaration, the
following operations with `replay_enable_option` may be performed
when `replay_enable_option` is holding junk values which could
possibly lead to undefined behaviour

```
 ...
 pr_config.replay_enable_option |= pr_enable_option_static_screen;
 ...

 if (!pr_config.replay_timing_sync_supported)
 pr_config.replay_enable_option &= ~pr_enable_option_general_ui;
 ...
```

This patch initializes `pr_config` after its declaration to ensure that
it doesn't contain junk data, and prevent any undefined behaviour

Addresses-Coverity-ID: 1544428 ("Uninitialized scalar variable")
Fixes: dede1fea4460 ("drm/amd/display: Add Freesync Panel DM code")
Signed-off-by: Yuran Pereira 
---
  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c

index 32d3086c4cb7..40526507f50b 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c
@@ -23,6 +23,7 @@
   *
   */
+#include 
  #include "amdgpu_dm_replay.h"
  #include "dc.h"
  #include "dm_helpers.h"
@@ -74,6 +75,8 @@ bool amdgpu_dm_setup_replay(struct dc_link *link, 
struct amdgpu_dm_connector *ac

  struct replay_config pr_config;


I would prefer setting pr_config = {0};


  union replay_debug_flags *debug_flags = NULL;
+    memset(_config, 0, sizeof(pr_config));
+
  // For eDP, if Replay is supported, return true to skip checks
  if (link->replay_settings.config.replay_supported)
  return true;

--
Hamza



Re: [PATCH] drm/amdgpu: Fixes uninitialized variable usage in amdgpu_dm_setup_replay

2023-10-27 Thread Hamza Mahfooz

On 10/26/23 17:25, Yuran Pereira wrote:

Since `pr_config` is not initialized after its declaration, the
following operations with `replay_enable_option` may be performed
when `replay_enable_option` is holding junk values which could
possibly lead to undefined behaviour

```
 ...
 pr_config.replay_enable_option |= pr_enable_option_static_screen;
 ...

 if (!pr_config.replay_timing_sync_supported)
 pr_config.replay_enable_option &= ~pr_enable_option_general_ui;
 ...
```

This patch initializes `pr_config` after its declaration to ensure that
it doesn't contain junk data, and prevent any undefined behaviour

Addresses-Coverity-ID: 1544428 ("Uninitialized scalar variable")
Fixes: dede1fea4460 ("drm/amd/display: Add Freesync Panel DM code")
Signed-off-by: Yuran Pereira 
---
  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c
index 32d3086c4cb7..40526507f50b 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c
@@ -23,6 +23,7 @@
   *
   */
  
+#include 

  #include "amdgpu_dm_replay.h"
  #include "dc.h"
  #include "dm_helpers.h"
@@ -74,6 +75,8 @@ bool amdgpu_dm_setup_replay(struct dc_link *link, struct 
amdgpu_dm_connector *ac
struct replay_config pr_config;


I would prefer setting pr_config = {0};


union replay_debug_flags *debug_flags = NULL;
  
+	memset(_config, 0, sizeof(pr_config));

+
// For eDP, if Replay is supported, return true to skip checks
if (link->replay_settings.config.replay_supported)
return true;

--
Hamza



Re: [PATCH] drm/amdgpu: Fixes uninitialized variable usage in amdgpu_dm_setup_replay

2023-10-27 Thread Lakha, Bhawanpreet
[AMD Official Use Only - General]

Thanks,

Reviewed-by: Bhawanpreet Lakha 


From: Yuran Pereira 
Sent: October 26, 2023 5:25 PM
To: airl...@gmail.com 
Cc: Yuran Pereira ; Wentland, Harry 
; Li, Sun peng (Leo) ; Siqueira, 
Rodrigo ; Deucher, Alexander 
; Koenig, Christian ; Pan, 
Xinhui ; dan...@ffwll.ch ; Lakha, 
Bhawanpreet ; amd-...@lists.freedesktop.org 
; dri-devel@lists.freedesktop.org 
; linux-ker...@vger.kernel.org 
; linux-kernel-ment...@lists.linuxfoundation.org 

Subject: [PATCH] drm/amdgpu: Fixes uninitialized variable usage in 
amdgpu_dm_setup_replay

Since `pr_config` is not initialized after its declaration, the
following operations with `replay_enable_option` may be performed
when `replay_enable_option` is holding junk values which could
possibly lead to undefined behaviour

```
...
pr_config.replay_enable_option |= pr_enable_option_static_screen;
...

if (!pr_config.replay_timing_sync_supported)
pr_config.replay_enable_option &= ~pr_enable_option_general_ui;
...
```

This patch initializes `pr_config` after its declaration to ensure that
it doesn't contain junk data, and prevent any undefined behaviour

Addresses-Coverity-ID: 1544428 ("Uninitialized scalar variable")
Fixes: dede1fea4460 ("drm/amd/display: Add Freesync Panel DM code")
Signed-off-by: Yuran Pereira 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c
index 32d3086c4cb7..40526507f50b 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c
@@ -23,6 +23,7 @@
  *
  */

+#include 
 #include "amdgpu_dm_replay.h"
 #include "dc.h"
 #include "dm_helpers.h"
@@ -74,6 +75,8 @@ bool amdgpu_dm_setup_replay(struct dc_link *link, struct 
amdgpu_dm_connector *ac
 struct replay_config pr_config;
 union replay_debug_flags *debug_flags = NULL;

+   memset(_config, 0, sizeof(pr_config));
+
 // For eDP, if Replay is supported, return true to skip checks
 if (link->replay_settings.config.replay_supported)
 return true;
--
2.25.1



Re: [PATCH v2 08/29] drm/dp: Add helpers to calculate the link BW overhead

2023-10-27 Thread Ville Syrjälä
On Tue, Oct 24, 2023 at 01:22:17PM +0300, Imre Deak wrote:
> Add helpers drivers can use to calculate the BW allocation overhead -
> due to SSC, FEC, DSC and data alignment on symbol cycles - and the
> channel coding efficiency - due to the 8b/10b, 128b/132b encoding. On
> 128b/132b links the FEC overhead is part of the coding efficiency, so
> not accounted for in the BW allocation overhead.
> 
> The drivers can use these functions to calculate a ratio, controlling
> the stream symbol insertion rate of the source device in each SST TU
> or MST MTP frame. Drivers can calculate this
> 
> m/n = (pixel_data_rate * drm_dp_bw_overhead()) /
>   (link_data_rate * drm_dp_bw_channel_coding_efficiency())
> 
> ratio for a given link and pixel stream and with that the
> 
> mtp_count = CEIL(64 * m / n)
> 
> allocated MTPs for the stream in a link frame and
> 
> pbn = CEIL(64 * dm_mst_get_pbn_divider() * m / n)
> 
> allocated PBNs for the stream on the MST link path.
> 
> Take drm_dp_bw_overhead() into use in drm_dp_calc_pbn_mode(), for
> drivers calculating the PBN value directly.
> 
> v2:
> - Add dockbook description to drm_dp_bw_channel_coding_efficiency().
>   (LKP).
> - Clarify the way m/n ratio is calculated in the commit log.
> 
> Cc: Lyude Paul 
> Cc: kernel test robot 
> Cc: dri-devel@lists.freedesktop.org
> Signed-off-by: Imre Deak 
> ---
>  drivers/gpu/drm/display/drm_dp_helper.c   | 124 ++
>  drivers/gpu/drm/display/drm_dp_mst_topology.c |  23 +++-
>  include/drm/display/drm_dp_helper.h   |  11 ++
>  3 files changed, 152 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/display/drm_dp_helper.c 
> b/drivers/gpu/drm/display/drm_dp_helper.c
> index e5d7970a9ddd0..79629bf7547bf 100644
> --- a/drivers/gpu/drm/display/drm_dp_helper.c
> +++ b/drivers/gpu/drm/display/drm_dp_helper.c
> @@ -3899,4 +3899,128 @@ int drm_panel_dp_aux_backlight(struct drm_panel 
> *panel, struct drm_dp_aux *aux)
>  }
>  EXPORT_SYMBOL(drm_panel_dp_aux_backlight);
>  
> +/* See DP Standard v2.1 2.6.4.4.1.1, 2.8.4.4, 2.8.7 */
> +static int drm_dp_link_symbol_cycles(int lane_count, int pixels, int bpp_x16,
> +  int symbol_size, bool is_mst)
> +{
> + int cycles = DIV_ROUND_UP(pixels * bpp_x16, 16 * symbol_size * 
> lane_count);
> + int align = is_mst ? 4 / lane_count : 1;
> +
> + return ALIGN(cycles, align);
> +}
> +
> +static int drm_dp_link_dsc_symbol_cycles(int lane_count, int pixels, int 
> slice_count,
> +  int bpp_x16, int symbol_size, bool 
> is_mst)
> +{
> + int slice_pixels = DIV_ROUND_UP(pixels, slice_count);
> + int slice_data_cycles = drm_dp_link_symbol_cycles(lane_count, 
> slice_pixels,
> +   bpp_x16, symbol_size, 
> is_mst);
> + int slice_eoc_cycles = is_mst ? 4 / lane_count : 1;
> +
> + return slice_count * (slice_data_cycles + slice_eoc_cycles);
> +}
> +
> +/**
> + * drm_dp_bw_overhead - Calculate the BW overhead of a DP link stream
> + * @lane_count: DP link lane count
> + * @hactive: pixel count of the active period in one scanline of the stream
> + * @dsc_slice_count: DSC slice count if @flags/DRM_DP_LINK_BW_OVERHEAD_DSC 
> is set
> + * @bpp_x16: bits per pixel in .4 binary fixed point
> + * @flags: DRM_DP_OVERHEAD_x flags
> + *
> + * Calculate the BW allocation overhead of a DP link stream, depending
> + * on the link's
> + * - @lane_count
> + * - SST/MST mode (@flags / %DRM_DP_OVERHEAD_MST)
> + * - symbol size (@flags / %DRM_DP_OVERHEAD_UHBR)
> + * - FEC mode (@flags / %DRM_DP_OVERHEAD_FEC)
> + * - SSC mode (@flags / %DRM_DP_OVERHEAD_SSC)
> + * as well as the stream's
> + * - @hactive timing
> + * - @bpp_x16 color depth
> + * - compression mode (@flags / %DRM_DP_OVERHEAD_DSC).
> + * Note that this overhead doesn't account for the 8b/10b, 128b/132b
> + * channel coding efficiency, for that see
> + * @drm_dp_link_bw_channel_coding_efficiency().
> + *
> + * Returns the overhead as 100% + overhead% in 1ppm units.
> + */
> +int drm_dp_bw_overhead(int lane_count, int hactive,
> +int dsc_slice_count,
> +int bpp_x16, unsigned long flags)
> +{
> + int symbol_size = flags & DRM_DP_BW_OVERHEAD_UHBR ? 32 : 8;
> + bool is_mst = flags & DRM_DP_BW_OVERHEAD_MST;
> + u32 overhead = 100;
> + int symbol_cycles;
> +
> + /*
> +  * DP Standard v2.1 2.6.4.1
> +  * SSC downspread and ref clock variation margin:
> +  *   5300ppm + 300ppm ~ 0.6%
> +  */
> + if (flags & DRM_DP_BW_OVERHEAD_SSC)

Maybe the flag name should reflect that it accounts for both 
downspread and ref clk difference.

> + overhead += 6000;
> +
> + /*
> +  * DP Standard v2.1 2.6.4.1.1:
> +  * FEC symbol insertions for 8b/10b channel coding:
> +  *   2.4%
> +  */
> + if (flags & DRM_DP_BW_OVERHEAD_FEC)
> + overhead += 24000;
> +
> + /*
> +  * DP 

[PATCH libdrm v4 1/9] intel: determine target endianness using meson

2023-10-27 Thread Geert Uytterhoeven
The endianness of the target is currently determined based on
preprocessor symbols.  Unfortunately some symbols checked are wrong
(sparc64-linux-gnu-gcc does not define __BIG_ENDIAN__ or SPARC), and
several checks for big-endian architectures are missing.

Fix this by introducing a new preprocessor symbol HAVE_BIG_ENDIAN, which
is set based on meson's knowledge of the target endianness.

Android.common.mk does not need an update, as Android is always
little-endian (https://developer.android.com/ndk/guides/abis.html).

Signed-off-by: Geert Uytterhoeven 
---
v4:
  - Replace explicit #ifdef checks by a define set by meson,

v3:
  - No changes,

v2:
  - Add arm, aarch64, microblaze, s390, and sh.
---
 intel/uthash.h | 4 ++--
 meson.build| 5 +
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/intel/uthash.h b/intel/uthash.h
index 45d1f9fc12a1d6f9..62e1650804841638 100644
--- a/intel/uthash.h
+++ b/intel/uthash.h
@@ -648,11 +648,11 @@ do {
 #define MUR_PLUS2_ALIGNED(p) (((unsigned long)p & 3UL) == 2UL)
 #define MUR_PLUS3_ALIGNED(p) (((unsigned long)p & 3UL) == 3UL)
 #define WP(p) ((uint32_t*)((unsigned long)(p) & ~3UL))
-#if (defined(__BIG_ENDIAN__) || defined(SPARC) || defined(__ppc__) || 
defined(__ppc64__))
+#ifdef HAVE_BIG_ENDIAN
 #define MUR_THREE_ONE(p) *WP(p))&0x00ff) << 8) | 
(((*(WP(p)+1))&0xff00) >> 24))
 #define MUR_TWO_TWO(p)   *WP(p))&0x) <<16) | 
(((*(WP(p)+1))&0x) >> 16))
 #define MUR_ONE_THREE(p) *WP(p))&0x00ff) <<24) | 
(((*(WP(p)+1))&0xff00) >>  8))
-#else /* assume little endian non-intel */
+#else /* little endian non-intel */
 #define MUR_THREE_ONE(p) *WP(p))&0xff00) >> 8) | 
(((*(WP(p)+1))&0x00ff) << 24))
 #define MUR_TWO_TWO(p)   *WP(p))&0x) >>16) | 
(((*(WP(p)+1))&0x) << 16))
 #define MUR_ONE_THREE(p) *WP(p))&0xff00) >>24) | 
(((*(WP(p)+1))&0x00ff) <<  8))
diff --git a/meson.build b/meson.build
index e203965d82eb620b..d2fdf7929f363f84 100644
--- a/meson.build
+++ b/meson.build
@@ -226,6 +226,11 @@ if with_freedreno_kgsl and not with_freedreno
   error('cannot enable freedreno-kgsl without freedreno support')
 endif
 config.set10('_GNU_SOURCE', true)
+
+if target_machine.endian() == 'big'
+  config.set('HAVE_BIG_ENDIAN', 1)
+endif
+
 config_file = configure_file(
   configuration : config,
   output : 'config.h',
-- 
2.34.1



[PATCH libdrm v4 4/9] util: add missing big-endian RGB16 frame buffer formats

2023-10-27 Thread Geert Uytterhoeven
Signed-off-by: Geert Uytterhoeven 
---
v4:
  - No changes,

v3:
  - Update for suffix change from "be" to "_BE", cfr. commit
ffb9375a505700ad ("xf86drm: handle DRM_FORMAT_BIG_ENDIAN in
drmGetFormatName()"),

v2:
  - New.
---
 tests/util/format.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tests/util/format.c b/tests/util/format.c
index b99cc9c3599d9237..376a32fe4b485238 100644
--- a/tests/util/format.c
+++ b/tests/util/format.c
@@ -78,6 +78,9 @@ static const struct util_format_info format_info[] = {
{ DRM_FORMAT_BGRX5551, "BX15", MAKE_RGB_INFO(5, 1, 5, 6, 5, 11, 0, 0) },
{ DRM_FORMAT_RGB565, "RG16", MAKE_RGB_INFO(5, 11, 6, 5, 5, 0, 0, 0) },
{ DRM_FORMAT_BGR565, "BG16", MAKE_RGB_INFO(5, 0, 6, 5, 5, 11, 0, 0) },
+   /* Big-endian RGB16 */
+   { DRM_FORMAT_XRGB1555 | DRM_FORMAT_BIG_ENDIAN, "XR15_BE", 
MAKE_RGB_INFO(5, 10, 5, 5, 5, 0, 0, 0) },
+   { DRM_FORMAT_RGB565 | DRM_FORMAT_BIG_ENDIAN, "RG16_BE", 
MAKE_RGB_INFO(5, 11, 6, 5, 5, 0, 0, 0) },
/* RGB24 */
{ DRM_FORMAT_BGR888, "BG24", MAKE_RGB_INFO(8, 0, 8, 8, 8, 16, 0, 0) },
{ DRM_FORMAT_RGB888, "RG24", MAKE_RGB_INFO(8, 16, 8, 8, 8, 0, 0, 0) },
-- 
2.34.1



[PATCH libdrm v4 0/9] Big-endian fixes

2023-10-27 Thread Geert Uytterhoeven
Hi all,

This patch series fixes some endianness issues in libdrm.
It has been tested on ARAnyM using a work-in-progress Atari DRM driver.
After this, the smpte and tiles modetest patterns and the pwetty markers
are rendered correctly using the XR24, RG16, and RG16BE formats on
big-endian systems.

Changes compared to v3[1]:
  - Replace explicit #ifdef checks by a define set by meson,
  - Use new HAVE_BIG_ENDIAN symbol,

Changes compared to v2[2]:
  - Increase indentation after definition of cpu_to_*() macros,
  - Update for suffix change from "be" to "_BE", cfr. commit
ffb9375a505700ad ("xf86drm: handle DRM_FORMAT_BIG_ENDIAN in
drmGetFormatName()"),
  - Replace hardcoded numbers in code by sizeof(),
  - Wrap byteswap_buffer{16,32}() implementation inside #if HAVE_CAIRO
to avoid defined-but-not-used compiler warnings,
  - Drop "modetest: Fix printing of big-endian fourcc values", as it is
no longer needed since commit ffb9375a505700ad ("xf86drm: handle
DRM_FORMAT_BIG_ENDIAN in drmGetFormatName()").

Changes compared to v1[3]:
  - Consider arm, aarch64, microblaze, s390, and sh in endianness
checks,
  - Add Acked-by,
  - Add swap32() intermediate helper,
  - Fix 16 bpp formats on big-endian,
  - Add support for big-endian XRGB1555 and RGB565,
  - Fix printing of big-endian fourcc values,
  - Fix pwetty on big-endian.

I have also updated the merge request at [4].

Thanks for your comments!

[1] "[PATCH libdrm v3 0/9] Big-endian fixes
https://lore.kernel.org/r/cover.1698217235.git.ge...@linux-m68k.org
[2] "[PATCH libdrm v2 00/10] Big-endian fixes"
https://lore.kernel.org/r/cover.1657302103.git.ge...@linux-m68k.org/#t
[3] "[PATCH RFC libdrm 0/2] Big-endian fixes"
https://lore.kernel.org/r/cover.1646684158.git.ge...@linux-m68k.org
[4] https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/331

Geert Uytterhoeven (9):
  intel: determine target endianness using meson
  util: fix 32 bpp patterns on big-endian
  util: fix 16 bpp patterns on big-endian
  util: add missing big-endian RGB16 frame buffer formats
  modetest: add support for parsing big-endian formats
  util: add test pattern support for big-endian XRGB1555/RGB565
  util: fix pwetty on big-endian
  util: add pwetty support for big-endian RGB565
  modetest: add support for big-endian XRGB1555/RGB565

 intel/uthash.h|   4 +-
 meson.build   |   5 ++
 tests/modetest/buffers.c  |   4 ++
 tests/modetest/modetest.c |  15 +++--
 tests/util/format.c   |   3 +
 tests/util/pattern.c  | 117 +++---
 6 files changed, 120 insertions(+), 28 deletions(-)

-- 
2.34.1



[PATCH libdrm v4 3/9] util: fix 16 bpp patterns on big-endian

2023-10-27 Thread Geert Uytterhoeven
DRM formats are defined to be little-endian, unless the
DRM_FORMAT_BIG_ENDIAN flag is set.  Hence writes of multi-byte pixel
values need to take endianness into account.

Introduce a swap16() helper to byteswap 16-bit values, and a
cpu_to_le16() helper to convert 16-bit values from CPU-endian to
little-endian, and use the latter in the various pattern fill functions
for 16-bit formats.

Signed-off-by: Geert Uytterhoeven 
---
v4:
  - No changes,

v3:
  - Increase indentation after definition of cpu_to_le16(),

v2:
  - New.
---
 tests/util/pattern.c | 21 ++---
 1 file changed, 14 insertions(+), 7 deletions(-)

diff --git a/tests/util/pattern.c b/tests/util/pattern.c
index c79cad2c6a23993f..1927377d0027c6fd 100644
--- a/tests/util/pattern.c
+++ b/tests/util/pattern.c
@@ -61,6 +61,11 @@ struct color_yuv {
  .u = MAKE_YUV_601_U(r, g, b), \
  .v = MAKE_YUV_601_V(r, g, b) }
 
+static inline uint16_t swap16(uint16_t x)
+{
+   return ((x & 0x00ffU) << 8) | ((x & 0xff00U) >> 8);
+}
+
 static inline uint32_t swap32(uint32_t x)
 {
return ((x & 0x00ffU) << 24) |
@@ -70,8 +75,10 @@ static inline uint32_t swap32(uint32_t x)
 }
 
 #ifdef HAVE_BIG_ENDIAN
+#define cpu_to_le16(x) swap16(x)
 #define cpu_to_le32(x) swap32(x)
 #else
+#define cpu_to_le16(x) (x)
 #define cpu_to_le32(x) (x)
 #endif
 
@@ -410,26 +417,26 @@ static void fill_smpte_rgb16(const struct util_rgb_info 
*rgb, void *mem,
 
for (y = 0; y < height * 6 / 9; ++y) {
for (x = 0; x < width; ++x)
-   ((uint16_t *)mem)[x] = colors_top[x * 7 / width];
+   ((uint16_t *)mem)[x] = cpu_to_le16(colors_top[x * 7 / 
width]);
mem += stride;
}
 
for (; y < height * 7 / 9; ++y) {
for (x = 0; x < width; ++x)
-   ((uint16_t *)mem)[x] = colors_middle[x * 7 / width];
+   ((uint16_t *)mem)[x] = cpu_to_le16(colors_middle[x * 7 
/ width]);
mem += stride;
}
 
for (; y < height; ++y) {
for (x = 0; x < width * 5 / 7; ++x)
((uint16_t *)mem)[x] =
-   colors_bottom[x * 4 / (width * 5 / 7)];
+   cpu_to_le16(colors_bottom[x * 4 / (width * 5 / 
7)]);
for (; x < width * 6 / 7; ++x)
((uint16_t *)mem)[x] =
-   colors_bottom[(x - width * 5 / 7) * 3
- / (width / 7) + 4];
+   cpu_to_le16(colors_bottom[(x - width * 5 / 7) * 
3
+ / (width / 7) + 4]);
for (; x < width; ++x)
-   ((uint16_t *)mem)[x] = colors_bottom[7];
+   ((uint16_t *)mem)[x] = cpu_to_le16(colors_bottom[7]);
mem += stride;
}
 }
@@ -1280,7 +1287,7 @@ static void fill_tiles_rgb16(const struct 
util_format_info *info, void *mem,
  (rgb32 >> 8) & 0xff, rgb32 & 0xff,
  255);
 
-   ((uint16_t *)mem)[x] = color;
+   ((uint16_t *)mem)[x] = cpu_to_le16(color);
}
mem += stride;
}
-- 
2.34.1



[PATCH libdrm v4 8/9] util: add pwetty support for big-endian RGB565

2023-10-27 Thread Geert Uytterhoeven
Add support for rendering the crosshairs in a buffer using the
big-endian RGB565 format.

Signed-off-by: Geert Uytterhoeven 
---
v4:
  - No changes,

v3:
  - No changes,

v2:
  - New.
---
 tests/util/pattern.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tests/util/pattern.c b/tests/util/pattern.c
index 769da5814d1f689e..34da1c4d4f8d5ff0 100644
--- a/tests/util/pattern.c
+++ b/tests/util/pattern.c
@@ -1188,6 +1188,7 @@ static void make_pwetty(void *data, unsigned int width, 
unsigned int height,
cairo_format = CAIRO_FORMAT_ARGB32;
break;
case DRM_FORMAT_RGB565:
+   case DRM_FORMAT_RGB565 | DRM_FORMAT_BIG_ENDIAN:
case DRM_FORMAT_BGR565:
cairo_format = CAIRO_FORMAT_RGB16_565;
swap16 = fb_foreign_endian(format);
-- 
2.34.1



[PATCH libdrm v4 7/9] util: fix pwetty on big-endian

2023-10-27 Thread Geert Uytterhoeven
Cairo always uses native byte order for rendering.

Hence if the byte order of the frame buffer differs from the byte order
of the CPU, the frame buffer contents need to be byteswapped twice: once
before rendering, to convert to native byte order, and a second time
after rendering, to restore the frame buffer format's byte order.

Note that byte swapping is not done for ARGB32 formats, as for these
formats, byte order only affects the order of the red, green, and blue
channels, which we do not care about here.

Signed-off-by: Geert Uytterhoeven 
---
v4:
  - No changes,

v3:
  - Wrap byteswap_buffer{16,32}() implementation inside #if HAVE_CAIRO
to avoid defined-but-not-used compiler warnings,

v2:
  - RGB30 is untested.
---
 tests/util/pattern.c | 44 
 1 file changed, 44 insertions(+)

diff --git a/tests/util/pattern.c b/tests/util/pattern.c
index ef589124f278e837..769da5814d1f689e 100644
--- a/tests/util/pattern.c
+++ b/tests/util/pattern.c
@@ -79,10 +79,12 @@ static inline uint32_t swap32(uint32_t x)
 #define cpu_to_be16(x) (x)
 #define cpu_to_le16(x) swap16(x)
 #define cpu_to_le32(x) swap32(x)
+#define fb_foreign_endian(format)  (!((format) & DRM_FORMAT_BIG_ENDIAN))
 #else
 #define cpu_to_be16(x) swap16(x)
 #define cpu_to_le16(x) (x)
 #define cpu_to_le32(x) (x)
+#define fb_foreign_endian(format)  ((format) & DRM_FORMAT_BIG_ENDIAN)
 #endif
 
 #define cpu_to_fb16(x) (fb_be ? cpu_to_be16(x) : cpu_to_le16(x))
@@ -1141,6 +1143,32 @@ static void fill_smpte(const struct util_format_info 
*info, void *planes[3],
}
 }
 
+#if HAVE_CAIRO
+static void byteswap_buffer16(void *mem, unsigned int width, unsigned int 
height,
+ unsigned int stride)
+{
+   unsigned int x, y;
+
+   for (y = 0; y < height; ++y) {
+   for (x = 0; x < width; ++x)
+   ((uint16_t *)mem)[x] = swap16(((uint16_t *)mem)[x]);
+   mem += stride;
+   }
+}
+
+static void byteswap_buffer32(void *mem, unsigned int width, unsigned int 
height,
+ unsigned int stride)
+{
+   unsigned int x, y;
+
+   for (y = 0; y < height; ++y) {
+   for (x = 0; x < width; ++x)
+   ((uint32_t *)mem)[x] = swap32(((uint32_t *)mem)[x]);
+   mem += stride;
+   }
+}
+#endif
+
 static void make_pwetty(void *data, unsigned int width, unsigned int height,
unsigned int stride, uint32_t format)
 {
@@ -1148,6 +1176,8 @@ static void make_pwetty(void *data, unsigned int width, 
unsigned int height,
cairo_surface_t *surface;
cairo_t *cr;
cairo_format_t cairo_format;
+   bool swap16 = false;
+   bool swap32 = false;
 
/* we can ignore the order of R,G,B channels */
switch (format) {
@@ -1160,6 +1190,7 @@ static void make_pwetty(void *data, unsigned int width, 
unsigned int height,
case DRM_FORMAT_RGB565:
case DRM_FORMAT_BGR565:
cairo_format = CAIRO_FORMAT_RGB16_565;
+   swap16 = fb_foreign_endian(format);
break;
 #if CAIRO_VERSION_MAJOR > 1 || (CAIRO_VERSION_MAJOR == 1 && 
CAIRO_VERSION_MINOR >= 12)
case DRM_FORMAT_ARGB2101010:
@@ -1167,12 +1198,19 @@ static void make_pwetty(void *data, unsigned int width, 
unsigned int height,
case DRM_FORMAT_ABGR2101010:
case DRM_FORMAT_XBGR2101010:
cairo_format = CAIRO_FORMAT_RGB30;
+   swap32 = fb_foreign_endian(format);
break;
 #endif
default:
return;
}
 
+   /* Cairo uses native byte order, so we may have to byteswap before... */
+   if (swap16)
+   byteswap_buffer16(data, width, height, stride);
+   if (swap32)
+   byteswap_buffer32(data, width, height, stride);
+
surface = cairo_image_surface_create_for_data(data,
  cairo_format,
  width, height,
@@ -1208,6 +1246,12 @@ static void make_pwetty(void *data, unsigned int width, 
unsigned int height,
}
 
cairo_destroy(cr);
+
+   /* ... and after */
+   if (swap16)
+   byteswap_buffer16(data, width, height, stride);
+   if (swap32)
+   byteswap_buffer32(data, width, height, stride);
 #endif
 }
 
-- 
2.34.1



[PATCH libdrm v4 9/9] modetest: add support for big-endian XRGB1555/RGB565

2023-10-27 Thread Geert Uytterhoeven
Add support for creating buffers using big-endian formats.

For now this is limited to XRGB1555 and RGB565, which are the most
common big-endian formats.

Signed-off-by: Geert Uytterhoeven 
---
v4:
  - No changes,

v3:
  - No changes,

v2:
  - New.
---
 tests/modetest/buffers.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/tests/modetest/buffers.c b/tests/modetest/buffers.c
index 65f1cfb32ab9eeae..c8aafadd5145b64a 100644
--- a/tests/modetest/buffers.c
+++ b/tests/modetest/buffers.c
@@ -158,6 +158,7 @@ bo_create(int fd, unsigned int format,
case DRM_FORMAT_BGRX:
case DRM_FORMAT_ARGB1555:
case DRM_FORMAT_XRGB1555:
+   case DRM_FORMAT_XRGB1555 | DRM_FORMAT_BIG_ENDIAN:
case DRM_FORMAT_ABGR1555:
case DRM_FORMAT_XBGR1555:
case DRM_FORMAT_RGBA5551:
@@ -165,6 +166,7 @@ bo_create(int fd, unsigned int format,
case DRM_FORMAT_BGRA5551:
case DRM_FORMAT_BGRX5551:
case DRM_FORMAT_RGB565:
+   case DRM_FORMAT_RGB565 | DRM_FORMAT_BIG_ENDIAN:
case DRM_FORMAT_BGR565:
case DRM_FORMAT_UYVY:
case DRM_FORMAT_VYUY:
@@ -318,6 +320,7 @@ bo_create(int fd, unsigned int format,
case DRM_FORMAT_BGRX:
case DRM_FORMAT_ARGB1555:
case DRM_FORMAT_XRGB1555:
+   case DRM_FORMAT_XRGB1555 | DRM_FORMAT_BIG_ENDIAN:
case DRM_FORMAT_ABGR1555:
case DRM_FORMAT_XBGR1555:
case DRM_FORMAT_RGBA5551:
@@ -325,6 +328,7 @@ bo_create(int fd, unsigned int format,
case DRM_FORMAT_BGRA5551:
case DRM_FORMAT_BGRX5551:
case DRM_FORMAT_RGB565:
+   case DRM_FORMAT_RGB565 | DRM_FORMAT_BIG_ENDIAN:
case DRM_FORMAT_BGR565:
case DRM_FORMAT_BGR888:
case DRM_FORMAT_RGB888:
-- 
2.34.1



[PATCH libdrm v4 5/9] modetest: add support for parsing big-endian formats

2023-10-27 Thread Geert Uytterhoeven
When specifying a frame buffer format like "RG16_BE" (big-endian RG16),
modetest still uses the little-endian variant, as the format string is
truncated to four characters.

Fix this by increasing the format string size to 8 bytes (7 characters +
NUL terminator).

Signed-off-by: Geert Uytterhoeven 
---
v4:
  - No changes,

v3:
  - Update for suffix change from "be" to "_BE", cfr. commit
ffb9375a505700ad ("xf86drm: handle DRM_FORMAT_BIG_ENDIAN in
drmGetFormatName()"),
  - Replace hardcoded numbers in code by sizeof(),

v2:
  - New.
---
 tests/modetest/modetest.c | 15 +--
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/tests/modetest/modetest.c b/tests/modetest/modetest.c
index 9b1aa537be8716cf..cc96015f4a555dd3 100644
--- a/tests/modetest/modetest.c
+++ b/tests/modetest/modetest.c
@@ -817,7 +817,7 @@ struct pipe_arg {
unsigned int num_cons;
uint32_t crtc_id;
char mode_str[64];
-   char format_str[5];
+   char format_str[8]; /* need to leave room for "_BE" and terminating \0 
*/
float vrefresh;
unsigned int fourcc;
drmModeModeInfo *mode;
@@ -841,7 +841,7 @@ struct plane_arg {
unsigned int old_fb_id;
struct bo *bo;
struct bo *old_bo;
-   char format_str[5]; /* need to leave room for terminating \0 */
+   char format_str[8]; /* need to leave room for "_BE" and terminating \0 
*/
unsigned int fourcc;
 };
 
@@ -2032,8 +2032,9 @@ static int parse_connector(struct pipe_arg *pipe, const 
char *arg)
}
 
if (*p == '@') {
-   strncpy(pipe->format_str, p + 1, 4);
-   pipe->format_str[4] = '\0';
+   len = sizeof(pipe->format_str) - 1;
+   strncpy(pipe->format_str, p + 1, len);
+   pipe->format_str[len] = '\0';
}
 
pipe->fourcc = util_format_fourcc(pipe->format_str);
@@ -2047,6 +2048,7 @@ static int parse_connector(struct pipe_arg *pipe, const 
char *arg)
 
 static int parse_plane(struct plane_arg *plane, const char *p)
 {
+   unsigned int len;
char *end;
 
plane->plane_id = strtoul(p, , 10);
@@ -2085,8 +2087,9 @@ static int parse_plane(struct plane_arg *plane, const 
char *p)
}
 
if (*end == '@') {
-   strncpy(plane->format_str, end + 1, 4);
-   plane->format_str[4] = '\0';
+   len = sizeof(plane->format_str) - 1;
+   strncpy(plane->format_str, end + 1, len);
+   plane->format_str[len] = '\0';
} else {
strcpy(plane->format_str, "XR24");
}
-- 
2.34.1



[PATCH libdrm v4 2/9] util: fix 32 bpp patterns on big-endian

2023-10-27 Thread Geert Uytterhoeven
DRM formats are defined to be little-endian, unless the
DRM_FORMAT_BIG_ENDIAN flag is set.  Hence writes of multi-byte pixel
values need to take endianness into account.

Introduce a swap32() helper to byteswap 32-bit values, and a
cpu_to_le32() helper to convert 32-bit values from CPU-endian to
little-endian, and use the latter in the various pattern fill functions
for 32-bit formats.

Signed-off-by: Geert Uytterhoeven 
Acked-by: Pekka Paalanen 
---
v4:
  - Use new HAVE_BIG_ENDIAN symbol,

v3:
  - Increase indentation after definition of cpu_to_le32(),

v2:
  - Add Acked-by,
  - Add swap32() intermediate helper,
  - Add __ARM_BIG_ENDIAN and __s390__.
---
 tests/util/pattern.c | 32 +++-
 1 file changed, 23 insertions(+), 9 deletions(-)

diff --git a/tests/util/pattern.c b/tests/util/pattern.c
index f69c5206d96eff02..c79cad2c6a23993f 100644
--- a/tests/util/pattern.c
+++ b/tests/util/pattern.c
@@ -61,6 +61,20 @@ struct color_yuv {
  .u = MAKE_YUV_601_U(r, g, b), \
  .v = MAKE_YUV_601_V(r, g, b) }
 
+static inline uint32_t swap32(uint32_t x)
+{
+   return ((x & 0x00ffU) << 24) |
+  ((x & 0xff00U) <<  8) |
+  ((x & 0x00ffU) >>  8) |
+  ((x & 0xff00U) >> 24);
+}
+
+#ifdef HAVE_BIG_ENDIAN
+#define cpu_to_le32(x) swap32(x)
+#else
+#define cpu_to_le32(x) (x)
+#endif
+
 /* This function takes 8-bit color values */
 static inline uint32_t shiftcolor8(const struct util_color_component *comp,
  uint32_t value)
@@ -520,26 +534,26 @@ static void fill_smpte_rgb32(const struct util_rgb_info 
*rgb, void *mem,
 
for (y = 0; y < height * 6 / 9; ++y) {
for (x = 0; x < width; ++x)
-   ((uint32_t *)mem)[x] = colors_top[x * 7 / width];
+   ((uint32_t *)mem)[x] = cpu_to_le32(colors_top[x * 7 / 
width]);
mem += stride;
}
 
for (; y < height * 7 / 9; ++y) {
for (x = 0; x < width; ++x)
-   ((uint32_t *)mem)[x] = colors_middle[x * 7 / width];
+   ((uint32_t *)mem)[x] = cpu_to_le32(colors_middle[x * 7 
/ width]);
mem += stride;
}
 
for (; y < height; ++y) {
for (x = 0; x < width * 5 / 7; ++x)
((uint32_t *)mem)[x] =
-   colors_bottom[x * 4 / (width * 5 / 7)];
+   cpu_to_le32(colors_bottom[x * 4 / (width * 5 / 
7)]);
for (; x < width * 6 / 7; ++x)
((uint32_t *)mem)[x] =
-   colors_bottom[(x - width * 5 / 7) * 3
- / (width / 7) + 4];
+   cpu_to_le32(colors_bottom[(x - width * 5 / 7) * 
3
+ / (width / 7) + 4]);
for (; x < width; ++x)
-   ((uint32_t *)mem)[x] = colors_bottom[7];
+   ((uint32_t *)mem)[x] = cpu_to_le32(colors_bottom[7]);
mem += stride;
}
 }
@@ -1315,7 +1329,7 @@ static void fill_tiles_rgb32(const struct 
util_format_info *info, void *mem,
  (rgb32 >> 8) & 0xff, rgb32 & 0xff,
  alpha);
 
-   ((uint32_t *)mem)[x] = color;
+   ((uint32_t *)mem)[x] = cpu_to_le32(color);
}
mem += stride;
}
@@ -1464,7 +1478,7 @@ static void fill_gradient_rgb32(const struct 
util_rgb_info *rgb,
 
for (j = 0; j < width / 2; j++) {
uint32_t value = MAKE_RGBA10(rgb, j & 0x3ff, j & 0x3ff, 
j & 0x3ff, 0);
-   row[2*j] = row[2*j+1] = value;
+   row[2*j] = row[2*j+1] = cpu_to_le32(value);
}
mem += stride;
}
@@ -1474,7 +1488,7 @@ static void fill_gradient_rgb32(const struct 
util_rgb_info *rgb,
 
for (j = 0; j < width / 2; j++) {
uint32_t value = MAKE_RGBA10(rgb, j & 0x3fc, j & 0x3fc, 
j & 0x3fc, 0);
-   row[2*j] = row[2*j+1] = value;
+   row[2*j] = row[2*j+1] = cpu_to_le32(value);
}
mem += stride;
}
-- 
2.34.1



[PATCH libdrm v4 6/9] util: add test pattern support for big-endian XRGB1555/RGB565

2023-10-27 Thread Geert Uytterhoeven
Add support for drawing the SMPTE and tiles test patterns in buffers
using big-endian formats.

For now this is limited to XRGB1555 and RGB565, which are the most
common big-endian formats.

Signed-off-by: Geert Uytterhoeven 
---
v4:
  - No changes,

v3:
  - Increase indentation after definition of cpu_to_be16(),

v2:
  - New.
---
 tests/util/pattern.c | 31 +--
 1 file changed, 21 insertions(+), 10 deletions(-)

diff --git a/tests/util/pattern.c b/tests/util/pattern.c
index 1927377d0027c6fd..ef589124f278e837 100644
--- a/tests/util/pattern.c
+++ b/tests/util/pattern.c
@@ -23,6 +23,7 @@
  * IN THE SOFTWARE.
  */
 
+#include 
 #include 
 #include 
 #include 
@@ -75,13 +76,17 @@ static inline uint32_t swap32(uint32_t x)
 }
 
 #ifdef HAVE_BIG_ENDIAN
+#define cpu_to_be16(x) (x)
 #define cpu_to_le16(x) swap16(x)
 #define cpu_to_le32(x) swap32(x)
 #else
+#define cpu_to_be16(x) swap16(x)
 #define cpu_to_le16(x) (x)
 #define cpu_to_le32(x) (x)
 #endif
 
+#define cpu_to_fb16(x) (fb_be ? cpu_to_be16(x) : cpu_to_le16(x))
+
 /* This function takes 8-bit color values */
 static inline uint32_t shiftcolor8(const struct util_color_component *comp,
  uint32_t value)
@@ -382,7 +387,7 @@ static void fill_smpte_yuv_packed(const struct 
util_yuv_info *yuv, void *mem,
 
 static void fill_smpte_rgb16(const struct util_rgb_info *rgb, void *mem,
 unsigned int width, unsigned int height,
-unsigned int stride)
+unsigned int stride, bool fb_be)
 {
const uint16_t colors_top[] = {
MAKE_RGBA(rgb, 192, 192, 192, 255), /* grey */
@@ -417,26 +422,26 @@ static void fill_smpte_rgb16(const struct util_rgb_info 
*rgb, void *mem,
 
for (y = 0; y < height * 6 / 9; ++y) {
for (x = 0; x < width; ++x)
-   ((uint16_t *)mem)[x] = cpu_to_le16(colors_top[x * 7 / 
width]);
+   ((uint16_t *)mem)[x] = cpu_to_fb16(colors_top[x * 7 / 
width]);
mem += stride;
}
 
for (; y < height * 7 / 9; ++y) {
for (x = 0; x < width; ++x)
-   ((uint16_t *)mem)[x] = cpu_to_le16(colors_middle[x * 7 
/ width]);
+   ((uint16_t *)mem)[x] = cpu_to_fb16(colors_middle[x * 7 
/ width]);
mem += stride;
}
 
for (; y < height; ++y) {
for (x = 0; x < width * 5 / 7; ++x)
((uint16_t *)mem)[x] =
-   cpu_to_le16(colors_bottom[x * 4 / (width * 5 / 
7)]);
+   cpu_to_fb16(colors_bottom[x * 4 / (width * 5 / 
7)]);
for (; x < width * 6 / 7; ++x)
((uint16_t *)mem)[x] =
-   cpu_to_le16(colors_bottom[(x - width * 5 / 7) * 
3
+   cpu_to_fb16(colors_bottom[(x - width * 5 / 7) * 
3
  / (width / 7) + 4]);
for (; x < width; ++x)
-   ((uint16_t *)mem)[x] = cpu_to_le16(colors_bottom[7]);
+   ((uint16_t *)mem)[x] = cpu_to_fb16(colors_bottom[7]);
mem += stride;
}
 }
@@ -1089,9 +1094,11 @@ static void fill_smpte(const struct util_format_info 
*info, void *planes[3],
case DRM_FORMAT_BGRA:
case DRM_FORMAT_BGRX:
case DRM_FORMAT_RGB565:
+   case DRM_FORMAT_RGB565 | DRM_FORMAT_BIG_ENDIAN:
case DRM_FORMAT_BGR565:
case DRM_FORMAT_ARGB1555:
case DRM_FORMAT_XRGB1555:
+   case DRM_FORMAT_XRGB1555 | DRM_FORMAT_BIG_ENDIAN:
case DRM_FORMAT_ABGR1555:
case DRM_FORMAT_XBGR1555:
case DRM_FORMAT_RGBA5551:
@@ -1099,7 +1106,8 @@ static void fill_smpte(const struct util_format_info 
*info, void *planes[3],
case DRM_FORMAT_BGRA5551:
case DRM_FORMAT_BGRX5551:
return fill_smpte_rgb16(>rgb, planes[0],
-   width, height, stride);
+   width, height, stride,
+   info->format & DRM_FORMAT_BIG_ENDIAN);
 
case DRM_FORMAT_BGR888:
case DRM_FORMAT_RGB888:
@@ -1271,7 +1279,7 @@ static void fill_tiles_yuv_packed(const struct 
util_format_info *info,
 
 static void fill_tiles_rgb16(const struct util_format_info *info, void *mem,
 unsigned int width, unsigned int height,
-unsigned int stride)
+unsigned int stride, bool fb_be)
 {
const struct util_rgb_info *rgb = >rgb;
void *mem_base = mem;
@@ -1287,7 +1295,7 @@ static void fill_tiles_rgb16(const struct 
util_format_info *info, void *mem,
  (rgb32 >> 8) 

Re: [PATCH v2] accel/qaic: Enable 1 MSI fallback mode

2023-10-27 Thread Jeffrey Hugo

On 10/16/2023 11:00 AM, Jeffrey Hugo wrote:

From: Carl Vanderlip 

Several virtualization use-cases either don't support 32 MultiMSIs
(Xen/VMware) or have significant drawbacks to their use (KVM's vIOMMU,
which is required to support 32 MSI, needs to allocate an alternate
system memory space for each device using vIOMMU (e.g. 8GB VM mem and
2 cards => 8 + 2 * 8 = 24GB host memory required)). Support these
cases by enabling a 1 MSI fallback mode.

Whenever all 32 MSIs requested are not available, a second request for
a single MSI is made. Its success is the initiator of single MSI mode.
This mode causes all interrupts generated by the device to be directed
to the 0th MSI (firmware >=v1.10 will do this as a response to the PCIe
MSI capability configuration). Likewise, all interrupt handlers for the
device are registered to the 0th MSI.

Since the DBC interrupt handler checks if the DBC is in use or if
there is any pending changes, the 'spurious' interrupts are
disregarded. If there is work to be done, the standard threaded IRQ
handler is dispatched.

On every interrupt, the MHI handler wakes up its threaded interrupt
handler, and attempts to wake any waiters for MHI state events.

Performance is within +-0.6% for test cases that typify real world
use. Larger differences ([-4,+132]%, avg +47%) exist for very simple
tasks (e.g. addition) compiled for single NSPs. It is assumed that the
small work and many interrupts typically cause contention (e.g. 16 NSPs
vs 4 CPUs), as evidenced by the standard deviation between runs also
decreasing (r=-0.48 between delta(Performace_test) and
delta(StdDev_test/Avg_test))

Signed-off-by: Carl Vanderlip 
Reviewed-by: Pranjal Ramajor Asha Kanojiya 
Reviewed-by: Jeffrey Hugo 
Signed-off-by: Jeffrey Hugo 


Pushed to drm-misc-next


[PATCH] accel/ivpu: avoid build failure with CONFIG_PM=n

2023-10-27 Thread Arnd Bergmann
From: Arnd Bergmann 

The usage count of struct dev_pm_info is an implementation detail that
is only available if CONFIG_PM is enabled, so printing it in a debug message
causes a build failure in configurations without PM:

In file included from include/linux/device.h:15,
 from include/linux/pci.h:37,
 from drivers/accel/ivpu/ivpu_pm.c:8:
drivers/accel/ivpu/ivpu_pm.c: In function 'ivpu_rpm_get_if_active':
drivers/accel/ivpu/ivpu_pm.c:254:51: error: 'struct dev_pm_info' has no member 
named 'usage_count'
  254 |  atomic_read(>drm.dev->power.usage_count));
  |   ^
include/linux/dev_printk.h:129:48: note: in definition of macro 'dev_printk'
  129 | _dev_printk(level, dev, fmt, ##__VA_ARGS__);
\
  |^~~
drivers/accel/ivpu/ivpu_drv.h:75:17: note: in expansion of macro 'dev_dbg'
   75 | dev_dbg((vdev)->drm.dev, "[%s] " fmt, #type, ##args);   
   \
  | ^~~
drivers/accel/ivpu/ivpu_pm.c:253:9: note: in expansion of macro 'ivpu_dbg'
  253 | ivpu_dbg(vdev, RPM, "rpm_get_if_active count %d\n",
  | ^~~~

The print message does not seem essential, so the easiest workaround is
to just remove it.

Fixes: c39dc15191c4 ("accel/ivpu: Read clock rate only if device is up")
Signed-off-by: Arnd Bergmann 
---
 drivers/accel/ivpu/ivpu_pm.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/accel/ivpu/ivpu_pm.c b/drivers/accel/ivpu/ivpu_pm.c
index 0ace218783c8..e9b16cbc26f4 100644
--- a/drivers/accel/ivpu/ivpu_pm.c
+++ b/drivers/accel/ivpu/ivpu_pm.c
@@ -250,9 +250,6 @@ int ivpu_rpm_get_if_active(struct ivpu_device *vdev)
 {
int ret;
 
-   ivpu_dbg(vdev, RPM, "rpm_get_if_active count %d\n",
-atomic_read(>drm.dev->power.usage_count));
-
ret = pm_runtime_get_if_active(vdev->drm.dev, false);
drm_WARN_ON(>drm, ret < 0);
 
-- 
2.39.2



Re: [PATCH 11/11] accel/ivpu: Wait for VPU to enter idle state after D0i3 entry message

2023-10-27 Thread Jeffrey Hugo

On 10/25/2023 3:43 AM, Stanislaw Gruszka wrote:

From: Andrzej Kacprowski 

The VPU needs non zero time to enter IDLE state after responding to
D0i3 entry message. If the driver does not wait for the VPU to enter
IDLE state it could cause warm boot failures.

Signed-off-by: Andrzej Kacprowski 
Reviewed-by: Stanislaw Gruszka 
Signed-off-by: Stanislaw Gruszka 


Should this be squashed into the previous change?  Feels like this is a 
bug fix to an issue that the previous change introduced.


Re: [PATCH 10/11] accel/ivpu: Add support for delayed D0i3 entry message

2023-10-27 Thread Jeffrey Hugo

On 10/25/2023 3:43 AM, Stanislaw Gruszka wrote:

From: Andrzej Kacprowski 

Currently the VPU firmware prepares for D0i3 every time the VPU
is entering D0i2 Idle state. This is not optimal as we might not
enter D0i3 every time we enter D0i2 Idle and this preparation
is quite costly.

This optimization moves D0i3 preparation to a dedicated
message sent from the host driver only when the driver is about
to enter D0i3 - this reduces power consumption and latency for
certain workloads, for example audio workloads that submit
inference every 10 ms.

Signed-off-by: Andrzej Kacprowski 
Reviewed-by: Stanislaw Gruszka 
Signed-off-by: Stanislaw Gruszka 


Reviewed-by: Jeffrey Hugo 

Some nits below.


---
  drivers/accel/ivpu/ivpu_drv.h | 10 +--
  drivers/accel/ivpu/ivpu_fw.c  | 48 +--
  drivers/accel/ivpu/ivpu_hw_37xx.c |  8 --
  drivers/accel/ivpu/ivpu_hw_40xx.c |  3 ++
  drivers/accel/ivpu/ivpu_jsm_msg.c | 15 ++
  drivers/accel/ivpu/ivpu_jsm_msg.h |  1 +
  drivers/accel/ivpu/ivpu_pm.c  | 11 ++-
  7 files changed, 87 insertions(+), 9 deletions(-)

diff --git a/drivers/accel/ivpu/ivpu_drv.h b/drivers/accel/ivpu/ivpu_drv.h
index 5432b5ee90df..84b170a3c323 100644
--- a/drivers/accel/ivpu/ivpu_drv.h
+++ b/drivers/accel/ivpu/ivpu_drv.h
@@ -88,6 +88,7 @@ struct ivpu_wa_table {
bool d3hot_after_power_off;
bool interrupt_clear_with_0;
bool disable_clock_relinquish;
+   bool disable_d0i3_msg;
  };
  
  struct ivpu_hw_info;

@@ -126,6 +127,7 @@ struct ivpu_device {
int tdr;
int reschedule_suspend;
int autosuspend;
+   int d0i3_entry_msg;
} timeout;
  };
  
@@ -148,9 +150,11 @@ extern u8 ivpu_pll_min_ratio;

  extern u8 ivpu_pll_max_ratio;
  extern bool ivpu_disable_mmu_cont_pages;
  
-#define IVPU_TEST_MODE_FW_TEST BIT(0)

-#define IVPU_TEST_MODE_NULL_HW BIT(1)
-#define IVPU_TEST_MODE_NULL_SUBMISSION BIT(2)
+#define IVPU_TEST_MODE_FW_TESTBIT(0)
+#define IVPU_TEST_MODE_NULL_HWBIT(1)
+#define IVPU_TEST_MODE_NULL_SUBMISSIONBIT(2)
+#define IVPU_TEST_MODE_D0I3_MSG_DISABLE   BIT(4)
+#define IVPU_TEST_MODE_D0I3_MSG_ENABLEBIT(5)
  extern int ivpu_test_mode;
  
  struct ivpu_file_priv *ivpu_file_priv_get(struct ivpu_file_priv *file_priv);

diff --git a/drivers/accel/ivpu/ivpu_fw.c b/drivers/accel/ivpu/ivpu_fw.c
index 51d976ba5276..3fd94c2d06c7 100644
--- a/drivers/accel/ivpu/ivpu_fw.c
+++ b/drivers/accel/ivpu/ivpu_fw.c
@@ -33,12 +33,17 @@
  
  #define ADDR_TO_L2_CACHE_CFG(addr) ((addr) >> 31)
  
-#define IVPU_FW_CHECK_API(vdev, fw_hdr, name, min_major) \

+/* Check if FW API is compatible with the driver */
+#define IVPU_FW_CHECK_API_COMPAT(vdev, fw_hdr, name, min_major) \
ivpu_fw_check_api(vdev, fw_hdr, #name, \
  VPU_##name##_API_VER_INDEX, \
  VPU_##name##_API_VER_MAJOR, \
  VPU_##name##_API_VER_MINOR, min_major)
  
+/* Check if API version is lover that the given version */


lower than?


+#define IVPU_FW_CHECK_API_VER_LT(vdev, fw_hdr, name, major, minor) \
+   ivpu_fw_check_api_ver_lt(vdev, fw_hdr, #name, 
VPU_##name##_API_VER_INDEX, major, minor)
+
  static char *ivpu_firmware;
  module_param_named_unsafe(firmware, ivpu_firmware, charp, 0644);
  MODULE_PARM_DESC(firmware, "VPU firmware binary in /lib/firmware/..");
@@ -105,6 +110,19 @@ ivpu_fw_check_api(struct ivpu_device *vdev, const struct 
vpu_firmware_header *fw
return 0;
  }
  
+static bool

+ivpu_fw_check_api_ver_lt(struct ivpu_device *vdev, const struct 
vpu_firmware_header *fw_hdr,
+const char *str, int index, u16 major, u16 minor)
+{
+   u16 fw_major = (u16)(fw_hdr->api_version[index] >> 16);
+   u16 fw_minor = (u16)(fw_hdr->api_version[index]);
+
+   if (fw_major < major || (fw_major == major && fw_minor < minor))
+   return true;
+
+   return false;
+}
+
  static int ivpu_fw_parse(struct ivpu_device *vdev)
  {
struct ivpu_fw_info *fw = vdev->fw;
@@ -164,9 +182,9 @@ static int ivpu_fw_parse(struct ivpu_device *vdev)
ivpu_info(vdev, "Firmware: %s, version: %s", fw->name,
  (const char *)fw_hdr + VPU_FW_HEADER_SIZE);
  
-	if (IVPU_FW_CHECK_API(vdev, fw_hdr, BOOT, 3))

+   if (IVPU_FW_CHECK_API_COMPAT(vdev, fw_hdr, BOOT, 3))
return -EINVAL;
-   if (IVPU_FW_CHECK_API(vdev, fw_hdr, JSM, 3))
+   if (IVPU_FW_CHECK_API_COMPAT(vdev, fw_hdr, JSM, 3))
return -EINVAL;
  
  	fw->runtime_addr = runtime_addr;

@@ -197,6 +215,24 @@ static void ivpu_fw_release(struct ivpu_device *vdev)
release_firmware(vdev->fw->file);
  }
  
+/* Initialize workarounds that depend on FW version */

+static void
+ivpu_fw_init_wa(struct ivpu_device *vdev)
+{
+   const struct vpu_firmware_header *fw_hdr = (const void 
*)vdev->fw->file->data;
+
+   if 

Re: [PATCH drm-misc-next v3] drm/sched: implement dynamic job-flow control

2023-10-27 Thread Danilo Krummrich

On 10/27/23 16:59, Luben Tuikov wrote:

Hi Danilo,

On 2023-10-27 10:45, Danilo Krummrich wrote:

Hi Luben,

On 10/26/23 23:13, Luben Tuikov wrote:

On 2023-10-26 12:13, Danilo Krummrich wrote:

Currently, job flow control is implemented simply by limiting the number
of jobs in flight. Therefore, a scheduler is initialized with a credit
limit that corresponds to the number of jobs which can be sent to the
hardware.

This implies that for each job, drivers need to account for the maximum
job size possible in order to not overflow the ring buffer.

However, there are drivers, such as Nouveau, where the job size has a
rather large range. For such drivers it can easily happen that job
submissions not even filling the ring by 1% can block subsequent
submissions, which, in the worst case, can lead to the ring run dry.

In order to overcome this issue, allow for tracking the actual job size
instead of the number of jobs. Therefore, add a field to track a job's
credit count, which represents the number of credits a job contributes
to the scheduler's credit limit.

Signed-off-by: Danilo Krummrich 
---
Changes in V2:
==
- fixed up influence on scheduling fairness due to consideration of a job's
  size
  - If we reach a ready entity in drm_sched_select_entity() but can't 
actually
queue a job from it due to size limitations, just give up and go to 
sleep
until woken up due to a pending job finishing, rather than continue to 
try
other entities.
- added a callback to dynamically update a job's credits (Boris)
- renamed 'units' to 'credits'
- fixed commit message and comments as requested by Luben

Changes in V3:
==
- rebased onto V7 of the "DRM scheduler changes for Xe" series by Matt
- move up drm_sched_can_queue() instead of adding a forward declaration
  (Boris)
- add a drm_sched_available_credits() helper (Boris)
- adjust control flow in drm_sched_rq_select_entity_fifo() to Luben's 
proposal
- re-phrase a few comments and fix a typo (Luben)
- change naming of all structures credit fields and function parameters to 
the
  following scheme
  - drm_sched_job::credits
  - drm_gpu_scheduler::credit_limit
  - drm_gpu_scheduler::credit_count
  - drm_sched_init(..., u32 credit_limit, ...)
  - drm_sched_job_init(..., u32 credits, ...)
- add proper documentation for the scheduler's job-flow control mechanism

This patch is based on V7 of the "DRM scheduler changes for Xe" series. [1]
provides a branch based on drm-misc-next, with the named series and this patch
on top of it.

[1] https://gitlab.freedesktop.org/nouvelles/kernel/-/commits/sched-credits/
---
   Documentation/gpu/drm-mm.rst  |   6 +
   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c   |   2 +-
   drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c  |   2 +-
   drivers/gpu/drm/lima/lima_sched.c |   2 +-
   drivers/gpu/drm/msm/msm_gem_submit.c  |   2 +-
   drivers/gpu/drm/nouveau/nouveau_sched.c   |   2 +-
   drivers/gpu/drm/panfrost/panfrost_drv.c   |   2 +-
   .../gpu/drm/scheduler/gpu_scheduler_trace.h   |   2 +-
   drivers/gpu/drm/scheduler/sched_entity.c  |   4 +-
   drivers/gpu/drm/scheduler/sched_main.c| 142 ++
   drivers/gpu/drm/v3d/v3d_gem.c |   2 +-
   include/drm/gpu_scheduler.h   |  33 +++-
   12 files changed, 152 insertions(+), 49 deletions(-)

diff --git a/Documentation/gpu/drm-mm.rst b/Documentation/gpu/drm-mm.rst
index 602010cb6894..acc5901ac840 100644
--- a/Documentation/gpu/drm-mm.rst
+++ b/Documentation/gpu/drm-mm.rst
@@ -552,6 +552,12 @@ Overview
   .. kernel-doc:: drivers/gpu/drm/scheduler/sched_main.c
  :doc: Overview
   
+Flow Control

+
+
+.. kernel-doc:: drivers/gpu/drm/scheduler/sched_main.c
+   :doc: Flow Control
+
   Scheduler Function References
   -
   
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c

index 1f357198533f..62bb7fc7448a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -115,7 +115,7 @@ int amdgpu_job_alloc(struct amdgpu_device *adev, struct 
amdgpu_vm *vm,
if (!entity)
return 0;
   
-	return drm_sched_job_init(&(*job)->base, entity, owner);

+   return drm_sched_job_init(&(*job)->base, entity, 1, owner);
   }
   
   int amdgpu_job_alloc_with_ib(struct amdgpu_device *adev,

diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c 
b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
index 2416c526f9b0..3d0f8d182506 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
@@ -535,7 +535,7 @@ int etnaviv_ioctl_gem_submit(struct drm_device *dev, void 
*data,
   
   	ret = drm_sched_job_init(>sched_job,

 >sched_entity[args->pipe],
-

Re: [PATCH 09/11] accel/ivpu/40xx: Capture D0i3 entry host and device timestamps

2023-10-27 Thread Jeffrey Hugo

On 10/25/2023 3:43 AM, Stanislaw Gruszka wrote:

From: Andrzej Kacprowski 

The driver needs to capture the D0i3 entry timestamp to
calculate D0i3 residency time.

The D0i3 residency time and the VPU timestamp are passed
to the firmware at D0i3 exit (warm boot).

Signed-off-by: Andrzej Kacprowski 
Reviewed-by: Stanislaw Gruszka 
Signed-off-by: Stanislaw Gruszka 


Feels like this should be applied before patch 8, otherwise this fixes a 
bug introduced by patch 8.


Reviewed-by: Jeffrey Hugo 


Re: [PATCH drm-misc-next v3] drm/sched: implement dynamic job-flow control

2023-10-27 Thread Luben Tuikov
Hi Danilo,

On 2023-10-27 10:45, Danilo Krummrich wrote:
> Hi Luben,
> 
> On 10/26/23 23:13, Luben Tuikov wrote:
>> On 2023-10-26 12:13, Danilo Krummrich wrote:
>>> Currently, job flow control is implemented simply by limiting the number
>>> of jobs in flight. Therefore, a scheduler is initialized with a credit
>>> limit that corresponds to the number of jobs which can be sent to the
>>> hardware.
>>>
>>> This implies that for each job, drivers need to account for the maximum
>>> job size possible in order to not overflow the ring buffer.
>>>
>>> However, there are drivers, such as Nouveau, where the job size has a
>>> rather large range. For such drivers it can easily happen that job
>>> submissions not even filling the ring by 1% can block subsequent
>>> submissions, which, in the worst case, can lead to the ring run dry.
>>>
>>> In order to overcome this issue, allow for tracking the actual job size
>>> instead of the number of jobs. Therefore, add a field to track a job's
>>> credit count, which represents the number of credits a job contributes
>>> to the scheduler's credit limit.
>>>
>>> Signed-off-by: Danilo Krummrich 
>>> ---
>>> Changes in V2:
>>> ==
>>>- fixed up influence on scheduling fairness due to consideration of a 
>>> job's
>>>  size
>>>  - If we reach a ready entity in drm_sched_select_entity() but can't 
>>> actually
>>>queue a job from it due to size limitations, just give up and go to 
>>> sleep
>>>until woken up due to a pending job finishing, rather than continue 
>>> to try
>>>other entities.
>>>- added a callback to dynamically update a job's credits (Boris)
>>>- renamed 'units' to 'credits'
>>>- fixed commit message and comments as requested by Luben
>>>
>>> Changes in V3:
>>> ==
>>>- rebased onto V7 of the "DRM scheduler changes for Xe" series by Matt
>>>- move up drm_sched_can_queue() instead of adding a forward declaration
>>>  (Boris)
>>>- add a drm_sched_available_credits() helper (Boris)
>>>- adjust control flow in drm_sched_rq_select_entity_fifo() to Luben's 
>>> proposal
>>>- re-phrase a few comments and fix a typo (Luben)
>>>- change naming of all structures credit fields and function parameters 
>>> to the
>>>  following scheme
>>>  - drm_sched_job::credits
>>>  - drm_gpu_scheduler::credit_limit
>>>  - drm_gpu_scheduler::credit_count
>>>  - drm_sched_init(..., u32 credit_limit, ...)
>>>  - drm_sched_job_init(..., u32 credits, ...)
>>>- add proper documentation for the scheduler's job-flow control mechanism
>>>
>>> This patch is based on V7 of the "DRM scheduler changes for Xe" series. [1]
>>> provides a branch based on drm-misc-next, with the named series and this 
>>> patch
>>> on top of it.
>>>
>>> [1] https://gitlab.freedesktop.org/nouvelles/kernel/-/commits/sched-credits/
>>> ---
>>>   Documentation/gpu/drm-mm.rst  |   6 +
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c   |   2 +-
>>>   drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c  |   2 +-
>>>   drivers/gpu/drm/lima/lima_sched.c |   2 +-
>>>   drivers/gpu/drm/msm/msm_gem_submit.c  |   2 +-
>>>   drivers/gpu/drm/nouveau/nouveau_sched.c   |   2 +-
>>>   drivers/gpu/drm/panfrost/panfrost_drv.c   |   2 +-
>>>   .../gpu/drm/scheduler/gpu_scheduler_trace.h   |   2 +-
>>>   drivers/gpu/drm/scheduler/sched_entity.c  |   4 +-
>>>   drivers/gpu/drm/scheduler/sched_main.c| 142 ++
>>>   drivers/gpu/drm/v3d/v3d_gem.c |   2 +-
>>>   include/drm/gpu_scheduler.h   |  33 +++-
>>>   12 files changed, 152 insertions(+), 49 deletions(-)
>>>
>>> diff --git a/Documentation/gpu/drm-mm.rst b/Documentation/gpu/drm-mm.rst
>>> index 602010cb6894..acc5901ac840 100644
>>> --- a/Documentation/gpu/drm-mm.rst
>>> +++ b/Documentation/gpu/drm-mm.rst
>>> @@ -552,6 +552,12 @@ Overview
>>>   .. kernel-doc:: drivers/gpu/drm/scheduler/sched_main.c
>>>  :doc: Overview
>>>   
>>> +Flow Control
>>> +
>>> +
>>> +.. kernel-doc:: drivers/gpu/drm/scheduler/sched_main.c
>>> +   :doc: Flow Control
>>> +
>>>   Scheduler Function References
>>>   -
>>>   
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>> index 1f357198533f..62bb7fc7448a 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>> @@ -115,7 +115,7 @@ int amdgpu_job_alloc(struct amdgpu_device *adev, struct 
>>> amdgpu_vm *vm,
>>> if (!entity)
>>> return 0;
>>>   
>>> -   return drm_sched_job_init(&(*job)->base, entity, owner);
>>> +   return drm_sched_job_init(&(*job)->base, entity, 1, owner);
>>>   }
>>>   
>>>   int amdgpu_job_alloc_with_ib(struct amdgpu_device *adev,
>>> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c 
>>> b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
>>> index 

Re: [PATCH 08/11] accel/ivpu: Pass D0i3 residency time to the VPU firmware

2023-10-27 Thread Jeffrey Hugo

On 10/25/2023 3:43 AM, Stanislaw Gruszka wrote:

From: Andrzej Kacprowski 

The firmware needs to know the time spend in D0i3/D3 to


spent?


calculate telemetry data. The D0i3/D3 residency time is
calculated by the driver and passed to the firmware
in the boot parameters.

The driver also passes VPU perf counter value captured
right before entering D0i3 - this allows the VPU firmware
to generate monotonic timestamps for the logs.

Signed-off-by: Andrzej Kacprowski 
Reviewed-by: Stanislaw Gruszka 
Signed-off-by: Stanislaw Gruszka 


Reviewed-by: Jeffrey Hugo 


Re: [PATCH 07/11] accel/ivpu: Introduce ivpu_ipc_send_receive_active()

2023-10-27 Thread Jeffrey Hugo

On 10/25/2023 3:43 AM, Stanislaw Gruszka wrote:

From: Karol Wachowski 

Split ivpu_ipc_send_receive() implementation to have a version
that does not call pm_runtime_resume_and_get(). That implementation
can be invoked when device is up and runtime resume is prohibited
(for example at the end of boot sequence).


There doesn't seem to be a user for this, which would make the new 
function dead code.  Assuming that this new function gets used later in 
the series, it would be clearer to combine this change with that one.


-Jeff


Re: [PATCH] i915/perf: Fix NULL deref bugs with drm_dbg() calls

2023-10-27 Thread Tvrtko Ursulin



On 27/10/2023 15:11, Andrzej Hajda wrote:

On 27.10.2023 16:07, Harshit Mogalapalli wrote:

When i915 perf interface is not available dereferencing it will lead to
NULL dereferences.

Fix this by using DRM_DEBUG() which the scenario before the commit in
the Fixes tag.

Fixes: 2fec539112e8 ("i915/perf: Replace DRM_DEBUG with driver 
specific drm_dbg call")

Signed-off-by: Harshit Mogalapalli 



Reviewed-by: Andrzej Hajda 


Please hold off merging.


---
This is found using smatch(static analysis tool), only compile tested.
---
  drivers/gpu/drm/i915/i915_perf.c | 9 +++--
  1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_perf.c 
b/drivers/gpu/drm/i915/i915_perf.c

index 2f3ecd7d4804..bb48c96b7950 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -4228,8 +4228,7 @@ int i915_perf_open_ioctl(struct drm_device *dev, 
void *data,

  int ret;
  if (!perf->i915) {
-    drm_dbg(>i915->drm,
-    "i915 perf interface not available for this system\n");
+    DRM_DEBUG("i915 perf interface not available for this 
system\n");


What's that struct drm_device *dev function argument a few lines up? :)

Although TBH all these these could just be removed since I doubt they 
are adding any value and ENOTSUPP is pretty clear.


Regards,

Tvrtko


  return -ENOTSUPP;
  }
@@ -4608,8 +4607,7 @@ int i915_perf_add_config_ioctl(struct drm_device 
*dev, void *data,

  int err, id;
  if (!perf->i915) {
-    drm_dbg(>i915->drm,
-    "i915 perf interface not available for this system\n");
+    DRM_DEBUG("i915 perf interface not available for this 
system\n");

  return -ENOTSUPP;
  }
@@ -4774,8 +4772,7 @@ int i915_perf_remove_config_ioctl(struct 
drm_device *dev, void *data,

  int ret;
  if (!perf->i915) {
-    drm_dbg(>i915->drm,
-    "i915 perf interface not available for this system\n");
+    DRM_DEBUG("i915 perf interface not available for this 
system\n");

  return -ENOTSUPP;
  }




Re: [PATCH 06/11] accel/ivpu: Change test_mode module param to bitmask

2023-10-27 Thread Jeffrey Hugo

On 10/25/2023 3:43 AM, Stanislaw Gruszka wrote:

From: Karol Wachowski 

Change meaning of test_mode module parameter from integer value
to bitmask allowing setting different test features with corresponding
bits.

Signed-off-by: Karol Wachowski 
Reviewed-by: Stanislaw Gruszka 
Signed-off-by: Stanislaw Gruszka 


Seems like this changes the uAPI.  You still haven't made a release of 
the userspace, correct?  If so -


Reviewed-by: Jeffrey Hugo 


Re: [PATCH drm-misc-next v3] drm/sched: implement dynamic job-flow control

2023-10-27 Thread Danilo Krummrich

On 10/27/23 03:03, Luben Tuikov wrote:

On 2023-10-26 17:13, Luben Tuikov wrote:

On 2023-10-26 12:13, Danilo Krummrich wrote:

Currently, job flow control is implemented simply by limiting the number
of jobs in flight. Therefore, a scheduler is initialized with a credit
limit that corresponds to the number of jobs which can be sent to the
hardware.

This implies that for each job, drivers need to account for the maximum
job size possible in order to not overflow the ring buffer.

However, there are drivers, such as Nouveau, where the job size has a
rather large range. For such drivers it can easily happen that job
submissions not even filling the ring by 1% can block subsequent
submissions, which, in the worst case, can lead to the ring run dry.

In order to overcome this issue, allow for tracking the actual job size
instead of the number of jobs. Therefore, add a field to track a job's
credit count, which represents the number of credits a job contributes
to the scheduler's credit limit.

Signed-off-by: Danilo Krummrich 
---
Changes in V2:
==
   - fixed up influence on scheduling fairness due to consideration of a job's
 size
 - If we reach a ready entity in drm_sched_select_entity() but can't 
actually
   queue a job from it due to size limitations, just give up and go to sleep
   until woken up due to a pending job finishing, rather than continue to 
try
   other entities.
   - added a callback to dynamically update a job's credits (Boris)
   - renamed 'units' to 'credits'
   - fixed commit message and comments as requested by Luben

Changes in V3:
==
   - rebased onto V7 of the "DRM scheduler changes for Xe" series by Matt
   - move up drm_sched_can_queue() instead of adding a forward declaration
 (Boris)
   - add a drm_sched_available_credits() helper (Boris)
   - adjust control flow in drm_sched_rq_select_entity_fifo() to Luben's 
proposal
   - re-phrase a few comments and fix a typo (Luben)
   - change naming of all structures credit fields and function parameters to 
the
 following scheme
 - drm_sched_job::credits
 - drm_gpu_scheduler::credit_limit
 - drm_gpu_scheduler::credit_count
 - drm_sched_init(..., u32 credit_limit, ...)
 - drm_sched_job_init(..., u32 credits, ...)
   - add proper documentation for the scheduler's job-flow control mechanism

This patch is based on V7 of the "DRM scheduler changes for Xe" series. [1]
provides a branch based on drm-misc-next, with the named series and this patch
on top of it.

[1] https://gitlab.freedesktop.org/nouvelles/kernel/-/commits/sched-credits/
---
  Documentation/gpu/drm-mm.rst  |   6 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_job.c   |   2 +-
  drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c  |   2 +-
  drivers/gpu/drm/lima/lima_sched.c |   2 +-
  drivers/gpu/drm/msm/msm_gem_submit.c  |   2 +-
  drivers/gpu/drm/nouveau/nouveau_sched.c   |   2 +-
  drivers/gpu/drm/panfrost/panfrost_drv.c   |   2 +-
  .../gpu/drm/scheduler/gpu_scheduler_trace.h   |   2 +-
  drivers/gpu/drm/scheduler/sched_entity.c  |   4 +-
  drivers/gpu/drm/scheduler/sched_main.c| 142 ++
  drivers/gpu/drm/v3d/v3d_gem.c |   2 +-
  include/drm/gpu_scheduler.h   |  33 +++-
  12 files changed, 152 insertions(+), 49 deletions(-)

diff --git a/Documentation/gpu/drm-mm.rst b/Documentation/gpu/drm-mm.rst
index 602010cb6894..acc5901ac840 100644
--- a/Documentation/gpu/drm-mm.rst
+++ b/Documentation/gpu/drm-mm.rst
@@ -552,6 +552,12 @@ Overview
  .. kernel-doc:: drivers/gpu/drm/scheduler/sched_main.c
 :doc: Overview
  
+Flow Control

+
+
+.. kernel-doc:: drivers/gpu/drm/scheduler/sched_main.c
+   :doc: Flow Control
+
  Scheduler Function References
  -
  
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c

index 1f357198533f..62bb7fc7448a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -115,7 +115,7 @@ int amdgpu_job_alloc(struct amdgpu_device *adev, struct 
amdgpu_vm *vm,
if (!entity)
return 0;
  
-	return drm_sched_job_init(&(*job)->base, entity, owner);

+   return drm_sched_job_init(&(*job)->base, entity, 1, owner);
  }
  
  int amdgpu_job_alloc_with_ib(struct amdgpu_device *adev,

diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c 
b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
index 2416c526f9b0..3d0f8d182506 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
@@ -535,7 +535,7 @@ int etnaviv_ioctl_gem_submit(struct drm_device *dev, void 
*data,
  
  	ret = drm_sched_job_init(>sched_job,

 >sched_entity[args->pipe],
-submit->ctx);
+1, submit->ctx);
if (ret)
goto 

Re: [PATCH drm-misc-next v3] drm/sched: implement dynamic job-flow control

2023-10-27 Thread Danilo Krummrich

Hi Luben,

On 10/26/23 23:13, Luben Tuikov wrote:

On 2023-10-26 12:13, Danilo Krummrich wrote:

Currently, job flow control is implemented simply by limiting the number
of jobs in flight. Therefore, a scheduler is initialized with a credit
limit that corresponds to the number of jobs which can be sent to the
hardware.

This implies that for each job, drivers need to account for the maximum
job size possible in order to not overflow the ring buffer.

However, there are drivers, such as Nouveau, where the job size has a
rather large range. For such drivers it can easily happen that job
submissions not even filling the ring by 1% can block subsequent
submissions, which, in the worst case, can lead to the ring run dry.

In order to overcome this issue, allow for tracking the actual job size
instead of the number of jobs. Therefore, add a field to track a job's
credit count, which represents the number of credits a job contributes
to the scheduler's credit limit.

Signed-off-by: Danilo Krummrich 
---
Changes in V2:
==
   - fixed up influence on scheduling fairness due to consideration of a job's
 size
 - If we reach a ready entity in drm_sched_select_entity() but can't 
actually
   queue a job from it due to size limitations, just give up and go to sleep
   until woken up due to a pending job finishing, rather than continue to 
try
   other entities.
   - added a callback to dynamically update a job's credits (Boris)
   - renamed 'units' to 'credits'
   - fixed commit message and comments as requested by Luben

Changes in V3:
==
   - rebased onto V7 of the "DRM scheduler changes for Xe" series by Matt
   - move up drm_sched_can_queue() instead of adding a forward declaration
 (Boris)
   - add a drm_sched_available_credits() helper (Boris)
   - adjust control flow in drm_sched_rq_select_entity_fifo() to Luben's 
proposal
   - re-phrase a few comments and fix a typo (Luben)
   - change naming of all structures credit fields and function parameters to 
the
 following scheme
 - drm_sched_job::credits
 - drm_gpu_scheduler::credit_limit
 - drm_gpu_scheduler::credit_count
 - drm_sched_init(..., u32 credit_limit, ...)
 - drm_sched_job_init(..., u32 credits, ...)
   - add proper documentation for the scheduler's job-flow control mechanism

This patch is based on V7 of the "DRM scheduler changes for Xe" series. [1]
provides a branch based on drm-misc-next, with the named series and this patch
on top of it.

[1] https://gitlab.freedesktop.org/nouvelles/kernel/-/commits/sched-credits/
---
  Documentation/gpu/drm-mm.rst  |   6 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_job.c   |   2 +-
  drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c  |   2 +-
  drivers/gpu/drm/lima/lima_sched.c |   2 +-
  drivers/gpu/drm/msm/msm_gem_submit.c  |   2 +-
  drivers/gpu/drm/nouveau/nouveau_sched.c   |   2 +-
  drivers/gpu/drm/panfrost/panfrost_drv.c   |   2 +-
  .../gpu/drm/scheduler/gpu_scheduler_trace.h   |   2 +-
  drivers/gpu/drm/scheduler/sched_entity.c  |   4 +-
  drivers/gpu/drm/scheduler/sched_main.c| 142 ++
  drivers/gpu/drm/v3d/v3d_gem.c |   2 +-
  include/drm/gpu_scheduler.h   |  33 +++-
  12 files changed, 152 insertions(+), 49 deletions(-)

diff --git a/Documentation/gpu/drm-mm.rst b/Documentation/gpu/drm-mm.rst
index 602010cb6894..acc5901ac840 100644
--- a/Documentation/gpu/drm-mm.rst
+++ b/Documentation/gpu/drm-mm.rst
@@ -552,6 +552,12 @@ Overview
  .. kernel-doc:: drivers/gpu/drm/scheduler/sched_main.c
 :doc: Overview
  
+Flow Control

+
+
+.. kernel-doc:: drivers/gpu/drm/scheduler/sched_main.c
+   :doc: Flow Control
+
  Scheduler Function References
  -
  
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c

index 1f357198533f..62bb7fc7448a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -115,7 +115,7 @@ int amdgpu_job_alloc(struct amdgpu_device *adev, struct 
amdgpu_vm *vm,
if (!entity)
return 0;
  
-	return drm_sched_job_init(&(*job)->base, entity, owner);

+   return drm_sched_job_init(&(*job)->base, entity, 1, owner);
  }
  
  int amdgpu_job_alloc_with_ib(struct amdgpu_device *adev,

diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c 
b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
index 2416c526f9b0..3d0f8d182506 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
@@ -535,7 +535,7 @@ int etnaviv_ioctl_gem_submit(struct drm_device *dev, void 
*data,
  
  	ret = drm_sched_job_init(>sched_job,

 >sched_entity[args->pipe],
-submit->ctx);
+1, submit->ctx);
if (ret)
goto err_submit_put;
  
diff --git 

Re: [PATCH 05/11] accel/ivpu: Add support for VPU_JOB_FLAGS_NULL_SUBMISSION_MASK

2023-10-27 Thread Jeffrey Hugo

On 10/25/2023 3:43 AM, Stanislaw Gruszka wrote:

From: Andrzej Kacprowski 

Add test_mode = 3 that add VPU_JOB_FLAGS_NULL_SUBMISSION_MASK
flag to the job send to the VPU device. Then the VPU will process
the job but won't execute commands (except the command to signal
the fence).

This can b used to estimate job processing overhead in the host


be?


software and VPU firmware.

Unlike the null hardware mode, the null submission mode will
still work even if UMD uses VPU fences to track job completion.

Signed-off-by: Andrzej Kacprowski 
Reviewed-by: Stanislaw Gruszka 
Signed-off-by: Stanislaw Gruszka 


Reviewed-by: Jeffrey Hugo 


Re: [PATCH 04/11] accel/ivpu: Remove reset from power up sequence

2023-10-27 Thread Jeffrey Hugo

On 10/25/2023 3:43 AM, Stanislaw Gruszka wrote:

From: Karol Wachowski 

Setting a non-zero work point resets the IP hence IP_RESET
trigger is redundant.

Signed-off-by: Karol Wachowski 
Reviewed-by: Stanislaw Gruszka 
Signed-off-by: Stanislaw Gruszka 


Reviewed-by: Jeffrey Hugo 


Re: [PATCH 03/11] accel/ivpu: Add dvfs_mode file to debugfs

2023-10-27 Thread Jeffrey Hugo

On 10/25/2023 3:43 AM, Stanislaw Gruszka wrote:

From: Tomasz Rusinowicz 

Add new debugfs file to set dvfs_mode FW boot parameter and restart
the FW to allow experimenting with DVFS (dynamic voltage & frequency
scaling).

Signed-off-by: Tomasz Rusinowicz 
Signed-off-by: Stanislaw Gruszka 


Reviewed-by: Jeffrey Hugo 


Re: [PATCH 02/11] accel/ivpu: Remove unneeded drm_driver dectaration

2023-10-27 Thread Jeffrey Hugo

$SUBJECT has a spelling error of "declaration"

On 10/25/2023 3:43 AM, Stanislaw Gruszka wrote:

Cleanup drm_driver declaration leftover.

Reviewed-by: Krystian Pradzynski 
Signed-off-by: Stanislaw Gruszka 


Reviewed-by: Jeffrey Hugo 


Re: [PATCH 01/11] accel/ivpu: Update FW API

2023-10-27 Thread Jeffrey Hugo

On 10/25/2023 3:43 AM, Stanislaw Gruszka wrote:

From: Krystian Pradzynski 

Bump boot API to 4.20
Bump JSM API to 3.15

Signed-off-by: Krystian Pradzynski 
Reviewed-by: Stanislaw Gruszka 
Signed-off-by: Stanislaw Gruszka 
---
  drivers/accel/ivpu/ivpu_jsm_msg.c |  17 ++
  drivers/accel/ivpu/vpu_boot_api.h |  90 -
  drivers/accel/ivpu/vpu_jsm_api.h  | 309 --
  3 files changed, 392 insertions(+), 24 deletions(-)

diff --git a/drivers/accel/ivpu/ivpu_jsm_msg.c 
b/drivers/accel/ivpu/ivpu_jsm_msg.c
index 0c2fe7142024..35a689475c68 100644
--- a/drivers/accel/ivpu/ivpu_jsm_msg.c
+++ b/drivers/accel/ivpu/ivpu_jsm_msg.c
@@ -36,6 +36,17 @@ const char *ivpu_jsm_msg_type_to_str(enum vpu_ipc_msg_type 
type)
IVPU_CASE_TO_STR(VPU_JSM_MSG_DESTROY_CMD_QUEUE);
IVPU_CASE_TO_STR(VPU_JSM_MSG_SET_CONTEXT_SCHED_PROPERTIES);
IVPU_CASE_TO_STR(VPU_JSM_MSG_HWS_REGISTER_DB);
+   IVPU_CASE_TO_STR(VPU_JSM_MSG_HWS_RESUME_CMDQ);
+   IVPU_CASE_TO_STR(VPU_JSM_MSG_HWS_SUSPEND_CMDQ);
+   IVPU_CASE_TO_STR(VPU_JSM_MSG_HWS_RESUME_CMDQ_RSP);
+   IVPU_CASE_TO_STR(VPU_JSM_MSG_HWS_SUSPEND_CMDQ_DONE);
+   IVPU_CASE_TO_STR(VPU_JSM_MSG_HWS_SET_SCHEDULING_LOG);
+   IVPU_CASE_TO_STR(VPU_JSM_MSG_HWS_SET_SCHEDULING_LOG_RSP);
+   IVPU_CASE_TO_STR(VPU_JSM_MSG_HWS_SCHEDULING_LOG_NOTIFICATION);
+   IVPU_CASE_TO_STR(VPU_JSM_MSG_HWS_ENGINE_RESUME);
+   IVPU_CASE_TO_STR(VPU_JSM_MSG_HWS_RESUME_ENGINE_DONE);
+   IVPU_CASE_TO_STR(VPU_JSM_MSG_STATE_DUMP);
+   IVPU_CASE_TO_STR(VPU_JSM_MSG_STATE_DUMP_RSP);
IVPU_CASE_TO_STR(VPU_JSM_MSG_BLOB_DEINIT);
IVPU_CASE_TO_STR(VPU_JSM_MSG_DYNDBG_CONTROL);
IVPU_CASE_TO_STR(VPU_JSM_MSG_JOB_DONE);
@@ -65,6 +76,12 @@ const char *ivpu_jsm_msg_type_to_str(enum vpu_ipc_msg_type 
type)
IVPU_CASE_TO_STR(VPU_JSM_MSG_SET_CONTEXT_SCHED_PROPERTIES_RSP);
IVPU_CASE_TO_STR(VPU_JSM_MSG_BLOB_DEINIT_DONE);
IVPU_CASE_TO_STR(VPU_JSM_MSG_DYNDBG_CONTROL_RSP);
+   IVPU_CASE_TO_STR(VPU_JSM_MSG_PWR_D0I3_ENTER);
+   IVPU_CASE_TO_STR(VPU_JSM_MSG_PWR_D0I3_ENTER_DONE);
+   IVPU_CASE_TO_STR(VPU_JSM_MSG_DCT_ENABLE);
+   IVPU_CASE_TO_STR(VPU_JSM_MSG_DCT_ENABLE_DONE);
+   IVPU_CASE_TO_STR(VPU_JSM_MSG_DCT_DISABLE);
+   IVPU_CASE_TO_STR(VPU_JSM_MSG_DCT_DISABLE_DONE);
}
#undef IVPU_CASE_TO_STR
  
diff --git a/drivers/accel/ivpu/vpu_boot_api.h b/drivers/accel/ivpu/vpu_boot_api.h

index 6b71be92ba65..04c954258563 100644
--- a/drivers/accel/ivpu/vpu_boot_api.h
+++ b/drivers/accel/ivpu/vpu_boot_api.h
@@ -11,7 +11,10 @@
   *  The bellow values will be used to construct the version info this way:
   *  fw_bin_header->api_version[VPU_BOOT_API_VER_ID] = (VPU_BOOT_API_VER_MAJOR 
<< 16) |
   *  VPU_BOOT_API_VER_MINOR;
- *  VPU_BOOT_API_VER_PATCH will be ignored. KMD and compatibility is not 
affected if this changes.
+ *  VPU_BOOT_API_VER_PATCH will be ignored. KMD and compatibility is not 
affected if this changes
+ *  This information is collected by using 
vpuip_2/application/vpuFirmware/make_std_fw_image.py
+ *  If a header is missing this info we ignore the header, if a header is 
missing or contains
+ *  partial info a build error will be generated.
   */
  
  /*

@@ -24,12 +27,12 @@
   * Minor version changes when API backward compatibility is preserved.
   * Resets to 0 if Major version is incremented.
   */
-#define VPU_BOOT_API_VER_MINOR 12
+#define VPU_BOOT_API_VER_MINOR 20
  
  /*

   * API header changed (field names, documentation, formatting) but API itself 
has not been changed
   */
-#define VPU_BOOT_API_VER_PATCH 2
+#define VPU_BOOT_API_VER_PATCH 4
  
  /*

   * Index in the API version table
@@ -63,6 +66,12 @@ struct vpu_firmware_header {
/* Size of memory require for firmware execution */
u32 runtime_size;
u32 shave_nn_fw_size;
+   /* Size of primary preemption buffer. */
+   u32 preemption_buffer_1_size;
+   /* Size of secondary preemption buffer. */
+   u32 preemption_buffer_2_size;
+   /* Space reserved for future preemption-related fields. */
+   u32 preemption_reserved[6];
  };
  
  /*

@@ -89,6 +98,14 @@ enum VPU_BOOT_L2_CACHE_CFG_TYPE {
VPU_BOOT_L2_CACHE_CFG_NUM = 2
  };
  
+/** VPU MCA ECC signalling mode. By default, no signalling is used */


This does not look like valid kernel-doc.  Maybe you wanted "/*" instead?


+enum VPU_BOOT_MCA_ECC_SIGNAL_TYPE {
+   VPU_BOOT_MCA_ECC_NONE = 0,
+   VPU_BOOT_MCA_ECC_CORR = 1,
+   VPU_BOOT_MCA_ECC_FATAL = 2,
+   VPU_BOOT_MCA_ECC_BOTH = 3


Personal preference, but having the "=" and the interget values all line 
up vetrically would make this a bit more plesant to look at.



+};
+
  /**
   * Logging destinations.
   *
@@ -131,9 +148,11 @@ enum vpu_trace_destination {
  #define VPU_TRACE_PROC_BIT_ACT_SHV_3 22
  #define VPU_TRACE_PROC_NO_OF_HW_DEVS 23
  
-/* KMB HW component IDs are sequential, so define first and last IDs. */

-#define 

Re: [PATCH drm-misc-next v3] drm/sched: implement dynamic job-flow control

2023-10-27 Thread Danilo Krummrich

On 10/27/23 09:17, Boris Brezillon wrote:

Hi Danilo,

On Thu, 26 Oct 2023 18:13:00 +0200
Danilo Krummrich  wrote:


+
+   /**
+* @update_job_credits: Called once the scheduler is considering this
+* job for execution.
+*
+* Drivers may use this to update the job's submission credits, which is
+* useful to e.g. deduct the number of native fences which have been
+* signaled meanwhile.
+*
+* The callback must either return the new number of submission credits
+* for the given job, or zero if no update is required.
+*
+* This callback is optional.
+*/
+   u32 (*update_job_credits)(struct drm_sched_job *sched_job);


I'm copying my late reply to v2 here so it doesn't get lost:

I keep thinking it'd be simpler to make this a void function that
updates s_job->submission_credits directly. I also don't see the
problem with doing a sanity check on job->submission_credits. I mean,
if the driver is doing something silly, you can't do much to prevent it
anyway, except warn the user that something wrong has happened. If you
want to

WARN_ON(job->submission_credits == 0 ||
job->submission_credits > job_old_submission_credits);

that's fine. But none of this sanity checking has to do with the
function prototype/semantics, and I'm still not comfortable with this 0
=> no-change. If there's no change, we should just leave
job->submission_credits unchanged (or return job->submission_credits)
instead of inventing a new special case.


If we can avoid letting drivers change fields of generic structures directly
without any drawbacks I think we should avoid it. Currently, drivers shouldn't
have the need to mess with job->credits directly. The initial value is set
through drm_sched_job_init() and is updated through the return value of
update_job_credits().

I'm fine getting rid of the 0 => no-change semantics though. Instead we can just
WARN() on 0. However, if we do that I'd also want to change it for
drm_sched_job_init() (where 0 currently defaults to 1) such that we accept 0, 
but
WARN() accordingly.

I think it's consequent to either consistently give 0 a different meaning or 
just
accept it but WARN() on it.



Regards,

Boris





Re: [PATCH drm-misc-next v3] drm/sched: implement dynamic job-flow control

2023-10-27 Thread Luben Tuikov
On 2023-10-27 04:25, Boris Brezillon wrote:
> Hi Danilo,
> 
> On Thu, 26 Oct 2023 18:13:00 +0200
> Danilo Krummrich  wrote:
> 
>> Currently, job flow control is implemented simply by limiting the number
>> of jobs in flight. Therefore, a scheduler is initialized with a credit
>> limit that corresponds to the number of jobs which can be sent to the
>> hardware.
>>
>> This implies that for each job, drivers need to account for the maximum
>> job size possible in order to not overflow the ring buffer.
>>
>> However, there are drivers, such as Nouveau, where the job size has a
>> rather large range. For such drivers it can easily happen that job
>> submissions not even filling the ring by 1% can block subsequent
>> submissions, which, in the worst case, can lead to the ring run dry.
>>
>> In order to overcome this issue, allow for tracking the actual job size
>> instead of the number of jobs. Therefore, add a field to track a job's
>> credit count, which represents the number of credits a job contributes
>> to the scheduler's credit limit.
>>
>> Signed-off-by: Danilo Krummrich 
>> ---
>> Changes in V2:
>> ==
>>   - fixed up influence on scheduling fairness due to consideration of a job's
>> size
>> - If we reach a ready entity in drm_sched_select_entity() but can't 
>> actually
>>   queue a job from it due to size limitations, just give up and go to 
>> sleep
>>   until woken up due to a pending job finishing, rather than continue to 
>> try
>>   other entities.
>>   - added a callback to dynamically update a job's credits (Boris)
> 
> This callback seems controversial. I'd suggest dropping it, so the
> patch can be merged.

Sorry, why is it controversial? (I did read back-and-forth above, but it wasn't 
clear
why it is /controversial/.)

I believe only drivers are privy to changes in the credit availability as their
firmware and hardware executes new jobs and finishes others, and so this 
"update"
here is essential--leaving it only to prepare_job() wouldn't quite fulfill the 
vision
of why the credit mechanism introduced by this patch in the first place.
-- 
Regards,
Luben


OpenPGP_0x4C15479431A334AF.asc
Description: OpenPGP public key


OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: drm/panel: panel-simple power-off sequencing

2023-10-27 Thread Doug Anderson
Hi,

On Fri, Oct 27, 2023 at 5:30 AM Jonas Mark (BT-FS/ENG1-GRB)
 wrote:
>
> > I think I've looked at this exact case before and then realized that
> > there's a better solution. At least in all cases I looked at the
> > "enable-gpio" you're talking about was actually better modeled as a
> > _backlight_ enable GPIO. The "backlight" is turned off before panel-
> > simple's disable() function is called (see drm_panel_disable().
> > So if you move the GPIO to the backlight and add a "disable" delay
> > then you're all set.
> >
> > Does that work for you? Does it make sense for this GPIO to be modeled
> > as a backlight GPIO?
>
> In combination with setting the "disable" delay this works *. Yet, it
> feels wrong.
>
> *: backlight-pwm only accepts one GPIO but that can be easily resolved.
>
> It feels wrong that the backlight driver takes over part of the panel
> control. On top, it still needs cooperation of the panel driver for the
> proper timing.

I would first ask you to take another look before saying that it's
wrong to put the enable pin in the backlight driver. At least on most
displays I've seen (though I've spent most time looking at eDP), the
backlight and panel are really not separable entities.

Take a look at the ASCII art timing diagram in
Documentation/devicetree/bindings/display/panel/panel-edp.yaml, for
instance. This is typical of eDP panel diagrams and it includes _both_
the backlight and the display timings. In general, it's always made
sense for me to think of the "LED_EN" line in that diagram as the
backlight enable.


> Lastly, it relies on the current behavior of drm-panel
> that the panel driver is prepared/ enabled first and then the backlight
> is switched on; and the other way around at power off.

That current behavior is not random and I don't think it would be
possible for it to change. Too many things rely on the current order.


> We think that actually more panels in products are affected: We have
> three panels from three different vendors (Sharp, Powertip, and Tianma)
> and only one is visually affected. Yet, all of them specify a number of
> vsyncs after de-asserting the enable GPIO.

If you're certain that your enable GPIO really shouldn't be modeled
with the backlight, then IMO you should submit a patch to panel-simple
that allows an "enable" GPIO to be controlled in the "enable"
function. We'd have to have a discussion about how to best do this.
The fact that the existing "enable" GPIO is considered a "power
enable" predates my involvement in the driver. In other words I think
it's always been in the "prepare/unprepare" functions. It always felt
wrong to me, too. ;-)

I guess the "easiest" (though a bit ugly) solution is to either add a
per-panel boolean flag that says whether that panel wants the enable
GPIO controlled in enable/disable or prepare/enable.

Another solution might be to introduce a 2nd GPIO, though you'd have
to think about what to call it since the existing one is kinda stuck
as "enable" given the DT bindings.

I guess a 3rd solution would be to audit users and see if anyone
actually needs the current "enable" GPIO as it is and whether those
cases would be better modeled as a GPIO-controlled regulator. Of
course, if you have to change how boards model this, then you start
getting into the argument about DT backward compatibility.


I guess, in summary, I'm hoping you'll look again and find that this
really is a backlight enable. If not, I'd probably advocate for a
per-panel boolean.


FWIW, looking a bit at the history and going back to 2014 in commit
f673c37ec453 ("drm/panel: simple: Support delays in panel functions"):

* The backlight calls used to be made directly from panel-simple

* The "unprepare" delay was documented as "the time (in milliseconds)
that it takes for the panel to power itself down completely" and makes
me believe that, even originally, it was about not turning the panel
back on before it fully turns off (T12 in the eDP timing diagram I
pointed at earlier).

* The "enable" GPIO has been controlled from prepare/unprepare since
those functions were introduced in commit 613a633e7a56 ("drm/panel:
simple: Add proper definition for prepare and unprepare"). It kinda
feels like the problem originated here...


-Doug


Re: [PATCH drm-misc-next v3] drm/sched: implement dynamic job-flow control

2023-10-27 Thread Danilo Krummrich

On 10/27/23 10:25, Boris Brezillon wrote:

Hi Danilo,

On Thu, 26 Oct 2023 18:13:00 +0200
Danilo Krummrich  wrote:


Currently, job flow control is implemented simply by limiting the number
of jobs in flight. Therefore, a scheduler is initialized with a credit
limit that corresponds to the number of jobs which can be sent to the
hardware.

This implies that for each job, drivers need to account for the maximum
job size possible in order to not overflow the ring buffer.

However, there are drivers, such as Nouveau, where the job size has a
rather large range. For such drivers it can easily happen that job
submissions not even filling the ring by 1% can block subsequent
submissions, which, in the worst case, can lead to the ring run dry.

In order to overcome this issue, allow for tracking the actual job size
instead of the number of jobs. Therefore, add a field to track a job's
credit count, which represents the number of credits a job contributes
to the scheduler's credit limit.

Signed-off-by: Danilo Krummrich 
---
Changes in V2:
==
   - fixed up influence on scheduling fairness due to consideration of a job's
 size
 - If we reach a ready entity in drm_sched_select_entity() but can't 
actually
   queue a job from it due to size limitations, just give up and go to sleep
   until woken up due to a pending job finishing, rather than continue to 
try
   other entities.
   - added a callback to dynamically update a job's credits (Boris)


This callback seems controversial. I'd suggest dropping it, so the
patch can be merged.


I don't think we should drop it just for the sake of moving forward. If there 
are objections
we'll discuss them. I've seen good reasons why the drivers you are working on 
require this,
while, following the discussion, I have *not* seen any reasons to drop it. It 
helps simplifying
driver code and doesn't introduce any complexity or overhead to existing 
drivers.



Regards,

Boris


   - renamed 'units' to 'credits'
   - fixed commit message and comments as requested by Luben

Changes in V3:
==
   - rebased onto V7 of the "DRM scheduler changes for Xe" series by Matt
   - move up drm_sched_can_queue() instead of adding a forward declaration
 (Boris)
   - add a drm_sched_available_credits() helper (Boris)
   - adjust control flow in drm_sched_rq_select_entity_fifo() to Luben's 
proposal
   - re-phrase a few comments and fix a typo (Luben)
   - change naming of all structures credit fields and function parameters to 
the
 following scheme
 - drm_sched_job::credits
 - drm_gpu_scheduler::credit_limit
 - drm_gpu_scheduler::credit_count
 - drm_sched_init(..., u32 credit_limit, ...)
 - drm_sched_job_init(..., u32 credits, ...)
   - add proper documentation for the scheduler's job-flow control mechanism






Re: [PATCH v2] accel/ivpu: Disable PLL after VPU IP reset during FLR

2023-10-27 Thread Jeffrey Hugo

On 10/24/2023 10:53 AM, Stanislaw Gruszka wrote:

From: Jacek Lawrynowicz 

IP reset has to followed by ivpu_pll_disable() to properly enter
reset state.

Fixes: 828d63042aec ("accel/ivpu: Don't enter d0i3 during FLR")
Cc: sta...@vger.kernel.org
Signed-off-by: Jacek Lawrynowicz 
Reviewed-by: Stanislaw Gruszka 
Signed-off-by: Stanislaw Gruszka 


Reviewed-by: Jeffrey Hugo 


Re: [RFC PATCH v2 03/17] drm/vkms: Create separate Kconfig file for VKMS

2023-10-27 Thread Simon Ser
On Thursday, October 19th, 2023 at 23:21, Harry Wentland 
 wrote:

> +++ b/drivers/gpu/drm/vkms/Kconfig
> @@ -0,0 +1,15 @@
> +# SPDX-License-Identifier: GPL-2.0+

It seems like the original Kconfig uses GPL-2.0-only. I think it'd be
safer to just re-use the exact same license here?

With that fixed:
Reviewed-by: Simon Ser 


Re: [RFC PATCH v2 01/17] drm/atomic: Allow get_value for immutable properties on atomic drivers

2023-10-27 Thread Simon Ser
Have you seen the comment on top?

 * Atomic drivers should never call this function directly, the core will read
 * out property values through the various ->atomic_get_property callbacks.

It seems like atomic drivers shouldn't call drm_object_property_get_value()
at all?


Re: [RFC PATCH v2 02/17] drm: Don't treat 0 as -1 in drm_fixp2int_ceil

2023-10-27 Thread Simon Ser
Reviewed-by: Simon Ser 


  1   2   >