Re: [PATCH drm-misc-next v3] drm/sched: implement dynamic job-flow control
On 2023-10-27 12:41, Boris Brezillon wrote: > On Fri, 27 Oct 2023 10:32:52 -0400 > Luben Tuikov wrote: > >> On 2023-10-27 04:25, Boris Brezillon wrote: >>> Hi Danilo, >>> >>> On Thu, 26 Oct 2023 18:13:00 +0200 >>> Danilo Krummrich wrote: >>> Currently, job flow control is implemented simply by limiting the number of jobs in flight. Therefore, a scheduler is initialized with a credit limit that corresponds to the number of jobs which can be sent to the hardware. This implies that for each job, drivers need to account for the maximum job size possible in order to not overflow the ring buffer. However, there are drivers, such as Nouveau, where the job size has a rather large range. For such drivers it can easily happen that job submissions not even filling the ring by 1% can block subsequent submissions, which, in the worst case, can lead to the ring run dry. In order to overcome this issue, allow for tracking the actual job size instead of the number of jobs. Therefore, add a field to track a job's credit count, which represents the number of credits a job contributes to the scheduler's credit limit. Signed-off-by: Danilo Krummrich --- Changes in V2: == - fixed up influence on scheduling fairness due to consideration of a job's size - If we reach a ready entity in drm_sched_select_entity() but can't actually queue a job from it due to size limitations, just give up and go to sleep until woken up due to a pending job finishing, rather than continue to try other entities. - added a callback to dynamically update a job's credits (Boris) >>> >>> This callback seems controversial. I'd suggest dropping it, so the >>> patch can be merged. >> >> Sorry, why is it controversial? (I did read back-and-forth above, but it >> wasn't clear >> why it is /controversial/.) > > That's a question for Christian, I guess. I didn't quite get what he > was worried about, other than this hook introducing a new way for > drivers to screw things up by returning funky/invalid credits (which we It's up to the driver--they can test, test, test, and fix their code and so on. We can only do so much and shouldn't be baby-sitting drivers ad nauseam. Drivers can also not define this callback. :-) > can report with WARN_ON()s). But let's be honest, there's probably a > hundred different ways (if not more) drivers can shoot themselves in the > foot with drm_sched already... Yes, that's true. So there's no worries with this hook. > >> >> I believe only drivers are privy to changes in the credit availability as >> their >> firmware and hardware executes new jobs and finishes others, and so this >> "update" >> here is essential--leaving it only to prepare_job() wouldn't quite fulfill >> the vision >> of why the credit mechanism introduced by this patch in the first place. > > I kinda agree with you, even if I wouldn't so pessimistic as to how > useful this patch would be without the ->update_job_credits() hook > (it already makes the situation a lot better for Nouveau and probably > other drivers too). Sure, and that's a good thing. The heart of the dynamic credit scheme this patch is introducing *is* update_job_credits() callback. Without it, it's just about like the current job flow-control scheme we have with varied job weights (credits). Remember, it is an optional callback and driver can choose NOT to define it--simple. :-) So, I'm very excited about this, and see a wide range of applications and tricks drivers can do with the credit scheme (albeit had it been an "int" bwha-ha-ha-ha ]:-> ). Have a good weekend everyone! -- Regards, Luben OpenPGP_0x4C15479431A334AF.asc Description: OpenPGP public key OpenPGP_signature.asc Description: OpenPGP digital signature
Re: [PATCH drm-misc-next v3] drm/sched: implement dynamic job-flow control
On 2023-10-27 12:31, Boris Brezillon wrote: > On Fri, 27 Oct 2023 16:23:24 +0200 > Danilo Krummrich wrote: > >> On 10/27/23 10:25, Boris Brezillon wrote: >>> Hi Danilo, >>> >>> On Thu, 26 Oct 2023 18:13:00 +0200 >>> Danilo Krummrich wrote: >>> Currently, job flow control is implemented simply by limiting the number of jobs in flight. Therefore, a scheduler is initialized with a credit limit that corresponds to the number of jobs which can be sent to the hardware. This implies that for each job, drivers need to account for the maximum job size possible in order to not overflow the ring buffer. However, there are drivers, such as Nouveau, where the job size has a rather large range. For such drivers it can easily happen that job submissions not even filling the ring by 1% can block subsequent submissions, which, in the worst case, can lead to the ring run dry. In order to overcome this issue, allow for tracking the actual job size instead of the number of jobs. Therefore, add a field to track a job's credit count, which represents the number of credits a job contributes to the scheduler's credit limit. Signed-off-by: Danilo Krummrich --- Changes in V2: == - fixed up influence on scheduling fairness due to consideration of a job's size - If we reach a ready entity in drm_sched_select_entity() but can't actually queue a job from it due to size limitations, just give up and go to sleep until woken up due to a pending job finishing, rather than continue to try other entities. - added a callback to dynamically update a job's credits (Boris) >>> >>> This callback seems controversial. I'd suggest dropping it, so the >>> patch can be merged. >> >> I don't think we should drop it just for the sake of moving forward. If >> there are objections >> we'll discuss them. I've seen good reasons why the drivers you are working >> on require this, >> while, following the discussion, I have *not* seen any reasons to drop it. >> It helps simplifying >> driver code and doesn't introduce any complexity or overhead to existing >> drivers. > > Up to you. I'm just saying, moving one step in the right direction is > better than being stuck, and it's not like adding this callback in a > follow-up patch is super complicated either. If you're confident that > we can get all parties to agree on keeping this hook, fine by me. I'd rather have it in now, as it is really *the vision* of this patch. There's no point in pushing in something half-baked. -- Regards, Luben OpenPGP_0x4C15479431A334AF.asc Description: OpenPGP public key OpenPGP_signature.asc Description: OpenPGP digital signature
Re: [PATCH drm-misc-next v3] drm/sched: implement dynamic job-flow control
Hi, On 2023-10-27 12:26, Boris Brezillon wrote: > On Fri, 27 Oct 2023 16:34:26 +0200 > Danilo Krummrich wrote: > >> On 10/27/23 09:17, Boris Brezillon wrote: >>> Hi Danilo, >>> >>> On Thu, 26 Oct 2023 18:13:00 +0200 >>> Danilo Krummrich wrote: >>> + + /** + * @update_job_credits: Called once the scheduler is considering this + * job for execution. + * + * Drivers may use this to update the job's submission credits, which is + * useful to e.g. deduct the number of native fences which have been + * signaled meanwhile. + * + * The callback must either return the new number of submission credits + * for the given job, or zero if no update is required. + * + * This callback is optional. + */ + u32 (*update_job_credits)(struct drm_sched_job *sched_job); >>> >>> I'm copying my late reply to v2 here so it doesn't get lost: >>> >>> I keep thinking it'd be simpler to make this a void function that >>> updates s_job->submission_credits directly. I also don't see the >>> problem with doing a sanity check on job->submission_credits. I mean, >>> if the driver is doing something silly, you can't do much to prevent it >>> anyway, except warn the user that something wrong has happened. If you >>> want to >>> >>> WARN_ON(job->submission_credits == 0 || >>> job->submission_credits > job_old_submission_credits); >>> >>> that's fine. But none of this sanity checking has to do with the >>> function prototype/semantics, and I'm still not comfortable with this 0 >>> => no-change. If there's no change, we should just leave >>> job->submission_credits unchanged (or return job->submission_credits) >>> instead of inventing a new special case. >> >> If we can avoid letting drivers change fields of generic structures directly >> without any drawbacks I think we should avoid it. Currently, drivers >> shouldn't >> have the need to mess with job->credits directly. The initial value is set >> through drm_sched_job_init() and is updated through the return value of >> update_job_credits(). > > Fair enough. I do agree that keeping internal fields out of driver > hands is a good thing in general, it's just that it's already > free-for-all in so many places in drm_sched (like the fact drivers "Free-for-all" doesn't mean we need to follow suit. We should keep good programming practices, as this patch strives to. > iterate the pending list in their stop-queue handling) that I didn't > really see it as an issue. Note that's there's always the option of > providing drm_sched_job_{update,get}_credits() helpers, with the update > helper making sure the new credits value is consistent (smaller or > equal to the old one, and not zero). > >> >> I'm fine getting rid of the 0 => no-change semantics though. Instead we can >> just >> WARN() on 0. > > Yeah, I think that's preferable. It's pretty easy to return the old > value if the driver has a way to detect when nothing changed (with a > get helper if you don't want drivers to touch the credits field). > >> However, if we do that I'd also want to change it for >> drm_sched_job_init() (where 0 currently defaults to 1) such that we accept >> 0, but >> WARN() accordingly. > > Sure. You update all drivers anyway, so passing 1 instead of 0 is not a > big deal, I would say. At this point in time, we should consider 1 as normal, 0 out of spec and WARN on it but carry on and (perhaps) reset it to 1. Drivers in the future, may see a need (i.e. do tricks) to return 0, at which point they'll submit a patch which does two things, 1) removes the WARN, 2) removes the reset from 0 to 1, and explain why they need to return 0 to allow (one more) job, but we're nowhere near then yet, so status quo for now. I don't see how it makes sense to call drm_sched_job_init(credits:0), and I believe the code is correct to default to 1 in that case--which defaults to the current flow control we have, which we want. > >> >> I think it's consequent to either consistently give 0 a different meaning or >> just >> accept it but WARN() on it. > > Using default as a default value makes sense when you're passing I suppose you meant "using zero as a default value". > zero-initialized objects that are later extended with new fields, but > here you update the function prototype and all the call sites, so we're > better off considering 0 as an invalid value, IMHO. Yes, absolutely. You never want to give 0 a meaning, since as you pointed out, it is zero-ed memory, and as such, can have any meaning you'd like. So yes: WARN on 0; 1 is good and normal. Regards, Luben OpenPGP_0x4C15479431A334AF.asc Description: OpenPGP public key OpenPGP_signature.asc Description: OpenPGP digital signature
[PATCH] drm/panel-edp: Add timings for BOE NV133WUM-N63
This panel is found on laptops e.g., variants of the Thinkpad X13s. Configuration was collected from the panel's EDID. Signed-off-by: Clayton Craft --- drivers/gpu/drm/panel/panel-edp.c | 27 +++ 1 file changed, 27 insertions(+) diff --git a/drivers/gpu/drm/panel/panel-edp.c b/drivers/gpu/drm/panel/panel-edp.c index 95c8472d878a..5db283f014f3 100644 --- a/drivers/gpu/drm/panel/panel-edp.c +++ b/drivers/gpu/drm/panel/panel-edp.c @@ -1204,6 +1204,29 @@ static const struct panel_desc boe_nv133fhm_n61 = { }, }; +static const struct drm_display_mode boe_nv133wum_n63_modes = { + .clock = 157760, + .hdisplay = 1920, + .hsync_start = 1920 + 48, + .hsync_end = 1920 + 48 + 32, + .htotal = 1920 + 48 + 32 + 80, + .vdisplay = 1200, + .vsync_start = 1200 + 3, + .vsync_end = 1200 + 3 + 6, + .vtotal = 1200 + 3 + 6 + 31, + .flags = DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_NVSYNC, +}; + +static const struct panel_desc boe_nv133wum_n63 = { + .modes = _nv133wum_n63_modes, + .num_modes = 1, + .bpc = 8, + .size = { + .width = 286, + .height = 179, + }, +}; + static const struct drm_display_mode boe_nv140fhmn49_modes[] = { { .clock = 148500, @@ -1723,6 +1746,9 @@ static const struct of_device_id platform_of_match[] = { }, { .compatible = "boe,nv133fhm-n62", .data = _nv133fhm_n61, + }, { + .compatible = "boe,nv133wum-n63", + .data = _nv133wum_n63, }, { .compatible = "boe,nv140fhmn49", .data = _nv140fhmn49, @@ -1852,6 +1878,7 @@ static const struct edp_panel_entry edp_panels[] = { EDP_PANEL_ENTRY('B', 'O', 'E', 0x095f, _200_500_e50, "NE135FBM-N41 v8.1"), EDP_PANEL_ENTRY('B', 'O', 'E', 0x098d, _nv110wtm_n61.delay, "NV110WTM-N61"), EDP_PANEL_ENTRY('B', 'O', 'E', 0x09dd, _200_500_e50, "NT116WHM-N21"), + EDP_PANEL_ENTRY('B', 'O', 'E', 0x0a1b, _200_500_e50, "NV133WUM-N63"), EDP_PANEL_ENTRY('B', 'O', 'E', 0x0a5d, _200_500_e50, "NV116WHM-N45"), EDP_PANEL_ENTRY('B', 'O', 'E', 0x0ac5, _200_500_e50, "NV116WHM-N4C"), -- 2.40.1
Re: [PATCH] drm/msm/gem: Add metadata
On Fri, 27 Oct 2023 at 22:45, Rob Clark wrote: > > From: Rob Clark > > The EXT_external_objects extension is a bit awkward as it doesn't pass > explicit modifiers, leaving the importer to guess with incomplete > information. In the case of vk (turnip) exporting and gl (freedreno) > importing, the "OPTIMAL_TILING_EXT" layout depends on VkImageCreateInfo > flags (among other things), which the importer does not know. Which > unfortunately leaves us with the need for a metadata back-channel. > > The contents of the metadata are defined by userspace. The > EXT_external_objects extension is only required to work between > compatible versions of gl and vk drivers, as defined by device and > driver UUIDs. > > Signed-off-by: Rob Clark > --- > drivers/gpu/drm/msm/msm_drv.c | 59 +-- > drivers/gpu/drm/msm/msm_gem.h | 4 +++ > include/uapi/drm/msm_drm.h| 2 ++ > 3 files changed, 63 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c > index b61ccea05327..8fe2677ea37a 100644 > --- a/drivers/gpu/drm/msm/msm_drv.c > +++ b/drivers/gpu/drm/msm/msm_drv.c > @@ -37,9 +37,10 @@ > * - 1.9.0 - Add MSM_SUBMIT_FENCE_SN_IN > * - 1.10.0 - Add MSM_SUBMIT_BO_NO_IMPLICIT > * - 1.11.0 - Add wait boost (MSM_WAIT_FENCE_BOOST, MSM_PREP_BOOST) > + * - 1.12.0 - Add MSM_INFO_SET_METADATA and MSM_INFO_GET_METADATA > */ > #define MSM_VERSION_MAJOR 1 > -#define MSM_VERSION_MINOR 10 > +#define MSM_VERSION_MINOR 12 > #define MSM_VERSION_PATCHLEVEL 0 > > static void msm_deinit_vram(struct drm_device *ddev); > @@ -566,6 +567,8 @@ static int msm_ioctl_gem_info(struct drm_device *dev, > void *data, > break; > case MSM_INFO_SET_NAME: > case MSM_INFO_GET_NAME: > + case MSM_INFO_SET_METADATA: > + case MSM_INFO_GET_METADATA: > break; > default: > return -EINVAL; > @@ -618,7 +621,7 @@ static int msm_ioctl_gem_info(struct drm_device *dev, > void *data, > break; > case MSM_INFO_GET_NAME: > if (args->value && (args->len < strlen(msm_obj->name))) { > - ret = -EINVAL; > + ret = -ETOOSMALL; This is unrelated and it also slightly changes user interface, so it IMO should come as a separate commit/ > break; > } > args->len = strlen(msm_obj->name); > @@ -627,6 +630,58 @@ static int msm_ioctl_gem_info(struct drm_device *dev, > void *data, > msm_obj->name, args->len)) > ret = -EFAULT; > } > + break; > + case MSM_INFO_SET_METADATA: > + /* Impose a moderate upper bound on metadata size: */ > + if (args->len > 128) { > + ret = -EOVERFLOW; > + break; > + } > + > + ret = msm_gem_lock_interruptible(obj); > + if (ret) > + break; > + > + msm_obj->metadata = > + krealloc(msm_obj->metadata, args->len, GFP_KERNEL); > + msm_obj->metadata_size = args->len; > + > + if (copy_from_user(msm_obj->metadata, > u64_to_user_ptr(args->value), > + args->len)) { > + msm_obj->metadata_size = 0; > + ret = -EFAULT; > + } > + > + msm_gem_unlock(obj); > + > + break; > + case MSM_INFO_GET_METADATA: > + if (!args->value) { > + /* > +* Querying the size is inherently racey, but > +* EXT_external_objects expects the app to confirm > +* via device and driver UUIDs that the exporter and > +* importer versions match. All we can do from the > +* kernel side is check the length under obj lock > +* when userspace tries to retrieve the metadata > +*/ > + args->len = msm_obj->metadata_size; > + break; > + } > + > + ret = msm_gem_lock_interruptible(obj); > + if (ret) > + break; > + > + if (args->len < msm_obj->metadata_size) { > + ret = -ETOOSMALL; > + } else if (copy_to_user(u64_to_user_ptr(args->value), > + msm_obj->metadata, args->len)) { > + ret = -EFAULT; > + } Doesn't checkpwatch warn here about extra curly brackets? > + > + msm_gem_unlock(obj); > + > break; > } > > diff --git a/drivers/gpu/drm/msm/msm_gem.h b/drivers/gpu/drm/msm/msm_gem.h > index
Re: [PATCH 3/7] drm/msm/gem: Don't queue job to sched in error cases
On Fri, 27 Oct 2023 at 19:59, Rob Clark wrote: > > From: Rob Clark > > We shouldn't be running the job in error cases. This also avoids having > to think too hard about where the objs get unpinned (and if necessary, > the resv takes over tracking that the obj is busy).. ie. error cases it > always happens synchronously, and normal cases it happens from scheduler > job_run() callback. > > Signed-off-by: Rob Clark Reviewed-by: Dmitry Baryshkov > --- > drivers/gpu/drm/msm/msm_gem_submit.c | 3 +++ > 1 file changed, 3 insertions(+) -- With best wishes Dmitry
Re: [PATCH] drm/amdgpu: Fixes uninitialized variable usage in amdgpu_dm_setup_replay
Hello, On Fri, Oct 27, 2023 at 11:57:45AM -0400, Hamza Mahfooz wrote: > On 10/27/23 11:55, Lakha, Bhawanpreet wrote: > > [AMD Official Use Only - General] > > > > > > > > There was a consensus to use memset instead of {0}. I remember making > > changes related to that previously. > > Hm, seems like it's used rather consistently in the DM and in DC > though. > Have you decided which one should be used? Should I submit a v2 patch using {0} instead of memset? Yuran Pereira > > > > Bhawan > > > > > > *From:* Mahfooz, Hamza > > *Sent:* October 27, 2023 11:53 AM > > *To:* Yuran Pereira ; airl...@gmail.com > > > > *Cc:* Li, Sun peng (Leo) ; Lakha, Bhawanpreet > > ; Pan, Xinhui ; Siqueira, > > Rodrigo ; linux-ker...@vger.kernel.org > > ; amd-...@lists.freedesktop.org > > ; dri-devel@lists.freedesktop.org > > ; Deucher, Alexander > > ; Koenig, Christian > > ; > > linux-kernel-ment...@lists.linuxfoundation.org > > > > *Subject:* Re: [PATCH] drm/amdgpu: Fixes uninitialized variable usage in > > amdgpu_dm_setup_replay > > On 10/26/23 17:25, Yuran Pereira wrote: > > > Since `pr_config` is not initialized after its declaration, the > > > following operations with `replay_enable_option` may be performed > > > when `replay_enable_option` is holding junk values which could > > > possibly lead to undefined behaviour > > > > > > ``` > > > ... > > > pr_config.replay_enable_option |= pr_enable_option_static_screen; > > > ... > > > > > > if (!pr_config.replay_timing_sync_supported) > > > pr_config.replay_enable_option &= ~pr_enable_option_general_ui; > > > ... > > > ``` > > > > > > This patch initializes `pr_config` after its declaration to ensure that > > > it doesn't contain junk data, and prevent any undefined behaviour > > > > > > Addresses-Coverity-ID: 1544428 ("Uninitialized scalar variable") > > > Fixes: dede1fea4460 ("drm/amd/display: Add Freesync Panel DM code") > > > Signed-off-by: Yuran Pereira > > > --- > > > drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c | 3 +++ > > > 1 file changed, 3 insertions(+) > > > > > > diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c > > > b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c > > > index 32d3086c4cb7..40526507f50b 100644 > > > --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c > > > +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c > > > @@ -23,6 +23,7 @@ > > > * > > > */ > > > +#include > > > #include "amdgpu_dm_replay.h" > > > #include "dc.h" > > > #include "dm_helpers.h" > > > @@ -74,6 +75,8 @@ bool amdgpu_dm_setup_replay(struct dc_link *link, > > > struct amdgpu_dm_connector *ac > > > struct replay_config pr_config; > > > > I would prefer setting pr_config = {0}; > > > > > union replay_debug_flags *debug_flags = NULL; > > > + memset(_config, 0, sizeof(pr_config)); > > > + > > > // For eDP, if Replay is supported, return true to skip checks > > > if (link->replay_settings.config.replay_supported) > > > return true; > > -- > > Hamza > > > -- > Hamza >
Re: [PATCH 2/7] drm/msm/gem: Remove submit_unlock_unpin_bo()
On Fri, 27 Oct 2023 at 19:59, Rob Clark wrote: > > From: Rob Clark > > The only point it is called is before pinning objects, so the "unpin" > part of the name is fiction. Just remove call submit_cleanup_bo() Nit: 'remove it and call' Other than that: Reviewed-by: Dmitry Baryshkov > directly. > > Signed-off-by: Rob Clark > --- > drivers/gpu/drm/msm/msm_gem_submit.c | 10 ++ > 1 file changed, 2 insertions(+), 8 deletions(-) -- With best wishes Dmitry
Re: [PATCH v2 1/2] drm/msm/dp: don't touch DP subconnector property in eDP case
On Sat, 28 Oct 2023 at 00:02, Abhinav Kumar wrote: > > > > On 10/25/2023 2:23 AM, Dmitry Baryshkov wrote: > > From: Abel Vesa > > > > In case of the eDP connection there is no subconnetor and as such no > > subconnector property. Put drm_dp_set_subconnector_property() calls > > under the !is_edp condition. > > > > Fixes: bfcc3d8f94f4 ("drm/msm/dp: support setting the DP subconnector type") > > Signed-off-by: Abel Vesa > > Signed-off-by: Dmitry Baryshkov > > --- > > drivers/gpu/drm/msm/dp/dp_display.c | 15 ++- > > 1 file changed, 10 insertions(+), 5 deletions(-) > > > > We still need to unify the calls to drm_dp_set_subconnector_property() > for the hpd connect/disconnect places preferably in > dp_display_send_hpd_notification(). > > That way, we would have had to make this change only in one location. Good point, I'd like to take another look at the HPD handling in the DP driver after we land the pending pm_runtime changes. As a part of that I'll check the drm_dp_set_subconnector_property() calls. > If you want to pursue that as a separate patch, I am fine as well. > > Hence, > > Reviewed-by: Abhinav Kumar -- With best wishes Dmitry
[PATCH] drm/radeon: replace 1-element arrays with flexible-array members
Reported by coccinelle, the following patch will move the following 1 element arrays to flexible arrays. drivers/gpu/drm/radeon/atombios.h:5523:32-48: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:5545:32-48: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:5461:34-44: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:4447:30-40: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:4236:30-41: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:7044:24-37: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:7054:24-37: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:7095:28-45: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:7553:8-17: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:7559:8-17: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:3896:27-37: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:5443:16-25: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:5454:34-43: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:4603:21-32: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:6299:32-44: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:4628:32-46: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:6285:29-39: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:4296:30-36: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:4756:28-36: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:4064:22-35: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:7327:9-24: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:7332:32-53: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:6030:8-17: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:7362:26-41: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:7369:29-44: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:7349:24-32: WARNING use flexible-array member instead
Re: [PATCH 2/5] drm/panel: nv3051d: Add Powkiddy RK2023 Panel Support
On 10/20/2023 8:02 AM, Chris Morgan wrote: On Thu, Oct 19, 2023 at 10:22:24AM -0700, Jessica Zhang wrote: On 10/18/2023 9:18 AM, Chris Morgan wrote: From: Chris Morgan Refactor the driver to add support for the powkiddy,rk2023-panel panel. This panel is extremely similar to the rg353p-panel but requires a smaller vertical back porch and isn't as tolerant of higher speeds. Tested on my RG351V, RG353P, RG353V, and RK2023. Signed-off-by: Chris Morgan Hi Chris, Thanks for the patch. Just have a minor question below. --- .../gpu/drm/panel/panel-newvision-nv3051d.c | 56 +++ 1 file changed, 45 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/panel/panel-newvision-nv3051d.c b/drivers/gpu/drm/panel/panel-newvision-nv3051d.c index 79de6c886292..d24c51503d68 100644 --- a/drivers/gpu/drm/panel/panel-newvision-nv3051d.c +++ b/drivers/gpu/drm/panel/panel-newvision-nv3051d.c @@ -28,6 +28,7 @@ struct nv3051d_panel_info { unsigned int num_modes; u16 width_mm, height_mm; u32 bus_flags; + u32 mode_flags; }; struct panel_nv3051d { @@ -385,15 +386,7 @@ static int panel_nv3051d_probe(struct mipi_dsi_device *dsi) dsi->lanes = 4; dsi->format = MIPI_DSI_FMT_RGB888; - dsi->mode_flags = MIPI_DSI_MODE_VIDEO | MIPI_DSI_MODE_VIDEO_BURST | - MIPI_DSI_MODE_LPM | MIPI_DSI_MODE_NO_EOT_PACKET; - - /* -* The panel in the RG351V is identical to the 353P, except it -* requires MIPI_DSI_CLOCK_NON_CONTINUOUS to operate correctly. -*/ - if (of_device_is_compatible(dev->of_node, "anbernic,rg351v-panel")) - dsi->mode_flags |= MIPI_DSI_CLOCK_NON_CONTINUOUS; + dsi->mode_flags = ctx->panel_info->mode_flags; drm_panel_init(>panel, >dev, _nv3051d_funcs, DRM_MODE_CONNECTOR_DSI); @@ -481,18 +474,59 @@ static const struct drm_display_mode nv3051d_rgxx3_modes[] = { }, }; -static const struct nv3051d_panel_info nv3051d_rgxx3_info = { +static const struct drm_display_mode nv3051d_rk2023_modes[] = { + { + .hdisplay = 640, + .hsync_start= 640 + 40, + .hsync_end = 640 + 40 + 2, + .htotal = 640 + 40 + 2 + 80, + .vdisplay = 480, + .vsync_start= 480 + 18, + .vsync_end = 480 + 18 + 2, + .vtotal = 480 + 18 + 2 + 4, + .clock = 24150, + .flags = DRM_MODE_FLAG_NHSYNC | DRM_MODE_FLAG_NVSYNC, + }, +}; + +static const struct nv3051d_panel_info nv3051d_rg351v_info = { .display_modes = nv3051d_rgxx3_modes, .num_modes = ARRAY_SIZE(nv3051d_rgxx3_modes), .width_mm = 70, .height_mm = 57, .bus_flags = DRM_BUS_FLAG_DE_LOW | DRM_BUS_FLAG_PIXDATA_DRIVE_NEGEDGE, + .mode_flags = MIPI_DSI_MODE_VIDEO | MIPI_DSI_MODE_VIDEO_BURST | + MIPI_DSI_MODE_LPM | MIPI_DSI_MODE_NO_EOT_PACKET | + MIPI_DSI_CLOCK_NON_CONTINUOUS, +}; + +static const struct nv3051d_panel_info nv3051d_rg353p_info = { + .display_modes = nv3051d_rgxx3_modes, + .num_modes = ARRAY_SIZE(nv3051d_rgxx3_modes), + .width_mm = 70, + .height_mm = 57, Will all the panels for this driver be 70x57? If so, would it be better to set display_info.[width_mm|height_mm] directly? They are all so far the same size, but I can't guarantee that going forward. To my knowledge this is the last of the nv3051d devices I'll be working on in the foreseeable future though, and so far they're all identical in size. Got it, if it's not guaranteed might be better to leave it as it then. Thanks for clarifying. BR, Jessica Zhang + .bus_flags = DRM_BUS_FLAG_DE_LOW | DRM_BUS_FLAG_PIXDATA_DRIVE_NEGEDGE, + .mode_flags = MIPI_DSI_MODE_VIDEO | MIPI_DSI_MODE_VIDEO_BURST | + MIPI_DSI_MODE_LPM | MIPI_DSI_MODE_NO_EOT_PACKET, +}; + +static const struct nv3051d_panel_info nv3051d_rk2023_info = { + .display_modes = nv3051d_rk2023_modes, + .num_modes = ARRAY_SIZE(nv3051d_rk2023_modes), + .width_mm = 70, + .height_mm = 57, + .bus_flags = DRM_BUS_FLAG_DE_LOW | DRM_BUS_FLAG_PIXDATA_DRIVE_NEGEDGE, + .mode_flags = MIPI_DSI_MODE_VIDEO | MIPI_DSI_MODE_VIDEO_BURST | + MIPI_DSI_MODE_LPM | MIPI_DSI_MODE_NO_EOT_PACKET, }; static const struct of_device_id newvision_nv3051d_of_match[] = { - { .compatible = "newvision,nv3051d", .data = _rgxx3_info }, + { .compatible = "anbernic,rg351v-panel", .data = _rg351v_info }, + { .compatible = "anbernic,rg353p-panel", .data = _rg353p_info }, + { .compatible = "powkiddy,rk2023-panel", .data = _rk2023_info }, { /* sentinel */ } }; + Sorry, will fix that in a V2. Thank you. I think you can drop this stray newline. Thanks, Jessica Zhang
[PATCH RFC v7 09/10] drm/msm/dpu: Use DRM solid_fill property
Drop DPU_PLANE_COLOR_FILL_FLAG and check the DRM solid_fill property to determine if the plane is solid fill. In addition drop the DPU plane color_fill field as we can now use drm_plane_state.solid_fill instead, and pass in drm_plane_state.alpha to _dpu_plane_color_fill_pipe() to allow userspace to configure the alpha value for the solid fill color. Reviewed-by: Dmitry Baryshkov Signed-off-by: Jessica Zhang --- drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c | 38 --- 1 file changed, 25 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c index 9615653db787..832747080daf 100644 --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c @@ -42,7 +42,6 @@ #define SHARP_SMOOTH_THR_DEFAULT 8 #define SHARP_NOISE_THR_DEFAULT2 -#define DPU_PLANE_COLOR_FILL_FLAG BIT(31) #define DPU_ZPOS_MAX 255 /* @@ -84,7 +83,6 @@ struct dpu_plane { enum dpu_sspp pipe; - uint32_t color_fill; bool is_error; bool is_rt_pipe; const struct dpu_mdss_cfg *catalog; @@ -640,19 +638,34 @@ static void _dpu_plane_color_fill_pipe(struct dpu_plane_state *pstate, _dpu_plane_setup_scaler(pipe, fmt, true, _cfg, pstate->rotation); } +static uint32_t _dpu_plane_get_abgr_fill_color(struct drm_plane_state *state) +{ + struct drm_solid_fill solid_fill = state->solid_fill; + + uint32_t ret = 0; + uint8_t a = state->alpha & 0xFF; + uint8_t b = solid_fill.b >> 24; + uint8_t g = solid_fill.g >> 24; + uint8_t r = solid_fill.r >> 24; + + ret |= a << 24; + ret |= b << 16; + ret |= g << 8; + ret |= r; + + return ret; +} + /** * _dpu_plane_color_fill - enables color fill on plane * @pdpu: Pointer to DPU plane object * @color: RGB fill color value, [23..16] Blue, [15..8] Green, [7..0] Red - * @alpha: 8-bit fill alpha value, 255 selects 100% alpha */ -static void _dpu_plane_color_fill(struct dpu_plane *pdpu, - uint32_t color, uint32_t alpha) +static void _dpu_plane_color_fill(struct dpu_plane *pdpu, uint32_t color) { const struct dpu_format *fmt; const struct drm_plane *plane = >base; struct dpu_plane_state *pstate = to_dpu_plane_state(plane->state); - u32 fill_color = (color & 0xFF) | ((alpha & 0xFF) << 24); DPU_DEBUG_PLANE(pdpu, "\n"); @@ -667,11 +680,11 @@ static void _dpu_plane_color_fill(struct dpu_plane *pdpu, /* update sspp */ _dpu_plane_color_fill_pipe(pstate, >pipe, >pipe_cfg.dst_rect, - fill_color, fmt); + color, fmt); if (pstate->r_pipe.sspp) _dpu_plane_color_fill_pipe(pstate, >r_pipe, >r_pipe_cfg.dst_rect, - fill_color, fmt); + color, fmt); } static int dpu_plane_prepare_fb(struct drm_plane *plane, @@ -1019,10 +1032,9 @@ void dpu_plane_flush(struct drm_plane *plane) */ if (pdpu->is_error) /* force white frame with 100% alpha pipe output on error */ - _dpu_plane_color_fill(pdpu, 0xFF, 0xFF); - else if (pdpu->color_fill & DPU_PLANE_COLOR_FILL_FLAG) - /* force 100% alpha */ - _dpu_plane_color_fill(pdpu, pdpu->color_fill, 0xFF); + _dpu_plane_color_fill(pdpu, 0x); + else if (drm_plane_solid_fill_enabled(plane->state)) + _dpu_plane_color_fill(pdpu, _dpu_plane_get_abgr_fill_color(plane->state)); else { dpu_plane_flush_csc(pdpu, >pipe); dpu_plane_flush_csc(pdpu, >r_pipe); @@ -1067,7 +1079,7 @@ static void dpu_plane_sspp_update_pipe(struct drm_plane *plane, } /* override for color fill */ - if (pdpu->color_fill & DPU_PLANE_COLOR_FILL_FLAG) { + if (drm_plane_solid_fill_enabled(plane->state)) { _dpu_plane_set_qos_ctrl(plane, pipe, false); /* skip remaining processing on color fill */ -- 2.42.0
[PATCH RFC v7 01/10] drm: Introduce pixel_source DRM plane property
Add support for pixel_source property to drm_plane and related documentation. In addition, force pixel_source to DRM_PLANE_PIXEL_SOURCE_FB in DRM_IOCTL_MODE_SETPLANE as to not break legacy userspace. This enum property will allow user to specify a pixel source for the plane. Possible pixel sources will be defined in the drm_plane_pixel_source enum. Currently, the only pixel sources are DRM_PLANE_PIXEL_SOURCE_FB (the default value) and DRM_PLANE_PIXEL_SOURCE_NONE. Acked-by: Dmitry Baryshkov Acked-by: Pekka Paalanen Acked-by: Harry Wentland Acked-by: Sebastian Wick Acked-by: Simon Ser Signed-off-by: Jessica Zhang --- drivers/gpu/drm/drm_atomic_state_helper.c | 1 + drivers/gpu/drm/drm_atomic_uapi.c | 4 ++ drivers/gpu/drm/drm_blend.c | 94 +++ drivers/gpu/drm/drm_plane.c | 19 +-- include/drm/drm_blend.h | 2 + include/drm/drm_plane.h | 21 +++ 6 files changed, 137 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/drm_atomic_state_helper.c b/drivers/gpu/drm/drm_atomic_state_helper.c index 784e63d70a42..01638c51ce0a 100644 --- a/drivers/gpu/drm/drm_atomic_state_helper.c +++ b/drivers/gpu/drm/drm_atomic_state_helper.c @@ -252,6 +252,7 @@ void __drm_atomic_helper_plane_state_reset(struct drm_plane_state *plane_state, plane_state->alpha = DRM_BLEND_ALPHA_OPAQUE; plane_state->pixel_blend_mode = DRM_MODE_BLEND_PREMULTI; + plane_state->pixel_source = DRM_PLANE_PIXEL_SOURCE_FB; if (plane->color_encoding_property) { if (!drm_object_property_get_default_value(>base, diff --git a/drivers/gpu/drm/drm_atomic_uapi.c b/drivers/gpu/drm/drm_atomic_uapi.c index 98d3b10c08ae..46c78b87803d 100644 --- a/drivers/gpu/drm/drm_atomic_uapi.c +++ b/drivers/gpu/drm/drm_atomic_uapi.c @@ -562,6 +562,8 @@ static int drm_atomic_plane_set_property(struct drm_plane *plane, state->src_w = val; } else if (property == config->prop_src_h) { state->src_h = val; + } else if (property == plane->pixel_source_property) { + state->pixel_source = val; } else if (property == plane->alpha_property) { state->alpha = val; } else if (property == plane->blend_mode_property) { @@ -634,6 +636,8 @@ drm_atomic_plane_get_property(struct drm_plane *plane, *val = state->src_w; } else if (property == config->prop_src_h) { *val = state->src_h; + } else if (property == plane->pixel_source_property) { + *val = state->pixel_source; } else if (property == plane->alpha_property) { *val = state->alpha; } else if (property == plane->blend_mode_property) { diff --git a/drivers/gpu/drm/drm_blend.c b/drivers/gpu/drm/drm_blend.c index 6e74de833466..fce734cdb85b 100644 --- a/drivers/gpu/drm/drm_blend.c +++ b/drivers/gpu/drm/drm_blend.c @@ -185,6 +185,25 @@ * plane does not expose the "alpha" property, then this is * assumed to be 1.0 * + * pixel_source: + * pixel_source is set up with drm_plane_create_pixel_source_property(). + * It is used to toggle the active source of pixel data for the plane. + * The plane will only display data from the set pixel_source -- any + * data from other sources will be ignored. + * + * For non-framebuffer sources, if pixel_source is set to a non-framebuffer + * and non-NONE source, and the corresponding source property is NULL, then + * the atomic commit should return an error. + * + * Possible values: + * + * "NONE": + * No active pixel source. + * Committing with a NONE pixel source will disable the plane. + * + * "FB": + * Framebuffer source set by the "FB_ID" property. + * * Note that all the property extensions described here apply either to the * plane or the CRTC (e.g. for the background color, which currently is not * exposed and assumed to be black). @@ -615,3 +634,78 @@ int drm_plane_create_blend_mode_property(struct drm_plane *plane, return 0; } EXPORT_SYMBOL(drm_plane_create_blend_mode_property); + +static const struct drm_prop_enum_list drm_pixel_source_enum_list[] = { + { DRM_PLANE_PIXEL_SOURCE_NONE, "NONE" }, + { DRM_PLANE_PIXEL_SOURCE_FB, "FB" }, +}; + +/** + * drm_plane_create_pixel_source_property - create a new pixel source property + * @plane: DRM plane + * @extra_sources: Bitmask of additional supported pixel_sources for the driver. + *DRM_PLANE_PIXEL_SOURCE_FB and DRM_PLANE_PIXEL_SOURCE_NONE will + *always be enabled as supported sources. + * + * This creates a new property describing the current source of pixel data for the + * plane. The pixel_source will be initialized as DRM_PLANE_PIXEL_SOURCE_FB by default. + * + * Drivers can set a custom default source by overriding the pixel_source
[PATCH RFC v7 06/10] drm/atomic: Move framebuffer checks to helper
Currently framebuffer checks happen directly in drm_atomic_plane_check(). Move these checks into their own helper method. Reviewed-by: Dmitry Baryshkov Acked-by: Harry Wentland Acked-by: Sebastian Wick Signed-off-by: Jessica Zhang --- drivers/gpu/drm/drm_atomic.c | 130 --- 1 file changed, 73 insertions(+), 57 deletions(-) diff --git a/drivers/gpu/drm/drm_atomic.c b/drivers/gpu/drm/drm_atomic.c index af778d32785b..48a2df4e3d27 100644 --- a/drivers/gpu/drm/drm_atomic.c +++ b/drivers/gpu/drm/drm_atomic.c @@ -589,6 +589,76 @@ plane_switching_crtc(const struct drm_plane_state *old_plane_state, return true; } +static int drm_atomic_plane_check_fb(const struct drm_plane_state *state) +{ + struct drm_plane *plane = state->plane; + const struct drm_framebuffer *fb = state->fb; + struct drm_mode_rect *clips; + + uint32_t num_clips; + unsigned int fb_width, fb_height; + int ret; + + /* Check whether this plane supports the fb pixel format. */ + ret = drm_plane_check_pixel_format(plane, fb->format->format, + fb->modifier); + + if (ret) { + drm_dbg_atomic(plane->dev, + "[PLANE:%d:%s] invalid pixel format %p4cc, modifier 0x%llx\n", + plane->base.id, plane->name, + >format->format, fb->modifier); + return ret; + } + + fb_width = fb->width << 16; + fb_height = fb->height << 16; + + /* Make sure source coordinates are inside the fb. */ + if (state->src_w > fb_width || + state->src_x > fb_width - state->src_w || + state->src_h > fb_height || + state->src_y > fb_height - state->src_h) { + drm_dbg_atomic(plane->dev, + "[PLANE:%d:%s] invalid source coordinates " + "%u.%06ux%u.%06u+%u.%06u+%u.%06u (fb %ux%u)\n", + plane->base.id, plane->name, + state->src_w >> 16, + ((state->src_w & 0x) * 15625) >> 10, + state->src_h >> 16, + ((state->src_h & 0x) * 15625) >> 10, + state->src_x >> 16, + ((state->src_x & 0x) * 15625) >> 10, + state->src_y >> 16, + ((state->src_y & 0x) * 15625) >> 10, + fb->width, fb->height); + return -ENOSPC; + } + + clips = __drm_plane_get_damage_clips(state); + num_clips = drm_plane_get_damage_clips_count(state); + + /* Make sure damage clips are valid and inside the fb. */ + while (num_clips > 0) { + if (clips->x1 >= clips->x2 || + clips->y1 >= clips->y2 || + clips->x1 < 0 || + clips->y1 < 0 || + clips->x2 > fb_width || + clips->y2 > fb_height) { + drm_dbg_atomic(plane->dev, + "[PLANE:%d:%s] invalid damage clip %d %d %d %d\n", + plane->base.id, plane->name, clips->x1, + clips->y1, clips->x2, clips->y2); + return -EINVAL; + } + clips++; + num_clips--; + } + + return 0; +} + /** * drm_atomic_plane_check - check plane state * @old_plane_state: old plane state to check @@ -605,9 +675,6 @@ static int drm_atomic_plane_check(const struct drm_plane_state *old_plane_state, struct drm_plane *plane = new_plane_state->plane; struct drm_crtc *crtc = new_plane_state->crtc; const struct drm_framebuffer *fb = new_plane_state->fb; - unsigned int fb_width, fb_height; - struct drm_mode_rect *clips; - uint32_t num_clips; int ret; /* either *both* CRTC and FB must be set, or neither */ @@ -634,17 +701,6 @@ static int drm_atomic_plane_check(const struct drm_plane_state *old_plane_state, return -EINVAL; } - /* Check whether this plane supports the fb pixel format. */ - ret = drm_plane_check_pixel_format(plane, fb->format->format, - fb->modifier); - if (ret) { - drm_dbg_atomic(plane->dev, - "[PLANE:%d:%s] invalid pixel format %p4cc, modifier 0x%llx\n", - plane->base.id, plane->name, - >format->format, fb->modifier); - return ret; - } - /* Give drivers some help against integer overflows */ if (new_plane_state->crtc_w > INT_MAX || new_plane_state->crtc_x > INT_MAX - (int32_t)
[PATCH RFC v7 00/10] Support for Solid Fill Planes
Some drivers support hardware that have optimizations for solid fill planes. This series aims to expose these capabilities to userspace as some compositors have a solid fill flag (ex. SOLID_COLOR in the Android hardware composer HAL) that can be set by apps like the Android Gears test app. In order to expose this capability to userspace, this series will: - Introduce solid_fill and pixel_source properties to allow userspace to toggle between FB and solid fill sources - Loosen NULL FB checks within the DRM atomic commit callstack to allow for NULL FB when solid fill is enabled. - Add NULL FB checks in methods where FB was previously assumed to be non-NULL - Have MSM DPU driver use drm_plane_state.solid_fill instead of dpu_plane_state.color_fill Note: The solid fill planes feature depends on both the solid_fill *and* pixel_source properties. To use this feature, userspace can set the solid_fill property to a blob containing the solid fill color (in RGB323232 format) and and setting the pixel_source property to DRM_PLANE_PIXEL_SOURCE_SOLID_FILL. This will disable memory fetch and the resulting plane will display the color specified by the solid_fill blob. In order to disable a solid fill plane, the user must set the pixel_source to NONE. A plane with a pixel_source of "solid_fill" and a NULL solid_fill blob will cause an error on commit. Currently, there's only one version of the solid_fill blob property. However if other drivers want to support a similar feature, but require more than just the solid fill color, they can extend this feature by extending the pixel_source enum and adding another solid fill blob property. This 2 property approach was chosen because passing in a special 1x1 FB with the necessary color information would have unecessary overhead that does not reflect the behavior of the solid fill feature. In addition, assigning the solid fill blob to FB_ID would require loosening some core drm_property checks that might cause unwanted side effects elsewhere. --- Changes in v7: - Updated cover letter (Sebastian) - Updated the uAPI documentation (Sebastian, Pekka) - Specify that padding must be set to zero in drm_mode_solid_fill documentation (Pekka) - Use %08x format when printing solid fill info in plane state dump (Pekka) - Use new_plane_state->fb directly in drm_atomic_plane_check() (Dmitry) - Dropped documentation for alpha for _dpu_plane_color_fill() (Dmitry) - Defined DPU_SOLID_FILL_FORMAT macro (Dmitry) - Fixed some necessary checks being skipped in the DPU atomic commit path (Dmitry) - Rebased to tip of msm-next - Picked up Acked-by and Reviewed-by tags Changes in v6: - Have _dpu_plane_color_fill() take in a single ABGR color instead of having separate alpha and BGR color parameters (Dmitry) - Drop plane->state->pixel_source != DRM_PLANE_PIXEL_SOURCE_FB check in SetPlane ioctl (Dmitry) - Add DRM_PLANE_PIXEL_SOURCE_NONE as a default pixel source (Sebastian) - Dropped versioning from solid fill property blob (Dmitry) - Use DRM_ENUM_NAME_FN (Dmitry) - Use drm_atomic_replace_property_blob_from_id() (Dmitry) - drm_atomic_check_fb -> drm_atomic_plane_check_fb (Dmitry) - Group redundant NULL FB checks (Dmitry) - Squashed drm_plane_needs_disable() implementation with DRM_PLANE_PIXEL_SOURCE_NONE declaration (Sebastian) - Add comment to support RGBA solid fill color in the future (Dmitry) - Link to v5: https://lore.kernel.org/r/20230728-solid-fill-v5-0-053dbefa9...@quicinc.com Changes in v5: - Added support for PIXEL_SOURCE_NONE (Sebastian) - Added WARN_ON() in drm_plane_has_visible_data() if pixel_source isn't set (Dmitry) - Added debugfs support for both properties (Dmitry) - Corrected u32 to u8 conversion (Pekka) - Moved drm_solid_fill_info struct and related documentation to include/uapi (Pekka) - Changed drm_solid_fill_info.version to __u32 for data alignment (Pekka) - Added more detailed UAPI and kernel documentation (Pekka) - Reordered patch series so that the pixel_source property is introduced before solid_fill (Dmitry) - Fixed inconsistent ABGR/RGBA format declaration (Pekka) - Reset pixel_source to FB in drm_mode_setplane() (Dmitry) - Rename supported_sources to extra_sources (Dmitry) - Only destroy old solid_fill blob state if new state is valid (Pekka) - Link to v4: https://lore.kernel.org/r/20230404-solid-fill-v4-0-f4ec0caa7...@quicinc.com Changes in v4: - Rebased onto latest kernel - Reworded cover letter for clarity (Dmitry) - Reworded commit messages for clarity - Split existing changes into smaller commits - Added pixel_source enum property (Dmitry, Pekka, Ville) - Updated drm-kms comment docs with pixel_source and solid_fill properties (Dmitry) - Inlined drm_atomic_convert_solid_fill_info() (Dmitry) - Passed in plane state alpha value to _dpu_plane_color_fill_pipe() - Link to v3: https://lore.kernel.org/r/20230104234036.636-1-quic_jessz...@quicinc.com Changes in v3: - Fixed some logic errors in atomic checks (Dmitry) -
[PATCH RFC v7 05/10] drm/atomic: Add solid fill data to plane state dump
Add solid_fill property data to the atomic plane state dump. Reviewed-by: Dmitry Baryshkov Acked-by: Harry Wentland Acked-by: Sebastian Wick Signed-off-by: Jessica Zhang --- drivers/gpu/drm/drm_atomic.c | 4 drivers/gpu/drm/drm_plane.c | 8 include/drm/drm_plane.h | 3 +++ 3 files changed, 15 insertions(+) diff --git a/drivers/gpu/drm/drm_atomic.c b/drivers/gpu/drm/drm_atomic.c index 9f9abbe76369..af778d32785b 100644 --- a/drivers/gpu/drm/drm_atomic.c +++ b/drivers/gpu/drm/drm_atomic.c @@ -726,6 +726,10 @@ static void drm_atomic_plane_print_state(struct drm_printer *p, drm_printf(p, "\tfb=%u\n", state->fb ? state->fb->base.id : 0); if (state->fb) drm_framebuffer_print_info(p, 2, state->fb); + drm_printf(p, "\tsolid_fill=%u\n", + state->solid_fill_blob ? state->solid_fill_blob->base.id : 0); + if (state->solid_fill_blob) + drm_plane_solid_fill_print_info(p, 2, state); drm_printf(p, "\tcrtc-pos=" DRM_RECT_FMT "\n", DRM_RECT_ARG()); drm_printf(p, "\tsrc-pos=" DRM_RECT_FP_FMT "\n", DRM_RECT_FP_ARG()); drm_printf(p, "\trotation=%x\n", state->rotation); diff --git a/drivers/gpu/drm/drm_plane.c b/drivers/gpu/drm/drm_plane.c index 559d101162ba..289b3be86d52 100644 --- a/drivers/gpu/drm/drm_plane.c +++ b/drivers/gpu/drm/drm_plane.c @@ -1495,6 +1495,14 @@ __drm_plane_get_damage_clips(const struct drm_plane_state *state) state->fb_damage_clips->data : NULL); } +void drm_plane_solid_fill_print_info(struct drm_printer *p, unsigned int indent, +const struct drm_plane_state *state) +{ + drm_printf_indent(p, indent, "r=0x%08x\n", state->solid_fill.r); + drm_printf_indent(p, indent, "g=0x%08x\n", state->solid_fill.g); + drm_printf_indent(p, indent, "b=0x%08x\n", state->solid_fill.b); +} + /** * drm_plane_get_damage_clips - Returns damage clips. * @state: Plane state. diff --git a/include/drm/drm_plane.h b/include/drm/drm_plane.h index baaf737392bc..6171fb1a0b47 100644 --- a/include/drm/drm_plane.h +++ b/include/drm/drm_plane.h @@ -1001,6 +1001,9 @@ drm_plane_get_damage_clips_count(const struct drm_plane_state *state); struct drm_mode_rect * drm_plane_get_damage_clips(const struct drm_plane_state *state); +void drm_plane_solid_fill_print_info(struct drm_printer *p, unsigned int indent, +const struct drm_plane_state *state); + int drm_plane_create_scaling_filter_property(struct drm_plane *plane, unsigned int supported_filters); -- 2.42.0
[PATCH RFC v7 07/10] drm/atomic: Loosen FB atomic checks
Loosen the requirements for atomic and legacy commit so that, in cases where pixel_source != FB, the commit can still go through. This includes adding framebuffer NULL checks in other areas to account for FB being NULL when non-FB pixel sources are enabled. To disable a plane, the pixel_source must be NONE or the FB must be NULL if pixel_source == FB. Signed-off-by: Jessica Zhang --- drivers/gpu/drm/drm_atomic.c| 21 ++-- drivers/gpu/drm/drm_atomic_helper.c | 39 + include/drm/drm_atomic_helper.h | 4 ++-- include/drm/drm_plane.h | 29 +++ 4 files changed, 64 insertions(+), 29 deletions(-) diff --git a/drivers/gpu/drm/drm_atomic.c b/drivers/gpu/drm/drm_atomic.c index 48a2df4e3d27..66e49c06aced 100644 --- a/drivers/gpu/drm/drm_atomic.c +++ b/drivers/gpu/drm/drm_atomic.c @@ -674,17 +674,16 @@ static int drm_atomic_plane_check(const struct drm_plane_state *old_plane_state, { struct drm_plane *plane = new_plane_state->plane; struct drm_crtc *crtc = new_plane_state->crtc; - const struct drm_framebuffer *fb = new_plane_state->fb; int ret; - /* either *both* CRTC and FB must be set, or neither */ - if (crtc && !fb) { - drm_dbg_atomic(plane->dev, "[PLANE:%d:%s] CRTC set but no FB\n", + /* either *both* CRTC and pixel source must be set, or neither */ + if (crtc && !drm_plane_has_visible_data(new_plane_state)) { + drm_dbg_atomic(plane->dev, "[PLANE:%d:%s] CRTC set but no visible data\n", plane->base.id, plane->name); return -EINVAL; - } else if (fb && !crtc) { - drm_dbg_atomic(plane->dev, "[PLANE:%d:%s] FB set but no CRTC\n", - plane->base.id, plane->name); + } else if (drm_plane_has_visible_data(new_plane_state) && !crtc) { + drm_dbg_atomic(plane->dev, "[PLANE:%d:%s] Source %d has visible data but no CRTC\n", + plane->base.id, plane->name, new_plane_state->pixel_source); return -EINVAL; } @@ -715,9 +714,11 @@ static int drm_atomic_plane_check(const struct drm_plane_state *old_plane_state, } - ret = drm_atomic_plane_check_fb(new_plane_state); - if (ret) - return ret; + if (new_plane_state->pixel_source == DRM_PLANE_PIXEL_SOURCE_FB && new_plane_state->fb) { + ret = drm_atomic_plane_check_fb(new_plane_state); + if (ret) + return ret; + } if (plane_switching_crtc(old_plane_state, new_plane_state)) { drm_dbg_atomic(plane->dev, diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c index 292e38eb6218..27e8c8dd9324 100644 --- a/drivers/gpu/drm/drm_atomic_helper.c +++ b/drivers/gpu/drm/drm_atomic_helper.c @@ -852,7 +852,6 @@ int drm_atomic_helper_check_plane_state(struct drm_plane_state *plane_state, bool can_position, bool can_update_disabled) { - struct drm_framebuffer *fb = plane_state->fb; struct drm_rect *src = _state->src; struct drm_rect *dst = _state->dst; unsigned int rotation = plane_state->rotation; @@ -864,7 +863,7 @@ int drm_atomic_helper_check_plane_state(struct drm_plane_state *plane_state, *src = drm_plane_state_src(plane_state); *dst = drm_plane_state_dest(plane_state); - if (!fb) { + if (!drm_plane_has_visible_data(plane_state)) { plane_state->visible = false; return 0; } @@ -881,25 +880,31 @@ int drm_atomic_helper_check_plane_state(struct drm_plane_state *plane_state, return -EINVAL; } - drm_rect_rotate(src, fb->width << 16, fb->height << 16, rotation); + /* Check that selected pixel source is valid */ + if (plane_state->pixel_source == DRM_PLANE_PIXEL_SOURCE_FB && plane_state->fb) { + struct drm_framebuffer *fb = plane_state->fb; + drm_rect_rotate(src, fb->width << 16, fb->height << 16, rotation); - /* Check scaling */ - hscale = drm_rect_calc_hscale(src, dst, min_scale, max_scale); - vscale = drm_rect_calc_vscale(src, dst, min_scale, max_scale); - if (hscale < 0 || vscale < 0) { - drm_dbg_kms(plane_state->plane->dev, - "Invalid scaling of plane\n"); - drm_rect_debug_print("src: ", _state->src, true); - drm_rect_debug_print("dst: ", _state->dst, false); - return -ERANGE; - } + /* Check scaling */ + hscale = drm_rect_calc_hscale(src, dst, min_scale, max_scale); + vscale = drm_rect_calc_vscale(src, dst, min_scale, max_scale); - if (crtc_state->enable) -
[PATCH RFC v7 10/10] drm/msm/dpu: Add solid fill and pixel source properties
Add solid_fill and pixel_source properties to DPU plane Reviewed-by: Dmitry Baryshkov Signed-off-by: Jessica Zhang --- drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c index 832747080daf..8e9fda541211 100644 --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c @@ -1493,6 +1493,8 @@ struct drm_plane *dpu_plane_init(struct drm_device *dev, DPU_ERROR("failed to install zpos property, rc = %d\n", ret); drm_plane_create_alpha_property(plane); + drm_plane_create_solid_fill_property(plane); + drm_plane_create_pixel_source_property(plane, BIT(DRM_PLANE_PIXEL_SOURCE_SOLID_FILL)); drm_plane_create_blend_mode_property(plane, BIT(DRM_MODE_BLEND_PIXEL_NONE) | BIT(DRM_MODE_BLEND_PREMULTI) | -- 2.42.0
[PATCH RFC v7 03/10] drm: Add solid fill pixel source
Add "SOLID_FILL" as a valid pixel source. If the pixel_source property is set to "SOLID_FILL", it will display data from the drm_plane "solid_fill" blob property. Reviewed-by: Dmitry Baryshkov Acked-by: Pekka Paalanen Acked-by: Harry Wentland Acked-by: Sebastian Wick Signed-off-by: Jessica Zhang --- drivers/gpu/drm/drm_blend.c | 10 +- include/drm/drm_plane.h | 1 + 2 files changed, 10 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/drm_blend.c b/drivers/gpu/drm/drm_blend.c index 665c5d9b056f..37b31b7e5ce5 100644 --- a/drivers/gpu/drm/drm_blend.c +++ b/drivers/gpu/drm/drm_blend.c @@ -204,6 +204,9 @@ * "FB": * Framebuffer source set by the "FB_ID" property. * + * "SOLID_FILL": + * Solid fill color source set by the "solid_fill" property. + * * solid_fill: * solid_fill is set up with drm_plane_create_solid_fill_property(). It * contains pixel data that drivers can use to fill a plane. @@ -642,6 +645,7 @@ EXPORT_SYMBOL(drm_plane_create_blend_mode_property); static const struct drm_prop_enum_list drm_pixel_source_enum_list[] = { { DRM_PLANE_PIXEL_SOURCE_NONE, "NONE" }, { DRM_PLANE_PIXEL_SOURCE_FB, "FB" }, + { DRM_PLANE_PIXEL_SOURCE_SOLID_FILL, "SOLID_FILL" }, }; /** @@ -666,6 +670,9 @@ static const struct drm_prop_enum_list drm_pixel_source_enum_list[] = { * "FB": * Framebuffer pixel source * + * "SOLID_FILL": + * Solid fill color pixel source + * * Returns: * Zero on success, negative errno on failure. */ @@ -675,7 +682,8 @@ int drm_plane_create_pixel_source_property(struct drm_plane *plane, struct drm_device *dev = plane->dev; struct drm_property *prop; static const unsigned int valid_source_mask = BIT(DRM_PLANE_PIXEL_SOURCE_FB) | - BIT(DRM_PLANE_PIXEL_SOURCE_NONE); + BIT(DRM_PLANE_PIXEL_SOURCE_NONE) | + BIT(DRM_PLANE_PIXEL_SOURCE_SOLID_FILL); int i; /* FB is supported by default */ diff --git a/include/drm/drm_plane.h b/include/drm/drm_plane.h index e2ae7c26cc57..baaf737392bc 100644 --- a/include/drm/drm_plane.h +++ b/include/drm/drm_plane.h @@ -43,6 +43,7 @@ enum drm_scaling_filter { enum drm_plane_pixel_source { DRM_PLANE_PIXEL_SOURCE_NONE, DRM_PLANE_PIXEL_SOURCE_FB, + DRM_PLANE_PIXEL_SOURCE_SOLID_FILL, DRM_PLANE_PIXEL_SOURCE_MAX }; -- 2.42.0
[PATCH RFC v7 04/10] drm/atomic: Add pixel source to plane state dump
Add pixel source to the atomic plane state dump Reviewed-by: Dmitry Baryshkov Acked-by: Harry Wentland Acked-by: Sebastian Wick Signed-off-by: Jessica Zhang --- drivers/gpu/drm/drm_atomic.c| 1 + drivers/gpu/drm/drm_blend.c | 1 + drivers/gpu/drm/drm_crtc_internal.h | 1 + 3 files changed, 3 insertions(+) diff --git a/drivers/gpu/drm/drm_atomic.c b/drivers/gpu/drm/drm_atomic.c index c277b198fa3f..9f9abbe76369 100644 --- a/drivers/gpu/drm/drm_atomic.c +++ b/drivers/gpu/drm/drm_atomic.c @@ -722,6 +722,7 @@ static void drm_atomic_plane_print_state(struct drm_printer *p, drm_printf(p, "plane[%u]: %s\n", plane->base.id, plane->name); drm_printf(p, "\tcrtc=%s\n", state->crtc ? state->crtc->name : "(null)"); + drm_printf(p, "\tpixel-source=%s\n", drm_get_pixel_source_name(state->pixel_source)); drm_printf(p, "\tfb=%u\n", state->fb ? state->fb->base.id : 0); if (state->fb) drm_framebuffer_print_info(p, 2, state->fb); diff --git a/drivers/gpu/drm/drm_blend.c b/drivers/gpu/drm/drm_blend.c index 37b31b7e5ce5..9c1608f7c1df 100644 --- a/drivers/gpu/drm/drm_blend.c +++ b/drivers/gpu/drm/drm_blend.c @@ -647,6 +647,7 @@ static const struct drm_prop_enum_list drm_pixel_source_enum_list[] = { { DRM_PLANE_PIXEL_SOURCE_FB, "FB" }, { DRM_PLANE_PIXEL_SOURCE_SOLID_FILL, "SOLID_FILL" }, }; +DRM_ENUM_NAME_FN(drm_get_pixel_source_name, drm_pixel_source_enum_list); /** * drm_plane_create_pixel_source_property - create a new pixel source property diff --git a/drivers/gpu/drm/drm_crtc_internal.h b/drivers/gpu/drm/drm_crtc_internal.h index 501a10edd0e1..7bc93ba449d5 100644 --- a/drivers/gpu/drm/drm_crtc_internal.h +++ b/drivers/gpu/drm/drm_crtc_internal.h @@ -267,6 +267,7 @@ int drm_plane_check_pixel_format(struct drm_plane *plane, u32 format, u64 modifier); struct drm_mode_rect * __drm_plane_get_damage_clips(const struct drm_plane_state *state); +const char *drm_get_pixel_source_name(int val); /* drm_bridge.c */ void drm_bridge_detach(struct drm_bridge *bridge); -- 2.42.0
[PATCH RFC v7 08/10] drm/msm/dpu: Allow NULL FBs in atomic commit
Since solid fill planes allow for a NULL framebuffer in a valid commit, add NULL framebuffer checks to atomic commit calls within DPU. Reviewed-by: Dmitry Baryshkov Signed-off-by: Jessica Zhang --- drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c | 9 +++- drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c | 35 --- 2 files changed, 31 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c index 3c475f8042b0..3b9648c679ad 100644 --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c @@ -451,6 +451,7 @@ static void _dpu_crtc_blend_setup_mixer(struct drm_crtc *crtc, struct drm_plane_state *state; struct dpu_crtc_state *cstate = to_dpu_crtc_state(crtc->state); struct dpu_plane_state *pstate = NULL; + const struct msm_format *fmt; struct dpu_format *format; struct dpu_hw_ctl *ctl = mixer->lm_ctl; @@ -470,7 +471,13 @@ static void _dpu_crtc_blend_setup_mixer(struct drm_crtc *crtc, pstate = to_dpu_plane_state(state); fb = state->fb; - format = to_dpu_format(msm_framebuffer_format(pstate->base.fb)); + if (drm_plane_solid_fill_enabled(state)) + fmt = dpu_get_msm_format(&_dpu_crtc_get_kms(crtc)->base, + DRM_FORMAT_ABGR, 0); + else + fmt = msm_framebuffer_format(pstate->base.fb); + + format = to_dpu_format(fmt); if (pstate->stage == DPU_STAGE_BASE && format->alpha_enable) bg_alpha_enable = true; diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c index 3eef5e025e12..9615653db787 100644 --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c @@ -55,6 +55,8 @@ #define DEFAULT_REFRESH_RATE 60 +#define DPU_SOLID_FILL_FORMAT DRM_FORMAT_ABGR + static const uint32_t qcom_compressed_supported_formats[] = { DRM_FORMAT_ABGR, DRM_FORMAT_ARGB, @@ -658,7 +660,7 @@ static void _dpu_plane_color_fill(struct dpu_plane *pdpu, * select fill format to match user property expectation, * h/w only supports RGB variants */ - fmt = dpu_get_dpu_format(DRM_FORMAT_ABGR); + fmt = dpu_get_dpu_format(DPU_SOLID_FILL_FORMAT); /* should not happen ever */ if (!fmt) return; @@ -877,18 +879,23 @@ static int dpu_plane_atomic_check(struct drm_plane *plane, pipe_cfg->dst_rect = new_plane_state->dst; - fb_rect.x2 = new_plane_state->fb->width; - fb_rect.y2 = new_plane_state->fb->height; + if (new_plane_state->pixel_source == DRM_PLANE_PIXEL_SOURCE_FB && new_plane_state->fb) { + fb_rect.x2 = new_plane_state->fb->width; + fb_rect.y2 = new_plane_state->fb->height; - /* Ensure fb size is supported */ - if (drm_rect_width(_rect) > MAX_IMG_WIDTH || - drm_rect_height(_rect) > MAX_IMG_HEIGHT) { - DPU_DEBUG_PLANE(pdpu, "invalid framebuffer " DRM_RECT_FMT "\n", - DRM_RECT_ARG(_rect)); - return -E2BIG; + /* Ensure fb size is supported */ + if (drm_rect_width(_rect) > MAX_IMG_WIDTH || + drm_rect_height(_rect) > MAX_IMG_HEIGHT) { + DPU_DEBUG_PLANE(pdpu, "invalid framebuffer " DRM_RECT_FMT "\n", + DRM_RECT_ARG(_rect)); + return -E2BIG; + } } - fmt = to_dpu_format(msm_framebuffer_format(new_plane_state->fb)); + if (drm_plane_solid_fill_enabled(new_plane_state)) + fmt = dpu_get_dpu_format(DPU_SOLID_FILL_FORMAT); + else + fmt = to_dpu_format(msm_framebuffer_format(new_plane_state->fb)); max_linewidth = pdpu->catalog->caps->max_linewidth; @@ -1123,8 +1130,7 @@ static void dpu_plane_sspp_atomic_update(struct drm_plane *plane) struct drm_crtc *crtc = state->crtc; struct drm_framebuffer *fb = state->fb; bool is_rt_pipe; - const struct dpu_format *fmt = - to_dpu_format(msm_framebuffer_format(fb)); + const struct dpu_format *fmt; struct dpu_sw_pipe_cfg *pipe_cfg = >pipe_cfg; struct dpu_sw_pipe_cfg *r_pipe_cfg = >r_pipe_cfg; struct dpu_kms *kms = _dpu_plane_get_kms(>base); @@ -1133,6 +1139,11 @@ static void dpu_plane_sspp_atomic_update(struct drm_plane *plane) bool layout_valid = false; int ret; + if (drm_plane_solid_fill_enabled(state)) + return; + + fmt = to_dpu_format(msm_framebuffer_format(fb)); + ret = dpu_format_populate_layout(aspace, fb, ); if (ret) DPU_ERROR_PLANE(pdpu, "failed to get
[PATCH RFC v7 02/10] drm: Introduce solid fill DRM plane property
Document and add support for solid_fill property to drm_plane. In addition, add support for setting and getting the values for solid_fill. To enable solid fill planes, userspace must assign a property blob to the "solid_fill" plane property containing the following information: struct drm_mode_solid_fill { u32 r, g, b, pad; }; Acked-by: Harry Wentland Acked-by: Sebastian Wick Signed-off-by: Jessica Zhang --- drivers/gpu/drm/drm_atomic_state_helper.c | 9 drivers/gpu/drm/drm_atomic_uapi.c | 26 ++ drivers/gpu/drm/drm_blend.c | 30 ++ include/drm/drm_blend.h | 1 + include/drm/drm_plane.h | 36 +++ include/uapi/drm/drm_mode.h | 24 + 6 files changed, 126 insertions(+) diff --git a/drivers/gpu/drm/drm_atomic_state_helper.c b/drivers/gpu/drm/drm_atomic_state_helper.c index 01638c51ce0a..86fb876efbe6 100644 --- a/drivers/gpu/drm/drm_atomic_state_helper.c +++ b/drivers/gpu/drm/drm_atomic_state_helper.c @@ -254,6 +254,11 @@ void __drm_atomic_helper_plane_state_reset(struct drm_plane_state *plane_state, plane_state->pixel_blend_mode = DRM_MODE_BLEND_PREMULTI; plane_state->pixel_source = DRM_PLANE_PIXEL_SOURCE_FB; + if (plane_state->solid_fill_blob) { + drm_property_blob_put(plane_state->solid_fill_blob); + plane_state->solid_fill_blob = NULL; + } + if (plane->color_encoding_property) { if (!drm_object_property_get_default_value(>base, plane->color_encoding_property, @@ -336,6 +341,9 @@ void __drm_atomic_helper_plane_duplicate_state(struct drm_plane *plane, if (state->fb) drm_framebuffer_get(state->fb); + if (state->solid_fill_blob) + drm_property_blob_get(state->solid_fill_blob); + state->fence = NULL; state->commit = NULL; state->fb_damage_clips = NULL; @@ -385,6 +393,7 @@ void __drm_atomic_helper_plane_destroy_state(struct drm_plane_state *state) drm_crtc_commit_put(state->commit); drm_property_blob_put(state->fb_damage_clips); + drm_property_blob_put(state->solid_fill_blob); } EXPORT_SYMBOL(__drm_atomic_helper_plane_destroy_state); diff --git a/drivers/gpu/drm/drm_atomic_uapi.c b/drivers/gpu/drm/drm_atomic_uapi.c index 46c78b87803d..576acb34195f 100644 --- a/drivers/gpu/drm/drm_atomic_uapi.c +++ b/drivers/gpu/drm/drm_atomic_uapi.c @@ -316,6 +316,20 @@ drm_atomic_set_crtc_for_connector(struct drm_connector_state *conn_state, } EXPORT_SYMBOL(drm_atomic_set_crtc_for_connector); +static void drm_atomic_set_solid_fill_prop(struct drm_plane_state *state) +{ + struct drm_mode_solid_fill *user_info; + + if (!state->solid_fill_blob) + return; + + user_info = (struct drm_mode_solid_fill *)state->solid_fill_blob->data; + + state->solid_fill.r = user_info->r; + state->solid_fill.g = user_info->g; + state->solid_fill.b = user_info->b; +} + static void set_out_fence_for_crtc(struct drm_atomic_state *state, struct drm_crtc *crtc, s32 __user *fence_ptr) { @@ -564,6 +578,15 @@ static int drm_atomic_plane_set_property(struct drm_plane *plane, state->src_h = val; } else if (property == plane->pixel_source_property) { state->pixel_source = val; + } else if (property == plane->solid_fill_property) { + ret = drm_atomic_replace_property_blob_from_id(dev, + >solid_fill_blob, + val, sizeof(struct drm_mode_solid_fill), + -1, ); + if (ret) + return ret; + + drm_atomic_set_solid_fill_prop(state); } else if (property == plane->alpha_property) { state->alpha = val; } else if (property == plane->blend_mode_property) { @@ -638,6 +661,9 @@ drm_atomic_plane_get_property(struct drm_plane *plane, *val = state->src_h; } else if (property == plane->pixel_source_property) { *val = state->pixel_source; + } else if (property == plane->solid_fill_property) { + *val = state->solid_fill_blob ? + state->solid_fill_blob->base.id : 0; } else if (property == plane->alpha_property) { *val = state->alpha; } else if (property == plane->blend_mode_property) { diff --git a/drivers/gpu/drm/drm_blend.c b/drivers/gpu/drm/drm_blend.c index fce734cdb85b..665c5d9b056f 100644 --- a/drivers/gpu/drm/drm_blend.c +++ b/drivers/gpu/drm/drm_blend.c @@ -204,6 +204,10 @@ * "FB": * Framebuffer source set by the "FB_ID" property. * + * solid_fill: + * solid_fill is set up with
[PATCH v2 4/4] drm/i915/mtl: Add module parameter override for Wa_16019325821/Wa_14019159160
From: John Harrison These w/a's can have signficant performance implications for any workload which uses both RCS and CCS. On the other hand, the hang itself is only seen in one or two very specific workloads. So add a module parameter to control whether the w/a's are enabled or not and default to not. Signed-off-by: John Harrison --- drivers/gpu/drm/i915/gt/uc/intel_guc.c| 3 ++- drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c| 3 ++- drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 3 ++- drivers/gpu/drm/i915/i915_params.c| 3 +++ drivers/gpu/drm/i915/i915_params.h| 1 + 5 files changed, 10 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc.c index 6252f32d67011..4c89983b1e907 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c @@ -296,7 +296,8 @@ static u32 guc_ctl_wa_flags(struct intel_guc *guc) /* Wa_16019325821 */ /* Wa_14019159160 */ - if (IS_GFX_GT_IP_RANGE(gt, IP_VER(12, 70), IP_VER(12, 71))) + if (gt->i915->params.enable_mtl_rcs_ccs_wa && + IS_GFX_GT_IP_RANGE(gt, IP_VER(12, 70), IP_VER(12, 71))) flags |= GUC_WA_RCS_CCS_SWITCHOUT; /* diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c index 8f7298cbbc322..78757e78bce88 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c @@ -845,7 +845,8 @@ static void guc_waklv_init(struct intel_guc *guc) remain = guc_ads_waklv_size(guc); /* Wa_14019159160 */ - if (IS_GFX_GT_IP_RANGE(gt, IP_VER(12, 70), IP_VER(12, 71))) { + if (gt->i915->params.enable_mtl_rcs_ccs_wa && + IS_GFX_GT_IP_RANGE(gt, IP_VER(12, 70), IP_VER(12, 71))) { size = guc_waklv_ra_mode(guc, offset, remain); offset += size; remain -= size; diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c index 225812b299524..4de54a100c451 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c @@ -4384,7 +4384,8 @@ static void guc_default_vfuncs(struct intel_engine_cs *engine) /* Wa_16019325821 */ /* Wa_14019159160 */ - if ((engine->class == COMPUTE_CLASS || engine->class == RENDER_CLASS) && + if (engine->i915->params.enable_mtl_rcs_ccs_wa && + (engine->class == COMPUTE_CLASS || engine->class == RENDER_CLASS) && IS_GFX_GT_IP_RANGE(engine->gt, IP_VER(12, 70), IP_VER(12, 71))) engine->flags |= I915_ENGINE_USES_WA_HOLD_SWITCHOUT; diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c index de43048543e8b..1004171d99943 100644 --- a/drivers/gpu/drm/i915/i915_params.c +++ b/drivers/gpu/drm/i915/i915_params.c @@ -134,6 +134,9 @@ i915_param_named_unsafe(lmem_size, uint, 0400, i915_param_named_unsafe(lmem_bar_size, uint, 0400, "Set the lmem bar size(in MiB)."); +i915_param_named(enable_mtl_rcs_ccs_wa, bool, 0400, + "Enable the RCS/CCS switchout hold workaround for MTL (only some workloads are affected by issue and w/a has a performance penalty) (default:false)"); + static void _param_print_bool(struct drm_printer *p, const char *name, bool val) { diff --git a/drivers/gpu/drm/i915/i915_params.h b/drivers/gpu/drm/i915/i915_params.h index 1315d7fac850f..971a765d74f56 100644 --- a/drivers/gpu/drm/i915/i915_params.h +++ b/drivers/gpu/drm/i915/i915_params.h @@ -62,6 +62,7 @@ struct drm_printer; param(unsigned int, lmem_size, 0, 0400) \ param(unsigned int, lmem_bar_size, 0, 0400) \ /* leave bools at the end to not create holes */ \ + param(bool, enable_mtl_rcs_ccs_wa, false, 0x400) \ param(bool, enable_hangcheck, true, 0600) \ param(bool, error_capture, true, IS_ENABLED(CONFIG_DRM_I915_CAPTURE_ERROR) ? 0600 : 0) \ param(bool, enable_gvt, false, IS_ENABLED(CONFIG_DRM_I915_GVT) ? 0400 : 0) -- 2.41.0
[PATCH v2 3/4] drm/i915/guc: Enable Wa_14019159160
From: John Harrison Use the new w/a KLV support to enable a MTL w/a. Note, this w/a is a super-set of Wa_16019325821, so requires turning that one as well as setting the new flag for Wa_14019159160 itself. Signed-off-by: John Harrison --- drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 3 ++ drivers/gpu/drm/i915/gt/intel_engine_types.h | 1 + drivers/gpu/drm/i915/gt/uc/abi/guc_klvs_abi.h | 7 drivers/gpu/drm/i915/gt/uc/intel_guc.c| 1 + drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c| 34 ++- .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 1 + 6 files changed, 38 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c index 9cccd60a5c41d..359b21fb02ab2 100644 --- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c @@ -744,6 +744,7 @@ static u32 *gen12_emit_preempt_busywait(struct i915_request *rq, u32 *cs) /* Wa_14014475959:dg2 */ /* Wa_16019325821 */ +/* Wa_14019159160 */ #define HOLD_SWITCHOUT_SEMAPHORE_PPHWSP_OFFSET 0x540 static u32 hold_switchout_semaphore_offset(struct i915_request *rq) { @@ -753,6 +754,7 @@ static u32 hold_switchout_semaphore_offset(struct i915_request *rq) /* Wa_14014475959:dg2 */ /* Wa_16019325821 */ +/* Wa_14019159160 */ static u32 *hold_switchout_emit_wa_busywait(struct i915_request *rq, u32 *cs) { int i; @@ -793,6 +795,7 @@ gen12_emit_fini_breadcrumb_tail(struct i915_request *rq, u32 *cs) /* Wa_14014475959:dg2 */ /* Wa_16019325821 */ + /* Wa_14019159160 */ if (intel_engine_uses_wa_hold_switchout(rq->engine)) cs = hold_switchout_emit_wa_busywait(rq, cs); diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h index f08739d020332..3b4993955a4b6 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h @@ -695,6 +695,7 @@ intel_engine_has_relative_mmio(const struct intel_engine_cs * const engine) /* Wa_14014475959:dg2 */ /* Wa_16019325821 */ +/* Wa_14019159160 */ static inline bool intel_engine_uses_wa_hold_switchout(struct intel_engine_cs *engine) { diff --git a/drivers/gpu/drm/i915/gt/uc/abi/guc_klvs_abi.h b/drivers/gpu/drm/i915/gt/uc/abi/guc_klvs_abi.h index 58012edd4eb0e..bebf28e3c4794 100644 --- a/drivers/gpu/drm/i915/gt/uc/abi/guc_klvs_abi.h +++ b/drivers/gpu/drm/i915/gt/uc/abi/guc_klvs_abi.h @@ -101,4 +101,11 @@ enum { GUC_CONTEXT_POLICIES_KLV_NUM_IDS = 5, }; +/* + * Workaround keys: + */ +enum { + GUC_WORKAROUND_KLV_SERIALIZED_RA_MODE = 0x9001, +}; + #endif /* _ABI_GUC_KLVS_ABI_H */ diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc.c index 0e6c160de3315..6252f32d67011 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c @@ -295,6 +295,7 @@ static u32 guc_ctl_wa_flags(struct intel_guc *guc) flags |= GUC_WA_HOLD_CCS_SWITCHOUT; /* Wa_16019325821 */ + /* Wa_14019159160 */ if (IS_GFX_GT_IP_RANGE(gt, IP_VER(12, 70), IP_VER(12, 71))) flags |= GUC_WA_RCS_CCS_SWITCHOUT; diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c index 251e7a7a05cb8..8f7298cbbc322 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c @@ -810,6 +810,25 @@ guc_capture_prep_lists(struct intel_guc *guc) return PAGE_ALIGN(total_size); } +/* Wa_14019159160 */ +static u32 guc_waklv_ra_mode(struct intel_guc *guc, u32 offset, u32 remain) +{ + u32 size; + u32 klv_entry[] = { + /* 16:16 key/length */ + FIELD_PREP(GUC_KLV_0_KEY, GUC_WORKAROUND_KLV_SERIALIZED_RA_MODE) | + FIELD_PREP(GUC_KLV_0_LEN, 0), + /* 0 dwords data */ + }; + + size = sizeof(klv_entry); + GEM_BUG_ON(remain < size); + + iosys_map_memcpy_to(>ads_map, offset, klv_entry, size); + + return size; +} + static void guc_waklv_init(struct intel_guc *guc) { struct intel_gt *gt = guc_to_gt(guc); @@ -825,15 +844,12 @@ static void guc_waklv_init(struct intel_guc *guc) offset = guc_ads_waklv_offset(guc); remain = guc_ads_waklv_size(guc); - /* -* Add workarounds here: -* -* if (want_wa_) { -* size = guc_waklv_(guc, offset, remain); -* offset += size; -* remain -= size; -* } -*/ + /* Wa_14019159160 */ + if (IS_GFX_GT_IP_RANGE(gt, IP_VER(12, 70), IP_VER(12, 71))) { + size = guc_waklv_ra_mode(guc, offset, remain); + offset += size; + remain -= size; + } size = guc_ads_waklv_size(guc) - remain; if (!size) diff --git
[PATCH v2 1/4] drm/i915: Enable Wa_16019325821
From: John Harrison Some platforms require holding RCS context switches until CCS is idle (the reverse w/a of Wa_14014475959). Some platforms require both versions. Signed-off-by: John Harrison --- drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 19 +++ drivers/gpu/drm/i915/gt/intel_engine_types.h | 7 --- drivers/gpu/drm/i915/gt/uc/intel_guc.c| 4 drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h | 3 ++- .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 7 ++- 5 files changed, 27 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c index 86a04afff64b3..9cccd60a5c41d 100644 --- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c @@ -743,21 +743,23 @@ static u32 *gen12_emit_preempt_busywait(struct i915_request *rq, u32 *cs) } /* Wa_14014475959:dg2 */ -#define CCS_SEMAPHORE_PPHWSP_OFFSET0x540 -static u32 ccs_semaphore_offset(struct i915_request *rq) +/* Wa_16019325821 */ +#define HOLD_SWITCHOUT_SEMAPHORE_PPHWSP_OFFSET 0x540 +static u32 hold_switchout_semaphore_offset(struct i915_request *rq) { return i915_ggtt_offset(rq->context->state) + - (LRC_PPHWSP_PN * PAGE_SIZE) + CCS_SEMAPHORE_PPHWSP_OFFSET; + (LRC_PPHWSP_PN * PAGE_SIZE) + HOLD_SWITCHOUT_SEMAPHORE_PPHWSP_OFFSET; } /* Wa_14014475959:dg2 */ -static u32 *ccs_emit_wa_busywait(struct i915_request *rq, u32 *cs) +/* Wa_16019325821 */ +static u32 *hold_switchout_emit_wa_busywait(struct i915_request *rq, u32 *cs) { int i; *cs++ = MI_ATOMIC_INLINE | MI_ATOMIC_GLOBAL_GTT | MI_ATOMIC_CS_STALL | MI_ATOMIC_MOVE; - *cs++ = ccs_semaphore_offset(rq); + *cs++ = hold_switchout_semaphore_offset(rq); *cs++ = 0; *cs++ = 1; @@ -773,7 +775,7 @@ static u32 *ccs_emit_wa_busywait(struct i915_request *rq, u32 *cs) MI_SEMAPHORE_POLL | MI_SEMAPHORE_SAD_EQ_SDD; *cs++ = 0; - *cs++ = ccs_semaphore_offset(rq); + *cs++ = hold_switchout_semaphore_offset(rq); *cs++ = 0; return cs; @@ -790,8 +792,9 @@ gen12_emit_fini_breadcrumb_tail(struct i915_request *rq, u32 *cs) cs = gen12_emit_preempt_busywait(rq, cs); /* Wa_14014475959:dg2 */ - if (intel_engine_uses_wa_hold_ccs_switchout(rq->engine)) - cs = ccs_emit_wa_busywait(rq, cs); + /* Wa_16019325821 */ + if (intel_engine_uses_wa_hold_switchout(rq->engine)) + cs = hold_switchout_emit_wa_busywait(rq, cs); rq->tail = intel_ring_offset(rq, cs); assert_ring_tail_valid(rq->ring, rq->tail); diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h index 8769760257fd9..f08739d020332 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h @@ -584,7 +584,7 @@ struct intel_engine_cs { #define I915_ENGINE_HAS_RCS_REG_STATE BIT(9) #define I915_ENGINE_HAS_EU_PRIORITYBIT(10) #define I915_ENGINE_FIRST_RENDER_COMPUTE BIT(11) -#define I915_ENGINE_USES_WA_HOLD_CCS_SWITCHOUT BIT(12) +#define I915_ENGINE_USES_WA_HOLD_SWITCHOUT BIT(12) unsigned int flags; /* @@ -694,10 +694,11 @@ intel_engine_has_relative_mmio(const struct intel_engine_cs * const engine) } /* Wa_14014475959:dg2 */ +/* Wa_16019325821 */ static inline bool -intel_engine_uses_wa_hold_ccs_switchout(struct intel_engine_cs *engine) +intel_engine_uses_wa_hold_switchout(struct intel_engine_cs *engine) { - return engine->flags & I915_ENGINE_USES_WA_HOLD_CCS_SWITCHOUT; + return engine->flags & I915_ENGINE_USES_WA_HOLD_SWITCHOUT; } #endif /* __INTEL_ENGINE_TYPES_H__ */ diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc.c index 3f3df1166b860..0e6c160de3315 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c @@ -294,6 +294,10 @@ static u32 guc_ctl_wa_flags(struct intel_guc *guc) IS_DG2(gt->i915)) flags |= GUC_WA_HOLD_CCS_SWITCHOUT; + /* Wa_16019325821 */ + if (IS_GFX_GT_IP_RANGE(gt, IP_VER(12, 70), IP_VER(12, 71))) + flags |= GUC_WA_RCS_CCS_SWITCHOUT; + /* * Wa_14012197797 * Wa_22011391025 diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h b/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h index 8ae1846431da7..48863188a130e 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h @@ -96,8 +96,9 @@ #define GUC_WA_GAM_CREDITS BIT(10) #define GUC_WA_DUAL_QUEUEBIT(11) #define GUC_WA_RCS_RESET_BEFORE_RC6 BIT(13) -#define GUC_WA_CONTEXT_ISOLATION BIT(15) #define GUC_WA_PRE_PARSERBIT(14) +#define GUC_WA_CONTEXT_ISOLATION BIT(15) +#define GUC_WA_RCS_CCS_SWITCHOUT
[PATCH v2 2/4] drm/i915/guc: Add support for w/a KLVs
From: John Harrison To prevent running out of bits, new w/a enable flags are being added via a KLV system instead of a 32 bit flags word. Signed-off-by: John Harrison --- .../gpu/drm/i915/gt/uc/abi/guc_errors_abi.h | 1 + drivers/gpu/drm/i915/gt/uc/intel_guc.h| 2 + drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c| 73 ++- drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c | 6 ++ drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h | 5 +- 5 files changed, 85 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/uc/abi/guc_errors_abi.h b/drivers/gpu/drm/i915/gt/uc/abi/guc_errors_abi.h index dabeaf4f245f3..00d6402333f8e 100644 --- a/drivers/gpu/drm/i915/gt/uc/abi/guc_errors_abi.h +++ b/drivers/gpu/drm/i915/gt/uc/abi/guc_errors_abi.h @@ -36,6 +36,7 @@ enum intel_guc_load_status { INTEL_GUC_LOAD_STATUS_INVALID_INIT_DATA_RANGE_START, INTEL_GUC_LOAD_STATUS_MPU_DATA_INVALID = 0x73, INTEL_GUC_LOAD_STATUS_INIT_MMIO_SAVE_RESTORE_INVALID = 0x74, + INTEL_GUC_LOAD_STATUS_KLV_WORKAROUND_INIT_ERROR= 0x75, INTEL_GUC_LOAD_STATUS_INVALID_INIT_DATA_RANGE_END, INTEL_GUC_LOAD_STATUS_READY= 0xF0, diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h b/drivers/gpu/drm/i915/gt/uc/intel_guc.h index 2b6dfe62c8f2a..4113776ff3e19 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h @@ -198,6 +198,8 @@ struct intel_guc { struct guc_mmio_reg *ads_regset; /** @ads_golden_ctxt_size: size of the golden contexts in the ADS */ u32 ads_golden_ctxt_size; + /** @ads_waklv_size: size of workaround KLVs */ + u32 ads_waklv_size; /** @ads_capture_size: size of register lists in the ADS used for error capture */ u32 ads_capture_size; /** @ads_engine_usage_size: size of engine usage in the ADS */ diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c index 63724e17829a7..251e7a7a05cb8 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c @@ -46,6 +46,10 @@ * +---+ * | padding | * +---+ <== 4K aligned + * | w/a KLVs | + * +---+ + * | padding | + * +---+ <== 4K aligned * | capture lists | * +---+ * | padding | @@ -88,6 +92,11 @@ static u32 guc_ads_golden_ctxt_size(struct intel_guc *guc) return PAGE_ALIGN(guc->ads_golden_ctxt_size); } +static u32 guc_ads_waklv_size(struct intel_guc *guc) +{ + return PAGE_ALIGN(guc->ads_waklv_size); +} + static u32 guc_ads_capture_size(struct intel_guc *guc) { return PAGE_ALIGN(guc->ads_capture_size); @@ -113,7 +122,7 @@ static u32 guc_ads_golden_ctxt_offset(struct intel_guc *guc) return PAGE_ALIGN(offset); } -static u32 guc_ads_capture_offset(struct intel_guc *guc) +static u32 guc_ads_waklv_offset(struct intel_guc *guc) { u32 offset; @@ -123,6 +132,16 @@ static u32 guc_ads_capture_offset(struct intel_guc *guc) return PAGE_ALIGN(offset); } +static u32 guc_ads_capture_offset(struct intel_guc *guc) +{ + u32 offset; + + offset = guc_ads_waklv_offset(guc) + +guc_ads_waklv_size(guc); + + return PAGE_ALIGN(offset); +} + static u32 guc_ads_private_data_offset(struct intel_guc *guc) { u32 offset; @@ -791,6 +810,49 @@ guc_capture_prep_lists(struct intel_guc *guc) return PAGE_ALIGN(total_size); } +static void guc_waklv_init(struct intel_guc *guc) +{ + struct intel_gt *gt = guc_to_gt(guc); + u32 offset, addr_ggtt, remain, size; + + if (!intel_uc_uses_guc_submission(>uc)) + return; + + if (GUC_FIRMWARE_VER(guc) < MAKE_GUC_VER(70, 10, 0)) + return; + + GEM_BUG_ON(iosys_map_is_null(>ads_map)); + offset = guc_ads_waklv_offset(guc); + remain = guc_ads_waklv_size(guc); + + /* +* Add workarounds here: +* +* if (want_wa_) { +* size = guc_waklv_(guc, offset, remain); +* offset += size; +* remain -= size; +* } +*/ + + size = guc_ads_waklv_size(guc) - remain; + if (!size) + return; + + offset = guc_ads_waklv_offset(guc); + addr_ggtt = intel_guc_ggtt_offset(guc, guc->ads_vma) + offset; + + ads_blob_write(guc, ads.wa_klv_addr_lo, addr_ggtt); + ads_blob_write(guc, ads.wa_klv_addr_hi, 0); + ads_blob_write(guc, ads.wa_klv_size, size); +} + +static int guc_prep_waklv(struct intel_guc *guc) +{ + /* Fudge
[PATCH v2 0/4] Enable Wa_14019159160 and Wa_16019325821 for MTL
From: John Harrison Enable Wa_14019159160 and Wa_16019325821 for MTL RCS/CCS workarounds for MTL. v2: Fix bug in WA KLV implementation (offset not being reset to start of list). Add better comment to prep patch about how KLVs can be added. Add a module parameter override and disable the w/a by default as it causes performance regressions and is only required by very specific workloads. Signed-off-by: John Harrison John Harrison (4): drm/i915: Enable Wa_16019325821 drm/i915/guc: Add support for w/a KLVs drm/i915/guc: Enable Wa_14019159160 drm/i915/mtl: Add module parameter override for Wa_16019325821/Wa_14019159160 drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 22 +++-- drivers/gpu/drm/i915/gt/intel_engine_types.h | 8 +- .../gpu/drm/i915/gt/uc/abi/guc_errors_abi.h | 1 + drivers/gpu/drm/i915/gt/uc/abi/guc_klvs_abi.h | 7 ++ drivers/gpu/drm/i915/gt/uc/intel_guc.c| 6 ++ drivers/gpu/drm/i915/gt/uc/intel_guc.h| 2 + drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c| 90 ++- drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c | 6 ++ drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h | 8 +- .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 9 +- drivers/gpu/drm/i915/i915_params.c| 3 + drivers/gpu/drm/i915/i915_params.h| 1 + 12 files changed, 148 insertions(+), 15 deletions(-) -- 2.41.0
Re: [Intel-gfx] [PATCH 3/4] drm/i915/guc: Add support for w/a KLVs
On 10/6/2023 17:38, Belgaumkar, Vinay wrote: On 9/15/2023 2:55 PM, john.c.harri...@intel.com wrote: From: John Harrison To prevent running out of bits, new w/a enable flags are being added via a KLV system instead of a 32 bit flags word. Signed-off-by: John Harrison --- .../gpu/drm/i915/gt/uc/abi/guc_errors_abi.h | 1 + drivers/gpu/drm/i915/gt/uc/intel_guc.h | 3 + drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 64 ++- drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c | 6 ++ drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h | 5 +- 5 files changed, 77 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/uc/abi/guc_errors_abi.h b/drivers/gpu/drm/i915/gt/uc/abi/guc_errors_abi.h index dabeaf4f245f3..00d6402333f8e 100644 --- a/drivers/gpu/drm/i915/gt/uc/abi/guc_errors_abi.h +++ b/drivers/gpu/drm/i915/gt/uc/abi/guc_errors_abi.h @@ -36,6 +36,7 @@ enum intel_guc_load_status { INTEL_GUC_LOAD_STATUS_INVALID_INIT_DATA_RANGE_START, INTEL_GUC_LOAD_STATUS_MPU_DATA_INVALID = 0x73, INTEL_GUC_LOAD_STATUS_INIT_MMIO_SAVE_RESTORE_INVALID = 0x74, + INTEL_GUC_LOAD_STATUS_KLV_WORKAROUND_INIT_ERROR = 0x75, INTEL_GUC_LOAD_STATUS_INVALID_INIT_DATA_RANGE_END, INTEL_GUC_LOAD_STATUS_READY = 0xF0, diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h b/drivers/gpu/drm/i915/gt/uc/intel_guc.h index 6c392bad29c19..3b1fc5f96306b 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h @@ -186,6 +186,8 @@ struct intel_guc { struct guc_mmio_reg *ads_regset; /** @ads_golden_ctxt_size: size of the golden contexts in the ADS */ u32 ads_golden_ctxt_size; + /** @ads_waklv_size: size of workaround KLVs */ + u32 ads_waklv_size; /** @ads_capture_size: size of register lists in the ADS used for error capture */ u32 ads_capture_size; /** @ads_engine_usage_size: size of engine usage in the ADS */ @@ -295,6 +297,7 @@ struct intel_guc { #define MAKE_GUC_VER(maj, min, pat) (((maj) << 16) | ((min) << 8) | (pat)) #define MAKE_GUC_VER_STRUCT(ver) MAKE_GUC_VER((ver).major, (ver).minor, (ver).patch) #define GUC_SUBMIT_VER(guc) MAKE_GUC_VER_STRUCT((guc)->submission_version) +#define GUC_FIRMWARE_VER(guc) MAKE_GUC_VER_STRUCT((guc)->fw.file_selected.ver) static inline struct intel_guc *log_to_guc(struct intel_guc_log *log) { diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c index 63724e17829a7..792910af3a481 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c @@ -46,6 +46,10 @@ * +---+ * | padding | * +---+ <== 4K aligned + * | w/a KLVs | + * +---+ + * | padding | + * +---+ <== 4K aligned * | capture lists | * +---+ * | padding | @@ -88,6 +92,11 @@ static u32 guc_ads_golden_ctxt_size(struct intel_guc *guc) return PAGE_ALIGN(guc->ads_golden_ctxt_size); } +static u32 guc_ads_waklv_size(struct intel_guc *guc) +{ + return PAGE_ALIGN(guc->ads_waklv_size); +} + static u32 guc_ads_capture_size(struct intel_guc *guc) { return PAGE_ALIGN(guc->ads_capture_size); @@ -113,7 +122,7 @@ static u32 guc_ads_golden_ctxt_offset(struct intel_guc *guc) return PAGE_ALIGN(offset); } -static u32 guc_ads_capture_offset(struct intel_guc *guc) +static u32 guc_ads_waklv_offset(struct intel_guc *guc) { u32 offset; @@ -123,6 +132,16 @@ static u32 guc_ads_capture_offset(struct intel_guc *guc) return PAGE_ALIGN(offset); } +static u32 guc_ads_capture_offset(struct intel_guc *guc) +{ + u32 offset; + + offset = guc_ads_waklv_offset(guc) + + guc_ads_waklv_size(guc); + + return PAGE_ALIGN(offset); +} + static u32 guc_ads_private_data_offset(struct intel_guc *guc) { u32 offset; @@ -791,6 +810,40 @@ guc_capture_prep_lists(struct intel_guc *guc) return PAGE_ALIGN(total_size); } +static void guc_waklv_init(struct intel_guc *guc) +{ + struct intel_gt *gt = guc_to_gt(guc); + u32 offset, addr_ggtt, remain, size; + + if (!intel_uc_uses_guc_submission(>uc)) + return; + + if (GUC_FIRMWARE_VER(guc) < MAKE_GUC_VER(70, 10, 0)) + return; should this be <= ? No. GuC 70.10.0 is when w/a KLVs were introduced. So we want to skip on any version that is prior to 70.10.0. + + GEM_BUG_ON(iosys_map_is_null(>ads_map)); + offset = guc_ads_waklv_offset(guc); + remain = guc_ads_waklv_size(guc); + + /* Add workarounds here */ + extra blank line? The
Re: [Intel-gfx] [PATCH 2/4] drm/i915: Enable Wa_16019325821
On 10/6/2023 17:10, Belgaumkar, Vinay wrote: On 9/15/2023 2:55 PM, john.c.harri...@intel.com wrote: From: John Harrison Some platforms require holding RCS context switches until CCS is idle (the reverse w/a of Wa_14014475959). Some platforms require both versions. Signed-off-by: John Harrison --- drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 19 +++ drivers/gpu/drm/i915/gt/intel_engine_types.h | 7 --- drivers/gpu/drm/i915/gt/uc/intel_guc.c | 4 drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h | 3 ++- .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 8 +++- 5 files changed, 28 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c index 0143445dba830..8b494825c55f2 100644 --- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c @@ -733,21 +733,23 @@ static u32 *gen12_emit_preempt_busywait(struct i915_request *rq, u32 *cs) } /* Wa_14014475959:dg2 */ -#define CCS_SEMAPHORE_PPHWSP_OFFSET 0x540 -static u32 ccs_semaphore_offset(struct i915_request *rq) +/* Wa_16019325821 */ +#define HOLD_SWITCHOUT_SEMAPHORE_PPHWSP_OFFSET 0x540 +static u32 hold_switchout_semaphore_offset(struct i915_request *rq) { return i915_ggtt_offset(rq->context->state) + - (LRC_PPHWSP_PN * PAGE_SIZE) + CCS_SEMAPHORE_PPHWSP_OFFSET; + (LRC_PPHWSP_PN * PAGE_SIZE) + HOLD_SWITCHOUT_SEMAPHORE_PPHWSP_OFFSET; } /* Wa_14014475959:dg2 */ -static u32 *ccs_emit_wa_busywait(struct i915_request *rq, u32 *cs) +/* Wa_16019325821 */ +static u32 *hold_switchout_emit_wa_busywait(struct i915_request *rq, u32 *cs) { int i; *cs++ = MI_ATOMIC_INLINE | MI_ATOMIC_GLOBAL_GTT | MI_ATOMIC_CS_STALL | MI_ATOMIC_MOVE; - *cs++ = ccs_semaphore_offset(rq); + *cs++ = hold_switchout_semaphore_offset(rq); *cs++ = 0; *cs++ = 1; @@ -763,7 +765,7 @@ static u32 *ccs_emit_wa_busywait(struct i915_request *rq, u32 *cs) MI_SEMAPHORE_POLL | MI_SEMAPHORE_SAD_EQ_SDD; *cs++ = 0; - *cs++ = ccs_semaphore_offset(rq); + *cs++ = hold_switchout_semaphore_offset(rq); *cs++ = 0; return cs; @@ -780,8 +782,9 @@ gen12_emit_fini_breadcrumb_tail(struct i915_request *rq, u32 *cs) cs = gen12_emit_preempt_busywait(rq, cs); /* Wa_14014475959:dg2 */ - if (intel_engine_uses_wa_hold_ccs_switchout(rq->engine)) - cs = ccs_emit_wa_busywait(rq, cs); + /* Wa_16019325821 */ + if (intel_engine_uses_wa_hold_switchout(rq->engine)) + cs = hold_switchout_emit_wa_busywait(rq, cs); rq->tail = intel_ring_offset(rq, cs); assert_ring_tail_valid(rq->ring, rq->tail); diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h index a7e6775980043..68fe1cef9cd94 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h @@ -573,7 +573,7 @@ struct intel_engine_cs { #define I915_ENGINE_HAS_RCS_REG_STATE BIT(9) #define I915_ENGINE_HAS_EU_PRIORITY BIT(10) #define I915_ENGINE_FIRST_RENDER_COMPUTE BIT(11) -#define I915_ENGINE_USES_WA_HOLD_CCS_SWITCHOUT BIT(12) +#define I915_ENGINE_USES_WA_HOLD_SWITCHOUT BIT(12) unsigned int flags; /* @@ -683,10 +683,11 @@ intel_engine_has_relative_mmio(const struct intel_engine_cs * const engine) } /* Wa_14014475959:dg2 */ +/* Wa_16019325821 */ static inline bool -intel_engine_uses_wa_hold_ccs_switchout(struct intel_engine_cs *engine) +intel_engine_uses_wa_hold_switchout(struct intel_engine_cs *engine) { - return engine->flags & I915_ENGINE_USES_WA_HOLD_CCS_SWITCHOUT; + return engine->flags & I915_ENGINE_USES_WA_HOLD_SWITCHOUT; } #endif /* __INTEL_ENGINE_TYPES_H__ */ diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc.c index 27df41c53b890..4001679ba0793 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c @@ -294,6 +294,10 @@ static u32 guc_ctl_wa_flags(struct intel_guc *guc) IS_DG2(gt->i915)) flags |= GUC_WA_HOLD_CCS_SWITCHOUT; + /* Wa_16019325821 */ + if (IS_GFX_GT_IP_RANGE(gt, IP_VER(12, 70), IP_VER(12, 71))) + flags |= GUC_WA_RCS_CCS_SWITCHOUT; + /* * Wa_14012197797 * Wa_22011391025 diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h b/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h index b4d56eccfb1f0..f97af0168a66b 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h @@ -95,8 +95,9 @@ #define GUC_WA_GAM_CREDITS BIT(10) #define GUC_WA_DUAL_QUEUE BIT(11) #define GUC_WA_RCS_RESET_BEFORE_RC6 BIT(13) -#define GUC_WA_CONTEXT_ISOLATION BIT(15) #define GUC_WA_PRE_PARSER BIT(14) +#define GUC_WA_CONTEXT_ISOLATION BIT(15) +#define GUC_WA_RCS_CCS_SWITCHOUT
Re: [PATCH v2 1/2] drm/msm/dp: don't touch DP subconnector property in eDP case
On 10/25/2023 2:23 AM, Dmitry Baryshkov wrote: From: Abel Vesa In case of the eDP connection there is no subconnetor and as such no subconnector property. Put drm_dp_set_subconnector_property() calls under the !is_edp condition. Fixes: bfcc3d8f94f4 ("drm/msm/dp: support setting the DP subconnector type") Signed-off-by: Abel Vesa Signed-off-by: Dmitry Baryshkov --- drivers/gpu/drm/msm/dp/dp_display.c | 15 ++- 1 file changed, 10 insertions(+), 5 deletions(-) We still need to unify the calls to drm_dp_set_subconnector_property() for the hpd connect/disconnect places preferably in dp_display_send_hpd_notification(). That way, we would have had to make this change only in one location. If you want to pursue that as a separate patch, I am fine as well. Hence, Reviewed-by: Abhinav Kumar
Re: [PATCH] dt-bindings: display: ssd132x: Remove '-' before compatible enum
Javier Martinez Canillas writes: [...] >>> Pushed to drm-misc (drm-misc-next). Thanks! >> >> Given what introduced this is before the drm-misc-next-2023-10-19 tag, >> isn't it going into 6.7 and needs to be in the fixes branch? Though that >> doesn't exist yet for 6.7 fixes. I don't understand why that's not done >> as part of the last tag for a cycle. But drm-misc is special. >> > > I pushed to drm-misc-next because I thought that there will be a last PR > of drm-misc-next for 6.7, but it seems I missed it for a couple of hours > (that is drm-misc-next-2023-10-27) :( > > https://lists.freedesktop.org/archives/dri-devel/2023-October/425698.html > > The solution now is to cherry-pick the fixes already in drm-misc-next to > drm-misc-next-fixes, to make sure that land in 6.7. I can do that once > drm-next is back merged to that branch, if you think the schema warning > fix must land in 6.7 and can't wait for the next release. > Or wait for the drm-misc-fixes branch to be back merged and land it as a part of the 6.7-rc cycle. I'll do that since it would cause less trouble to the drm-misc maintainers. -- Best regards, Javier Martinez Canillas Core Platforms Red Hat
Re: [PATCH] dt-bindings: display: ssd132x: Remove '-' before compatible enum
Rob Herring writes: Hello Rob, > On Fri, Oct 27, 2023 at 11:30:56AM +0200, Javier Martinez Canillas wrote: >> Rob Herring writes: >> >> > On Sat, 21 Oct 2023 00:30:17 +0200, Javier Martinez Canillas wrote: >> >> This is a leftover from when the binding schema had the compatible string >> >> property enum as a 'oneOf' child and the '-' was not removed when 'oneOf' >> >> got dropped during the binding review process. >> >> >> >> Reported-by: Rob Herring >> >> Closes: >> >> https://lore.kernel.org/dri-devel/cal_jsq+h8dcnpkqhokqoodcc8+qi3m0prxrfkz_y4v37ymj...@mail.gmail.com/ >> >> Signed-off-by: Javier Martinez Canillas >> >> --- >> >> >> >> .../devicetree/bindings/display/solomon,ssd132x.yaml | 8 >> >> 1 file changed, 4 insertions(+), 4 deletions(-) >> >> >> > >> > Reviewed-by: Rob Herring >> > >> >> Pushed to drm-misc (drm-misc-next). Thanks! > > Given what introduced this is before the drm-misc-next-2023-10-19 tag, > isn't it going into 6.7 and needs to be in the fixes branch? Though that > doesn't exist yet for 6.7 fixes. I don't understand why that's not done > as part of the last tag for a cycle. But drm-misc is special. > I pushed to drm-misc-next because I thought that there will be a last PR of drm-misc-next for 6.7, but it seems I missed it for a couple of hours (that is drm-misc-next-2023-10-27) :( https://lists.freedesktop.org/archives/dri-devel/2023-October/425698.html The solution now is to cherry-pick the fixes already in drm-misc-next to drm-misc-next-fixes, to make sure that land in 6.7. I can do that once drm-next is back merged to that branch, if you think the schema warning fix must land in 6.7 and can't wait for the next release. > Rob > -- Best regards, Javier Martinez Canillas Core Platforms Red Hat
[pull] amdgpu, amdkfd drm-next-6.7
Hi Dave, Sima, Fixes for 6.7. The following changes since commit 5258dfd4a6adb5f45f046b0dd2e73c680f880d9d: usb: typec: altmodes/displayport: fixup drm internal api change vs new user. (2023-10-27 07:55:41 +1000) are available in the Git repository at: https://gitlab.freedesktop.org/agd5f/linux.git tags/amd-drm-next-6.7-2023-10-27 for you to fetch changes up to dd3dd9829bf9a4ecd55482050745efdd9f7f97fc: drm/amdgpu: Remove unused variables from amdgpu_show_fdinfo (2023-10-27 14:23:01 -0400) amd-drm-next-6.7-2023-10-27: amdgpu: - RAS fixes - Seamless boot fixes - NBIO 7.7 fix - SMU 14.0 fixes - GC 11.5 fixes - DML2 fixes - ASPM fixes - VPE fixes - Misc code cleanups - SRIOV fixes - Add some missing copyright notices - DCN 3.5 fixes - FAMS fixes - Backlight fix - S/G display fix - fdinfo cleanups - EXT_COHERENT fixes for APU and NUMA systems amdkfd: - Misc fixes - Misc code cleanups - SVM fixes Agustin Gutierrez (1): drm/amd/display: Remove power sequencing check Alex Hung (2): drm/amd/display: Revert "drm/amd/display: allow edp updates for virtual signal" drm/amd/display: Set emulated sink type to HDMI accordingly. Alvin Lee (1): drm/amd/display: Update FAMS sequence for DCN30 & DCN32 Aric Cyr (1): drm/amd/display: 3.2.256 Aurabindo Pillai (1): drm/amd/display: add interface to query SubVP status Candice Li (2): drm/amdgpu: Identify data parity error corrected in replay mode drm/amdgpu: Retrieve CE count from ce_count_lo_chip in EccInfo table David Francis (1): drm/amdgpu: Add EXT_COHERENT support for APU and NUMA systems Fangzhi Zuo (1): drm/amd/display: Fix MST Multi-Stream Not Lighting Up on dcn35 George Shen (1): drm/amd/display: Update SDP VSC colorimetry from DP test automation request Hamza Mahfooz (1): drm/amd/display: fix S/G display enablement Hugo Hu (1): drm/amd/display: reprogram det size while seamless boot Ilya Bakoulin (1): drm/amd/display: Fix shaper using bad LUT params Iswara Nagulendran (1): drm/amd/display: Read before writing Backlight Mode Set Register James Zhu (1): drm/amdxcp: fix amdxcp unloads incompletely Jesse Zhang (1): drm/amdkfd: Fix shift out-of-bounds issue Jiadong Zhu (2): drm/amd/pm: drop unneeded dpm features disablement for SMU 14.0.0 drm/amdgpu: add tmz support for GC IP v11.5.0 Kenneth Feng (1): drm/amd/amdgpu: avoid to disable gfxhub interrupt when driver is unloaded Lang Yu (1): drm/amdgpu/vpe: correct queue stop programing Li Ma (2): drm/amdgpu: modify if condition in nbio_v7_7.c drm/amd/amdgpu: fix the GPU power print error in pm info Lijo Lazar (4): drm/amdgpu: Add API to get full IP version drm/amdgpu: Use discovery table's subrevision drm/amdgpu: Add a read to GFX v9.4.3 ring test drm/amdgpu: Use pcie domain of xcc acpi objects Lin.Cao (2): drm/amdgpu remove restriction of sriov max_pfn on Vega10 drm/amd: check num of link levels when update pcie param Ma Jun (1): drm/amd/pm: Fix the return value in default case Mario Limonciello (4): drm/amd: Disable ASPM for VI w/ all Intel systems drm/amd: Disable PP_PCIE_DPM_MASK when dynamic speed switching not supported drm/amd: Move AMD_IS_APU check for ASPM into top level function drm/amd: Explicitly disable ASPM when dynamic switching disabled Michael Strauss (1): drm/amd/display: Disable SYMCLK32_SE RCO on DCN314 Mukul Joshi (1): drm/amdgpu: Fix typo in IP discovery parsing Nicholas Kazlauskas (2): drm/amd/display: Revert "Improve x86 and dmub ips handshake" drm/amd/display: Fix IPS handshake for idle optimizations Philip Yang (2): Revert "drm/amdkfd:remove unused code" Revert "drm/amdkfd: Use partial migrations in GPU page faults" Qu Huang (1): drm/amdgpu: Fix a null pointer access when the smc_rreg pointer is NULL Rob Clark (1): drm/amdgpu: Remove duplicate fdinfo fields Rodrigo Siqueira (5): drm/amd/display: Set the DML2 attribute to false in all DCNs older than version 3.5 drm/amd/display: Fix DMUB errors introduced by DML2 drm/amd/display: Correct enum typo drm/amd/display: Add prefix to amdgpu crtc functions drm/amd/display: Add prefix for plane functions Samson Tam (2): drm/amd/display: fix num_ways overflow error drm/amd/display: add null check for invalid opps Srinivasan Shanmugam (1): drm/amdkfd: Address 'remap_list' not described in 'svm_range_add' Stylon Wang (3): drm/amd/display: Add missing copyright notice in DMUB drm/amd/display: Fix copyright notice in DML2 code drm/amd/display: Fix copyright notice in DC code Sung Joon Kim (2): drm/amd/display: Add a check for idle power optimization
[PATCH] drm/msm/gem: Add metadata
From: Rob Clark The EXT_external_objects extension is a bit awkward as it doesn't pass explicit modifiers, leaving the importer to guess with incomplete information. In the case of vk (turnip) exporting and gl (freedreno) importing, the "OPTIMAL_TILING_EXT" layout depends on VkImageCreateInfo flags (among other things), which the importer does not know. Which unfortunately leaves us with the need for a metadata back-channel. The contents of the metadata are defined by userspace. The EXT_external_objects extension is only required to work between compatible versions of gl and vk drivers, as defined by device and driver UUIDs. Signed-off-by: Rob Clark --- drivers/gpu/drm/msm/msm_drv.c | 59 +-- drivers/gpu/drm/msm/msm_gem.h | 4 +++ include/uapi/drm/msm_drm.h| 2 ++ 3 files changed, 63 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c index b61ccea05327..8fe2677ea37a 100644 --- a/drivers/gpu/drm/msm/msm_drv.c +++ b/drivers/gpu/drm/msm/msm_drv.c @@ -37,9 +37,10 @@ * - 1.9.0 - Add MSM_SUBMIT_FENCE_SN_IN * - 1.10.0 - Add MSM_SUBMIT_BO_NO_IMPLICIT * - 1.11.0 - Add wait boost (MSM_WAIT_FENCE_BOOST, MSM_PREP_BOOST) + * - 1.12.0 - Add MSM_INFO_SET_METADATA and MSM_INFO_GET_METADATA */ #define MSM_VERSION_MAJOR 1 -#define MSM_VERSION_MINOR 10 +#define MSM_VERSION_MINOR 12 #define MSM_VERSION_PATCHLEVEL 0 static void msm_deinit_vram(struct drm_device *ddev); @@ -566,6 +567,8 @@ static int msm_ioctl_gem_info(struct drm_device *dev, void *data, break; case MSM_INFO_SET_NAME: case MSM_INFO_GET_NAME: + case MSM_INFO_SET_METADATA: + case MSM_INFO_GET_METADATA: break; default: return -EINVAL; @@ -618,7 +621,7 @@ static int msm_ioctl_gem_info(struct drm_device *dev, void *data, break; case MSM_INFO_GET_NAME: if (args->value && (args->len < strlen(msm_obj->name))) { - ret = -EINVAL; + ret = -ETOOSMALL; break; } args->len = strlen(msm_obj->name); @@ -627,6 +630,58 @@ static int msm_ioctl_gem_info(struct drm_device *dev, void *data, msm_obj->name, args->len)) ret = -EFAULT; } + break; + case MSM_INFO_SET_METADATA: + /* Impose a moderate upper bound on metadata size: */ + if (args->len > 128) { + ret = -EOVERFLOW; + break; + } + + ret = msm_gem_lock_interruptible(obj); + if (ret) + break; + + msm_obj->metadata = + krealloc(msm_obj->metadata, args->len, GFP_KERNEL); + msm_obj->metadata_size = args->len; + + if (copy_from_user(msm_obj->metadata, u64_to_user_ptr(args->value), + args->len)) { + msm_obj->metadata_size = 0; + ret = -EFAULT; + } + + msm_gem_unlock(obj); + + break; + case MSM_INFO_GET_METADATA: + if (!args->value) { + /* +* Querying the size is inherently racey, but +* EXT_external_objects expects the app to confirm +* via device and driver UUIDs that the exporter and +* importer versions match. All we can do from the +* kernel side is check the length under obj lock +* when userspace tries to retrieve the metadata +*/ + args->len = msm_obj->metadata_size; + break; + } + + ret = msm_gem_lock_interruptible(obj); + if (ret) + break; + + if (args->len < msm_obj->metadata_size) { + ret = -ETOOSMALL; + } else if (copy_to_user(u64_to_user_ptr(args->value), + msm_obj->metadata, args->len)) { + ret = -EFAULT; + } + + msm_gem_unlock(obj); + break; } diff --git a/drivers/gpu/drm/msm/msm_gem.h b/drivers/gpu/drm/msm/msm_gem.h index 7f34263048a3..8d414b072c29 100644 --- a/drivers/gpu/drm/msm/msm_gem.h +++ b/drivers/gpu/drm/msm/msm_gem.h @@ -109,6 +109,10 @@ struct msm_gem_object { char name[32]; /* Identifier to print for the debugfs files */ + /* userspace metadata backchannel */ + void *metadata; + u32 metadata_size; + /** * pin_count: Number of times the pages are pinned * diff --git a/include/uapi/drm/msm_drm.h
[PATCH] accel/qaic: Quiet array bounds check on DMA abort message
From: Carl Vanderlip Current wrapper is right-sized to the message being transferred; however, this is smaller than the structure defining message wrappers since the trailing element is a union of message/transfer headers of various sizes (8 and 32 bytes on 32-bit system where issue was reported). Using the smaller header with a small message (wire_trans_dma_xfer is 24 bytes including header) ends up being smaller than a wrapper with the larger header. There are no accesses outside of the defined size, however they are possible if the larger union member is referenced. Abort messages are outside of hot-path and changing the wrapper struct would require a larger rewrite, so having the memory allocated to the message be 8 bytes too big is acceptable. Reported-by: kernel test robot Closes: https://lore.kernel.org/oe-kbuild-all/202310182253.bcb9jcyj-...@intel.com/ Signed-off-by: Carl Vanderlip Reviewed-by: Pranjal Ramajor Asha Kanojiya Reviewed-by: Jeffrey Hugo Signed-off-by: Jeffrey Hugo --- drivers/accel/qaic/qaic_control.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/accel/qaic/qaic_control.c b/drivers/accel/qaic/qaic_control.c index 388abd40024b..84915824be54 100644 --- a/drivers/accel/qaic/qaic_control.c +++ b/drivers/accel/qaic/qaic_control.c @@ -1138,7 +1138,7 @@ static int abort_dma_cont(struct qaic_device *qdev, struct wrapper_list *wrapper if (!list_is_first(>list, >list)) kref_put(>ref_count, free_wrapper); - wrapper = add_wrapper(wrappers, offsetof(struct wrapper_msg, trans) + sizeof(*out_trans)); + wrapper = add_wrapper(wrappers, sizeof(*wrapper)); if (!wrapper) return -ENOMEM; -- 2.40.1
RE: [PATCH] drm/radeon: replace 1-element arrays with flexible-array members
[Public] > -Original Message- > From: José Pekkarinen > Sent: Friday, October 27, 2023 12:59 PM > To: Deucher, Alexander ; Koenig, Christian > ; Pan, Xinhui ; > sk...@linuxfoundation.org > Cc: José Pekkarinen ; airl...@gmail.com; > dan...@ffwll.ch; amd-...@lists.freedesktop.org; dri- > de...@lists.freedesktop.org; linux-ker...@vger.kernel.org; linux-kernel- > ment...@lists.linuxfoundation.org > Subject: [PATCH] drm/radeon: replace 1-element arrays with flexible-array > members > > Reported by coccinelle, the following patch will move the following 1 element > arrays to flexible arrays. > > drivers/gpu/drm/radeon/atombios.h:5523:32-48: WARNING use flexible- > array member instead > (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero- > length-and-one-element-arrays) > drivers/gpu/drm/radeon/atombios.h:5545:32-48: WARNING use flexible- > array member instead > (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero- > length-and-one-element-arrays) > drivers/gpu/drm/radeon/atombios.h:5461:34-44: WARNING use flexible- > array member instead > (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero- > length-and-one-element-arrays) > drivers/gpu/drm/radeon/atombios.h:4447:30-40: WARNING use flexible- > array member instead > (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero- > length-and-one-element-arrays) > drivers/gpu/drm/radeon/atombios.h:4236:30-41: WARNING use flexible- > array member instead > (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero- > length-and-one-element-arrays) > drivers/gpu/drm/radeon/atombios.h:7044:24-37: WARNING use flexible- > array member instead > (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero- > length-and-one-element-arrays) > drivers/gpu/drm/radeon/atombios.h:7054:24-37: WARNING use flexible- > array member instead > (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero- > length-and-one-element-arrays) > drivers/gpu/drm/radeon/atombios.h:7095:28-45: WARNING use flexible- > array member instead > (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero- > length-and-one-element-arrays) > drivers/gpu/drm/radeon/atombios.h:7553:8-17: WARNING use flexible-array > member instead > (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero- > length-and-one-element-arrays) > drivers/gpu/drm/radeon/atombios.h:7559:8-17: WARNING use flexible-array > member instead > (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero- > length-and-one-element-arrays) > drivers/gpu/drm/radeon/atombios.h:3896:27-37: WARNING use flexible- > array member instead > (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero- > length-and-one-element-arrays) > drivers/gpu/drm/radeon/atombios.h:5443:16-25: WARNING use flexible- > array member instead > (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero- > length-and-one-element-arrays) > drivers/gpu/drm/radeon/atombios.h:5454:34-43: WARNING use flexible- > array member instead > (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero- > length-and-one-element-arrays) > drivers/gpu/drm/radeon/atombios.h:4603:21-32: WARNING use flexible- > array member instead > (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero- > length-and-one-element-arrays) > drivers/gpu/drm/radeon/atombios.h:6299:32-44: WARNING use flexible- > array member instead > (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero- > length-and-one-element-arrays) > drivers/gpu/drm/radeon/atombios.h:4628:32-46: WARNING use flexible- > array member instead > (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero- > length-and-one-element-arrays) > drivers/gpu/drm/radeon/atombios.h:6285:29-39: WARNING use flexible- > array member instead > (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero- > length-and-one-element-arrays) > drivers/gpu/drm/radeon/atombios.h:4296:30-36: WARNING use flexible- > array member instead > (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero- > length-and-one-element-arrays) > drivers/gpu/drm/radeon/atombios.h:4756:28-36: WARNING use flexible- > array member instead > (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero- > length-and-one-element-arrays) > drivers/gpu/drm/radeon/atombios.h:4064:22-35: WARNING use flexible- > array member instead > (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero- > length-and-one-element-arrays) > drivers/gpu/drm/radeon/atombios.h:7327:9-24: WARNING use flexible-array > member instead > (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero- > length-and-one-element-arrays) > drivers/gpu/drm/radeon/atombios.h:7332:32-53: WARNING use flexible- > array member instead > (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero- > length-and-one-element-arrays) >
Re: [PATCH v2] i915/perf: Fix NULL deref bugs with drm_dbg() calls
On 27/10/2023 18:28, Harshit Mogalapalli wrote: When i915 perf interface is not available dereferencing it will lead to NULL dereferences. As returning -ENOTSUPP is pretty clear return when perf interface is not available. Fixes: 2fec539112e8 ("i915/perf: Replace DRM_DEBUG with driver specific drm_dbg call") Suggested-by: Tvrtko Ursulin Signed-off-by: Harshit Mogalapalli --- v1 --> v2: Remove the debug calls as they don't add much value and -ENOTSUPP is a good enough return value. --- drivers/gpu/drm/i915/i915_perf.c | 15 +++ 1 file changed, 3 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index 2f3ecd7d4804..7b1c8de2f9cb 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -4227,11 +4227,8 @@ int i915_perf_open_ioctl(struct drm_device *dev, void *data, u32 known_open_flags; int ret; - if (!perf->i915) { - drm_dbg(>i915->drm, - "i915 perf interface not available for this system\n"); + if (!perf->i915) return -ENOTSUPP; - } known_open_flags = I915_PERF_FLAG_FD_CLOEXEC | I915_PERF_FLAG_FD_NONBLOCK | @@ -4607,11 +4604,8 @@ int i915_perf_add_config_ioctl(struct drm_device *dev, void *data, struct i915_oa_reg *regs; int err, id; - if (!perf->i915) { - drm_dbg(>i915->drm, - "i915 perf interface not available for this system\n"); + if (!perf->i915) return -ENOTSUPP; - } if (!perf->metrics_kobj) { drm_dbg(>i915->drm, @@ -4773,11 +4767,8 @@ int i915_perf_remove_config_ioctl(struct drm_device *dev, void *data, struct i915_oa_config *oa_config; int ret; - if (!perf->i915) { - drm_dbg(>i915->drm, - "i915 perf interface not available for this system\n"); + if (!perf->i915) return -ENOTSUPP; - } if (i915_perf_stream_paranoid && !perfmon_capable()) { drm_dbg(>i915->drm, Thanks for re-spinning it so quickly! LGTM. Reviewed-by: Tvrtko Ursulin Regards, Tvrtko
Re: (subset) [PATCH v2] drm/doc: ci: Require more context for flaky tests
On 26/10/2023 12:31, Maxime Ripard wrote: Hi, On Thu, Oct 26, 2023 at 11:27:18AM -0300, Helen Koike wrote: On 26/10/2023 10:27, Maxime Ripard wrote: On Thu, Oct 26, 2023 at 09:08:03AM -0300, Helen Koike wrote: On 26/10/2023 09:01, Helen Koike wrote: On 26/10/2023 07:58, Maxime Ripard wrote: On Wed, 25 Oct 2023 16:24:41 +0200, Maxime Ripard wrote: Flaky tests can be very difficult to reproduce after the facts, which will make it even harder to ever fix. Let's document the metadata we agreed on to provide more context to anyone trying to address these fixes. [...] Applied to drm/drm-misc (drm-misc-next). Thanks! Could you also apply https://patchwork.kernel.org/project/dri-devel/cover/20231024004525.169002-1-helen.ko...@collabora.com/ (and the dependencies listed on it). For some reason, commit message 7/10 (drm/ci: increase i915 job timeout to 1h30m) looks incomplete in patchwork, but it looks fine in my branch: https://gitlab.freedesktop.org/helen.fornazier/linux/-/commits/for-drm-misc-wip/ Let me know if you prefer that I send it again or if you could pull from the branch. It was fine on lore.kernel.org and that's where I'm pulling from, so it all worked out :) Everything you asked for should be applied now Maxime Awesome, thank you! Sorry, just another request, could you please pull this other one updating MAINTAINERS? https://patchwork.kernel.org/project/linux-arm-msm/patch/20230919182249.153499-1-helen.ko...@collabora.com/ I don't mind, but the expectation (the one I had at least) was that you would do it :) If you don't have drm-misc access, please create an account, you have done way more than expected to get one already Nice! I just created this request https://gitlab.freedesktop.org/freedesktop/freedesktop/-/issues/983 Would you mind approving it? Thanks! Helen Maxime
[PATCH v2] i915/perf: Fix NULL deref bugs with drm_dbg() calls
When i915 perf interface is not available dereferencing it will lead to NULL dereferences. As returning -ENOTSUPP is pretty clear return when perf interface is not available. Fixes: 2fec539112e8 ("i915/perf: Replace DRM_DEBUG with driver specific drm_dbg call") Suggested-by: Tvrtko Ursulin Signed-off-by: Harshit Mogalapalli --- v1 --> v2: Remove the debug calls as they don't add much value and -ENOTSUPP is a good enough return value. --- drivers/gpu/drm/i915/i915_perf.c | 15 +++ 1 file changed, 3 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index 2f3ecd7d4804..7b1c8de2f9cb 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -4227,11 +4227,8 @@ int i915_perf_open_ioctl(struct drm_device *dev, void *data, u32 known_open_flags; int ret; - if (!perf->i915) { - drm_dbg(>i915->drm, - "i915 perf interface not available for this system\n"); + if (!perf->i915) return -ENOTSUPP; - } known_open_flags = I915_PERF_FLAG_FD_CLOEXEC | I915_PERF_FLAG_FD_NONBLOCK | @@ -4607,11 +4604,8 @@ int i915_perf_add_config_ioctl(struct drm_device *dev, void *data, struct i915_oa_reg *regs; int err, id; - if (!perf->i915) { - drm_dbg(>i915->drm, - "i915 perf interface not available for this system\n"); + if (!perf->i915) return -ENOTSUPP; - } if (!perf->metrics_kobj) { drm_dbg(>i915->drm, @@ -4773,11 +4767,8 @@ int i915_perf_remove_config_ioctl(struct drm_device *dev, void *data, struct i915_oa_config *oa_config; int ret; - if (!perf->i915) { - drm_dbg(>i915->drm, - "i915 perf interface not available for this system\n"); + if (!perf->i915) return -ENOTSUPP; - } if (i915_perf_stream_paranoid && !perfmon_capable()) { drm_dbg(>i915->drm, -- 2.39.3
Re: [PATCH] dt-bindings: display: ssd132x: Remove '-' before compatible enum
On Fri, Oct 27, 2023 at 11:30:56AM +0200, Javier Martinez Canillas wrote: > Rob Herring writes: > > > On Sat, 21 Oct 2023 00:30:17 +0200, Javier Martinez Canillas wrote: > >> This is a leftover from when the binding schema had the compatible string > >> property enum as a 'oneOf' child and the '-' was not removed when 'oneOf' > >> got dropped during the binding review process. > >> > >> Reported-by: Rob Herring > >> Closes: > >> https://lore.kernel.org/dri-devel/cal_jsq+h8dcnpkqhokqoodcc8+qi3m0prxrfkz_y4v37ymj...@mail.gmail.com/ > >> Signed-off-by: Javier Martinez Canillas > >> --- > >> > >> .../devicetree/bindings/display/solomon,ssd132x.yaml | 8 > >> 1 file changed, 4 insertions(+), 4 deletions(-) > >> > > > > Reviewed-by: Rob Herring > > > > Pushed to drm-misc (drm-misc-next). Thanks! Given what introduced this is before the drm-misc-next-2023-10-19 tag, isn't it going into 6.7 and needs to be in the fixes branch? Though that doesn't exist yet for 6.7 fixes. I don't understand why that's not done as part of the last tag for a cycle. But drm-misc is special. Rob
Re: [PATCH] i915/perf: Fix NULL deref bugs with drm_dbg() calls
Hi Tvrtko, On 27/10/23 8:17 pm, Tvrtko Ursulin wrote: On 27/10/2023 15:11, Andrzej Hajda wrote: On 27.10.2023 16:07, Harshit Mogalapalli wrote: When i915 perf interface is not available dereferencing it will lead to NULL dereferences. Fix this by using DRM_DEBUG() which the scenario before the commit in the Fixes tag. Fixes: 2fec539112e8 ("i915/perf: Replace DRM_DEBUG with driver specific drm_dbg call") Signed-off-by: Harshit Mogalapalli Reviewed-by: Andrzej Hajda Please hold off merging. --- This is found using smatch(static analysis tool), only compile tested. --- drivers/gpu/drm/i915/i915_perf.c | 9 +++-- 1 file changed, 3 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index 2f3ecd7d4804..bb48c96b7950 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -4228,8 +4228,7 @@ int i915_perf_open_ioctl(struct drm_device *dev, void *data, int ret; if (!perf->i915) { - drm_dbg(>i915->drm, - "i915 perf interface not available for this system\n"); + DRM_DEBUG("i915 perf interface not available for this system\n"); What's that struct drm_device *dev function argument a few lines up? :) Although TBH all these these could just be removed since I doubt they are adding any value and ENOTSUPP is pretty clear. Thanks for checking. I will remove the dbg() calls and send a V2. Regards, Harshit Regards, Tvrtko return -ENOTSUPP; } @@ -4608,8 +4607,7 @@ int i915_perf_add_config_ioctl(struct drm_device *dev, void *data, int err, id; if (!perf->i915) { - drm_dbg(>i915->drm, - "i915 perf interface not available for this system\n"); + DRM_DEBUG("i915 perf interface not available for this system\n"); return -ENOTSUPP; } @@ -4774,8 +4772,7 @@ int i915_perf_remove_config_ioctl(struct drm_device *dev, void *data, int ret; if (!perf->i915) { - drm_dbg(>i915->drm, - "i915 perf interface not available for this system\n"); + DRM_DEBUG("i915 perf interface not available for this system\n"); return -ENOTSUPP; }
[PATCH 6/7] drm/exec: Pass in initial # of objects
From: Rob Clark In cases where the # is known ahead of time, it is silly to do the table resize dance. Signed-off-by: Rob Clark --- drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c | 4 ++-- drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 4 ++-- drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 4 ++-- drivers/gpu/drm/drm_exec.c | 15 --- drivers/gpu/drm/nouveau/nouveau_exec.c | 2 +- drivers/gpu/drm/nouveau/nouveau_uvmm.c | 2 +- include/drm/drm_exec.h | 2 +- 8 files changed, 22 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c index efdb1c48f431..d27ca8f61929 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c @@ -65,7 +65,7 @@ static int amdgpu_cs_parser_init(struct amdgpu_cs_parser *p, } amdgpu_sync_create(>sync); - drm_exec_init(>exec, DRM_EXEC_INTERRUPTIBLE_WAIT); + drm_exec_init(>exec, DRM_EXEC_INTERRUPTIBLE_WAIT, 0); return 0; } diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c index 720011019741..796fa6f1420b 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c @@ -70,7 +70,7 @@ int amdgpu_map_static_csa(struct amdgpu_device *adev, struct amdgpu_vm *vm, struct drm_exec exec; int r; - drm_exec_init(, DRM_EXEC_INTERRUPTIBLE_WAIT); + drm_exec_init(, DRM_EXEC_INTERRUPTIBLE_WAIT, 0); drm_exec_until_all_locked() { r = amdgpu_vm_lock_pd(vm, , 0); if (likely(!r)) @@ -110,7 +110,7 @@ int amdgpu_unmap_static_csa(struct amdgpu_device *adev, struct amdgpu_vm *vm, struct drm_exec exec; int r; - drm_exec_init(, DRM_EXEC_INTERRUPTIBLE_WAIT); + drm_exec_init(, DRM_EXEC_INTERRUPTIBLE_WAIT, 0); drm_exec_until_all_locked() { r = amdgpu_vm_lock_pd(vm, , 0); if (likely(!r)) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c index ca4d2d430e28..16f1715148ad 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c @@ -203,7 +203,7 @@ static void amdgpu_gem_object_close(struct drm_gem_object *obj, struct drm_exec exec; long r; - drm_exec_init(, DRM_EXEC_IGNORE_DUPLICATES); + drm_exec_init(, DRM_EXEC_IGNORE_DUPLICATES, 0); drm_exec_until_all_locked() { r = drm_exec_prepare_obj(, >tbo.base, 1); drm_exec_retry_on_contention(); @@ -739,7 +739,7 @@ int amdgpu_gem_va_ioctl(struct drm_device *dev, void *data, } drm_exec_init(, DRM_EXEC_INTERRUPTIBLE_WAIT | - DRM_EXEC_IGNORE_DUPLICATES); + DRM_EXEC_IGNORE_DUPLICATES, 0); drm_exec_until_all_locked() { if (gobj) { r = drm_exec_lock_obj(, gobj); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c index b6015157763a..3c351941701e 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c @@ -1105,7 +1105,7 @@ int amdgpu_mes_ctx_map_meta_data(struct amdgpu_device *adev, amdgpu_sync_create(); - drm_exec_init(, 0); + drm_exec_init(, 0, 0); drm_exec_until_all_locked() { r = drm_exec_lock_obj(, _data->meta_data_obj->tbo.base); @@ -1176,7 +1176,7 @@ int amdgpu_mes_ctx_unmap_meta_data(struct amdgpu_device *adev, struct drm_exec exec; long r; - drm_exec_init(, 0); + drm_exec_init(, 0, 0); drm_exec_until_all_locked() { r = drm_exec_lock_obj(, _data->meta_data_obj->tbo.base); diff --git a/drivers/gpu/drm/drm_exec.c b/drivers/gpu/drm/drm_exec.c index 5d2809de4517..27d11c20d148 100644 --- a/drivers/gpu/drm/drm_exec.c +++ b/drivers/gpu/drm/drm_exec.c @@ -69,16 +69,25 @@ static void drm_exec_unlock_all(struct drm_exec *exec) * drm_exec_init - initialize a drm_exec object * @exec: the drm_exec object to initialize * @flags: controls locking behavior, see DRM_EXEC_* defines + * @nr: the initial # of objects * * Initialize the object and make sure that we can track locked objects. + * + * If nr is non-zero then it is used as the initial objects table size. + * In either case, the table will grow (be re-allocated) on demand. */ -void drm_exec_init(struct drm_exec *exec, uint32_t flags) +void drm_exec_init(struct drm_exec *exec, uint32_t flags, unsigned nr) { + size_t sz = PAGE_SIZE; + + if (nr) + sz = (size_t)nr * sizeof(void *); + exec->flags = flags; - exec->objects = kmalloc(PAGE_SIZE, GFP_KERNEL); + exec->objects = kmalloc(sz, GFP_KERNEL);
[PATCH 7/7] drm/msm/gem: Convert to drm_exec
From: Rob Clark Replace the ww_mutex locking dance with the drm_exec helper. Signed-off-by: Rob Clark --- drivers/gpu/drm/msm/Kconfig | 1 + drivers/gpu/drm/msm/msm_gem.h| 5 +- drivers/gpu/drm/msm/msm_gem_submit.c | 117 +-- 3 files changed, 24 insertions(+), 99 deletions(-) diff --git a/drivers/gpu/drm/msm/Kconfig b/drivers/gpu/drm/msm/Kconfig index 6309a857ca31..f91d87afc0d3 100644 --- a/drivers/gpu/drm/msm/Kconfig +++ b/drivers/gpu/drm/msm/Kconfig @@ -16,6 +16,7 @@ config DRM_MSM select DRM_DP_AUX_BUS select DRM_DISPLAY_DP_HELPER select DRM_DISPLAY_HELPER + select DRM_EXEC select DRM_KMS_HELPER select DRM_PANEL select DRM_BRIDGE diff --git a/drivers/gpu/drm/msm/msm_gem.h b/drivers/gpu/drm/msm/msm_gem.h index af884ced7a0d..7f34263048a3 100644 --- a/drivers/gpu/drm/msm/msm_gem.h +++ b/drivers/gpu/drm/msm/msm_gem.h @@ -9,6 +9,7 @@ #include #include +#include "drm/drm_exec.h" #include "drm/gpu_scheduler.h" #include "msm_drv.h" @@ -254,7 +255,7 @@ struct msm_gem_submit { struct msm_gpu *gpu; struct msm_gem_address_space *aspace; struct list_head node; /* node in ring submit list */ - struct ww_acquire_ctx ticket; + struct drm_exec exec; uint32_t seqno; /* Sequence number of the submit on the ring */ /* Hw fence, which is created when the scheduler executes the job, and @@ -287,8 +288,6 @@ struct msm_gem_submit { struct drm_msm_gem_submit_reloc *relocs; } *cmd; /* array of size nr_cmds */ struct { -/* make sure these don't conflict w/ MSM_SUBMIT_BO_x */ -#define BO_LOCKED 0x4000 /* obj lock is held */ uint32_t flags; union { struct drm_gem_object *obj; diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c index 603f04d851d9..f8d14d4ccfef 100644 --- a/drivers/gpu/drm/msm/msm_gem_submit.c +++ b/drivers/gpu/drm/msm/msm_gem_submit.c @@ -248,85 +248,31 @@ static int submit_lookup_cmds(struct msm_gem_submit *submit, return ret; } -static void submit_unlock_bo(struct msm_gem_submit *submit, int i) -{ - struct drm_gem_object *obj = submit->bos[i].obj; - unsigned cleanup_flags = BO_LOCKED; - unsigned flags = submit->bos[i].flags & cleanup_flags; - - /* -* Clear flags bit before dropping lock, so that the msm_job_run() -* path isn't racing with submit_cleanup() (ie. the read/modify/ -* write is protected by the obj lock in all paths) -*/ - submit->bos[i].flags &= ~cleanup_flags; - - if (flags & BO_LOCKED) - dma_resv_unlock(obj->resv); -} - /* This is where we make sure all the bo's are reserved and pin'd: */ static int submit_lock_objects(struct msm_gem_submit *submit) { - int contended, slow_locked = -1, i, ret = 0; - -retry: - for (i = 0; i < submit->nr_bos; i++) { - struct drm_gem_object *obj = submit->bos[i].obj; - - if (slow_locked == i) - slow_locked = -1; + int ret; - contended = i; + drm_exec_init(>exec, DRM_EXEC_INTERRUPTIBLE_WAIT, submit->nr_bos); - if (!(submit->bos[i].flags & BO_LOCKED)) { - ret = dma_resv_lock_interruptible(obj->resv, - >ticket); + drm_exec_until_all_locked (>exec) { + for (unsigned i = 0; i < submit->nr_bos; i++) { + struct drm_gem_object *obj = submit->bos[i].obj; + ret = drm_exec_prepare_obj(>exec, obj, 1); + drm_exec_retry_on_contention(>exec); if (ret) - goto fail; - submit->bos[i].flags |= BO_LOCKED; + goto error; } } - ww_acquire_done(>ticket); - return 0; -fail: - if (ret == -EALREADY) { - SUBMIT_ERROR(submit, "handle %u at index %u already on submit list\n", -submit->bos[i].handle, i); - ret = -EINVAL; - } - - for (; i >= 0; i--) - submit_unlock_bo(submit, i); - - if (slow_locked > 0) - submit_unlock_bo(submit, slow_locked); - - if (ret == -EDEADLK) { - struct drm_gem_object *obj = submit->bos[contended].obj; - /* we lost out in a seqno race, lock and retry.. */ - ret = dma_resv_lock_slow_interruptible(obj->resv, - >ticket); - if (!ret) { - submit->bos[contended].flags |= BO_LOCKED; - slow_locked = contended; - goto retry; - } - - /* Not expecting -EALREADY
[PATCH 5/7] drm/msm/gem: Cleanup submit_cleanup_bo()
From: Rob Clark Now that it only handles unlock duty, drop the superfluous arg and rename it. Signed-off-by: Rob Clark --- drivers/gpu/drm/msm/msm_gem_submit.c | 15 +-- 1 file changed, 5 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c index d001bf286606..603f04d851d9 100644 --- a/drivers/gpu/drm/msm/msm_gem_submit.c +++ b/drivers/gpu/drm/msm/msm_gem_submit.c @@ -248,14 +248,10 @@ static int submit_lookup_cmds(struct msm_gem_submit *submit, return ret; } -/* Unwind bo state, according to cleanup_flags. In the success case, only - * the lock is dropped at the end of the submit (and active/pin ref is dropped - * later when the submit is retired). - */ -static void submit_cleanup_bo(struct msm_gem_submit *submit, int i, - unsigned cleanup_flags) +static void submit_unlock_bo(struct msm_gem_submit *submit, int i) { struct drm_gem_object *obj = submit->bos[i].obj; + unsigned cleanup_flags = BO_LOCKED; unsigned flags = submit->bos[i].flags & cleanup_flags; /* @@ -304,10 +300,10 @@ static int submit_lock_objects(struct msm_gem_submit *submit) } for (; i >= 0; i--) - submit_cleanup_bo(submit, i, BO_LOCKED); + submit_unlock_bo(submit, i); if (slow_locked > 0) - submit_cleanup_bo(submit, slow_locked, BO_LOCKED); + submit_unlock_bo(submit, slow_locked); if (ret == -EDEADLK) { struct drm_gem_object *obj = submit->bos[contended].obj; @@ -533,7 +529,6 @@ static int submit_reloc(struct msm_gem_submit *submit, struct drm_gem_object *ob */ static void submit_cleanup(struct msm_gem_submit *submit, bool error) { - unsigned cleanup_flags = BO_LOCKED; unsigned i; if (error) @@ -541,7 +536,7 @@ static void submit_cleanup(struct msm_gem_submit *submit, bool error) for (i = 0; i < submit->nr_bos; i++) { struct drm_gem_object *obj = submit->bos[i].obj; - submit_cleanup_bo(submit, i, cleanup_flags); + submit_unlock_bo(submit, i); if (error) drm_gem_object_put(obj); } -- 2.41.0
[PATCH 4/7] drm/msm/gem: Split out submit_unpin_objects() helper
From: Rob Clark Untangle unpinning from unlock/unref loop. The unpin only happens in error paths so it is easier to decouple from the normal unlock path. Since we never have an intermediate state where a subset of buffers are pinned (ie. we never bail out of the pin or unpin loops) we can replace the bo state flag bit with a global flag in the submit. Signed-off-by: Rob Clark --- drivers/gpu/drm/msm/msm_gem.h| 6 +++--- drivers/gpu/drm/msm/msm_gem_submit.c | 22 +- drivers/gpu/drm/msm/msm_ringbuffer.c | 3 ++- 3 files changed, 22 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/msm/msm_gem.h b/drivers/gpu/drm/msm/msm_gem.h index c36c1c1fa222..af884ced7a0d 100644 --- a/drivers/gpu/drm/msm/msm_gem.h +++ b/drivers/gpu/drm/msm/msm_gem.h @@ -270,8 +270,9 @@ struct msm_gem_submit { int fence_id; /* key into queue->fence_idr */ struct msm_gpu_submitqueue *queue; struct pid *pid;/* submitting process */ - bool fault_dumped; /* Limit devcoredump dumping to one per submit */ - bool in_rb; /* "sudo" mode, copy cmds into RB */ + bool bos_pinned : 1; + bool fault_dumped:1;/* Limit devcoredump dumping to one per submit */ + bool in_rb : 1; /* "sudo" mode, copy cmds into RB */ struct msm_ringbuffer *ring; unsigned int nr_cmds; unsigned int nr_bos; @@ -288,7 +289,6 @@ struct msm_gem_submit { struct { /* make sure these don't conflict w/ MSM_SUBMIT_BO_x */ #define BO_LOCKED 0x4000 /* obj lock is held */ -#define BO_PINNED 0x2000 /* obj (pages) is pinned and on active list */ uint32_t flags; union { struct drm_gem_object *obj; diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c index 786b48a55309..d001bf286606 100644 --- a/drivers/gpu/drm/msm/msm_gem_submit.c +++ b/drivers/gpu/drm/msm/msm_gem_submit.c @@ -265,9 +265,6 @@ static void submit_cleanup_bo(struct msm_gem_submit *submit, int i, */ submit->bos[i].flags &= ~cleanup_flags; - if (flags & BO_PINNED) - msm_gem_unpin_locked(obj); - if (flags & BO_LOCKED) dma_resv_unlock(obj->resv); } @@ -407,13 +404,28 @@ static int submit_pin_objects(struct msm_gem_submit *submit) mutex_lock(>lru.lock); for (i = 0; i < submit->nr_bos; i++) { msm_gem_pin_obj_locked(submit->bos[i].obj); - submit->bos[i].flags |= BO_PINNED; } mutex_unlock(>lru.lock); + submit->bos_pinned = true; + return ret; } +static void submit_unpin_objects(struct msm_gem_submit *submit) +{ + if (!submit->bos_pinned) + return; + + for (int i = 0; i < submit->nr_bos; i++) { + struct drm_gem_object *obj = submit->bos[i].obj; + + msm_gem_unpin_locked(obj); + } + + submit->bos_pinned = false; +} + static void submit_attach_object_fences(struct msm_gem_submit *submit) { int i; @@ -525,7 +537,7 @@ static void submit_cleanup(struct msm_gem_submit *submit, bool error) unsigned i; if (error) - cleanup_flags |= BO_PINNED; + submit_unpin_objects(submit); for (i = 0; i < submit->nr_bos; i++) { struct drm_gem_object *obj = submit->bos[i].obj; diff --git a/drivers/gpu/drm/msm/msm_ringbuffer.c b/drivers/gpu/drm/msm/msm_ringbuffer.c index 9d6e2e10d25a..7ea5eca118eb 100644 --- a/drivers/gpu/drm/msm/msm_ringbuffer.c +++ b/drivers/gpu/drm/msm/msm_ringbuffer.c @@ -29,9 +29,10 @@ static struct dma_fence *msm_job_run(struct drm_sched_job *job) struct drm_gem_object *obj = submit->bos[i].obj; msm_gem_unpin_active(obj); - submit->bos[i].flags &= ~BO_PINNED; } + submit->bos_pinned = false; + mutex_unlock(>lru.lock); msm_gpu_submit(gpu, submit); -- 2.41.0
[PATCH 3/7] drm/msm/gem: Don't queue job to sched in error cases
From: Rob Clark We shouldn't be running the job in error cases. This also avoids having to think too hard about where the objs get unpinned (and if necessary, the resv takes over tracking that the obj is busy).. ie. error cases it always happens synchronously, and normal cases it happens from scheduler job_run() callback. Signed-off-by: Rob Clark --- drivers/gpu/drm/msm/msm_gem_submit.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c index 2d5527dc3e1a..786b48a55309 100644 --- a/drivers/gpu/drm/msm/msm_gem_submit.c +++ b/drivers/gpu/drm/msm/msm_gem_submit.c @@ -946,6 +946,9 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data, } } + if (ret) + goto out; + submit_attach_object_fences(submit); /* The scheduler owns a ref now: */ -- 2.41.0
[PATCH 2/7] drm/msm/gem: Remove submit_unlock_unpin_bo()
From: Rob Clark The only point it is called is before pinning objects, so the "unpin" part of the name is fiction. Just remove call submit_cleanup_bo() directly. Signed-off-by: Rob Clark --- drivers/gpu/drm/msm/msm_gem_submit.c | 10 ++ 1 file changed, 2 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c index 996274ef32a6..2d5527dc3e1a 100644 --- a/drivers/gpu/drm/msm/msm_gem_submit.c +++ b/drivers/gpu/drm/msm/msm_gem_submit.c @@ -272,12 +272,6 @@ static void submit_cleanup_bo(struct msm_gem_submit *submit, int i, dma_resv_unlock(obj->resv); } -static void submit_unlock_unpin_bo(struct msm_gem_submit *submit, int i) -{ - unsigned cleanup_flags = BO_PINNED | BO_LOCKED; - submit_cleanup_bo(submit, i, cleanup_flags); -} - /* This is where we make sure all the bo's are reserved and pin'd: */ static int submit_lock_objects(struct msm_gem_submit *submit) { @@ -313,10 +307,10 @@ static int submit_lock_objects(struct msm_gem_submit *submit) } for (; i >= 0; i--) - submit_unlock_unpin_bo(submit, i); + submit_cleanup_bo(submit, i, BO_LOCKED); if (slow_locked > 0) - submit_unlock_unpin_bo(submit, slow_locked); + submit_cleanup_bo(submit, slow_locked, BO_LOCKED); if (ret == -EDEADLK) { struct drm_gem_object *obj = submit->bos[contended].obj; -- 2.41.0
[PATCH 1/7] drm/msm/gem: Remove "valid" tracking
From: Rob Clark This was a small optimization for pre-soft-pin userspace. But mesa switched to soft-pin nearly 5yrs ago. So lets drop the optimization and simplify the code. Signed-off-by: Rob Clark --- drivers/gpu/drm/msm/msm_gem.h| 2 -- drivers/gpu/drm/msm/msm_gem_submit.c | 44 +--- 2 files changed, 8 insertions(+), 38 deletions(-) diff --git a/drivers/gpu/drm/msm/msm_gem.h b/drivers/gpu/drm/msm/msm_gem.h index 8ddef5443140..c36c1c1fa222 100644 --- a/drivers/gpu/drm/msm/msm_gem.h +++ b/drivers/gpu/drm/msm/msm_gem.h @@ -271,7 +271,6 @@ struct msm_gem_submit { struct msm_gpu_submitqueue *queue; struct pid *pid;/* submitting process */ bool fault_dumped; /* Limit devcoredump dumping to one per submit */ - bool valid; /* true if no cmdstream patching needed */ bool in_rb; /* "sudo" mode, copy cmds into RB */ struct msm_ringbuffer *ring; unsigned int nr_cmds; @@ -288,7 +287,6 @@ struct msm_gem_submit { } *cmd; /* array of size nr_cmds */ struct { /* make sure these don't conflict w/ MSM_SUBMIT_BO_x */ -#define BO_VALID 0x8000 /* is current addr in cmdstream correct/valid? */ #define BO_LOCKED 0x4000 /* obj lock is held */ #define BO_PINNED 0x2000 /* obj (pages) is pinned and on active list */ uint32_t flags; diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c index 6d8ec1337e8b..996274ef32a6 100644 --- a/drivers/gpu/drm/msm/msm_gem_submit.c +++ b/drivers/gpu/drm/msm/msm_gem_submit.c @@ -150,8 +150,6 @@ static int submit_lookup_objects(struct msm_gem_submit *submit, submit->bos[i].handle = submit_bo.handle; submit->bos[i].flags = submit_bo.flags; - /* in validate_objects() we figure out if this is true: */ - submit->bos[i].iova = submit_bo.presumed; } spin_lock(>table_lock); @@ -278,9 +276,6 @@ static void submit_unlock_unpin_bo(struct msm_gem_submit *submit, int i) { unsigned cleanup_flags = BO_PINNED | BO_LOCKED; submit_cleanup_bo(submit, i, cleanup_flags); - - if (!(submit->bos[i].flags & BO_VALID)) - submit->bos[i].iova = 0; } /* This is where we make sure all the bo's are reserved and pin'd: */ @@ -390,8 +385,6 @@ static int submit_pin_objects(struct msm_gem_submit *submit) struct msm_drm_private *priv = submit->dev->dev_private; int i, ret = 0; - submit->valid = true; - for (i = 0; i < submit->nr_bos; i++) { struct drm_gem_object *obj = submit->bos[i].obj; struct msm_gem_vma *vma; @@ -407,14 +400,7 @@ static int submit_pin_objects(struct msm_gem_submit *submit) if (ret) break; - if (vma->iova == submit->bos[i].iova) { - submit->bos[i].flags |= BO_VALID; - } else { - submit->bos[i].iova = vma->iova; - /* iova changed, so address in cmdstream is not valid: */ - submit->bos[i].flags &= ~BO_VALID; - submit->valid = false; - } + submit->bos[i].iova = vma->iova; } /* @@ -451,7 +437,7 @@ static void submit_attach_object_fences(struct msm_gem_submit *submit) } static int submit_bo(struct msm_gem_submit *submit, uint32_t idx, - struct drm_gem_object **obj, uint64_t *iova, bool *valid) + struct drm_gem_object **obj, uint64_t *iova) { if (idx >= submit->nr_bos) { SUBMIT_ERROR(submit, "invalid buffer index: %u (out of %u)\n", @@ -463,8 +449,6 @@ static int submit_bo(struct msm_gem_submit *submit, uint32_t idx, *obj = submit->bos[idx].obj; if (iova) *iova = submit->bos[idx].iova; - if (valid) - *valid = !!(submit->bos[idx].flags & BO_VALID); return 0; } @@ -477,9 +461,6 @@ static int submit_reloc(struct msm_gem_submit *submit, struct drm_gem_object *ob uint32_t *ptr; int ret = 0; - if (!nr_relocs) - return 0; - if (offset % 4) { SUBMIT_ERROR(submit, "non-aligned cmdstream buffer: %u\n", offset); return -EINVAL; @@ -500,7 +481,6 @@ static int submit_reloc(struct msm_gem_submit *submit, struct drm_gem_object *ob struct drm_msm_gem_submit_reloc submit_reloc = relocs[i]; uint32_t off; uint64_t iova; - bool valid; if (submit_reloc.submit_offset % 4) { SUBMIT_ERROR(submit, "non-aligned reloc offset: %u\n", @@ -519,13 +499,10 @@ static int submit_reloc(struct msm_gem_submit *submit, struct drm_gem_object *ob goto out; } - ret =
[PATCH 0/7] drm/msm/gem: drm_exec conversion
From: Rob Clark Simplify the exec path (removing a legacy optimization) and convert to drm_exec. One drm_exec patch to allow passing in the expected # of GEM objects to avoid re-allocation. I'd be a bit happier if I could avoid the extra objects table allocation in drm_exec in the first place, but wasn't really happy with any of the things I tried to get rid of that. Rob Clark (7): drm/msm/gem: Remove "valid" tracking drm/msm/gem: Remove submit_unlock_unpin_bo() drm/msm/gem: Don't queue job to sched in error cases drm/msm/gem: Split out submit_unpin_objects() helper drm/msm/gem: Cleanup submit_cleanup_bo() drm/exec: Pass in initial # of objects drm/msm/gem: Convert to drm_exec drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c | 4 +- drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 4 +- drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 4 +- drivers/gpu/drm/drm_exec.c | 15 +- drivers/gpu/drm/msm/Kconfig | 1 + drivers/gpu/drm/msm/msm_gem.h | 13 +- drivers/gpu/drm/msm/msm_gem_submit.c| 197 ++-- drivers/gpu/drm/msm/msm_ringbuffer.c| 3 +- drivers/gpu/drm/nouveau/nouveau_exec.c | 2 +- drivers/gpu/drm/nouveau/nouveau_uvmm.c | 2 +- include/drm/drm_exec.h | 2 +- 12 files changed, 79 insertions(+), 170 deletions(-) -- 2.41.0
Re: [PATCH] drm/panel-edp: Add panel entry for AUO B116XTN02 and BOE NT116WHM-N21,836X2 and NV116WHM-N49 V8.0
Hi, On Thu, Oct 26, 2023 at 8:05 PM Sheng-Liang Pan wrote: > > Add panel identification entry for > - AUO B116XTN02 family (product ID:0x235c) > - BOE NT116WHM-N21,836X2 (product ID:0x09c3) > - BOE NV116WHM-N49 V8.0 (product ID:0x0979) > > Signed-off-by: Sheng-Liang Pan > > --- > > drivers/gpu/drm/panel/panel-edp.c | 3 +++ > 1 file changed, 3 insertions(+) nit: the ${SUBJECT} of this patch is too long. It's OK to go a little over suggested lengths, but this one is probably too long. Other than that: Reviewed-by: Douglas Anderson ...as per my usual policy, I don't let simple changes to this table sit on the list. I fixed the subject myself and pushed to drm-misc-next: 3db2420422a5 drm/panel-edp: Add AUO B116XTN02, BOE NT116WHM-N21,836X2, NV116WHM-N49 V8.0
[PATCH] accel/qaic: Support for 0 resize slice execution in BO
From: Pranjal Ramajor Asha Kanojiya Add support to partially execute a slice which is resized to zero. Executing a zero size slice in a BO should mean that there is no DMA transfers involved but you should still configure doorbell and semaphores. For example consider a BO of size 18K and it is sliced into 3 6K slices and user calls partial execute ioctl with resize as 10K. slice 0 - size is 6k and offset is 0, so resize of 10K will not cut short this slice hence we send the entire slice for execution. slice 1 - size is 6k and offset is 6k, so resize of 10K will cut short this slice and only the first 4k should be DMA along with configuring doorbell and semaphores. slice 2 - size is 6k and offset is 12k, so resize of 10k will cut short this slice and no DMA transfer would be involved but we should would configure doorbell and semaphores. This change begs to change the behavior of 0 resize. Currently, 0 resize partial execute ioctl behaves exactly like execute ioctl i.e. no resize. After this patch all the slice in BO should behave exactly like slice 2 in above example. Refactor copy_partial_exec_reqs() to make it more readable and less complex. Signed-off-by: Pranjal Ramajor Asha Kanojiya Reviewed-by: Jeffrey Hugo Signed-off-by: Jeffrey Hugo --- Userspace specifically requested this change in uAPI. drivers/accel/qaic/qaic_data.c | 104 ++--- include/uapi/drm/qaic_accel.h | 5 +- 2 files changed, 46 insertions(+), 63 deletions(-) diff --git a/drivers/accel/qaic/qaic_data.c b/drivers/accel/qaic/qaic_data.c index ebc3cca1b094..8da81768f2ab 100644 --- a/drivers/accel/qaic/qaic_data.c +++ b/drivers/accel/qaic/qaic_data.c @@ -51,6 +51,7 @@ }) #define NUM_EVENTS 128 #define NUM_DELAYS 10 +#define fifo_at(base, offset) ((base) + (offset) * get_dbc_req_elem_size()) static unsigned int wait_exec_default_timeout_ms = 5000; /* 5 sec default */ module_param(wait_exec_default_timeout_ms, uint, 0600); @@ -1058,6 +1059,16 @@ int qaic_attach_slice_bo_ioctl(struct drm_device *dev, void *data, struct drm_fi return ret; } +static inline u32 fifo_space_avail(u32 head, u32 tail, u32 q_size) +{ + u32 avail = head - tail - 1; + + if (head <= tail) + avail += q_size; + + return avail; +} + static inline int copy_exec_reqs(struct qaic_device *qdev, struct bo_slice *slice, u32 dbc_id, u32 head, u32 *ptail) { @@ -1066,27 +1077,20 @@ static inline int copy_exec_reqs(struct qaic_device *qdev, struct bo_slice *slic u32 tail = *ptail; u32 avail; - avail = head - tail; - if (head <= tail) - avail += dbc->nelem; - - --avail; - + avail = fifo_space_avail(head, tail, dbc->nelem); if (avail < slice->nents) return -EAGAIN; if (tail + slice->nents > dbc->nelem) { avail = dbc->nelem - tail; avail = min_t(u32, avail, slice->nents); - memcpy(dbc->req_q_base + tail * get_dbc_req_elem_size(), reqs, - sizeof(*reqs) * avail); + memcpy(fifo_at(dbc->req_q_base, tail), reqs, sizeof(*reqs) * avail); reqs += avail; avail = slice->nents - avail; if (avail) memcpy(dbc->req_q_base, reqs, sizeof(*reqs) * avail); } else { - memcpy(dbc->req_q_base + tail * get_dbc_req_elem_size(), reqs, - sizeof(*reqs) * slice->nents); + memcpy(fifo_at(dbc->req_q_base, tail), reqs, sizeof(*reqs) * slice->nents); } *ptail = (tail + slice->nents) % dbc->nelem; @@ -1094,46 +1098,31 @@ static inline int copy_exec_reqs(struct qaic_device *qdev, struct bo_slice *slic return 0; } -/* - * Based on the value of resize we may only need to transmit first_n - * entries and the last entry, with last_bytes to send from the last entry. - * Note that first_n could be 0. - */ static inline int copy_partial_exec_reqs(struct qaic_device *qdev, struct bo_slice *slice, -u64 resize, u32 dbc_id, u32 head, u32 *ptail) +u64 resize, struct dma_bridge_chan *dbc, u32 head, +u32 *ptail) { - struct dma_bridge_chan *dbc = >dbc[dbc_id]; struct dbc_req *reqs = slice->reqs; struct dbc_req *last_req; u32 tail = *ptail; - u64 total_bytes; u64 last_bytes; u32 first_n; u32 avail; - int ret; - int i; - - avail = head - tail; - if (head <= tail) - avail += dbc->nelem; - --avail; + avail = fifo_space_avail(head, tail, dbc->nelem); - total_bytes = 0; - for (i = 0; i < slice->nents; i++) { - total_bytes += le32_to_cpu(reqs[i].len); -
Re: [PATCH drm-misc-next v3] drm/sched: implement dynamic job-flow control
On Fri, 27 Oct 2023 10:32:52 -0400 Luben Tuikov wrote: > On 2023-10-27 04:25, Boris Brezillon wrote: > > Hi Danilo, > > > > On Thu, 26 Oct 2023 18:13:00 +0200 > > Danilo Krummrich wrote: > > > >> Currently, job flow control is implemented simply by limiting the number > >> of jobs in flight. Therefore, a scheduler is initialized with a credit > >> limit that corresponds to the number of jobs which can be sent to the > >> hardware. > >> > >> This implies that for each job, drivers need to account for the maximum > >> job size possible in order to not overflow the ring buffer. > >> > >> However, there are drivers, such as Nouveau, where the job size has a > >> rather large range. For such drivers it can easily happen that job > >> submissions not even filling the ring by 1% can block subsequent > >> submissions, which, in the worst case, can lead to the ring run dry. > >> > >> In order to overcome this issue, allow for tracking the actual job size > >> instead of the number of jobs. Therefore, add a field to track a job's > >> credit count, which represents the number of credits a job contributes > >> to the scheduler's credit limit. > >> > >> Signed-off-by: Danilo Krummrich > >> --- > >> Changes in V2: > >> == > >> - fixed up influence on scheduling fairness due to consideration of a > >> job's > >> size > >> - If we reach a ready entity in drm_sched_select_entity() but can't > >> actually > >> queue a job from it due to size limitations, just give up and go to > >> sleep > >> until woken up due to a pending job finishing, rather than continue > >> to try > >> other entities. > >> - added a callback to dynamically update a job's credits (Boris) > > > > This callback seems controversial. I'd suggest dropping it, so the > > patch can be merged. > > Sorry, why is it controversial? (I did read back-and-forth above, but it > wasn't clear > why it is /controversial/.) That's a question for Christian, I guess. I didn't quite get what he was worried about, other than this hook introducing a new way for drivers to screw things up by returning funky/invalid credits (which we can report with WARN_ON()s). But let's be honest, there's probably a hundred different ways (if not more) drivers can shoot themselves in the foot with drm_sched already... > > I believe only drivers are privy to changes in the credit availability as > their > firmware and hardware executes new jobs and finishes others, and so this > "update" > here is essential--leaving it only to prepare_job() wouldn't quite fulfill > the vision > of why the credit mechanism introduced by this patch in the first place. I kinda agree with you, even if I wouldn't so pessimistic as to how useful this patch would be without the ->update_job_credits() hook (it already makes the situation a lot better for Nouveau and probably other drivers too).
Re: [PATCH v2 08/29] drm/dp: Add helpers to calculate the link BW overhead
On Fri, Oct 27, 2023 at 06:48:44PM +0300, Ville Syrjälä wrote: > On Tue, Oct 24, 2023 at 01:22:17PM +0300, Imre Deak wrote: > > Add helpers drivers can use to calculate the BW allocation overhead - > > due to SSC, FEC, DSC and data alignment on symbol cycles - and the > > channel coding efficiency - due to the 8b/10b, 128b/132b encoding. On > > 128b/132b links the FEC overhead is part of the coding efficiency, so > > not accounted for in the BW allocation overhead. > > > > The drivers can use these functions to calculate a ratio, controlling > > the stream symbol insertion rate of the source device in each SST TU > > or MST MTP frame. Drivers can calculate this > > > > m/n = (pixel_data_rate * drm_dp_bw_overhead()) / > > (link_data_rate * drm_dp_bw_channel_coding_efficiency()) > > > > ratio for a given link and pixel stream and with that the > > > > mtp_count = CEIL(64 * m / n) > > > > allocated MTPs for the stream in a link frame and > > > > pbn = CEIL(64 * dm_mst_get_pbn_divider() * m / n) > > > > allocated PBNs for the stream on the MST link path. > > > > Take drm_dp_bw_overhead() into use in drm_dp_calc_pbn_mode(), for > > drivers calculating the PBN value directly. > > > > v2: > > - Add dockbook description to drm_dp_bw_channel_coding_efficiency(). > > (LKP). > > - Clarify the way m/n ratio is calculated in the commit log. > > > > Cc: Lyude Paul > > Cc: kernel test robot > > Cc: dri-devel@lists.freedesktop.org > > Signed-off-by: Imre Deak > > --- > > drivers/gpu/drm/display/drm_dp_helper.c | 124 ++ > > drivers/gpu/drm/display/drm_dp_mst_topology.c | 23 +++- > > include/drm/display/drm_dp_helper.h | 11 ++ > > 3 files changed, 152 insertions(+), 6 deletions(-) > > > > diff --git a/drivers/gpu/drm/display/drm_dp_helper.c > > b/drivers/gpu/drm/display/drm_dp_helper.c > > index e5d7970a9ddd0..79629bf7547bf 100644 > > --- a/drivers/gpu/drm/display/drm_dp_helper.c > > +++ b/drivers/gpu/drm/display/drm_dp_helper.c > > @@ -3899,4 +3899,128 @@ int drm_panel_dp_aux_backlight(struct drm_panel > > *panel, struct drm_dp_aux *aux) > > } > > EXPORT_SYMBOL(drm_panel_dp_aux_backlight); > > > > +/* See DP Standard v2.1 2.6.4.4.1.1, 2.8.4.4, 2.8.7 */ > > +static int drm_dp_link_symbol_cycles(int lane_count, int pixels, int > > bpp_x16, > > +int symbol_size, bool is_mst) > > +{ > > + int cycles = DIV_ROUND_UP(pixels * bpp_x16, 16 * symbol_size * > > lane_count); > > + int align = is_mst ? 4 / lane_count : 1; > > + > > + return ALIGN(cycles, align); > > +} > > + > > +static int drm_dp_link_dsc_symbol_cycles(int lane_count, int pixels, int > > slice_count, > > +int bpp_x16, int symbol_size, bool > > is_mst) > > +{ > > + int slice_pixels = DIV_ROUND_UP(pixels, slice_count); > > + int slice_data_cycles = drm_dp_link_symbol_cycles(lane_count, > > slice_pixels, > > + bpp_x16, symbol_size, > > is_mst); > > + int slice_eoc_cycles = is_mst ? 4 / lane_count : 1; > > + > > + return slice_count * (slice_data_cycles + slice_eoc_cycles); > > +} > > + > > +/** > > + * drm_dp_bw_overhead - Calculate the BW overhead of a DP link stream > > + * @lane_count: DP link lane count > > + * @hactive: pixel count of the active period in one scanline of the stream > > + * @dsc_slice_count: DSC slice count if @flags/DRM_DP_LINK_BW_OVERHEAD_DSC > > is set > > + * @bpp_x16: bits per pixel in .4 binary fixed point > > + * @flags: DRM_DP_OVERHEAD_x flags > > + * > > + * Calculate the BW allocation overhead of a DP link stream, depending > > + * on the link's > > + * - @lane_count > > + * - SST/MST mode (@flags / %DRM_DP_OVERHEAD_MST) > > + * - symbol size (@flags / %DRM_DP_OVERHEAD_UHBR) > > + * - FEC mode (@flags / %DRM_DP_OVERHEAD_FEC) > > + * - SSC mode (@flags / %DRM_DP_OVERHEAD_SSC) > > + * as well as the stream's > > + * - @hactive timing > > + * - @bpp_x16 color depth > > + * - compression mode (@flags / %DRM_DP_OVERHEAD_DSC). > > + * Note that this overhead doesn't account for the 8b/10b, 128b/132b > > + * channel coding efficiency, for that see > > + * @drm_dp_link_bw_channel_coding_efficiency(). > > + * > > + * Returns the overhead as 100% + overhead% in 1ppm units. > > + */ > > +int drm_dp_bw_overhead(int lane_count, int hactive, > > + int dsc_slice_count, > > + int bpp_x16, unsigned long flags) > > +{ > > + int symbol_size = flags & DRM_DP_BW_OVERHEAD_UHBR ? 32 : 8; > > + bool is_mst = flags & DRM_DP_BW_OVERHEAD_MST; > > + u32 overhead = 100; > > + int symbol_cycles; > > + > > + /* > > +* DP Standard v2.1 2.6.4.1 > > +* SSC downspread and ref clock variation margin: > > +* 5300ppm + 300ppm ~ 0.6% > > +*/ > > + if (flags & DRM_DP_BW_OVERHEAD_SSC) > > Maybe the flag name should reflect that it accounts for both > downspread and ref clk difference. Ok, will rename it. >
[RFC PATCH] drm/imx: ipuv3-plane: Allow preventing sequential DMA bursts
Sequential DMA bursts improve NIC/RAM usage thanks to the basic NIC hardware optimizations available when performing in-order sequential accesses. This can be further enforced with the IPU DMA locking mechanism which basically prevents any other IP to access the interconnect for a longer time while performing up to 8 sequential DMA bursts. The drawback is a lower availability for short time periods and delayed accesses which may cause problem with latency-sensible systems (typically, the network might suffer from high drop rates). This is even more visible with larger displays requiring even more RAM bandwidth. Issues have been observed on IMX6Q. The setup featured a 60Hz 1024x768 LVDS display just showing a static picture (thus no CPU usage, only background DMA bringing the picture to the display engine). When performing full speed iperf3 uplink tests with the FEC, almost no drop was observed, whereas the drop would raise above 50% when limiting the bandwidth to 1Mb/s (on a 100Mb/s link). The exact same test with the display pipeline disabled would show little to no drop. The LP-DDR3 chip on the module would allow up to ~53MiB each 1/60th of a second, and the display pipeline consume approximately ~10MiB of this bandwidth, and thus be active 20% of the time on each time slot. One particular feature of the IPU DMA controller (IDMAC) is the ability to serialize DMA bursts and to lock further interconnect accesses when doing so. Experimentally, disabling the locking lead to a drop rate from 50% down to 10%. A few more % could be earned by setting the burst number to 1. It seems this huge difference could be explained by a possible hardware conflict between the locking feature and some QoS logic. Indeed, on IMX6Q, the NIC-301 manages priorities and by default will elect ENET's requests (priority 2) above IPU's requests (priority 0). But the QoS seems to only be valid above a certain threshold, which is: 4 consequent DMA bursts in the case of the IPU. It was indeed observed that tweaking the number of bursts to be lowered from 8 to 4 would lead to a significant increase in the Ethernet transfers stability. IOW, it looks like when the display pipeline performs DMA transfers, incoming DMA requests from other master devices on the interconnect are delayed too much (or canceled). I have no clue to explain why on the Ethernet MAC side some uDMA transfers would never reach completion, especially without notification nor any error. All uplink transfers are properly queued at the FEC level and more importantly, the corresponding interrupts are fired upon "proper transmission" and report no error whatsoever (note: there is no actual way to know the uDMA internal controller could not fetch the data, only MAC errors could be reported at this stage). As a solution, we might want to prevent these DMA bursts from being queued together. Maybe the IMX6Q is primarily used for its graphics capabilities, but when the network (and other RAM consuming subsystem) also matter, it may be relevant to apply this workaround in order to help them fetching from RAM more reliably. Signed-off-by: Miquel Raynal --- Hello, This really is an RFC as the bug was also observed on v6.5 but the fix proposed here was written and tested on a v4.14 kernel. I want to discuss the approach and ideally get some feedback from imx6 experts who know the SoC internals before publishing a clean series. There is a lot of guessing in this workaround, besides the experimental measures I managed to do. I would be glad if someone could sched any light or involve knowledgeable people in this conversation. The initial report was there and mainly focused on the network subsystem: https://lore.kernel.org/netdev/18b72fdb-d24a-a416-ffab-3a15b281a...@katalix.com/T/#md265d6da81b8fb6b85e3adbb399bcda79dfc761c In this thread I made wrong observations because for speeding up my test cycles, I dropped the support for: DRM, SND, USB as these subsystems seemed totally irrelevant. It actually had a strong impact. In the end, I really think there is something wrong with the locking of IPU DMA bursts when mixed with the QoS of the NIC. Any comments welcome! [I know this deserves a DT binding entry, but we are not there yet] Thanks, Miquèl drivers/gpu/drm/imx/ipuv3-plane.c | 27 --- drivers/gpu/drm/imx/ipuv3-plane.h | 1 + 2 files changed, 25 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/imx/ipuv3-plane.c b/drivers/gpu/drm/imx/ipuv3-plane.c index cf98596c7ce1..0492df96ac3e 100644 --- a/drivers/gpu/drm/imx/ipuv3-plane.c +++ b/drivers/gpu/drm/imx/ipuv3-plane.c @@ -496,8 +496,8 @@ static int ipu_chan_assign_axi_id(int ipu_chan) } } -static void ipu_calculate_bursts(u32 width, u32 cpp, u32 stride, -u8 *burstsize, u8 *num_bursts) +static void ipu_calculate_bursts(struct ipu_plane *ipu_plane, u32 width, u32 cpp, +u32 stride, u8 *burstsize, u8 *num_bursts) {
Re: [PATCH drm-misc-next v3] drm/sched: implement dynamic job-flow control
On Fri, 27 Oct 2023 16:23:24 +0200 Danilo Krummrich wrote: > On 10/27/23 10:25, Boris Brezillon wrote: > > Hi Danilo, > > > > On Thu, 26 Oct 2023 18:13:00 +0200 > > Danilo Krummrich wrote: > > > >> Currently, job flow control is implemented simply by limiting the number > >> of jobs in flight. Therefore, a scheduler is initialized with a credit > >> limit that corresponds to the number of jobs which can be sent to the > >> hardware. > >> > >> This implies that for each job, drivers need to account for the maximum > >> job size possible in order to not overflow the ring buffer. > >> > >> However, there are drivers, such as Nouveau, where the job size has a > >> rather large range. For such drivers it can easily happen that job > >> submissions not even filling the ring by 1% can block subsequent > >> submissions, which, in the worst case, can lead to the ring run dry. > >> > >> In order to overcome this issue, allow for tracking the actual job size > >> instead of the number of jobs. Therefore, add a field to track a job's > >> credit count, which represents the number of credits a job contributes > >> to the scheduler's credit limit. > >> > >> Signed-off-by: Danilo Krummrich > >> --- > >> Changes in V2: > >> == > >>- fixed up influence on scheduling fairness due to consideration of a > >> job's > >> size > >> - If we reach a ready entity in drm_sched_select_entity() but can't > >> actually > >>queue a job from it due to size limitations, just give up and go to > >> sleep > >>until woken up due to a pending job finishing, rather than continue > >> to try > >>other entities. > >>- added a callback to dynamically update a job's credits (Boris) > > > > This callback seems controversial. I'd suggest dropping it, so the > > patch can be merged. > > I don't think we should drop it just for the sake of moving forward. If there > are objections > we'll discuss them. I've seen good reasons why the drivers you are working on > require this, > while, following the discussion, I have *not* seen any reasons to drop it. It > helps simplifying > driver code and doesn't introduce any complexity or overhead to existing > drivers. Up to you. I'm just saying, moving one step in the right direction is better than being stuck, and it's not like adding this callback in a follow-up patch is super complicated either. If you're confident that we can get all parties to agree on keeping this hook, fine by me.
Re: [PATCH drm-misc-next v3] drm/sched: implement dynamic job-flow control
On Fri, 27 Oct 2023 16:34:26 +0200 Danilo Krummrich wrote: > On 10/27/23 09:17, Boris Brezillon wrote: > > Hi Danilo, > > > > On Thu, 26 Oct 2023 18:13:00 +0200 > > Danilo Krummrich wrote: > > > >> + > >> + /** > >> + * @update_job_credits: Called once the scheduler is considering this > >> + * job for execution. > >> + * > >> + * Drivers may use this to update the job's submission credits, which is > >> + * useful to e.g. deduct the number of native fences which have been > >> + * signaled meanwhile. > >> + * > >> + * The callback must either return the new number of submission credits > >> + * for the given job, or zero if no update is required. > >> + * > >> + * This callback is optional. > >> + */ > >> + u32 (*update_job_credits)(struct drm_sched_job *sched_job); > > > > I'm copying my late reply to v2 here so it doesn't get lost: > > > > I keep thinking it'd be simpler to make this a void function that > > updates s_job->submission_credits directly. I also don't see the > > problem with doing a sanity check on job->submission_credits. I mean, > > if the driver is doing something silly, you can't do much to prevent it > > anyway, except warn the user that something wrong has happened. If you > > want to > > > > WARN_ON(job->submission_credits == 0 || > > job->submission_credits > job_old_submission_credits); > > > > that's fine. But none of this sanity checking has to do with the > > function prototype/semantics, and I'm still not comfortable with this 0 > > => no-change. If there's no change, we should just leave > > job->submission_credits unchanged (or return job->submission_credits) > > instead of inventing a new special case. > > If we can avoid letting drivers change fields of generic structures directly > without any drawbacks I think we should avoid it. Currently, drivers shouldn't > have the need to mess with job->credits directly. The initial value is set > through drm_sched_job_init() and is updated through the return value of > update_job_credits(). Fair enough. I do agree that keeping internal fields out of driver hands is a good thing in general, it's just that it's already free-for-all in so many places in drm_sched (like the fact drivers iterate the pending list in their stop-queue handling) that I didn't really see it as an issue. Note that's there's always the option of providing drm_sched_job_{update,get}_credits() helpers, with the update helper making sure the new credits value is consistent (smaller or equal to the old one, and not zero). > > I'm fine getting rid of the 0 => no-change semantics though. Instead we can > just > WARN() on 0. Yeah, I think that's preferable. It's pretty easy to return the old value if the driver has a way to detect when nothing changed (with a get helper if you don't want drivers to touch the credits field). > However, if we do that I'd also want to change it for > drm_sched_job_init() (where 0 currently defaults to 1) such that we accept 0, > but > WARN() accordingly. Sure. You update all drivers anyway, so passing 1 instead of 0 is not a big deal, I would say. > > I think it's consequent to either consistently give 0 a different meaning or > just > accept it but WARN() on it. Using default as a default value makes sense when you're passing zero-initialized objects that are later extended with new fields, but here you update the function prototype and all the call sites, so we're better off considering 0 as an invalid value, IMHO.
Re: [PATCH v2 0/2] accel/qaic: Add support for host/device timesync
On 10/16/2023 11:01 AM, Jeffrey Hugo wrote: AIC100 supports a timesync mechanism that allows AIC100 to timestamp device logs with a host based time. This becomes useful for putting host logs in a unified timeline with device logs for debugging and performance profiling. The mechanism consists of a boot-time initialization and a runtime perodic resync to counteract the effects of time source drift over time between the host and device. v2: - Fix readq usage on 32-bit powerpc - Fix doc warning due to trailing "_" Ajit Pal Singh (1): accel/qaic: Add support for periodic timesync Pranjal Ramajor Asha Kanojiya (1): accel/qaic: Support MHI QAIC_TIMESYNC channel Documentation/accel/qaic/aic100.rst | 6 +- Documentation/accel/qaic/qaic.rst | 5 + drivers/accel/qaic/Makefile | 3 +- drivers/accel/qaic/mhi_controller.c | 36 ++- drivers/accel/qaic/qaic.h | 4 + drivers/accel/qaic/qaic_drv.c | 17 ++ drivers/accel/qaic/qaic_timesync.c | 395 drivers/accel/qaic/qaic_timesync.h | 11 + 8 files changed, 473 insertions(+), 4 deletions(-) create mode 100644 drivers/accel/qaic/qaic_timesync.c create mode 100644 drivers/accel/qaic/qaic_timesync.h Pushed to drm-misc-next
Re: [PATCH] drm/amdgpu: Fixes uninitialized variable usage in amdgpu_dm_setup_replay
On 10/27/23 11:55, Lakha, Bhawanpreet wrote: [AMD Official Use Only - General] There was a consensus to use memset instead of {0}. I remember making changes related to that previously. Hm, seems like it's used rather consistently in the DM and in DC though. Bhawan *From:* Mahfooz, Hamza *Sent:* October 27, 2023 11:53 AM *To:* Yuran Pereira ; airl...@gmail.com *Cc:* Li, Sun peng (Leo) ; Lakha, Bhawanpreet ; Pan, Xinhui ; Siqueira, Rodrigo ; linux-ker...@vger.kernel.org ; amd-...@lists.freedesktop.org ; dri-devel@lists.freedesktop.org ; Deucher, Alexander ; Koenig, Christian ; linux-kernel-ment...@lists.linuxfoundation.org *Subject:* Re: [PATCH] drm/amdgpu: Fixes uninitialized variable usage in amdgpu_dm_setup_replay On 10/26/23 17:25, Yuran Pereira wrote: Since `pr_config` is not initialized after its declaration, the following operations with `replay_enable_option` may be performed when `replay_enable_option` is holding junk values which could possibly lead to undefined behaviour ``` ... pr_config.replay_enable_option |= pr_enable_option_static_screen; ... if (!pr_config.replay_timing_sync_supported) pr_config.replay_enable_option &= ~pr_enable_option_general_ui; ... ``` This patch initializes `pr_config` after its declaration to ensure that it doesn't contain junk data, and prevent any undefined behaviour Addresses-Coverity-ID: 1544428 ("Uninitialized scalar variable") Fixes: dede1fea4460 ("drm/amd/display: Add Freesync Panel DM code") Signed-off-by: Yuran Pereira --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c index 32d3086c4cb7..40526507f50b 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c @@ -23,6 +23,7 @@ * */ +#include #include "amdgpu_dm_replay.h" #include "dc.h" #include "dm_helpers.h" @@ -74,6 +75,8 @@ bool amdgpu_dm_setup_replay(struct dc_link *link, struct amdgpu_dm_connector *ac struct replay_config pr_config; I would prefer setting pr_config = {0}; union replay_debug_flags *debug_flags = NULL; + memset(_config, 0, sizeof(pr_config)); + // For eDP, if Replay is supported, return true to skip checks if (link->replay_settings.config.replay_supported) return true; -- Hamza -- Hamza
Re: [PATCH v2 1/2] accel/qaic: Add support for periodic timesync
On 10/22/2023 5:06 AM, Stanislaw Gruszka wrote: On Mon, Oct 16, 2023 at 11:01:13AM -0600, Jeffrey Hugo wrote: From: Ajit Pal Singh Device and Host have a time synchronization mechanism that happens once during boot when device is in SBL mode. After that, in mission-mode there is no timesync. In an experiment after continuous operation, device time drifted w.r.t. host by approximately 3 seconds per day. This drift leads to mismatch in timestamp of device and Host logs. To correct this implement periodic timesync in driver. This timesync is carried out via QAIC_TIMESYNC_PERIODIC MHI channel. Signed-off-by: Ajit Pal Singh Signed-off-by: Pranjal Ramajor Asha Kanojiya Reviewed-by: Jeffrey Hugo Reviewed-by: Carl Vanderlip Reviewed-by: Pranjal Ramajor Asha Kanojiya Signed-off-by: Jeffrey Hugo Reviewed-by: Stanislaw Gruszka @@ -586,8 +587,16 @@ static int __init qaic_init(void) goto free_pci; } + ret = qaic_timesync_init(); + if (ret) { + pr_debug("qaic: qaic_timesync_init failed %d\n", ret); + goto free_mhi; I would print at error level here. Or if timesync is optional do not error exit. Good point. timesync is optional so will not do error exit. +#ifdef readq +static u64 read_qtimer(const volatile void __iomem *addr) +{ + return readq(addr); +} +#else +static u64 read_qtimer(const volatile void __iomem *addr) +{ + u64 low, high; + + low = readl(addr); + high = readl(addr + sizeof(u32)); + return low | (high << 32); +} If that's only for compile on 32-bit PowerPC, I think would be better to limit supported architectures on Kconfig. The issue was flagged on 32-bit PowerPC, but I concluded from a code review that the issue exists on any 32-bit architecture. Given that this is an add-on card I'd prefer to support as many architectures as possible. -Jeff
Re: [PATCH] drm/amdgpu: Fixes uninitialized variable usage in amdgpu_dm_setup_replay
[AMD Official Use Only - General] There was a consensus to use memset instead of {0}. I remember making changes related to that previously. Bhawan From: Mahfooz, Hamza Sent: October 27, 2023 11:53 AM To: Yuran Pereira ; airl...@gmail.com Cc: Li, Sun peng (Leo) ; Lakha, Bhawanpreet ; Pan, Xinhui ; Siqueira, Rodrigo ; linux-ker...@vger.kernel.org ; amd-...@lists.freedesktop.org ; dri-devel@lists.freedesktop.org ; Deucher, Alexander ; Koenig, Christian ; linux-kernel-ment...@lists.linuxfoundation.org Subject: Re: [PATCH] drm/amdgpu: Fixes uninitialized variable usage in amdgpu_dm_setup_replay On 10/26/23 17:25, Yuran Pereira wrote: > Since `pr_config` is not initialized after its declaration, the > following operations with `replay_enable_option` may be performed > when `replay_enable_option` is holding junk values which could > possibly lead to undefined behaviour > > ``` > ... > pr_config.replay_enable_option |= pr_enable_option_static_screen; > ... > > if (!pr_config.replay_timing_sync_supported) > pr_config.replay_enable_option &= ~pr_enable_option_general_ui; > ... > ``` > > This patch initializes `pr_config` after its declaration to ensure that > it doesn't contain junk data, and prevent any undefined behaviour > > Addresses-Coverity-ID: 1544428 ("Uninitialized scalar variable") > Fixes: dede1fea4460 ("drm/amd/display: Add Freesync Panel DM code") > Signed-off-by: Yuran Pereira > --- > drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c > b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c > index 32d3086c4cb7..40526507f50b 100644 > --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c > +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c > @@ -23,6 +23,7 @@ >* >*/ > > +#include > #include "amdgpu_dm_replay.h" > #include "dc.h" > #include "dm_helpers.h" > @@ -74,6 +75,8 @@ bool amdgpu_dm_setup_replay(struct dc_link *link, struct > amdgpu_dm_connector *ac >struct replay_config pr_config; I would prefer setting pr_config = {0}; >union replay_debug_flags *debug_flags = NULL; > > + memset(_config, 0, sizeof(pr_config)); > + >// For eDP, if Replay is supported, return true to skip checks >if (link->replay_settings.config.replay_supported) >return true; -- Hamza
Re: [PATCH] drm/amdgpu: Fixes uninitialized variable usage in amdgpu_dm_setup_replay
Also, please write the tagline in present tense. On 10/27/23 11:53, Hamza Mahfooz wrote: On 10/26/23 17:25, Yuran Pereira wrote: Since `pr_config` is not initialized after its declaration, the following operations with `replay_enable_option` may be performed when `replay_enable_option` is holding junk values which could possibly lead to undefined behaviour ``` ... pr_config.replay_enable_option |= pr_enable_option_static_screen; ... if (!pr_config.replay_timing_sync_supported) pr_config.replay_enable_option &= ~pr_enable_option_general_ui; ... ``` This patch initializes `pr_config` after its declaration to ensure that it doesn't contain junk data, and prevent any undefined behaviour Addresses-Coverity-ID: 1544428 ("Uninitialized scalar variable") Fixes: dede1fea4460 ("drm/amd/display: Add Freesync Panel DM code") Signed-off-by: Yuran Pereira --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c index 32d3086c4cb7..40526507f50b 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c @@ -23,6 +23,7 @@ * */ +#include #include "amdgpu_dm_replay.h" #include "dc.h" #include "dm_helpers.h" @@ -74,6 +75,8 @@ bool amdgpu_dm_setup_replay(struct dc_link *link, struct amdgpu_dm_connector *ac struct replay_config pr_config; I would prefer setting pr_config = {0}; union replay_debug_flags *debug_flags = NULL; + memset(_config, 0, sizeof(pr_config)); + // For eDP, if Replay is supported, return true to skip checks if (link->replay_settings.config.replay_supported) return true; -- Hamza
Re: [PATCH] drm/amdgpu: Fixes uninitialized variable usage in amdgpu_dm_setup_replay
On 10/26/23 17:25, Yuran Pereira wrote: Since `pr_config` is not initialized after its declaration, the following operations with `replay_enable_option` may be performed when `replay_enable_option` is holding junk values which could possibly lead to undefined behaviour ``` ... pr_config.replay_enable_option |= pr_enable_option_static_screen; ... if (!pr_config.replay_timing_sync_supported) pr_config.replay_enable_option &= ~pr_enable_option_general_ui; ... ``` This patch initializes `pr_config` after its declaration to ensure that it doesn't contain junk data, and prevent any undefined behaviour Addresses-Coverity-ID: 1544428 ("Uninitialized scalar variable") Fixes: dede1fea4460 ("drm/amd/display: Add Freesync Panel DM code") Signed-off-by: Yuran Pereira --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c index 32d3086c4cb7..40526507f50b 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c @@ -23,6 +23,7 @@ * */ +#include #include "amdgpu_dm_replay.h" #include "dc.h" #include "dm_helpers.h" @@ -74,6 +75,8 @@ bool amdgpu_dm_setup_replay(struct dc_link *link, struct amdgpu_dm_connector *ac struct replay_config pr_config; I would prefer setting pr_config = {0}; union replay_debug_flags *debug_flags = NULL; + memset(_config, 0, sizeof(pr_config)); + // For eDP, if Replay is supported, return true to skip checks if (link->replay_settings.config.replay_supported) return true; -- Hamza
Re: [PATCH] drm/amdgpu: Fixes uninitialized variable usage in amdgpu_dm_setup_replay
[AMD Official Use Only - General] Thanks, Reviewed-by: Bhawanpreet Lakha From: Yuran Pereira Sent: October 26, 2023 5:25 PM To: airl...@gmail.com Cc: Yuran Pereira ; Wentland, Harry ; Li, Sun peng (Leo) ; Siqueira, Rodrigo ; Deucher, Alexander ; Koenig, Christian ; Pan, Xinhui ; dan...@ffwll.ch ; Lakha, Bhawanpreet ; amd-...@lists.freedesktop.org ; dri-devel@lists.freedesktop.org ; linux-ker...@vger.kernel.org ; linux-kernel-ment...@lists.linuxfoundation.org Subject: [PATCH] drm/amdgpu: Fixes uninitialized variable usage in amdgpu_dm_setup_replay Since `pr_config` is not initialized after its declaration, the following operations with `replay_enable_option` may be performed when `replay_enable_option` is holding junk values which could possibly lead to undefined behaviour ``` ... pr_config.replay_enable_option |= pr_enable_option_static_screen; ... if (!pr_config.replay_timing_sync_supported) pr_config.replay_enable_option &= ~pr_enable_option_general_ui; ... ``` This patch initializes `pr_config` after its declaration to ensure that it doesn't contain junk data, and prevent any undefined behaviour Addresses-Coverity-ID: 1544428 ("Uninitialized scalar variable") Fixes: dede1fea4460 ("drm/amd/display: Add Freesync Panel DM code") Signed-off-by: Yuran Pereira --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c index 32d3086c4cb7..40526507f50b 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_replay.c @@ -23,6 +23,7 @@ * */ +#include #include "amdgpu_dm_replay.h" #include "dc.h" #include "dm_helpers.h" @@ -74,6 +75,8 @@ bool amdgpu_dm_setup_replay(struct dc_link *link, struct amdgpu_dm_connector *ac struct replay_config pr_config; union replay_debug_flags *debug_flags = NULL; + memset(_config, 0, sizeof(pr_config)); + // For eDP, if Replay is supported, return true to skip checks if (link->replay_settings.config.replay_supported) return true; -- 2.25.1
Re: [PATCH v2 08/29] drm/dp: Add helpers to calculate the link BW overhead
On Tue, Oct 24, 2023 at 01:22:17PM +0300, Imre Deak wrote: > Add helpers drivers can use to calculate the BW allocation overhead - > due to SSC, FEC, DSC and data alignment on symbol cycles - and the > channel coding efficiency - due to the 8b/10b, 128b/132b encoding. On > 128b/132b links the FEC overhead is part of the coding efficiency, so > not accounted for in the BW allocation overhead. > > The drivers can use these functions to calculate a ratio, controlling > the stream symbol insertion rate of the source device in each SST TU > or MST MTP frame. Drivers can calculate this > > m/n = (pixel_data_rate * drm_dp_bw_overhead()) / > (link_data_rate * drm_dp_bw_channel_coding_efficiency()) > > ratio for a given link and pixel stream and with that the > > mtp_count = CEIL(64 * m / n) > > allocated MTPs for the stream in a link frame and > > pbn = CEIL(64 * dm_mst_get_pbn_divider() * m / n) > > allocated PBNs for the stream on the MST link path. > > Take drm_dp_bw_overhead() into use in drm_dp_calc_pbn_mode(), for > drivers calculating the PBN value directly. > > v2: > - Add dockbook description to drm_dp_bw_channel_coding_efficiency(). > (LKP). > - Clarify the way m/n ratio is calculated in the commit log. > > Cc: Lyude Paul > Cc: kernel test robot > Cc: dri-devel@lists.freedesktop.org > Signed-off-by: Imre Deak > --- > drivers/gpu/drm/display/drm_dp_helper.c | 124 ++ > drivers/gpu/drm/display/drm_dp_mst_topology.c | 23 +++- > include/drm/display/drm_dp_helper.h | 11 ++ > 3 files changed, 152 insertions(+), 6 deletions(-) > > diff --git a/drivers/gpu/drm/display/drm_dp_helper.c > b/drivers/gpu/drm/display/drm_dp_helper.c > index e5d7970a9ddd0..79629bf7547bf 100644 > --- a/drivers/gpu/drm/display/drm_dp_helper.c > +++ b/drivers/gpu/drm/display/drm_dp_helper.c > @@ -3899,4 +3899,128 @@ int drm_panel_dp_aux_backlight(struct drm_panel > *panel, struct drm_dp_aux *aux) > } > EXPORT_SYMBOL(drm_panel_dp_aux_backlight); > > +/* See DP Standard v2.1 2.6.4.4.1.1, 2.8.4.4, 2.8.7 */ > +static int drm_dp_link_symbol_cycles(int lane_count, int pixels, int bpp_x16, > + int symbol_size, bool is_mst) > +{ > + int cycles = DIV_ROUND_UP(pixels * bpp_x16, 16 * symbol_size * > lane_count); > + int align = is_mst ? 4 / lane_count : 1; > + > + return ALIGN(cycles, align); > +} > + > +static int drm_dp_link_dsc_symbol_cycles(int lane_count, int pixels, int > slice_count, > + int bpp_x16, int symbol_size, bool > is_mst) > +{ > + int slice_pixels = DIV_ROUND_UP(pixels, slice_count); > + int slice_data_cycles = drm_dp_link_symbol_cycles(lane_count, > slice_pixels, > + bpp_x16, symbol_size, > is_mst); > + int slice_eoc_cycles = is_mst ? 4 / lane_count : 1; > + > + return slice_count * (slice_data_cycles + slice_eoc_cycles); > +} > + > +/** > + * drm_dp_bw_overhead - Calculate the BW overhead of a DP link stream > + * @lane_count: DP link lane count > + * @hactive: pixel count of the active period in one scanline of the stream > + * @dsc_slice_count: DSC slice count if @flags/DRM_DP_LINK_BW_OVERHEAD_DSC > is set > + * @bpp_x16: bits per pixel in .4 binary fixed point > + * @flags: DRM_DP_OVERHEAD_x flags > + * > + * Calculate the BW allocation overhead of a DP link stream, depending > + * on the link's > + * - @lane_count > + * - SST/MST mode (@flags / %DRM_DP_OVERHEAD_MST) > + * - symbol size (@flags / %DRM_DP_OVERHEAD_UHBR) > + * - FEC mode (@flags / %DRM_DP_OVERHEAD_FEC) > + * - SSC mode (@flags / %DRM_DP_OVERHEAD_SSC) > + * as well as the stream's > + * - @hactive timing > + * - @bpp_x16 color depth > + * - compression mode (@flags / %DRM_DP_OVERHEAD_DSC). > + * Note that this overhead doesn't account for the 8b/10b, 128b/132b > + * channel coding efficiency, for that see > + * @drm_dp_link_bw_channel_coding_efficiency(). > + * > + * Returns the overhead as 100% + overhead% in 1ppm units. > + */ > +int drm_dp_bw_overhead(int lane_count, int hactive, > +int dsc_slice_count, > +int bpp_x16, unsigned long flags) > +{ > + int symbol_size = flags & DRM_DP_BW_OVERHEAD_UHBR ? 32 : 8; > + bool is_mst = flags & DRM_DP_BW_OVERHEAD_MST; > + u32 overhead = 100; > + int symbol_cycles; > + > + /* > + * DP Standard v2.1 2.6.4.1 > + * SSC downspread and ref clock variation margin: > + * 5300ppm + 300ppm ~ 0.6% > + */ > + if (flags & DRM_DP_BW_OVERHEAD_SSC) Maybe the flag name should reflect that it accounts for both downspread and ref clk difference. > + overhead += 6000; > + > + /* > + * DP Standard v2.1 2.6.4.1.1: > + * FEC symbol insertions for 8b/10b channel coding: > + * 2.4% > + */ > + if (flags & DRM_DP_BW_OVERHEAD_FEC) > + overhead += 24000; > + > + /* > + * DP
[PATCH libdrm v4 1/9] intel: determine target endianness using meson
The endianness of the target is currently determined based on preprocessor symbols. Unfortunately some symbols checked are wrong (sparc64-linux-gnu-gcc does not define __BIG_ENDIAN__ or SPARC), and several checks for big-endian architectures are missing. Fix this by introducing a new preprocessor symbol HAVE_BIG_ENDIAN, which is set based on meson's knowledge of the target endianness. Android.common.mk does not need an update, as Android is always little-endian (https://developer.android.com/ndk/guides/abis.html). Signed-off-by: Geert Uytterhoeven --- v4: - Replace explicit #ifdef checks by a define set by meson, v3: - No changes, v2: - Add arm, aarch64, microblaze, s390, and sh. --- intel/uthash.h | 4 ++-- meson.build| 5 + 2 files changed, 7 insertions(+), 2 deletions(-) diff --git a/intel/uthash.h b/intel/uthash.h index 45d1f9fc12a1d6f9..62e1650804841638 100644 --- a/intel/uthash.h +++ b/intel/uthash.h @@ -648,11 +648,11 @@ do { #define MUR_PLUS2_ALIGNED(p) (((unsigned long)p & 3UL) == 2UL) #define MUR_PLUS3_ALIGNED(p) (((unsigned long)p & 3UL) == 3UL) #define WP(p) ((uint32_t*)((unsigned long)(p) & ~3UL)) -#if (defined(__BIG_ENDIAN__) || defined(SPARC) || defined(__ppc__) || defined(__ppc64__)) +#ifdef HAVE_BIG_ENDIAN #define MUR_THREE_ONE(p) *WP(p))&0x00ff) << 8) | (((*(WP(p)+1))&0xff00) >> 24)) #define MUR_TWO_TWO(p) *WP(p))&0x) <<16) | (((*(WP(p)+1))&0x) >> 16)) #define MUR_ONE_THREE(p) *WP(p))&0x00ff) <<24) | (((*(WP(p)+1))&0xff00) >> 8)) -#else /* assume little endian non-intel */ +#else /* little endian non-intel */ #define MUR_THREE_ONE(p) *WP(p))&0xff00) >> 8) | (((*(WP(p)+1))&0x00ff) << 24)) #define MUR_TWO_TWO(p) *WP(p))&0x) >>16) | (((*(WP(p)+1))&0x) << 16)) #define MUR_ONE_THREE(p) *WP(p))&0xff00) >>24) | (((*(WP(p)+1))&0x00ff) << 8)) diff --git a/meson.build b/meson.build index e203965d82eb620b..d2fdf7929f363f84 100644 --- a/meson.build +++ b/meson.build @@ -226,6 +226,11 @@ if with_freedreno_kgsl and not with_freedreno error('cannot enable freedreno-kgsl without freedreno support') endif config.set10('_GNU_SOURCE', true) + +if target_machine.endian() == 'big' + config.set('HAVE_BIG_ENDIAN', 1) +endif + config_file = configure_file( configuration : config, output : 'config.h', -- 2.34.1
[PATCH libdrm v4 4/9] util: add missing big-endian RGB16 frame buffer formats
Signed-off-by: Geert Uytterhoeven --- v4: - No changes, v3: - Update for suffix change from "be" to "_BE", cfr. commit ffb9375a505700ad ("xf86drm: handle DRM_FORMAT_BIG_ENDIAN in drmGetFormatName()"), v2: - New. --- tests/util/format.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/tests/util/format.c b/tests/util/format.c index b99cc9c3599d9237..376a32fe4b485238 100644 --- a/tests/util/format.c +++ b/tests/util/format.c @@ -78,6 +78,9 @@ static const struct util_format_info format_info[] = { { DRM_FORMAT_BGRX5551, "BX15", MAKE_RGB_INFO(5, 1, 5, 6, 5, 11, 0, 0) }, { DRM_FORMAT_RGB565, "RG16", MAKE_RGB_INFO(5, 11, 6, 5, 5, 0, 0, 0) }, { DRM_FORMAT_BGR565, "BG16", MAKE_RGB_INFO(5, 0, 6, 5, 5, 11, 0, 0) }, + /* Big-endian RGB16 */ + { DRM_FORMAT_XRGB1555 | DRM_FORMAT_BIG_ENDIAN, "XR15_BE", MAKE_RGB_INFO(5, 10, 5, 5, 5, 0, 0, 0) }, + { DRM_FORMAT_RGB565 | DRM_FORMAT_BIG_ENDIAN, "RG16_BE", MAKE_RGB_INFO(5, 11, 6, 5, 5, 0, 0, 0) }, /* RGB24 */ { DRM_FORMAT_BGR888, "BG24", MAKE_RGB_INFO(8, 0, 8, 8, 8, 16, 0, 0) }, { DRM_FORMAT_RGB888, "RG24", MAKE_RGB_INFO(8, 16, 8, 8, 8, 0, 0, 0) }, -- 2.34.1
[PATCH libdrm v4 0/9] Big-endian fixes
Hi all, This patch series fixes some endianness issues in libdrm. It has been tested on ARAnyM using a work-in-progress Atari DRM driver. After this, the smpte and tiles modetest patterns and the pwetty markers are rendered correctly using the XR24, RG16, and RG16BE formats on big-endian systems. Changes compared to v3[1]: - Replace explicit #ifdef checks by a define set by meson, - Use new HAVE_BIG_ENDIAN symbol, Changes compared to v2[2]: - Increase indentation after definition of cpu_to_*() macros, - Update for suffix change from "be" to "_BE", cfr. commit ffb9375a505700ad ("xf86drm: handle DRM_FORMAT_BIG_ENDIAN in drmGetFormatName()"), - Replace hardcoded numbers in code by sizeof(), - Wrap byteswap_buffer{16,32}() implementation inside #if HAVE_CAIRO to avoid defined-but-not-used compiler warnings, - Drop "modetest: Fix printing of big-endian fourcc values", as it is no longer needed since commit ffb9375a505700ad ("xf86drm: handle DRM_FORMAT_BIG_ENDIAN in drmGetFormatName()"). Changes compared to v1[3]: - Consider arm, aarch64, microblaze, s390, and sh in endianness checks, - Add Acked-by, - Add swap32() intermediate helper, - Fix 16 bpp formats on big-endian, - Add support for big-endian XRGB1555 and RGB565, - Fix printing of big-endian fourcc values, - Fix pwetty on big-endian. I have also updated the merge request at [4]. Thanks for your comments! [1] "[PATCH libdrm v3 0/9] Big-endian fixes https://lore.kernel.org/r/cover.1698217235.git.ge...@linux-m68k.org [2] "[PATCH libdrm v2 00/10] Big-endian fixes" https://lore.kernel.org/r/cover.1657302103.git.ge...@linux-m68k.org/#t [3] "[PATCH RFC libdrm 0/2] Big-endian fixes" https://lore.kernel.org/r/cover.1646684158.git.ge...@linux-m68k.org [4] https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/331 Geert Uytterhoeven (9): intel: determine target endianness using meson util: fix 32 bpp patterns on big-endian util: fix 16 bpp patterns on big-endian util: add missing big-endian RGB16 frame buffer formats modetest: add support for parsing big-endian formats util: add test pattern support for big-endian XRGB1555/RGB565 util: fix pwetty on big-endian util: add pwetty support for big-endian RGB565 modetest: add support for big-endian XRGB1555/RGB565 intel/uthash.h| 4 +- meson.build | 5 ++ tests/modetest/buffers.c | 4 ++ tests/modetest/modetest.c | 15 +++-- tests/util/format.c | 3 + tests/util/pattern.c | 117 +++--- 6 files changed, 120 insertions(+), 28 deletions(-) -- 2.34.1
[PATCH libdrm v4 3/9] util: fix 16 bpp patterns on big-endian
DRM formats are defined to be little-endian, unless the DRM_FORMAT_BIG_ENDIAN flag is set. Hence writes of multi-byte pixel values need to take endianness into account. Introduce a swap16() helper to byteswap 16-bit values, and a cpu_to_le16() helper to convert 16-bit values from CPU-endian to little-endian, and use the latter in the various pattern fill functions for 16-bit formats. Signed-off-by: Geert Uytterhoeven --- v4: - No changes, v3: - Increase indentation after definition of cpu_to_le16(), v2: - New. --- tests/util/pattern.c | 21 ++--- 1 file changed, 14 insertions(+), 7 deletions(-) diff --git a/tests/util/pattern.c b/tests/util/pattern.c index c79cad2c6a23993f..1927377d0027c6fd 100644 --- a/tests/util/pattern.c +++ b/tests/util/pattern.c @@ -61,6 +61,11 @@ struct color_yuv { .u = MAKE_YUV_601_U(r, g, b), \ .v = MAKE_YUV_601_V(r, g, b) } +static inline uint16_t swap16(uint16_t x) +{ + return ((x & 0x00ffU) << 8) | ((x & 0xff00U) >> 8); +} + static inline uint32_t swap32(uint32_t x) { return ((x & 0x00ffU) << 24) | @@ -70,8 +75,10 @@ static inline uint32_t swap32(uint32_t x) } #ifdef HAVE_BIG_ENDIAN +#define cpu_to_le16(x) swap16(x) #define cpu_to_le32(x) swap32(x) #else +#define cpu_to_le16(x) (x) #define cpu_to_le32(x) (x) #endif @@ -410,26 +417,26 @@ static void fill_smpte_rgb16(const struct util_rgb_info *rgb, void *mem, for (y = 0; y < height * 6 / 9; ++y) { for (x = 0; x < width; ++x) - ((uint16_t *)mem)[x] = colors_top[x * 7 / width]; + ((uint16_t *)mem)[x] = cpu_to_le16(colors_top[x * 7 / width]); mem += stride; } for (; y < height * 7 / 9; ++y) { for (x = 0; x < width; ++x) - ((uint16_t *)mem)[x] = colors_middle[x * 7 / width]; + ((uint16_t *)mem)[x] = cpu_to_le16(colors_middle[x * 7 / width]); mem += stride; } for (; y < height; ++y) { for (x = 0; x < width * 5 / 7; ++x) ((uint16_t *)mem)[x] = - colors_bottom[x * 4 / (width * 5 / 7)]; + cpu_to_le16(colors_bottom[x * 4 / (width * 5 / 7)]); for (; x < width * 6 / 7; ++x) ((uint16_t *)mem)[x] = - colors_bottom[(x - width * 5 / 7) * 3 - / (width / 7) + 4]; + cpu_to_le16(colors_bottom[(x - width * 5 / 7) * 3 + / (width / 7) + 4]); for (; x < width; ++x) - ((uint16_t *)mem)[x] = colors_bottom[7]; + ((uint16_t *)mem)[x] = cpu_to_le16(colors_bottom[7]); mem += stride; } } @@ -1280,7 +1287,7 @@ static void fill_tiles_rgb16(const struct util_format_info *info, void *mem, (rgb32 >> 8) & 0xff, rgb32 & 0xff, 255); - ((uint16_t *)mem)[x] = color; + ((uint16_t *)mem)[x] = cpu_to_le16(color); } mem += stride; } -- 2.34.1
[PATCH libdrm v4 8/9] util: add pwetty support for big-endian RGB565
Add support for rendering the crosshairs in a buffer using the big-endian RGB565 format. Signed-off-by: Geert Uytterhoeven --- v4: - No changes, v3: - No changes, v2: - New. --- tests/util/pattern.c | 1 + 1 file changed, 1 insertion(+) diff --git a/tests/util/pattern.c b/tests/util/pattern.c index 769da5814d1f689e..34da1c4d4f8d5ff0 100644 --- a/tests/util/pattern.c +++ b/tests/util/pattern.c @@ -1188,6 +1188,7 @@ static void make_pwetty(void *data, unsigned int width, unsigned int height, cairo_format = CAIRO_FORMAT_ARGB32; break; case DRM_FORMAT_RGB565: + case DRM_FORMAT_RGB565 | DRM_FORMAT_BIG_ENDIAN: case DRM_FORMAT_BGR565: cairo_format = CAIRO_FORMAT_RGB16_565; swap16 = fb_foreign_endian(format); -- 2.34.1
[PATCH libdrm v4 7/9] util: fix pwetty on big-endian
Cairo always uses native byte order for rendering. Hence if the byte order of the frame buffer differs from the byte order of the CPU, the frame buffer contents need to be byteswapped twice: once before rendering, to convert to native byte order, and a second time after rendering, to restore the frame buffer format's byte order. Note that byte swapping is not done for ARGB32 formats, as for these formats, byte order only affects the order of the red, green, and blue channels, which we do not care about here. Signed-off-by: Geert Uytterhoeven --- v4: - No changes, v3: - Wrap byteswap_buffer{16,32}() implementation inside #if HAVE_CAIRO to avoid defined-but-not-used compiler warnings, v2: - RGB30 is untested. --- tests/util/pattern.c | 44 1 file changed, 44 insertions(+) diff --git a/tests/util/pattern.c b/tests/util/pattern.c index ef589124f278e837..769da5814d1f689e 100644 --- a/tests/util/pattern.c +++ b/tests/util/pattern.c @@ -79,10 +79,12 @@ static inline uint32_t swap32(uint32_t x) #define cpu_to_be16(x) (x) #define cpu_to_le16(x) swap16(x) #define cpu_to_le32(x) swap32(x) +#define fb_foreign_endian(format) (!((format) & DRM_FORMAT_BIG_ENDIAN)) #else #define cpu_to_be16(x) swap16(x) #define cpu_to_le16(x) (x) #define cpu_to_le32(x) (x) +#define fb_foreign_endian(format) ((format) & DRM_FORMAT_BIG_ENDIAN) #endif #define cpu_to_fb16(x) (fb_be ? cpu_to_be16(x) : cpu_to_le16(x)) @@ -1141,6 +1143,32 @@ static void fill_smpte(const struct util_format_info *info, void *planes[3], } } +#if HAVE_CAIRO +static void byteswap_buffer16(void *mem, unsigned int width, unsigned int height, + unsigned int stride) +{ + unsigned int x, y; + + for (y = 0; y < height; ++y) { + for (x = 0; x < width; ++x) + ((uint16_t *)mem)[x] = swap16(((uint16_t *)mem)[x]); + mem += stride; + } +} + +static void byteswap_buffer32(void *mem, unsigned int width, unsigned int height, + unsigned int stride) +{ + unsigned int x, y; + + for (y = 0; y < height; ++y) { + for (x = 0; x < width; ++x) + ((uint32_t *)mem)[x] = swap32(((uint32_t *)mem)[x]); + mem += stride; + } +} +#endif + static void make_pwetty(void *data, unsigned int width, unsigned int height, unsigned int stride, uint32_t format) { @@ -1148,6 +1176,8 @@ static void make_pwetty(void *data, unsigned int width, unsigned int height, cairo_surface_t *surface; cairo_t *cr; cairo_format_t cairo_format; + bool swap16 = false; + bool swap32 = false; /* we can ignore the order of R,G,B channels */ switch (format) { @@ -1160,6 +1190,7 @@ static void make_pwetty(void *data, unsigned int width, unsigned int height, case DRM_FORMAT_RGB565: case DRM_FORMAT_BGR565: cairo_format = CAIRO_FORMAT_RGB16_565; + swap16 = fb_foreign_endian(format); break; #if CAIRO_VERSION_MAJOR > 1 || (CAIRO_VERSION_MAJOR == 1 && CAIRO_VERSION_MINOR >= 12) case DRM_FORMAT_ARGB2101010: @@ -1167,12 +1198,19 @@ static void make_pwetty(void *data, unsigned int width, unsigned int height, case DRM_FORMAT_ABGR2101010: case DRM_FORMAT_XBGR2101010: cairo_format = CAIRO_FORMAT_RGB30; + swap32 = fb_foreign_endian(format); break; #endif default: return; } + /* Cairo uses native byte order, so we may have to byteswap before... */ + if (swap16) + byteswap_buffer16(data, width, height, stride); + if (swap32) + byteswap_buffer32(data, width, height, stride); + surface = cairo_image_surface_create_for_data(data, cairo_format, width, height, @@ -1208,6 +1246,12 @@ static void make_pwetty(void *data, unsigned int width, unsigned int height, } cairo_destroy(cr); + + /* ... and after */ + if (swap16) + byteswap_buffer16(data, width, height, stride); + if (swap32) + byteswap_buffer32(data, width, height, stride); #endif } -- 2.34.1
[PATCH libdrm v4 9/9] modetest: add support for big-endian XRGB1555/RGB565
Add support for creating buffers using big-endian formats. For now this is limited to XRGB1555 and RGB565, which are the most common big-endian formats. Signed-off-by: Geert Uytterhoeven --- v4: - No changes, v3: - No changes, v2: - New. --- tests/modetest/buffers.c | 4 1 file changed, 4 insertions(+) diff --git a/tests/modetest/buffers.c b/tests/modetest/buffers.c index 65f1cfb32ab9eeae..c8aafadd5145b64a 100644 --- a/tests/modetest/buffers.c +++ b/tests/modetest/buffers.c @@ -158,6 +158,7 @@ bo_create(int fd, unsigned int format, case DRM_FORMAT_BGRX: case DRM_FORMAT_ARGB1555: case DRM_FORMAT_XRGB1555: + case DRM_FORMAT_XRGB1555 | DRM_FORMAT_BIG_ENDIAN: case DRM_FORMAT_ABGR1555: case DRM_FORMAT_XBGR1555: case DRM_FORMAT_RGBA5551: @@ -165,6 +166,7 @@ bo_create(int fd, unsigned int format, case DRM_FORMAT_BGRA5551: case DRM_FORMAT_BGRX5551: case DRM_FORMAT_RGB565: + case DRM_FORMAT_RGB565 | DRM_FORMAT_BIG_ENDIAN: case DRM_FORMAT_BGR565: case DRM_FORMAT_UYVY: case DRM_FORMAT_VYUY: @@ -318,6 +320,7 @@ bo_create(int fd, unsigned int format, case DRM_FORMAT_BGRX: case DRM_FORMAT_ARGB1555: case DRM_FORMAT_XRGB1555: + case DRM_FORMAT_XRGB1555 | DRM_FORMAT_BIG_ENDIAN: case DRM_FORMAT_ABGR1555: case DRM_FORMAT_XBGR1555: case DRM_FORMAT_RGBA5551: @@ -325,6 +328,7 @@ bo_create(int fd, unsigned int format, case DRM_FORMAT_BGRA5551: case DRM_FORMAT_BGRX5551: case DRM_FORMAT_RGB565: + case DRM_FORMAT_RGB565 | DRM_FORMAT_BIG_ENDIAN: case DRM_FORMAT_BGR565: case DRM_FORMAT_BGR888: case DRM_FORMAT_RGB888: -- 2.34.1
[PATCH libdrm v4 5/9] modetest: add support for parsing big-endian formats
When specifying a frame buffer format like "RG16_BE" (big-endian RG16), modetest still uses the little-endian variant, as the format string is truncated to four characters. Fix this by increasing the format string size to 8 bytes (7 characters + NUL terminator). Signed-off-by: Geert Uytterhoeven --- v4: - No changes, v3: - Update for suffix change from "be" to "_BE", cfr. commit ffb9375a505700ad ("xf86drm: handle DRM_FORMAT_BIG_ENDIAN in drmGetFormatName()"), - Replace hardcoded numbers in code by sizeof(), v2: - New. --- tests/modetest/modetest.c | 15 +-- 1 file changed, 9 insertions(+), 6 deletions(-) diff --git a/tests/modetest/modetest.c b/tests/modetest/modetest.c index 9b1aa537be8716cf..cc96015f4a555dd3 100644 --- a/tests/modetest/modetest.c +++ b/tests/modetest/modetest.c @@ -817,7 +817,7 @@ struct pipe_arg { unsigned int num_cons; uint32_t crtc_id; char mode_str[64]; - char format_str[5]; + char format_str[8]; /* need to leave room for "_BE" and terminating \0 */ float vrefresh; unsigned int fourcc; drmModeModeInfo *mode; @@ -841,7 +841,7 @@ struct plane_arg { unsigned int old_fb_id; struct bo *bo; struct bo *old_bo; - char format_str[5]; /* need to leave room for terminating \0 */ + char format_str[8]; /* need to leave room for "_BE" and terminating \0 */ unsigned int fourcc; }; @@ -2032,8 +2032,9 @@ static int parse_connector(struct pipe_arg *pipe, const char *arg) } if (*p == '@') { - strncpy(pipe->format_str, p + 1, 4); - pipe->format_str[4] = '\0'; + len = sizeof(pipe->format_str) - 1; + strncpy(pipe->format_str, p + 1, len); + pipe->format_str[len] = '\0'; } pipe->fourcc = util_format_fourcc(pipe->format_str); @@ -2047,6 +2048,7 @@ static int parse_connector(struct pipe_arg *pipe, const char *arg) static int parse_plane(struct plane_arg *plane, const char *p) { + unsigned int len; char *end; plane->plane_id = strtoul(p, , 10); @@ -2085,8 +2087,9 @@ static int parse_plane(struct plane_arg *plane, const char *p) } if (*end == '@') { - strncpy(plane->format_str, end + 1, 4); - plane->format_str[4] = '\0'; + len = sizeof(plane->format_str) - 1; + strncpy(plane->format_str, end + 1, len); + plane->format_str[len] = '\0'; } else { strcpy(plane->format_str, "XR24"); } -- 2.34.1
[PATCH libdrm v4 2/9] util: fix 32 bpp patterns on big-endian
DRM formats are defined to be little-endian, unless the DRM_FORMAT_BIG_ENDIAN flag is set. Hence writes of multi-byte pixel values need to take endianness into account. Introduce a swap32() helper to byteswap 32-bit values, and a cpu_to_le32() helper to convert 32-bit values from CPU-endian to little-endian, and use the latter in the various pattern fill functions for 32-bit formats. Signed-off-by: Geert Uytterhoeven Acked-by: Pekka Paalanen --- v4: - Use new HAVE_BIG_ENDIAN symbol, v3: - Increase indentation after definition of cpu_to_le32(), v2: - Add Acked-by, - Add swap32() intermediate helper, - Add __ARM_BIG_ENDIAN and __s390__. --- tests/util/pattern.c | 32 +++- 1 file changed, 23 insertions(+), 9 deletions(-) diff --git a/tests/util/pattern.c b/tests/util/pattern.c index f69c5206d96eff02..c79cad2c6a23993f 100644 --- a/tests/util/pattern.c +++ b/tests/util/pattern.c @@ -61,6 +61,20 @@ struct color_yuv { .u = MAKE_YUV_601_U(r, g, b), \ .v = MAKE_YUV_601_V(r, g, b) } +static inline uint32_t swap32(uint32_t x) +{ + return ((x & 0x00ffU) << 24) | + ((x & 0xff00U) << 8) | + ((x & 0x00ffU) >> 8) | + ((x & 0xff00U) >> 24); +} + +#ifdef HAVE_BIG_ENDIAN +#define cpu_to_le32(x) swap32(x) +#else +#define cpu_to_le32(x) (x) +#endif + /* This function takes 8-bit color values */ static inline uint32_t shiftcolor8(const struct util_color_component *comp, uint32_t value) @@ -520,26 +534,26 @@ static void fill_smpte_rgb32(const struct util_rgb_info *rgb, void *mem, for (y = 0; y < height * 6 / 9; ++y) { for (x = 0; x < width; ++x) - ((uint32_t *)mem)[x] = colors_top[x * 7 / width]; + ((uint32_t *)mem)[x] = cpu_to_le32(colors_top[x * 7 / width]); mem += stride; } for (; y < height * 7 / 9; ++y) { for (x = 0; x < width; ++x) - ((uint32_t *)mem)[x] = colors_middle[x * 7 / width]; + ((uint32_t *)mem)[x] = cpu_to_le32(colors_middle[x * 7 / width]); mem += stride; } for (; y < height; ++y) { for (x = 0; x < width * 5 / 7; ++x) ((uint32_t *)mem)[x] = - colors_bottom[x * 4 / (width * 5 / 7)]; + cpu_to_le32(colors_bottom[x * 4 / (width * 5 / 7)]); for (; x < width * 6 / 7; ++x) ((uint32_t *)mem)[x] = - colors_bottom[(x - width * 5 / 7) * 3 - / (width / 7) + 4]; + cpu_to_le32(colors_bottom[(x - width * 5 / 7) * 3 + / (width / 7) + 4]); for (; x < width; ++x) - ((uint32_t *)mem)[x] = colors_bottom[7]; + ((uint32_t *)mem)[x] = cpu_to_le32(colors_bottom[7]); mem += stride; } } @@ -1315,7 +1329,7 @@ static void fill_tiles_rgb32(const struct util_format_info *info, void *mem, (rgb32 >> 8) & 0xff, rgb32 & 0xff, alpha); - ((uint32_t *)mem)[x] = color; + ((uint32_t *)mem)[x] = cpu_to_le32(color); } mem += stride; } @@ -1464,7 +1478,7 @@ static void fill_gradient_rgb32(const struct util_rgb_info *rgb, for (j = 0; j < width / 2; j++) { uint32_t value = MAKE_RGBA10(rgb, j & 0x3ff, j & 0x3ff, j & 0x3ff, 0); - row[2*j] = row[2*j+1] = value; + row[2*j] = row[2*j+1] = cpu_to_le32(value); } mem += stride; } @@ -1474,7 +1488,7 @@ static void fill_gradient_rgb32(const struct util_rgb_info *rgb, for (j = 0; j < width / 2; j++) { uint32_t value = MAKE_RGBA10(rgb, j & 0x3fc, j & 0x3fc, j & 0x3fc, 0); - row[2*j] = row[2*j+1] = value; + row[2*j] = row[2*j+1] = cpu_to_le32(value); } mem += stride; } -- 2.34.1
[PATCH libdrm v4 6/9] util: add test pattern support for big-endian XRGB1555/RGB565
Add support for drawing the SMPTE and tiles test patterns in buffers using big-endian formats. For now this is limited to XRGB1555 and RGB565, which are the most common big-endian formats. Signed-off-by: Geert Uytterhoeven --- v4: - No changes, v3: - Increase indentation after definition of cpu_to_be16(), v2: - New. --- tests/util/pattern.c | 31 +-- 1 file changed, 21 insertions(+), 10 deletions(-) diff --git a/tests/util/pattern.c b/tests/util/pattern.c index 1927377d0027c6fd..ef589124f278e837 100644 --- a/tests/util/pattern.c +++ b/tests/util/pattern.c @@ -23,6 +23,7 @@ * IN THE SOFTWARE. */ +#include #include #include #include @@ -75,13 +76,17 @@ static inline uint32_t swap32(uint32_t x) } #ifdef HAVE_BIG_ENDIAN +#define cpu_to_be16(x) (x) #define cpu_to_le16(x) swap16(x) #define cpu_to_le32(x) swap32(x) #else +#define cpu_to_be16(x) swap16(x) #define cpu_to_le16(x) (x) #define cpu_to_le32(x) (x) #endif +#define cpu_to_fb16(x) (fb_be ? cpu_to_be16(x) : cpu_to_le16(x)) + /* This function takes 8-bit color values */ static inline uint32_t shiftcolor8(const struct util_color_component *comp, uint32_t value) @@ -382,7 +387,7 @@ static void fill_smpte_yuv_packed(const struct util_yuv_info *yuv, void *mem, static void fill_smpte_rgb16(const struct util_rgb_info *rgb, void *mem, unsigned int width, unsigned int height, -unsigned int stride) +unsigned int stride, bool fb_be) { const uint16_t colors_top[] = { MAKE_RGBA(rgb, 192, 192, 192, 255), /* grey */ @@ -417,26 +422,26 @@ static void fill_smpte_rgb16(const struct util_rgb_info *rgb, void *mem, for (y = 0; y < height * 6 / 9; ++y) { for (x = 0; x < width; ++x) - ((uint16_t *)mem)[x] = cpu_to_le16(colors_top[x * 7 / width]); + ((uint16_t *)mem)[x] = cpu_to_fb16(colors_top[x * 7 / width]); mem += stride; } for (; y < height * 7 / 9; ++y) { for (x = 0; x < width; ++x) - ((uint16_t *)mem)[x] = cpu_to_le16(colors_middle[x * 7 / width]); + ((uint16_t *)mem)[x] = cpu_to_fb16(colors_middle[x * 7 / width]); mem += stride; } for (; y < height; ++y) { for (x = 0; x < width * 5 / 7; ++x) ((uint16_t *)mem)[x] = - cpu_to_le16(colors_bottom[x * 4 / (width * 5 / 7)]); + cpu_to_fb16(colors_bottom[x * 4 / (width * 5 / 7)]); for (; x < width * 6 / 7; ++x) ((uint16_t *)mem)[x] = - cpu_to_le16(colors_bottom[(x - width * 5 / 7) * 3 + cpu_to_fb16(colors_bottom[(x - width * 5 / 7) * 3 / (width / 7) + 4]); for (; x < width; ++x) - ((uint16_t *)mem)[x] = cpu_to_le16(colors_bottom[7]); + ((uint16_t *)mem)[x] = cpu_to_fb16(colors_bottom[7]); mem += stride; } } @@ -1089,9 +1094,11 @@ static void fill_smpte(const struct util_format_info *info, void *planes[3], case DRM_FORMAT_BGRA: case DRM_FORMAT_BGRX: case DRM_FORMAT_RGB565: + case DRM_FORMAT_RGB565 | DRM_FORMAT_BIG_ENDIAN: case DRM_FORMAT_BGR565: case DRM_FORMAT_ARGB1555: case DRM_FORMAT_XRGB1555: + case DRM_FORMAT_XRGB1555 | DRM_FORMAT_BIG_ENDIAN: case DRM_FORMAT_ABGR1555: case DRM_FORMAT_XBGR1555: case DRM_FORMAT_RGBA5551: @@ -1099,7 +1106,8 @@ static void fill_smpte(const struct util_format_info *info, void *planes[3], case DRM_FORMAT_BGRA5551: case DRM_FORMAT_BGRX5551: return fill_smpte_rgb16(>rgb, planes[0], - width, height, stride); + width, height, stride, + info->format & DRM_FORMAT_BIG_ENDIAN); case DRM_FORMAT_BGR888: case DRM_FORMAT_RGB888: @@ -1271,7 +1279,7 @@ static void fill_tiles_yuv_packed(const struct util_format_info *info, static void fill_tiles_rgb16(const struct util_format_info *info, void *mem, unsigned int width, unsigned int height, -unsigned int stride) +unsigned int stride, bool fb_be) { const struct util_rgb_info *rgb = >rgb; void *mem_base = mem; @@ -1287,7 +1295,7 @@ static void fill_tiles_rgb16(const struct util_format_info *info, void *mem, (rgb32 >> 8)
Re: [PATCH v2] accel/qaic: Enable 1 MSI fallback mode
On 10/16/2023 11:00 AM, Jeffrey Hugo wrote: From: Carl Vanderlip Several virtualization use-cases either don't support 32 MultiMSIs (Xen/VMware) or have significant drawbacks to their use (KVM's vIOMMU, which is required to support 32 MSI, needs to allocate an alternate system memory space for each device using vIOMMU (e.g. 8GB VM mem and 2 cards => 8 + 2 * 8 = 24GB host memory required)). Support these cases by enabling a 1 MSI fallback mode. Whenever all 32 MSIs requested are not available, a second request for a single MSI is made. Its success is the initiator of single MSI mode. This mode causes all interrupts generated by the device to be directed to the 0th MSI (firmware >=v1.10 will do this as a response to the PCIe MSI capability configuration). Likewise, all interrupt handlers for the device are registered to the 0th MSI. Since the DBC interrupt handler checks if the DBC is in use or if there is any pending changes, the 'spurious' interrupts are disregarded. If there is work to be done, the standard threaded IRQ handler is dispatched. On every interrupt, the MHI handler wakes up its threaded interrupt handler, and attempts to wake any waiters for MHI state events. Performance is within +-0.6% for test cases that typify real world use. Larger differences ([-4,+132]%, avg +47%) exist for very simple tasks (e.g. addition) compiled for single NSPs. It is assumed that the small work and many interrupts typically cause contention (e.g. 16 NSPs vs 4 CPUs), as evidenced by the standard deviation between runs also decreasing (r=-0.48 between delta(Performace_test) and delta(StdDev_test/Avg_test)) Signed-off-by: Carl Vanderlip Reviewed-by: Pranjal Ramajor Asha Kanojiya Reviewed-by: Jeffrey Hugo Signed-off-by: Jeffrey Hugo Pushed to drm-misc-next
[PATCH] accel/ivpu: avoid build failure with CONFIG_PM=n
From: Arnd Bergmann The usage count of struct dev_pm_info is an implementation detail that is only available if CONFIG_PM is enabled, so printing it in a debug message causes a build failure in configurations without PM: In file included from include/linux/device.h:15, from include/linux/pci.h:37, from drivers/accel/ivpu/ivpu_pm.c:8: drivers/accel/ivpu/ivpu_pm.c: In function 'ivpu_rpm_get_if_active': drivers/accel/ivpu/ivpu_pm.c:254:51: error: 'struct dev_pm_info' has no member named 'usage_count' 254 | atomic_read(>drm.dev->power.usage_count)); | ^ include/linux/dev_printk.h:129:48: note: in definition of macro 'dev_printk' 129 | _dev_printk(level, dev, fmt, ##__VA_ARGS__); \ |^~~ drivers/accel/ivpu/ivpu_drv.h:75:17: note: in expansion of macro 'dev_dbg' 75 | dev_dbg((vdev)->drm.dev, "[%s] " fmt, #type, ##args); \ | ^~~ drivers/accel/ivpu/ivpu_pm.c:253:9: note: in expansion of macro 'ivpu_dbg' 253 | ivpu_dbg(vdev, RPM, "rpm_get_if_active count %d\n", | ^~~~ The print message does not seem essential, so the easiest workaround is to just remove it. Fixes: c39dc15191c4 ("accel/ivpu: Read clock rate only if device is up") Signed-off-by: Arnd Bergmann --- drivers/accel/ivpu/ivpu_pm.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/drivers/accel/ivpu/ivpu_pm.c b/drivers/accel/ivpu/ivpu_pm.c index 0ace218783c8..e9b16cbc26f4 100644 --- a/drivers/accel/ivpu/ivpu_pm.c +++ b/drivers/accel/ivpu/ivpu_pm.c @@ -250,9 +250,6 @@ int ivpu_rpm_get_if_active(struct ivpu_device *vdev) { int ret; - ivpu_dbg(vdev, RPM, "rpm_get_if_active count %d\n", -atomic_read(>drm.dev->power.usage_count)); - ret = pm_runtime_get_if_active(vdev->drm.dev, false); drm_WARN_ON(>drm, ret < 0); -- 2.39.2
Re: [PATCH 11/11] accel/ivpu: Wait for VPU to enter idle state after D0i3 entry message
On 10/25/2023 3:43 AM, Stanislaw Gruszka wrote: From: Andrzej Kacprowski The VPU needs non zero time to enter IDLE state after responding to D0i3 entry message. If the driver does not wait for the VPU to enter IDLE state it could cause warm boot failures. Signed-off-by: Andrzej Kacprowski Reviewed-by: Stanislaw Gruszka Signed-off-by: Stanislaw Gruszka Should this be squashed into the previous change? Feels like this is a bug fix to an issue that the previous change introduced.
Re: [PATCH 10/11] accel/ivpu: Add support for delayed D0i3 entry message
On 10/25/2023 3:43 AM, Stanislaw Gruszka wrote: From: Andrzej Kacprowski Currently the VPU firmware prepares for D0i3 every time the VPU is entering D0i2 Idle state. This is not optimal as we might not enter D0i3 every time we enter D0i2 Idle and this preparation is quite costly. This optimization moves D0i3 preparation to a dedicated message sent from the host driver only when the driver is about to enter D0i3 - this reduces power consumption and latency for certain workloads, for example audio workloads that submit inference every 10 ms. Signed-off-by: Andrzej Kacprowski Reviewed-by: Stanislaw Gruszka Signed-off-by: Stanislaw Gruszka Reviewed-by: Jeffrey Hugo Some nits below. --- drivers/accel/ivpu/ivpu_drv.h | 10 +-- drivers/accel/ivpu/ivpu_fw.c | 48 +-- drivers/accel/ivpu/ivpu_hw_37xx.c | 8 -- drivers/accel/ivpu/ivpu_hw_40xx.c | 3 ++ drivers/accel/ivpu/ivpu_jsm_msg.c | 15 ++ drivers/accel/ivpu/ivpu_jsm_msg.h | 1 + drivers/accel/ivpu/ivpu_pm.c | 11 ++- 7 files changed, 87 insertions(+), 9 deletions(-) diff --git a/drivers/accel/ivpu/ivpu_drv.h b/drivers/accel/ivpu/ivpu_drv.h index 5432b5ee90df..84b170a3c323 100644 --- a/drivers/accel/ivpu/ivpu_drv.h +++ b/drivers/accel/ivpu/ivpu_drv.h @@ -88,6 +88,7 @@ struct ivpu_wa_table { bool d3hot_after_power_off; bool interrupt_clear_with_0; bool disable_clock_relinquish; + bool disable_d0i3_msg; }; struct ivpu_hw_info; @@ -126,6 +127,7 @@ struct ivpu_device { int tdr; int reschedule_suspend; int autosuspend; + int d0i3_entry_msg; } timeout; }; @@ -148,9 +150,11 @@ extern u8 ivpu_pll_min_ratio; extern u8 ivpu_pll_max_ratio; extern bool ivpu_disable_mmu_cont_pages; -#define IVPU_TEST_MODE_FW_TEST BIT(0) -#define IVPU_TEST_MODE_NULL_HW BIT(1) -#define IVPU_TEST_MODE_NULL_SUBMISSION BIT(2) +#define IVPU_TEST_MODE_FW_TESTBIT(0) +#define IVPU_TEST_MODE_NULL_HWBIT(1) +#define IVPU_TEST_MODE_NULL_SUBMISSIONBIT(2) +#define IVPU_TEST_MODE_D0I3_MSG_DISABLE BIT(4) +#define IVPU_TEST_MODE_D0I3_MSG_ENABLEBIT(5) extern int ivpu_test_mode; struct ivpu_file_priv *ivpu_file_priv_get(struct ivpu_file_priv *file_priv); diff --git a/drivers/accel/ivpu/ivpu_fw.c b/drivers/accel/ivpu/ivpu_fw.c index 51d976ba5276..3fd94c2d06c7 100644 --- a/drivers/accel/ivpu/ivpu_fw.c +++ b/drivers/accel/ivpu/ivpu_fw.c @@ -33,12 +33,17 @@ #define ADDR_TO_L2_CACHE_CFG(addr) ((addr) >> 31) -#define IVPU_FW_CHECK_API(vdev, fw_hdr, name, min_major) \ +/* Check if FW API is compatible with the driver */ +#define IVPU_FW_CHECK_API_COMPAT(vdev, fw_hdr, name, min_major) \ ivpu_fw_check_api(vdev, fw_hdr, #name, \ VPU_##name##_API_VER_INDEX, \ VPU_##name##_API_VER_MAJOR, \ VPU_##name##_API_VER_MINOR, min_major) +/* Check if API version is lover that the given version */ lower than? +#define IVPU_FW_CHECK_API_VER_LT(vdev, fw_hdr, name, major, minor) \ + ivpu_fw_check_api_ver_lt(vdev, fw_hdr, #name, VPU_##name##_API_VER_INDEX, major, minor) + static char *ivpu_firmware; module_param_named_unsafe(firmware, ivpu_firmware, charp, 0644); MODULE_PARM_DESC(firmware, "VPU firmware binary in /lib/firmware/.."); @@ -105,6 +110,19 @@ ivpu_fw_check_api(struct ivpu_device *vdev, const struct vpu_firmware_header *fw return 0; } +static bool +ivpu_fw_check_api_ver_lt(struct ivpu_device *vdev, const struct vpu_firmware_header *fw_hdr, +const char *str, int index, u16 major, u16 minor) +{ + u16 fw_major = (u16)(fw_hdr->api_version[index] >> 16); + u16 fw_minor = (u16)(fw_hdr->api_version[index]); + + if (fw_major < major || (fw_major == major && fw_minor < minor)) + return true; + + return false; +} + static int ivpu_fw_parse(struct ivpu_device *vdev) { struct ivpu_fw_info *fw = vdev->fw; @@ -164,9 +182,9 @@ static int ivpu_fw_parse(struct ivpu_device *vdev) ivpu_info(vdev, "Firmware: %s, version: %s", fw->name, (const char *)fw_hdr + VPU_FW_HEADER_SIZE); - if (IVPU_FW_CHECK_API(vdev, fw_hdr, BOOT, 3)) + if (IVPU_FW_CHECK_API_COMPAT(vdev, fw_hdr, BOOT, 3)) return -EINVAL; - if (IVPU_FW_CHECK_API(vdev, fw_hdr, JSM, 3)) + if (IVPU_FW_CHECK_API_COMPAT(vdev, fw_hdr, JSM, 3)) return -EINVAL; fw->runtime_addr = runtime_addr; @@ -197,6 +215,24 @@ static void ivpu_fw_release(struct ivpu_device *vdev) release_firmware(vdev->fw->file); } +/* Initialize workarounds that depend on FW version */ +static void +ivpu_fw_init_wa(struct ivpu_device *vdev) +{ + const struct vpu_firmware_header *fw_hdr = (const void *)vdev->fw->file->data; + + if
Re: [PATCH drm-misc-next v3] drm/sched: implement dynamic job-flow control
On 10/27/23 16:59, Luben Tuikov wrote: Hi Danilo, On 2023-10-27 10:45, Danilo Krummrich wrote: Hi Luben, On 10/26/23 23:13, Luben Tuikov wrote: On 2023-10-26 12:13, Danilo Krummrich wrote: Currently, job flow control is implemented simply by limiting the number of jobs in flight. Therefore, a scheduler is initialized with a credit limit that corresponds to the number of jobs which can be sent to the hardware. This implies that for each job, drivers need to account for the maximum job size possible in order to not overflow the ring buffer. However, there are drivers, such as Nouveau, where the job size has a rather large range. For such drivers it can easily happen that job submissions not even filling the ring by 1% can block subsequent submissions, which, in the worst case, can lead to the ring run dry. In order to overcome this issue, allow for tracking the actual job size instead of the number of jobs. Therefore, add a field to track a job's credit count, which represents the number of credits a job contributes to the scheduler's credit limit. Signed-off-by: Danilo Krummrich --- Changes in V2: == - fixed up influence on scheduling fairness due to consideration of a job's size - If we reach a ready entity in drm_sched_select_entity() but can't actually queue a job from it due to size limitations, just give up and go to sleep until woken up due to a pending job finishing, rather than continue to try other entities. - added a callback to dynamically update a job's credits (Boris) - renamed 'units' to 'credits' - fixed commit message and comments as requested by Luben Changes in V3: == - rebased onto V7 of the "DRM scheduler changes for Xe" series by Matt - move up drm_sched_can_queue() instead of adding a forward declaration (Boris) - add a drm_sched_available_credits() helper (Boris) - adjust control flow in drm_sched_rq_select_entity_fifo() to Luben's proposal - re-phrase a few comments and fix a typo (Luben) - change naming of all structures credit fields and function parameters to the following scheme - drm_sched_job::credits - drm_gpu_scheduler::credit_limit - drm_gpu_scheduler::credit_count - drm_sched_init(..., u32 credit_limit, ...) - drm_sched_job_init(..., u32 credits, ...) - add proper documentation for the scheduler's job-flow control mechanism This patch is based on V7 of the "DRM scheduler changes for Xe" series. [1] provides a branch based on drm-misc-next, with the named series and this patch on top of it. [1] https://gitlab.freedesktop.org/nouvelles/kernel/-/commits/sched-credits/ --- Documentation/gpu/drm-mm.rst | 6 + drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 2 +- drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c | 2 +- drivers/gpu/drm/lima/lima_sched.c | 2 +- drivers/gpu/drm/msm/msm_gem_submit.c | 2 +- drivers/gpu/drm/nouveau/nouveau_sched.c | 2 +- drivers/gpu/drm/panfrost/panfrost_drv.c | 2 +- .../gpu/drm/scheduler/gpu_scheduler_trace.h | 2 +- drivers/gpu/drm/scheduler/sched_entity.c | 4 +- drivers/gpu/drm/scheduler/sched_main.c| 142 ++ drivers/gpu/drm/v3d/v3d_gem.c | 2 +- include/drm/gpu_scheduler.h | 33 +++- 12 files changed, 152 insertions(+), 49 deletions(-) diff --git a/Documentation/gpu/drm-mm.rst b/Documentation/gpu/drm-mm.rst index 602010cb6894..acc5901ac840 100644 --- a/Documentation/gpu/drm-mm.rst +++ b/Documentation/gpu/drm-mm.rst @@ -552,6 +552,12 @@ Overview .. kernel-doc:: drivers/gpu/drm/scheduler/sched_main.c :doc: Overview +Flow Control + + +.. kernel-doc:: drivers/gpu/drm/scheduler/sched_main.c + :doc: Flow Control + Scheduler Function References - diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c index 1f357198533f..62bb7fc7448a 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c @@ -115,7 +115,7 @@ int amdgpu_job_alloc(struct amdgpu_device *adev, struct amdgpu_vm *vm, if (!entity) return 0; - return drm_sched_job_init(&(*job)->base, entity, owner); + return drm_sched_job_init(&(*job)->base, entity, 1, owner); } int amdgpu_job_alloc_with_ib(struct amdgpu_device *adev, diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c index 2416c526f9b0..3d0f8d182506 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c @@ -535,7 +535,7 @@ int etnaviv_ioctl_gem_submit(struct drm_device *dev, void *data, ret = drm_sched_job_init(>sched_job, >sched_entity[args->pipe], -
Re: [PATCH 09/11] accel/ivpu/40xx: Capture D0i3 entry host and device timestamps
On 10/25/2023 3:43 AM, Stanislaw Gruszka wrote: From: Andrzej Kacprowski The driver needs to capture the D0i3 entry timestamp to calculate D0i3 residency time. The D0i3 residency time and the VPU timestamp are passed to the firmware at D0i3 exit (warm boot). Signed-off-by: Andrzej Kacprowski Reviewed-by: Stanislaw Gruszka Signed-off-by: Stanislaw Gruszka Feels like this should be applied before patch 8, otherwise this fixes a bug introduced by patch 8. Reviewed-by: Jeffrey Hugo
Re: [PATCH drm-misc-next v3] drm/sched: implement dynamic job-flow control
Hi Danilo, On 2023-10-27 10:45, Danilo Krummrich wrote: > Hi Luben, > > On 10/26/23 23:13, Luben Tuikov wrote: >> On 2023-10-26 12:13, Danilo Krummrich wrote: >>> Currently, job flow control is implemented simply by limiting the number >>> of jobs in flight. Therefore, a scheduler is initialized with a credit >>> limit that corresponds to the number of jobs which can be sent to the >>> hardware. >>> >>> This implies that for each job, drivers need to account for the maximum >>> job size possible in order to not overflow the ring buffer. >>> >>> However, there are drivers, such as Nouveau, where the job size has a >>> rather large range. For such drivers it can easily happen that job >>> submissions not even filling the ring by 1% can block subsequent >>> submissions, which, in the worst case, can lead to the ring run dry. >>> >>> In order to overcome this issue, allow for tracking the actual job size >>> instead of the number of jobs. Therefore, add a field to track a job's >>> credit count, which represents the number of credits a job contributes >>> to the scheduler's credit limit. >>> >>> Signed-off-by: Danilo Krummrich >>> --- >>> Changes in V2: >>> == >>>- fixed up influence on scheduling fairness due to consideration of a >>> job's >>> size >>> - If we reach a ready entity in drm_sched_select_entity() but can't >>> actually >>>queue a job from it due to size limitations, just give up and go to >>> sleep >>>until woken up due to a pending job finishing, rather than continue >>> to try >>>other entities. >>>- added a callback to dynamically update a job's credits (Boris) >>>- renamed 'units' to 'credits' >>>- fixed commit message and comments as requested by Luben >>> >>> Changes in V3: >>> == >>>- rebased onto V7 of the "DRM scheduler changes for Xe" series by Matt >>>- move up drm_sched_can_queue() instead of adding a forward declaration >>> (Boris) >>>- add a drm_sched_available_credits() helper (Boris) >>>- adjust control flow in drm_sched_rq_select_entity_fifo() to Luben's >>> proposal >>>- re-phrase a few comments and fix a typo (Luben) >>>- change naming of all structures credit fields and function parameters >>> to the >>> following scheme >>> - drm_sched_job::credits >>> - drm_gpu_scheduler::credit_limit >>> - drm_gpu_scheduler::credit_count >>> - drm_sched_init(..., u32 credit_limit, ...) >>> - drm_sched_job_init(..., u32 credits, ...) >>>- add proper documentation for the scheduler's job-flow control mechanism >>> >>> This patch is based on V7 of the "DRM scheduler changes for Xe" series. [1] >>> provides a branch based on drm-misc-next, with the named series and this >>> patch >>> on top of it. >>> >>> [1] https://gitlab.freedesktop.org/nouvelles/kernel/-/commits/sched-credits/ >>> --- >>> Documentation/gpu/drm-mm.rst | 6 + >>> drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 2 +- >>> drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c | 2 +- >>> drivers/gpu/drm/lima/lima_sched.c | 2 +- >>> drivers/gpu/drm/msm/msm_gem_submit.c | 2 +- >>> drivers/gpu/drm/nouveau/nouveau_sched.c | 2 +- >>> drivers/gpu/drm/panfrost/panfrost_drv.c | 2 +- >>> .../gpu/drm/scheduler/gpu_scheduler_trace.h | 2 +- >>> drivers/gpu/drm/scheduler/sched_entity.c | 4 +- >>> drivers/gpu/drm/scheduler/sched_main.c| 142 ++ >>> drivers/gpu/drm/v3d/v3d_gem.c | 2 +- >>> include/drm/gpu_scheduler.h | 33 +++- >>> 12 files changed, 152 insertions(+), 49 deletions(-) >>> >>> diff --git a/Documentation/gpu/drm-mm.rst b/Documentation/gpu/drm-mm.rst >>> index 602010cb6894..acc5901ac840 100644 >>> --- a/Documentation/gpu/drm-mm.rst >>> +++ b/Documentation/gpu/drm-mm.rst >>> @@ -552,6 +552,12 @@ Overview >>> .. kernel-doc:: drivers/gpu/drm/scheduler/sched_main.c >>> :doc: Overview >>> >>> +Flow Control >>> + >>> + >>> +.. kernel-doc:: drivers/gpu/drm/scheduler/sched_main.c >>> + :doc: Flow Control >>> + >>> Scheduler Function References >>> - >>> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c >>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c >>> index 1f357198533f..62bb7fc7448a 100644 >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c >>> @@ -115,7 +115,7 @@ int amdgpu_job_alloc(struct amdgpu_device *adev, struct >>> amdgpu_vm *vm, >>> if (!entity) >>> return 0; >>> >>> - return drm_sched_job_init(&(*job)->base, entity, owner); >>> + return drm_sched_job_init(&(*job)->base, entity, 1, owner); >>> } >>> >>> int amdgpu_job_alloc_with_ib(struct amdgpu_device *adev, >>> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c >>> b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c >>> index
Re: [PATCH 08/11] accel/ivpu: Pass D0i3 residency time to the VPU firmware
On 10/25/2023 3:43 AM, Stanislaw Gruszka wrote: From: Andrzej Kacprowski The firmware needs to know the time spend in D0i3/D3 to spent? calculate telemetry data. The D0i3/D3 residency time is calculated by the driver and passed to the firmware in the boot parameters. The driver also passes VPU perf counter value captured right before entering D0i3 - this allows the VPU firmware to generate monotonic timestamps for the logs. Signed-off-by: Andrzej Kacprowski Reviewed-by: Stanislaw Gruszka Signed-off-by: Stanislaw Gruszka Reviewed-by: Jeffrey Hugo
Re: [PATCH 07/11] accel/ivpu: Introduce ivpu_ipc_send_receive_active()
On 10/25/2023 3:43 AM, Stanislaw Gruszka wrote: From: Karol Wachowski Split ivpu_ipc_send_receive() implementation to have a version that does not call pm_runtime_resume_and_get(). That implementation can be invoked when device is up and runtime resume is prohibited (for example at the end of boot sequence). There doesn't seem to be a user for this, which would make the new function dead code. Assuming that this new function gets used later in the series, it would be clearer to combine this change with that one. -Jeff
Re: [PATCH] i915/perf: Fix NULL deref bugs with drm_dbg() calls
On 27/10/2023 15:11, Andrzej Hajda wrote: On 27.10.2023 16:07, Harshit Mogalapalli wrote: When i915 perf interface is not available dereferencing it will lead to NULL dereferences. Fix this by using DRM_DEBUG() which the scenario before the commit in the Fixes tag. Fixes: 2fec539112e8 ("i915/perf: Replace DRM_DEBUG with driver specific drm_dbg call") Signed-off-by: Harshit Mogalapalli Reviewed-by: Andrzej Hajda Please hold off merging. --- This is found using smatch(static analysis tool), only compile tested. --- drivers/gpu/drm/i915/i915_perf.c | 9 +++-- 1 file changed, 3 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index 2f3ecd7d4804..bb48c96b7950 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -4228,8 +4228,7 @@ int i915_perf_open_ioctl(struct drm_device *dev, void *data, int ret; if (!perf->i915) { - drm_dbg(>i915->drm, - "i915 perf interface not available for this system\n"); + DRM_DEBUG("i915 perf interface not available for this system\n"); What's that struct drm_device *dev function argument a few lines up? :) Although TBH all these these could just be removed since I doubt they are adding any value and ENOTSUPP is pretty clear. Regards, Tvrtko return -ENOTSUPP; } @@ -4608,8 +4607,7 @@ int i915_perf_add_config_ioctl(struct drm_device *dev, void *data, int err, id; if (!perf->i915) { - drm_dbg(>i915->drm, - "i915 perf interface not available for this system\n"); + DRM_DEBUG("i915 perf interface not available for this system\n"); return -ENOTSUPP; } @@ -4774,8 +4772,7 @@ int i915_perf_remove_config_ioctl(struct drm_device *dev, void *data, int ret; if (!perf->i915) { - drm_dbg(>i915->drm, - "i915 perf interface not available for this system\n"); + DRM_DEBUG("i915 perf interface not available for this system\n"); return -ENOTSUPP; }
Re: [PATCH 06/11] accel/ivpu: Change test_mode module param to bitmask
On 10/25/2023 3:43 AM, Stanislaw Gruszka wrote: From: Karol Wachowski Change meaning of test_mode module parameter from integer value to bitmask allowing setting different test features with corresponding bits. Signed-off-by: Karol Wachowski Reviewed-by: Stanislaw Gruszka Signed-off-by: Stanislaw Gruszka Seems like this changes the uAPI. You still haven't made a release of the userspace, correct? If so - Reviewed-by: Jeffrey Hugo
Re: [PATCH drm-misc-next v3] drm/sched: implement dynamic job-flow control
On 10/27/23 03:03, Luben Tuikov wrote: On 2023-10-26 17:13, Luben Tuikov wrote: On 2023-10-26 12:13, Danilo Krummrich wrote: Currently, job flow control is implemented simply by limiting the number of jobs in flight. Therefore, a scheduler is initialized with a credit limit that corresponds to the number of jobs which can be sent to the hardware. This implies that for each job, drivers need to account for the maximum job size possible in order to not overflow the ring buffer. However, there are drivers, such as Nouveau, where the job size has a rather large range. For such drivers it can easily happen that job submissions not even filling the ring by 1% can block subsequent submissions, which, in the worst case, can lead to the ring run dry. In order to overcome this issue, allow for tracking the actual job size instead of the number of jobs. Therefore, add a field to track a job's credit count, which represents the number of credits a job contributes to the scheduler's credit limit. Signed-off-by: Danilo Krummrich --- Changes in V2: == - fixed up influence on scheduling fairness due to consideration of a job's size - If we reach a ready entity in drm_sched_select_entity() but can't actually queue a job from it due to size limitations, just give up and go to sleep until woken up due to a pending job finishing, rather than continue to try other entities. - added a callback to dynamically update a job's credits (Boris) - renamed 'units' to 'credits' - fixed commit message and comments as requested by Luben Changes in V3: == - rebased onto V7 of the "DRM scheduler changes for Xe" series by Matt - move up drm_sched_can_queue() instead of adding a forward declaration (Boris) - add a drm_sched_available_credits() helper (Boris) - adjust control flow in drm_sched_rq_select_entity_fifo() to Luben's proposal - re-phrase a few comments and fix a typo (Luben) - change naming of all structures credit fields and function parameters to the following scheme - drm_sched_job::credits - drm_gpu_scheduler::credit_limit - drm_gpu_scheduler::credit_count - drm_sched_init(..., u32 credit_limit, ...) - drm_sched_job_init(..., u32 credits, ...) - add proper documentation for the scheduler's job-flow control mechanism This patch is based on V7 of the "DRM scheduler changes for Xe" series. [1] provides a branch based on drm-misc-next, with the named series and this patch on top of it. [1] https://gitlab.freedesktop.org/nouvelles/kernel/-/commits/sched-credits/ --- Documentation/gpu/drm-mm.rst | 6 + drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 2 +- drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c | 2 +- drivers/gpu/drm/lima/lima_sched.c | 2 +- drivers/gpu/drm/msm/msm_gem_submit.c | 2 +- drivers/gpu/drm/nouveau/nouveau_sched.c | 2 +- drivers/gpu/drm/panfrost/panfrost_drv.c | 2 +- .../gpu/drm/scheduler/gpu_scheduler_trace.h | 2 +- drivers/gpu/drm/scheduler/sched_entity.c | 4 +- drivers/gpu/drm/scheduler/sched_main.c| 142 ++ drivers/gpu/drm/v3d/v3d_gem.c | 2 +- include/drm/gpu_scheduler.h | 33 +++- 12 files changed, 152 insertions(+), 49 deletions(-) diff --git a/Documentation/gpu/drm-mm.rst b/Documentation/gpu/drm-mm.rst index 602010cb6894..acc5901ac840 100644 --- a/Documentation/gpu/drm-mm.rst +++ b/Documentation/gpu/drm-mm.rst @@ -552,6 +552,12 @@ Overview .. kernel-doc:: drivers/gpu/drm/scheduler/sched_main.c :doc: Overview +Flow Control + + +.. kernel-doc:: drivers/gpu/drm/scheduler/sched_main.c + :doc: Flow Control + Scheduler Function References - diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c index 1f357198533f..62bb7fc7448a 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c @@ -115,7 +115,7 @@ int amdgpu_job_alloc(struct amdgpu_device *adev, struct amdgpu_vm *vm, if (!entity) return 0; - return drm_sched_job_init(&(*job)->base, entity, owner); + return drm_sched_job_init(&(*job)->base, entity, 1, owner); } int amdgpu_job_alloc_with_ib(struct amdgpu_device *adev, diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c index 2416c526f9b0..3d0f8d182506 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c @@ -535,7 +535,7 @@ int etnaviv_ioctl_gem_submit(struct drm_device *dev, void *data, ret = drm_sched_job_init(>sched_job, >sched_entity[args->pipe], -submit->ctx); +1, submit->ctx); if (ret) goto
Re: [PATCH drm-misc-next v3] drm/sched: implement dynamic job-flow control
Hi Luben, On 10/26/23 23:13, Luben Tuikov wrote: On 2023-10-26 12:13, Danilo Krummrich wrote: Currently, job flow control is implemented simply by limiting the number of jobs in flight. Therefore, a scheduler is initialized with a credit limit that corresponds to the number of jobs which can be sent to the hardware. This implies that for each job, drivers need to account for the maximum job size possible in order to not overflow the ring buffer. However, there are drivers, such as Nouveau, where the job size has a rather large range. For such drivers it can easily happen that job submissions not even filling the ring by 1% can block subsequent submissions, which, in the worst case, can lead to the ring run dry. In order to overcome this issue, allow for tracking the actual job size instead of the number of jobs. Therefore, add a field to track a job's credit count, which represents the number of credits a job contributes to the scheduler's credit limit. Signed-off-by: Danilo Krummrich --- Changes in V2: == - fixed up influence on scheduling fairness due to consideration of a job's size - If we reach a ready entity in drm_sched_select_entity() but can't actually queue a job from it due to size limitations, just give up and go to sleep until woken up due to a pending job finishing, rather than continue to try other entities. - added a callback to dynamically update a job's credits (Boris) - renamed 'units' to 'credits' - fixed commit message and comments as requested by Luben Changes in V3: == - rebased onto V7 of the "DRM scheduler changes for Xe" series by Matt - move up drm_sched_can_queue() instead of adding a forward declaration (Boris) - add a drm_sched_available_credits() helper (Boris) - adjust control flow in drm_sched_rq_select_entity_fifo() to Luben's proposal - re-phrase a few comments and fix a typo (Luben) - change naming of all structures credit fields and function parameters to the following scheme - drm_sched_job::credits - drm_gpu_scheduler::credit_limit - drm_gpu_scheduler::credit_count - drm_sched_init(..., u32 credit_limit, ...) - drm_sched_job_init(..., u32 credits, ...) - add proper documentation for the scheduler's job-flow control mechanism This patch is based on V7 of the "DRM scheduler changes for Xe" series. [1] provides a branch based on drm-misc-next, with the named series and this patch on top of it. [1] https://gitlab.freedesktop.org/nouvelles/kernel/-/commits/sched-credits/ --- Documentation/gpu/drm-mm.rst | 6 + drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 2 +- drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c | 2 +- drivers/gpu/drm/lima/lima_sched.c | 2 +- drivers/gpu/drm/msm/msm_gem_submit.c | 2 +- drivers/gpu/drm/nouveau/nouveau_sched.c | 2 +- drivers/gpu/drm/panfrost/panfrost_drv.c | 2 +- .../gpu/drm/scheduler/gpu_scheduler_trace.h | 2 +- drivers/gpu/drm/scheduler/sched_entity.c | 4 +- drivers/gpu/drm/scheduler/sched_main.c| 142 ++ drivers/gpu/drm/v3d/v3d_gem.c | 2 +- include/drm/gpu_scheduler.h | 33 +++- 12 files changed, 152 insertions(+), 49 deletions(-) diff --git a/Documentation/gpu/drm-mm.rst b/Documentation/gpu/drm-mm.rst index 602010cb6894..acc5901ac840 100644 --- a/Documentation/gpu/drm-mm.rst +++ b/Documentation/gpu/drm-mm.rst @@ -552,6 +552,12 @@ Overview .. kernel-doc:: drivers/gpu/drm/scheduler/sched_main.c :doc: Overview +Flow Control + + +.. kernel-doc:: drivers/gpu/drm/scheduler/sched_main.c + :doc: Flow Control + Scheduler Function References - diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c index 1f357198533f..62bb7fc7448a 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c @@ -115,7 +115,7 @@ int amdgpu_job_alloc(struct amdgpu_device *adev, struct amdgpu_vm *vm, if (!entity) return 0; - return drm_sched_job_init(&(*job)->base, entity, owner); + return drm_sched_job_init(&(*job)->base, entity, 1, owner); } int amdgpu_job_alloc_with_ib(struct amdgpu_device *adev, diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c index 2416c526f9b0..3d0f8d182506 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c @@ -535,7 +535,7 @@ int etnaviv_ioctl_gem_submit(struct drm_device *dev, void *data, ret = drm_sched_job_init(>sched_job, >sched_entity[args->pipe], -submit->ctx); +1, submit->ctx); if (ret) goto err_submit_put; diff --git
Re: [PATCH 05/11] accel/ivpu: Add support for VPU_JOB_FLAGS_NULL_SUBMISSION_MASK
On 10/25/2023 3:43 AM, Stanislaw Gruszka wrote: From: Andrzej Kacprowski Add test_mode = 3 that add VPU_JOB_FLAGS_NULL_SUBMISSION_MASK flag to the job send to the VPU device. Then the VPU will process the job but won't execute commands (except the command to signal the fence). This can b used to estimate job processing overhead in the host be? software and VPU firmware. Unlike the null hardware mode, the null submission mode will still work even if UMD uses VPU fences to track job completion. Signed-off-by: Andrzej Kacprowski Reviewed-by: Stanislaw Gruszka Signed-off-by: Stanislaw Gruszka Reviewed-by: Jeffrey Hugo
Re: [PATCH 04/11] accel/ivpu: Remove reset from power up sequence
On 10/25/2023 3:43 AM, Stanislaw Gruszka wrote: From: Karol Wachowski Setting a non-zero work point resets the IP hence IP_RESET trigger is redundant. Signed-off-by: Karol Wachowski Reviewed-by: Stanislaw Gruszka Signed-off-by: Stanislaw Gruszka Reviewed-by: Jeffrey Hugo
Re: [PATCH 03/11] accel/ivpu: Add dvfs_mode file to debugfs
On 10/25/2023 3:43 AM, Stanislaw Gruszka wrote: From: Tomasz Rusinowicz Add new debugfs file to set dvfs_mode FW boot parameter and restart the FW to allow experimenting with DVFS (dynamic voltage & frequency scaling). Signed-off-by: Tomasz Rusinowicz Signed-off-by: Stanislaw Gruszka Reviewed-by: Jeffrey Hugo
Re: [PATCH 02/11] accel/ivpu: Remove unneeded drm_driver dectaration
$SUBJECT has a spelling error of "declaration" On 10/25/2023 3:43 AM, Stanislaw Gruszka wrote: Cleanup drm_driver declaration leftover. Reviewed-by: Krystian Pradzynski Signed-off-by: Stanislaw Gruszka Reviewed-by: Jeffrey Hugo
Re: [PATCH 01/11] accel/ivpu: Update FW API
On 10/25/2023 3:43 AM, Stanislaw Gruszka wrote: From: Krystian Pradzynski Bump boot API to 4.20 Bump JSM API to 3.15 Signed-off-by: Krystian Pradzynski Reviewed-by: Stanislaw Gruszka Signed-off-by: Stanislaw Gruszka --- drivers/accel/ivpu/ivpu_jsm_msg.c | 17 ++ drivers/accel/ivpu/vpu_boot_api.h | 90 - drivers/accel/ivpu/vpu_jsm_api.h | 309 -- 3 files changed, 392 insertions(+), 24 deletions(-) diff --git a/drivers/accel/ivpu/ivpu_jsm_msg.c b/drivers/accel/ivpu/ivpu_jsm_msg.c index 0c2fe7142024..35a689475c68 100644 --- a/drivers/accel/ivpu/ivpu_jsm_msg.c +++ b/drivers/accel/ivpu/ivpu_jsm_msg.c @@ -36,6 +36,17 @@ const char *ivpu_jsm_msg_type_to_str(enum vpu_ipc_msg_type type) IVPU_CASE_TO_STR(VPU_JSM_MSG_DESTROY_CMD_QUEUE); IVPU_CASE_TO_STR(VPU_JSM_MSG_SET_CONTEXT_SCHED_PROPERTIES); IVPU_CASE_TO_STR(VPU_JSM_MSG_HWS_REGISTER_DB); + IVPU_CASE_TO_STR(VPU_JSM_MSG_HWS_RESUME_CMDQ); + IVPU_CASE_TO_STR(VPU_JSM_MSG_HWS_SUSPEND_CMDQ); + IVPU_CASE_TO_STR(VPU_JSM_MSG_HWS_RESUME_CMDQ_RSP); + IVPU_CASE_TO_STR(VPU_JSM_MSG_HWS_SUSPEND_CMDQ_DONE); + IVPU_CASE_TO_STR(VPU_JSM_MSG_HWS_SET_SCHEDULING_LOG); + IVPU_CASE_TO_STR(VPU_JSM_MSG_HWS_SET_SCHEDULING_LOG_RSP); + IVPU_CASE_TO_STR(VPU_JSM_MSG_HWS_SCHEDULING_LOG_NOTIFICATION); + IVPU_CASE_TO_STR(VPU_JSM_MSG_HWS_ENGINE_RESUME); + IVPU_CASE_TO_STR(VPU_JSM_MSG_HWS_RESUME_ENGINE_DONE); + IVPU_CASE_TO_STR(VPU_JSM_MSG_STATE_DUMP); + IVPU_CASE_TO_STR(VPU_JSM_MSG_STATE_DUMP_RSP); IVPU_CASE_TO_STR(VPU_JSM_MSG_BLOB_DEINIT); IVPU_CASE_TO_STR(VPU_JSM_MSG_DYNDBG_CONTROL); IVPU_CASE_TO_STR(VPU_JSM_MSG_JOB_DONE); @@ -65,6 +76,12 @@ const char *ivpu_jsm_msg_type_to_str(enum vpu_ipc_msg_type type) IVPU_CASE_TO_STR(VPU_JSM_MSG_SET_CONTEXT_SCHED_PROPERTIES_RSP); IVPU_CASE_TO_STR(VPU_JSM_MSG_BLOB_DEINIT_DONE); IVPU_CASE_TO_STR(VPU_JSM_MSG_DYNDBG_CONTROL_RSP); + IVPU_CASE_TO_STR(VPU_JSM_MSG_PWR_D0I3_ENTER); + IVPU_CASE_TO_STR(VPU_JSM_MSG_PWR_D0I3_ENTER_DONE); + IVPU_CASE_TO_STR(VPU_JSM_MSG_DCT_ENABLE); + IVPU_CASE_TO_STR(VPU_JSM_MSG_DCT_ENABLE_DONE); + IVPU_CASE_TO_STR(VPU_JSM_MSG_DCT_DISABLE); + IVPU_CASE_TO_STR(VPU_JSM_MSG_DCT_DISABLE_DONE); } #undef IVPU_CASE_TO_STR diff --git a/drivers/accel/ivpu/vpu_boot_api.h b/drivers/accel/ivpu/vpu_boot_api.h index 6b71be92ba65..04c954258563 100644 --- a/drivers/accel/ivpu/vpu_boot_api.h +++ b/drivers/accel/ivpu/vpu_boot_api.h @@ -11,7 +11,10 @@ * The bellow values will be used to construct the version info this way: * fw_bin_header->api_version[VPU_BOOT_API_VER_ID] = (VPU_BOOT_API_VER_MAJOR << 16) | * VPU_BOOT_API_VER_MINOR; - * VPU_BOOT_API_VER_PATCH will be ignored. KMD and compatibility is not affected if this changes. + * VPU_BOOT_API_VER_PATCH will be ignored. KMD and compatibility is not affected if this changes + * This information is collected by using vpuip_2/application/vpuFirmware/make_std_fw_image.py + * If a header is missing this info we ignore the header, if a header is missing or contains + * partial info a build error will be generated. */ /* @@ -24,12 +27,12 @@ * Minor version changes when API backward compatibility is preserved. * Resets to 0 if Major version is incremented. */ -#define VPU_BOOT_API_VER_MINOR 12 +#define VPU_BOOT_API_VER_MINOR 20 /* * API header changed (field names, documentation, formatting) but API itself has not been changed */ -#define VPU_BOOT_API_VER_PATCH 2 +#define VPU_BOOT_API_VER_PATCH 4 /* * Index in the API version table @@ -63,6 +66,12 @@ struct vpu_firmware_header { /* Size of memory require for firmware execution */ u32 runtime_size; u32 shave_nn_fw_size; + /* Size of primary preemption buffer. */ + u32 preemption_buffer_1_size; + /* Size of secondary preemption buffer. */ + u32 preemption_buffer_2_size; + /* Space reserved for future preemption-related fields. */ + u32 preemption_reserved[6]; }; /* @@ -89,6 +98,14 @@ enum VPU_BOOT_L2_CACHE_CFG_TYPE { VPU_BOOT_L2_CACHE_CFG_NUM = 2 }; +/** VPU MCA ECC signalling mode. By default, no signalling is used */ This does not look like valid kernel-doc. Maybe you wanted "/*" instead? +enum VPU_BOOT_MCA_ECC_SIGNAL_TYPE { + VPU_BOOT_MCA_ECC_NONE = 0, + VPU_BOOT_MCA_ECC_CORR = 1, + VPU_BOOT_MCA_ECC_FATAL = 2, + VPU_BOOT_MCA_ECC_BOTH = 3 Personal preference, but having the "=" and the interget values all line up vetrically would make this a bit more plesant to look at. +}; + /** * Logging destinations. * @@ -131,9 +148,11 @@ enum vpu_trace_destination { #define VPU_TRACE_PROC_BIT_ACT_SHV_3 22 #define VPU_TRACE_PROC_NO_OF_HW_DEVS 23 -/* KMB HW component IDs are sequential, so define first and last IDs. */ -#define
Re: [PATCH drm-misc-next v3] drm/sched: implement dynamic job-flow control
On 10/27/23 09:17, Boris Brezillon wrote: Hi Danilo, On Thu, 26 Oct 2023 18:13:00 +0200 Danilo Krummrich wrote: + + /** +* @update_job_credits: Called once the scheduler is considering this +* job for execution. +* +* Drivers may use this to update the job's submission credits, which is +* useful to e.g. deduct the number of native fences which have been +* signaled meanwhile. +* +* The callback must either return the new number of submission credits +* for the given job, or zero if no update is required. +* +* This callback is optional. +*/ + u32 (*update_job_credits)(struct drm_sched_job *sched_job); I'm copying my late reply to v2 here so it doesn't get lost: I keep thinking it'd be simpler to make this a void function that updates s_job->submission_credits directly. I also don't see the problem with doing a sanity check on job->submission_credits. I mean, if the driver is doing something silly, you can't do much to prevent it anyway, except warn the user that something wrong has happened. If you want to WARN_ON(job->submission_credits == 0 || job->submission_credits > job_old_submission_credits); that's fine. But none of this sanity checking has to do with the function prototype/semantics, and I'm still not comfortable with this 0 => no-change. If there's no change, we should just leave job->submission_credits unchanged (or return job->submission_credits) instead of inventing a new special case. If we can avoid letting drivers change fields of generic structures directly without any drawbacks I think we should avoid it. Currently, drivers shouldn't have the need to mess with job->credits directly. The initial value is set through drm_sched_job_init() and is updated through the return value of update_job_credits(). I'm fine getting rid of the 0 => no-change semantics though. Instead we can just WARN() on 0. However, if we do that I'd also want to change it for drm_sched_job_init() (where 0 currently defaults to 1) such that we accept 0, but WARN() accordingly. I think it's consequent to either consistently give 0 a different meaning or just accept it but WARN() on it. Regards, Boris
Re: [PATCH drm-misc-next v3] drm/sched: implement dynamic job-flow control
On 2023-10-27 04:25, Boris Brezillon wrote: > Hi Danilo, > > On Thu, 26 Oct 2023 18:13:00 +0200 > Danilo Krummrich wrote: > >> Currently, job flow control is implemented simply by limiting the number >> of jobs in flight. Therefore, a scheduler is initialized with a credit >> limit that corresponds to the number of jobs which can be sent to the >> hardware. >> >> This implies that for each job, drivers need to account for the maximum >> job size possible in order to not overflow the ring buffer. >> >> However, there are drivers, such as Nouveau, where the job size has a >> rather large range. For such drivers it can easily happen that job >> submissions not even filling the ring by 1% can block subsequent >> submissions, which, in the worst case, can lead to the ring run dry. >> >> In order to overcome this issue, allow for tracking the actual job size >> instead of the number of jobs. Therefore, add a field to track a job's >> credit count, which represents the number of credits a job contributes >> to the scheduler's credit limit. >> >> Signed-off-by: Danilo Krummrich >> --- >> Changes in V2: >> == >> - fixed up influence on scheduling fairness due to consideration of a job's >> size >> - If we reach a ready entity in drm_sched_select_entity() but can't >> actually >> queue a job from it due to size limitations, just give up and go to >> sleep >> until woken up due to a pending job finishing, rather than continue to >> try >> other entities. >> - added a callback to dynamically update a job's credits (Boris) > > This callback seems controversial. I'd suggest dropping it, so the > patch can be merged. Sorry, why is it controversial? (I did read back-and-forth above, but it wasn't clear why it is /controversial/.) I believe only drivers are privy to changes in the credit availability as their firmware and hardware executes new jobs and finishes others, and so this "update" here is essential--leaving it only to prepare_job() wouldn't quite fulfill the vision of why the credit mechanism introduced by this patch in the first place. -- Regards, Luben OpenPGP_0x4C15479431A334AF.asc Description: OpenPGP public key OpenPGP_signature.asc Description: OpenPGP digital signature
Re: drm/panel: panel-simple power-off sequencing
Hi, On Fri, Oct 27, 2023 at 5:30 AM Jonas Mark (BT-FS/ENG1-GRB) wrote: > > > I think I've looked at this exact case before and then realized that > > there's a better solution. At least in all cases I looked at the > > "enable-gpio" you're talking about was actually better modeled as a > > _backlight_ enable GPIO. The "backlight" is turned off before panel- > > simple's disable() function is called (see drm_panel_disable(). > > So if you move the GPIO to the backlight and add a "disable" delay > > then you're all set. > > > > Does that work for you? Does it make sense for this GPIO to be modeled > > as a backlight GPIO? > > In combination with setting the "disable" delay this works *. Yet, it > feels wrong. > > *: backlight-pwm only accepts one GPIO but that can be easily resolved. > > It feels wrong that the backlight driver takes over part of the panel > control. On top, it still needs cooperation of the panel driver for the > proper timing. I would first ask you to take another look before saying that it's wrong to put the enable pin in the backlight driver. At least on most displays I've seen (though I've spent most time looking at eDP), the backlight and panel are really not separable entities. Take a look at the ASCII art timing diagram in Documentation/devicetree/bindings/display/panel/panel-edp.yaml, for instance. This is typical of eDP panel diagrams and it includes _both_ the backlight and the display timings. In general, it's always made sense for me to think of the "LED_EN" line in that diagram as the backlight enable. > Lastly, it relies on the current behavior of drm-panel > that the panel driver is prepared/ enabled first and then the backlight > is switched on; and the other way around at power off. That current behavior is not random and I don't think it would be possible for it to change. Too many things rely on the current order. > We think that actually more panels in products are affected: We have > three panels from three different vendors (Sharp, Powertip, and Tianma) > and only one is visually affected. Yet, all of them specify a number of > vsyncs after de-asserting the enable GPIO. If you're certain that your enable GPIO really shouldn't be modeled with the backlight, then IMO you should submit a patch to panel-simple that allows an "enable" GPIO to be controlled in the "enable" function. We'd have to have a discussion about how to best do this. The fact that the existing "enable" GPIO is considered a "power enable" predates my involvement in the driver. In other words I think it's always been in the "prepare/unprepare" functions. It always felt wrong to me, too. ;-) I guess the "easiest" (though a bit ugly) solution is to either add a per-panel boolean flag that says whether that panel wants the enable GPIO controlled in enable/disable or prepare/enable. Another solution might be to introduce a 2nd GPIO, though you'd have to think about what to call it since the existing one is kinda stuck as "enable" given the DT bindings. I guess a 3rd solution would be to audit users and see if anyone actually needs the current "enable" GPIO as it is and whether those cases would be better modeled as a GPIO-controlled regulator. Of course, if you have to change how boards model this, then you start getting into the argument about DT backward compatibility. I guess, in summary, I'm hoping you'll look again and find that this really is a backlight enable. If not, I'd probably advocate for a per-panel boolean. FWIW, looking a bit at the history and going back to 2014 in commit f673c37ec453 ("drm/panel: simple: Support delays in panel functions"): * The backlight calls used to be made directly from panel-simple * The "unprepare" delay was documented as "the time (in milliseconds) that it takes for the panel to power itself down completely" and makes me believe that, even originally, it was about not turning the panel back on before it fully turns off (T12 in the eDP timing diagram I pointed at earlier). * The "enable" GPIO has been controlled from prepare/unprepare since those functions were introduced in commit 613a633e7a56 ("drm/panel: simple: Add proper definition for prepare and unprepare"). It kinda feels like the problem originated here... -Doug
Re: [PATCH drm-misc-next v3] drm/sched: implement dynamic job-flow control
On 10/27/23 10:25, Boris Brezillon wrote: Hi Danilo, On Thu, 26 Oct 2023 18:13:00 +0200 Danilo Krummrich wrote: Currently, job flow control is implemented simply by limiting the number of jobs in flight. Therefore, a scheduler is initialized with a credit limit that corresponds to the number of jobs which can be sent to the hardware. This implies that for each job, drivers need to account for the maximum job size possible in order to not overflow the ring buffer. However, there are drivers, such as Nouveau, where the job size has a rather large range. For such drivers it can easily happen that job submissions not even filling the ring by 1% can block subsequent submissions, which, in the worst case, can lead to the ring run dry. In order to overcome this issue, allow for tracking the actual job size instead of the number of jobs. Therefore, add a field to track a job's credit count, which represents the number of credits a job contributes to the scheduler's credit limit. Signed-off-by: Danilo Krummrich --- Changes in V2: == - fixed up influence on scheduling fairness due to consideration of a job's size - If we reach a ready entity in drm_sched_select_entity() but can't actually queue a job from it due to size limitations, just give up and go to sleep until woken up due to a pending job finishing, rather than continue to try other entities. - added a callback to dynamically update a job's credits (Boris) This callback seems controversial. I'd suggest dropping it, so the patch can be merged. I don't think we should drop it just for the sake of moving forward. If there are objections we'll discuss them. I've seen good reasons why the drivers you are working on require this, while, following the discussion, I have *not* seen any reasons to drop it. It helps simplifying driver code and doesn't introduce any complexity or overhead to existing drivers. Regards, Boris - renamed 'units' to 'credits' - fixed commit message and comments as requested by Luben Changes in V3: == - rebased onto V7 of the "DRM scheduler changes for Xe" series by Matt - move up drm_sched_can_queue() instead of adding a forward declaration (Boris) - add a drm_sched_available_credits() helper (Boris) - adjust control flow in drm_sched_rq_select_entity_fifo() to Luben's proposal - re-phrase a few comments and fix a typo (Luben) - change naming of all structures credit fields and function parameters to the following scheme - drm_sched_job::credits - drm_gpu_scheduler::credit_limit - drm_gpu_scheduler::credit_count - drm_sched_init(..., u32 credit_limit, ...) - drm_sched_job_init(..., u32 credits, ...) - add proper documentation for the scheduler's job-flow control mechanism
Re: [PATCH v2] accel/ivpu: Disable PLL after VPU IP reset during FLR
On 10/24/2023 10:53 AM, Stanislaw Gruszka wrote: From: Jacek Lawrynowicz IP reset has to followed by ivpu_pll_disable() to properly enter reset state. Fixes: 828d63042aec ("accel/ivpu: Don't enter d0i3 during FLR") Cc: sta...@vger.kernel.org Signed-off-by: Jacek Lawrynowicz Reviewed-by: Stanislaw Gruszka Signed-off-by: Stanislaw Gruszka Reviewed-by: Jeffrey Hugo
Re: [RFC PATCH v2 03/17] drm/vkms: Create separate Kconfig file for VKMS
On Thursday, October 19th, 2023 at 23:21, Harry Wentland wrote: > +++ b/drivers/gpu/drm/vkms/Kconfig > @@ -0,0 +1,15 @@ > +# SPDX-License-Identifier: GPL-2.0+ It seems like the original Kconfig uses GPL-2.0-only. I think it'd be safer to just re-use the exact same license here? With that fixed: Reviewed-by: Simon Ser
Re: [RFC PATCH v2 01/17] drm/atomic: Allow get_value for immutable properties on atomic drivers
Have you seen the comment on top? * Atomic drivers should never call this function directly, the core will read * out property values through the various ->atomic_get_property callbacks. It seems like atomic drivers shouldn't call drm_object_property_get_value() at all?
Re: [RFC PATCH v2 02/17] drm: Don't treat 0 as -1 in drm_fixp2int_ceil
Reviewed-by: Simon Ser