date:20231013

Re: [PATCH v3 09/16] platform/x86/amd/pmf: Add facility to dump TA inputs

2023-10-13 Thread kernel test robot

Hi Shyam,

kernel test robot noticed the following build warnings:

[auto build test WARNING on hid/for-next]
[also build test WARNING on drm-misc/drm-misc-next linus/master v6.6-rc5 
next-20231013]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:
https://github.com/intel-lab-lkp/linux/commits/Shyam-Sundar-S-K/platform-x86-amd-pmf-Add-PMF-TEE-interface/20231010-210347
base:   https://git.kernel.org/pub/scm/linux/kernel/git/hid/hid.git for-next
patch link:
https://lore.kernel.org/r/20231010125917.138225-10-Shyam-sundar.S-k%40amd.com
patch subject: [PATCH v3 09/16] platform/x86/amd/pmf: Add facility to dump TA 
inputs
config: x86_64-allyesconfig 
(https://download.01.org/0day-ci/archive/20231014/202310141247.22coajca-...@intel.com/config)
compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
reproduce (this is a W=1 build): 
(https://download.01.org/0day-ci/archive/20231014/202310141247.22coajca-...@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot 
| Closes: 
https://lore.kernel.org/oe-kbuild-all/202310141247.22coajca-...@intel.com/

All warnings (new ones prefixed by >>):

   In file included from include/linux/printk.h:564,
from include/linux/kernel.h:30,
from arch/x86/include/asm/percpu.h:27,
from arch/x86/include/asm/preempt.h:6,
from include/linux/preempt.h:79,
from include/linux/spinlock.h:56,
from include/linux/mmzone.h:8,
from include/linux/gfp.h:7,
from include/linux/slab.h:16,
from include/linux/resource_ext.h:11,
from include/linux/acpi.h:13,
from drivers/platform/x86/amd/pmf/pmf.h:14,
from drivers/platform/x86/amd/pmf/sps.c:11:
   drivers/platform/x86/amd/pmf/sps.c: In function 'amd_pmf_dump_sps_defaults':
   drivers/platform/x86/amd/pmf/sps.c:50:65: error: implicit declaration of 
function 'source_as_str' [-Werror=implicit-function-declaration]
  50 | pr_debug("--- Source:%s Mode:%s ---\n", 
source_as_str(i), slider_as_str(j));
 | 
^
   include/linux/dynamic_debug.h:224:29: note: in definition of macro 
'__dynamic_func_call_cls'
 224 | func(, ##__VA_ARGS__);   \
 | ^~~
   include/linux/dynamic_debug.h:250:9: note: in expansion of macro 
'_dynamic_func_call_cls'
 250 | _dynamic_func_call_cls(_DPRINTK_CLASS_DFLT, fmt, func, 
##__VA_ARGS__)
 | ^~
   include/linux/dynamic_debug.h:269:9: note: in expansion of macro 
'_dynamic_func_call'
 269 | _dynamic_func_call(fmt, __dynamic_pr_debug, \
 | ^~
   include/linux/printk.h:579:9: note: in expansion of macro 'dynamic_pr_debug'
 579 | dynamic_pr_debug(fmt, ##__VA_ARGS__)
 | ^~~~
   drivers/platform/x86/amd/pmf/sps.c:50:25: note: in expansion of macro 
'pr_debug'
  50 | pr_debug("--- Source:%s Mode:%s ---\n", 
source_as_str(i), slider_as_str(j));
 | ^~~~
>> drivers/platform/x86/amd/pmf/sps.c:50:34: warning: format '%s' expects 
>> argument of type 'char *', but argument 3 has type 'int' [-Wformat=]
  50 | pr_debug("--- Source:%s Mode:%s ---\n", 
source_as_str(i), slider_as_str(j));
 |  ^
   include/linux/printk.h:345:21: note: in definition of macro 'pr_fmt'
 345 | #define pr_fmt(fmt) fmt
 | ^~~
   include/linux/dynamic_debug.h:248:9: note: in expansion of macro 
'__dynamic_func_call_cls'
 248 | __dynamic_func_call_cls(__UNIQUE_ID(ddebug), cls, fmt, func, 
##__VA_ARGS__)
 | ^~~
   include/linux/dynamic_debug.h:250:9: note: in expansion of macro 
'_dynamic_func_call_cls'
 250 | _dynamic_func_call_cls(_DPRINTK_CLASS_DFLT, fmt, func, 
##__VA_ARGS__)
 | ^~
   include/linux/dynamic_debug.h:269:9: note: in expansion of macro 
'_dynamic_func_call'
 269 | _dynamic_func_call(fmt, __dynamic_pr_debug, \
 | ^~
   include/linux/printk.h:579:9: note: in expansion of macro 'dynamic_pr_debug'
 579 | dynamic_pr_debug(fmt, ##__VA_ARGS__)
 | ^~~~
   drivers/platform/x86/amd/pmf/sps

Re: [PATCH v5 7/7] drm/sched: Add helper to queue TDR immediately for current and future jobs

2023-10-13 Thread Luben Tuikov

On 2023-10-11 19:58, Matthew Brost wrote:
> Add helper to queue TDR immediately for current and future jobs. This is
> used in Xe, a new Intel GPU driver, to trigger a TDR to cleanup a
> drm_scheduler that encounter errors.

I think the best (most optimal) thing to do is to remove the last sentence
mentioning Xe. It is irrelevant to this patch. This patch is functional
as is, and worth having it as is.

So it's best to have just:

Add a helper whereby a driver can invoke TDR immediately.

Also remove "for current and future jobs" from the title, as it is
implied by how TDR works. We want to say less.

drm/sched: Add a helper to queue TDR immediately

These are only GPU scheduler changes, worth having on their own. The fact
that a new (future as of this moment) driver (Xe) would use them is irrelevant
at the moment. Other drivers (new, current?) would most likely end up using the 
changes
of these patches, and these changes go in on their own merit.

Regards,
Luben

> 
> v2:
>  - Drop timeout args, rename function, use mod delayed work (Luben)
> v3:
>  - s/XE/Xe (Luben)
>  - present tense in commit message (Luben)
>  - Adjust comment for drm_sched_tdr_queue_imm (Luben)
> 
> Cc: Luben Tuikov 
> Signed-off-by: Matthew Brost 
> ---
>  drivers/gpu/drm/scheduler/sched_main.c | 18 +-
>  include/drm/gpu_scheduler.h|  1 +
>  2 files changed, 18 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
> b/drivers/gpu/drm/scheduler/sched_main.c
> index c4d5c3d265a8..f2846745b067 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -431,7 +431,7 @@ static void drm_sched_start_timeout(struct 
> drm_gpu_scheduler *sched)
>  
>   if (sched->timeout != MAX_SCHEDULE_TIMEOUT &&
>   !list_empty(>pending_list))
> - queue_delayed_work(sched->timeout_wq, >work_tdr, 
> sched->timeout);
> + mod_delayed_work(sched->timeout_wq, >work_tdr, 
> sched->timeout);
>  }
>  
>  static void drm_sched_start_timeout_unlocked(struct drm_gpu_scheduler *sched)
> @@ -441,6 +441,22 @@ static void drm_sched_start_timeout_unlocked(struct 
> drm_gpu_scheduler *sched)
>   spin_unlock(>job_list_lock);
>  }
>  
> +/**
> + * drm_sched_tdr_queue_imm: - immediately start job timeout handler
> + *
> + * @sched: scheduler for which the timeout handling should be started.
> + *
> + * Start timeout handling immediately for the named scheduler.
> + */
> +void drm_sched_tdr_queue_imm(struct drm_gpu_scheduler *sched)
> +{
> + spin_lock(>job_list_lock);
> + sched->timeout = 0;
> + drm_sched_start_timeout(sched);
> + spin_unlock(>job_list_lock);
> +}
> +EXPORT_SYMBOL(drm_sched_tdr_queue_imm);
> +
>  /**
>   * drm_sched_fault - immediately start timeout handler
>   *
> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> index 625ffe040de3..998b32b8d212 100644
> --- a/include/drm/gpu_scheduler.h
> +++ b/include/drm/gpu_scheduler.h
> @@ -568,6 +568,7 @@ void drm_sched_entity_modify_sched(struct 
> drm_sched_entity *entity,
>   struct drm_gpu_scheduler **sched_list,
> unsigned int num_sched_list);
>  
> +void drm_sched_tdr_queue_imm(struct drm_gpu_scheduler *sched);
>  void drm_sched_job_cleanup(struct drm_sched_job *job);
>  void drm_sched_wakeup_if_can_queue(struct drm_gpu_scheduler *sched);
>  bool drm_sched_wqueue_ready(struct drm_gpu_scheduler *sched);

Re: [PATCH v5 6/7] drm/sched: Add drm_sched_start_timeout_unlocked helper

2023-10-13 Thread Luben Tuikov

On 2023-10-11 19:58, Matthew Brost wrote:
> Also add a lockdep assert to drm_sched_start_timeout.
> 
> Signed-off-by: Matthew Brost 
> Reviewed-by: Luben Tuikov 

I don't remember sending a Reviewed-by email to this patch.

I'll add the R-V to the commit when I apply and push this patch,
after replying with a R-V email.

Regards,
Luben

> ---
>  drivers/gpu/drm/scheduler/sched_main.c | 23 +--
>  1 file changed, 13 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
> b/drivers/gpu/drm/scheduler/sched_main.c
> index cf4c23db7547..c4d5c3d265a8 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -427,11 +427,20 @@ static void drm_sched_job_done_cb(struct dma_fence *f, 
> struct dma_fence_cb *cb)
>   */
>  static void drm_sched_start_timeout(struct drm_gpu_scheduler *sched)
>  {
> + lockdep_assert_held(>job_list_lock);
> +
>   if (sched->timeout != MAX_SCHEDULE_TIMEOUT &&
>   !list_empty(>pending_list))
>   queue_delayed_work(sched->timeout_wq, >work_tdr, 
> sched->timeout);
>  }
>  
> +static void drm_sched_start_timeout_unlocked(struct drm_gpu_scheduler *sched)
> +{
> + spin_lock(>job_list_lock);
> + drm_sched_start_timeout(sched);
> + spin_unlock(>job_list_lock);
> +}
> +
>  /**
>   * drm_sched_fault - immediately start timeout handler
>   *
> @@ -544,11 +553,8 @@ static void drm_sched_job_timedout(struct work_struct 
> *work)
>   spin_unlock(>job_list_lock);
>   }
>  
> - if (status != DRM_GPU_SCHED_STAT_ENODEV) {
> - spin_lock(>job_list_lock);
> - drm_sched_start_timeout(sched);
> - spin_unlock(>job_list_lock);
> - }
> + if (status != DRM_GPU_SCHED_STAT_ENODEV)
> + drm_sched_start_timeout_unlocked(sched);
>  }
>  
>  /**
> @@ -674,11 +680,8 @@ void drm_sched_start(struct drm_gpu_scheduler *sched, 
> bool full_recovery)
>   drm_sched_job_done(s_job, -ECANCELED);
>   }
>  
> - if (full_recovery) {
> - spin_lock(>job_list_lock);
> - drm_sched_start_timeout(sched);
> - spin_unlock(>job_list_lock);
> - }
> + if (full_recovery)
> + drm_sched_start_timeout_unlocked(sched);
>  
>   drm_sched_wqueue_start(sched);
>  }

Re: [PATCH v5 5/7] drm/sched: Split free_job into own work item

2023-10-13 Thread Luben Tuikov

On 2023-10-11 19:58, Matthew Brost wrote:
> Rather than call free_job and run_job in same work item have a dedicated
> work item for each. This aligns with the design and intended use of work
> queues.
> 
> v2:
>- Test for DMA_FENCE_FLAG_TIMESTAMP_BIT before setting
>  timestamp in free_job() work item (Danilo)
> v3:
>   - Drop forward dec of drm_sched_select_entity (Boris)
>   - Return in drm_sched_run_job_work if entity NULL (Boris)
> v4:
>   - Replace dequeue with peek and invert logic (Luben)
>   - Wrap to 100 lines (Luben)
>   - Update comments for *_queue / *_queue_if_ready functions (Luben)
> 
> Signed-off-by: Matthew Brost 

I wasn't able to apply this patch on top of drm-misc-next at a48e2cc92835fa.

Create a branch off of a *clean* drm-misc-next and rebase/reapply/cherry-pick 
all
7 patches on top of that clean drm-misc-next branch. You should also run
scripts/checkpatch.pl --strict on all your patches, or integrate it into the 
precommit
hook, see githooks(5), so you don't have to run it manually.

Set format.useAutobase to whenAble somewhere in your Git configs (global/local),
or use --base=auto to git-format-patch when you format your patches before
git-send-email-ing them.

Repost your patches.

The base commit will be added to the bottom of the cover letter. It should
be an ancestor (or tip) of drm-misc-next (git branch --contains  --list 
etc.).
If it is not, your base tree wasn't clean, and you should redo this process.
-- 
Regards,
Luben

> ---
>  drivers/gpu/drm/scheduler/sched_main.c | 287 +++--
>  include/drm/gpu_scheduler.h|   8 +-
>  2 files changed, 178 insertions(+), 117 deletions(-)
> 
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
> b/drivers/gpu/drm/scheduler/sched_main.c
> index 7dbb3392124d..cf4c23db7547 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -213,11 +213,12 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq *rq,
>   * drm_sched_rq_select_entity_rr - Select an entity which could provide a 
> job to run
>   *
>   * @rq: scheduler run queue to check.
> + * @peek: Just find, don't set to current.
>   *
>   * Try to find a ready entity, returns NULL if none found.
>   */
>  static struct drm_sched_entity *
> -drm_sched_rq_select_entity_rr(struct drm_sched_rq *rq)
> +drm_sched_rq_select_entity_rr(struct drm_sched_rq *rq, bool peek)
>  {
>   struct drm_sched_entity *entity;
>  
> @@ -227,8 +228,10 @@ drm_sched_rq_select_entity_rr(struct drm_sched_rq *rq)
>   if (entity) {
>   list_for_each_entry_continue(entity, >entities, list) {
>   if (drm_sched_entity_is_ready(entity)) {
> - rq->current_entity = entity;
> - reinit_completion(>entity_idle);
> + if (!peek) {
> + rq->current_entity = entity;
> + reinit_completion(>entity_idle);
> + }
>   spin_unlock(>lock);
>   return entity;
>   }
> @@ -238,8 +241,10 @@ drm_sched_rq_select_entity_rr(struct drm_sched_rq *rq)
>   list_for_each_entry(entity, >entities, list) {
>  
>   if (drm_sched_entity_is_ready(entity)) {
> - rq->current_entity = entity;
> - reinit_completion(>entity_idle);
> + if (!peek) {
> + rq->current_entity = entity;
> + reinit_completion(>entity_idle);
> + }
>   spin_unlock(>lock);
>   return entity;
>   }
> @@ -257,11 +262,12 @@ drm_sched_rq_select_entity_rr(struct drm_sched_rq *rq)
>   * drm_sched_rq_select_entity_fifo - Select an entity which provides a job 
> to run
>   *
>   * @rq: scheduler run queue to check.
> + * @peek: Just find, don't set to current.
>   *
>   * Find oldest waiting ready entity, returns NULL if none found.
>   */
>  static struct drm_sched_entity *
> -drm_sched_rq_select_entity_fifo(struct drm_sched_rq *rq)
> +drm_sched_rq_select_entity_fifo(struct drm_sched_rq *rq, bool peek)
>  {
>   struct rb_node *rb;
>  
> @@ -271,8 +277,10 @@ drm_sched_rq_select_entity_fifo(struct drm_sched_rq *rq)
>  
>   entity = rb_entry(rb, struct drm_sched_entity, rb_tree_node);
>   if (drm_sched_entity_is_ready(entity)) {
> - rq->current_entity = entity;
> - reinit_completion(>entity_idle);
> + if (!peek) {
> + rq->current_entity = entity;
> + reinit_completion(>entity_idle);
> + }
>   break;
>   }
>   }
> @@ -282,13 +290,98 @@ drm_sched_rq_select_entity_fifo(struct drm_sched_rq *rq)
>  }
>  
>  /**
> - *

Re: [PATCH v5 4/7] drm/sched: Add DRM_SCHED_POLICY_SINGLE_ENTITY scheduling policy

2023-10-13 Thread Luben Tuikov

On 2023-10-11 19:58, Matthew Brost wrote:
> DRM_SCHED_POLICY_SINGLE_ENTITY creates a 1 to 1 relationship between
> scheduler and entity. No priorities or run queue used in this mode.
> Intended for devices with firmware schedulers.
> 
> v2:
>   - Drop sched / rq union (Luben)
> v3:
>   - Don't pick entity if stopped in drm_sched_select_entity (Danilo)
> v4:
>   - Rework if statement in drm_sched_entity_init (Luben)
>   - Update comment for drm_sched_entity_to_scheduler (Luben)
>   - Reword a few things in DOC comment (Luben)
>   - Do not check sched policy in for statement (Luben)
> 
> Signed-off-by: Matthew Brost 
> ---
>  drivers/gpu/drm/scheduler/sched_entity.c | 70 ++-
>  drivers/gpu/drm/scheduler/sched_fence.c  |  2 +-
>  drivers/gpu/drm/scheduler/sched_main.c   | 86 ++--
>  include/drm/gpu_scheduler.h  |  8 +++
>  4 files changed, 131 insertions(+), 35 deletions(-)
> 
> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
> b/drivers/gpu/drm/scheduler/sched_entity.c
> index cf42e2265d64..1ef3883764f9 100644
> --- a/drivers/gpu/drm/scheduler/sched_entity.c
> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> @@ -83,6 +83,7 @@ int drm_sched_entity_init(struct drm_sched_entity *entity,
>   memset(entity, 0, sizeof(struct drm_sched_entity));
>   INIT_LIST_HEAD(>list);
>   entity->rq = NULL;
> + entity->single_sched = NULL;
>   entity->guilty = guilty;
>   entity->num_sched_list = num_sched_list;
>   entity->priority = priority;
> @@ -90,8 +91,18 @@ int drm_sched_entity_init(struct drm_sched_entity *entity,
>   RCU_INIT_POINTER(entity->last_scheduled, NULL);
>   RB_CLEAR_NODE(>rb_tree_node);
>  
> - if(num_sched_list)
> - entity->rq = _list[0]->sched_rq[entity->priority];
> +

^ This adds an extra blank line and scripts/checkpatch.pl rejects it.

> + if (num_sched_list) {
> + if (sched_list[0]->sched_policy !=
> + DRM_SCHED_POLICY_SINGLE_ENTITY) {
> + entity->rq = _list[0]->sched_rq[entity->priority];
> + } else if (num_sched_list == 1 && 
> !sched_list[0]->single_entity) {
> + sched_list[0]->single_entity = entity;
> + entity->single_sched = sched_list[0];
> + } else {
> + return -EINVAL;
> + }
> + }
>  
>   init_completion(>entity_idle);
>  
> @@ -124,7 +135,8 @@ void drm_sched_entity_modify_sched(struct 
> drm_sched_entity *entity,
>   struct drm_gpu_scheduler **sched_list,
>   unsigned int num_sched_list)
>  {
> - WARN_ON(!num_sched_list || !sched_list);
> + WARN_ON(!num_sched_list || !sched_list ||
> + !!entity->single_sched);
>  
>   entity->sched_list = sched_list;
>   entity->num_sched_list = num_sched_list;
> @@ -231,13 +243,15 @@ static void drm_sched_entity_kill(struct 
> drm_sched_entity *entity)
>  {
>   struct drm_sched_job *job;
>   struct dma_fence *prev;
> + bool single_entity = !!entity->single_sched;
>  
> - if (!entity->rq)
> + if (!entity->rq && !single_entity)
>   return;
>  
>   spin_lock(>rq_lock);
>   entity->stopped = true;
> - drm_sched_rq_remove_entity(entity->rq, entity);
> + if (!single_entity)
> + drm_sched_rq_remove_entity(entity->rq, entity);
>   spin_unlock(>rq_lock);
>  
>   /* Make sure this entity is not used by the scheduler at the moment */
> @@ -259,6 +273,20 @@ static void drm_sched_entity_kill(struct 
> drm_sched_entity *entity)
>   dma_fence_put(prev);
>  }
>  
> +/**
> + * drm_sched_entity_to_scheduler - Schedule entity to GPU scheduler
> + * @entity: scheduler entity
> + *
> + * Returns GPU scheduler for the entity
> + */
> +struct drm_gpu_scheduler *
> +drm_sched_entity_to_scheduler(struct drm_sched_entity *entity)
> +{
> + bool single_entity = !!entity->single_sched;
> +
> + return single_entity ? entity->single_sched : entity->rq->sched;
> +}
> +
>  /**
>   * drm_sched_entity_flush - Flush a context entity
>   *
> @@ -276,11 +304,12 @@ long drm_sched_entity_flush(struct drm_sched_entity 
> *entity, long timeout)
>   struct drm_gpu_scheduler *sched;
>   struct task_struct *last_user;
>   long ret = timeout;
> + bool single_entity = !!entity->single_sched;
>  
> - if (!entity->rq)
> + if (!entity->rq && !single_entity)
>   return 0;
>  
> - sched = entity->rq->sched;
> + sched = drm_sched_entity_to_scheduler(entity);
>   /**
>* The client will not queue more IBs during this fini, consume existing
>* queued IBs or discard them on SIGKILL
> @@ -373,7 +402,7 @@ static void drm_sched_entity_wakeup(struct dma_fence *f,
>   container_of(cb, struct drm_sched_entity, cb);
>  
>   drm_sched_entity_clear_dep(f, cb);
> -

Re: [PATCH v5 3/7] drm/sched: Move schedule policy to scheduler

2023-10-13 Thread Luben Tuikov

On 2023-10-11 19:58, Matthew Brost wrote:
> Rather than a global modparam for scheduling policy, move the scheduling
> policy to scheduler so user can control each scheduler policy.
> 
> v2:
>   - s/DRM_SCHED_POLICY_MAX/DRM_SCHED_POLICY_COUNT (Luben)
>   - Only include policy in scheduler (Luben)
> v3:
>   - use a ternary operator as opposed to an if-control (Luben)
>   - s/DRM_SCHED_POLICY_DEFAULT/DRM_SCHED_POLICY_UNSET/ (Luben)
>   - s/default_drm_sched_policy/drm_sched_policy_default/ (Luben)
>   - Update commit message (Boris)
>   - Fix v3d build (CI)
>   - s/bad_policies/drm_sched_policy_mismatch/ (Luben)
>   - Don't update modparam doc (Luben)
> v4:
>   - Fix alignment in msm_ringbuffer_new (Luben / checkpatch)
> 
> Reviewed-by: Luben Tuikov 
> Signed-off-by: Matthew Brost 

Reviewed-by: Luben Tuikov 

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  1 +
>  drivers/gpu/drm/etnaviv/etnaviv_sched.c|  3 ++-
>  drivers/gpu/drm/lima/lima_sched.c  |  3 ++-
>  drivers/gpu/drm/msm/msm_ringbuffer.c   |  2 +-
>  drivers/gpu/drm/nouveau/nouveau_sched.c|  3 ++-
>  drivers/gpu/drm/panfrost/panfrost_job.c|  3 ++-
>  drivers/gpu/drm/scheduler/sched_entity.c   | 24 ++
>  drivers/gpu/drm/scheduler/sched_main.c | 19 -
>  drivers/gpu/drm/v3d/v3d_sched.c| 15 +-
>  include/drm/gpu_scheduler.h| 20 --
>  10 files changed, 68 insertions(+), 25 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index b54c4d771104..e4e6f91450a4 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -2283,6 +2283,7 @@ static int amdgpu_device_init_schedulers(struct 
> amdgpu_device *adev)
>  ring->num_hw_submission, 0,
>  timeout, adev->reset_domain->wq,
>  ring->sched_score, ring->name,
> +DRM_SCHED_POLICY_UNSET,
>  adev->dev);
>   if (r) {
>   DRM_ERROR("Failed to create scheduler on ring %s.\n",
> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c 
> b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> index 618a804ddc34..15b0e2f1abe5 100644
> --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> @@ -137,7 +137,8 @@ int etnaviv_sched_init(struct etnaviv_gpu *gpu)
>   ret = drm_sched_init(>sched, _sched_ops, NULL,
>etnaviv_hw_jobs_limit, etnaviv_job_hang_limit,
>msecs_to_jiffies(500), NULL, NULL,
> -  dev_name(gpu->dev), gpu->dev);
> +  dev_name(gpu->dev), DRM_SCHED_POLICY_UNSET,
> +  gpu->dev);
>   if (ret)
>   return ret;
>  
> diff --git a/drivers/gpu/drm/lima/lima_sched.c 
> b/drivers/gpu/drm/lima/lima_sched.c
> index 8d858aed0e56..50c2075228aa 100644
> --- a/drivers/gpu/drm/lima/lima_sched.c
> +++ b/drivers/gpu/drm/lima/lima_sched.c
> @@ -491,7 +491,8 @@ int lima_sched_pipe_init(struct lima_sched_pipe *pipe, 
> const char *name)
>   return drm_sched_init(>base, _sched_ops, NULL, 1,
> lima_job_hang_limit,
> msecs_to_jiffies(timeout), NULL,
> -   NULL, name, pipe->ldev->dev);
> +   NULL, name, DRM_SCHED_POLICY_UNSET,
> +   pipe->ldev->dev);
>  }
>  
>  void lima_sched_pipe_fini(struct lima_sched_pipe *pipe)
> diff --git a/drivers/gpu/drm/msm/msm_ringbuffer.c 
> b/drivers/gpu/drm/msm/msm_ringbuffer.c
> index 1097f8e93d6b..173ad2f17c50 100644
> --- a/drivers/gpu/drm/msm/msm_ringbuffer.c
> +++ b/drivers/gpu/drm/msm/msm_ringbuffer.c
> @@ -97,7 +97,7 @@ struct msm_ringbuffer *msm_ringbuffer_new(struct msm_gpu 
> *gpu, int id,
>   ret = drm_sched_init(>sched, _sched_ops, NULL,
>num_hw_submissions, 0, sched_timeout,
>NULL, NULL, to_msm_bo(ring->bo)->name,
> -  gpu->dev->dev);
> +  DRM_SCHED_POLICY_UNSET, gpu->dev->dev);
>   if (ret) {
>   goto fail;
>   }
> diff --git a/drivers/gpu/drm/nouveau/nouveau_sched.c 
> b/drivers/gpu/drm/nouveau/nouveau_sched.c
> index 4c959dec42b3..c4e09d2e77f9 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_sched.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_sched.c
> @@ -437,7 +437,8 @@ int nouveau_sched_init(struct nouveau_drm *drm)
>  
>   return drm_sched_init(sched, _sched_ops, NULL,
> NOUVEAU_SCHED_HW_SUBMISSIONS, 0, job_hang_limit,
> -   NULL, NULL, "nouveau_sched", drm->dev->dev);
> +   NULL, NULL, "nouveau_sched",
> +

Re: [PATCH v5 2/7] drm/sched: Convert drm scheduler to use a work queue rather than kthread

2023-10-13 Thread Luben Tuikov

On 2023-10-11 19:58, Matthew Brost wrote:
> In Xe, the new Intel GPU driver, a choice has made to have a 1 to 1
> mapping between a drm_gpu_scheduler and drm_sched_entity. At first this
> seems a bit odd but let us explain the reasoning below.
> 
> 1. In Xe the submission order from multiple drm_sched_entity is not
> guaranteed to be the same completion even if targeting the same hardware
> engine. This is because in Xe we have a firmware scheduler, the GuC,
> which allowed to reorder, timeslice, and preempt submissions. If a using
> shared drm_gpu_scheduler across multiple drm_sched_entity, the TDR falls
> apart as the TDR expects submission order == completion order. Using a
> dedicated drm_gpu_scheduler per drm_sched_entity solve this problem.
> 
> 2. In Xe submissions are done via programming a ring buffer (circular
> buffer), a drm_gpu_scheduler provides a limit on number of jobs, if the
> limit of number jobs is set to RING_SIZE / MAX_SIZE_PER_JOB we get flow
> control on the ring for free.
> 
> A problem with this design is currently a drm_gpu_scheduler uses a
> kthread for submission / job cleanup. This doesn't scale if a large
> number of drm_gpu_scheduler are used. To work around the scaling issue,
> use a worker rather than kthread for submission / job cleanup.
> 
> v2:
>   - (Rob Clark) Fix msm build
>   - Pass in run work queue
> v3:
>   - (Boris) don't have loop in worker
> v4:
>   - (Tvrtko) break out submit ready, stop, start helpers into own patch
> v5:
>   - (Boris) default to ordered work queue
> v6:
>   - (Luben / checkpatch) fix alignment in msm_ringbuffer.c
>   - (Luben) s/drm_sched_submit_queue/drm_sched_wqueue_enqueue
>   - (Luben) Update comment for drm_sched_wqueue_enqueue
>   - (Luben) Positive check for submit_wq in drm_sched_init
>   - (Luben) s/alloc_submit_wq/own_submit_wq
> 
> Signed-off-by: Matthew Brost 

Reviewed-by: Luben Tuikov 

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |   2 +-
>  drivers/gpu/drm/etnaviv/etnaviv_sched.c|   2 +-
>  drivers/gpu/drm/lima/lima_sched.c  |   2 +-
>  drivers/gpu/drm/msm/msm_ringbuffer.c   |   7 +-
>  drivers/gpu/drm/nouveau/nouveau_sched.c|   2 +-
>  drivers/gpu/drm/panfrost/panfrost_job.c|   2 +-
>  drivers/gpu/drm/scheduler/sched_main.c | 118 ++---
>  drivers/gpu/drm/v3d/v3d_sched.c|  10 +-
>  include/drm/gpu_scheduler.h|  14 ++-
>  9 files changed, 82 insertions(+), 77 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index b1aafe815f28..b54c4d771104 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -2279,7 +2279,7 @@ static int amdgpu_device_init_schedulers(struct 
> amdgpu_device *adev)
>   break;
>   }
>  
> - r = drm_sched_init(>sched, _sched_ops,
> + r = drm_sched_init(>sched, _sched_ops, NULL,
>  ring->num_hw_submission, 0,
>  timeout, adev->reset_domain->wq,
>  ring->sched_score, ring->name,
> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c 
> b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> index 345fec6cb1a4..618a804ddc34 100644
> --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> @@ -134,7 +134,7 @@ int etnaviv_sched_init(struct etnaviv_gpu *gpu)
>  {
>   int ret;
>  
> - ret = drm_sched_init(>sched, _sched_ops,
> + ret = drm_sched_init(>sched, _sched_ops, NULL,
>etnaviv_hw_jobs_limit, etnaviv_job_hang_limit,
>msecs_to_jiffies(500), NULL, NULL,
>dev_name(gpu->dev), gpu->dev);
> diff --git a/drivers/gpu/drm/lima/lima_sched.c 
> b/drivers/gpu/drm/lima/lima_sched.c
> index ffd91a5ee299..8d858aed0e56 100644
> --- a/drivers/gpu/drm/lima/lima_sched.c
> +++ b/drivers/gpu/drm/lima/lima_sched.c
> @@ -488,7 +488,7 @@ int lima_sched_pipe_init(struct lima_sched_pipe *pipe, 
> const char *name)
>  
>   INIT_WORK(>recover_work, lima_sched_recover_work);
>  
> - return drm_sched_init(>base, _sched_ops, 1,
> + return drm_sched_init(>base, _sched_ops, NULL, 1,
> lima_job_hang_limit,
> msecs_to_jiffies(timeout), NULL,
> NULL, name, pipe->ldev->dev);
> diff --git a/drivers/gpu/drm/msm/msm_ringbuffer.c 
> b/drivers/gpu/drm/msm/msm_ringbuffer.c
> index 40c0bc35a44c..1097f8e93d6b 100644
> --- a/drivers/gpu/drm/msm/msm_ringbuffer.c
> +++ b/drivers/gpu/drm/msm/msm_ringbuffer.c
> @@ -94,9 +94,10 @@ struct msm_ringbuffer *msm_ringbuffer_new(struct msm_gpu 
> *gpu, int id,
>/* currently managing hangcheck ourselves: */
>   sched_timeout = MAX_SCHEDULE_TIMEOUT;
>  
> - ret = drm_sched_init(>sched, _sched_ops,
> -

Re: [PATCH v5 1/7] drm/sched: Add drm_sched_wqueue_* helpers

2023-10-13 Thread Luben Tuikov

On 2023-10-11 19:58, Matthew Brost wrote:
> Add scheduler wqueue ready, stop, and start helpers to hide the
> implementation details of the scheduler from the drivers.
> 
> v2:
>   - s/sched_wqueue/sched_wqueue (Luben)
>   - Remove the extra white line after the return-statement (Luben)
>   - update drm_sched_wqueue_ready comment (Luben)
> 
> Signed-off-by: Matthew Brost 
> Cc: Luben Tuikov 

Cc comes before S-O-B, but I can fix this when applying it, so don't worry about
this patch anymore. I'll also add Link: and so on, but this is all automated for
me so don't worry about it.

Reviewed-by: Luben Tuikov 
-- 
Regards,
Luben

> ---
>  .../drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c   |  2 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c   | 15 +++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c| 12 +++---
>  drivers/gpu/drm/msm/adreno/adreno_device.c|  6 ++-
>  drivers/gpu/drm/scheduler/sched_main.c| 39 ++-
>  include/drm/gpu_scheduler.h   |  3 ++
>  6 files changed, 59 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c
> index 625db444df1c..10d56979fe3b 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c
> @@ -290,7 +290,7 @@ static int suspend_resume_compute_scheduler(struct 
> amdgpu_device *adev, bool sus
>   for (i = 0; i < adev->gfx.num_compute_rings; i++) {
>   struct amdgpu_ring *ring = >gfx.compute_ring[i];
>  
> - if (!(ring && ring->sched.thread))
> + if (!(ring && drm_sched_wqueue_ready(>sched)))
>   continue;
>  
>   /* stop secheduler and drain ring. */
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> index a4faea4fa0b5..a4c0bb358db7 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> @@ -1659,9 +1659,9 @@ static int amdgpu_debugfs_test_ib_show(struct seq_file 
> *m, void *unused)
>   for (i = 0; i < AMDGPU_MAX_RINGS; i++) {
>   struct amdgpu_ring *ring = adev->rings[i];
>  
> - if (!ring || !ring->sched.thread)
> + if (!ring || !drm_sched_wqueue_ready(>sched))
>   continue;
> - kthread_park(ring->sched.thread);
> + drm_sched_wqueue_stop(>sched);
>   }
>  
>   seq_puts(m, "run ib test:\n");
> @@ -1675,9 +1675,9 @@ static int amdgpu_debugfs_test_ib_show(struct seq_file 
> *m, void *unused)
>   for (i = 0; i < AMDGPU_MAX_RINGS; i++) {
>   struct amdgpu_ring *ring = adev->rings[i];
>  
> - if (!ring || !ring->sched.thread)
> + if (!ring || !drm_sched_wqueue_ready(>sched))
>   continue;
> - kthread_unpark(ring->sched.thread);
> + drm_sched_wqueue_start(>sched);
>   }
>  
>   up_write(>reset_domain->sem);
> @@ -1897,7 +1897,8 @@ static int amdgpu_debugfs_ib_preempt(void *data, u64 
> val)
>  
>   ring = adev->rings[val];
>  
> - if (!ring || !ring->funcs->preempt_ib || !ring->sched.thread)
> + if (!ring || !ring->funcs->preempt_ib ||
> + !drm_sched_wqueue_ready(>sched))
>   return -EINVAL;
>  
>   /* the last preemption failed */
> @@ -1915,7 +1916,7 @@ static int amdgpu_debugfs_ib_preempt(void *data, u64 
> val)
>   goto pro_end;
>  
>   /* stop the scheduler */
> - kthread_park(ring->sched.thread);
> + drm_sched_wqueue_stop(>sched);
>  
>   /* preempt the IB */
>   r = amdgpu_ring_preempt_ib(ring);
> @@ -1949,7 +1950,7 @@ static int amdgpu_debugfs_ib_preempt(void *data, u64 
> val)
>  
>  failure:
>   /* restart the scheduler */
> - kthread_unpark(ring->sched.thread);
> + drm_sched_wqueue_start(>sched);
>  
>   up_read(>reset_domain->sem);
>  
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 2b8356699f23..b1aafe815f28 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -4588,7 +4588,7 @@ bool amdgpu_device_has_job_running(struct amdgpu_device 
> *adev)
>   for (i = 0; i < AMDGPU_MAX_RINGS; ++i) {
>   struct amdgpu_ring *ring = adev->rings[i];
>  
> - if (!ring || !ring->sched.thread)
> + if (!ring || !drm_sched_wqueue_ready(>sched))
>   continue;
>  
>   spin_lock(>sched.job_list_lock);
> @@ -4727,7 +4727,7 @@ int amdgpu_device_pre_asic_reset(struct amdgpu_device 
> *adev,
>   for (i = 0; i < AMDGPU_MAX_RINGS; ++i) {
>   struct amdgpu_ring *ring = adev->rings[i];
>  
> - if (!ring || !ring->sched.thread)
> + if (!ring || !drm_sched_wqueue_ready(>sched))
>

Re: [PATCH drm-misc-next v2] drm/sched: implement dynamic job-flow control

2023-10-13 Thread Luben Tuikov

On 2023-10-11 22:10, Danilo Krummrich wrote:
> 
> 
> On 10/12/23 03:52, Luben Tuikov wrote:
>> Hi,
>>
>> Thanks for fixing the title and submitting a v2 of this patch. Comments 
>> inlined below.
>>
>> On 2023-10-09 18:35, Danilo Krummrich wrote:
>>> Currently, job flow control is implemented simply by limiting the number
>>> of jobs in flight. Therefore, a scheduler is initialized with a
>>> submission limit that corresponds to the number of jobs which can be
>>> sent to the hardware.
>>>
>>> This implies that for each job, drivers need to account for the maximum
>>> job size possible in order to not overflow the ring buffer.
>>>
>>> However, there are drivers, such as Nouveau, where the job size has a
>>> rather large range. For such drivers it can easily happen that job
>>> submissions not even filling the ring by 1% can block subsequent
>>> submissions, which, in the worst case, can lead to the ring run dry.
>>>
>>> In order to overcome this issue, allow for tracking the actual job size
>>> instead of the number of jobs. Therefore, add a field to track a job's
>>> submission credits, which represents the number of credits a job
>>> contributes to the scheduler's submission limit.
>>>
>>> Signed-off-by: Danilo Krummrich 
>>> ---
>>> Changes in V2:
>>> ==
>>>- fixed up influence on scheduling fairness due to consideration of a 
>>> job's
>>>  size
>>>  - If we reach a ready entity in drm_sched_select_entity() but can't 
>>> actually
>>>queue a job from it due to size limitations, just give up and go to 
>>> sleep
>>>until woken up due to a pending job finishing, rather than continue 
>>> to try
>>>other entities.
>>>- added a callback to dynamically update a job's credits (Boris)
>>>- renamed 'units' to 'credits'
>>>- fixed commit message and comments as requested by Luben
>>> ---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c   |   2 +-
>>>   drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c  |   2 +-
>>>   drivers/gpu/drm/lima/lima_sched.c |   2 +-
>>>   drivers/gpu/drm/msm/msm_gem_submit.c  |   2 +-
>>>   drivers/gpu/drm/nouveau/nouveau_sched.c   |   2 +-
>>>   drivers/gpu/drm/panfrost/panfrost_drv.c   |   2 +-
>>>   .../gpu/drm/scheduler/gpu_scheduler_trace.h   |   2 +-
>>>   drivers/gpu/drm/scheduler/sched_entity.c  |   5 +-
>>>   drivers/gpu/drm/scheduler/sched_main.c| 101 +-
>>>   drivers/gpu/drm/v3d/v3d_gem.c |   2 +-
>>>   include/drm/gpu_scheduler.h   |  33 --
>>>   11 files changed, 115 insertions(+), 40 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>> index 78476bc75b4e..d54daaf64bf1 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>> @@ -115,7 +115,7 @@ int amdgpu_job_alloc(struct amdgpu_device *adev, struct 
>>> amdgpu_vm *vm,
>>> if (!entity)
>>> return 0;
>>>   
>>> -   return drm_sched_job_init(&(*job)->base, entity, owner);
>>> +   return drm_sched_job_init(&(*job)->base, entity, 1, owner);
>>>   }
>>>   
>>>   int amdgpu_job_alloc_with_ib(struct amdgpu_device *adev,
>>> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c 
>>> b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
>>> index 45403ea38906..74a446711207 100644
>>> --- a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
>>> +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
>>> @@ -538,7 +538,7 @@ int etnaviv_ioctl_gem_submit(struct drm_device *dev, 
>>> void *data,
>>>   
>>> ret = drm_sched_job_init(>sched_job,
>>>  >sched_entity[args->pipe],
>>> -submit->ctx);
>>> +1, submit->ctx);
>>> if (ret)
>>> goto err_submit_put;
>>>   
>>> diff --git a/drivers/gpu/drm/lima/lima_sched.c 
>>> b/drivers/gpu/drm/lima/lima_sched.c
>>> index 50c2075228aa..5dc6678e1eb9 100644
>>> --- a/drivers/gpu/drm/lima/lima_sched.c
>>> +++ b/drivers/gpu/drm/lima/lima_sched.c
>>> @@ -123,7 +123,7 @@ int lima_sched_task_init(struct lima_sched_task *task,
>>> for (i = 0; i < num_bos; i++)
>>> drm_gem_object_get([i]->base.base);
>>>   
>>> -   err = drm_sched_job_init(>base, >base, vm);
>>> +   err = drm_sched_job_init(>base, >base, 1, vm);
>>> if (err) {
>>> kfree(task->bos);
>>> return err;
>>> diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c 
>>> b/drivers/gpu/drm/msm/msm_gem_submit.c
>>> index 3f1aa4de3b87..6d230c38e4f5 100644
>>> --- a/drivers/gpu/drm/msm/msm_gem_submit.c
>>> +++ b/drivers/gpu/drm/msm/msm_gem_submit.c
>>> @@ -48,7 +48,7 @@ static struct msm_gem_submit *submit_create(struct 
>>> drm_device *dev,
>>> return ERR_PTR(ret);
>>> }
>>>   
>>> -   ret = drm_sched_job_init(>base, queue->entity, queue);
>>> +   ret = drm_sched_job_init(>base, queue->entity, 1, queue);
>>> if (ret) {
>>>

[PATCH v5 3/3] drm/i915/gt: Timeout when waiting for idle in suspending

2023-10-13 Thread Alan Previn

When suspending, add a timeout when calling
intel_gt_pm_wait_for_idle else if we have a lost
G2H event that holds a wakeref (which would be
indicative of a bug elsewhere in the driver),
driver will at least complete the suspend-resume
cycle, (albeit not hitting all the targets for
low power hw counters), instead of hanging in the kernel.

Signed-off-by: Alan Previn 
Reviewed-by: Rodrigo Vivi 
Tested-by: Mousumi Jana 
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c |  2 +-
 drivers/gpu/drm/i915/gt/intel_gt_pm.c |  7 ++-
 drivers/gpu/drm/i915/gt/intel_gt_pm.h |  7 ++-
 drivers/gpu/drm/i915/intel_wakeref.c  | 14 ++
 drivers/gpu/drm/i915/intel_wakeref.h  |  6 --
 5 files changed, 27 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 179d9546865b..4b45a8f7fe5a 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -686,7 +686,7 @@ void intel_engines_release(struct intel_gt *gt)
if (!engine->release)
continue;
 
-   intel_wakeref_wait_for_idle(>wakeref);
+   intel_wakeref_wait_for_idle(>wakeref, 0);
GEM_BUG_ON(intel_engine_pm_is_awake(engine));
 
engine->release(engine);
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm.c 
b/drivers/gpu/drm/i915/gt/intel_gt_pm.c
index f5899d503e23..25cb39ba9fdf 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_pm.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_pm.c
@@ -306,6 +306,8 @@ int intel_gt_resume(struct intel_gt *gt)
 
 static void wait_for_suspend(struct intel_gt *gt)
 {
+   int final_timeout_ms = (I915_GT_SUSPEND_IDLE_TIMEOUT * 10);
+
if (!intel_gt_pm_is_awake(gt))
return;
 
@@ -318,7 +320,10 @@ static void wait_for_suspend(struct intel_gt *gt)
intel_gt_retire_requests(gt);
}
 
-   intel_gt_pm_wait_for_idle(gt);
+   /* we are suspending, so we shouldn't be waiting forever */
+   if (intel_gt_pm_wait_timeout_for_idle(gt, final_timeout_ms) == 
-ETIMEDOUT)
+   gt_warn(gt, "bailing from %s after %d milisec timeout\n",
+   __func__, final_timeout_ms);
 }
 
 void intel_gt_suspend_prepare(struct intel_gt *gt)
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm.h 
b/drivers/gpu/drm/i915/gt/intel_gt_pm.h
index b1eeb5b33918..1757ca4c3077 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_pm.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_pm.h
@@ -68,7 +68,12 @@ static inline void intel_gt_pm_might_put(struct intel_gt *gt)
 
 static inline int intel_gt_pm_wait_for_idle(struct intel_gt *gt)
 {
-   return intel_wakeref_wait_for_idle(>wakeref);
+   return intel_wakeref_wait_for_idle(>wakeref, 0);
+}
+
+static inline int intel_gt_pm_wait_timeout_for_idle(struct intel_gt *gt, int 
timeout_ms)
+{
+   return intel_wakeref_wait_for_idle(>wakeref, timeout_ms);
 }
 
 void intel_gt_pm_init_early(struct intel_gt *gt);
diff --git a/drivers/gpu/drm/i915/intel_wakeref.c 
b/drivers/gpu/drm/i915/intel_wakeref.c
index 623a69089386..f2611c65246b 100644
--- a/drivers/gpu/drm/i915/intel_wakeref.c
+++ b/drivers/gpu/drm/i915/intel_wakeref.c
@@ -113,14 +113,20 @@ void __intel_wakeref_init(struct intel_wakeref *wf,
 "wakeref.work", >work, 0);
 }
 
-int intel_wakeref_wait_for_idle(struct intel_wakeref *wf)
+int intel_wakeref_wait_for_idle(struct intel_wakeref *wf, int timeout_ms)
 {
-   int err;
+   int err = 0;
 
might_sleep();
 
-   err = wait_var_event_killable(>wakeref,
- !intel_wakeref_is_active(wf));
+   if (!timeout_ms)
+   err = wait_var_event_killable(>wakeref,
+ !intel_wakeref_is_active(wf));
+   else if (wait_var_event_timeout(>wakeref,
+   !intel_wakeref_is_active(wf),
+   msecs_to_jiffies(timeout_ms)) < 1)
+   err = -ETIMEDOUT;
+
if (err)
return err;
 
diff --git a/drivers/gpu/drm/i915/intel_wakeref.h 
b/drivers/gpu/drm/i915/intel_wakeref.h
index ec881b097368..302694a780d2 100644
--- a/drivers/gpu/drm/i915/intel_wakeref.h
+++ b/drivers/gpu/drm/i915/intel_wakeref.h
@@ -251,15 +251,17 @@ __intel_wakeref_defer_park(struct intel_wakeref *wf)
 /**
  * intel_wakeref_wait_for_idle: Wait until the wakeref is idle
  * @wf: the wakeref
+ * @timeout_ms: Timeout in ms, 0 means never timeout.
  *
  * Wait for the earlier asynchronous release of the wakeref. Note
  * this will wait for any third party as well, so make sure you only wait
  * when you have control over the wakeref and trust no one else is acquiring
  * it.
  *
- * Return: 0 on success, error code if killed.
+ * Returns 0 on success, -ETIMEDOUT upon a timeout, or the unlikely
+ * error propagation from wait_var_event_killable if timeout_ms is 0.

[PATCH v5 0/3] Resolve suspend-resume racing with GuC destroy-context-worker

2023-10-13 Thread Alan Previn

This series is the result of debugging issues root caused to
races between the GuC's destroyed_worker_func being triggered
vs repeating suspend-resume cycles with concurrent delayed
fence signals for engine-freeing.

The reproduction steps require that an app is launched right
before the start of the suspend cycle where it creates a
new gem context and submits a tiny workload that would
complete in the middle of the suspend cycle. However this
app uses dma-buffer sharing or dma-fence with non-GPU
objects or signals that eventually triggers a FENCE_FREE
via__i915_sw_fence_notify that connects to engines_notify ->
free_engines_rcu -> intel_context_put ->
kref_put(>ref..) that queues the worker after the GuCs
CTB has been disabled (i.e. after i915-gem's suspend-late).

This sequence is a corner-case and required repeating this
app->suspend->resume cycle ~1500 times across 4 identical
systems to see it once. That said, based on above callstack,
it is clear that merely flushing the context destruction worker,
which is obviously missing and needed, isn't sufficient.

Because of that, this series adds additional patches besides
the obvious (Patch #1) flushing of the worker during the
suspend flows. It also includes (Patch #2) closing a race
between sending the context-deregistration H2G vs the CTB
getting disabled in the midst of it (by detecing the failure
and unrolling the guc-lrc-unpin flow) and (Patch #32) not
infinitely waiting in intel_gt_pm_wait_timeout_for_idle
when in the suspend-flow.

NOTE: We are still observing one more wakeref leak from gt
but not necessarilty guc. We are still debugging this so
this series will very likely get rev'd up again.

Changes from prior revs:
   v4: - In Patch #2, change the position of the calls into
 rcu_barrier based on latest testing data. (Alan/Anshuman).
   - In Patch #3, fix the timeout value selection for the
 final gt-pm idle-wait that was incorrectly using a 'ns'
 #define as a milisec timeout.
   v3: - In Patch #3, when deregister_context fails, instead
 of calling intel_gt_pm_put(that might sleep), call
 __intel_wakeref_put (without ASYNC flag) (Rodrigo/Anshuman).
   - In wait_for_suspend add an rcu_barrier before we
 proceed to wait for idle. (Anshuman)
   v2: - Patch #2 Restructure code in guc_lrc_desc_unpin so
 it's more readible to differentiate (1)direct guc-id
 cleanup ..vs (2) sending the H2G ctx-destroy action ..
 vs (3) the unrolling steps if the H2G fails.
   - Patch #2 Add a check to close the race sooner by checking
 for intel_guc_is_ready from destroyed_worker_func.
   - Patch #2 When guc_submission_send_busy_loop gets a
 failure from intel_guc_send_busy_loop, we need to undo
 i.e. decrement the outstanding_submission_g2h.
   - Patch #3 In wait_for_suspend, fix checking of return from
 intel_gt_pm_wait_timeout_for_idle to now use -ETIMEDOUT
 and add documentation for intel_wakeref_wait_for_idle.
 (Rodrigo).

Alan Previn (3):
  drm/i915/guc: Flush context destruction worker at suspend
  drm/i915/guc: Close deregister-context race against CT-loss
  drm/i915/gt: Timeout when waiting for idle in suspending

 drivers/gpu/drm/i915/gem/i915_gem_pm.c| 10 +++
 drivers/gpu/drm/i915/gt/intel_engine_cs.c |  2 +-
 drivers/gpu/drm/i915/gt/intel_gt_pm.c |  7 +-
 drivers/gpu/drm/i915/gt/intel_gt_pm.h |  7 +-
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 86 ---
 .../gpu/drm/i915/gt/uc/intel_guc_submission.h |  2 +
 drivers/gpu/drm/i915/gt/uc/intel_uc.c |  2 +
 drivers/gpu/drm/i915/intel_wakeref.c  | 14 ++-
 drivers/gpu/drm/i915/intel_wakeref.h  |  6 +-
 9 files changed, 116 insertions(+), 20 deletions(-)


base-commit: 13e6d61391ba3da0e26ce5c788d1fe26e036294c
-- 
2.39.0

[PATCH v5 1/3] drm/i915/guc: Flush context destruction worker at suspend

2023-10-13 Thread Alan Previn

When suspending, flush the context-guc-id
deregistration worker at the final stages of
intel_gt_suspend_late when we finally call gt_sanitize
that eventually leads down to __uc_sanitize so that
the deregistration worker doesn't fire off later as
we reset the GuC microcontroller.

Signed-off-by: Alan Previn 
Reviewed-by: Rodrigo Vivi 
Tested-by: Mousumi Jana 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 5 +
 drivers/gpu/drm/i915/gt/uc/intel_guc_submission.h | 2 ++
 drivers/gpu/drm/i915/gt/uc/intel_uc.c | 2 ++
 3 files changed, 9 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index 2cce5ec1ff00..a5b68f77e494 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -1610,6 +1610,11 @@ static void guc_flush_submissions(struct intel_guc *guc)
spin_unlock_irqrestore(_engine->lock, flags);
 }
 
+void intel_guc_submission_flush_work(struct intel_guc *guc)
+{
+   flush_work(>submission_state.destroyed_worker);
+}
+
 static void guc_flush_destroyed_contexts(struct intel_guc *guc);
 
 void intel_guc_submission_reset_prepare(struct intel_guc *guc)
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.h 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.h
index c57b29cdb1a6..b6df75622d3b 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.h
@@ -38,6 +38,8 @@ int intel_guc_wait_for_pending_msg(struct intel_guc *guc,
   bool interruptible,
   long timeout);
 
+void intel_guc_submission_flush_work(struct intel_guc *guc);
+
 static inline bool intel_guc_submission_is_supported(struct intel_guc *guc)
 {
return guc->submission_supported;
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc.c 
b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
index 98b103375b7a..eb3554cb5ea4 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_uc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
@@ -693,6 +693,8 @@ void intel_uc_suspend(struct intel_uc *uc)
return;
}
 
+   intel_guc_submission_flush_work(guc);
+
with_intel_runtime_pm(_to_gt(uc)->i915->runtime_pm, wakeref) {
err = intel_guc_suspend(guc);
if (err)
-- 
2.39.0

[PATCH v5 2/3] drm/i915/guc: Close deregister-context race against CT-loss

2023-10-13 Thread Alan Previn

If we are at the end of suspend or very early in resume
its possible an async fence signal (via rcu_call) is triggered
to free_engines which could lead us to the execution of
the context destruction worker (after a prior worker flush).

Thus, when suspending, insert rcu_barriers at the start
of i915_gem_suspend (part of driver's suspend prepare) and
again in i915_gem_suspend_late so that all such cases have
completed and context destruction list isn't missing anything.

In destroyed_worker_func, close the race against CT-loss
by checking that CT is enabled before calling into
deregister_destroyed_contexts.

Based on testing, guc_lrc_desc_unpin may still race and fail
as we traverse the GuC's context-destroy list because the
CT could be disabled right before calling GuC's CT send function.

We've witnessed this race condition once every ~6000-8000
suspend-resume cycles while ensuring workloads that render
something onscreen is continuously started just before
we suspend (and the workload is small enough to complete
and trigger the queued engine/context free-up either very
late in suspend or very early in resume).

In such a case, we need to unroll the entire process because
guc-lrc-unpin takes a gt wakeref which only gets released in
the G2H IRQ reply that never comes through in this corner
case. Without the unroll, the taken wakeref is leaked and will
cascade into a kernel hang later at the tail end of suspend in
this function:

   intel_wakeref_wait_for_idle(>wakeref)
   (called by) - intel_gt_pm_wait_for_idle
   (called by) - wait_for_suspend

Thus, do an unroll in guc_lrc_desc_unpin and deregister_destroyed_-
contexts if guc_lrc_desc_unpin fails due to CT send falure.
When unrolling, keep the context in the GuC's destroy-list so
it can get picked up on the next destroy worker invocation
(if suspend aborted) or get fully purged as part of a GuC
sanitization (end of suspend) or a reset flow.

Signed-off-by: Alan Previn 
Signed-off-by: Anshuman Gupta 
Tested-by: Mousumi Jana 
---
 drivers/gpu/drm/i915/gem/i915_gem_pm.c| 10 +++
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 81 ---
 2 files changed, 80 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pm.c 
b/drivers/gpu/drm/i915/gem/i915_gem_pm.c
index 0d812f4d787d..3b27218aabe2 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_pm.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_pm.c
@@ -28,6 +28,13 @@ void i915_gem_suspend(struct drm_i915_private *i915)
GEM_TRACE("%s\n", dev_name(i915->drm.dev));
 
intel_wakeref_auto(>runtime_pm.userfault_wakeref, 0);
+   /*
+* On rare occasions, we've observed the fence completion triggers
+* free_engines asynchronously via rcu_call. Ensure those are done.
+* This path is only called on suspend, so it's an acceptable cost.
+*/
+   rcu_barrier();
+
flush_workqueue(i915->wq);
 
/*
@@ -160,6 +167,9 @@ void i915_gem_suspend_late(struct drm_i915_private *i915)
 * machine in an unusable condition.
 */
 
+   /* Like i915_gem_suspend, flush tasks staged from fence triggers */
+   rcu_barrier();
+
for_each_gt(gt, i915, i)
intel_gt_suspend_late(gt);
 
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index a5b68f77e494..9806b33c8561 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -235,6 +235,13 @@ set_context_destroyed(struct intel_context *ce)
ce->guc_state.sched_state |= SCHED_STATE_DESTROYED;
 }
 
+static inline void
+clr_context_destroyed(struct intel_context *ce)
+{
+   lockdep_assert_held(>guc_state.lock);
+   ce->guc_state.sched_state &= ~SCHED_STATE_DESTROYED;
+}
+
 static inline bool context_pending_disable(struct intel_context *ce)
 {
return ce->guc_state.sched_state & SCHED_STATE_PENDING_DISABLE;
@@ -612,6 +619,8 @@ static int guc_submission_send_busy_loop(struct intel_guc 
*guc,
 u32 g2h_len_dw,
 bool loop)
 {
+   int ret;
+
/*
 * We always loop when a send requires a reply (i.e. g2h_len_dw > 0),
 * so we don't handle the case where we don't get a reply because we
@@ -622,7 +631,11 @@ static int guc_submission_send_busy_loop(struct intel_guc 
*guc,
if (g2h_len_dw)
atomic_inc(>outstanding_submission_g2h);
 
-   return intel_guc_send_busy_loop(guc, action, len, g2h_len_dw, loop);
+   ret = intel_guc_send_busy_loop(guc, action, len, g2h_len_dw, loop);
+   if (ret)
+   atomic_dec(>outstanding_submission_g2h);
+
+   return ret;
 }
 
 int intel_guc_wait_for_pending_msg(struct intel_guc *guc,
@@ -3205,12 +3218,13 @@ static void guc_context_close(struct intel_context *ce)
spin_unlock_irqrestore(>guc_state.lock, flags);
 }
 
-static inline

Re: [PATCH v2 1/2] drm/msm/dsi: use msm_gem_kernel_put to free TX buffer

2023-10-13 Thread Abhinav Kumar





On 10/11/2023 6:29 PM, Dmitry Baryshkov wrote:

Use exiting function to free the allocated GEM object instead of
open-coding it. This has a bonus of internally calling
msm_gem_put_vaddr() to compensate for msm_gem_get_vaddr() in
msm_get_kernel_new().

Fixes: 1e29dff00400 ("drm/msm: Add a common function to free kernel buffer 
objects")
Signed-off-by: Dmitry Baryshkov 
---
  drivers/gpu/drm/msm/dsi/dsi_host.c | 3 +--
  1 file changed, 1 insertion(+), 2 deletions(-)



Reviewed-by: Abhinav Kumar

[PATCH] drm/msm/a6xx: Fix up QMP handling

2023-10-13 Thread Konrad Dybcio

The commit referenced in the Fixes tag had a couple problems (as
pointed out by Dan):

- qmp_put was never called, resulting in refcnt leaks
- failling to acquire the QMP mailbox on A7xx would not undo the probe
  function properly
- the qmp_put call present in the code was unreachable

Fix all of these issues.

Reported-by: Dan Carpenter 
Fixes: 88a0997f2f94 ("drm/msm/a6xx: Send ACD state to QMP at GMU resume")
Signed-off-by: Konrad Dybcio 
---
 drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 18 --
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
index 0555a0134fad..8c4900444b2c 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
@@ -1558,6 +1558,9 @@ void a6xx_gmu_remove(struct a6xx_gpu *a6xx_gpu)
dev_pm_domain_detach(gmu->gxpd, false);
}
 
+   if (!IS_ERR_OR_NULL(gmu->qmp))
+   qmp_put(gmu->qmp);
+
iounmap(gmu->mmio);
if (platform_get_resource_byname(pdev, IORESOURCE_MEM, "rscc"))
iounmap(gmu->rscc);
@@ -1654,6 +1657,7 @@ int a6xx_gmu_init(struct a6xx_gpu *a6xx_gpu, struct 
device_node *node)
struct adreno_gpu *adreno_gpu = _gpu->base;
struct a6xx_gmu *gmu = _gpu->gmu;
struct platform_device *pdev = of_find_device_by_node(node);
+   struct device_link *link;
int ret;
 
if (!pdev)
@@ -1777,15 +1781,17 @@ int a6xx_gmu_init(struct a6xx_gpu *a6xx_gpu, struct 
device_node *node)
goto err_mmio;
}
 
-   if (!device_link_add(gmu->dev, gmu->cxpd,
-   DL_FLAG_PM_RUNTIME)) {
+   link = device_link_add(gmu->dev, gmu->cxpd, DL_FLAG_PM_RUNTIME);
+   if (!link) {
ret = -ENODEV;
goto detach_cxpd;
}
 
gmu->qmp = qmp_get(gmu->dev);
-   if (IS_ERR(gmu->qmp) && adreno_is_a7xx(adreno_gpu))
-   return PTR_ERR(gmu->qmp);
+   if (IS_ERR(gmu->qmp) && adreno_is_a7xx(adreno_gpu)) {
+   ret = PTR_ERR(gmu->qmp);
+   goto remove_device_link;
+   }
 
init_completion(>pd_gate);
complete_all(>pd_gate);
@@ -1810,8 +1816,8 @@ int a6xx_gmu_init(struct a6xx_gpu *a6xx_gpu, struct 
device_node *node)
 
return 0;
 
-   if (!IS_ERR_OR_NULL(gmu->qmp))
-   qmp_put(gmu->qmp);
+remove_device_link:
+   device_link_del(link);
 
 detach_cxpd:
dev_pm_domain_detach(gmu->cxpd, false);

---
base-commit: e3b18f7200f45d66f7141136c25554ac1e82009b
change-id: 20231014-topic-a7xxv3-d0fa2d187215

Best regards,
-- 
Konrad Dybcio

[PATCH v7b 25/25] drm: restore CONFIG_DRM_USE_DYNAMIC_DEBUG un-BROKEN

2023-10-13 Thread Jim Cromie

Lots of burn-in testing needed before signing, upstreaming.

NOTE: I set default Y to maximize testing by default.
Is there a better way to do this ?

Signed-off-by: Jim Cromie 
---
 drivers/gpu/drm/Kconfig | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index 3caa020391c7..708f5e8cb205 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -55,8 +55,7 @@ config DRM_DEBUG_MM
 
 config DRM_USE_DYNAMIC_DEBUG
bool "use dynamic debug to implement drm.debug"
-   default n
-   depends on BROKEN
+   default y
depends on DRM
depends on DYNAMIC_DEBUG || DYNAMIC_DEBUG_CORE
depends on JUMP_LABEL
-- 
2.41.0

[PATCH v7b 20/25] dyndbg: add _DPRINTK_FLAGS_INCL_LOOKUP

2023-10-13 Thread Jim Cromie

dyndbg's dynamic prefixing (by +tmfsl flags) is needlessly expensive.

When an enabled (with +p) pr_debug is called, _DPRINTK_FLAGS_INCL_ANY
prefix decorations are sprintf'd into stack-mem for every call.

This string (or part of it) could be cached once its 1st generated,
and retrieved thereafter, as long as its deleted any time the
callsite's flags are changed afterwards.

So consider the prefix/decoration flags: 'tmfsl', and what should be
in the cache:

-t  thread-id. not part of the "callsite" info, derived from current.
doesn't belong in the cache. it would be wrong.
can be done in outer: dynamic_emit_prefix()

-l  line number
this could be part of the prefix, but would bloat the cache
can also be done in outer: dynamic_emit_prefix()

-mfs  module, function, source-file
we cache these, composed into a sub-string.
they are "lookups", currently to descriptor fields,
could be accessor macros to "compressed" tables.
cache saves more access work.

All enabled together, they compose a prefix string like:

  # outer   -inner--   outer
  "[tid] module:function:sourcfile:line: "

So this patch extracts _DPRINTK_FLAGS_INCL_LOOKUP macro out of
_DPRINTK_FLAGS_INCL_ANY macro, then redefs latter.

Next re-refactor dynamic_emit_prefix inner/outer fns accordingly.

Signed-off-by: Jim Cromie 
---
 include/linux/dynamic_debug.h | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/include/linux/dynamic_debug.h b/include/linux/dynamic_debug.h
index 927cb14f24e0..2237d454bc19 100644
--- a/include/linux/dynamic_debug.h
+++ b/include/linux/dynamic_debug.h
@@ -40,10 +40,12 @@ struct _ddebug {
 #define _DPRINTK_FLAGS_INCL_SOURCENAME (1<<5)
 #define _DPRINTK_FLAGS_PREFIX_CACHED   (1<<7)
 
-#define _DPRINTK_FLAGS_INCL_ANY\
-   (_DPRINTK_FLAGS_INCL_MODNAME | _DPRINTK_FLAGS_INCL_FUNCNAME |\
-_DPRINTK_FLAGS_INCL_LINENO  | _DPRINTK_FLAGS_INCL_TID |\
+#define _DPRINTK_FLAGS_INCL_LOOKUP \
+   (_DPRINTK_FLAGS_INCL_MODNAME | _DPRINTK_FLAGS_INCL_FUNCNAME |   \
 _DPRINTK_FLAGS_INCL_SOURCENAME)
+#define _DPRINTK_FLAGS_INCL_ANY
\
+   (_DPRINTK_FLAGS_INCL_LINENO | _DPRINTK_FLAGS_INCL_TID | \
+_DPRINTK_FLAGS_INCL_LOOKUP)
 
 #if defined DEBUG
 #define _DPRINTK_FLAGS_DEFAULT _DPRINTK_FLAGS_PRINT
-- 
2.41.0

[PATCH v7b 21/25] dyndbg: refactor *dynamic_emit_prefix

2023-10-13 Thread Jim Cromie

Refactor the split of duties between outer & inner fns.

The outer fn was previously just an inline unlikely forward to inner,
which did all the work.

Now, outer handles +t and +l flags itself, and calls inner only when
_DPRINTK_FLAGS_INCL_LOOKUP is needed.

No functional change.

But it does make the results of the inner-fn more cache-friendly
(fewer entries, reused more often):

1- no spurious [TID] or  noise
2- no LINE-number to bloat the cache (avg 9 pr_debugs/fn)
3- only LOOKUP stuff

Currently LOOKUPs are descriptor-field refs but could be replaced by
accessor functions.  This would allow the __dyndbg_sites section to be
de-duplicated and reclaimed; currently module, filename fields are
~90% repeated.  As the accessors get more expensive, the value of
caching part of the prefix goes up.

Also change inner-fn to return count of extra chars written to the
buffer, and drop "inline" from outer, let the compiler decide.  Maybe
also change name accordingly.

Signed-off-by: Jim Cromie 
---
fixup whitespace
---
 lib/dynamic_debug.c | 39 ++-
 1 file changed, 22 insertions(+), 17 deletions(-)

diff --git a/lib/dynamic_debug.c b/lib/dynamic_debug.c
index 17eefb35ac96..5825b58043a6 100644
--- a/lib/dynamic_debug.c
+++ b/lib/dynamic_debug.c
@@ -777,11 +777,28 @@ static int remaining(int wrote)
return 0;
 }
 
-static char *__dynamic_emit_prefix(const struct _ddebug *desc, char *buf)
+static int __dynamic_emit_prefix(const struct _ddebug *desc, char *buf, int 
pos)
+{
+   if (desc->flags & _DPRINTK_FLAGS_INCL_MODNAME)
+   pos += snprintf(buf + pos, remaining(pos), "%s:",
+   desc->modname);
+   if (desc->flags & _DPRINTK_FLAGS_INCL_FUNCNAME)
+   pos += snprintf(buf + pos, remaining(pos), "%s:",
+   desc->function);
+   if (desc->flags & _DPRINTK_FLAGS_INCL_SOURCENAME)
+   pos += snprintf(buf + pos, remaining(pos), "%s:",
+   trim_prefix(desc->filename));
+   return pos;
+}
+
+static char *dynamic_emit_prefix(struct _ddebug *desc, char *buf)
 {
int pos_after_tid;
int pos = 0;
 
+   if (likely(!(desc->flags & _DPRINTK_FLAGS_INCL_ANY)))
+   return buf;
+
if (desc->flags & _DPRINTK_FLAGS_INCL_TID) {
if (in_interrupt())
pos += snprintf(buf + pos, remaining(pos), " ");
@@ -790,15 +807,10 @@ static char *__dynamic_emit_prefix(const struct _ddebug 
*desc, char *buf)
task_pid_vnr(current));
}
pos_after_tid = pos;
-   if (desc->flags & _DPRINTK_FLAGS_INCL_MODNAME)
-   pos += snprintf(buf + pos, remaining(pos), "%s:",
-   desc->modname);
-   if (desc->flags & _DPRINTK_FLAGS_INCL_FUNCNAME)
-   pos += snprintf(buf + pos, remaining(pos), "%s:",
-   desc->function);
-   if (desc->flags & _DPRINTK_FLAGS_INCL_SOURCENAME)
-   pos += snprintf(buf + pos, remaining(pos), "%s:",
-   trim_prefix(desc->filename));
+
+   if (unlikely(desc->flags & _DPRINTK_FLAGS_INCL_LOOKUP))
+   pos += __dynamic_emit_prefix(desc, buf, pos);
+
if (desc->flags & _DPRINTK_FLAGS_INCL_LINENO)
pos += snprintf(buf + pos, remaining(pos), "%d:",
desc->lineno);
@@ -810,13 +822,6 @@ static char *__dynamic_emit_prefix(const struct _ddebug 
*desc, char *buf)
return buf;
 }
 
-static inline char *dynamic_emit_prefix(struct _ddebug *desc, char *buf)
-{
-   if (unlikely(desc->flags & _DPRINTK_FLAGS_INCL_ANY))
-   return __dynamic_emit_prefix(desc, buf);
-   return buf;
-}
-
 void __dynamic_pr_debug(struct _ddebug *descriptor, const char *fmt, ...)
 {
va_list args;
-- 
2.41.0

[PATCH v7b 23/25] drm: use correct ccflags-y spelling

2023-10-13 Thread Jim Cromie

Incorrectly spelled CFLAGS- failed to add -DDYNAMIC_DEBUG_MODULE,
which broke builds with:

CONFIG_DRM_USE_DYNAMIC_DEBUG=y
CONFIG_DYNAMIC_DEBUG_CORE=y
CONFIG_DYNAMIC_DEBUG=n

Also add subdir-ccflags so that all drivers pick up the addition.

Fixes: 84ec67288c10 ("drm_print: wrap drm_*_dbg in dyndbg descriptor factory 
macro")
Signed-off-by: Jim Cromie 
---
 drivers/gpu/drm/Makefile | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
index 215e78e79125..22b1984cc982 100644
--- a/drivers/gpu/drm/Makefile
+++ b/drivers/gpu/drm/Makefile
@@ -3,7 +3,8 @@
 # Makefile for the drm device driver.  This driver provides support for the
 # Direct Rendering Infrastructure (DRI) in XFree86 4.1.0 and higher.
 
-CFLAGS-$(CONFIG_DRM_USE_DYNAMIC_DEBUG) += -DDYNAMIC_DEBUG_MODULE
+ccflags-$(CONFIG_DRM_USE_DYNAMIC_DEBUG)+= 
-DDYNAMIC_DEBUG_MODULE
+subdir-ccflags-$(CONFIG_DRM_USE_DYNAMIC_DEBUG) += -DDYNAMIC_DEBUG_MODULE
 
 drm-y := \
drm_aperture.o \
-- 
2.41.0

[PATCH v7b 24/25] drm-drivers: DRM_CLASSMAP_USE in 2nd batch of drivers, helpers

2023-10-13 Thread Jim Cromie

Add a DRM_CLASSMAP_USE declaration to 2nd batch of helpers and *_drv.c
files.  For drivers, add the decl just above the module's PARAMs,
since it identifies the "inherited" drm.debug param.

Note: with CONFIG_DRM_USE_DYNAMIC_DEBUG=y, a module not also declaring
DRM_CLASSMAP_USE will have its class'd prdbgs stuck in the initial
(disabled, but for DEBUG) state.

The stuck sites are evident in /proc/dynamic_debug/control as:

   class:_UNKNOWN_ _id:N# control's last column

rather than a proper "enumeration":

   class:DRM_UT_CORE

This set of updates was found by choosing M for all DRM-config items I
found (not allmodconfig), building & modprobing them, and grepping
"class unknown," control.  There may yet be others.

Signed-off-by: Jim Cromie 
---
 drivers/gpu/drm/drm_gem_shmem_helper.c | 2 ++
 drivers/gpu/drm/gud/gud_drv.c  | 2 ++
 drivers/gpu/drm/mgag200/mgag200_drv.c  | 2 ++
 drivers/gpu/drm/qxl/qxl_drv.c  | 2 ++
 drivers/gpu/drm/radeon/radeon_drv.c| 2 ++
 drivers/gpu/drm/udl/udl_main.c | 2 ++
 drivers/gpu/drm/vkms/vkms_drv.c| 2 ++
 drivers/gpu/drm/vmwgfx/vmwgfx_drv.c| 2 ++
 8 files changed, 16 insertions(+)

diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c 
b/drivers/gpu/drm/drm_gem_shmem_helper.c
index e435f986cd13..066d906e3199 100644
--- a/drivers/gpu/drm/drm_gem_shmem_helper.c
+++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
@@ -23,6 +23,8 @@
 #include 
 #include 
 
+DRM_CLASSMAP_USE(drm_debug_classes);
+
 MODULE_IMPORT_NS(DMA_BUF);
 
 /**
diff --git a/drivers/gpu/drm/gud/gud_drv.c b/drivers/gpu/drm/gud/gud_drv.c
index 9d7bf8ee45f1..5b555045fce4 100644
--- a/drivers/gpu/drm/gud/gud_drv.c
+++ b/drivers/gpu/drm/gud/gud_drv.c
@@ -31,6 +31,8 @@
 
 #include "gud_internal.h"
 
+DRM_CLASSMAP_USE(drm_debug_classes);
+
 /* Only used internally */
 static const struct drm_format_info gud_drm_format_r1 = {
.format = GUD_DRM_FORMAT_R1,
diff --git a/drivers/gpu/drm/mgag200/mgag200_drv.c 
b/drivers/gpu/drm/mgag200/mgag200_drv.c
index abddf37f0ea1..d678eb8e028d 100644
--- a/drivers/gpu/drm/mgag200/mgag200_drv.c
+++ b/drivers/gpu/drm/mgag200/mgag200_drv.c
@@ -24,6 +24,8 @@ static int mgag200_modeset = -1;
 MODULE_PARM_DESC(modeset, "Disable/Enable modesetting");
 module_param_named(modeset, mgag200_modeset, int, 0400);
 
+DRM_CLASSMAP_USE(drm_debug_classes);
+
 int mgag200_init_pci_options(struct pci_dev *pdev, u32 option, u32 option2)
 {
struct device *dev = >dev;
diff --git a/drivers/gpu/drm/qxl/qxl_drv.c b/drivers/gpu/drm/qxl/qxl_drv.c
index b30ede1cf62d..91942ffcc2b4 100644
--- a/drivers/gpu/drm/qxl/qxl_drv.c
+++ b/drivers/gpu/drm/qxl/qxl_drv.c
@@ -65,6 +65,8 @@ module_param_named(modeset, qxl_modeset, int, 0400);
 MODULE_PARM_DESC(num_heads, "Number of virtual crtcs to expose (default 4)");
 module_param_named(num_heads, qxl_num_crtc, int, 0400);
 
+DRM_CLASSMAP_USE(drm_debug_classes);
+
 static struct drm_driver qxl_driver;
 static struct pci_driver qxl_pci_driver;
 
diff --git a/drivers/gpu/drm/radeon/radeon_drv.c 
b/drivers/gpu/drm/radeon/radeon_drv.c
index fa531493b111..ab29945af657 100644
--- a/drivers/gpu/drm/radeon/radeon_drv.c
+++ b/drivers/gpu/drm/radeon/radeon_drv.c
@@ -247,6 +247,8 @@ int radeon_cik_support = 1;
 MODULE_PARM_DESC(cik_support, "CIK support (1 = enabled (default), 0 = 
disabled)");
 module_param_named(cik_support, radeon_cik_support, int, 0444);
 
+DRM_CLASSMAP_USE(drm_debug_classes);
+
 static struct pci_device_id pciidlist[] = {
radeon_PCI_IDS
 };
diff --git a/drivers/gpu/drm/udl/udl_main.c b/drivers/gpu/drm/udl/udl_main.c
index 3ebe2ce55dfd..ba57c14454e5 100644
--- a/drivers/gpu/drm/udl/udl_main.c
+++ b/drivers/gpu/drm/udl/udl_main.c
@@ -19,6 +19,8 @@
 
 #define NR_USB_REQUEST_CHANNEL 0x12
 
+DRM_CLASSMAP_USE(drm_debug_classes);
+
 #define MAX_TRANSFER (PAGE_SIZE*16 - BULK_SIZE)
 #define WRITES_IN_FLIGHT (20)
 #define MAX_VENDOR_DESCRIPTOR_SIZE 256
diff --git a/drivers/gpu/drm/vkms/vkms_drv.c b/drivers/gpu/drm/vkms/vkms_drv.c
index dd0af086e7fa..086797c4b82b 100644
--- a/drivers/gpu/drm/vkms/vkms_drv.c
+++ b/drivers/gpu/drm/vkms/vkms_drv.c
@@ -39,6 +39,8 @@
 
 static struct vkms_config *default_config;
 
+DRM_CLASSMAP_USE(drm_debug_classes);
+
 static bool enable_cursor = true;
 module_param_named(enable_cursor, enable_cursor, bool, 0444);
 MODULE_PARM_DESC(enable_cursor, "Enable/Disable cursor support");
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c 
b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
index 8b24ecf60e3e..9cb6be422621 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
@@ -275,6 +275,8 @@ static int vmw_probe(struct pci_dev *, const struct 
pci_device_id *);
 static int vmwgfx_pm_notifier(struct notifier_block *nb, unsigned long val,
  void *ptr);
 
+DRM_CLASSMAP_USE(drm_debug_classes);
+
 MODULE_PARM_DESC(restrict_iommu, "Try to limit IOMMU usage for TTM pages");
 module_param_named(restrict_iommu, vmw_restrict_iommu, int, 0600);

[PATCH v7b 22/25] dyndbg: change WARN_ON to WARN_ON_ONCE

2023-10-13 Thread Jim Cromie

This shouldn't ever happen, and 1st 2 conditions never have.

The 3rd condition did happen, due to corrupt linkage due to a missing
align(8) in DYNDBG_CLASSMAP_USE, on the static struct allocation into
the __dyndbg_class_users section.

Not sure whether changing to _ONCE is appropriate - this is a
module-load activity, so it won't continuously spam syslog.

Signed-off-by: Jim Cromie 
---
undo BUG_ON addition
---
 lib/dynamic_debug.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/dynamic_debug.c b/lib/dynamic_debug.c
index 5825b58043a6..2a5eb64dbc27 100644
--- a/lib/dynamic_debug.c
+++ b/lib/dynamic_debug.c
@@ -1284,7 +1284,7 @@ static void ddebug_attach_user_module_classes(struct 
ddebug_table *dt,
 */
for_each_boxed_vector(di, class_users, num_class_users, i, cli) {
 
-   if (WARN_ON(!cli || !cli->map || !cli->user_mod_name))
+   if (WARN_ON_ONCE(!cli || !cli->map || !cli->user_mod_name))
continue;
 
if (!strcmp(cli->user_mod_name, dt->mod_name)) {
-- 
2.41.0

[PATCH v7b 19/25] dyndbg: reserve flag bit _DPRINTK_FLAGS_PREFIX_CACHED

2023-10-13 Thread Jim Cromie

Reserve bit 7 to remember that a pr-debug callsite is/was:
- enabled, with +p
- wants a dynamic-prefix, with one+ of module:function:sourcfile
- was previously called
- was thus saved in the cache. NOT YET.

Its unclear whether any cache fetch would be faster than 2-3 field
fetches, but theres another factor; the 3 columns in the __dyndbg
section are highly redundant and compressible, but to get the
compression, we need field accessors, which will rebalance the
tradeoff.

So, for now, its just the bit reservation.

Signed-off-by: Jim Cromie 
---
 include/linux/dynamic_debug.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/linux/dynamic_debug.h b/include/linux/dynamic_debug.h
index f182f95caabb..927cb14f24e0 100644
--- a/include/linux/dynamic_debug.h
+++ b/include/linux/dynamic_debug.h
@@ -38,6 +38,7 @@ struct _ddebug {
 #define _DPRINTK_FLAGS_INCL_LINENO (1<<3)
 #define _DPRINTK_FLAGS_INCL_TID(1<<4)
 #define _DPRINTK_FLAGS_INCL_SOURCENAME (1<<5)
+#define _DPRINTK_FLAGS_PREFIX_CACHED   (1<<7)
 
 #define _DPRINTK_FLAGS_INCL_ANY\
(_DPRINTK_FLAGS_INCL_MODNAME | _DPRINTK_FLAGS_INCL_FUNCNAME |\
-- 
2.41.0

[PATCH v7b 18/25] dyndbg-doc: add classmap info to howto

2023-10-13 Thread Jim Cromie

Add some basic info on classmap usage and api

cc: linux-...@vger.kernel.org
Signed-off-by: Jim Cromie 
---
v5- adjustments per Randy Dunlap, me
v7b- checkpatch fixes
---
 .../admin-guide/dynamic-debug-howto.rst   | 60 ++-
 1 file changed, 59 insertions(+), 1 deletion(-)

diff --git a/Documentation/admin-guide/dynamic-debug-howto.rst 
b/Documentation/admin-guide/dynamic-debug-howto.rst
index 0b3d39c610d9..028c2cb5b4c5 100644
--- a/Documentation/admin-guide/dynamic-debug-howto.rst
+++ b/Documentation/admin-guide/dynamic-debug-howto.rst
@@ -225,7 +225,6 @@ the ``p`` flag has meaning, other flags are ignored.
 Note the regexp ``^[-+=][fslmpt_]+$`` matches a flags specification.
 To clear all flags at once, use ``=_`` or ``-fslmpt``.
 
-
 Debug messages during Boot Process
 ==
 
@@ -375,3 +374,62 @@ just a shortcut for ``print_hex_dump(KERN_DEBUG)``.
 For ``print_hex_dump_debug()``/``print_hex_dump_bytes()``, format string is
 its ``prefix_str`` argument, if it is constant string; or ``hexdump``
 in case ``prefix_str`` is built dynamically.
+
+Dynamic Debug classmaps
+===
+
+Dyndbg allows selection/grouping of *prdbg* callsites using structural
+info: module, file, function, line.  Classmaps allow authors to add
+their own domain-oriented groupings using class-names.  Classmaps are
+exported, so they referencable from other modules.
+
+  # enable classes individually
+  :#> ddcmd class DRM_UT_CORE +p
+  :#> ddcmd class DRM_UT_KMS +p
+  # or more selectively
+  :#> ddcmd class DRM_UT_CORE module drm +p
+
+The "class FOO" syntax protects class'd prdbgs from generic overwrite::
+
+  # IOW this doesn't wipe any DRM.debug settings
+  :#> ddcmd -p
+
+To support the DRM.debug parameter, DYNDBG_CLASSMAP_PARAM* updates all
+classes in a classmap, mapping param-bits 0..N onto the classes:
+DRM_UT_<*> for the DRM use-case.
+
+Dynamic Debug Classmap API
+==
+
+DYNDBG_CLASSMAP_DEFINE - modules use this to create classmaps, naming
+each of the classes (stringified enum-symbols: "DRM_UT_<*>"), and
+type, and mapping the class-names to consecutive _class_ids.
+
+By doing so, modules tell dyndbg that they are have prdbgs with those
+class_ids, and they authorize dyndbg to accept "class FOO" for the
+module defining that classname.
+
+There are 2 types of classmaps:
+
+ DD_CLASS_TYPE_DISJOINT_BITS: classes are independent, like DRM.debug
+ DD_CLASS_TYPE_LEVEL_NUM: classes are relative, ordered (V3 > V2)
+
+DYNDBG_CLASSMAP_PARAM - refers to a DEFINEd classmap, exposing the set
+of defined classes to manipulation as a group.  This interface
+enforces the relatedness of classes of DD_CLASS_TYPE_LEVEL_NUM typed
+classmaps; all classes are independent in the >control parser itself.
+
+DYNDBG_CLASSMAP_USE - drm drivers invoke this to ref the CLASSMAP that
+drm DEFINEs.  This shares the classmap definition, and authorizes
+dyndbg to apply changes to the user module's class'd pr_debugs.  It
+also tells dyndbg how to initialize the user's prdbgs at modprobe,
+based upon the current setting of the parent's controlling param.
+
+Modules or module-groups (drm & drivers) can define multiple
+classmaps, as long as they share the limited 0..62 per-module-group
+_class_id range, without overlap.
+
+``#define DEBUG`` will enable all pr_debugs in scope, including any
+class'd ones.  This won't be reflected in the PARAM readback value,
+but the pr_debug callsites can be toggled into agreement with the
+param.
-- 
2.41.0

[PATCH v7b 17/25] dyndbg-API: promote DYNDBG_CLASSMAP_PARAM to API

2023-10-13 Thread Jim Cromie

move the DYNDBG_CLASSMAP_PARAM macro from test-dynamic-debug.c into
the header, and refine it, by distinguishing the 2 use cases:

1.DYNDBG_CLASSMAP_PARAM_REF
for DRM, to pass in extern __drm_debug by name.
dyndbg keeps bits in it, so drm can still use it as before

2.DYNDBG_CLASSMAP_PARAM
new user (test_dynamic_debug) doesn't need to share state,
decls a static long unsigned int to store the bitvec.

__DYNDBG_CLASSMAP_PARAM
   bottom layer - allocate,init a ddebug-class-param, module-param-cb.

Also clean up and improve comments in test-code, and add
MODULE_DESCRIPTIONs.

Signed-off-by: Jim Cromie 
---
---
 drivers/gpu/drm/drm_print.c |  8 ++---
 include/drm/drm_print.h |  6 ++--
 include/linux/dynamic_debug.h   | 37 +-
 lib/test_dynamic_debug.c| 56 +++--
 lib/test_dynamic_debug_submod.c |  9 +-
 5 files changed, 69 insertions(+), 47 deletions(-)

diff --git a/drivers/gpu/drm/drm_print.c b/drivers/gpu/drm/drm_print.c
index dabcfa0dd279..8f4b609353a5 100644
--- a/drivers/gpu/drm/drm_print.c
+++ b/drivers/gpu/drm/drm_print.c
@@ -69,12 +69,8 @@ DRM_CLASSMAP_DEFINE(drm_debug_classes, 
DD_CLASS_TYPE_DISJOINT_BITS,
"DRM_UT_DP",
"DRM_UT_DRMRES");
 
-static struct ddebug_class_param drm_debug_bitmap = {
-   .bits = &__drm_debug,
-   .flags = "p",
-   .map = _debug_classes,
-};
-module_param_cb(debug, _ops_dyndbg_classes, _debug_bitmap, 0600);
+DRM_CLASSMAP_PARAM_REF(debug, __drm_debug, drm_debug_classes, p);
+
 #endif
 
 void __drm_puts_coredump(struct drm_printer *p, const char *str)
diff --git a/include/drm/drm_print.h b/include/drm/drm_print.h
index 706afc97c79c..94d4f5500030 100644
--- a/include/drm/drm_print.h
+++ b/include/drm/drm_print.h
@@ -322,11 +322,13 @@ enum drm_debug_category {
 };
 
 #ifdef CONFIG_DRM_USE_DYNAMIC_DEBUG
-#define DRM_CLASSMAP_DEFINE(...) DYNDBG_CLASSMAP_DEFINE(__VA_ARGS__)
-#define DRM_CLASSMAP_USE(name)   DYNDBG_CLASSMAP_USE(name)
+#define DRM_CLASSMAP_DEFINE(...)   DYNDBG_CLASSMAP_DEFINE(__VA_ARGS__)
+#define DRM_CLASSMAP_USE(name) DYNDBG_CLASSMAP_USE(name)
+#define DRM_CLASSMAP_PARAM_REF(...)DYNDBG_CLASSMAP_PARAM_REF(__VA_ARGS__)
 #else
 #define DRM_CLASSMAP_DEFINE(...)
 #define DRM_CLASSMAP_USE(name)
+#define DRM_CLASSMAP_PARAM_REF(...)
 #endif
 
 static inline bool drm_debug_enabled_raw(enum drm_debug_category category)
diff --git a/include/linux/dynamic_debug.h b/include/linux/dynamic_debug.h
index 1b027be2cdc4..f182f95caabb 100644
--- a/include/linux/dynamic_debug.h
+++ b/include/linux/dynamic_debug.h
@@ -91,7 +91,7 @@ struct ddebug_class_map {
  * used to validate a "class FOO .." >control command on the module
  */
 #define DYNDBG_CLASSMAP_DEFINE(_var, _maptype, _base, ...) \
-   const char *_var##_classnames[] = { __VA_ARGS__ };  \
+   static const char *_var##_classnames[] = { __VA_ARGS__ };   \
struct ddebug_class_map __aligned(8) __used \
__section("__dyndbg_classes") _var = {  \
.mod = THIS_MODULE, \
@@ -145,6 +145,41 @@ struct ddebug_class_param {
const struct ddebug_class_map *map;
 };
 
+/**
+ * DYNDBG_CLASSMAP_PARAM - wrap a dyndbg-classmap with a controlling sys-param
+ * @_name  sysfs node name
+ * @_var   name of the struct classmap var defining the controlled classes
+ * @_flags flags to be toggled, typically just 'p'
+ *
+ * Creates a sysfs-param to control the classes defined by the
+ * classmap.  Keeps bits in a private/static
+ */
+#define DYNDBG_CLASSMAP_PARAM(_name, _var, _flags) \
+   static unsigned long _name##_bvec;  \
+   __DYNDBG_CLASSMAP_PARAM(_name, _name##_bvec, _var, _flags)
+
+/**
+ * DYNDBG_CLASSMAP_PARAM_REF - wrap a dyndbg-classmap with a controlling 
sys-param
+ * @_name  sysfs node name
+ * @_bits  name of the module's unsigned long bit-vector, ex: __drm_debug
+ * @_var   name of the struct classmap var defining the controlled classes
+ * @_flags flags to be toggled, typically just 'p'
+ *
+ * Creates a sysfs-param to control the classmap, keeping bitvec in user 
@_bits.
+ * This lets drm use __drm_debug elsewhere too.
+ */
+#define DYNDBG_CLASSMAP_PARAM_REF(_name, _bits, _var, _flags)  \
+   __DYNDBG_CLASSMAP_PARAM(_name, _bits, _var, _flags)
+
+#define __DYNDBG_CLASSMAP_PARAM(_name, _bits, _var, _flags)\
+   static struct ddebug_class_param _name##_##_flags = {   \
+   .bits = &(_bits),   \
+   .flags = #_flags,   \
+   .map = &(_var), \
+   };  \
+   module_param_cb(_name, _ops_dyndbg_classes,   \
+

[PATCH v7b 14/25] dyndbg-API: fix CONFIG_DRM_USE_DYNAMIC_DEBUG regression

2023-10-13 Thread Jim Cromie

DECLARE_DYNDBG_CLASSMAP() has a design error; it fails a basic K
rule: "define once, refer many times".

When DRM_USE_DYNAMIC_DEBUG=y, DECLARE_DYNDBG_CLASSMAP() is used across
DRM core & drivers; they all repeat the same classmap-defn args, which
must match for the modules to respond together when DRM.debug
categories are enabled.

Worse, it causes the CONFIG_DRM_USE_DYNAMIC_DEBUG=Y regression; 1st
drm.ko loads, and dyndbg initializes its DRM.debug callsites, then a
drm-driver loads, but too late - it missed the DRM.debug enablement.

So replace it with 2 macros:
  DYNDBG_CLASSMAP_DEFINE - invoked once from core - drm.ko
  DYNDBG_CLASSMAP_USE- from all drm drivers and helpers.

DYNDBG_CLASSMAP_DEFINE: based on DECLARE_DYNDBG_CLASSMAP, but now it
drops the static on the constructed classmap variable, and exports it
instead.

DYNDBG_CLASSMAP_USE: then refers to the exported var by name:
* used from drivers, helper-mods
* lets us drop the repetitive "classname" args
* fixes 2nd-defn problem
* creates a ddebug_class_user record in new __dyndbg_class_users section
  this allows ddebug_add_module(etal) to handle them per-module.

The distinction, and the usage record, allows dyndbg to initialize the
driver's DRM.debug callsites separately after it is modprobed.

Since DRM now needs updates to use the new macros, it also gets 2
wrappers: DRM_CLASSMAP_DEFINE, DRM_CLASSMAP_USE which declutter the
users by hiding the ifdef CONFIG_DRM_USE_DYNAMIC_DEBUG.

To review, dyndbg's existing __dyndbg_classes[] section does:

. catalogs the classmaps defined by the module (or builtin modules)
. authorizes dyndbg to >control those class'd prdbgs for the module.
. DYNDBG_CLASSMAP_DEFINE(and old one) creates classmaps in this section.

This patch adds __dyndbg_class_users[] section:

. catalogs uses/references to the classmap definitions.
. authorizes dyndbg to >control those class'd prdbgs in ref'g module.
. DYNDBG_CLASSMAP_USE() creates classmap-user records in this section.

Now ddebug_add_module(etal) can handle classmap-uses like (and after)
classmaps; when a dependent module is loaded, its parent's kernel
params are scanned to find the param wired to dyndbg-param-ops, whose
classmap matches the one ref'd by the client.

To support this, theres a few data/header changes:

. new struct ddebug_class_user
  contains: user-module-name, 
  it records drm-driver's use of a classmap in the section, allowing lookup

struct ddebug_info gets 2 new fields to encapsulate the new section:
  class_users, num_class_users.
  set by dynamic_debug_init() for builtins.
  or by kernel/module/main:load_info() for loadable modules.

vmlinux.lds.h: new BOUNDED_SECTION for __dyndbg_class_users

dynamic_debug.c has 2 changes in ddebug_add_module(), ddebug_change():

ddebug_add_module() already calls ddebug_attach_module_classes()
to handle classmaps DEFINEd by a module, now it also calls
ddebug_attach_user_module_classes() to handle USEd classmaps.  To
avoid this work when possible, 1st scan the module's descriptors and
count the number of class'd pr_debugs.

ddebug_attach_user_module_classes() scans the module's class_users
section, follows the refs to the parent's classmap, and calls
ddebug_apply_params() on each.  It also avoids work by checking the
module's class-ct.

ddebug_apply_params(new fn):

It scans module's/builtin kernel-params, calls ddebug_match_apply_kparam
for each to find the params/sysfs-nodes which may be wired to a classmap.

ddebug_match_apply_kparam(new fn):

1st, it tests the kernel-param.ops is dyndbg's; this guarantees that
the attached arg is a struct ddebug_class_param, which has a ref to
the param's state, and to the classmap defining the param's handling.

2nd, it requires that the classmap ref'd by the kparam is the one
we're called for; modules can use many separate classmaps (as
test_dynamic_debug does).

Then apply the "parent" kparam's setting to the dependent module,
using ddebug_apply_class_bitmap().

ddebug_change(and callees) also gets adjustments:

ddebug_find_valid_class(): This does a search over the module's
classmaps, looking for the class FOO echo'd to >control.  So now it
searches over __dyndbg_class_users[] after __dyndbg_classes[].

ddebug_class_name(): return class-names for defined AND used classes.

test_dynamic_debug.c, test_dynamic_debug_submod.c:

This (already) demonstrates the 2 types of classmaps & sysfs-params,
following the 4-part recipe:

1. define an enum for the classmap: DRM.debug has DRM_UT_{CORE,KMS,...}
   multiple classes must share 0-62 classid space.
2. DYNDBG_CLASSMAP_DEFINE(.. DRM_UT_{CORE,KMS,...})
3. DYNDBG_CLASSMAP_PARAM* (classmap)
4. DYNDBG_CLASSMAP_USE()
   by _submod only, skipping 2,3

Move all the enum declarations together, to better explain how they
share the 0..62 class-id space available to a module (non-overlapping
subranges).

reorg macros 2,3 by name.  This gives a tabular format, making it easy
to see the pattern of repetition, and the points of change.

And

[PATCH v7b 16/25] dyndbg: refactor ddebug_classparam_clamp_input

2023-10-13 Thread Jim Cromie

Extract input validation code, from param_set_dyndbg_module_classes()
(the sys-node >handler) to new: ddebug_classparam_clamp_input(kp),
call it from former.  It takes kernel-param arg, so it can complain
about "foo: bad input".

Reuse ddparam_clamp_input(kp) in ddebug_sync_classbits(),
to validate inputs from parent's params, just like our own.
To support that reuse, alter ddebug_sync_classbits() and caller to
pass kp instead of kp->arg.

Signed-off-by: Jim Cromie 
---
 lib/dynamic_debug.c | 70 ++---
 1 file changed, 47 insertions(+), 23 deletions(-)

diff --git a/lib/dynamic_debug.c b/lib/dynamic_debug.c
index c11feca70d6f..17eefb35ac96 100644
--- a/lib/dynamic_debug.c
+++ b/lib/dynamic_debug.c
@@ -656,6 +656,30 @@ static int ddebug_apply_class_bitmap(const struct 
ddebug_class_param *dcp,
 
 #define CLASSMAP_BITMASK(width) ((1UL << (width)) - 1)
 
+static void ddebug_class_param_clamp_input(unsigned long *inrep, const struct 
kernel_param *kp)
+{
+   const struct ddebug_class_param *dcp = kp->arg;
+   const struct ddebug_class_map *map = dcp->map;
+
+   switch (map->map_type) {
+   case DD_CLASS_TYPE_DISJOINT_BITS:
+   /* expect bits. mask and warn if too many */
+   if (*inrep & ~CLASSMAP_BITMASK(map->length)) {
+   pr_warn("%s: input: 0x%lx exceeds mask: 0x%lx, 
masking\n",
+   KP_NAME(kp), *inrep, 
CLASSMAP_BITMASK(map->length));
+   *inrep &= CLASSMAP_BITMASK(map->length);
+   }
+   break;
+   case DD_CLASS_TYPE_LEVEL_NUM:
+   /* input is bitpos, of highest verbosity to be enabled */
+   if (*inrep > map->length) {
+   pr_warn("%s: level:%ld exceeds max:%d, clamping\n",
+   KP_NAME(kp), *inrep, map->length);
+   *inrep = map->length;
+   }
+   break;
+   }
+}
 static int param_set_dyndbg_module_classes(const char *instr,
   const struct kernel_param *kp,
   const char *modnm)
@@ -674,26 +698,15 @@ static int param_set_dyndbg_module_classes(const char 
*instr,
pr_err("expecting numeric input, not: %s > %s\n", instr, 
KP_NAME(kp));
return -EINVAL;
}
+   ddebug_class_param_clamp_input(, kp);
 
switch (map->map_type) {
case DD_CLASS_TYPE_DISJOINT_BITS:
-   /* expect bits. mask and warn if too many */
-   if (inrep & ~CLASSMAP_BITMASK(map->length)) {
-   pr_warn("%s: input: 0x%lx exceeds mask: 0x%lx, 
masking\n",
-   KP_NAME(kp), inrep, 
CLASSMAP_BITMASK(map->length));
-   inrep &= CLASSMAP_BITMASK(map->length);
-   }
v2pr_info("bits:0x%lx > %s.%s\n", inrep, modnm ?: "*", 
KP_NAME(kp));
totct += ddebug_apply_class_bitmap(dcp, , *dcp->bits, 
modnm);
*dcp->bits = inrep;
break;
case DD_CLASS_TYPE_LEVEL_NUM:
-   /* input is bitpos, of highest verbosity to be enabled */
-   if (inrep > map->length) {
-   pr_warn("%s: level:%ld exceeds max:%d, clamping\n",
-   KP_NAME(kp), inrep, map->length);
-   inrep = map->length;
-   }
old_bits = CLASSMAP_BITMASK(*dcp->lvl);
new_bits = CLASSMAP_BITMASK(inrep);
v2pr_info("lvl:%ld bits:0x%lx > %s\n", inrep, new_bits, 
KP_NAME(kp));
@@ -1160,16 +1173,27 @@ static const char * const ddebug_classmap_typenames[] = 
{
  ddebug_classmap_typenames[_cm->map_type]);\
})
 
-static void ddebug_sync_classbits(const struct ddebug_class_param *dcp, const 
char *modname)
+static void ddebug_sync_classbits(const struct kernel_param *kp, const char 
*modname)
 {
-   /* clamp initial bitvec, mask off hi-bits */
-   if (*dcp->bits & ~CLASSMAP_BITMASK(dcp->map->length)) {
-   *dcp->bits &= CLASSMAP_BITMASK(dcp->map->length);
-   v2pr_info("preset classbits: %lx\n", *dcp->bits);
+   struct ddebug_class_param *dcp = kp->arg;
+   unsigned long new_bits;
+
+   ddebug_class_param_clamp_input(dcp->bits, kp);
+
+   switch (dcp->map->map_type) {
+   case DD_CLASS_TYPE_DISJOINT_BITS:
+   v2pr_info("  %s: classbits: 0x%lx\n", KP_NAME(kp), *dcp->bits);
+   ddebug_apply_class_bitmap(dcp, dcp->bits, 0UL, modname);
+   break;
+   case DD_CLASS_TYPE_LEVEL_NUM:
+   new_bits = CLASSMAP_BITMASK(*dcp->lvl);
+   v2pr_info("  %s: lvl:%ld bits:0x%lx\n", KP_NAME(kp), *dcp->lvl, 
new_bits);
+   ddebug_apply_class_bitmap(dcp, _bits, 0UL, modname);
+   break;
+   default:
+

[PATCH v7b 15/25] dyndbg: add for_each_boxed_vector

2023-10-13 Thread Jim Cromie

Add a for_each iterator to walk a counted vector member in a struct
(ie the box), and use it to replace 8 open-coded loops.

Signed-off-by: Jim Cromie 
---
v5- parens-on-box-force-precedence
---
 lib/dynamic_debug.c | 20 +++-
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/lib/dynamic_debug.c b/lib/dynamic_debug.c
index be49e104ec76..c11feca70d6f 100644
--- a/lib/dynamic_debug.c
+++ b/lib/dynamic_debug.c
@@ -158,6 +158,9 @@ static void vpr_info_dq(const struct ddebug_query *query, 
const char *msg)
  _dt->num_class_users);\
})
 
+#define for_each_boxed_vector(_box, _vec, _len, _ct, _curs)\
+   for (_ct = 0, _curs = (_box)->_vec; _ct < (_box)->_len; _ct++, _curs++)
+
 #define __outvar /* filled by callee */
 static struct ddebug_class_map *ddebug_find_valid_class(struct ddebug_table 
const *dt,
const char 
*class_string,
@@ -167,7 +170,7 @@ static struct ddebug_class_map 
*ddebug_find_valid_class(struct ddebug_table cons
struct ddebug_class_user *cli;
int i, idx;
 
-   for (i = 0, map = dt->classes; i < dt->num_classes; i++, map++) {
+   for_each_boxed_vector(dt, classes, num_classes, i, map) {
idx = match_string(map->class_names, map->length, class_string);
if (idx >= 0) {
*class_id = idx + map->base;
@@ -175,7 +178,7 @@ static struct ddebug_class_map 
*ddebug_find_valid_class(struct ddebug_table cons
return map;
}
}
-   for (i = 0, cli = dt->class_users; i < dt->num_class_users; i++, cli++) 
{
+   for_each_boxed_vector(dt, class_users, num_class_users, i, cli) {
idx = match_string(cli->map->class_names, cli->map->length, 
class_string);
if (idx >= 0) {
*class_id = idx + cli->map->base;
@@ -1058,11 +1061,11 @@ static const char *ddebug_class_name(struct 
ddebug_table *dt, struct _ddebug *dp
struct ddebug_class_user *cli = dt->class_users;
int i;
 
-   for (i = 0; i < dt->num_classes; i++, map++)
+   for_each_boxed_vector(dt, classes, num_classes, i, map)
if (class_in_range(dp->class_id, map))
return map->class_names[dp->class_id - map->base];
 
-   for (i = 0; i < dt->num_class_users; i++, cli++)
+   for_each_boxed_vector(dt, class_users, num_class_users, i, cli)
if (class_in_range(dp->class_id, cli->map))
return cli->map->class_names[dp->class_id - 
cli->map->base];
 
@@ -1216,7 +1219,7 @@ static void ddebug_attach_module_classes(struct 
ddebug_table *dt, struct _ddebug
struct ddebug_class_map *cm;
int i, nc = 0;
 
-   for (i = 0, cm = di->classes; i < di->num_classes; i++, cm++) {
+   for_each_boxed_vector(di, classes, num_classes, i, cm) {
 
if (!strcmp(cm->mod_name, dt->mod_name)) {
vpr_cm_info(cm, "classes[%d]:", i);
@@ -1230,7 +1233,7 @@ static void ddebug_attach_module_classes(struct 
ddebug_table *dt, struct _ddebug
vpr_info("module:%s attached %d classes\n", dt->mod_name, nc);
dt->num_classes = nc;
 
-   for (i = 0, cm = dt->classes; i < dt->num_classes; i++, cm++)
+   for_each_boxed_vector(di, classes, num_classes, i, cm)
ddebug_apply_params(cm, cm->mod_name);
 }
 
@@ -1250,7 +1253,7 @@ static void ddebug_attach_user_module_classes(struct 
ddebug_table *dt,
 * module's refs, save to dt.  For loadables, this is the
 * whole array.
 */
-   for (i = 0, cli = di->class_users; i < di->num_class_users; i++, cli++) 
{
+   for_each_boxed_vector(di, class_users, num_class_users, i, cli) {
 
if (WARN_ON(!cli || !cli->map || !cli->user_mod_name))
continue;
@@ -1268,8 +1271,7 @@ static void ddebug_attach_user_module_classes(struct 
ddebug_table *dt,
 
dt->num_class_users = nc;
 
-   /* now iterate dt */
-   for (i = 0, cli = dt->class_users; i < dt->num_class_users; i++, cli++)
+   for_each_boxed_vector(di, class_users, num_class_users, i, cli)
ddebug_apply_params(cli->map, cli->user_mod_name);
 
vpr_dt_info(dt, "attach-client-module: ");
-- 
2.41.0

[PATCH v7b 11/25] dyndbg: tighten fn-sig of ddebug_apply_class_bitmap

2023-10-13 Thread Jim Cromie

old_bits arg is currently a pointer to the input bits, but this could
allow inadvertent changes to the input by the fn.  Disallow this.
And constify new_bits while here.

Signed-off-by: Jim Cromie 
---
 lib/dynamic_debug.c | 21 +++--
 1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/lib/dynamic_debug.c b/lib/dynamic_debug.c
index 8158943b350d..8beb98a831f5 100644
--- a/lib/dynamic_debug.c
+++ b/lib/dynamic_debug.c
@@ -593,7 +593,8 @@ static int ddebug_exec_queries(char *query, const char 
*modname)
 
 /* apply a new class-param setting */
 static int ddebug_apply_class_bitmap(const struct ddebug_class_param *dcp,
-unsigned long *new_bits, unsigned long 
*old_bits,
+const unsigned long *new_bits,
+const unsigned long old_bits,
 const char *query_modname)
 {
 #define QUERY_SIZE 128
@@ -602,12 +603,12 @@ static int ddebug_apply_class_bitmap(const struct 
ddebug_class_param *dcp,
int matches = 0;
int bi, ct;
 
-   if (*new_bits != *old_bits)
+   if (*new_bits != old_bits)
v2pr_info("apply bitmap: 0x%lx to: 0x%lx for %s\n", *new_bits,
- *old_bits, query_modname ?: "'*'");
+ old_bits, query_modname ?: "'*'");
 
for (bi = 0; bi < map->length; bi++) {
-   if (test_bit(bi, new_bits) == test_bit(bi, old_bits))
+   if (test_bit(bi, new_bits) == test_bit(bi, _bits))
continue;
 
snprintf(query, QUERY_SIZE, "class %s %c%s", 
map->class_names[bi],
@@ -619,9 +620,9 @@ static int ddebug_apply_class_bitmap(const struct 
ddebug_class_param *dcp,
v2pr_info("bit_%d: %d matches on class: %s -> 0x%lx\n", bi,
  ct, map->class_names[bi], *new_bits);
}
-   if (*new_bits != *old_bits)
+   if (*new_bits != old_bits)
v2pr_info("applied bitmap: 0x%lx to: 0x%lx for %s\n", *new_bits,
- *old_bits, query_modname ?: "'*'");
+ old_bits, query_modname ?: "'*'");
 
return matches;
 }
@@ -678,7 +679,7 @@ static int param_set_dyndbg_classnames(const char *instr, 
const struct kernel_pa
continue;
}
curr_bits ^= BIT(cls_id);
-   totct += ddebug_apply_class_bitmap(dcp, _bits, 
dcp->bits, NULL);
+   totct += ddebug_apply_class_bitmap(dcp, _bits, 
*dcp->bits, NULL);
*dcp->bits = curr_bits;
v2pr_info("%s: changed bit %d:%s\n", KP_NAME(kp), 
cls_id,
  map->class_names[cls_id]);
@@ -688,7 +689,7 @@ static int param_set_dyndbg_classnames(const char *instr, 
const struct kernel_pa
old_bits = CLASSMAP_BITMASK(*dcp->lvl);
curr_bits = CLASSMAP_BITMASK(cls_id + (wanted ? 1 : 0 
));
 
-   totct += ddebug_apply_class_bitmap(dcp, _bits, 
_bits, NULL);
+   totct += ddebug_apply_class_bitmap(dcp, _bits, 
old_bits, NULL);
*dcp->lvl = (cls_id + (wanted ? 1 : 0));
v2pr_info("%s: changed bit-%d: \"%s\" %lx->%lx\n", 
KP_NAME(kp), cls_id,
  map->class_names[cls_id], old_bits, 
curr_bits);
@@ -742,7 +743,7 @@ static int param_set_dyndbg_module_classes(const char 
*instr,
inrep &= CLASSMAP_BITMASK(map->length);
}
v2pr_info("bits:0x%lx > %s.%s\n", inrep, modnm ?: "*", 
KP_NAME(kp));
-   totct += ddebug_apply_class_bitmap(dcp, , dcp->bits, 
modnm);
+   totct += ddebug_apply_class_bitmap(dcp, , *dcp->bits, 
modnm);
*dcp->bits = inrep;
break;
case DD_CLASS_TYPE_LEVEL_NUM:
@@ -755,7 +756,7 @@ static int param_set_dyndbg_module_classes(const char 
*instr,
old_bits = CLASSMAP_BITMASK(*dcp->lvl);
new_bits = CLASSMAP_BITMASK(inrep);
v2pr_info("lvl:%ld bits:0x%lx > %s\n", inrep, new_bits, 
KP_NAME(kp));
-   totct += ddebug_apply_class_bitmap(dcp, _bits, _bits, 
modnm);
+   totct += ddebug_apply_class_bitmap(dcp, _bits, old_bits, 
modnm);
*dcp->lvl = inrep;
break;
default:
-- 
2.41.0

[PATCH v7b 13/25] dyndbg-API: remove DD_CLASS_TYPE_(DISJOINT|LEVEL)_NAMES and code

2023-10-13 Thread Jim Cromie

Remove the NAMED class types; these 2 classmap types accept class
names at the PARAM interface, for example:

  echo +DRM_UT_CORE,-DRM_UT_KMS > /sys/module/drm/parameters/debug_names

The code works, but its only used by test-dynamic-debug, and wasn't
asked for by anyone else, so simplify things for now.

Signed-off-by: Jim Cromie 
---
 include/linux/dynamic_debug.h |  19 ++-
 lib/dynamic_debug.c   | 103 +++---
 lib/test_dynamic_debug.c  |  12 
 3 files changed, 12 insertions(+), 122 deletions(-)

diff --git a/include/linux/dynamic_debug.h b/include/linux/dynamic_debug.h
index 8116d0a0d33a..8eaf8eabdc8d 100644
--- a/include/linux/dynamic_debug.h
+++ b/include/linux/dynamic_debug.h
@@ -61,24 +61,13 @@ struct _ddebug {
 enum class_map_type {
DD_CLASS_TYPE_DISJOINT_BITS,
/**
-* DD_CLASS_TYPE_DISJOINT_BITS: classes are independent, one per bit.
-* expecting hex input. Built for drm.debug, basis for other types.
+* DD_CLASS_TYPE_DISJOINT_BITS: classes are independent, mapped to 
bits[0..N].
+* Expects hex input. Built for drm.debug, basis for other types.
 */
DD_CLASS_TYPE_LEVEL_NUM,
/**
-* DD_CLASS_TYPE_LEVEL_NUM: input is numeric level, 0-N.
-* N turns on just bits N-1 .. 0, so N=0 turns all bits off.
-*/
-   DD_CLASS_TYPE_DISJOINT_NAMES,
-   /**
-* DD_CLASS_TYPE_DISJOINT_NAMES: input is a CSV of [+-]CLASS_NAMES,
-* classes are independent, like _DISJOINT_BITS.
-*/
-   DD_CLASS_TYPE_LEVEL_NAMES,
-   /**
-* DD_CLASS_TYPE_LEVEL_NAMES: input is a CSV of [+-]CLASS_NAMES,
-* intended for names like: INFO,DEBUG,TRACE, with a module prefix
-* avoid EMERG,ALERT,CRIT,ERR,WARNING: they're not debug
+* DD_CLASS_TYPE_LEVEL_NUM: input is numeric level, 0..N.
+* Input N turns on bits 0..N-1
 */
 };
 
diff --git a/lib/dynamic_debug.c b/lib/dynamic_debug.c
index 45870a699507..91c8b67fd8f8 100644
--- a/lib/dynamic_debug.c
+++ b/lib/dynamic_debug.c
@@ -632,77 +632,6 @@ static int ddebug_apply_class_bitmap(const struct 
ddebug_class_param *dcp,
 
 #define CLASSMAP_BITMASK(width) ((1UL << (width)) - 1)
 
-/* accept comma-separated-list of [+-] classnames */
-static int param_set_dyndbg_classnames(const char *instr, const struct 
kernel_param *kp)
-{
-   const struct ddebug_class_param *dcp = kp->arg;
-   const struct ddebug_class_map *map = dcp->map;
-   unsigned long curr_bits, old_bits;
-   char *cl_str, *p, *tmp;
-   int cls_id, totct = 0;
-   bool wanted;
-
-   cl_str = tmp = kstrdup(instr, GFP_KERNEL);
-   p = strchr(cl_str, '\n');
-   if (p)
-   *p = '\0';
-
-   /* start with previously set state-bits, then modify */
-   curr_bits = old_bits = *dcp->bits;
-   vpr_info("\"%s\" > %s:0x%lx\n", cl_str, KP_NAME(kp), curr_bits);
-
-   for (; cl_str; cl_str = p) {
-   p = strchr(cl_str, ',');
-   if (p)
-   *p++ = '\0';
-
-   if (*cl_str == '-') {
-   wanted = false;
-   cl_str++;
-   } else {
-   wanted = true;
-   if (*cl_str == '+')
-   cl_str++;
-   }
-   cls_id = match_string(map->class_names, map->length, cl_str);
-   if (cls_id < 0) {
-   pr_err("%s unknown to %s\n", cl_str, KP_NAME(kp));
-   continue;
-   }
-
-   /* have one or more valid class_ids of one *_NAMES type */
-   switch (map->map_type) {
-   case DD_CLASS_TYPE_DISJOINT_NAMES:
-   /* the +/- pertains to a single bit */
-   if (test_bit(cls_id, _bits) == wanted) {
-   v3pr_info("no change on %s\n", cl_str);
-   continue;
-   }
-   curr_bits ^= BIT(cls_id);
-   totct += ddebug_apply_class_bitmap(dcp, _bits, 
*dcp->bits, NULL);
-   *dcp->bits = curr_bits;
-   v2pr_info("%s: changed bit %d:%s\n", KP_NAME(kp), 
cls_id,
- map->class_names[cls_id]);
-   break;
-   case DD_CLASS_TYPE_LEVEL_NAMES:
-   /* cls_id = N in 0..max. wanted +/- determines N or N-1 
*/
-   old_bits = CLASSMAP_BITMASK(*dcp->lvl);
-   curr_bits = CLASSMAP_BITMASK(cls_id + (wanted ? 1 : 0 
));
-
-   totct += ddebug_apply_class_bitmap(dcp, _bits, 
old_bits, NULL);
-   *dcp->lvl = (cls_id + (wanted ? 1 : 0));
-   v2pr_info("%s: changed bit-%d: \"%s\" %lx->%lx\n", 
KP_NAME(kp), cls_id,
-

[PATCH v7b 09/25] dyndbg: silence debugs with no-change updates

2023-10-13 Thread Jim Cromie

check for actual changes before announcing them, declutter logs.

Signed-off-by: Jim Cromie 
---
 lib/dynamic_debug.c | 12 +++-
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/lib/dynamic_debug.c b/lib/dynamic_debug.c
index b0e11f6bfaa2..b07aab422604 100644
--- a/lib/dynamic_debug.c
+++ b/lib/dynamic_debug.c
@@ -591,7 +591,7 @@ static int ddebug_exec_queries(char *query, const char 
*modname)
return nfound;
 }
 
-/* apply a new bitmap to the sys-knob's current bit-state */
+/* apply a new class-param setting */
 static int ddebug_apply_class_bitmap(const struct ddebug_class_param *dcp,
 unsigned long *new_bits, unsigned long 
*old_bits,
 const char *query_modname)
@@ -602,8 +602,9 @@ static int ddebug_apply_class_bitmap(const struct 
ddebug_class_param *dcp,
int matches = 0;
int bi, ct;
 
-   v2pr_info("apply bitmap: 0x%lx to: 0x%lx for %s\n", *new_bits, 
*old_bits,
- query_modname ?: "");
+   if (*new_bits != *old_bits)
+   v2pr_info("apply bitmap: 0x%lx to: 0x%lx for %s\n", *new_bits,
+ *old_bits, query_modname ?: "'*'");
 
for (bi = 0; bi < map->length; bi++) {
if (test_bit(bi, new_bits) == test_bit(bi, old_bits))
@@ -618,8 +619,9 @@ static int ddebug_apply_class_bitmap(const struct 
ddebug_class_param *dcp,
v2pr_info("bit_%d: %d matches on class: %s -> 0x%lx\n", bi,
  ct, map->class_names[bi], *new_bits);
}
-   v2pr_info("applied bitmap: 0x%lx to: 0x%lx for %s\n", *new_bits, 
*old_bits,
- query_modname ?: "");
+   if (*new_bits != *old_bits)
+   v2pr_info("applied bitmap: 0x%lx to: 0x%lx for %s\n", *new_bits,
+ *old_bits, query_modname ?: "'*'");
 
return matches;
 }
-- 
2.41.0

[PATCH v7b 12/25] dyndbg: reduce verbose=3 messages in ddebug_add_module

2023-10-13 Thread Jim Cromie

The fn currently says "add-module", then "skipping" if the module has
no prdbgs.  Just check 1st and return quietly.

no functional change

Signed-off-by: Jim Cromie 
---
 lib/dynamic_debug.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/lib/dynamic_debug.c b/lib/dynamic_debug.c
index 8beb98a831f5..45870a699507 100644
--- a/lib/dynamic_debug.c
+++ b/lib/dynamic_debug.c
@@ -1242,11 +1242,10 @@ static int ddebug_add_module(struct _ddebug_info *di, 
const char *modname)
 {
struct ddebug_table *dt;
 
-   v3pr_info("add-module: %s.%d sites\n", modname, di->num_descs);
-   if (!di->num_descs) {
-   v3pr_info(" skip %s\n", modname);
+   if (!di->num_descs)
return 0;
-   }
+
+   v3pr_info("add-module: %s %d sites\n", modname, di->num_descs);
 
dt = kzalloc(sizeof(*dt), GFP_KERNEL);
if (dt == NULL) {
-- 
2.41.0

[PATCH v7b 08/25] dyndbg: reduce verbose/debug clutter

2023-10-13 Thread Jim Cromie

currently, for verbose=3, these are logged (blank lines for clarity):

 dyndbg: query 0: "class DRM_UT_CORE +p" mod:*
 dyndbg: split into words: "class" "DRM_UT_CORE" "+p"

 dyndbg: op='+'
 dyndbg: flags=0x1
 dyndbg: *flagsp=0x1 *maskp=0x

 dyndbg: parsed: func="" file="" module="" format="" lineno=0-0 class=...
 dyndbg: no matches for query
 dyndbg: no-match: func="" file="" module="" format="" lineno=0-0 class=...
 dyndbg: processed 1 queries, with 0 matches, 0 errs

That is excessive, so this patch:
 - shrinks 3 lines of 2nd stanza to single line
 - drops 1st 2 lines of 3rd stanza
   3rd is like 1st, with result, not procedure.
   2nd is just status, retold in 4th, with more info.

Signed-off-by: Jim Cromie 
---
 lib/dynamic_debug.c | 14 +++---
 1 file changed, 3 insertions(+), 11 deletions(-)

diff --git a/lib/dynamic_debug.c b/lib/dynamic_debug.c
index b67c9b137447..b0e11f6bfaa2 100644
--- a/lib/dynamic_debug.c
+++ b/lib/dynamic_debug.c
@@ -266,9 +266,6 @@ static int ddebug_change(const struct ddebug_query *query,
}
mutex_unlock(_lock);
 
-   if (!nfound && verbose)
-   pr_info("no matches for query\n");
-
return nfound;
 }
 
@@ -497,7 +494,6 @@ static int ddebug_parse_flags(const char *str, struct 
flag_settings *modifiers)
pr_err("bad flag-op %c, at start of %s\n", *str, str);
return -EINVAL;
}
-   v3pr_info("op='%c'\n", op);
 
for (; *str ; ++str) {
for (i = ARRAY_SIZE(opt_array) - 1; i >= 0; i--) {
@@ -511,7 +507,6 @@ static int ddebug_parse_flags(const char *str, struct 
flag_settings *modifiers)
return -EINVAL;
}
}
-   v3pr_info("flags=0x%x\n", modifiers->flags);
 
/* calculate final flags, mask based upon op */
switch (op) {
@@ -527,7 +522,7 @@ static int ddebug_parse_flags(const char *str, struct 
flag_settings *modifiers)
modifiers->flags = 0;
break;
}
-   v3pr_info("*flagsp=0x%x *maskp=0x%x\n", modifiers->flags, 
modifiers->mask);
+   v3pr_info("op='%c' flags=0x%x maskp=0x%x\n", op, modifiers->flags, 
modifiers->mask);
 
return 0;
 }
@@ -537,7 +532,7 @@ static int ddebug_exec_query(char *query_string, const char 
*modname)
struct flag_settings modifiers = {};
struct ddebug_query query = {};
 #define MAXWORDS 9
-   int nwords, nfound;
+   int nwords;
char *words[MAXWORDS];
 
nwords = ddebug_tokenize(query_string, words, MAXWORDS);
@@ -555,10 +550,7 @@ static int ddebug_exec_query(char *query_string, const 
char *modname)
return -EINVAL;
}
/* actually go and implement the change */
-   nfound = ddebug_change(, );
-   vpr_info_dq(, nfound ? "applied" : "no-match");
-
-   return nfound;
+   return ddebug_change(, );
 }
 
 /* handle multiple queries in query string, continue on error, return
-- 
2.41.0

[PATCH v7b 10/25] dyndbg: tighten ddebug_class_name() 1st arg type

2023-10-13 Thread Jim Cromie

Change function's 1st arg-type, and deref in the caller.
The fn doesn't need any other fields in the struct.

no functional change.

Signed-off-by: Jim Cromie 
---
 lib/dynamic_debug.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/lib/dynamic_debug.c b/lib/dynamic_debug.c
index b07aab422604..8158943b350d 100644
--- a/lib/dynamic_debug.c
+++ b/lib/dynamic_debug.c
@@ -1117,12 +1117,12 @@ static void *ddebug_proc_next(struct seq_file *m, void 
*p, loff_t *pos)
 #define class_in_range(class_id, map)  \
(class_id >= map->base && class_id < map->base + map->length)
 
-static const char *ddebug_class_name(struct ddebug_iter *iter, struct _ddebug 
*dp)
+static const char *ddebug_class_name(struct ddebug_table *dt, struct _ddebug 
*dp)
 {
-   struct ddebug_class_map *map = iter->table->classes;
-   int i, nc = iter->table->num_classes;
+   struct ddebug_class_map *map = dt->classes;
+   int i;
 
-   for (i = 0; i < nc; i++, map++)
+   for (i = 0; i < dt->num_classes; i++, map++)
if (class_in_range(dp->class_id, map))
return map->class_names[dp->class_id - map->base];
 
@@ -1156,7 +1156,7 @@ static int ddebug_proc_show(struct seq_file *m, void *p)
seq_puts(m, "\"");
 
if (dp->class_id != _DPRINTK_CLASS_DFLT) {
-   class = ddebug_class_name(iter, dp);
+   class = ddebug_class_name(iter->table, dp);
if (class)
seq_printf(m, " class:%s", class);
else
-- 
2.41.0

[PATCH v7b 07/25] dyndbg: drop NUM_TYPE_ARRAY

2023-10-13 Thread Jim Cromie

ARRAY_SIZE works here, since array decl is complete.

no functional change

Signed-off-by: Jim Cromie 
---
 include/linux/dynamic_debug.h | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/include/linux/dynamic_debug.h b/include/linux/dynamic_debug.h
index b53217e4b711..8116d0a0d33a 100644
--- a/include/linux/dynamic_debug.h
+++ b/include/linux/dynamic_debug.h
@@ -106,11 +106,9 @@ struct ddebug_class_map {
.mod_name = KBUILD_MODNAME, \
.base = _base,  \
.map_type = _maptype,   \
-   .length = NUM_TYPE_ARGS(char*, __VA_ARGS__),\
+   .length = ARRAY_SIZE(_var##_classnames),\
.class_names = _var##_classnames,   \
}
-#define NUM_TYPE_ARGS(eltype, ...) \
-(sizeof((eltype[]){__VA_ARGS__}) / sizeof(eltype))
 
 /* encapsulate linker provided built-in (or module) dyndbg data */
 struct _ddebug_info {
-- 
2.41.0

[PATCH v7b 06/25] dyndbg: split param_set_dyndbg_classes to module/wrapper fns

2023-10-13 Thread Jim Cromie

rename param_set_dyndbg_classes: add _module_ name & arg, old name is
wrapper to new.  New arg allows caller to specify that only one module
is affected by a prdbgs update.

Outer fn preserves kernel_param interface, passing NULL to inner fn.
This selectivity will be used later to narrow the scope of changes
made.

no functional change.

Signed-off-by: Jim Cromie 
---
 lib/dynamic_debug.c | 37 ++---
 1 file changed, 22 insertions(+), 15 deletions(-)

diff --git a/lib/dynamic_debug.c b/lib/dynamic_debug.c
index ba41fdeaaf98..b67c9b137447 100644
--- a/lib/dynamic_debug.c
+++ b/lib/dynamic_debug.c
@@ -708,18 +708,9 @@ static int param_set_dyndbg_classnames(const char *instr, 
const struct kernel_pa
return 0;
 }
 
-/**
- * param_set_dyndbg_classes - class FOO >control
- * @instr: string echo>d to sysfs, input depends on map_type
- * @kp:kp->arg has state: bits/lvl, map, map_type
- *
- * Enable/disable prdbgs by their class, as given in the arguments to
- * DECLARE_DYNDBG_CLASSMAP.  For LEVEL map-types, enforce relative
- * levels by bitpos.
- *
- * Returns: 0 or <0 if error.
- */
-int param_set_dyndbg_classes(const char *instr, const struct kernel_param *kp)
+static int param_set_dyndbg_module_classes(const char *instr,
+  const struct kernel_param *kp,
+  const char *modnm)
 {
const struct ddebug_class_param *dcp = kp->arg;
const struct ddebug_class_map *map = dcp->map;
@@ -756,8 +747,8 @@ int param_set_dyndbg_classes(const char *instr, const 
struct kernel_param *kp)
KP_NAME(kp), inrep, 
CLASSMAP_BITMASK(map->length));
inrep &= CLASSMAP_BITMASK(map->length);
}
-   v2pr_info("bits:%lx > %s\n", inrep, KP_NAME(kp));
-   totct += ddebug_apply_class_bitmap(dcp, , dcp->bits, 
NULL);
+   v2pr_info("bits:0x%lx > %s.%s\n", inrep, modnm ?: "*", 
KP_NAME(kp));
+   totct += ddebug_apply_class_bitmap(dcp, , dcp->bits, 
modnm);
*dcp->bits = inrep;
break;
case DD_CLASS_TYPE_LEVEL_NUM:
@@ -770,7 +761,7 @@ int param_set_dyndbg_classes(const char *instr, const 
struct kernel_param *kp)
old_bits = CLASSMAP_BITMASK(*dcp->lvl);
new_bits = CLASSMAP_BITMASK(inrep);
v2pr_info("lvl:%ld bits:0x%lx > %s\n", inrep, new_bits, 
KP_NAME(kp));
-   totct += ddebug_apply_class_bitmap(dcp, _bits, _bits, 
NULL);
+   totct += ddebug_apply_class_bitmap(dcp, _bits, _bits, 
modnm);
*dcp->lvl = inrep;
break;
default:
@@ -779,6 +770,22 @@ int param_set_dyndbg_classes(const char *instr, const 
struct kernel_param *kp)
vpr_info("%s: total matches: %d\n", KP_NAME(kp), totct);
return 0;
 }
+
+/**
+ * param_set_dyndbg_classes - class FOO >control
+ * @instr: string echo>d to sysfs, input depends on map_type
+ * @kp:kp->arg has state: bits/lvl, map, map_type
+ *
+ * Enable/disable prdbgs by their class, as given in the arguments to
+ * DECLARE_DYNDBG_CLASSMAP.  For LEVEL map-types, enforce relative
+ * levels by bitpos.
+ *
+ * Returns: 0 or <0 if error.
+ */
+int param_set_dyndbg_classes(const char *instr, const struct kernel_param *kp)
+{
+   return param_set_dyndbg_module_classes(instr, kp, NULL);
+}
 EXPORT_SYMBOL(param_set_dyndbg_classes);
 
 /**
-- 
2.41.0

[PATCH v7b 05/25] dyndbg: ddebug_apply_class_bitmap - add module arg, select on it

2023-10-13 Thread Jim Cromie

Add query_module param to ddebug_apply_class_bitmap().  This allows
its caller to update just one module, or all (as currently).  We'll
use this later to propagate drm.debug to each USEr as they're
modprobed.

No functional change.

Signed-off-by: Jim Cromie 
---

after `modprobe i915`, heres the module dependencies,
though not all on drm.debug.

bash-5.2# lsmod
Module  Size  Used by
i915 3133440  0
drm_buddy  20480  1 i915
ttm90112  1 i915
i2c_algo_bit   16384  1 i915
video  61440  1 i915
wmi32768  1 video
drm_display_helper200704  1 i915
drm_kms_helper208896  2 drm_display_helper,i915
drm   606208  5 
drm_kms_helper,drm_display_helper,drm_buddy,i915,ttm
cec57344  2 drm_display_helper,i915
---
 lib/dynamic_debug.c | 19 ---
 1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/lib/dynamic_debug.c b/lib/dynamic_debug.c
index a3be2e7c8c84..ba41fdeaaf98 100644
--- a/lib/dynamic_debug.c
+++ b/lib/dynamic_debug.c
@@ -601,7 +601,8 @@ static int ddebug_exec_queries(char *query, const char 
*modname)
 
 /* apply a new bitmap to the sys-knob's current bit-state */
 static int ddebug_apply_class_bitmap(const struct ddebug_class_param *dcp,
-unsigned long *new_bits, unsigned long 
*old_bits)
+unsigned long *new_bits, unsigned long 
*old_bits,
+const char *query_modname)
 {
 #define QUERY_SIZE 128
char query[QUERY_SIZE];
@@ -609,7 +610,8 @@ static int ddebug_apply_class_bitmap(const struct 
ddebug_class_param *dcp,
int matches = 0;
int bi, ct;
 
-   v2pr_info("apply: 0x%lx to: 0x%lx\n", *new_bits, *old_bits);
+   v2pr_info("apply bitmap: 0x%lx to: 0x%lx for %s\n", *new_bits, 
*old_bits,
+ query_modname ?: "");
 
for (bi = 0; bi < map->length; bi++) {
if (test_bit(bi, new_bits) == test_bit(bi, old_bits))
@@ -618,12 +620,15 @@ static int ddebug_apply_class_bitmap(const struct 
ddebug_class_param *dcp,
snprintf(query, QUERY_SIZE, "class %s %c%s", 
map->class_names[bi],
 test_bit(bi, new_bits) ? '+' : '-', dcp->flags);
 
-   ct = ddebug_exec_queries(query, NULL);
+   ct = ddebug_exec_queries(query, query_modname);
matches += ct;
 
v2pr_info("bit_%d: %d matches on class: %s -> 0x%lx\n", bi,
  ct, map->class_names[bi], *new_bits);
}
+   v2pr_info("applied bitmap: 0x%lx to: 0x%lx for %s\n", *new_bits, 
*old_bits,
+ query_modname ?: "");
+
return matches;
 }
 
@@ -679,7 +684,7 @@ static int param_set_dyndbg_classnames(const char *instr, 
const struct kernel_pa
continue;
}
curr_bits ^= BIT(cls_id);
-   totct += ddebug_apply_class_bitmap(dcp, _bits, 
dcp->bits);
+   totct += ddebug_apply_class_bitmap(dcp, _bits, 
dcp->bits, NULL);
*dcp->bits = curr_bits;
v2pr_info("%s: changed bit %d:%s\n", KP_NAME(kp), 
cls_id,
  map->class_names[cls_id]);
@@ -689,7 +694,7 @@ static int param_set_dyndbg_classnames(const char *instr, 
const struct kernel_pa
old_bits = CLASSMAP_BITMASK(*dcp->lvl);
curr_bits = CLASSMAP_BITMASK(cls_id + (wanted ? 1 : 0 
));
 
-   totct += ddebug_apply_class_bitmap(dcp, _bits, 
_bits);
+   totct += ddebug_apply_class_bitmap(dcp, _bits, 
_bits, NULL);
*dcp->lvl = (cls_id + (wanted ? 1 : 0));
v2pr_info("%s: changed bit-%d: \"%s\" %lx->%lx\n", 
KP_NAME(kp), cls_id,
  map->class_names[cls_id], old_bits, 
curr_bits);
@@ -752,7 +757,7 @@ int param_set_dyndbg_classes(const char *instr, const 
struct kernel_param *kp)
inrep &= CLASSMAP_BITMASK(map->length);
}
v2pr_info("bits:%lx > %s\n", inrep, KP_NAME(kp));
-   totct += ddebug_apply_class_bitmap(dcp, , dcp->bits);
+   totct += ddebug_apply_class_bitmap(dcp, , dcp->bits, 
NULL);
*dcp->bits = inrep;
break;
case DD_CLASS_TYPE_LEVEL_NUM:
@@ -765,7 +770,7 @@ int param_set_dyndbg_classes(const char *instr, const 
struct kernel_param *kp)
old_bits = CLASSMAP_BITMASK(*dcp->lvl);
new_bits = CLASSMAP_BITMASK(inrep);
v2pr_info("lvl:%ld bits:0x%lx > %s\n", inrep, new_bits, 
KP_NAME(kp));
-   totct += ddebug_apply_class_bitmap(dcp, _bits, _bits);
+   totct += ddebug_apply_class_bitmap(dcp, _bits, _bits, 
NULL);

[PATCH v7b 04/25] dyndbg: replace classmap list with a vector

2023-10-13 Thread Jim Cromie

Classmaps are stored/linked in a section/array, but are each added to
the module's ddebug_table.maps list-head.

This is unnecessary; even when ddebug_attach_classmap() is handling
the builtin section (with classmaps for multiple builtin modules), its
contents are ordered, so a module's possibly multiple classmaps will
be consecutive in the section, and could be treated as a vector/block,
since both start-addy and subrange length are in the ddebug_info arg.

So this changes:

struct ddebug_class_map drops list-head link.

struct ddebug_table drops the list-head maps, and gets: classes &
num_classes for the start-addy and num_classes, placed to improve
struct packing.

The loading: in ddebug_attach_module_classes(), replace the
for-the-modname list-add loop, with a forloop that finds the module's
subrange (start,length) of matching classmaps within the possibly
builtin classmaps vector, and saves those to the ddebug_table.

The reading/using: change list-foreach loops in ddebug_class_name() &
ddebug_find_valid_class() to walk the array from start to length.

Also:
Move #define __outvar up, above an added use in a fn-prototype.
Simplify ddebug_attach_module_classes args, ref has both addy,len.

no functional changes

Signed-off-by: Jim Cromie 
---
 include/linux/dynamic_debug.h |  1 -
 lib/dynamic_debug.c   | 61 ++-
 2 files changed, 32 insertions(+), 30 deletions(-)

diff --git a/include/linux/dynamic_debug.h b/include/linux/dynamic_debug.h
index 5231aaf361c4..b53217e4b711 100644
--- a/include/linux/dynamic_debug.h
+++ b/include/linux/dynamic_debug.h
@@ -83,7 +83,6 @@ enum class_map_type {
 };
 
 struct ddebug_class_map {
-   struct list_head link;
struct module *mod;
const char *mod_name;   /* needed for builtins */
const char **class_names;
diff --git a/lib/dynamic_debug.c b/lib/dynamic_debug.c
index b984ce338921..a3be2e7c8c84 100644
--- a/lib/dynamic_debug.c
+++ b/lib/dynamic_debug.c
@@ -45,10 +45,11 @@ extern struct ddebug_class_map __start___dyndbg_classes[];
 extern struct ddebug_class_map __stop___dyndbg_classes[];
 
 struct ddebug_table {
-   struct list_head link, maps;
+   struct list_head link;
const char *mod_name;
-   unsigned int num_ddebugs;
struct _ddebug *ddebugs;
+   struct ddebug_class_map *classes;
+   unsigned int num_ddebugs, num_classes;
 };
 
 struct ddebug_query {
@@ -147,13 +148,15 @@ static void vpr_info_dq(const struct ddebug_query *query, 
const char *msg)
  query->first_lineno, query->last_lineno, query->class_string);
 }
 
+#define __outvar /* filled by callee */
 static struct ddebug_class_map *ddebug_find_valid_class(struct ddebug_table 
const *dt,
- const char 
*class_string, int *class_id)
+   const char 
*class_string,
+   __outvar int *class_id)
 {
struct ddebug_class_map *map;
-   int idx;
+   int i, idx;
 
-   list_for_each_entry(map, >maps, link) {
+   for (map = dt->classes, i = 0; i < dt->num_classes; i++, map++) {
idx = match_string(map->class_names, map->length, class_string);
if (idx >= 0) {
*class_id = idx + map->base;
@@ -164,7 +167,6 @@ static struct ddebug_class_map 
*ddebug_find_valid_class(struct ddebug_table cons
return NULL;
 }
 
-#define __outvar /* filled by callee */
 /*
  * Search the tables for _ddebug's which match the given `query' and
  * apply the `flags' and `mask' to them.  Returns number of matching
@@ -,9 +1113,10 @@ static void *ddebug_proc_next(struct seq_file *m, void 
*p, loff_t *pos)
 
 static const char *ddebug_class_name(struct ddebug_iter *iter, struct _ddebug 
*dp)
 {
-   struct ddebug_class_map *map;
+   struct ddebug_class_map *map = iter->table->classes;
+   int i, nc = iter->table->num_classes;
 
-   list_for_each_entry(map, >table->maps, link)
+   for (i = 0; i < nc; i++, map++)
if (class_in_range(dp->class_id, map))
return map->class_names[dp->class_id - map->base];
 
@@ -1197,30 +1200,31 @@ static const struct proc_ops proc_fops = {
.proc_write = ddebug_proc_write
 };
 
-static void ddebug_attach_module_classes(struct ddebug_table *dt,
-struct ddebug_class_map *classes,
-int num_classes)
+static void ddebug_attach_module_classes(struct ddebug_table *dt, struct 
_ddebug_info *di)
 {
struct ddebug_class_map *cm;
-   int i, j, ct = 0;
+   int i, nc = 0;
 
-   for (cm = classes, i = 0; i < num_classes; i++, cm++) {
+   /*
+* Find this module's classmaps in a subrange/wholerange of
+* the builtin/modular classmap vector/section.  Save the start
+* and length of the subrange

[PATCH v7b 03/25] dyndbg: make ddebug_class_param union members same size

2023-10-13 Thread Jim Cromie

struct ddebug_class_param keeps a ref to the state-storage of the
param, make both flavors use the same unsigned long under-type.
ISTM this is simpler and safer.

Signed-off-by: Jim Cromie 
---
 include/linux/dynamic_debug.h | 2 +-
 lib/dynamic_debug.c   | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/linux/dynamic_debug.h b/include/linux/dynamic_debug.h
index 4fcbf4d4fd0a..5231aaf361c4 100644
--- a/include/linux/dynamic_debug.h
+++ b/include/linux/dynamic_debug.h
@@ -124,7 +124,7 @@ struct _ddebug_info {
 struct ddebug_class_param {
union {
unsigned long *bits;
-   unsigned int *lvl;
+   unsigned long *lvl;
};
char flags[8];
const struct ddebug_class_map *map;
diff --git a/lib/dynamic_debug.c b/lib/dynamic_debug.c
index ceb3067a5c83..b984ce338921 100644
--- a/lib/dynamic_debug.c
+++ b/lib/dynamic_debug.c
@@ -796,7 +796,7 @@ int param_get_dyndbg_classes(char *buffer, const struct 
kernel_param *kp)
 
case DD_CLASS_TYPE_LEVEL_NAMES:
case DD_CLASS_TYPE_LEVEL_NUM:
-   return scnprintf(buffer, PAGE_SIZE, "%d\n", *dcp->lvl);
+   return scnprintf(buffer, PAGE_SIZE, "%ld\n", *dcp->lvl);
default:
return -1;
}
-- 
2.41.0

[PATCH v7b 01/25] test-dyndbg: fixup CLASSMAP usage error

2023-10-13 Thread Jim Cromie

more careful reading of test output reveals:

lib/test_dynamic_debug.c:103 [test_dynamic_debug]do_cats =pmf "doing 
categories\n"
lib/test_dynamic_debug.c:105 [test_dynamic_debug]do_cats =p "LOW msg\n" 
class:MID
lib/test_dynamic_debug.c:106 [test_dynamic_debug]do_cats =p "MID msg\n" class:HI
lib/test_dynamic_debug.c:107 [test_dynamic_debug]do_cats =_ "HI msg\n" class 
unknown, _id:13

That last line is wrong, the HI class is declared.

But the enum's 1st val (explicitly initialized) was wrong; it must be
_base, not _base+1 (a DECLARE_DYNDBG_CLASSMAP[1] param).  So the last
enumeration exceeded the range of mapped class-id's, which triggered
the "class unknown" report.  I intentionally coded in an error, but
forgot to verify its detection and remove it.

RFC:

This patch fixes a bad usage of DECLARE_DYNDBG_CLASSMAP(), showing
that it is too error-prone.  As noted in test-mod comments:

 * Using the CLASSMAP api:
 * - classmaps must have corresponding enum
 * - enum symbols must match/correlate with class-name strings in the map.
 * - base must equal enum's 1st value
 * - multiple maps must set their base to share the 0-62 class_id space !!
 *   (build-bug-on tips welcome)

Those shortcomings could largely be fixed with a __stringify_list
(which doesn't exist,) used in DECLARE_DYNDBG_CLASSMAP to stringify
__VA_ARGS__.  Then, API would accept DRM_UT_* values literally; all
the categories, in order, and not their stringifications, which
created all the usage complications above.

[1] name changes later to DYNDBG_CLASSMAP_DEFINE

Signed-off-by: Jim Cromie 
---
 lib/test_dynamic_debug.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/test_dynamic_debug.c b/lib/test_dynamic_debug.c
index 8dd250ad022b..a01f0193a419 100644
--- a/lib/test_dynamic_debug.c
+++ b/lib/test_dynamic_debug.c
@@ -75,7 +75,7 @@ DD_SYS_WRAP(disjoint_bits, p);
 DD_SYS_WRAP(disjoint_bits, T);
 
 /* symbolic input, independent bits */
-enum cat_disjoint_names { LOW = 11, MID, HI };
+enum cat_disjoint_names { LOW = 10, MID, HI };
 DECLARE_DYNDBG_CLASSMAP(map_disjoint_names, DD_CLASS_TYPE_DISJOINT_NAMES, 10,
"LOW", "MID", "HI");
 DD_SYS_WRAP(disjoint_names, p);
-- 
2.41.0

[PATCH v7b 02/25] dyndbg: reword "class unknown, " to "class:_UNKNOWN_"

2023-10-13 Thread Jim Cromie

This appears in the control-file to report an unknown class-name, which
indicates that the class_id is not authorized, and dyndbg will ignore
changes to it.  Generally, this means that a DYNDBG_CLASSMAP_DEFINE or
DYNDBG_CLASSMAP_USE is missing.

But the word "unknown" appears in quite a few prdbg formats, so thats
a suboptimal search term to find occurrences of the problem.  Thus
change it to "_UNKNOWN_" which properly shouts the condition.

Signed-off-by: Jim Cromie 
---
 lib/dynamic_debug.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/dynamic_debug.c b/lib/dynamic_debug.c
index 6fba6423cc10..ceb3067a5c83 100644
--- a/lib/dynamic_debug.c
+++ b/lib/dynamic_debug.c
@@ -1151,7 +1151,7 @@ static int ddebug_proc_show(struct seq_file *m, void *p)
if (class)
seq_printf(m, " class:%s", class);
else
-   seq_printf(m, " class unknown, _id:%d", dp->class_id);
+   seq_printf(m, " class:_UNKNOWN_ _id:%d", dp->class_id);
}
seq_puts(m, "\n");
 
-- 
2.41.0

[PATCH v7b 00/25] fix DRM_USE_DYNAMIC_DEBUG=y regression

2023-10-13 Thread Jim Cromie

hi Jason, DRM-folk

(now with checkpatch fixes)

This patchest fixes the chicken-egg initialization problem in the 1st
version of ddebug-class-maps, that DRM-CI uncovered.

The root-problem was DECLARE_DYNDBG_CLASSMAP, which broke the K rule:
"define once, refer many".  In patch 14 it is replaced by:

 DYNDBG_CLASSMAP_DEFINE - define and export a struct ddebug_class_map
 DYNDBG_CLASSMAP_USE - ref the exported struct

test-dynamic-debug is also extended with a -submod.ko, in order to
recapitulate the drm & drivers initialization scenario.

They're on v6.6-rc5 now, and apply cleanly to drm-tip/drm-tip.

Ive been running recent revs on rc3+, on my desktop and laptop.

The final blocker was a missing __align(8) on the ddebug_class_user
record inserted by DYNDBG_CLASSMAP_USE.  This caused DRM=y (builtin
only) to have a corrupt record for drm_kms_helper (builtin dependent).
Curiously, a clang build did not exhibit this problem.

Heres a part of dmesg, for a DRM=y kernel, booted with
 dynamic_debug.verbose=3 drm.debug=0x10

[0.466747] dyndbg: add-module: drm 406 sites
[0.467569] dyndbg: classes[0]: module:drm base:0 len:10 type:DISJOINT_BITS
[0.467743] dyndbg: module:drm attached 1 classes
[0.468557] dyndbg: builtin class: module:drm base:0 len:10 
type:DISJOINT_BITS
[0.468742] dyndbg:  found kp:drm.debug =0x10
[0.468743] dyndbg:   mapped to: module:drm base:0 len:10 type:DISJOINT_BITS
[0.469742] dyndbg:   drm.debug: classbits: 0x10
[0.470573] dyndbg: apply bitmap: 0x10 to: 0x0 for drm
[0.470743] dyndbg: query 0: "class DRM_UT_ATOMIC +p" mod:drm
[0.471743] dyndbg: split into words: "class" "DRM_UT_ATOMIC" "+p"
[0.472743] dyndbg: op='+' flags=0x1 maskp=0x
[0.473679] dyndbg: parsed: func="" file="" module="drm" format="" 
lineno=0-0 class=DRM_UT_ATOMIC
[0.473749] dyndbg: processed 1 queries, with 0 matches, 0 errs
[0.474742] dyndbg: bit_4: 0 matches on class: DRM_UT_ATOMIC -> 0x10
[0.475742] dyndbg: applied bitmap: 0x10 to: 0x0 for drm
[0.476686] dyndbg: 406 debug prints in module drm
[0.476743] dyndbg: add-module: drm_kms_helper 93 sites
[0.477727] dyndbg: class_ref[0] drm_kms_helper -> drm module:drm base:0 
len:10 type:DISJOINT_BITS
[0.477743] dyndbg: builtin class: module:drm base:0 len:10 
type:DISJOINT_BITS
[0.478742] dyndbg:  found kp:drm.debug =0x10
[0.478743] dyndbg:   mapped to: module:drm base:0 len:10 type:DISJOINT_BITS
[0.479743] dyndbg:   drm.debug: classbits: 0x10
[0.480592] dyndbg: apply bitmap: 0x10 to: 0x0 for drm_kms_helper
[0.480743] dyndbg: query 0: "class DRM_UT_ATOMIC +p" mod:drm_kms_helper
[0.481743] dyndbg: split into words: "class" "DRM_UT_ATOMIC" "+p"
[0.482743] dyndbg: op='+' flags=0x1 maskp=0x
[0.483743] dyndbg: parsed: func="" file="" module="drm_kms_helper" 
format="" lineno=0-0 class=DRM_UT_ATOMIC
[0.484750] dyndbg: class-ref: drm_kms_helper.DRM_UT_ATOMIC  
module:drm_kms_helper nd:93 nc:0 nu:1
[0.485809] dyndbg: processed 1 queries, with 44 matches, 0 errs
[0.486742] dyndbg: bit_4: 44 matches on class: DRM_UT_ATOMIC -> 0x10
[0.487742] dyndbg: applied bitmap: 0x10 to: 0x0 for drm_kms_helper
[0.488743] dyndbg: attach-client-module:  module:drm_kms_helper nd:93 nc:0 
nu:1
[0.489742] dyndbg:  93 debug prints in module drm_kms_helper

Widespread testing is appreciated.
I have scripts if anyone wants them.

I'll forward lkp-robot reports here when I get them.
Patches also at https://github.com/jimc/linux (dd-fix-7b)


Jim Cromie (25):
  test-dyndbg: fixup CLASSMAP usage error
  dyndbg: reword "class unknown," to "class:_UNKNOWN_"
  dyndbg: make ddebug_class_param union members same size
  dyndbg: replace classmap list with a vector
  dyndbg: ddebug_apply_class_bitmap - add module arg, select on it
  dyndbg: split param_set_dyndbg_classes to module/wrapper fns
  dyndbg: drop NUM_TYPE_ARRAY
  dyndbg: reduce verbose/debug clutter
  dyndbg: silence debugs with no-change updates
  dyndbg: tighten ddebug_class_name() 1st arg type
  dyndbg: tighten fn-sig of ddebug_apply_class_bitmap
  dyndbg: reduce verbose=3 messages in ddebug_add_module
  dyndbg-API: remove DD_CLASS_TYPE_(DISJOINT|LEVEL)_NAMES and code
  dyndbg-API: fix CONFIG_DRM_USE_DYNAMIC_DEBUG regression
  dyndbg: add for_each_boxed_vector
  dyndbg: refactor ddebug_classparam_clamp_input
  dyndbg-API: promote DYNDBG_CLASSMAP_PARAM to API
  dyndbg-doc: add classmap info to howto
  dyndbg: reserve flag bit _DPRINTK_FLAGS_PREFIX_CACHED
  dyndbg: add _DPRINTK_FLAGS_INCL_LOOKUP
  dyndbg: refactor *dynamic_emit_prefix
  dyndbg: change WARN_ON to WARN_ON_ONCE
  drm: use correct ccflags-y spelling
  drm-drivers: DRM_CLASSMAP_USE in 2nd batch of drivers, helpers
  drm: restore CONFIG_DRM_USE_DYNAMIC_DEBUG un-BROKEN

 .../admin-guide/dynamic-debug-howto.rst   |  60 ++-
 MAINTAINERS   |   2 +-
 drivers/gpu/drm/Kconfig   |   3 +-

Re: [v4 0/3] Break out as separate driver from boe-tv101wum-nl6 panel driver

2023-10-13 Thread Doug Anderson

Hi,

On Fri, Oct 13, 2023 at 2:18 AM Cong Yang
 wrote:
>
> Linus series proposed to break out ili9882t as separate driver,
> but he didn't have time for that extensive rework of the driver.
> As discussed by Linus and Doug [1], keep macro using the "struct 
> panel_init_cmd"
> until we get some resolution about the binary size issue.
>
> [1]: 
> https://lore.kernel.org/all/20230703-fix-boe-tv101wum-nl6-v3-0-bd6e9432c...@linaro.org
>
> In [v1 2/2], Doug suggested move the ili9882t_enter_sleep_mode function
> to ili9882t_unprepare. I tried this scheme and the test failed. I will
> continue to investigate the rootcause with ilitek, which may take a long time 
> .
> So if possible, apply this patch first. If there are new solutions later, I
> will continue upstream.
>
> [v1 2/2] 
> https://lore.kernel.org/all/CAD=FV=XtqPJ77dx8uRb0=tmvc3cvgh5x+7mujexgcg228kz...@mail.gmail.com/
>
> Changes in v4:
> - PATCH 2/3: Change usleep_range(5,51000) to msleep(50);.
> - Link to v3: 
> https://lore.kernel.org/all/20231012121004.2127918-1-yangco...@huaqin.corp-partner.google.com/
>
> Changes in v3:
> - PATCH 1/3: Remove "init_cmd_length" and "linux/of_device.h" .
> - PATCH 2/3: Change usleep_range(5,51000).
> - PATCH 3/3: Add a little background for commit.
> - Link to v2: 
> https://lore.kernel.org/all/20231010121402.3687948-1-yangco...@huaqin.corp-partner.google.com/
>
> Changes in v2:
> - PATCH 1/3: fix Doug comments,define "_INIT_SWITCH_PAGE_CMD" and remove the 
> "shutdown".
> - PATCH 2/3: Modify ili9882t_switch_page function instead of hardcoding.
> - PATCH 3/3: Enable new config in defconfig.
> - Link to v1: 
> https://lore.kernel.org/all/20231007060639.725350-1-yangco...@huaqin.corp-partner.google.com/
>
> Cong Yang (3):
>   drm/panel: ili9882t: Break out as separate driver
>   drm/panel: ili9882t: Avoid blurred screen from fast sleep
>   arm64: defconfig: Enable ILITEK_ILI9882T panel
>
>  arch/arm64/configs/defconfig  |   1 +
>  drivers/gpu/drm/panel/Kconfig |   9 +
>  drivers/gpu/drm/panel/Makefile|   1 +
>  .../gpu/drm/panel/panel-boe-tv101wum-nl6.c| 371 -
>  drivers/gpu/drm/panel/panel-ilitek-ili9882t.c | 779 ++
>  5 files changed, 790 insertions(+), 371 deletions(-)
>  create mode 100644 drivers/gpu/drm/panel/panel-ilitek-ili9882t.c

I think the series is well-reviewed now. I'll wait a week and target
applying next Thursday or Friday unless something comes up.

-Doug

Re: [v4 2/3] drm/panel: ili9882t: Avoid blurred screen from fast sleep

2023-10-13 Thread Doug Anderson

Hi,

On Fri, Oct 13, 2023 at 2:19 AM Cong Yang
 wrote:
>
> At present, we have found that there may be a problem of blurred
> screen during fast sleep/resume. The direct cause of the blurred
> screen is that the IC does not receive 0x28/0x10. Because of the
> particularity of the IC, before the panel enters sleep hid must
> stop scanning, as i2c_hid_core_suspend before ili9882t_disable.
> If move the ili9882t_enter_sleep_mode function to ili9882t_unprepare,
> touch reset will pull low before panel entersleep, which does not meet
> the timing requirements.. So in order to solve this problem, the IC
> can handle it through the exception mechanism when it cannot receive
> 0x28/0x10 command. Handling exceptions requires a reset 50ms delay.
> Refer to vendor detailed analysis [1].
>
> Ilitek vendor also suggested switching the page before entering sleep to
> avoid panel IC not receiving 0x28/0x10 command.
>
> Note: 0x28 is display off, 0x10 is sleep in.
>
> [1]: 
> https://github.com/ILITEK-LoganLin/Document/tree/main/ILITEK_Power_Sequence
>
> Signed-off-by: Cong Yang 
> ---
>  drivers/gpu/drm/panel/panel-ilitek-ili9882t.c | 22 ++-
>  1 file changed, 21 insertions(+), 1 deletion(-)

As talked about in response to the previous version [1], we can work
to see if we can improve the sequencing. However, for now this seems
fine.

Reviewed-by: Douglas Anderson 

[1] 
https://lore.kernel.org/r/CAD=FV=w_lt9mpykjakp3ovudenpszxkhvn9np_hq+es6fe3...@mail.gmail.com

Re: [Intel-gfx] [PATCH] drm/i915/gem: Allow users to disable waitboost

2023-10-13 Thread Rodrigo Vivi

On Thu, Sep 28, 2023 at 01:48:34PM +0100, Tvrtko Ursulin wrote:
> 
> On 27/09/2023 20:34, Belgaumkar, Vinay wrote:
> > 
> > On 9/21/2023 3:41 AM, Tvrtko Ursulin wrote:
> > > 
> > > On 20/09/2023 22:56, Vinay Belgaumkar wrote:
> > > > Provide a bit to disable waitboost while waiting on a gem object.
> > > > Waitboost results in increased power consumption by requesting RP0
> > > > while waiting for the request to complete. Add a bit in the gem_wait()
> > > > IOCTL where this can be disabled.
> > > > 
> > > > This is related to the libva API change here -
> > > > Link: 
> > > > https://github.com/XinfengZhang/libva/commit/3d90d18c67609a73121bb71b20ee4776b54b61a7
> > > 
> > > This link does not appear to lead to userspace code using this uapi?
> > We have asked Carl (cc'd) to post a patch for the same.
> 
> Ack.

I'm glad to see that we will have the end-to-end flow of the high-level API.

> 
> > > > Cc: Rodrigo Vivi 
> > > > Signed-off-by: Vinay Belgaumkar 
> > > > ---
> > > >   drivers/gpu/drm/i915/gem/i915_gem_wait.c | 9 ++---
> > > >   drivers/gpu/drm/i915/i915_request.c  | 3 ++-
> > > >   drivers/gpu/drm/i915/i915_request.h  | 1 +
> > > >   include/uapi/drm/i915_drm.h  | 1 +
> > > >   4 files changed, 10 insertions(+), 4 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_wait.c
> > > > b/drivers/gpu/drm/i915/gem/i915_gem_wait.c
> > > > index d4b918fb11ce..955885ec859d 100644
> > > > --- a/drivers/gpu/drm/i915/gem/i915_gem_wait.c
> > > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_wait.c
> > > > @@ -72,7 +72,8 @@ i915_gem_object_wait_reservation(struct
> > > > dma_resv *resv,
> > > >   struct dma_fence *fence;
> > > >   long ret = timeout ?: 1;
> > > >   -    i915_gem_object_boost(resv, flags);
> > > > +    if (!(flags & I915_WAITBOOST_DISABLE))
> > > > +    i915_gem_object_boost(resv, flags);
> > > >     dma_resv_iter_begin(, resv,
> > > >   dma_resv_usage_rw(flags & I915_WAIT_ALL));
> > > > @@ -236,7 +237,7 @@ i915_gem_wait_ioctl(struct drm_device *dev,
> > > > void *data, struct drm_file *file)
> > > >   ktime_t start;
> > > >   long ret;
> > > >   -    if (args->flags != 0)
> > > > +    if (args->flags != 0 || args->flags != I915_GEM_WAITBOOST_DISABLE)
> > > >   return -EINVAL;
> > > >     obj = i915_gem_object_lookup(file, args->bo_handle);
> > > > @@ -248,7 +249,9 @@ i915_gem_wait_ioctl(struct drm_device *dev,
> > > > void *data, struct drm_file *file)
> > > >   ret = i915_gem_object_wait(obj,
> > > >  I915_WAIT_INTERRUPTIBLE |
> > > >  I915_WAIT_PRIORITY |
> > > > -   I915_WAIT_ALL,
> > > > +   I915_WAIT_ALL |
> > > > +   (args->flags & I915_GEM_WAITBOOST_DISABLE ?
> > > > +    I915_WAITBOOST_DISABLE : 0),
> > > >  to_wait_timeout(args->timeout_ns));
> > > >     if (args->timeout_ns > 0) {
> > > > diff --git a/drivers/gpu/drm/i915/i915_request.c
> > > > b/drivers/gpu/drm/i915/i915_request.c
> > > > index f59081066a19..2957409b4b2a 100644
> > > > --- a/drivers/gpu/drm/i915/i915_request.c
> > > > +++ b/drivers/gpu/drm/i915/i915_request.c
> > > > @@ -2044,7 +2044,8 @@ long i915_request_wait_timeout(struct
> > > > i915_request *rq,
> > > >    * but at a cost of spending more power processing the workload
> > > >    * (bad for battery).
> > > >    */
> > > > -    if (flags & I915_WAIT_PRIORITY && !i915_request_started(rq))
> > > > +    if (!(flags & I915_WAITBOOST_DISABLE) && (flags &
> > > > I915_WAIT_PRIORITY) &&
> > > > +    !i915_request_started(rq))
> > > >   intel_rps_boost(rq);
> > > >     wait.tsk = current;
> > > > diff --git a/drivers/gpu/drm/i915/i915_request.h
> > > > b/drivers/gpu/drm/i915/i915_request.h
> > > > index 0ac55b2e4223..3cc00e8254dc 100644
> > > > --- a/drivers/gpu/drm/i915/i915_request.h
> > > > +++ b/drivers/gpu/drm/i915/i915_request.h
> > > > @@ -445,6 +445,7 @@ long i915_request_wait(struct i915_request *rq,
> > > >   #define I915_WAIT_INTERRUPTIBLE    BIT(0)
> > > >   #define I915_WAIT_PRIORITY    BIT(1) /* small priority bump
> > > > for the request */
> > > >   #define I915_WAIT_ALL    BIT(2) /* used by
> > > > i915_gem_object_wait() */
> > > > +#define I915_WAITBOOST_DISABLE    BIT(3) /* used by

maybe name it I915_WAIT_NO_BOOST to align a bit better with the existent ones?

> > > > i915_gem_object_wait() */
> > > >     void i915_request_show(struct drm_printer *m,
> > > >  const struct i915_request *rq,
> > > > diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> > > > index 7000e5910a1d..4adee70e39cf 100644
> > > > --- a/include/uapi/drm/i915_drm.h
> > > > +++ b/include/uapi/drm/i915_drm.h
> > > > @@ -1928,6 +1928,7 @@ struct drm_i915_gem_wait {
> > > >   /** Handle of BO we shall wait on */
> > > >   __u32 bo_handle;
> > > >   __u32 flags;
> > > > +#define

[PATCH v2 2/2] video: fbdev: core: syscopyarea: fix sloppy typing

2023-10-13 Thread Sergey Shtylyov

In sys_copyarea(), the local variable bits_per_line is needlessly typed as
*unsigned long* -- which is a 32-bit type on the 32-bit arches and a 64-bit
type on the 64-bit arches; that variable's value is derived from the __u32
typed fb_fix_screeninfo::line_length field (multiplied by 8u) and a 32-bit
*unsigned int* type should still be enough to store the # of bits per line.

Found by Linux Verification Center (linuxtesting.org) with the Svace static
analysis tool.

Signed-off-by: Sergey Shtylyov 
---
 drivers/video/fbdev/core/syscopyarea.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/video/fbdev/core/syscopyarea.c 
b/drivers/video/fbdev/core/syscopyarea.c
index c1eda3190968..7b8bd3a2bedc 100644
--- a/drivers/video/fbdev/core/syscopyarea.c
+++ b/drivers/video/fbdev/core/syscopyarea.c
@@ -316,7 +316,7 @@ void sys_copyarea(struct fb_info *p, const struct 
fb_copyarea *area)
 {
u32 dx = area->dx, dy = area->dy, sx = area->sx, sy = area->sy;
u32 height = area->height, width = area->width;
-   unsigned long const bits_per_line = p->fix.line_length*8u;
+   unsigned int const bits_per_line = p->fix.line_length * 8u;
unsigned long *base = NULL;
int bits = BITS_PER_LONG, bytes = bits >> 3;
unsigned dst_idx = 0, src_idx = 0, rev_copy = 0;
-- 
2.26.3

[PATCH v2 0/2] Fix sloppy typing in the FB area copying routines

2023-10-13 Thread Sergey Shtylyov

Here are 2 patches against the 'master' branch of Linus' 'linux.git' repo...

In {cfb|sys}_copyarea(), the local variable bits_per_line is needlessly typed
as *unsigned long* -- which is a 32-bit type on the 32-bit arches and a 64-bit
type on the 64-bit arches;; that variable's value is derived from the __u32
typed fb_fix_screeninfo::line_length field (multiplied by 8u) and a 32-bit
*unsigned int* type should still be enough to store the # of bits per line.

Sergey Shtylyov (2):
  video: fbdev: core: cfbcopyarea: fix sloppy typing
  video: fbdev: core: syscopyarea: fix sloppy typing

 drivers/video/fbdev/core/cfbcopyarea.c | 2 +-
 drivers/video/fbdev/core/syscopyarea.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

-- 
2.26.3

[PATCH v2 1/2] video: fbdev: core: cfbcopyarea: fix sloppy typing

2023-10-13 Thread Sergey Shtylyov

In cfb_copyarea(), the local variable bits_per_line is needlessly typed as
*unsigned long* -- which is a 32-bit type on the 32-bit arches and a 64-bit
type on the 64-bit arches; that variable's value is derived from the __u32
typed fb_fix_screeninfo::line_length field (multiplied by 8u) and a 32-bit
*unsigned int* type should still be enough to store the # of bits per line.

Found by Linux Verification Center (linuxtesting.org) with the Svace static
analysis tool.

Signed-off-by: Sergey Shtylyov 
---
 drivers/video/fbdev/core/cfbcopyarea.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/video/fbdev/core/cfbcopyarea.c 
b/drivers/video/fbdev/core/cfbcopyarea.c
index 6d4bfeecee35..5b80bf3dae50 100644
--- a/drivers/video/fbdev/core/cfbcopyarea.c
+++ b/drivers/video/fbdev/core/cfbcopyarea.c
@@ -382,7 +382,7 @@ void cfb_copyarea(struct fb_info *p, const struct 
fb_copyarea *area)
 {
u32 dx = area->dx, dy = area->dy, sx = area->sx, sy = area->sy;
u32 height = area->height, width = area->width;
-   unsigned long const bits_per_line = p->fix.line_length*8u;
+   unsigned int const bits_per_line = p->fix.line_length * 8u;
unsigned long __iomem *base = NULL;
int bits = BITS_PER_LONG, bytes = bits >> 3;
unsigned dst_idx = 0, src_idx = 0, rev_copy = 0;
-- 
2.26.3

Re: Implement per-key keyboard backlight as auxdisplay?

2023-10-13 Thread Pavel Machek

Hi!

> coming from the leds mailing list I'm writing with Pavel how to best handle
> per-key RGB keyboards.
> 
> His suggestion was that it could be implemented as an aux display, but he
> also suggested that I ask first if this fits.

Thanks for doing this.

> The specific keyboard RGB controller I want to implement takes 6*21 rgb
> values. However not every one is actually mapped to a physical key. e.g. the
> bottom row needs less entries because of the space bar. Additionally the
> keys are ofc not in a straight line from top to bottom.

So... a bit of rationale. The keyboard does not really fit into the
LED subsystem; LEDs are expected to be independent ("hdd led") and not
a matrix of them.

We do see various strange displays these days -- they commonly have
rounded corners and holes in them. I'm not sure how that's currently
supported, but I believe it is reasonable to view keyboard as a
display with slightly weird placing of pixels.

Plus, I'd really like to play tetris on one of those :-).

So, would presenting them as auxdisplay be acceptable? Or are there
better options?

Best regards,
Pavel
-- 
People of Russia, stop Putin before his war on Ukraine escalates.


signature.asc
Description: PGP signature

Re: Implement per-key keyboard backlight as auxdisplay?

2023-10-13 Thread Pavel Machek

Hi!

> > The specific keyboard RGB controller I want to implement takes 6*21 rgb
> > values. However not every one is actually mapped to a physical key. e.g. the
> > bottom row needs less entries because of the space bar. Additionally the
> > keys are ofc not in a straight line from top to bottom.
> 
> So... a bit of rationale. The keyboard does not really fit into the
> LED subsystem; LEDs are expected to be independent ("hdd led") and not
> a matrix of them.
> 
> We do see various strange displays these days -- they commonly have
> rounded corners and holes in them. I'm not sure how that's currently
> supported, but I believe it is reasonable to view keyboard as a
> display with slightly weird placing of pixels.
> 
> Plus, I'd really like to play tetris on one of those :-).
> 
> So, would presenting them as auxdisplay be acceptable? Or are there
> better options?

Oh and... My existing keyboard membrane-based Chicony, and it is from
time when PS/2 was still in wide use. I am slowly looking for a new
keyboard. If you know of one with nice mechanical switches, RGB
backlight with known protocol, and hopefully easily available in Czech
republic, let me know.

Best regards,
Pavel
-- 
People of Russia, stop Putin before his war on Ukraine escalates.


signature.asc
Description: PGP signature

Re: [PATCH 5/7] drm/sun4i: dw-hdmi: Split driver registration

2023-10-13 Thread Jernej Škrabec

Dne četrtek, 05. oktober 2023 ob 10:43:14 CEST je Maxime Ripard napisal(a):
> On Mon, Sep 25, 2023 at 05:07:45PM +0200, Jernej Škrabec wrote:
> > Dne ponedeljek, 25. september 2023 ob 09:47:15 CEST je Maxime Ripard 
> > napisal(a):
> > > On Sun, Sep 24, 2023 at 09:26:02PM +0200, Jernej Skrabec wrote:
> > > > There is no reason to register two drivers in same place. Using macro
> > > > lowers amount of boilerplate code.
> > > 
> > > There's one actually: you can't have several module_init functions in
> > > the some module, and both files are compiled into the same module.
> > 
> > Yeah, I figured as much. However, I think code clean up is good enough 
> > reason
> > to add hidden option in Kconfig and extra entry in Makefile (in v2).
> > 
> > Do you agree?
> 
> Yeah, I don't know. Adding more modules makes it more difficult to
> handle (especially autoloading) without a clear argument why.
> Module_init is simple enough as it is currently, maybe we should just
> add a comment to make it clearer?

I'll just drop this patch then. While I think autoloading works pretty good
these days and cleaner code is nice, it can certainly cause some issues while
packaging.

Best regards,
Jernej

Re: [PATCH 5/9] dma-buf: heaps: mtk_sec_heap: Initialise tee session

2023-10-13 Thread Jeffrey Kardatzke

Sorry for the delayed reply, needed to get some more info. This really
wouldn't be possible due to the limitation on the number of
regions...for example only 32 regions can be defined on some SoCs, and
you'd run out of regions really fast trying to do this. That's why
this is creating heaps for those regions and then allocations are
performed within the defined region is the preferred strategy.


On Thu, Sep 28, 2023 at 11:54 PM Benjamin Gaignard
 wrote:
>
>
> Le 28/09/2023 à 19:48, Jeffrey Kardatzke a écrit :
> > On Thu, Sep 28, 2023 at 1:30 AM Benjamin Gaignard
> >  wrote:
> >>
> >> Le 27/09/2023 à 20:56, Jeffrey Kardatzke a écrit :
> >>> On Wed, Sep 27, 2023 at 8:18 AM Benjamin Gaignard
> >>>  wrote:
>  Le 27/09/2023 à 15:46, Joakim Bech a écrit :
> > On Mon, Sep 25, 2023 at 12:49:50PM +, Yong Wu (吴勇) wrote:
> >> On Tue, 2023-09-12 at 11:32 +0200, AngeloGioacchino Del Regno wrote:
> >>> Il 12/09/23 08:17, Yong Wu (吴勇) ha scritto:
>  On Mon, 2023-09-11 at 11:29 +0200, AngeloGioacchino Del Regno
>  wrote:
> > Il 11/09/23 04:30, Yong Wu ha scritto:
> >> The TEE probe later than dma-buf heap, and PROBE_DEDER doesn't
> >> work
> >> here since this is not a platform driver, therefore initialise
> >> the
> >> TEE
> >> context/session while we allocate the first secure buffer.
> >>
> >> Signed-off-by: Yong Wu 
> >> ---
> >>   drivers/dma-buf/heaps/mtk_secure_heap.c | 61
> >> +
> >>   1 file changed, 61 insertions(+)
> >>
> >> diff --git a/drivers/dma-buf/heaps/mtk_secure_heap.c
> >> b/drivers/dma-
> >> buf/heaps/mtk_secure_heap.c
> >> index bbf1c8dce23e..e3da33a3d083 100644
> >> --- a/drivers/dma-buf/heaps/mtk_secure_heap.c
> >> +++ b/drivers/dma-buf/heaps/mtk_secure_heap.c
> >> @@ -10,6 +10,12 @@
> >>   #include 
> >>   #include 
> >>   #include 
> >> +#include 
> >> +#include 
> >> +
> >> +#define TZ_TA_MEM_UUID  "4477588a-8476-11e2-ad15-
> >> e41f1390d676"
> >> +
> > Is this UUID the same for all SoCs and all TZ versions?
>  Yes. It is the same for all SoCs and all TZ versions currently.
> 
> >>> That's good news!
> >>>
> >>> Is this UUID used in any userspace component? (example: Android
> >>> HALs?)
> >> No. Userspace never use it. If userspace would like to allocate this
> >> secure buffer, it can achieve through the existing dmabuf IOCTL via
> >> /dev/dma_heap/mtk_svp node.
> >>
> > In general I think as mentioned elsewhere in comments, that there isn't
> > that much here that seems to be unique for MediaTek in this patch
> > series, so I think it worth to see whether this whole patch set can be
> > made more generic. Having said that, the UUID is always unique for a
> > certain Trusted Application. So, it's not entirely true saying that the
> > UUID is the same for all SoCs and all TrustZone versions. It might be
> > true for a family of MediaTek devices and the TEE in use, but not
> > generically.
> >
> > So, if we need to differentiate between different TA implementations,
> > then we need different UUIDs. If it would be possible to make this patch
> > set generic, then it sounds like a single UUID would be sufficient, but
> > that would imply that all TA's supporting such a generic UUID would be
> > implemented the same from an API point of view. Which also means that
> > for example Trusted Application function ID's needs to be the same etc.
> > Not impossible to achieve, but still not easy (different TEE follows
> > different specifications) and it's not typically something we've done in
> > the past.
> >
> > Unfortunately there is no standardized database of TA's describing what
> > they implement and support.
> >
> > As an alternative, we could implement a query call in the TEE answering,
> > "What UUID does your TA have that implements secure unmapped heap?".
> > I.e., something that reminds of a lookup table. Then we wouldn't have to
> > carry this in UAPI, DT or anywhere else.
>  Joakim does a TA could offer a generic API and hide the hardware specific
>  details (like kernel uAPI does for drivers) ?
> >>> It would have to go through another layer (like the tee driver) to be
> >>> a generic API. The main issue with TAs is that they have UUIDs you
> >>> need to connect to and specific codes for each function; so we should
> >>> abstract at a layer above where those exist in the dma-heap code.
>  Aside that question I wonder what are the needs to perform a 'secure' 
>  playback.
>  I have in mind 2 requirements:
>  - secure memory regions, which means configure the hardware to ensure 
>

Re: [PATCH 5/9] dma-buf: heaps: mtk_sec_heap: Initialise tee session

2023-10-13 Thread Jeffrey Kardatzke

Sorry for the delayed reply, needed to get some more info. This really
wouldn't be possible due to the limitation on the number of regions...for
example only 32 regions can be defined on some SoCs, and you'd run out of
regions really fast trying to do this. That's why this is creating heaps
for those regions and then allocations are performed within the defined
region is the preferred strategy.

On Thu, Sep 28, 2023 at 11:54 PM Benjamin Gaignard <
benjamin.gaign...@collabora.com> wrote:

>
> Le 28/09/2023 à 19:48, Jeffrey Kardatzke a écrit :
> > On Thu, Sep 28, 2023 at 1:30 AM Benjamin Gaignard
> >  wrote:
> >>
> >> Le 27/09/2023 à 20:56, Jeffrey Kardatzke a écrit :
> >>> On Wed, Sep 27, 2023 at 8:18 AM Benjamin Gaignard
> >>>  wrote:
>  Le 27/09/2023 à 15:46, Joakim Bech a écrit :
> > On Mon, Sep 25, 2023 at 12:49:50PM +, Yong Wu (吴勇) wrote:
> >> On Tue, 2023-09-12 at 11:32 +0200, AngeloGioacchino Del Regno wrote:
> >>> Il 12/09/23 08:17, Yong Wu (吴勇) ha scritto:
>  On Mon, 2023-09-11 at 11:29 +0200, AngeloGioacchino Del Regno
>  wrote:
> > Il 11/09/23 04:30, Yong Wu ha scritto:
> >> The TEE probe later than dma-buf heap, and PROBE_DEDER doesn't
> >> work
> >> here since this is not a platform driver, therefore initialise
> >> the
> >> TEE
> >> context/session while we allocate the first secure buffer.
> >>
> >> Signed-off-by: Yong Wu 
> >> ---
> >>   drivers/dma-buf/heaps/mtk_secure_heap.c | 61
> >> +
> >>   1 file changed, 61 insertions(+)
> >>
> >> diff --git a/drivers/dma-buf/heaps/mtk_secure_heap.c
> >> b/drivers/dma-
> >> buf/heaps/mtk_secure_heap.c
> >> index bbf1c8dce23e..e3da33a3d083 100644
> >> --- a/drivers/dma-buf/heaps/mtk_secure_heap.c
> >> +++ b/drivers/dma-buf/heaps/mtk_secure_heap.c
> >> @@ -10,6 +10,12 @@
> >>   #include 
> >>   #include 
> >>   #include 
> >> +#include 
> >> +#include 
> >> +
> >> +#define TZ_TA_MEM_UUID  "4477588a-8476-11e2-ad15-
> >> e41f1390d676"
> >> +
> > Is this UUID the same for all SoCs and all TZ versions?
>  Yes. It is the same for all SoCs and all TZ versions currently.
> 
> >>> That's good news!
> >>>
> >>> Is this UUID used in any userspace component? (example: Android
> >>> HALs?)
> >> No. Userspace never use it. If userspace would like to allocate this
> >> secure buffer, it can achieve through the existing dmabuf IOCTL via
> >> /dev/dma_heap/mtk_svp node.
> >>
> > In general I think as mentioned elsewhere in comments, that there
> isn't
> > that much here that seems to be unique for MediaTek in this patch
> > series, so I think it worth to see whether this whole patch set can
> be
> > made more generic. Having said that, the UUID is always unique for a
> > certain Trusted Application. So, it's not entirely true saying that
> the
> > UUID is the same for all SoCs and all TrustZone versions. It might be
> > true for a family of MediaTek devices and the TEE in use, but not
> > generically.
> >
> > So, if we need to differentiate between different TA implementations,
> > then we need different UUIDs. If it would be possible to make this
> patch
> > set generic, then it sounds like a single UUID would be sufficient,
> but
> > that would imply that all TA's supporting such a generic UUID would
> be
> > implemented the same from an API point of view. Which also means that
> > for example Trusted Application function ID's needs to be the same
> etc.
> > Not impossible to achieve, but still not easy (different TEE follows
> > different specifications) and it's not typically something we've
> done in
> > the past.
> >
> > Unfortunately there is no standardized database of TA's describing
> what
> > they implement and support.
> >
> > As an alternative, we could implement a query call in the TEE
> answering,
> > "What UUID does your TA have that implements secure unmapped heap?".
> > I.e., something that reminds of a lookup table. Then we wouldn't
> have to
> > carry this in UAPI, DT or anywhere else.
>  Joakim does a TA could offer a generic API and hide the hardware
> specific
>  details (like kernel uAPI does for drivers) ?
> >>> It would have to go through another layer (like the tee driver) to be
> >>> a generic API. The main issue with TAs is that they have UUIDs you
> >>> need to connect to and specific codes for each function; so we should
> >>> abstract at a layer above where those exist in the dma-heap code.
>  Aside that question I wonder what are the needs to perform a 'secure'
> playback.
>  I have in mind 2 requirements:
>  - secure memory regions,

[PATCH 3/5] drm/panel: st7703: Add Powkiddy RGB30 Panel Support

2023-10-13 Thread Chris Morgan

From: Chris Morgan 

The Powkiddy RGB30 4 inch panel is a 4 inch 720x720 DSI panel used in
the Powkiddy RGB30 handheld gaming device. Add support for it.

TODO: The panel seems to not resume properly from suspend. I've
confirmed on the other ST7703 based devices it works correctly.

Signed-off-by: Chris Morgan 
---
 drivers/gpu/drm/panel/panel-sitronix-st7703.c | 89 +++
 1 file changed, 89 insertions(+)

diff --git a/drivers/gpu/drm/panel/panel-sitronix-st7703.c 
b/drivers/gpu/drm/panel/panel-sitronix-st7703.c
index 6a3945639535..a9892a4447fb 100644
--- a/drivers/gpu/drm/panel/panel-sitronix-st7703.c
+++ b/drivers/gpu/drm/panel/panel-sitronix-st7703.c
@@ -433,6 +433,94 @@ static const struct st7703_panel_desc rg353v2_desc = {
.init_sequence = rg353v2_init_sequence,
 };
 
+static int rgb30panel_init_sequence(struct st7703 *ctx)
+{
+   struct mipi_dsi_device *dsi = to_mipi_dsi_device(ctx->dev);
+
+   /* Init sequence extracted from Powkiddy RGB30 BSP kernel. */
+
+   /*
+* For some reason this specific panel must be taken out of sleep
+* before the full init sequence, or else it will not display.
+*/
+   mipi_dsi_dcs_exit_sleep_mode(dsi);
+   msleep(250);
+
+   mipi_dsi_dcs_write_seq(dsi, ST7703_CMD_SETEXTC, 0xf1, 0x12, 0x83);
+   mipi_dsi_dcs_write_seq(dsi, ST7703_CMD_SETMIPI, 0x33, 0x81, 0x05, 0xf9,
+  0x0e, 0x0e, 0x20, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x44, 0x25, 0x00, 0x90, 0x0a, 0x00,
+  0x00, 0x01, 0x4f, 0x01, 0x00, 0x00, 0x37);
+   mipi_dsi_dcs_write_seq(dsi, ST7703_CMD_SETPOWER_EXT, 0x25, 0x22, 0xf0,
+  0x63);
+   mipi_dsi_dcs_write_seq(dsi, ST7703_CMD_UNKNOWN_BF, 0x02, 0x11, 0x00);
+   mipi_dsi_dcs_write_seq(dsi, ST7703_CMD_SETRGBIF, 0x10, 0x10, 0x28,
+  0x28, 0x03, 0xff, 0x00, 0x00, 0x00, 0x00);
+   mipi_dsi_dcs_write_seq(dsi, ST7703_CMD_SETSCR, 0x73, 0x73, 0x50, 0x50,
+  0x00, 0x00, 0x12, 0x70, 0x00);
+   mipi_dsi_dcs_write_seq(dsi, ST7703_CMD_SETVDC, 0x46);
+   mipi_dsi_dcs_write_seq(dsi, ST7703_CMD_SETPANEL, 0x0b);
+   mipi_dsi_dcs_write_seq(dsi, ST7703_CMD_SETCYC, 0x80);
+   mipi_dsi_dcs_write_seq(dsi, ST7703_CMD_SETDISP, 0x3c, 0x12, 0x30);
+   mipi_dsi_dcs_write_seq(dsi, ST7703_CMD_SETEQ, 0x07, 0x07, 0x0b, 0x0b,
+  0x03, 0x0b, 0x00, 0x00, 0x00, 0x00, 0xff, 0x00,
+  0xc0, 0x10);
+   mipi_dsi_dcs_write_seq(dsi, ST7703_CMD_SETPOWER, 0x36, 0x00, 0x32,
+  0x32, 0x77, 0xf1, 0xcc, 0xcc, 0x77, 0x77, 0x33,
+  0x33);
+   mipi_dsi_dcs_write_seq(dsi, ST7703_CMD_SETBGP, 0x0a, 0x0a);
+   mipi_dsi_dcs_write_seq(dsi, ST7703_CMD_SETVCOM, 0x88, 0x88);
+   mipi_dsi_dcs_write_seq(dsi, ST7703_CMD_SETGIP1, 0xc8, 0x10, 0x0a, 0x10,
+  0x0f, 0xa1, 0x80, 0x12, 0x31, 0x23, 0x47, 0x86,
+  0xa1, 0x80, 0x47, 0x08, 0x00, 0x00, 0x0d, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x0d, 0x00, 0x00, 0x00,
+  0x48, 0x02, 0x8b, 0xaf, 0x46, 0x02, 0x88, 0x88,
+  0x88, 0x88, 0x88, 0x48, 0x13, 0x8b, 0xaf, 0x57,
+  0x13, 0x88, 0x88, 0x88, 0x88, 0x88, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00);
+   mipi_dsi_dcs_write_seq(dsi, ST7703_CMD_SETGIP2, 0x96, 0x12, 0x01, 0x01,
+  0x01, 0x78, 0x02, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x4f, 0x31, 0x8b, 0xa8, 0x31, 0x75, 0x88, 0x88,
+  0x88, 0x88, 0x88, 0x4f, 0x20, 0x8b, 0xa8, 0x20,
+  0x64, 0x88, 0x88, 0x88, 0x88, 0x88, 0x23, 0x00,
+  0x00, 0x01, 0x02, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+  0x00, 0x00, 0x40, 0xa1, 0x80, 0x00, 0x00, 0x00,
+  0x00);
+   mipi_dsi_dcs_write_seq(dsi, ST7703_CMD_SETGAMMA, 0x00, 0x0a, 0x0f,
+  0x29, 0x3b, 0x3f, 0x42, 0x39, 0x06, 0x0d, 0x10,
+  0x13, 0x15, 0x14, 0x15, 0x10, 0x17, 0x00, 0x0a,
+  0x0f, 0x29, 0x3b, 0x3f, 0x42, 0x39, 0x06, 0x0d,
+  0x10, 0x13, 0x15, 0x14, 0x15, 0x10, 0x17);
+
+   return 0;
+}
+
+static const struct drm_display_mode rgb30panel_mode = {
+   .hdisplay   = 720,
+   .hsync_start= 720 + 45,
+   .hsync_end  = 720 + 45 + 4,
+   .htotal = 720 + 45 + 4 + 45,
+   .vdisplay   = 720,
+   .vsync_start= 720 + 15,
+   .vsync_end  = 720 + 15 + 3,
+

[PATCH 5/5] arm64: dts: rockchip: add support for Powkiddy RGB30

2023-10-13 Thread Chris Morgan

From: Chris Morgan 

The Powkiddy RGB30 is a portable game device based on the Rockchip
RK3566 SoC. It has GPIO buttons on the face and sides for input, stereo
speakers, a 720x720 4 inch DSI display, a USB-C host port and a USB-C
peripheral port, dual SD card slots, WiFi, Bluetooth, and 1GB of RAM.

Working/Tested:
- SDMMC
- UART (for debugging)
- Buttons
- Charging/battery/PMIC
- Speaker/Headphones
- USB
- WiFi
- Bluetooth
- Display (at 59.04hz)

Not Working/TODO:
- Display does not resume from suspend properly. I'm working with the
  manufacturer to resolve this.

Signed-off-by: Chris Morgan 
---
 arch/arm64/boot/dts/rockchip/Makefile |   1 +
 .../dts/rockchip/rk3566-powkiddy-rgb30.dts| 152 ++
 2 files changed, 153 insertions(+)
 create mode 100644 arch/arm64/boot/dts/rockchip/rk3566-powkiddy-rgb30.dts

diff --git a/arch/arm64/boot/dts/rockchip/Makefile 
b/arch/arm64/boot/dts/rockchip/Makefile
index e7728007fd1b..3f01b429a3aa 100644
--- a/arch/arm64/boot/dts/rockchip/Makefile
+++ b/arch/arm64/boot/dts/rockchip/Makefile
@@ -77,6 +77,7 @@ dtb-$(CONFIG_ARCH_ROCKCHIP) += rk3566-anbernic-rg353vs.dtb
 dtb-$(CONFIG_ARCH_ROCKCHIP) += rk3566-anbernic-rg503.dtb
 dtb-$(CONFIG_ARCH_ROCKCHIP) += rk3566-pinenote-v1.1.dtb
 dtb-$(CONFIG_ARCH_ROCKCHIP) += rk3566-pinenote-v1.2.dtb
+dtb-$(CONFIG_ARCH_ROCKCHIP) += rk3566-powkiddy-rgb30.dtb
 dtb-$(CONFIG_ARCH_ROCKCHIP) += rk3566-quartz64-a.dtb
 dtb-$(CONFIG_ARCH_ROCKCHIP) += rk3566-quartz64-b.dtb
 dtb-$(CONFIG_ARCH_ROCKCHIP) += rk3566-radxa-cm3-io.dtb
diff --git a/arch/arm64/boot/dts/rockchip/rk3566-powkiddy-rgb30.dts 
b/arch/arm64/boot/dts/rockchip/rk3566-powkiddy-rgb30.dts
new file mode 100644
index ..c7828c99a1bb
--- /dev/null
+++ b/arch/arm64/boot/dts/rockchip/rk3566-powkiddy-rgb30.dts
@@ -0,0 +1,152 @@
+// SPDX-License-Identifier: (GPL-2.0+ OR MIT)
+
+/dts-v1/;
+
+#include 
+#include 
+#include 
+#include "rk3566-anbernic-rg353x.dtsi"
+
+/ {
+   model = "RGB30";
+   compatible = "powkiddy,rgb30", "rockchip,rk3566";
+
+   aliases {
+   mmc1 = 
+   mmc2 = 
+   mmc3 = 
+   };
+
+   battery: battery {
+   compatible = "simple-battery";
+   charge-full-design-microamp-hours = <3151000>;
+   charge-term-current-microamp = <30>;
+   constant-charge-current-max-microamp = <200>;
+   constant-charge-voltage-max-microvolt = <425>;
+   factory-internal-resistance-micro-ohms = <117000>;
+   voltage-max-design-microvolt = <4172000>;
+   voltage-min-design-microvolt = <340>;
+
+   ocv-capacity-celsius = <20>;
+   ocv-capacity-table-0 =  <4172000 100>, <4092000 95>, <4035000 
90>, <399 85>,
+   <3939000 80>, <3895000 75>, <3852000 
70>, <3807000 65>,
+   <3762000 60>, <3713000 55>, <3672000 
50>, <3647000 45>,
+   <3629000 40>, <3613000 35>, <3598000 
30>, <3578000 25>,
+   <355 20>, <3519000 15>, <3479000 
10>, <3438000 5>,
+   <340 0>;
+   };
+
+   /*
+* Channels reversed for speakers. Headphones automatically switch via 
hardware when
+* detected with no ability to control output in software. Headphones 
appear to be mono
+* (each output channel receives all audio). No microphone support on 
3.5mm jack.
+*/
+   sound {
+   compatible = "simple-audio-card";
+   simple-audio-card,name = "rk817_ext";
+   simple-audio-card,format = "i2s";
+   simple-audio-card,mclk-fs = <256>;
+   simple-audio-card,widgets =
+   "Headphone", "Headphones";
+   simple-audio-card,routing =
+   "Headphones", "HPOL",
+   "Headphones", "HPOR";
+
+   simple-audio-card,codec {
+   sound-dai = <>;
+   };
+
+   simple-audio-card,cpu {
+   sound-dai = <_8ch>;
+   };
+   };
+};
+
+/delete-node/ _keys;
+
+ {
+   assigned-clocks = < CLK_RTC_32K>, < PLL_GPLL>,
+ < PLL_PPLL>, < PLL_VPLL>;
+   assigned-clock-rates = <32768>, <12>,
+  <2>, <10800>;
+};
+
+_keys_control {
+   button-r1 {
+   gpios = < RK_PB3 GPIO_ACTIVE_LOW>;
+   label = "TR";
+   linux,code = ;
+   };
+
+   button-r2 {
+   gpios = < RK_PB4 GPIO_ACTIVE_LOW>;
+   label = "TR2";
+   linux,code = ;
+   };
+};
+
+/delete-node/ &{/i2c@fdd4/regulator@40};
+
+ {
+   vdd_cpu: regulator@1c {
+   compatible = "tcs,tcs4525";
+   reg = <0x1c>;
+   fcs,suspend-voltage-selector =

[PATCH 1/5] dt-bindings: vendor-prefixes: document Powkiddy

2023-10-13 Thread Chris Morgan

From: Chris Morgan 

Document Powkiddy (https://powkiddy.com/).

Signed-off-by: Chris Morgan 
---
 Documentation/devicetree/bindings/vendor-prefixes.yaml | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/Documentation/devicetree/bindings/vendor-prefixes.yaml 
b/Documentation/devicetree/bindings/vendor-prefixes.yaml
index 573578db9509..25fd2dc378f5 100644
--- a/Documentation/devicetree/bindings/vendor-prefixes.yaml
+++ b/Documentation/devicetree/bindings/vendor-prefixes.yaml
@@ -1081,6 +1081,8 @@ patternProperties:
 description: Powertip Tech. Corp.
   "^powervr,.*":
 description: PowerVR (deprecated, use img)
+  "^powkiddy,.*":
+description: Powkiddy
   "^primux,.*":
 description: Primux Trading, S.L.
   "^probox2,.*":
-- 
2.34.1

[PATCH 4/5] dt-bindings: arm64: rockchip: add Powkiddy RGB30

2023-10-13 Thread Chris Morgan

From: Chris Morgan 

The Powkiddy RGB30 is a portable handheld console from Powkiddy which
uses the Rockchip RK3566 SoC.

Signed-off-by: Chris Morgan 
---
 Documentation/devicetree/bindings/arm/rockchip.yaml | 5 +
 1 file changed, 5 insertions(+)

diff --git a/Documentation/devicetree/bindings/arm/rockchip.yaml 
b/Documentation/devicetree/bindings/arm/rockchip.yaml
index ca5389862887..a349bf4da6bc 100644
--- a/Documentation/devicetree/bindings/arm/rockchip.yaml
+++ b/Documentation/devicetree/bindings/arm/rockchip.yaml
@@ -669,6 +669,11 @@ properties:
   - const: pine64,soquartz
   - const: rockchip,rk3566
 
+  - description: Powkiddy RGB30
+items:
+  - const: powkiddy,rgb30
+  - const: rockchip,rk3566
+
   - description: Radxa Compute Module 3(CM3)
 items:
   - enum:
-- 
2.34.1

[PATCH 2/5] dt-bindings: panel: Add Powkiddy RGB30 panel compatible

2023-10-13 Thread Chris Morgan

From: Chris Morgan 

The Powkiddy RGB30 panel is a 4 inch 720x720 MIPI-DSI LCD panel. It
appears to be based on the ST7703 LCD controller (this is assumed from
the init sequence similarity between this and other displays). Powkiddy
would not share the part number or name for the display from the bill
of materials and there were no obvious external markings, so name the
panel for the device (Powkiddy RGB30).

Signed-off-by: Chris Morgan 
---
 .../devicetree/bindings/display/panel/rocktech,jh057n00900.yaml | 2 ++
 1 file changed, 2 insertions(+)

diff --git 
a/Documentation/devicetree/bindings/display/panel/rocktech,jh057n00900.yaml 
b/Documentation/devicetree/bindings/display/panel/rocktech,jh057n00900.yaml
index 5ea74426b1d5..97cccd8a8479 100644
--- a/Documentation/devicetree/bindings/display/panel/rocktech,jh057n00900.yaml
+++ b/Documentation/devicetree/bindings/display/panel/rocktech,jh057n00900.yaml
@@ -22,6 +22,8 @@ properties:
 enum:
   # Anberic RG353V-V2 5.0" 640x480 TFT LCD panel
   - anbernic,rg353v-panel-v2
+  # Powkiddy RGB30 3.0" 720x720 TFT LCD panel
+  - powkiddy,rgb30-panel
   # Rocktech JH057N00900 5.5" 720x1440 TFT LCD panel
   - rocktech,jh057n00900
   # Xingbangda XBD599 5.99" 720x1440 TFT LCD panel
-- 
2.34.1

[PATCH 0/5] Add Support for RK3566 Powkiddy RGB30

2023-10-13 Thread Chris Morgan

From: Chris Morgan 

Add support for the Powkiddy RGB30 handheld gaming console.

Chris Morgan (5):
  dt-bindings: vendor-prefixes: document Powkiddy
  dt-bindings: panel: Add Powkiddy RGB30 panel compatible
  drm/panel: st7703: Add Powkiddy RGB30 Panel Support
  dt-bindings: arm64: rockchip: add Powkiddy RGB30
  arm64: dts: rockchip: add support for Powkiddy RGB30

 .../devicetree/bindings/arm/rockchip.yaml |   5 +
 .../display/panel/rocktech,jh057n00900.yaml   |   2 +
 .../devicetree/bindings/vendor-prefixes.yaml  |   2 +
 arch/arm64/boot/dts/rockchip/Makefile |   1 +
 .../dts/rockchip/rk3566-powkiddy-rgb30.dts| 152 ++
 drivers/gpu/drm/panel/panel-sitronix-st7703.c |  89 ++
 6 files changed, 251 insertions(+)
 create mode 100644 arch/arm64/boot/dts/rockchip/rk3566-powkiddy-rgb30.dts

-- 
2.34.1

BUG: KASAN: slab-use-after-free in drm_connector_cleanup

2023-10-13 Thread Naresh Kamboju

Following kasan bug was noticed on arm64 bcm2711-rpi-4-b device running
Linux next 6.6.0-rc5-next-20231011 with given config.

This was first noticed on 6.6.0-rc2-next-20230918.
This is an intermittent issue.
Links to build and test logs provided.

Reported-by: Linux Kernel Functional Testing 

Boot log:
-

[  209.991842] Bluetooth: Core ver 2.22
[  209.996014] NET: Registered PF_BLUETOOTH protocol family
[  210.002564] Bluetooth: HCI device and connection manager initialized
[  210.009204] Bluetooth: HCI socket layer initialized
[  210.014443] Bluetooth: L2CAP socket layer initialized
[  210.019906] Bluetooth: SCO socket layer initialized
[  210.175889] KTAP version 1
[  210.178771] 1..3
[  210.184030] KTAP version 1
[  210.187234] # Subtest: vc4-pv-muxing-combinations
[  210.192491] # module: vc4
[  210.192572] 1..2
[  210.197937] KTAP version 1
[  210.201544] # Subtest: drm_vc4_test_pv_muxing
[  210.218416] Bluetooth: HCI UART driver ver 2.3
[  210.236869] [drm] Initialized vc4 0.0.0 20140616 for
drm-kunit-mock-device on minor 1
[  210.241063] Bluetooth: HCI UART protocol H4 registered
[  210.320009] Bluetooth: HCI UART protocol LL registered
[  210.464457] Bluetooth: HCI UART protocol Broadcom registered
[  210.470871] hci_uart_bcm serial0-0: supply vbat not found, using
dummy regulator
[  210.472120] Bluetooth: HCI UART protocol QCA registered
[  210.480123] hci_uart_bcm serial0-0: supply vddio not found, using
dummy regulator
[  210.490971] Bluetooth: HCI UART protocol Marvell registered

Debian GNU/Linux 12 runner-vwmj3eza-project-40964107-concurrent-0 ttyS0

runner-vwmj3eza-project-40964107-concurrent-0 login: [  2root
10.623188] cfg80211: Loading compiled-in X.509 certificates for
regulatory database
[  210.658616] 
==
[  210.666006] BUG: KASAN: slab-use-after-free in
drm_connector_cleanup+0x30/0x488 drm
[  210.675144] Read of size 8 at addr 000113a8e0a8 by task
kunit_try_catch/1750
[  210.682685]
[  210.684219] CPU: 1 PID: 1750 Comm: kunit_try_catch Tainted: GB
  N 6.6.0-rc5-next-20231011 #1
[  210.694056] Hardware name: Raspberry Pi 4 Model B (DT)
[  210.699323] Call trace:
r[ 210.701824] dump_backtrace (arch/arm64/kernel/stacktrace.c:235)
[  210.705757] show_stack (arch/arm64/kernel/stacktrace.c:242)
[  210.709160] dump_stack_lvl (lib/dump_stack.c:107)
[  210.712917] print_report (mm/kasan/report.c:365 mm/kasan/report.c:475)
[  210.716560] kasan_report (mm/kasan/report.c:590)
o[ 210.720201] __asan_load8 (mm/kasan/generic.c:260)
o[ 210.723827] drm_connector_cleanup+0x30/0x488 drm
[  210.729265] drm_connector_cleanup_action+0x1c/0x30 drm
t[ 210.735228] drm_managed_release+0x128/0x228 drm
[  210.740570] drm_dev_put.part.0+0xb4/0xf8 drm
[  210.742616] Loaded X.509 cert 'sforshee: 00b28ddf47aef9cea7'
[  210.745649] devm_drm_dev_init_release+0x1c/0x38 drm
[  210.756667] devm_action_release (drivers/base/devres.c:722)
[  210.760840] release_nodes (drivers/base/devres.c:506)
[  210.764569] devres_release_all (drivers/base/devres.c:535)
[  210.765145] platform regulatory.0: Direct firmware load for
regulatory.db failed with error -2
[  210.768737] device_unbind_cleanup (drivers/base/dd.c:551)
[  210.768759] device_release_driver_internal (drivers/base/dd.c:1280
drivers/base/dd.c:1295)
[  210.768775] device_release_driver (drivers/base/dd.c:1319)
[  210.768788] bus_remove_device (include/linux/kobject.h:193
drivers/base/base.h:73 drivers/base/bus.c:581)
[  210.768801] device_del (drivers/base/core.c:3815)
[  210.768812] platform_device_del.part.0 (drivers/base/platform.c:753)
[  210.768824] platform_device_del (drivers/base/platform.c:764)

[  210.768834] kunit_action_platform_device_del+0x18/0x30 drm_kunit_helpers
[  210.810264] cfg80211: failed to load regulatory.db
[  210.815438] __kunit_action_free (lib/kunit/resource.c:92)
[  210.815460] kunit_remove_resource (include/kunit/resource.h:120
include/linux/kref.h:65 include/kunit/resource.h:138
lib/kunit/resource.c:59 lib/kunit/resource.c:48)
[  210.815474] kunit_cleanup (lib/kunit/test.c:815 (discriminator 1))
[  210.815486] kunit_try_run_case_cleanup (lib/kunit/test.c:412)
[  210.815499] kunit_generic_run_threadfn_adapter (lib/kunit/try-catch.c:30)
[  210.815514] kthread (kernel/kthread.c:388)
[  210.815526] ret_from_fork (arch/arm64/kernel/entry.S:858)
[  210.815540]
[  210.815544] Allocated by task 1745:
[  210.815552] kasan_save_stack (mm/kasan/common.c:46)
[  210.815567] kasan_set_track (mm/kasan/common.c:52 (discriminator 1))
[  210.815579] kasan_save_alloc_info (mm/kasan/generic.c:512)
[  210.815592] __kasan_kmalloc (mm/kasan/common.c:374 mm/kasan/common.c:383)
[  210.815603] __kmalloc (include/linux/kasan.h:198
mm/slab_common.c:1004 mm/slab_common.c:1017)
[  210.815613] kunit_kmalloc_array (include/linux/slab.h:637
lib/kunit/test.c:779)
[  210.815625] vc4_dummy_output+0xac/0x228 vc4
[  210.882798] __mock_device+0x24c/0x4b0 vc4
[

Re: [PATCH] drm/ci: Default to UART for logging

2023-10-13 Thread Rob Clark

On Fri, Oct 13, 2023 at 9:28 AM Daniel Stone  wrote:
>
> On Fri, 6 Oct 2023 at 18:32, Rob Clark  wrote:
> > ssh logging is the default for mesa, as it is generally more reliable.
> > But if there are kernel issues, especially at boot, UART logging is
> > infinitely more useful.
>
> Hmm, we should still be capturing the UART boot logs regardless. Those
> go into a collapsed 'LAVA boot' section but they don't just disappear
> ... ?

Hmm, I wasn't seeing anything in the raw log, which doesn't collapse sections..

That said, I still think uart is preferable to ssh for kernel CI.. we
aren't running jobs at the scale of mesa CI jobs so even if we get
UART flakes 1/1000 (or even 1/100) times, that is an acceptable
trade-off for the fact that uart can still work when things are too
fubar for ssh.

BR,
-R

[pull] amdgpu, amdkfd, radeon, UAPI drm-next-6.7

2023-10-13 Thread Alex Deucher

Hi Dave, Daniel,

New stuff for 6.7.

The following changes since commit 3698a75f5a98d0a6599e2878ab25d30a82dd836a:

  Merge tag 'drm-intel-next-fixes-2023-08-24' of 
git://anongit.freedesktop.org/drm/drm-intel into drm-next (2023-08-25 12:55:55 
+1000)

are available in the Git repository at:

  https://gitlab.freedesktop.org/agd5f/linux.git 
tags/amd-drm-next-6.7-2023-10-13

for you to fetch changes up to cd90511557fdfb394bb4ac4c3b539b007383914c:

  drm/amdgpu/vkms: fix a possible null pointer dereference (2023-10-13 11:36:25 
-0400)


amd-drm-next-6.7-2023-10-13:

amdgpu:
- DC replay fixes
- Misc code cleanups and spelling fixes
- Documentation updates
- RAS EEPROM Updates
- FRU EEPROM Updates
- IP discovery updates
- SR-IOV fixes
- RAS updates
- DC PQ fixes
- SMU 13.0.6 updates
- GC 11.5 Support
- NBIO 7.11 Support
- GMC 11 Updates
- Reset fixes
- SMU 11.5 Updates
- SMU 13.0 OD support
- Use flexible arrays for bo list handling
- W=1 Fixes
- SubVP fixes
- DPIA fixes
- DCN 3.5 Support
- Devcoredump fixes
- VPE 6.1 support
- VCN 4.0 Updates
- S/G display fixes
- DML fixes
- DML2 Support
- MST fixes
- VRR fixes
- Enable seamless boot in more cases
- Enable content type property for HDMI
- OLED fixes
- Rework and clean up GPUVM TLB flushing
- DC ODM fixes
- DP 2.x fixes
- AGP aperture fixes
- SDMA firmware loading cleanups
- Cyan Skillfish GPU clock counter fix
- GC 11 GART fix
- Cache GPU fault info for userspace queries
- DC cursor check fixes
- eDP fixes
- DC FP handling fixes
- Variable sized array fixes
- SMU 13.0.x fixes
- IB start and size alignment fixes for VCN
- SMU 14 Support
- Suspend and resume sequence rework
- vkms fix

amdkfd:
- GC 11 fixes
- GC 10 fixes
- Doorbell fixes
- CWSR fixes
- SVM fixes
- Clean up GC info enumeration
- Rework memory limit handling
- Coherent memory handling fixes
- Use partial migrations in GPU faults
- TLB flush fixes
- DMA unmap fixes
- GC 9.4.3 fixes
- SQ interrupt fix
- GTT mapping fix
- GC 11.5 Support

radeon:
- Misc code cleanups
- W=1 Fixes
- Fix possible buffer overflow
- Fix possible NULL pointer dereference

UAPI:
- Add EXT_COHERENT memory allocation flags.  These allow for system scope 
atomics.
  Proposed userspace: 
https://github.com/RadeonOpenCompute/ROCT-Thunk-Interface/pull/88
- Add support for new VPE engine.  This is a memory to memory copy engine with 
advanced scaling, CSC, and color management features
  Proposed mesa MR: 
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25713
- Add INFO IOCTL interface to query GPU faults
  Proposed Mesa MR: 
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23238
  Proposed libdrm MR: 
https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/298


Aaron Liu (4):
  drm/amdgpu: add golden setting for gc_11_5_0
  drm/amdgpu: add imu firmware support for gc_11_5_0
  drm/amdgpu: add mes firmware support for gc_11_5_0
  drm/amdgpu/discovery: enable DCN 3.5.0 support

Agustin Gutierrez (1):
  drm/amd/display: Optimize OLED T7 delay

Alex Deucher (20):
  drm/amd/pm: fix debugfs pm_info output
  drm/amd/pm: fix error flow in sensor fetching
  drm/amdgpu/soc21: don't remap HDP registers for SR-IOV
  drm/amdgpu/nbio4.3: set proper rmmio_remap.reg_offset for SR-IOV
  drm/amdgpu: add vcn_doorbell_range callback for nbio 7.11
  drm/amdgpu: add remap_hdp_registers callback for nbio 7.11
  drm/amdgpu: add VPE IP discovery info to HW IP info query
  drm/amd/display: fix some style issues
  drm/amdgpu/gmc6-8: properly disable the AGP aperture
  drm/amdgpu/gmc: set a default disable value for AGP
  drm/amdgpu/gmc11: disable AGP on GC 11.5
  drm/amdkfd: reduce stack size in kfd_topology_add_device()
  drm/amdkfd: drop struct kfd_cu_info
  drm/amdgpu/gmc: add a way to force a particular placement for GART
  drm/amdgpu/gmc11: set gart placement GC11
  drm/amdgpu: add cached GPU fault structure to vm struct
  drm/amdgpu: cache gpuvm fault information for gmc7+
  drm/amdgpu: add new INFO ioctl query for the last GPU page fault
  drm/amdgpu: refine fault cache updates
  drm/amdgpu: Enable SMU 13.0.0 optimizations when ROCm is active (v2)

Alex Hung (31):
  drm/amd/display: skip audio config for virtual signal
  drm/amd/display: Remove unwanted drm edid references
  drm/amd/display: Initialize writeback connector
  drm/amd/display: Hande writeback request from userspace
  drm/amd/display: Add writeback enable/disable in dc
  drm/amd/display: Fix writeback_info never got updated
  drm/amd/display: Validate hw_points_num before using it
  drm/amd/display: Fix writeback_info is not removed
  drm/amd/display: Add writeback enable field (wb_enabled)
  drm/amd/display: Setup for mmhubbub3_warmup_mcif with big buffer
  drm/amd/display: Add new

Re: [PATCH v5 3/7] drm/sched: Move schedule policy to scheduler

2023-10-13 Thread Luben Tuikov

On 2023-10-11 19:58, Matthew Brost wrote:
> Rather than a global modparam for scheduling policy, move the scheduling
> policy to scheduler so user can control each scheduler policy.
> 
> v2:
>   - s/DRM_SCHED_POLICY_MAX/DRM_SCHED_POLICY_COUNT (Luben)
>   - Only include policy in scheduler (Luben)
> v3:
>   - use a ternary operator as opposed to an if-control (Luben)
>   - s/DRM_SCHED_POLICY_DEFAULT/DRM_SCHED_POLICY_UNSET/ (Luben)
>   - s/default_drm_sched_policy/drm_sched_policy_default/ (Luben)
>   - Update commit message (Boris)
>   - Fix v3d build (CI)
>   - s/bad_policies/drm_sched_policy_mismatch/ (Luben)
>   - Don't update modparam doc (Luben)
> v4:
>   - Fix alignment in msm_ringbuffer_new (Luben / checkpatch)
> 
> Reviewed-by: Luben Tuikov 
> Signed-off-by: Matthew Brost 

Hi,

Forgot to mention this, but it is a very important process to note,
is that one should _never_ add someone else's R-V tag, _*UNLESS*_
a) there's an email from the person giving their review or ack, and
b) you're the one pushing the patch set into the tree.
If you're not the one pushing it into the tree, the maintainer will
add their R-V (after their reply-to follow-up email--see below),
including a Link: tag when they do "git am" after it's been all reviewed.

And there's a reason for this.

The reason is that when kernel maintainers (especially DRM via dim[1]) push
patches into the kernel, we want to add a Link: tag [2,3] pointing to
the thread where a) the patch was posted and b) where the reviewer gave
their Reviewed-by to the patch in a reply-all email, and at this moment
there is no such email for this patch.

When a maintainer says "Do X, Y, Z, for an immediate R-V", this means
do those things, post it, and get a reply from the maintainer with an
R-V. This records how it happened and is very helpful when doing
data mining on how and why the code changed, via what patches, etc.

I suspect there might be a v6, and we can do the R-V/Ack the right way
at that time. No big deal, but it's good to know in one place as it
is a bit scatter here and there in the kernel-doc.

[1] https://gitlab.freedesktop.org/drm/maintainer-tools/
[2] git am --message-id
[3] https://docs.kernel.org/maintainer/
-- 
Regards,
Luben

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  1 +
>  drivers/gpu/drm/etnaviv/etnaviv_sched.c|  3 ++-
>  drivers/gpu/drm/lima/lima_sched.c  |  3 ++-
>  drivers/gpu/drm/msm/msm_ringbuffer.c   |  2 +-
>  drivers/gpu/drm/nouveau/nouveau_sched.c|  3 ++-
>  drivers/gpu/drm/panfrost/panfrost_job.c|  3 ++-
>  drivers/gpu/drm/scheduler/sched_entity.c   | 24 ++
>  drivers/gpu/drm/scheduler/sched_main.c | 19 -
>  drivers/gpu/drm/v3d/v3d_sched.c| 15 +-
>  include/drm/gpu_scheduler.h| 20 --
>  10 files changed, 68 insertions(+), 25 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index b54c4d771104..e4e6f91450a4 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -2283,6 +2283,7 @@ static int amdgpu_device_init_schedulers(struct 
> amdgpu_device *adev)
>  ring->num_hw_submission, 0,
>  timeout, adev->reset_domain->wq,
>  ring->sched_score, ring->name,
> +DRM_SCHED_POLICY_UNSET,
>  adev->dev);
>   if (r) {
>   DRM_ERROR("Failed to create scheduler on ring %s.\n",
> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c 
> b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> index 618a804ddc34..15b0e2f1abe5 100644
> --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> @@ -137,7 +137,8 @@ int etnaviv_sched_init(struct etnaviv_gpu *gpu)
>   ret = drm_sched_init(>sched, _sched_ops, NULL,
>etnaviv_hw_jobs_limit, etnaviv_job_hang_limit,
>msecs_to_jiffies(500), NULL, NULL,
> -  dev_name(gpu->dev), gpu->dev);
> +  dev_name(gpu->dev), DRM_SCHED_POLICY_UNSET,
> +  gpu->dev);
>   if (ret)
>   return ret;
>  
> diff --git a/drivers/gpu/drm/lima/lima_sched.c 
> b/drivers/gpu/drm/lima/lima_sched.c
> index 8d858aed0e56..50c2075228aa 100644
> --- a/drivers/gpu/drm/lima/lima_sched.c
> +++ b/drivers/gpu/drm/lima/lima_sched.c
> @@ -491,7 +491,8 @@ int lima_sched_pipe_init(struct lima_sched_pipe *pipe, 
> const char *name)
>   return drm_sched_init(>base, _sched_ops, NULL, 1,
> lima_job_hang_limit,
> msecs_to_jiffies(timeout), NULL,
> -   NULL, name, pipe->ldev->dev);
> +   NULL, name, DRM_SCHED_POLICY_UNSET,
> +

Re: [PATCH v5 6/7] drm/ssd130x: Fix atomic_check for disabled planes

2023-10-13 Thread Javier Martinez Canillas

Thomas Zimmermann  writes:

Hello Thomas,

> The plane's atomic_check returns -EINVAL if the CRTC has not been
> set. This is the case for disabled planes, for which atomic_check
> should return 0. For disabled planes, it also omits the mandatory
> call to drm_atomic_helper_check_plane_state().
>
> Replace the test with the boiler-plate code that first invokes
> drm_atomic_helper_check_plane_state() and then tests for the plane
> to be visible. Return early for non-visible planes.
>
> Signed-off-by: Thomas Zimmermann 
> Fixes: d51f9fbd98b6 ("drm/ssd130x: Store the HW buffer in the driver-private 
> CRTC state")
> Reviewed-by: Javier Martinez Canillas 
> Tested-by: Javier Martinez Canillas 
> Cc: Geert Uytterhoeven 
> Cc: Javier Martinez Canillas 
> Cc: Maxime Ripard 
> ---

I've picked this since it's an unrelated fix to the rest of the series.

Pushed to drm-misc (drm-misc-next) because the offending commit is not
in mainline yet. Thanks!

-- 
Best regards,

Javier Martinez Canillas
Core Platforms
Red Hat

Re: [v3 2/3] drm/panel: ili9882t: Avoid blurred screen from fast sleep

2023-10-13 Thread Doug Anderson

Hi,

On Thu, Oct 12, 2023 at 8:56 PM cong yang
 wrote:
>
> Hi,
>
> On Fri, Oct 13, 2023 at 10:28 AM Doug Anderson  wrote:
> >
> > Hi,
> >
> > On Thu, Oct 12, 2023 at 6:12 PM cong yang
> >  wrote:
> > >
> > > Hi,
> > >
> > > On Thu, Oct 12, 2023 at 11:15 PM Doug Anderson  
> > > wrote:
> > > >
> > > > Hi,
> > > >
> > > > On Thu, Oct 12, 2023 at 5:10 AM Cong Yang
> > > >  wrote:
> > > > >
> > > > > At present, we have found that there may be a problem of blurred
> > > > > screen during fast sleep/resume. The direct cause of the blurred
> > > > > screen is that the IC does not receive 0x28/0x10. Because of the
> > > > > particularity of the IC, before the panel enters sleep hid must
> > > > > stop scanning, as i2c_hid_core_suspend before ili9882t_disable.
> > > > > If move the ili9882t_enter_sleep_mode function to ili9882t_unprepare,
> > > > > touch reset will pull low before panel entersleep, which does not meet
> > > > > the timing requirements..
> > > >
> > > > The above makes me believe that the reset GPIO should be moved out of
> > > > the input driver and into the panel driver. I could just imagine that
> > > > the kernel might have some reason it wants to suspend the i2c hid
> > > > device. If that causes the panel to suddenly start failing then that
> > > > would be bad... I think we should fix this.
> > >
> > > Thanks, I will confirm with ilitek in further analysis and use "move
> > > the ili9882t_enter_sleep_mode
> > > function to ili9882t_unprepare".  Is the test failure really because
> > > the touch reset timing
> > > does not match? There is also a separate reset GPIO on the panel.
> > > Shouldn't touch reset not
> > > affect the panel?
> > >
> > > If we find a better solution I will continue upstream,。 So is it
> > > possible to apply this plan now?
> >
> > I wouldn't be too upset at applying the current code as long as you're
> > going to continue to investigate. We can always continue to iterate on
> > it and having something working reasonably well is better than nothing
> > at all. However, I probably would wait at least 1 week before applying
> > any patch from you just simply out of courtesy to give others on the
> > mailing list time to express their comments. ...presumably we could
> > get to the bottom of the problem in that 1 week time anyway...
> >
> > I'm not trying to be an obstinate pain here--I'm merely trying to make
> > sure that whatever we land will continue to work across kernel uprevs,
> > even if driver probe order / timing changes in the kernel. If the
> > panel is really so tied to the touchscreen device's reset GPIO timing
> > then it worries me. What happens, for instance, if you disable the
> > touchscreen CONFIG in the kernel? Does the panel still work, or is
> > that extra reset GPIO totally critical to the functioning of the
> > panel. If it's totally critical then it probably makes sense to move
> > to the panel driver given that the touchscreen is a panel follower
> > anyway...
>
> Thanks. It looks like the panel works fine after I disable the touch screen
> device. So the panel may not depend on touch screen reset.
> Need to continue investigating the root cause for current status.

Ah, OK. So I guess the issue is that the ideal case involves more
interleaving of things? Right now, I think this is what happens is at
power off:

1. We call the "disable" of the panel code which enters sleep mode.

2. As panel follower, the touchscreen gets called to power off
_before_ the panel's unprepare stage. This is when we assert the
touchscreen reset GPIO.

3. We call the "unprepare" of the panel code which deasserts the
"enable" pin of the panel and then disables regulators.


The proper sequence is:

1. Stop i2c hid scanning

2. Panel enter sleep

3. Assert touchscreen reset

4. Deassert the "enable" pin of the panel and disable regulators.


Ick. I guess the only way we'd be able to really make this work would
be to extend panel follower to notify followers before _both_ the
disable and the unprepare. I guess I can put that on my todo list and
we can see what folks think. Looking closely at it, I agree that I
don't think we want to move the "touchscreen reset" functionality into
the panel even if it would probably work. That feels ugly.

-Doug

Re: Warning notice "Memory manager not clean during takedown" on RB5 runs

2023-10-13 Thread Dmitry Baryshkov

Hello Aishwarya,

On Fri, 13 Oct 2023 at 19:05, Aishwarya TCV  wrote:
>
> Hi Dmitry
>
>
> On 17/08/2023 15:59, Dmitry Baryshkov wrote:
> > Add displayport altmode declaration to the Type-C controller node to
> > enable DP altmode negotiation.
> >
> > Signed-off-by: Dmitry Baryshkov 
> > ---
> >  arch/arm64/boot/dts/qcom/qrb5165-rb5.dts | 7 +++
> >  1 file changed, 7 insertions(+)
> >
>
> At present RB5 runs against next-master are consistently throwing
> warning messages (attached below with call trace) on "Memory manager not
> clean during takedown". I can send full logs if required. However, the
> runs are booting successfully so the logs are quite big. Tried running
> with older dtb built, where these
> warnings was not observed.

Thank you for the report. These patches should be fixed by the
patchset at https://patchwork.freedesktop.org/series/125010/

-- 
With best wishes
Dmitry

Re: [PATCH] drm/ci: Default to UART for logging

2023-10-13 Thread Daniel Stone

On Fri, 6 Oct 2023 at 18:32, Rob Clark  wrote:
> ssh logging is the default for mesa, as it is generally more reliable.
> But if there are kernel issues, especially at boot, UART logging is
> infinitely more useful.

Hmm, we should still be capturing the UART boot logs regardless. Those
go into a collapsed 'LAVA boot' section but they don't just disappear
... ?

Re: [PATCH v7 04/23] dt-bindings: display: mediatek: padding: Add MT8188

2023-10-13 Thread Daniel Stone

Hi Shawn,

On Fri, 6 Oct 2023 at 08:38, Hsiao Chien Sung  wrote:
> +  Padding provides ability to add pixels to width and height of a layer with
> +  specified colors. Due to hardware design, Mixer in VDOSYS1 requires
> +  width of a layer to be 2-pixel-align, or 4-pixel-align when ETHDR is 
> enabled,
> +  we need Padding to deal with odd width.
> +  Please notice that even if the Padding is in bypass mode, settings in
> +  register must be cleared to 0, or undefined behaviors could happen.

If I understand the driver correctly, padding is automatically applied
to compensate for unaligned dimensions. The first/last rows/columns of
the overlay area will be filled with a specified colour (black?) to
preserve the area. This is unfortunately not OK to do transparently.
Userspace must be aware of this policy decision and specifically
request it. If not, the atomic request check should fail and tell
userspace that the requested configuration is not possible to achieve.

Cheers,
Daniel

Re: [PATCH] drm/i915: Flush WC GGTT only on required platforms

2023-10-13 Thread Daniel Vetter

On Fri, Oct 13, 2023 at 02:28:21PM +0200, Nirmoy Das wrote:
> Hi Ville,
> 
> On 10/13/2023 12:50 PM, Ville Syrjälä wrote:
> > On Fri, Oct 13, 2023 at 12:31:40PM +0200, Nirmoy Das wrote:
> > > gen8_ggtt_invalidate() is only needed for limitted set of platforms
> > > where GGTT is mapped as WC
> > I know there is supposed to be some kind hw snooping of the ggtt
> > pte writes to invalidate the tlb, but are we sure GFX_FLSH_CNTL
> > has no other side effects we depend on?
> 
> I spent some time searching through the gfxspec. This GFX_FLSH_CNTL register
> only seems to be for
> 
> invalidating TLB for GUnit  and (from git log ) we started to do that to
> enable WC based GGTT updates.

Might be good to cite the relevant git commits in the commit message to
make this clear.
-Sima

> 
> 
> So if I am not missing anything obvious then this should be safe.
> 
> 
> Regards,
> 
> Nirmoy
> 
> > 
> > > otherwise this can cause unwanted
> > > side-effects on XE_HP platforms where GFX_FLSH_CNTL_GEN6 is not
> > > valid.
> > > 
> > > Fixes: d2eae8e98d59 ("drm/i915/dg2: Drop force_probe requirement")
> > > Cc: Rodrigo Vivi 
> > > Cc: Tvrtko Ursulin 
> > > Cc: Joonas Lahtinen 
> > > Cc: Jani Nikula 
> > > Cc: Jonathan Cavitt 
> > > Cc: John Harrison 
> > > Cc: Andi Shyti 
> > > Cc:  # v6.2+
> > > Suggested-by: Matt Roper 
> > > Signed-off-by: Nirmoy Das 
> > > ---
> > >   drivers/gpu/drm/i915/gt/intel_ggtt.c | 6 +-
> > >   1 file changed, 5 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c 
> > > b/drivers/gpu/drm/i915/gt/intel_ggtt.c
> > > index 4d7d88b92632..c2858d434bce 100644
> > > --- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
> > > +++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
> > > @@ -197,13 +197,17 @@ void gen6_ggtt_invalidate(struct i915_ggtt *ggtt)
> > >   static void gen8_ggtt_invalidate(struct i915_ggtt *ggtt)
> > >   {
> > > + struct drm_i915_private *i915 = ggtt->vm.i915;
> > >   struct intel_uncore *uncore = ggtt->vm.gt->uncore;
> > >   /*
> > >* Note that as an uncached mmio write, this will flush the
> > >* WCB of the writes into the GGTT before it triggers the 
> > > invalidate.
> > > +  *
> > > +  * Only perform this when GGTT is mapped as WC, see ggtt_probe_common().
> > >*/
> > > - intel_uncore_write_fw(uncore, GFX_FLSH_CNTL_GEN6, GFX_FLSH_CNTL_EN);
> > > + if (!IS_GEN9_LP(i915) && GRAPHICS_VER(i915) < 11)
> > > + intel_uncore_write_fw(uncore, GFX_FLSH_CNTL_GEN6, 
> > > GFX_FLSH_CNTL_EN);
> > >   }
> > >   static void guc_ggtt_invalidate(struct i915_ggtt *ggtt)
> > > -- 
> > > 2.41.0

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Re: [PATCH v2] drm/i915: Flush WC GGTT only on required platforms

2023-10-13 Thread Matt Roper

On Fri, Oct 13, 2023 at 03:44:39PM +0200, Nirmoy Das wrote:
> gen8_ggtt_invalidate() is only needed for limited set of platforms
> where GGTT is mapped as WC otherwise this can cause unwanted
> side-effects on XE_HP platforms where GFX_FLSH_CNTL_GEN6 is not
> valid.
> 
> v2: Add a func to detect wc ggtt detection (Ville)
> 
> Fixes: d2eae8e98d59 ("drm/i915/dg2: Drop force_probe requirement")
> Cc: Rodrigo Vivi 
> Cc: Tvrtko Ursulin 
> Cc: Joonas Lahtinen 
> Cc: Jani Nikula 
> Cc: Jonathan Cavitt 
> Cc: John Harrison 
> Cc: Andi Shyti 
> Cc: Ville Syrjälä 
> Cc:  # v6.2+
> Suggested-by: Matt Roper 
> Signed-off-by: Nirmoy Das 
> Acked-by: Andi Shyti 

Reviewed-by: Matt Roper 

Interestingly, bspec 151 indicates that we probably shouldn't have been
using a CPU:WC mapping for the GGTT on gen9bc platforms either (i.e.,
the GTT part of the GTTMMADR has the same "64-bits or less" restriction
listed as later platforms).  But we've been using WC without issue for
the last 8 years, so I guess it's not worth changing it now.


Matt

> ---
>  drivers/gpu/drm/i915/gt/intel_ggtt.c | 35 +++-
>  1 file changed, 24 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c 
> b/drivers/gpu/drm/i915/gt/intel_ggtt.c
> index 4d7d88b92632..401667f83f96 100644
> --- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
> +++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
> @@ -195,6 +195,21 @@ void gen6_ggtt_invalidate(struct i915_ggtt *ggtt)
>   spin_unlock_irq(>lock);
>  }
>  
> +static bool needs_wc_ggtt_mapping(struct drm_i915_private *i915)
> +{
> + /*
> +  * On BXT+/ICL+ writes larger than 64 bit to the GTT pagetable range
> +  * will be dropped. For WC mappings in general we have 64 byte burst
> +  * writes when the WC buffer is flushed, so we can't use it, but have to
> +  * resort to an uncached mapping. The WC issue is easily caught by the
> +  * readback check when writing GTT PTE entries.
> +  */
> + if (!IS_GEN9_LP(i915) && GRAPHICS_VER(i915) < 11)
> + return true;
> +
> + return false;
> +}
> +
>  static void gen8_ggtt_invalidate(struct i915_ggtt *ggtt)
>  {
>   struct intel_uncore *uncore = ggtt->vm.gt->uncore;
> @@ -202,8 +217,12 @@ static void gen8_ggtt_invalidate(struct i915_ggtt *ggtt)
>   /*
>* Note that as an uncached mmio write, this will flush the
>* WCB of the writes into the GGTT before it triggers the invalidate.
> +  *
> +  * Only perform this when GGTT is mapped as WC, see ggtt_probe_common().
>*/
> - intel_uncore_write_fw(uncore, GFX_FLSH_CNTL_GEN6, GFX_FLSH_CNTL_EN);
> + if (needs_wc_ggtt_mapping(ggtt->vm.i915))
> + intel_uncore_write_fw(uncore, GFX_FLSH_CNTL_GEN6,
> +   GFX_FLSH_CNTL_EN);
>  }
>  
>  static void guc_ggtt_invalidate(struct i915_ggtt *ggtt)
> @@ -1126,17 +1145,11 @@ static int ggtt_probe_common(struct i915_ggtt *ggtt, 
> u64 size)
>   GEM_WARN_ON(pci_resource_len(pdev, GEN4_GTTMMADR_BAR) != 
> gen6_gttmmadr_size(i915));
>   phys_addr = pci_resource_start(pdev, GEN4_GTTMMADR_BAR) + 
> gen6_gttadr_offset(i915);
>  
> - /*
> -  * On BXT+/ICL+ writes larger than 64 bit to the GTT pagetable range
> -  * will be dropped. For WC mappings in general we have 64 byte burst
> -  * writes when the WC buffer is flushed, so we can't use it, but have to
> -  * resort to an uncached mapping. The WC issue is easily caught by the
> -  * readback check when writing GTT PTE entries.
> -  */
> - if (IS_GEN9_LP(i915) || GRAPHICS_VER(i915) >= 11)
> - ggtt->gsm = ioremap(phys_addr, size);
> - else
> + if (needs_wc_ggtt_mapping(i915))
>   ggtt->gsm = ioremap_wc(phys_addr, size);
> + else
> + ggtt->gsm = ioremap(phys_addr, size);
> +
>   if (!ggtt->gsm) {
>   drm_err(>drm, "Failed to map the ggtt page table\n");
>   return -ENOMEM;
> -- 
> 2.41.0
> 

-- 
Matt Roper
Graphics Software Engineer
Linux GPU Platform Enablement
Intel Corporation

[PATCH v2] drm/i915/gvt: Optimize mmio_offset_compare() for efficiency

2023-10-13 Thread Kuan-Wei Chiu

The original code used conditional branching in the mmio_offset_compare
function to compare two values and return -1, 1, or 0 based on the
result. However, the list_sort comparison function only needs results
<0, >0, or =0. This patch optimizes the code to make the comparison
branchless, improving efficiency and reducing code size. This change
reduces the number of comparison operations from 1-2 to a single
subtraction operation, thereby saving the number of instructions.

Signed-off-by: Kuan-Wei Chiu 
---
v1 -> v2:
- Add explicit type cast in case the sizes of u32 and int differ.

 drivers/gpu/drm/i915/gvt/debugfs.c | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/gvt/debugfs.c 
b/drivers/gpu/drm/i915/gvt/debugfs.c
index baccbf1761b7..d85d8a3b5ae5 100644
--- a/drivers/gpu/drm/i915/gvt/debugfs.c
+++ b/drivers/gpu/drm/i915/gvt/debugfs.c
@@ -48,11 +48,7 @@ static int mmio_offset_compare(void *priv,
 
ma = container_of(a, struct diff_mmio, node);
mb = container_of(b, struct diff_mmio, node);
-   if (ma->offset < mb->offset)
-   return -1;
-   else if (ma->offset > mb->offset)
-   return 1;
-   return 0;
+   return (int)ma->offset - (int)mb->offset;
 }
 
 static inline int mmio_diff_handler(struct intel_gvt *gvt,
-- 
2.25.1

[PATCH] drm/bridge: analogix: anx78xx: Check return value

2023-10-13 Thread Artem Chernyshev

In anx78xx_start() err value of anx78xx_clear_bits()
overwriting without check.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Signed-off-by: Artem Chernyshev 
---
 drivers/gpu/drm/bridge/analogix/analogix-anx78xx.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/bridge/analogix/analogix-anx78xx.c 
b/drivers/gpu/drm/bridge/analogix/analogix-anx78xx.c
index 800555aef97f..c966e661b9b9 100644
--- a/drivers/gpu/drm/bridge/analogix/analogix-anx78xx.c
+++ b/drivers/gpu/drm/bridge/analogix/analogix-anx78xx.c
@@ -537,6 +537,10 @@ static int anx78xx_start(struct anx78xx *anx78xx)
 SP_POWERDOWN_CTRL_REG,
 SP_HDCP_PD | SP_AUDIO_PD | SP_VIDEO_PD |
 SP_LINK_PD);
+   if (err) {
+   DRM_ERROR("Failed to clear bits: %d\n", err);
+   goto err_poweroff;
+   }
 
err = anx78xx_enable_interrupts(anx78xx);
if (err) {
-- 
2.37.3

Re: [PATCH v3 4/6] drm/ssd130x: Add support for the SSD132x OLED controller family

2023-10-13 Thread Javier Martinez Canillas

Thomas Zimmermann  writes:

Hello Thomas,

[...]
 
>> As you mention that these could be follow-up changes, I assume that you
>> agree with the current approach. Should I expect your review / ack for
>> this patch-set?
>
> Please take my ack for this patchset
>
> Acked-by: Thomas Zimmermann 
>

Perfect, thanks a lot!

> Best regards
> Thomas
>

-- 
Best regards,

Javier Martinez Canillas
Core Platforms
Red Hat

Re: [PATCH v3 4/6] drm/ssd130x: Add support for the SSD132x OLED controller family

2023-10-13 Thread Thomas Zimmermann


Hi

Am 13.10.23 um 16:57 schrieb Javier Martinez Canillas:

Thomas Zimmermann  writes:

Hello Thomas,

Thanks a lot for your feedback.


Hi Javier,

thanks for this patch.

Am 12.10.23 um 23:38 schrieb Javier Martinez Canillas:
[...]
   
+static int ssd132x_fb_blit_rect(struct drm_framebuffer *fb,

+   const struct iosys_map *vmap,
+   struct drm_rect *rect, u8 *buf,
+   u8 *data_array)
+{
+   struct ssd130x_device *ssd130x = drm_to_ssd130x(fb->dev);
+   unsigned int dst_pitch = drm_rect_width(rect);
+   struct iosys_map dst;
+   int ret = 0;
+
+   /* Align x to display segment boundaries */
+   rect->x1 = round_down(rect->x1, SSD132X_SEGMENT_WIDTH);
+   rect->x2 = min_t(unsigned int, round_up(rect->x2, 
SSD132X_SEGMENT_WIDTH),
+ssd130x->width);
+
+   ret = drm_gem_fb_begin_cpu_access(fb, DMA_FROM_DEVICE);
+   if (ret)
+   return ret;
+
+   iosys_map_set_vaddr(, buf);
+   drm_fb_xrgb_to_gray8(, _pitch, vmap, fb, rect);


Here's an idea for a follow-up patchset.

You could attempt to integrate the gray8 and mono conversions into
drm_fb_blit(). With some the right parameters, both, ssd130x and ssd132x
could use the same blitting code from BO to buffer.



Yeah, I considered that but as mentioned in the commit message want to see
what are the needs of the SSD133x controller family (I bought a SSD1331
display but haven't had time to play with it yet) before trying to factor
out the common bits in helper functions.

[...]


+
+   ssd130x_state->buffer = kcalloc(pitch, ssd130x->height, GFP_KERNEL);
+   if (!ssd130x_state->buffer)
+   return -ENOMEM;


It's unrelated to these patches and I know it's been discussed
endlessly, but I have a questions about buffer allocation. That memory
acts as another shadow buffer for the device's memory, such that format
conversion becomes easier.



Correct.


But then, why is ->buffer part of the plane_state? Shouldn't it be part
of the plane and never be re-allocated? The real size of that buffer is
 times  (not ). That size is static over the
lifetime of the device. That would represent the semantics much better.

This would allow for additional changes: blit_rect and update_rect would
be much easier to separate: no more segment adjustments for the blit
code; only for updates. If the update code has high latency (IDK), you
could push it into a worker thread to run besides the DRM logic. The gud
and repaper drivers do something to this effect.




The idea of making it part of the plane state is that this buffer could be
optional, for example in the case of user-space using the native display
format instead of the emulated XRGB.

In that case, an intermediate buffer won't be used because the shadow-plane
format will already be the native one (e.g: R1) and there won't be a need
to do any format conversion (only the conversion to the data format as is
expected by the controller).

Take a look to Geert's patch adding R1 support to ssd130x for an example:

https://lore.kernel.org/all/72746f6d9c47f09fc057ad7a4bbb3b7f423af803.1689252746.git.ge...@linux-m68k.org/

That's why it was decided that making it part of the plane state follows
better the KMS model, because when using R1 this buffer won't even be
allocated in the primary plane .atomic_check handler.

[...]


+   drm_atomic_helper_damage_iter_init(, old_plane_state, plane_state);
+   drm_atomic_for_each_plane_damage(, ) {
+   dst_clip = plane_state->dst;
+
+   if (!drm_rect_intersect(_clip, ))
+   continue;
+
+   ssd132x_fb_blit_rect(fb, _plane_state->data[0], 
_clip,
+ssd130x_plane_state->buffer,
+ssd130x_crtc_state->data_array);
+   }


Here's another idea for a another follow-up patchset:

You are allocating state->buffer to cover the whole display, right? It's
 times  IIRC.  Maybe it would make sense to split the
damage loop into two loops and inline the driver's blit_rect() function.
Something like that

begin_cpu_access()

for_each(damage) {
  drm_fb_blit( "from GEM BO to buffer" )
}

end_cpu_access()

for_each(damge) {
  update_rect( "from buffer to device" )
}

With the changes from the other comments, the first loop could become
entirely device-neutral AFAICT.



Regardless, splitting the blit and update rect might make sense and is an
intersesting idea. I need to explore this, thanks for the suggestion.

As you mention that these could be follow-up changes, I assume that you
agree with the current approach. Should I expect your review / ack for
this patch-set?


Please take my ack for this patchset

Acked-by: Thomas Zimmermann 

Best regards
Thomas




Best regards
Thomas





--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany

Implement per-key keyboard backlight as auxdisplay?

2023-10-13 Thread Werner Sembach


Hi,

coming from the leds mailing list I'm writing with Pavel how to best handle 
per-key RGB keyboards.


His suggestion was that it could be implemented as an aux display, but he also 
suggested that I ask first if this fits.


The specific keyboard RGB controller I want to implement takes 6*21 rgb values. 
However not every one is actually mapped to a physical key. e.g. the bottom row 
needs less entries because of the space bar. Additionally the keys are ofc not 
in a straight line from top to bottom.


Best regards,

Werner

Re: [PATCH v3 4/6] drm/ssd130x: Add support for the SSD132x OLED controller family

2023-10-13 Thread Javier Martinez Canillas

Thomas Zimmermann  writes:

Hello Thomas,

Thanks a lot for your feedback.

> Hi Javier,
>
> thanks for this patch.
>
> Am 12.10.23 um 23:38 schrieb Javier Martinez Canillas:
> [...]
>>   
>> +static int ssd132x_fb_blit_rect(struct drm_framebuffer *fb,
>> +const struct iosys_map *vmap,
>> +struct drm_rect *rect, u8 *buf,
>> +u8 *data_array)
>> +{
>> +struct ssd130x_device *ssd130x = drm_to_ssd130x(fb->dev);
>> +unsigned int dst_pitch = drm_rect_width(rect);
>> +struct iosys_map dst;
>> +int ret = 0;
>> +
>> +/* Align x to display segment boundaries */
>> +rect->x1 = round_down(rect->x1, SSD132X_SEGMENT_WIDTH);
>> +rect->x2 = min_t(unsigned int, round_up(rect->x2, 
>> SSD132X_SEGMENT_WIDTH),
>> + ssd130x->width);
>> +
>> +ret = drm_gem_fb_begin_cpu_access(fb, DMA_FROM_DEVICE);
>> +if (ret)
>> +return ret;
>> +
>> +iosys_map_set_vaddr(, buf);
>> +drm_fb_xrgb_to_gray8(, _pitch, vmap, fb, rect);
>
> Here's an idea for a follow-up patchset.
>
> You could attempt to integrate the gray8 and mono conversions into 
> drm_fb_blit(). With some the right parameters, both, ssd130x and ssd132x 
> could use the same blitting code from BO to buffer.
>

Yeah, I considered that but as mentioned in the commit message want to see
what are the needs of the SSD133x controller family (I bought a SSD1331
display but haven't had time to play with it yet) before trying to factor
out the common bits in helper functions.

[...]

>> +
>> +ssd130x_state->buffer = kcalloc(pitch, ssd130x->height, GFP_KERNEL);
>> +if (!ssd130x_state->buffer)
>> +return -ENOMEM;
>
> It's unrelated to these patches and I know it's been discussed 
> endlessly, but I have a questions about buffer allocation. That memory 
> acts as another shadow buffer for the device's memory, such that format 
> conversion becomes easier.
>

Correct.

> But then, why is ->buffer part of the plane_state? Shouldn't it be part 
> of the plane and never be re-allocated? The real size of that buffer is 
>  times  (not ). That size is static over the 
> lifetime of the device. That would represent the semantics much better.
>
> This would allow for additional changes: blit_rect and update_rect would 
> be much easier to separate: no more segment adjustments for the blit 
> code; only for updates. If the update code has high latency (IDK), you 
> could push it into a worker thread to run besides the DRM logic. The gud 
> and repaper drivers do something to this effect.
>
>

The idea of making it part of the plane state is that this buffer could be
optional, for example in the case of user-space using the native display
format instead of the emulated XRGB.

In that case, an intermediate buffer won't be used because the shadow-plane
format will already be the native one (e.g: R1) and there won't be a need
to do any format conversion (only the conversion to the data format as is
expected by the controller).

Take a look to Geert's patch adding R1 support to ssd130x for an example:

https://lore.kernel.org/all/72746f6d9c47f09fc057ad7a4bbb3b7f423af803.1689252746.git.ge...@linux-m68k.org/

That's why it was decided that making it part of the plane state follows
better the KMS model, because when using R1 this buffer won't even be
allocated in the primary plane .atomic_check handler.

[...]

>> +drm_atomic_helper_damage_iter_init(, old_plane_state, plane_state);
>> +drm_atomic_for_each_plane_damage(, ) {
>> +dst_clip = plane_state->dst;
>> +
>> +if (!drm_rect_intersect(_clip, ))
>> +continue;
>> +
>> +ssd132x_fb_blit_rect(fb, _plane_state->data[0], 
>> _clip,
>> + ssd130x_plane_state->buffer,
>> + ssd130x_crtc_state->data_array);
>> +}
>
> Here's another idea for a another follow-up patchset:
>
> You are allocating state->buffer to cover the whole display, right? It's 
>  times  IIRC.  Maybe it would make sense to split the 
> damage loop into two loops and inline the driver's blit_rect() function. 
> Something like that
>
>begin_cpu_access()
>
>for_each(damage) {
>  drm_fb_blit( "from GEM BO to buffer" )
>}
>
>end_cpu_access()
>
>for_each(damge) {
>  update_rect( "from buffer to device" )
>}
>
> With the changes from the other comments, the first loop could become 
> entirely device-neutral AFAICT.
>

Regardless, splitting the blit and update rect might make sense and is an
intersesting idea. I need to explore this, thanks for the suggestion.

As you mention that these could be follow-up changes, I assume that you
agree with the current approach. Should I expect your review / ack for
this patch-set?

> Best regards
> Thomas
>

-- 
Best regards,

Javier Martinez Canillas
Core Platforms
Red Hat

[PATCH libdrm v4 1/9] util: improve SMPTE color LUT accuracy

2023-10-13 Thread Geert Uytterhoeven

Fill in the LSB when converting color components from 8-bit to 16-bit.

Signed-off-by: Geert Uytterhoeven 
Acked-by: Sam Ravnborg 
---
v4:
  - No changes,

v3:
  - Add Acked-by,

v2:
  - New.
---
 tests/util/pattern.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tests/util/pattern.c b/tests/util/pattern.c
index bd0989e6dbc6aa27..7d4f6610015e7464 100644
--- a/tests/util/pattern.c
+++ b/tests/util/pattern.c
@@ -646,9 +646,9 @@ void util_smpte_c8_gamma(unsigned size, struct 
drm_color_lut *lut)
memset(lut, 0, size * sizeof(struct drm_color_lut));
 
 #define FILL_COLOR(idx, r, g, b) \
-   lut[idx].red = (r) << 8; \
-   lut[idx].green = (g) << 8; \
-   lut[idx].blue = (b) << 8
+   lut[idx].red = (r) * 0x101; \
+   lut[idx].green = (g) * 0x101; \
+   lut[idx].blue = (b) * 0x101
 
FILL_COLOR( 0, 192, 192, 192);  /* grey */
FILL_COLOR( 1, 192, 192, 0  );  /* yellow */
-- 
2.34.1

[PATCH libdrm v4 9/9] modetest: add SMPTE pattern support for C[124] formats

2023-10-13 Thread Geert Uytterhoeven

Add support for drawing the SMPTE pattern in buffers using a
color-indexed frame buffer formats with two, four, or sixteen colors.

Note that this still uses 256 as the CLUT size, as
DRM_IOCTL_MODE_SETGAMMA enforces that the size matches against the
(fixed) gamma size, while the CLUT size depends on the format.

Move clearing the color LUT entries from util_smpte_index_gamma() to its
caller, as only the caller knows how many entries there really are
(currently DRM always assumes 256 entries).

Signed-off-by: Geert Uytterhoeven 
Acked-by: Sam Ravnborg 
---
v4:
  - Add missing C[12] to oneline-summary,
  - Do not remove memset() of full lut, else some entries may stay
uninitialized,

v3:
  - Add Acked-by,

v2:
  - Split off changes to tests/modetest/modetest.c,
  - Add C1 and C2 support.

The linuxdoc comments say userspace can query the gamma size:

 * drm_mode_gamma_set_ioctl - set the gamma table
 *
 * Set the gamma table of a CRTC to the one passed in by the user. Userspace can
 * inquire the required gamma table size through drm_mode_gamma_get_ioctl.

 * drm_mode_gamma_get_ioctl - get the gamma table
 *
 * Copy the current gamma table into the storage provided. This also provides
 * the gamma table size the driver expects, which can be used to size the
 * allocated storage.

but the code doesn't seem to support that in an easy way (like setting
red/green/blue to NULL on input, retrieving gamma_size on output), only
by providing big enough buffers for red/green/blue, and looping over
gamma_size until -EINVAL is no longer returned.
---
 tests/modetest/modetest.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/tests/modetest/modetest.c b/tests/modetest/modetest.c
index 9504fbd8af59ff21..9b1aa537be8716cf 100644
--- a/tests/modetest/modetest.c
+++ b/tests/modetest/modetest.c
@@ -1149,13 +1149,16 @@ static bool add_property_optional(struct device *dev, 
uint32_t obj_id,
 static void set_gamma(struct device *dev, unsigned crtc_id, unsigned fourcc)
 {
unsigned blob_id = 0;
+   const struct util_format_info *info;
/* TODO: support 1024-sized LUTs, when the use-case arises */
struct drm_color_lut gamma_lut[256];
int i, ret;
 
-   if (fourcc == DRM_FORMAT_C8) {
-   /* TODO: Add C8 support for more patterns */
-   util_smpte_fill_lut(256, gamma_lut);
+   info = util_format_info_find(fourcc);
+   if (info->ncolors) {
+   memset(gamma_lut, 0, sizeof(gamma_lut));
+   /* TODO: Add index support for more patterns */
+   util_smpte_fill_lut(info->ncolors, gamma_lut);
drmModeCreatePropertyBlob(dev->fd, gamma_lut, 
sizeof(gamma_lut), _id);
} else {
/*
-- 
2.34.1

[PATCH libdrm v4 4/9] util: store number of colors for indexed formats

2023-10-13 Thread Geert Uytterhoeven

Store the number of available colors for color-indexed frame
buffer formats in the format_info[] array.  This avoids the need of test
code for having to use switch statements all the time to obtain the
number of colors, or to check if a mode is color-indexed or not.

Signed-off-by: Geert Uytterhoeven 
Acked-by: Sam Ravnborg 
---
v4:
  - No changes,

v3:
  - Add Acked-by,

v2:
  - New.
---
 tests/util/format.c | 8 
 tests/util/format.h | 1 +
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/tests/util/format.c b/tests/util/format.c
index b3d2abdc8e67eed0..b99cc9c3599d9237 100644
--- a/tests/util/format.c
+++ b/tests/util/format.c
@@ -40,10 +40,10 @@
 
 static const struct util_format_info format_info[] = {
/* Indexed */
-   { DRM_FORMAT_C1, "C1" },
-   { DRM_FORMAT_C2, "C2" },
-   { DRM_FORMAT_C4, "C4" },
-   { DRM_FORMAT_C8, "C8" },
+   { DRM_FORMAT_C1, "C1", .ncolors = 2 },
+   { DRM_FORMAT_C2, "C2", .ncolors = 4 },
+   { DRM_FORMAT_C4, "C4", .ncolors = 16 },
+   { DRM_FORMAT_C8, "C8", .ncolors = 256 },
/* YUV packed */
{ DRM_FORMAT_UYVY, "UYVY", MAKE_YUV_INFO(YUV_YCbCr | YUV_CY, 2, 2, 2) },
{ DRM_FORMAT_VYUY, "VYUY", MAKE_YUV_INFO(YUV_YCrCb | YUV_CY, 2, 2, 2) },
diff --git a/tests/util/format.h b/tests/util/format.h
index 2ce1c021fd78d51d..b847c9f2933b3cde 100644
--- a/tests/util/format.h
+++ b/tests/util/format.h
@@ -55,6 +55,7 @@ struct util_yuv_info {
 struct util_format_info {
uint32_t format;
const char *name;
+   unsigned int ncolors;
const struct util_rgb_info rgb;
const struct util_yuv_info yuv;
 };
-- 
2.34.1

[PATCH libdrm v4 2/9] util: factor out and optimize C8 SMPTE color LUT

2023-10-13 Thread Geert Uytterhoeven

The color LUT for the SMPTE pattern in indexed mode contains 22 entries,
although only 13 are non-unique.

Reduce the size of the color LUT by dropping duplicate entries, so it
can be reused for formats supporting e.g. 16 colors.  Rename the
function util_smpte_c8_gamma() to util_smpte_fill_lut(), and its first
parameter size to ncolors, to match their actual use.

Signed-off-by: Geert Uytterhoeven 
Acked-by: Sam Ravnborg 
---
v4:
  - Rename util_smpte_index_gamma() to util_smpte_fill_lut(), and its
first parameter from size to ncolors,
  - Move smpte_color_lut[] down,
  - Kill FILL_COLOR() macro,
  - Add and use EXPAND_COLOR() macro,

v3:
  - Add Acked-by,

v2:
  - Factor out smpte color LUT.
---
 tests/modetest/modetest.c |   2 +-
 tests/util/pattern.c  | 122 +-
 tests/util/pattern.h  |   2 +-
 3 files changed, 82 insertions(+), 44 deletions(-)

diff --git a/tests/modetest/modetest.c b/tests/modetest/modetest.c
index 861a06ebb27bd170..9504fbd8af59ff21 100644
--- a/tests/modetest/modetest.c
+++ b/tests/modetest/modetest.c
@@ -1155,7 +1155,7 @@ static void set_gamma(struct device *dev, unsigned 
crtc_id, unsigned fourcc)
 
if (fourcc == DRM_FORMAT_C8) {
/* TODO: Add C8 support for more patterns */
-   util_smpte_c8_gamma(256, gamma_lut);
+   util_smpte_fill_lut(256, gamma_lut);
drmModeCreatePropertyBlob(dev->fd, gamma_lut, 
sizeof(gamma_lut), _id);
} else {
/*
diff --git a/tests/util/pattern.c b/tests/util/pattern.c
index 7d4f6610015e7464..fc457c4ac61e404a 100644
--- a/tests/util/pattern.c
+++ b/tests/util/pattern.c
@@ -605,6 +605,73 @@ static void fill_smpte_rgb16fp(const struct util_rgb_info 
*rgb, void *mem,
}
 }
 
+enum smpte_colors {
+   SMPTE_COLOR_GREY,
+   SMPTE_COLOR_YELLOW,
+   SMPTE_COLOR_CYAN,
+   SMPTE_COLOR_GREEN,
+   SMPTE_COLOR_MAGENTA,
+   SMPTE_COLOR_RED,
+   SMPTE_COLOR_BLUE,
+   SMPTE_COLOR_BLACK,
+   SMPTE_COLOR_IN_PHASE,
+   SMPTE_COLOR_SUPER_WHITE,
+   SMPTE_COLOR_QUADRATURE,
+   SMPTE_COLOR_3PC5,
+   SMPTE_COLOR_11PC5,
+};
+
+static unsigned int smpte_top[7] = {
+   SMPTE_COLOR_GREY,
+   SMPTE_COLOR_YELLOW,
+   SMPTE_COLOR_CYAN,
+   SMPTE_COLOR_GREEN,
+   SMPTE_COLOR_MAGENTA,
+   SMPTE_COLOR_RED,
+   SMPTE_COLOR_BLUE,
+};
+
+static unsigned int smpte_middle[7] = {
+   SMPTE_COLOR_BLUE,
+   SMPTE_COLOR_BLACK,
+   SMPTE_COLOR_MAGENTA,
+   SMPTE_COLOR_BLACK,
+   SMPTE_COLOR_CYAN,
+   SMPTE_COLOR_BLACK,
+   SMPTE_COLOR_GREY,
+};
+
+static unsigned int smpte_bottom[8] = {
+   SMPTE_COLOR_IN_PHASE,
+   SMPTE_COLOR_SUPER_WHITE,
+   SMPTE_COLOR_QUADRATURE,
+   SMPTE_COLOR_BLACK,
+   SMPTE_COLOR_3PC5,
+   SMPTE_COLOR_BLACK,
+   SMPTE_COLOR_11PC5,
+   SMPTE_COLOR_BLACK,
+};
+
+#define EXPAND_COLOR(r, g, b)  { (r) * 0x101, (g) * 0x101, (b) * 0x101 }
+
+static const struct drm_color_lut smpte_color_lut[] = {
+   [SMPTE_COLOR_GREY] =EXPAND_COLOR(192, 192, 192),
+   [SMPTE_COLOR_YELLOW] =  EXPAND_COLOR(192, 192,   0),
+   [SMPTE_COLOR_CYAN] =EXPAND_COLOR(  0, 192, 192),
+   [SMPTE_COLOR_GREEN] =   EXPAND_COLOR(  0, 192,   0),
+   [SMPTE_COLOR_MAGENTA] = EXPAND_COLOR(192,   0, 192),
+   [SMPTE_COLOR_RED] = EXPAND_COLOR(192,   0,   0),
+   [SMPTE_COLOR_BLUE] =EXPAND_COLOR(  0,   0, 192),
+   [SMPTE_COLOR_BLACK] =   EXPAND_COLOR( 19,  19,  19),
+   [SMPTE_COLOR_IN_PHASE] =EXPAND_COLOR(  0,  33,  76),
+   [SMPTE_COLOR_SUPER_WHITE] = EXPAND_COLOR(255, 255, 255),
+   [SMPTE_COLOR_QUADRATURE] =  EXPAND_COLOR( 50,   0, 106),
+   [SMPTE_COLOR_3PC5] =EXPAND_COLOR(  9,   9,   9),
+   [SMPTE_COLOR_11PC5] =   EXPAND_COLOR( 29,  29,  29),
+};
+
+#undef EXPAND_COLOR
+
 static void fill_smpte_c8(void *mem, unsigned int width, unsigned int height,
  unsigned int stride)
 {
@@ -613,69 +680,40 @@ static void fill_smpte_c8(void *mem, unsigned int width, 
unsigned int height,
 
for (y = 0; y < height * 6 / 9; ++y) {
for (x = 0; x < width; ++x)
-   ((uint8_t *)mem)[x] = x * 7 / width;
+   ((uint8_t *)mem)[x] = smpte_top[x * 7 / width];
mem += stride;
}
 
for (; y < height * 7 / 9; ++y) {
for (x = 0; x < width; ++x)
-   ((uint8_t *)mem)[x] = 7 + (x * 7 / width);
+   ((uint8_t *)mem)[x] = smpte_middle[x * 7 / width];
mem += stride;
}
 
for (; y < height; ++y) {
for (x = 0; x < width * 5 / 7; ++x)
((uint8_t *)mem)[x] =
-   14 + (x * 4 / (width * 5 / 7));
+   smpte_bottom[x * 4 / (width * 5 / 7)];
for (; x < width

[PATCH libdrm v4 7/9] util: add SMPTE pattern support for C2 format

2023-10-13 Thread Geert Uytterhoeven

Add support for drawing the SMPTE pattern in a buffer using the C2
indexed format.

As only four colors are available, resolution is halved, and the pattern
is drawn in a PenTile RG-GB matrix, using Floyd-Steinberg dithering.
The magnitude of the green subpixels is reduced, as there are twice as
many green subpixels as red or blue subpixels.

Signed-off-by: Geert Uytterhoeven 
Acked-by: Sam Ravnborg 
---
Dithering example at 
https://drive.google.com/file/d/1g5O8XeacrjrC8rgaVENvR65YeI6QvmtO/view

v4:
  - Replace FILL_COLOR() use by pentile_color_lut[],

v3:
  - Add Acked-by,

v2:
  - New.
---
 tests/util/pattern.c | 100 ++-
 1 file changed, 99 insertions(+), 1 deletion(-)

diff --git a/tests/util/pattern.c b/tests/util/pattern.c
index 2362555356e0dd71..f69c5206d96eff02 100644
--- a/tests/util/pattern.c
+++ b/tests/util/pattern.c
@@ -659,6 +659,14 @@ static const struct drm_color_lut bw_color_lut[] = {
EXPAND_COLOR(255, 255, 255),/* white */
 };
 
+static const struct drm_color_lut pentile_color_lut[] = {
+   /* PenTile RG-GB */
+   EXPAND_COLOR(  0,   0,   0),/* black */
+   EXPAND_COLOR(255,   0,   0),/* red */
+   EXPAND_COLOR(  0, 207,   0),/* green */
+   EXPAND_COLOR(  0,   0, 255),/* blue */
+};
+
 static const struct drm_color_lut smpte_color_lut[] = {
[SMPTE_COLOR_GREY] =EXPAND_COLOR(192, 192, 192),
[SMPTE_COLOR_YELLOW] =  EXPAND_COLOR(192, 192,   0),
@@ -835,6 +843,92 @@ static void fill_smpte_c1(void *mem, unsigned int width, 
unsigned int height,
free(fsd);
 }
 
+static void write_pixel_2(uint8_t *mem, unsigned int x, unsigned int pixel)
+{
+   unsigned int shift = 6 - 2 * (x & 3);
+   unsigned int mask = 3U << shift;
+
+   mem[x / 4] = (mem[x / 4] & ~mask) | ((pixel << shift) & mask);
+}
+
+static void write_color_2(struct fsd *fsd, uint8_t *mem, unsigned int stride,
+ unsigned int x, unsigned int index)
+{
+   struct drm_color_lut color = smpte_color_lut[index];
+   unsigned int r, g, b;
+
+   fsd_dither(fsd, );
+
+   if (color.red >= 32768) {
+   r = 1;
+   color.red = 65535;
+   } else {
+   r = 0;
+   color.red = 0;
+   }
+   if (color.green >= 32768) {
+   g = 2;
+   color.green = 65535;
+   } else {
+   g = 0;
+   color.green = 0;
+   }
+   if (color.blue >= 32768) {
+   b = 3;
+   color.blue = 65535;
+   } else {
+   b = 0;
+   color.blue = 0;
+   }
+
+   fsd_update(fsd, );
+
+   /* Use PenTile RG-GB */
+   write_pixel_2(mem, 2 * x, r);
+   write_pixel_2(mem, 2 * x + 1, g);
+   write_pixel_2(mem + stride, 2 * x, g);
+   write_pixel_2(mem + stride, 2 * x + 1, b);
+}
+
+static void fill_smpte_c2(void *mem, unsigned int width, unsigned int height,
+ unsigned int stride)
+{
+   struct fsd *fsd = fsd_alloc(width);
+   unsigned int x;
+   unsigned int y;
+
+   /* Half resolution for PenTile RG-GB */
+   width /= 2;
+   height /= 2;
+
+   for (y = 0; y < height * 6 / 9; ++y) {
+   for (x = 0; x < width; ++x)
+   write_color_2(fsd, mem, stride, x, smpte_top[x * 7 / 
width]);
+   mem += 2 * stride;
+   }
+
+   for (; y < height * 7 / 9; ++y) {
+   for (x = 0; x < width; ++x)
+   write_color_2(fsd, mem, stride, x, smpte_middle[x * 7 / 
width]);
+   mem += 2 * stride;
+   }
+
+   for (; y < height; ++y) {
+   for (x = 0; x < width * 5 / 7; ++x)
+   write_color_2(fsd, mem, stride, x,
+ smpte_bottom[x * 4 / (width * 5 / 7)]);
+   for (; x < width * 6 / 7; ++x)
+   write_color_2(fsd, mem, stride, x,
+ smpte_bottom[(x - width * 5 / 7) * 3 /
+  (width / 7) + 4]);
+   for (; x < width; ++x)
+   write_color_2(fsd, mem, stride, x, smpte_bottom[7]);
+   mem += 2 * stride;
+   }
+
+   free(fsd);
+}
+
 static void write_pixel_4(uint8_t *mem, unsigned int x, unsigned int pixel)
 {
if (x & 1)
@@ -916,8 +1010,10 @@ void util_smpte_fill_lut(unsigned int ncolors, struct 
drm_color_lut *lut)
}
memset(lut, 0, ncolors * sizeof(struct drm_color_lut));
 
-   if (ncolors < ARRAY_SIZE(smpte_color_lut))
+   if (ncolors < ARRAY_SIZE(pentile_color_lut))
memcpy(lut, bw_color_lut, sizeof(bw_color_lut));
+   else if (ncolors < ARRAY_SIZE(smpte_color_lut))
+   memcpy(lut, pentile_color_lut, sizeof(pentile_color_lut));
else
memcpy(lut, smpte_color_lut, sizeof(smpte_color_lut));
 }
@@

[PATCH libdrm v4 6/9] util: add SMPTE pattern support for C1 format

2023-10-13 Thread Geert Uytterhoeven

Add support for drawing the SMPTE pattern in a buffer using the C1
indexed format.

As only two colors are available, the pattern is drawn in black and
white, using Floyd-Steinberg dithering[1].

[1] https://en.wikipedia.org/wiki/Floyd%E2%80%93Steinberg_dithering

Signed-off-by: Geert Uytterhoeven 
Acked-by: Sam Ravnborg 
---
Dithering example at 
https://drive.google.com/file/d/1waJczErrIaEKRhBCCU1ynxRG8agpo0Xx/view

v4:
  - Replace FILL_COLOR() use by bw_color_lut[],

v3:
  - Add Acked-by,
  - Add Wikipedia link,

v2:
  New.
---
 tests/util/pattern.c | 174 ++-
 1 file changed, 171 insertions(+), 3 deletions(-)

diff --git a/tests/util/pattern.c b/tests/util/pattern.c
index 17074265dc60033c..2362555356e0dd71 100644
--- a/tests/util/pattern.c
+++ b/tests/util/pattern.c
@@ -654,6 +654,11 @@ static unsigned int smpte_bottom[8] = {
 
 #define EXPAND_COLOR(r, g, b)  { (r) * 0x101, (g) * 0x101, (b) * 0x101 }
 
+static const struct drm_color_lut bw_color_lut[] = {
+   EXPAND_COLOR(  0,   0,   0),/* black */
+   EXPAND_COLOR(255, 255, 255),/* white */
+};
+
 static const struct drm_color_lut smpte_color_lut[] = {
[SMPTE_COLOR_GREY] =EXPAND_COLOR(192, 192, 192),
[SMPTE_COLOR_YELLOW] =  EXPAND_COLOR(192, 192,   0),
@@ -672,6 +677,164 @@ static const struct drm_color_lut smpte_color_lut[] = {
 
 #undef EXPAND_COLOR
 
+/*
+ * Floyd-Steinberg dithering
+ */
+
+struct fsd {
+   unsigned int width;
+   unsigned int x;
+   unsigned int i;
+   int red;
+   int green;
+   int blue;
+   int error[];
+};
+
+static struct fsd *fsd_alloc(unsigned int width)
+{
+   unsigned int n = 3 * (width + 1);
+   struct fsd *fsd = malloc(sizeof(*fsd) + n * sizeof(fsd->error[0]));
+
+   fsd->width = width;
+   fsd->x = 0;
+   fsd->i = 0;
+   memset(fsd->error, 0, n * sizeof(fsd->error[0]));
+
+   return fsd;
+}
+
+static inline int clamp(int val, int min, int max)
+{
+   if (val < min)
+   return min;
+   if (val > max)
+   return max;
+   return val;
+}
+
+static void fsd_dither(struct fsd *fsd, struct drm_color_lut *color)
+{
+   unsigned int i = fsd->i;
+
+   fsd->red = (int)color->red + (fsd->error[3 * i] + 8) / 16;
+   fsd->green = (int)color->green + (fsd->error[3 * i + 1] + 8) / 16;
+   fsd->blue = (int)color->blue + (fsd->error[3 * i + 2] + 8) / 16;
+
+   color->red = clamp(fsd->red, 0, 65535);
+   color->green = clamp(fsd->green, 0, 65535);
+   color->blue = clamp(fsd->blue, 0, 65535);
+}
+
+static void fsd_update(struct fsd *fsd, const struct drm_color_lut *actual)
+{
+   int error_red = fsd->red - (int)actual->red;
+   int error_green = fsd->green - (int)actual->green;
+   int error_blue = fsd->blue - (int)actual->blue;
+   unsigned int width = fsd->width;
+   unsigned int i = fsd->i, j;
+   unsigned int n = width + 1;
+
+   /* Distribute errors over neighboring pixels */
+   if (fsd->x == width - 1) {
+   /* Last pixel on this scanline */
+   /* South East: initialize to zero */
+   fsd->error[3 * i] = 0;
+   fsd->error[3 * i + 1] = 0;
+   fsd->error[3 * i + 2] = 0;
+   } else {
+   /* East: accumulate error */
+   j = (i + 1) % n;
+   fsd->error[3 * j] += 7 * error_red;
+   fsd->error[3 * j + 1] += 7 * error_green;
+   fsd->error[3 * j + 2] += 7 * error_blue;
+
+   /* South East: initial error */
+   fsd->error[3 * i] = error_red;
+   fsd->error[3 * i + 1] = error_green;
+   fsd->error[3 * i + 2] = error_blue;
+   }
+   /* South West: accumulate error */
+   j = (i + width - 1) % n;
+   fsd->error[3 * j] += 3 * error_red;
+   fsd->error[3 * j + 1] += 3 * error_green;
+   fsd->error[3 * j + 2] += 3 * error_blue;
+
+   /* South: accumulate error */
+   j = (i + width) % n;
+   fsd->error[3 * j] += 5 * error_red;
+   fsd->error[3 * j + 1] += 5 * error_green;
+   fsd->error[3 * j + 2] += 5 * error_blue;
+
+   fsd->x = (fsd->x + 1) % width;
+   fsd->i = (fsd->i + 1) % n;
+}
+
+static void write_pixel_1(uint8_t *mem, unsigned int x, unsigned int pixel)
+{
+   unsigned int shift = 7 - (x & 7);
+   unsigned int mask = 1U << shift;
+
+   mem[x / 8] = (mem[x / 8] & ~mask) | ((pixel << shift) & mask);
+}
+
+static void write_color_1(struct fsd *fsd, uint8_t *mem, unsigned int x,
+ unsigned int index)
+{
+   struct drm_color_lut color = smpte_color_lut[index];
+   unsigned int pixel;
+
+   fsd_dither(fsd, );
+
+   /* ITU BT.601: Y = 0.299 R + 0.587 G + 0.114 B */
+   if (3 * color.red + 6 * color.green + color.blue >= 10 * 32768) {
+   pixel = 1;
+   color.red = color.green = color.blue = 65535;
+   }

[PATCH libdrm v4 5/9] util: add SMPTE pattern support for C4 format

2023-10-13 Thread Geert Uytterhoeven

Add support for drawing the SMPTE pattern in a buffer using the C4
indexed format.

Signed-off-by: Geert Uytterhoeven 
Acked-by: Sam Ravnborg 
---
v4:
  - No changes,

v3:
  - Add Acked-by,

v2:
  - Use new smpte_top[],
  - Split off changes to tests/util/pattern.c.
---
 tests/util/pattern.c | 42 ++
 1 file changed, 42 insertions(+)

diff --git a/tests/util/pattern.c b/tests/util/pattern.c
index fc457c4ac61e404a..17074265dc60033c 100644
--- a/tests/util/pattern.c
+++ b/tests/util/pattern.c
@@ -672,6 +672,46 @@ static const struct drm_color_lut smpte_color_lut[] = {
 
 #undef EXPAND_COLOR
 
+static void write_pixel_4(uint8_t *mem, unsigned int x, unsigned int pixel)
+{
+   if (x & 1)
+   mem[x / 2] = (mem[x / 2] & 0xf0) | (pixel & 0x0f);
+   else
+   mem[x / 2] = (mem[x / 2] & 0x0f) | (pixel << 4);
+}
+
+static void fill_smpte_c4(void *mem, unsigned int width, unsigned int height,
+ unsigned int stride)
+{
+   unsigned int x;
+   unsigned int y;
+
+   for (y = 0; y < height * 6 / 9; ++y) {
+   for (x = 0; x < width; ++x)
+   write_pixel_4(mem, x, smpte_top[x * 7 / width]);
+   mem += stride;
+   }
+
+   for (; y < height * 7 / 9; ++y) {
+   for (x = 0; x < width; ++x)
+   write_pixel_4(mem, x, smpte_middle[x * 7 / width]);
+   mem += stride;
+   }
+
+   for (; y < height; ++y) {
+   for (x = 0; x < width * 5 / 7; ++x)
+   write_pixel_4(mem, x,
+ smpte_bottom[x * 4 / (width * 5 / 7)]);
+   for (; x < width * 6 / 7; ++x)
+   write_pixel_4(mem, x,
+ smpte_bottom[(x - width * 5 / 7) * 3 /
+  (width / 7) + 4]);
+   for (; x < width; ++x)
+   write_pixel_4(mem, x, smpte_bottom[7]);
+   mem += stride;
+   }
+}
+
 static void fill_smpte_c8(void *mem, unsigned int width, unsigned int height,
  unsigned int stride)
 {
@@ -723,6 +763,8 @@ static void fill_smpte(const struct util_format_info *info, 
void *planes[3],
unsigned char *u, *v;
 
switch (info->format) {
+   case DRM_FORMAT_C4:
+   return fill_smpte_c4(planes[0], width, height, stride);
case DRM_FORMAT_C8:
return fill_smpte_c8(planes[0], width, height, stride);
case DRM_FORMAT_UYVY:
-- 
2.34.1

[PATCH libdrm v4 8/9] modetest: add support for DRM_FORMAT_C[124]

2023-10-13 Thread Geert Uytterhoeven

Add support for creating buffers using the new color-indexed frame
buffer formats with two, four, and sixteen colors.

Signed-off-by: Geert Uytterhoeven 
Acked-by: Sam Ravnborg 
---
v4:
  - No changes,

v3:
  - Add Acked-by,

v2:
  - Split off changes to tests/modetest/buffers.c.
---
 tests/modetest/buffers.c | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/tests/modetest/buffers.c b/tests/modetest/buffers.c
index c122fb3fe9429190..65f1cfb32ab9eeae 100644
--- a/tests/modetest/buffers.c
+++ b/tests/modetest/buffers.c
@@ -124,6 +124,18 @@ bo_create(int fd, unsigned int format,
int ret;
 
switch (format) {
+   case DRM_FORMAT_C1:
+   bpp = 1;
+   break;
+
+   case DRM_FORMAT_C2:
+   bpp = 2;
+   break;
+
+   case DRM_FORMAT_C4:
+   bpp = 4;
+   break;
+
case DRM_FORMAT_C8:
case DRM_FORMAT_NV12:
case DRM_FORMAT_NV21:
@@ -292,6 +304,9 @@ bo_create(int fd, unsigned int format,
planes[2] = virtual + offsets[2];
break;
 
+   case DRM_FORMAT_C1:
+   case DRM_FORMAT_C2:
+   case DRM_FORMAT_C4:
case DRM_FORMAT_C8:
case DRM_FORMAT_ARGB:
case DRM_FORMAT_XRGB:
-- 
2.34.1

[PATCH libdrm v4 3/9] util: add support for DRM_FORMAT_C[124]

2023-10-13 Thread Geert Uytterhoeven

Add support for creating buffers using the new color-indexed frame
buffer formats with two, four, and sixteen colors.

Signed-off-by: Geert Uytterhoeven 
Acked-by: Sam Ravnborg 
---
v4:
  - No changes,

v3:
  - Add Acked-by,

v2:
  - Split off changes to tests/util/format.c.
---
 tests/util/format.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tests/util/format.c b/tests/util/format.c
index f825027227ddba24..b3d2abdc8e67eed0 100644
--- a/tests/util/format.c
+++ b/tests/util/format.c
@@ -40,6 +40,9 @@
 
 static const struct util_format_info format_info[] = {
/* Indexed */
+   { DRM_FORMAT_C1, "C1" },
+   { DRM_FORMAT_C2, "C2" },
+   { DRM_FORMAT_C4, "C4" },
{ DRM_FORMAT_C8, "C8" },
/* YUV packed */
{ DRM_FORMAT_UYVY, "UYVY", MAKE_YUV_INFO(YUV_YCbCr | YUV_CY, 2, 2, 2) },
-- 
2.34.1

[PATCH libdrm v4 0/9] util, modetest: add support for low-color frame buffer formats

2023-10-13 Thread Geert Uytterhoeven

Hi all,

A long outstanding issue with the DRM subsystem has been the lack of
support for low-color displays, as used typically on older desktop
systems, and on small embedded displays.

This patch series adds support for color-indexed frame buffer formats
with 2, 4, and 16 colors.  It has been tested on ARAnyM using a
work-in-progress Atari DRM driver.

Changes compared to v3[1]:
  - Rename util_smpte_index_gamma() to util_smpte_fill_lut(), and its
first parameter from size to ncolors,
  - Move smpte_color_lut[] down,
  - Kill FILL_COLOR() macro,
  - Add and use EXPAND_COLOR() macro,
  - Replace FILL_COLOR() use by bw_color_lut[],
  - Replace FILL_COLOR() use by pentile_color_lut[],
  - Add missing C[12] to oneline-summary,
  - Do not remove memset() of full lut, else some entries may stay
uninitialized.

Changes compared to v2[2]:
  - Add Acked-by,
  - Add Wikipedia link,
  - Dropped "[RFC] drm_fourcc: Add DRM_FORMAT_C[124]", as these were
added in commit 329eebcf32793361 ("drm_fourcc: sync drm_fourcc with
latest drm-next kernel") in libdrm-2.4.115.

Changes compared to v1[3]:
  - SMPTE color LUT accuracy,
  - Factor out smpte color LUT,
  - Restructure patches,
  - Improve descriptions.
  - Store number of colors for indexed formats,
  - Add SMPTE pattern support for the C1 and C2 formats.

I have also updated the merge request at [4].

Thanks for your comments!

[1] https://lore.kernel.org/r/cover.1690537375.git.ge...@linux-m68k.org
[2] https://lore.kernel.org/r/cover.1657302034.git.ge...@linux-m68k.org/
[3] https://lore.kernel.org/r/cover.1646683737.git.ge...@linux-m68k.org/
[4] https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/314

Geert Uytterhoeven (9):
  util: improve SMPTE color LUT accuracy
  util: factor out and optimize C8 SMPTE color LUT
  util: add support for DRM_FORMAT_C[124]
  util: store number of colors for indexed formats
  util: add SMPTE pattern support for C4 format
  util: add SMPTE pattern support for C1 format
  util: add SMPTE pattern support for C2 format
  modetest: add support for DRM_FORMAT_C[124]
  modetest: add SMPTE pattern support for C[124] formats

 tests/modetest/buffers.c  |  15 ++
 tests/modetest/modetest.c |   9 +-
 tests/util/format.c   |   5 +-
 tests/util/format.h   |   1 +
 tests/util/pattern.c  | 430 ++
 tests/util/pattern.h  |   2 +-
 6 files changed, 415 insertions(+), 47 deletions(-)

-- 
2.34.1

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

[PATCH] drm/ttm: Drain workqueue before sys manager release

2023-10-13 Thread Karolina Stolarek

In rare cases, a delayed destruction of a BO with a system resource
could stay in the workqueue until drain_workqueue() is called
in ttm_device_fini(). An attempt to free a resource from an already
released manager results in NULL pointer dereference. Move the step
of draining and destroying the workqueue so it happens before
the ttm_sys_manager cleanup.

Fixes: 9bff18d13473 ("drm/ttm: use per BO cleanup workers")
Signed-off-by: Karolina Stolarek 
---
Some background: I stumbled upon this issue when testing
ttm_bo_pipeline_gutting() with BO with an active dma_resv fence. In ~2% of
the runs, the delayed destruction of the ghost wouldn't happen until the
drain_queue() step. man->func->free(man, *res) got called via
ttm_bo_cleanup_memtype_use(), the manager and its functions were nowhere to
be seen, resulting in a nulptr deref.

 drivers/gpu/drm/ttm/ttm_device.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_device.c b/drivers/gpu/drm/ttm/ttm_device.c
index 7726a72befc5..753126581620 100644
--- a/drivers/gpu/drm/ttm/ttm_device.c
+++ b/drivers/gpu/drm/ttm/ttm_device.c
@@ -232,6 +232,9 @@ void ttm_device_fini(struct ttm_device *bdev)
struct ttm_resource_manager *man;
unsigned i;
 
+   drain_workqueue(bdev->wq);
+   destroy_workqueue(bdev->wq);
+
man = ttm_manager_type(bdev, TTM_PL_SYSTEM);
ttm_resource_manager_set_used(man, false);
ttm_set_driver_manager(bdev, TTM_PL_SYSTEM, NULL);
@@ -240,9 +243,6 @@ void ttm_device_fini(struct ttm_device *bdev)
list_del(>device_list);
mutex_unlock(_global_mutex);
 
-   drain_workqueue(bdev->wq);
-   destroy_workqueue(bdev->wq);
-
spin_lock(>lru_lock);
for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i)
if (list_empty(>lru[0]))
-- 
2.25.1

Re: DRM FB emulation initialisation leaving the display disabled

2023-10-13 Thread Dave Stevenson

Hi Thomas

Thanks for the response.

On Thu, 12 Oct 2023 at 08:03, Thomas Zimmermann  wrote:
>
> Hi Dave
>
> Am 11.10.23 um 17:52 schrieb Dave Stevenson:
> > Hi Thomas (and everyone else)
> >
> > Maxime has suggested you're the person for DRM framebuffer emulation.
> >
> > I'm getting some unexpected behaviour when there are multiple DRM
> > drivers in play. In this case it happens to be vc4 and the tiny
> > hx8357d SPI display driver, but this affects most of the tiny DRM
> > drivers and also the DSI and DPI outputs on the Pi5.
> > We get different behaviour depending on whether vc4 or hx8357d
> > initialises first.
> >
> > If hx8357d loads first and registers as /dev/fb0 through the fb
> > emulation, then we get fbcon enabling the display and putting the
> > emulated framebuffer on it. vc4 then loads, registers /dev/fb1, and
> > through the hotplug handler it enables the display
> > (drm_fb_helper_hotplug_event calls, drm_fb_helper_set_par, which calls
> > __drm_fb_helper_restore_fbdev_mode_unlocked).
> >
> > If vc4 loads first and claims /dev/fb0, fbcon initalises and enables
> > the display. hx8357d then loads and registers as /dev/fb1. fbcon is
> > not configured for that fb, and there is no subsequent hotplug event
> > (SPI displays aren't hotpluggable), so we have a fully configured
> > framebuffer exposed to userspace but the display itself isn't enabled
> > so we don't see anything :-(
> > Open and close /dev/dri/card1 and the lastclose hook calls
> > drm_fb_helper_restore_fbdev_mode_unlocked and we get the display
> > enabled.
>
> What you're describing sounds like the recent bug report at
>
>https://gitlab.freedesktop.org/drm/amd/-/issues/2649
>
> which had similar symptoms with amdgpu. IIRC the console didn't
> initialize if the DRM hotplugging event happened before fbdev emulation
> was ready. DRM's fbdev code would then not see the hotplugged connector.
>
> Do you have commit 27655b9bb9f0 ("drm/client: Send hotplug event after
> registering a client") in your tree? (It's been ported into various
> stable branches as well.)

I was about to switch from my 6.1.55 build to the latest just to
ensure it hadn't been fixed, but that commit hit the 6.1 stable branch
in 6.1.43, so I do have it.

Other priorities have overtaken my investigation of this, but at least
knowing that it should enable the outputs (and nominally through the
hotplug hook) I can have another look at what is actually going on.

Thanks again
  Dave

> Best regard
> Thomas
>
> >
> > Is it intentional that we're left in this limbo state with the display
> > not enabled if fbcon isn't enabled on a framebuffer?
> >
> > We're trying to get people to transition from the fbdev drivers to DRM
> > equivalents, but this seems to be a backwards step if there is an
> > extra step required to get the display enabled. Many users are still
> > just using the framebuffer.
> >
> > Any input is much appreciated.
> >
> > Thanks,
> >Dave
>
> --
> Thomas Zimmermann
> Graphics Driver Developer
> SUSE Software Solutions Germany GmbH
> Frankenstrasse 146, 90461 Nuernberg, Germany
> GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
> HRB 36809 (AG Nuernberg)

Re: [PATCH v3 1/2] drm/bridge: megachips-stdpxxxx-ge-b850v3-fw: switch to drm_do_get_edid()

2023-10-13 Thread Robert Foss

On Thu, 21 Sep 2023 13:47:50 +0300, Ian Ray wrote:
> Migrate away from custom EDID parsing and validity checks.
> 
> Note:  This is a follow-up to the original RFC by Jani [1].  The first
> submission in this series should have been marked v2.
> 
> [1] 
> https://patchwork.freedesktop.org/patch/msgid/20230901102400.552254-1-jani.nik...@intel.com
> 
> [...]

Applied, thanks!

[1/2] drm/bridge: megachips-stdp-ge-b850v3-fw: switch to drm_do_get_edid()
  https://cgit.freedesktop.org/drm/drm-misc/commit/?id=e0eb7db49764
[2/2] MAINTAINERS: Update entry for megachips-stdp-ge-b850v3-fw
  https://cgit.freedesktop.org/drm/drm-misc/commit/?id=e755d439c1b7



Rob

Re: [PATCH v3 2/2] MAINTAINERS: Update entry for megachips-stdpxxxx-ge-b850v3-fw

2023-10-13 Thread Robert Foss

On Thu, Sep 21, 2023 at 12:48 PM Ian Ray  wrote:
>
> Replace Martin, who has left GE.
>
> Signed-off-by: Ian Ray 
> ---
>  MAINTAINERS | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index bf0f54c..31bb835 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -13524,7 +13524,7 @@ F:  drivers/usb/mtu3/
>
>  MEGACHIPS STDP-GE-B850V3-FW LVDS/DP++ BRIDGES
>  M: Peter Senna Tschudin 
> -M: Martin Donnelly 
> +M: Ian Ray 
>  M: Martyn Welch 
>  S: Maintained
>  F: 
> Documentation/devicetree/bindings/display/bridge/megachips-stdp-ge-b850v3-fw.txt
> --
> 2.10.1
>

Reviewed-by: Robert Foss

Re: [PATCH v3 1/2] drm/bridge: megachips-stdpxxxx-ge-b850v3-fw: switch to drm_do_get_edid()

2023-10-13 Thread Robert Foss

On Thu, Sep 21, 2023 at 12:48 PM Ian Ray  wrote:
>
> Migrate away from custom EDID parsing and validity checks.
>
> Note:  This is a follow-up to the original RFC by Jani [1].  The first
> submission in this series should have been marked v2.
>
> [1] 
> https://patchwork.freedesktop.org/patch/msgid/20230901102400.552254-1-jani.nik...@intel.com
>
> Co-developed-by: Jani Nikula 
> Signed-off-by: Jani Nikula 
> Signed-off-by: Ian Ray 
> ---
>  .../drm/bridge/megachips-stdp-ge-b850v3-fw.c   | 57 
> --
>  1 file changed, 9 insertions(+), 48 deletions(-)
>
> diff --git a/drivers/gpu/drm/bridge/megachips-stdp-ge-b850v3-fw.c 
> b/drivers/gpu/drm/bridge/megachips-stdp-ge-b850v3-fw.c
> index 460db3c..e93083b 100644
> --- a/drivers/gpu/drm/bridge/megachips-stdp-ge-b850v3-fw.c
> +++ b/drivers/gpu/drm/bridge/megachips-stdp-ge-b850v3-fw.c
> @@ -65,12 +65,11 @@ struct ge_b850v3_lvds {
>
>  static struct ge_b850v3_lvds *ge_b850v3_lvds_ptr;
>
> -static u8 *stdp2690_get_edid(struct i2c_client *client)
> +static int stdp2690_read_block(void *context, u8 *buf, unsigned int block, 
> size_t len)
>  {
> +   struct i2c_client *client = context;
> struct i2c_adapter *adapter = client->adapter;
> -   unsigned char start = 0x00;
> -   unsigned int total_size;
> -   u8 *block = kmalloc(EDID_LENGTH, GFP_KERNEL);
> +   unsigned char start = block * EDID_LENGTH;
>
> struct i2c_msg msgs[] = {
> {
> @@ -81,53 +80,15 @@ static u8 *stdp2690_get_edid(struct i2c_client *client)
> }, {
> .addr   = client->addr,
> .flags  = I2C_M_RD,
> -   .len= EDID_LENGTH,
> -   .buf= block,
> +   .len= len,
> +   .buf= buf,
> }
> };
>
> -   if (!block)
> -   return NULL;
> +   if (i2c_transfer(adapter, msgs, 2) != 2)
> +   return -1;
>
> -   if (i2c_transfer(adapter, msgs, 2) != 2) {
> -   DRM_ERROR("Unable to read EDID.\n");
> -   goto err;
> -   }
> -
> -   if (!drm_edid_block_valid(block, 0, false, NULL)) {
> -   DRM_ERROR("Invalid EDID data\n");
> -   goto err;
> -   }
> -
> -   total_size = (block[EDID_EXT_BLOCK_CNT] + 1) * EDID_LENGTH;
> -   if (total_size > EDID_LENGTH) {
> -   kfree(block);
> -   block = kmalloc(total_size, GFP_KERNEL);
> -   if (!block)
> -   return NULL;
> -
> -   /* Yes, read the entire buffer, and do not skip the first
> -* EDID_LENGTH bytes.
> -*/
> -   start = 0x00;
> -   msgs[1].len = total_size;
> -   msgs[1].buf = block;
> -
> -   if (i2c_transfer(adapter, msgs, 2) != 2) {
> -   DRM_ERROR("Unable to read EDID extension blocks.\n");
> -   goto err;
> -   }
> -   if (!drm_edid_block_valid(block, 1, false, NULL)) {
> -   DRM_ERROR("Invalid EDID data\n");
> -   goto err;
> -   }
> -   }
> -
> -   return block;
> -
> -err:
> -   kfree(block);
> -   return NULL;
> +   return 0;
>  }
>
>  static struct edid *ge_b850v3_lvds_get_edid(struct drm_bridge *bridge,
> @@ -137,7 +98,7 @@ static struct edid *ge_b850v3_lvds_get_edid(struct 
> drm_bridge *bridge,
>
> client = ge_b850v3_lvds_ptr->stdp2690_i2c;
>
> -   return (struct edid *)stdp2690_get_edid(client);
> +   return drm_do_get_edid(connector, stdp2690_read_block, client);
>  }
>
>  static int ge_b850v3_lvds_get_modes(struct drm_connector *connector)

Reviewed-by: Robert Foss

Re: [PATCH] drm/atomic: Perform blocking commits on workqueue

2023-10-13 Thread Ray Strode

Hi

On Fri, Oct 13, 2023 at 5:41 AM Daniel Vetter  wrote:
> > I mean we're not talking about scientific computing, or code
> > compilation, or seti@home. We're talking about nearly the equivalent
> > of `while (1) __asm__ ("nop");`
>
> I don't think anyone said this shouldn't be fixed or improved.

Yea but we apparently disagree that it would be an improvement to stop
discrediting user space for driver problems.
You see, to me, there are two problems 1) The behavior itself is not
nice and should be fixed 2) The blame/accounting/attribution for a
driver problem is getting directed to user space. I think both issues
should be fixed. One brought the other issue to light, but both
problems, in my mind, are legitimate and should be addressed. You
think fixing the second problem is some tactic to paper over the first
problem. Christian thinks the second problem isn't a problem at all
but the correct design.  So none of us are completely aligned and I
don't see anyone changing their mind anytime soon.

> What I'm saying is that the atomic ioctl is not going to make guarantees
> that it will not take up to much cpu time (for some extremely vague value
> of "too much") to the point that userspace can configure it's compositor
> in a way that it _will_ get killed if we _ever_ violate this rule.
yea I find that strange, all kernel tasks have a certain implied
baseline responsibility to be good citizens to the system.
And how user space is configured seems nearly immaterial.

But we're talking in circles.

> Fixing the worst offenders I don't think will see any pushback at all.
Yea we all agree fixing this one busy loop is a problem but we don't
agree on where the cpu time blame should go.

> > But *this* feels like duct tape: You've already said there's no
> > guarantee the problem won't also happen during preliminary computation
> > during non-blocking commits or via other drm entry points. So it
> > really does seem like a fix that won't age well. I won't be surprised
> > if in ~3 years (or whatever) in some RHEL release there's a customer
> > bug leading to the real-time thread getting blocklisted for obscure
> > server display hardware because it's causing the session to tank on a
> > production machine.
>
> Yes the atomic ioctl makes no guarantees across drivers and hw platforms
> of guaranteed "we will never violate, you can kill your compositor if you
> do" cpu bound limits. We'll try to not suck too badly, and we'll try to
> focus on fixing the suck where it upsets the people the most.
>
> But there is fundamentally no hard real time guarantee in atomic. At least
> not with the current uapi.

So in your mind mutter should get out of the real-time thread business entirely?

> The problem isn't about wasting cpu time. It's about you having set up the
> system in a way so that mutter gets killed if we ever waste too much cpu
> time, ever.

mutter is not set up in a way to kill itself if the driver wastes too
much cpu time,
ever. mutter is set up in a way to kill itself if it, itself, wastes
too much cpu time, ever.
The fact that the kernel is killing it when a kernel driver is wasting
cpu time is the
bug that led to the patch in the first place!

> The latter is flat out not a guarantee the kernel currently makes, and I
> see no practical feasible way to make that happen. And pretending we do
> make this guarantee will just result in frustrated users filling bugs that
> they desktop session died and developers closing them as "too hard to
> fix".

I think all three of us agree busy loops are not nice (though maybe we
aren't completely aligned on severity). And I'll certainly concede
that fixing all the busy loops can be hard. Some of the cpu-bound code
paths may even be doing legitimate computation.  I still assert that
if the uapi calls into driver code that might potentially be doing
something slow where it can't sleep, it should be doing it on a
workqueue or thread. That seems orthogonal to fixing all the places
where the drivers aren't acting nicely.

> Maybe as a bit more useful outcome of this entire discussion: Could you
> sign up to write a documentation patch for the atomic ioctl that adds a
> paragraph stating extremely clearly that this ioctl does not make hard
> real time guarantees, but only best effort soft realtime, and even then
> priority of the effort is focused entirely on the !ALLOW_MODESET case,
> because that is the one users care about the most.

I don't think I'm the best positioned to write such documentation. I
still think what the kernel is doing is wrong here and I don't even
fully grok what you mean by best effort.

--Ray

[PATCH v2] drm/i915: Flush WC GGTT only on required platforms

2023-10-13 Thread Nirmoy Das

gen8_ggtt_invalidate() is only needed for limited set of platforms
where GGTT is mapped as WC otherwise this can cause unwanted
side-effects on XE_HP platforms where GFX_FLSH_CNTL_GEN6 is not
valid.

v2: Add a func to detect wc ggtt detection (Ville)

Fixes: d2eae8e98d59 ("drm/i915/dg2: Drop force_probe requirement")
Cc: Rodrigo Vivi 
Cc: Tvrtko Ursulin 
Cc: Joonas Lahtinen 
Cc: Jani Nikula 
Cc: Jonathan Cavitt 
Cc: John Harrison 
Cc: Andi Shyti 
Cc: Ville Syrjälä 
Cc:  # v6.2+
Suggested-by: Matt Roper 
Signed-off-by: Nirmoy Das 
Acked-by: Andi Shyti 
---
 drivers/gpu/drm/i915/gt/intel_ggtt.c | 35 +++-
 1 file changed, 24 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c 
b/drivers/gpu/drm/i915/gt/intel_ggtt.c
index 4d7d88b92632..401667f83f96 100644
--- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
@@ -195,6 +195,21 @@ void gen6_ggtt_invalidate(struct i915_ggtt *ggtt)
spin_unlock_irq(>lock);
 }
 
+static bool needs_wc_ggtt_mapping(struct drm_i915_private *i915)
+{
+   /*
+* On BXT+/ICL+ writes larger than 64 bit to the GTT pagetable range
+* will be dropped. For WC mappings in general we have 64 byte burst
+* writes when the WC buffer is flushed, so we can't use it, but have to
+* resort to an uncached mapping. The WC issue is easily caught by the
+* readback check when writing GTT PTE entries.
+*/
+   if (!IS_GEN9_LP(i915) && GRAPHICS_VER(i915) < 11)
+   return true;
+
+   return false;
+}
+
 static void gen8_ggtt_invalidate(struct i915_ggtt *ggtt)
 {
struct intel_uncore *uncore = ggtt->vm.gt->uncore;
@@ -202,8 +217,12 @@ static void gen8_ggtt_invalidate(struct i915_ggtt *ggtt)
/*
 * Note that as an uncached mmio write, this will flush the
 * WCB of the writes into the GGTT before it triggers the invalidate.
+*
+* Only perform this when GGTT is mapped as WC, see ggtt_probe_common().
 */
-   intel_uncore_write_fw(uncore, GFX_FLSH_CNTL_GEN6, GFX_FLSH_CNTL_EN);
+   if (needs_wc_ggtt_mapping(ggtt->vm.i915))
+   intel_uncore_write_fw(uncore, GFX_FLSH_CNTL_GEN6,
+ GFX_FLSH_CNTL_EN);
 }
 
 static void guc_ggtt_invalidate(struct i915_ggtt *ggtt)
@@ -1126,17 +1145,11 @@ static int ggtt_probe_common(struct i915_ggtt *ggtt, 
u64 size)
GEM_WARN_ON(pci_resource_len(pdev, GEN4_GTTMMADR_BAR) != 
gen6_gttmmadr_size(i915));
phys_addr = pci_resource_start(pdev, GEN4_GTTMMADR_BAR) + 
gen6_gttadr_offset(i915);
 
-   /*
-* On BXT+/ICL+ writes larger than 64 bit to the GTT pagetable range
-* will be dropped. For WC mappings in general we have 64 byte burst
-* writes when the WC buffer is flushed, so we can't use it, but have to
-* resort to an uncached mapping. The WC issue is easily caught by the
-* readback check when writing GTT PTE entries.
-*/
-   if (IS_GEN9_LP(i915) || GRAPHICS_VER(i915) >= 11)
-   ggtt->gsm = ioremap(phys_addr, size);
-   else
+   if (needs_wc_ggtt_mapping(i915))
ggtt->gsm = ioremap_wc(phys_addr, size);
+   else
+   ggtt->gsm = ioremap(phys_addr, size);
+
if (!ggtt->gsm) {
drm_err(>drm, "Failed to map the ggtt page table\n");
return -ENOMEM;
-- 
2.41.0

Re: [PATCH] drm/amdgpu/vkms: fix a possible null pointer dereference

2023-10-13 Thread Alex Deucher

On Thu, Oct 12, 2023 at 10:32 PM Ma Ke  wrote:
>
> In amdgpu_vkms_conn_get_modes(), the return value of drm_cvt_mode()
> is assigned to mode, which will lead to a NULL pointer dereference
> on failure of drm_cvt_mode(). Add a check to avoid null pointer
> dereference.
>
> Signed-off-by: Ma Ke 

Applied.  Thanks!

Alex

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c
> index 7148a216ae2f..db6fc0cb18eb 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c
> @@ -239,6 +239,8 @@ static int amdgpu_vkms_conn_get_modes(struct 
> drm_connector *connector)
>
> for (i = 0; i < ARRAY_SIZE(common_modes); i++) {
> mode = drm_cvt_mode(dev, common_modes[i].w, 
> common_modes[i].h, 60, false, false, false);
> +   if (!mode)
> +   continue;
> drm_mode_probed_add(connector, mode);
> }
>
> --
> 2.37.2
>

Re: [PATCH drm-misc-next v6 4/6] drm/gpuvm: track/lock/validate external/evicted objects

2023-10-13 Thread Thomas Hellström

Hi,

On Mon, 2023-10-09 at 01:32 +0200, Danilo Krummrich wrote:
> Currently the DRM GPUVM offers common infrastructure to track GPU VA
> allocations and mappings, generically connect GPU VA mappings to
> their
> backing buffers and perform more complex mapping operations on the
> GPU VA
> space.
> 
> However, there are more design patterns commonly used by drivers,
> which
> can potentially be generalized in order to make the DRM GPUVM
> represent
> a basis for GPU-VM implementations. In this context, this patch aims
> at generalizing the following elements.
> 
> 1) Provide a common dma-resv for GEM objects not being used outside
> of
>    this GPU-VM.
> 
> 2) Provide tracking of external GEM objects (GEM objects which are
>    shared with other GPU-VMs).
> 
> 3) Provide functions to efficiently lock all GEM objects dma-resv the
>    GPU-VM contains mappings of.
> 
> 4) Provide tracking of evicted GEM objects the GPU-VM contains
> mappings
>    of, such that validation of evicted GEM objects is accelerated.
> 
> 5) Provide some convinience functions for common patterns.
> 
> Big thanks to Boris Brezillon for his help to figure out locking for
> drivers updating the GPU VA space within the fence signalling path.
> 
> Suggested-by: Matthew Brost 
> Signed-off-by: Danilo Krummrich 
> ---
>  drivers/gpu/drm/drm_gpuvm.c | 646
> 
>  include/drm/drm_gpuvm.h | 246 ++
>  2 files changed, 892 insertions(+)
> 

There's a checkpatch.pl warning and a number of random macro CHECKs if
using --strict.

Also the overall s/Returns:/Return/ (and possibly function line break).


> diff --git a/drivers/gpu/drm/drm_gpuvm.c
> b/drivers/gpu/drm/drm_gpuvm.c
> index 28282283ddaf..6977bd30eca5 100644
> --- a/drivers/gpu/drm/drm_gpuvm.c
> +++ b/drivers/gpu/drm/drm_gpuvm.c
> @@ -82,6 +82,21 @@
>   * _gem_object list of _gpuvm_bos for an existing instance
> of this
>   * particular combination. If not existent a new instance is created
> and linked
>   * to the _gem_object.
> + *
> + * _gpuvm_bo structures, since unique for a given _gpuvm,
> are also used
> + * as entry for the _gpuvm's lists of external and evicted
> objects. Those
> + * list are maintained in order to accelerate locking of dma-resv
> locks and
s/list/lists/
> + * validation of evicted objects bound in a _gpuvm. For
> instance, all
> + * _gem_object's _resv of a given _gpuvm can be locked
> by calling
> + * drm_gpuvm_exec_lock(). Once locked drivers can call
> drm_gpuvm_validate() in
> + * order to validate all evicted _gem_objects. It is also
> possible to lock
> + * additional _gem_objects by providing the corresponding
> parameters to
> + * drm_gpuvm_exec_lock() as well as open code the _exec loop
> while making
> + * use of helper functions such as drm_gpuvm_prepare_range() or
> + * drm_gpuvm_prepare_objects().
> + *
> + * Every bound _gem_object is treated as external object when
> its _resv
> + * structure is different than the _gpuvm's common _resv
> structure.
>   */
>  
>  /**
> @@ -429,6 +444,20 @@
>   * Subsequent calls to drm_gpuvm_bo_obtain() for the same _gpuvm
> and
>   * _gem_object must be able to observe previous creations and
> destructions
>   * of _gpuvm_bos in order to keep instances unique.
> + *
> + * The _gpuvm's lists for keeping track of external and evicted
> objects are
> + * protected against concurrent insertion / removal and iteration
> internally.

> + *
> + * However, drivers still need ensure to protect concurrent calls to
> functions
> + * iterating those lists, namely drm_gpuvm_prepare_objects() and
> + * drm_gpuvm_validate().


> + *
> + * Alternatively, drivers can set the _GPUVM_RESV_PROTECTED flag
> to indicate
> + * that the corresponding _resv locks are held in order to
> protect the
> + * lists. If _GPUVM_RESV_PROTECTED is set, internal locking is
> disabled and
> + * the corresponding lockdep checks are enabled. This is an
> optimization for
> + * drivers which are capable of taking the corresponding _resv
> locks and
> + * hence do not require internal locking.
>   */
>  
>  /**
> @@ -641,6 +670,195 @@
>   * }
>   */
>  
> +/**
> + * get_next_vm_bo_from_list() - get the next vm_bo element

macros use a different kerneldoc syntax:
https://return42.github.io/linuxdoc/linuxdoc-howto/kernel-doc-syntax.html#macro

> + * @__gpuvm: The GPU VM
> + * @__list_name: The name of the list we're iterating on
> + * @__local_list: A pointer to the local list used to store already
> iterated items
> + * @__prev_vm_bo: The previous element we got from
> drm_gpuvm_get_next_cached_vm_bo()
> + *
> + * This helper is here to provide lockless list iteration. Lockless
> as in, the
> + * iterator releases the lock immediately after picking the first
> element from
> + * the list, so list insertion deletion can happen concurrently.
> + *
> + * Elements popped from the original list are kept in a local list,
> so removal
> + * and is_empty checks can still happen while we're iterating the
>

[PATCH 4/4] drm/i915: Fix glk+ degamma LUT conversions

2023-10-13 Thread Ville Syrjala

From: Ville Syrjälä 

The current implementation of change_lut_val_precision() is just
a convoluted way of shifting by 8. Implement the proper rounding
by just using drm_color_lut_extract() and intel_color_lut_pack()
like everyone else does.

And as the uapi can't handle >=1.0 values but the hardware
can we need to clamp the results appropriately in the readout
path.

Signed-off-by: Ville Syrjälä 
---
 drivers/gpu/drm/i915/display/intel_color.c | 54 +++---
 1 file changed, 28 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_color.c 
b/drivers/gpu/drm/i915/display/intel_color.c
index a4b30614bd63..1cfbb3650304 100644
--- a/drivers/gpu/drm/i915/display/intel_color.c
+++ b/drivers/gpu/drm/i915/display/intel_color.c
@@ -1526,14 +1526,27 @@ static int glk_degamma_lut_size(struct drm_i915_private 
*i915)
return 35;
 }
 
-/*
- * change_lut_val_precision: helper function to upscale or downscale lut 
values.
- * Parameters 'to' and 'from' needs to be less than 32. This should be 
sufficient
- * as currently there are no lut values exceeding 32 bit.
- */
-static u32 change_lut_val_precision(u32 lut_val, int to, int from)
+static u32 glk_degamma_lut(const struct drm_color_lut *color)
+{
+   return color->green;
+}
+
+static void glk_degamma_lut_pack(struct drm_color_lut *entry, u32 val)
+{
+   /* PRE_CSC_GAMC_DATA is 3.16, clamp to 0.16 */
+   entry->red = entry->green = entry->blue = min(val, 0xu);
+}
+
+static u32 mtl_degamma_lut(const struct drm_color_lut *color)
+{
+   return drm_color_lut_extract(color->green, 24);
+}
+
+static void mtl_degamma_lut_pack(struct drm_color_lut *entry, u32 val)
 {
-   return mul_u32_u32(lut_val, (1 << to)) / (1 << from);
+   /* PRE_CSC_GAMC_DATA is 3.24, clamp to 0.16 */
+   entry->red = entry->green = entry->blue =
+   intel_color_lut_pack(min(val, 0xffu), 24);
 }
 
 static void glk_load_degamma_lut(const struct intel_crtc_state *crtc_state,
@@ -1570,20 +1583,16 @@ static void glk_load_degamma_lut(const struct 
intel_crtc_state *crtc_state,
 * ToDo: Extend to max 7.0. Enable 32 bit input value
 * as compared to just 16 to achieve this.
 */
-   u32 lut_val;
-
-   if (DISPLAY_VER(i915) >= 14)
-   lut_val = change_lut_val_precision(lut[i].green, 24, 
16);
-   else
-   lut_val = lut[i].green;
-
ilk_lut_write(crtc_state, PRE_CSC_GAMC_DATA(pipe),
- lut_val);
+ DISPLAY_VER(i915) >= 14 ?
+ mtl_degamma_lut([i]) : 
glk_degamma_lut([i]));
}
 
/* Clamp values > 1.0. */
while (i++ < glk_degamma_lut_size(i915))
-   ilk_lut_write(crtc_state, PRE_CSC_GAMC_DATA(pipe), 1 << 16);
+   ilk_lut_write(crtc_state, PRE_CSC_GAMC_DATA(pipe),
+ DISPLAY_VER(i915) >= 14 ?
+ 1 << 24 : 1 << 16);
 
ilk_lut_write(crtc_state, PRE_CSC_GAMC_INDEX(pipe), 0);
 }
@@ -3573,17 +3582,10 @@ static struct drm_property_blob 
*glk_read_degamma_lut(struct intel_crtc *crtc)
for (i = 0; i < lut_size; i++) {
u32 val = intel_de_read_fw(dev_priv, PRE_CSC_GAMC_DATA(pipe));
 
-   /*
-* For MTL and beyond, convert back the 24 bit lut values
-* read from HW to 16 bit values to maintain parity with
-* userspace values
-*/
if (DISPLAY_VER(dev_priv) >= 14)
-   val = change_lut_val_precision(val, 16, 24);
-
-   lut[i].red = val;
-   lut[i].green = val;
-   lut[i].blue = val;
+   mtl_degamma_lut_pack([i], val);
+   else
+   glk_degamma_lut_pack([i], val);
}
 
intel_de_write_fw(dev_priv, PRE_CSC_GAMC_INDEX(pipe),
-- 
2.41.0

[PATCH 3/4] drm/i915: s/clamp()/min()/ in i965_lut_11p6_max_pack()

2023-10-13 Thread Ville Syrjala

From: Ville Syrjälä 

Use min() instead of clamp() since the color values
involved are unsigned. No functional changes.

Signed-off-by: Ville Syrjälä 
---
 drivers/gpu/drm/i915/display/intel_color.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/display/intel_color.c 
b/drivers/gpu/drm/i915/display/intel_color.c
index b01f463af861..a4b30614bd63 100644
--- a/drivers/gpu/drm/i915/display/intel_color.c
+++ b/drivers/gpu/drm/i915/display/intel_color.c
@@ -909,7 +909,7 @@ static void i965_lut_10p6_pack(struct drm_color_lut *entry, 
u32 ldw, u32 udw)
 static u16 i965_lut_11p6_max_pack(u32 val)
 {
/* PIPEGCMAX is 11.6, clamp to 10.6 */
-   return clamp_val(val, 0, 0x);
+   return min(val, 0xu);
 }
 
 static u32 ilk_lut_10(const struct drm_color_lut *color)
-- 
2.41.0

[PATCH 2/4] drm/i915: Adjust LUT rounding rules

2023-10-13 Thread Ville Syrjala

From: Ville Syrjälä 

drm_color_lut_extract() rounding was changed to follow the
OpenGL int<->float conversion rules. Adjust intel_color_lut_pack()
to match.

Signed-off-by: Ville Syrjälä 
---
 drivers/gpu/drm/i915/display/intel_color.c | 14 ++
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_color.c 
b/drivers/gpu/drm/i915/display/intel_color.c
index 2a2a163ea652..b01f463af861 100644
--- a/drivers/gpu/drm/i915/display/intel_color.c
+++ b/drivers/gpu/drm/i915/display/intel_color.c
@@ -785,14 +785,12 @@ static void chv_assign_csc(struct intel_crtc_state 
*crtc_state)
 /* convert hw value with given bit_precision to lut property val */
 static u32 intel_color_lut_pack(u32 val, int bit_precision)
 {
-   u32 max = 0x >> (16 - bit_precision);
-
-   val = clamp_val(val, 0, max);
-
-   if (bit_precision < 16)
-   val <<= 16 - bit_precision;
-
-   return val;
+   if (bit_precision > 16)
+   return DIV_ROUND_CLOSEST_ULL(mul_u32_u32(val, (1 << 16) - 1),
+(1 << bit_precision) - 1);
+   else
+   return DIV_ROUND_CLOSEST(val * ((1 << 16) - 1),
+(1 << bit_precision) - 1);
 }
 
 static u32 i9xx_lut_8(const struct drm_color_lut *color)
-- 
2.41.0

[PATCH 1/4] drm: Fix color LUT rounding

2023-10-13 Thread Ville Syrjala

From: Ville Syrjälä 

The current implementation of drm_color_lut_extract()
generates weird results. Eg. if we go through all the
values for 16->8bpc conversion we see the following pattern:

inout (count)
   0 -   7f ->  0 (128)
  80 -  17f ->  1 (256)
 180 -  27f ->  2 (256)
 280 -  37f ->  3 (256)
...
fb80 - fc7f -> fc (256)
fc80 - fd7f -> fd (256)
fd80 - fe7f -> fe (256)
fe80 -  -> ff (384)

So less values map to 0 and more values map 0xff, which
doesn't seem particularly great.

To get just the same number of input values to map to
the same output values we'd just need to drop the rounding
entrirely. But perhaps a better idea would be to follow the
OpenGL int<->float conversion rules, in which case we get
the following results:

inout (count)
   0 -   80 ->  0 (129)
  81 -  181 ->  1 (257)
 182 -  282 ->  2 (257)
 283 -  383 ->  3 (257)
...
fc7c - fd7c -> fc (257)
fd7d - fe7d -> fd (257)
fe7e - ff7e -> fe (257)
ff7f -  -> ff (129)

Note that since the divisor is constant the compiler
is able to optimize away the integer division in most
cases. The only exception is the _ULL() case on 32bit
architectures since that gets emitted as inline asm
via do_div() and thus the compiler doesn't get to
optimize it.

Signed-off-by: Ville Syrjälä 
---
 include/drm/drm_color_mgmt.h | 18 +++---
 1 file changed, 7 insertions(+), 11 deletions(-)

diff --git a/include/drm/drm_color_mgmt.h b/include/drm/drm_color_mgmt.h
index 81c298488b0c..6be3cbe18944 100644
--- a/include/drm/drm_color_mgmt.h
+++ b/include/drm/drm_color_mgmt.h
@@ -36,20 +36,16 @@ struct drm_plane;
  *
  * Extract a degamma/gamma LUT value provided by user (in the form of
  * _color_lut entries) and round it to the precision supported by the
- * hardware.
+ * hardware, following OpenGL int<->float conversion rules.
  */
 static inline u32 drm_color_lut_extract(u32 user_input, int bit_precision)
 {
-   u32 val = user_input;
-   u32 max = 0x >> (16 - bit_precision);
-
-   /* Round only if we're not using full precision. */
-   if (bit_precision < 16) {
-   val += 1UL << (16 - bit_precision - 1);
-   val >>= 16 - bit_precision;
-   }
-
-   return clamp_val(val, 0, max);
+   if (bit_precision > 16)
+   return DIV_ROUND_CLOSEST_ULL(mul_u32_u32(user_input, (1 << 
bit_precision) - 1),
+(1 << 16) - 1);
+   else
+   return DIV_ROUND_CLOSEST(user_input * ((1 << bit_precision) - 
1),
+(1 << 16) - 1);
 }
 
 u64 drm_color_ctm_s31_32_to_qm_n(u64 user_input, u32 m, u32 n);
-- 
2.41.0

[PATCH 0/4] drm/i915: Fix LUT rounding

2023-10-13 Thread Ville Syrjala

From: Ville Syrjälä 

The current LUT rounding is generating weird results. Adjust
it to follow the OpenGL int<->float conversion rules.

Ville Syrjälä (4):
  drm: Fix color LUT rounding
  drm/i915: Adjust LUT rounding rules
  drm/i915: s/clamp()/min()/ in i965_lut_11p6_max_pack()
  drm/i915: Fix glk+ degamma LUT conversions

 drivers/gpu/drm/i915/display/intel_color.c | 70 +++---
 include/drm/drm_color_mgmt.h   | 18 +++---
 2 files changed, 42 insertions(+), 46 deletions(-)

-- 
2.41.0

Re: [PATCH] drm/i915: Flush WC GGTT only on required platforms

2023-10-13 Thread Nirmoy Das




On 10/13/2023 2:55 PM, Ville Syrjälä wrote:

On Fri, Oct 13, 2023 at 02:28:21PM +0200, Nirmoy Das wrote:

Hi Ville,

On 10/13/2023 12:50 PM, Ville Syrjälä wrote:

On Fri, Oct 13, 2023 at 12:31:40PM +0200, Nirmoy Das wrote:

gen8_ggtt_invalidate() is only needed for limitted set of platforms
where GGTT is mapped as WC

I know there is supposed to be some kind hw snooping of the ggtt
pte writes to invalidate the tlb, but are we sure GFX_FLSH_CNTL
has no other side effects we depend on?

I spent some time searching through the gfxspec. This GFX_FLSH_CNTL
register only seems to be for

invalidating TLB for GUnit  and (from git log ) we started to do that to
enable WC based GGTT updates.


So if I am not missing anything obvious then this should be safe.

OK.

The only code related complaint I have is that you are now duplicating
that same platform check in two different places. It's always better to
have a single point of truth instead of two or more, so that there is
no risk of introducing bugs due to mismatches.


I agree. I will resend with a static helper function to detect that.


Thanks,

Nirmoy





Regards,

Nirmoy


otherwise this can cause unwanted
side-effects on XE_HP platforms where GFX_FLSH_CNTL_GEN6 is not
valid.

Fixes: d2eae8e98d59 ("drm/i915/dg2: Drop force_probe requirement")
Cc: Rodrigo Vivi 
Cc: Tvrtko Ursulin 
Cc: Joonas Lahtinen 
Cc: Jani Nikula 
Cc: Jonathan Cavitt 
Cc: John Harrison 
Cc: Andi Shyti 
Cc:  # v6.2+
Suggested-by: Matt Roper 
Signed-off-by: Nirmoy Das 
---
   drivers/gpu/drm/i915/gt/intel_ggtt.c | 6 +-
   1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c 
b/drivers/gpu/drm/i915/gt/intel_ggtt.c
index 4d7d88b92632..c2858d434bce 100644
--- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
@@ -197,13 +197,17 @@ void gen6_ggtt_invalidate(struct i915_ggtt *ggtt)
   
   static void gen8_ggtt_invalidate(struct i915_ggtt *ggtt)

   {
+   struct drm_i915_private *i915 = ggtt->vm.i915;
struct intel_uncore *uncore = ggtt->vm.gt->uncore;
   
   	/*

 * Note that as an uncached mmio write, this will flush the
 * WCB of the writes into the GGTT before it triggers the invalidate.
+*
+* Only perform this when GGTT is mapped as WC, see ggtt_probe_common().
 */
-   intel_uncore_write_fw(uncore, GFX_FLSH_CNTL_GEN6, GFX_FLSH_CNTL_EN);
+   if (!IS_GEN9_LP(i915) && GRAPHICS_VER(i915) < 11)
+   intel_uncore_write_fw(uncore, GFX_FLSH_CNTL_GEN6, 
GFX_FLSH_CNTL_EN);
   }
   
   static void guc_ggtt_invalidate(struct i915_ggtt *ggtt)

--
2.41.0

1 2 >

1 - 100 of 167 matches

Mail list logo