date:20230423

Re: [PATCH] modules/firmware: add a new option to denote a firmware group to choose one.

2023-04-23 Thread Dave Airlie

On Fri, 21 Apr 2023 at 05:09, Lucas De Marchi  wrote:
>
> On Wed, Apr 19, 2023 at 02:36:52PM +1000, Dave Airlie wrote:
> >From: Dave Airlie 
> >
> >This adds a tag that will go into the module info, only one firmware from
> >the group given needs to be available for this driver to work. This allows
> >dracut to avoid adding in firmware that aren't needed.
> >
> >This just brackets a module list in the modinfo, the modules in the list
> >will get entries in reversed order so the last module in the list is the
> >preferred one.
> >
> >The corresponding dracut code it at:
> >https://github.com/dracutdevs/dracut/pull/2309
>
> it would be good to have the example usage in the commit message here so
> it can be easily checked as reference for other drivers.

Good point.

>
> I don't think we ever had any ordering in modinfo being relevant for
> other things. Considering the use case and that we could also use a
> similar thing for i915 / xe modules wrt to the major version,
> couldn't we do something like below?
>
> MODULE_FIRMWARE_GROUP("nvidia/ga106/gsp/gsp");
> MODULE_FIRMWARE("nvidia/ga106/gsp/gsp-5258902.bin");
> MODULE_FIRMWARE("nvidia/ga106/gsp/gsp-5303002.bin");
>
> so the group is created by startswith() rather than by the order the
> modinfo appears in the elf section. In i915 we'd have:

The way userspace parses these is reverse order, and it doesn't see
the GROUP until after the FIRMWARE, so this can't work, as it already
will have included all the ones below, hence why I bracketed top and
bottom with a group.

>
> MODULE_FIRMWARE_GROUP("i915/tgl_guc")
>
> There is still an order the kernel would probably like: latest version.
> But then it's an order only among things with the same key.

Dave.

Re: [PATCH] drm/amdgpu: Mark contexts guilty for any reset type

2023-04-23 Thread kernel test robot

Hi André,

kernel test robot noticed the following build warnings:

[auto build test WARNING on drm-misc/drm-misc-next]
[also build test WARNING on drm/drm-next drm-exynos/exynos-drm-next 
drm-intel/for-linux-next drm-intel/for-linux-next-fixes drm-tip/drm-tip 
linus/master v6.3 next-20230421]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:
https://github.com/intel-lab-lkp/linux/commits/Andr-Almeida/drm-amdgpu-Mark-contexts-guilty-for-any-reset-type/20230424-094534
base:   git://anongit.freedesktop.org/drm/drm-misc drm-misc-next
patch link:
https://lore.kernel.org/r/20230424014324.218531-1-andrealmeid%40igalia.com
patch subject: [PATCH] drm/amdgpu: Mark contexts guilty for any reset type
config: s390-allyesconfig 
(https://download.01.org/0day-ci/archive/20230424/202304241259.qq0dmlud-...@intel.com/config)
compiler: s390-linux-gcc (GCC) 12.1.0
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# 
https://github.com/intel-lab-lkp/linux/commit/ea7b1d78b677fdcf5f4776e63de611a2681cd5fb
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review 
Andr-Almeida/drm-amdgpu-Mark-contexts-guilty-for-any-reset-type/20230424-094534
git checkout ea7b1d78b677fdcf5f4776e63de611a2681cd5fb
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 
O=build_dir ARCH=s390 olddefconfig
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 
O=build_dir ARCH=s390 SHELL=/bin/bash drivers/gpu/drm/amd/amdgpu/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot 
| Link: 
https://lore.kernel.org/oe-kbuild-all/202304241259.qq0dmlud-...@intel.com/

All warnings (new ones prefixed by >>):

   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c: In function 
'amdgpu_device_pre_asic_reset':
>> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:4738:28: warning: variable 'job' 
>> set but not used [-Wunused-but-set-variable]
4738 | struct amdgpu_job *job = NULL;
 |^~~


vim +/job +4738 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c

5740682e66cef5 Monk Liu  2017-10-25  4733  
e3c1b0712fdb03 shaoyunl  2021-02-16  4734  int 
amdgpu_device_pre_asic_reset(struct amdgpu_device *adev,
04442bf70debb1 Lijo Lazar2021-03-16  4735   
 struct amdgpu_reset_context *reset_context)
26bc534094ed45 Andrey Grodzovsky 2018-11-22  4736  {
5c1e6fa49e8d8d Huang Rui 2021-12-16  4737   int i, r = 0;
04442bf70debb1 Lijo Lazar2021-03-16 @4738   struct amdgpu_job *job 
= NULL;
04442bf70debb1 Lijo Lazar2021-03-16  4739   bool need_full_reset =
04442bf70debb1 Lijo Lazar2021-03-16  4740   
test_bit(AMDGPU_NEED_FULL_RESET, _context->flags);
04442bf70debb1 Lijo Lazar2021-03-16  4741  
04442bf70debb1 Lijo Lazar2021-03-16  4742   if 
(reset_context->reset_req_dev == adev)
04442bf70debb1 Lijo Lazar2021-03-16  4743   job = 
reset_context->job;
711826656bebb0 Monk Liu  2017-12-25  4744  
b602ca5f31fe69 Tiecheng Zhou 2020-08-19  4745   if 
(amdgpu_sriov_vf(adev)) {
b602ca5f31fe69 Tiecheng Zhou 2020-08-19  4746   /* stop the 
data exchange thread */
b602ca5f31fe69 Tiecheng Zhou 2020-08-19  4747   
amdgpu_virt_fini_data_exchange(adev);
b602ca5f31fe69 Tiecheng Zhou 2020-08-19  4748   }
b602ca5f31fe69 Tiecheng Zhou 2020-08-19  4749  
9e225fb9e636b3 Andrey Grodzovsky 2022-06-18  4750   
amdgpu_fence_driver_isr_toggle(adev, true);
9e225fb9e636b3 Andrey Grodzovsky 2022-06-18  4751  
711826656bebb0 Monk Liu  2017-12-25  4752   /* block all schedulers 
and reset given job's ring */
0875dc9e80eb3b Chunming Zhou 2016-06-12  4753   for (i = 0; i < 
AMDGPU_MAX_RINGS; ++i) {
0875dc9e80eb3b Chunming Zhou 2016-06-12  4754   struct 
amdgpu_ring *ring = adev->rings[i];
0875dc9e80eb3b Chunming Zhou 2016-06-12  4755  
51687759be93fb Chunming Zhou 2017-04-24  4756   if (!ring || 
!ring->sched.thread)
0875dc9e80eb3b Chunming Zhou 2016-06-12  4757   
continue;
5740682e66cef5 Monk Liu  2017-10-25  4758  
c530b02f39850a Jack Zhang2021-05-12  4759   /*clear job 
fence from fence drv to avoid force_completion
c530b02f39850a Jack Zhang2021-05-12  4760*leave NULL 
and vm flush fence in fence drv */
5c1e6fa49e8d8d Huang Rui 2021-12-16  4761   
amdgpu_fence_driver_clear_job_fences(ring);

[PATCH v1 6/6] drm/i915: Allow user to set cache at BO creation

2023-04-23 Thread fei . yang

From: Fei Yang 

To comply with the design that buffer objects shall have immutable
cache setting through out their life cycle, {set, get}_caching ioctl's
are no longer supported from MTL onward. With that change caching
policy can only be set at object creation time. The current code
applies a default (platform dependent) cache setting for all objects.
However this is not optimal for performance tuning. The patch extends
the existing gem_create uAPI to let user set PAT index for the object
at creation time.
The new extension is platform independent, so UMD's can switch to using
this extension for older platforms as well, while {set, get}_caching are
still supported on these legacy paltforms for compatibility reason.

Cc: Chris Wilson 
Cc: Matt Roper 
Cc: Andi Shyti 
Signed-off-by: Fei Yang 
Reviewed-by: Andi Shyti 
---
 drivers/gpu/drm/i915/gem/i915_gem_create.c | 36 ++
 drivers/gpu/drm/i915/gem/i915_gem_object.c |  6 
 include/uapi/drm/i915_drm.h| 36 ++
 tools/include/uapi/drm/i915_drm.h  | 36 ++
 4 files changed, 114 insertions(+)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_create.c 
b/drivers/gpu/drm/i915/gem/i915_gem_create.c
index bfe1dbda4cb7..723c3ddd6c74 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_create.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_create.c
@@ -245,6 +245,7 @@ struct create_ext {
unsigned int n_placements;
unsigned int placement_mask;
unsigned long flags;
+   unsigned int pat_index;
 };
 
 static void repr_placements(char *buf, size_t size,
@@ -394,11 +395,39 @@ static int ext_set_protected(struct i915_user_extension 
__user *base, void *data
return 0;
 }
 
+static int ext_set_pat(struct i915_user_extension __user *base, void *data)
+{
+   struct create_ext *ext_data = data;
+   struct drm_i915_private *i915 = ext_data->i915;
+   struct drm_i915_gem_create_ext_set_pat ext;
+   unsigned int max_pat_index;
+
+   BUILD_BUG_ON(sizeof(struct drm_i915_gem_create_ext_set_pat) !=
+offsetofend(struct drm_i915_gem_create_ext_set_pat, rsvd));
+
+   if (copy_from_user(, base, sizeof(ext)))
+   return -EFAULT;
+
+   max_pat_index = INTEL_INFO(i915)->max_pat_index;
+
+   if (ext.pat_index > max_pat_index) {
+   drm_dbg(>drm, "PAT index is invalid: %u\n",
+   ext.pat_index);
+   return -EINVAL;
+   }
+
+   ext_data->pat_index = ext.pat_index;
+
+   return 0;
+}
+
 static const i915_user_extension_fn create_extensions[] = {
[I915_GEM_CREATE_EXT_MEMORY_REGIONS] = ext_set_placements,
[I915_GEM_CREATE_EXT_PROTECTED_CONTENT] = ext_set_protected,
+   [I915_GEM_CREATE_EXT_SET_PAT] = ext_set_pat,
 };
 
+#define PAT_INDEX_NOT_SET  0x
 /**
  * i915_gem_create_ext_ioctl - Creates a new mm object and returns a handle to 
it.
  * @dev: drm device pointer
@@ -418,6 +447,7 @@ i915_gem_create_ext_ioctl(struct drm_device *dev, void 
*data,
if (args->flags & ~I915_GEM_CREATE_EXT_FLAG_NEEDS_CPU_ACCESS)
return -EINVAL;
 
+   ext_data.pat_index = PAT_INDEX_NOT_SET;
ret = i915_user_extensions(u64_to_user_ptr(args->extensions),
   create_extensions,
   ARRAY_SIZE(create_extensions),
@@ -454,5 +484,11 @@ i915_gem_create_ext_ioctl(struct drm_device *dev, void 
*data,
if (IS_ERR(obj))
return PTR_ERR(obj);
 
+   if (ext_data.pat_index != PAT_INDEX_NOT_SET) {
+   i915_gem_object_set_pat_index(obj, ext_data.pat_index);
+   /* Mark pat_index is set by UMD */
+   obj->cache_level = I915_CACHE_INVAL;
+   }
+
return i915_gem_publish(obj, file, >size, >handle);
 }
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c 
b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index 27c948350b5b..61651f7e5806 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -209,6 +209,12 @@ bool i915_gem_object_can_bypass_llc(struct 
drm_i915_gem_object *obj)
if (!(obj->flags & I915_BO_ALLOC_USER))
return false;
 
+   /*
+* Always flush cache for UMD objects at creation time.
+*/
+   if (obj->cache_level == I915_CACHE_INVAL)
+   return true;
+
/*
 * EHL and JSL add the 'Bypass LLC' MOCS entry, which should make it
 * possible for userspace to bypass the GTT caching bits set by the
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index dba7c5a5b25e..03c5c314846e 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -3630,9 +3630,13 @@ struct drm_i915_gem_create_ext {
 *
 * For I915_GEM_CREATE_EXT_PROTECTED_CONTENT usage see
 * struct drm_i915_gem_create_ext_protected_content.
+*
+* For

[PATCH v1 5/6] drm/i915/mtl: end support for set caching ioctl

2023-04-23 Thread fei . yang

From: Fei Yang 

The design is to keep Buffer Object's caching policy immutable through
out its life cycle. This patch ends the support for set caching ioctl
from MTL onward. While doing that we also set BO's to be 1-way coherent
at creation time because GPU is no longer automatically snooping CPU
cache. For userspace components needing to fine tune the caching policy
for BO's, a follow up patch will extend the GEM_CREATE uAPI to allow
them specify caching mode at BO creation time.

Signed-off-by: Fei Yang 
Reviewed-by: Andi Shyti 
Reviewed-by: Andrzej Hajda 
---
 drivers/gpu/drm/i915/gem/i915_gem_domain.c | 3 +++
 drivers/gpu/drm/i915/gem/i915_gem_shmem.c  | 9 -
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c 
b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
index 89938084af97..d5fd4c9cd9f8 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
@@ -328,6 +328,9 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void 
*data,
if (IS_DGFX(i915))
return -ENODEV;
 
+   if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70))
+   return -EOPNOTSUPP;
+
switch (args->caching) {
case I915_CACHING_NONE:
level = I915_CACHE_NONE;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c 
b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
index 37d1efcd3ca6..cad4a6017f4b 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
@@ -601,7 +601,14 @@ static int shmem_object_init(struct intel_memory_region 
*mem,
obj->write_domain = I915_GEM_DOMAIN_CPU;
obj->read_domains = I915_GEM_DOMAIN_CPU;
 
-   if (HAS_LLC(i915))
+   /*
+* MTL doesn't snoop CPU cache by default for GPU access (namely
+* 1-way coherency). However some UMD's are currently depending on
+* that. Make 1-way coherent the default setting for MTL. A follow
+* up patch will extend the GEM_CREATE uAPI to allow UMD's specify
+* caching mode at BO creation time
+*/
+   if (HAS_LLC(i915) || (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70)))
/* On some devices, we can have the GPU use the LLC (the CPU
 * cache) for about a 10% performance improvement
 * compared to uncached.  Graphics requests other than
-- 
2.25.1

[PATCH v1 4/6] drm/i915: make sure correct pte encode is used

2023-04-23 Thread fei . yang

From: Fei Yang 

PTE encode is platform dependent. After replacing cache_level with
pat_index, the newly introduced mtl_pte_encode is actually generic
for all gen12 platforms, thus rename it to gen12_pte_encode and
apply it to all gen12 platforms.

Cc: Chris Wilson 
Cc: Matt Roper 
Signed-off-by: Fei Yang 
Reviewed-by: Andi Shyti 
---
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c 
b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
index c046813514f4..a738a25dd857 100644
--- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
@@ -55,9 +55,9 @@ static u64 gen8_pte_encode(dma_addr_t addr,
return pte;
 }
 
-static u64 mtl_pte_encode(dma_addr_t addr,
- unsigned int pat_index,
- u32 flags)
+static u64 gen12_pte_encode(dma_addr_t addr,
+   unsigned int pat_index,
+   u32 flags)
 {
gen8_pte_t pte = addr | GEN8_PAGE_PRESENT | GEN8_PAGE_RW;
 
@@ -994,8 +994,8 @@ struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt,
 */
ppgtt->vm.alloc_scratch_dma = alloc_pt_dma;
 
-   if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 70))
-   ppgtt->vm.pte_encode = mtl_pte_encode;
+   if (GRAPHICS_VER(gt->i915) >= 12)
+   ppgtt->vm.pte_encode = gen12_pte_encode;
else
ppgtt->vm.pte_encode = gen8_pte_encode;
 
-- 
2.25.1

[PATCH v1 3/6] drm/i915: use pat_index instead of cache_level

2023-04-23 Thread fei . yang

From: Fei Yang 

Currently the KMD is using enum i915_cache_level to set caching policy for
buffer objects. This is flaky because the PAT index which really controls
the caching behavior in PTE has far more levels than what's defined in the
enum. In addition, the PAT index is platform dependent, having to translate
between i915_cache_level and PAT index is not reliable, and makes the code
more complicated.

>From UMD's perspective there is also a necessity to set caching policy for
performance fine tuning. It's much easier for the UMD to directly use PAT
index because the behavior of each PAT index is clearly defined in Bspec.
Having the abstracted i915_cache_level sitting in between would only cause
more ambiguity.

For these reasons this patch replaces i915_cache_level with PAT index. Also
note, the cache_level is not completely removed yet, because the KMD still
has the need of creating buffer objects with simple cache settings such as
cached, uncached, or writethrough. For such simple cases, using cache_level
would help simplify the code.

Cc: Chris Wilson 
Cc: Matt Roper 
Signed-off-by: Fei Yang 
Reviewed-by: Andi Shyti 
---
 drivers/gpu/drm/i915/display/intel_dpt.c  | 12 +--
 drivers/gpu/drm/i915/gem/i915_gem_domain.c| 27 ++
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 10 ++-
 drivers/gpu/drm/i915/gem/i915_gem_mman.c  |  3 +-
 drivers/gpu/drm/i915/gem/i915_gem_object.c| 52 +++-
 drivers/gpu/drm/i915/gem/i915_gem_object.h|  4 +
 .../gpu/drm/i915/gem/i915_gem_object_types.h  | 25 +-
 drivers/gpu/drm/i915/gem/i915_gem_stolen.c|  4 +-
 drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c  | 16 ++--
 .../gpu/drm/i915/gem/selftests/huge_pages.c   |  2 +-
 .../drm/i915/gem/selftests/i915_gem_migrate.c |  2 +-
 .../drm/i915/gem/selftests/i915_gem_mman.c|  2 +-
 drivers/gpu/drm/i915/gt/gen6_ppgtt.c  | 10 ++-
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c  | 71 
 drivers/gpu/drm/i915/gt/gen8_ppgtt.h  |  3 +-
 drivers/gpu/drm/i915/gt/intel_ggtt.c  | 82 +--
 drivers/gpu/drm/i915/gt/intel_gtt.h   | 20 ++---
 drivers/gpu/drm/i915/gt/intel_migrate.c   | 47 ++-
 drivers/gpu/drm/i915/gt/intel_migrate.h   | 13 ++-
 drivers/gpu/drm/i915/gt/intel_ppgtt.c |  6 +-
 drivers/gpu/drm/i915/gt/selftest_migrate.c| 47 ++-
 drivers/gpu/drm/i915/gt/selftest_reset.c  |  8 +-
 drivers/gpu/drm/i915/gt/selftest_timeline.c   |  2 +-
 drivers/gpu/drm/i915/gt/selftest_tlb.c|  4 +-
 drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c  | 10 ++-
 drivers/gpu/drm/i915/i915_debugfs.c   | 55 ++---
 drivers/gpu/drm/i915/i915_gem.c   | 16 +++-
 drivers/gpu/drm/i915/i915_gpu_error.c |  8 +-
 drivers/gpu/drm/i915/i915_vma.c   | 16 ++--
 drivers/gpu/drm/i915/i915_vma.h   |  2 +-
 drivers/gpu/drm/i915/i915_vma_types.h |  2 -
 drivers/gpu/drm/i915/selftests/i915_gem.c |  5 +-
 .../gpu/drm/i915/selftests/i915_gem_evict.c   |  4 +-
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 15 ++--
 .../drm/i915/selftests/intel_memory_region.c  |  4 +-
 drivers/gpu/drm/i915/selftests/mock_gtt.c |  8 +-
 36 files changed, 378 insertions(+), 239 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_dpt.c 
b/drivers/gpu/drm/i915/display/intel_dpt.c
index c5eacfdba1a5..7c5fddb203ba 100644
--- a/drivers/gpu/drm/i915/display/intel_dpt.c
+++ b/drivers/gpu/drm/i915/display/intel_dpt.c
@@ -43,24 +43,24 @@ static void gen8_set_pte(void __iomem *addr, gen8_pte_t pte)
 static void dpt_insert_page(struct i915_address_space *vm,
dma_addr_t addr,
u64 offset,
-   enum i915_cache_level level,
+   unsigned int pat_index,
u32 flags)
 {
struct i915_dpt *dpt = i915_vm_to_dpt(vm);
gen8_pte_t __iomem *base = dpt->iomem;
 
gen8_set_pte(base + offset / I915_GTT_PAGE_SIZE,
-vm->pte_encode(addr, level, flags));
+vm->pte_encode(addr, pat_index, flags));
 }
 
 static void dpt_insert_entries(struct i915_address_space *vm,
   struct i915_vma_resource *vma_res,
-  enum i915_cache_level level,
+  unsigned int pat_index,
   u32 flags)
 {
struct i915_dpt *dpt = i915_vm_to_dpt(vm);
gen8_pte_t __iomem *base = dpt->iomem;
-   const gen8_pte_t pte_encode = vm->pte_encode(0, level, flags);
+   const gen8_pte_t pte_encode = vm->pte_encode(0, pat_index, flags);
struct sgt_iter sgt_iter;
dma_addr_t addr;
int i;
@@ -83,7 +83,7 @@ static void dpt_clear_range(struct i915_address_space *vm,
 static void dpt_bind_vma(struct i915_address_space *vm,
 struct i915_vm_pt_stash *stash,
 struct

[PATCH v1 2/6] drm/i915: preparation for using PAT index

2023-04-23 Thread fei . yang

From: Fei Yang 

This patch is a preparation for replacing enum i915_cache_level with PAT
index. Caching policy for buffer objects is set through the PAT index in
PTE, the old i915_cache_level is not sufficient to represent all caching
modes supported by the hardware.

Preparing the transition by adding some platform dependent data structures
and helper functions to translate the cache_level to pat_index.

cachelevel_to_pat: a platform dependent array mapping cache_level to
   pat_index.

max_pat_index: the maximum PAT index recommended in hardware specification
   Needed for validating the PAT index passed in from user
   space.

i915_gem_get_pat_index: function to convert cache_level to PAT index.

obj_to_i915(obj): macro moved to header file for wider usage.

I915_MAX_CACHE_LEVEL: upper bound of i915_cache_level for the
  convenience of coding.

Cc: Chris Wilson 
Cc: Matt Roper 
Cc: Andi Shyti 
Signed-off-by: Fei Yang 
Reviewed-by: Andi Shyti 
Reviewed-by: Andrzej Hajda 
---
 drivers/gpu/drm/i915/gem/i915_gem_object.c|  9 +++
 drivers/gpu/drm/i915/gem/i915_gem_object.h|  4 +
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |  1 +
 drivers/gpu/drm/i915/gem/i915_gem_shrinker.c  |  2 -
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c  |  6 ++
 drivers/gpu/drm/i915/gt/intel_ggtt.c  |  6 ++
 drivers/gpu/drm/i915/i915_pci.c   | 79 ---
 drivers/gpu/drm/i915/intel_device_info.h  |  5 ++
 .../gpu/drm/i915/selftests/mock_gem_device.c  |  9 +++
 9 files changed, 110 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c 
b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index 4666bb82f312..8c70a0ec7d2f 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -45,6 +45,15 @@ static struct kmem_cache *slab_objects;
 
 static const struct drm_gem_object_funcs i915_gem_object_funcs;
 
+unsigned int i915_gem_get_pat_index(struct drm_i915_private *i915,
+   enum i915_cache_level level)
+{
+   if (drm_WARN_ON(>drm, level >= I915_MAX_CACHE_LEVEL))
+   return 0;
+
+   return INTEL_INFO(i915)->cachelevel_to_pat[level];
+}
+
 struct drm_i915_gem_object *i915_gem_object_alloc(void)
 {
struct drm_i915_gem_object *obj;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h 
b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index 885ccde9dc3c..4c92e17b4337 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -20,6 +20,8 @@
 
 enum intel_region_id;
 
+#define obj_to_i915(obj__) to_i915((obj__)->base.dev)
+
 static inline bool i915_gem_object_size_2big(u64 size)
 {
struct drm_i915_gem_object *obj;
@@ -30,6 +32,8 @@ static inline bool i915_gem_object_size_2big(u64 size)
return false;
 }
 
+unsigned int i915_gem_get_pat_index(struct drm_i915_private *i915,
+   enum i915_cache_level level);
 void i915_gem_init__objects(struct drm_i915_private *i915);
 
 void i915_objects_module_exit(void);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h 
b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
index 830c11431ee8..41b35abccf88 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
@@ -194,6 +194,7 @@ enum i915_cache_level {
 * engine.
 */
I915_CACHE_WT,
+   I915_MAX_CACHE_LEVEL,
 };
 
 enum i915_map_type {
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c 
b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
index b1672e054b21..214763942aa2 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
@@ -460,8 +460,6 @@ void i915_gem_shrinker_taints_mutex(struct drm_i915_private 
*i915,
fs_reclaim_release(GFP_KERNEL);
 }
 
-#define obj_to_i915(obj__) to_i915((obj__)->base.dev)
-
 /**
  * i915_gem_object_make_unshrinkable - Hide the object from the shrinker. By
  * default all object types that support shrinking(see IS_SHRINKABLE), will 
also
diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c 
b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
index 11b91e0453c8..7a4b1d1afce9 100644
--- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
@@ -78,6 +78,12 @@ static u64 mtl_pte_encode(dma_addr_t addr,
case I915_CACHE_WT:
pte |= GEN12_PPGTT_PTE_PAT0;
break;
+   default:
+   /* This should never happen. Added to deal with the compile
+* error due to the addition of I915_MAX_CACHE_LEVEL. Will
+* be removed by the pat_index patch.
+*/
+   break;
}
 
return pte;
diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c 
b/drivers/gpu/drm/i915/gt/intel_ggtt.c
index 20915edc8bd9..c8390d03fce2 100644
---

[PATCH v1 0/6] drm/i915: Allow user to set cache at BO creation

2023-04-23 Thread fei . yang

From: Fei Yang 

The first four patches in this series are taken from
https://patchwork.freedesktop.org/series/116868/
These patches are included here because the last patch
has dependency on the pat_index refactor.

This series is focusing on uAPI changes,
1. end support for set caching ioctl [PATCH 5/6]
2. add set_pat extension for gem_create [PATCH 6/6]

Fei Yang (6):
  drm/i915/mtl: Add PTE encode function
  drm/i915: preparation for using PAT index
  drm/i915: use pat_index instead of cache_level
  drm/i915: make sure correct pte encode is used
  drm/i915/mtl: end support for set caching ioctl
  drm/i915: Allow user to set cache at BO creation

 drivers/gpu/drm/i915/display/intel_dpt.c  | 14 ++--
 drivers/gpu/drm/i915/gem/i915_gem_create.c| 36 
 drivers/gpu/drm/i915/gem/i915_gem_domain.c| 30 +++
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 10 ++-
 drivers/gpu/drm/i915/gem/i915_gem_mman.c  |  3 +-
 drivers/gpu/drm/i915/gem/i915_gem_object.c| 67 ++-
 drivers/gpu/drm/i915/gem/i915_gem_object.h|  8 ++
 .../gpu/drm/i915/gem/i915_gem_object_types.h  | 26 +-
 drivers/gpu/drm/i915/gem/i915_gem_shmem.c |  9 +-
 drivers/gpu/drm/i915/gem/i915_gem_shrinker.c  |  2 -
 drivers/gpu/drm/i915/gem/i915_gem_stolen.c|  4 +-
 drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c  | 16 ++--
 .../gpu/drm/i915/gem/selftests/huge_pages.c   |  2 +-
 .../drm/i915/gem/selftests/i915_gem_migrate.c |  2 +-
 .../drm/i915/gem/selftests/i915_gem_mman.c|  2 +-
 drivers/gpu/drm/i915/gt/gen6_ppgtt.c  | 10 ++-
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c  | 76 -
 drivers/gpu/drm/i915/gt/gen8_ppgtt.h  |  3 +-
 drivers/gpu/drm/i915/gt/intel_ggtt.c  | 84 +--
 drivers/gpu/drm/i915/gt/intel_gtt.h   | 31 ---
 drivers/gpu/drm/i915/gt/intel_migrate.c   | 47 ++-
 drivers/gpu/drm/i915/gt/intel_migrate.h   | 13 ++-
 drivers/gpu/drm/i915/gt/intel_ppgtt.c |  6 +-
 drivers/gpu/drm/i915/gt/selftest_migrate.c| 47 ++-
 drivers/gpu/drm/i915/gt/selftest_reset.c  |  8 +-
 drivers/gpu/drm/i915/gt/selftest_timeline.c   |  2 +-
 drivers/gpu/drm/i915/gt/selftest_tlb.c|  4 +-
 drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c  | 10 ++-
 drivers/gpu/drm/i915/i915_debugfs.c   | 55 +---
 drivers/gpu/drm/i915/i915_gem.c   | 16 +++-
 drivers/gpu/drm/i915/i915_gpu_error.c |  8 +-
 drivers/gpu/drm/i915/i915_pci.c   | 79 +++--
 drivers/gpu/drm/i915/i915_vma.c   | 16 ++--
 drivers/gpu/drm/i915/i915_vma.h   |  2 +-
 drivers/gpu/drm/i915/i915_vma_types.h |  2 -
 drivers/gpu/drm/i915/intel_device_info.h  |  5 ++
 drivers/gpu/drm/i915/selftests/i915_gem.c |  5 +-
 .../gpu/drm/i915/selftests/i915_gem_evict.c   |  4 +-
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 15 ++--
 .../drm/i915/selftests/intel_memory_region.c  |  4 +-
 .../gpu/drm/i915/selftests/mock_gem_device.c  |  9 ++
 drivers/gpu/drm/i915/selftests/mock_gtt.c |  8 +-
 include/uapi/drm/i915_drm.h   | 36 
 tools/include/uapi/drm/i915_drm.h | 36 
 44 files changed, 652 insertions(+), 220 deletions(-)

-- 
2.25.1

[PATCH v1 1/6] drm/i915/mtl: Add PTE encode function

2023-04-23 Thread fei . yang

From: Fei Yang 

PTE encode functions are platform dependent. This patch implements
PTE functions for MTL, and ensures the correct PTE encode function
is used by calling pte_encode function pointer instead of the
hardcoded gen8 version of PTE encode.

Signed-off-by: Fei Yang 
Reviewed-by: Andrzej Hajda 
Reviewed-by: Andi Shyti 
Acked-by: Nirmoy Das 
---
 drivers/gpu/drm/i915/display/intel_dpt.c |  2 +-
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c | 45 
 drivers/gpu/drm/i915/gt/intel_ggtt.c | 36 +--
 drivers/gpu/drm/i915/gt/intel_gtt.h  | 13 +--
 4 files changed, 83 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_dpt.c 
b/drivers/gpu/drm/i915/display/intel_dpt.c
index b8027392144d..c5eacfdba1a5 100644
--- a/drivers/gpu/drm/i915/display/intel_dpt.c
+++ b/drivers/gpu/drm/i915/display/intel_dpt.c
@@ -300,7 +300,7 @@ intel_dpt_create(struct intel_framebuffer *fb)
vm->vma_ops.bind_vma= dpt_bind_vma;
vm->vma_ops.unbind_vma  = dpt_unbind_vma;
 
-   vm->pte_encode = gen8_ggtt_pte_encode;
+   vm->pte_encode = vm->gt->ggtt->vm.pte_encode;
 
dpt->obj = dpt_obj;
dpt->obj->is_dpt = true;
diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c 
b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
index 4daaa6f55668..11b91e0453c8 100644
--- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
@@ -55,6 +55,34 @@ static u64 gen8_pte_encode(dma_addr_t addr,
return pte;
 }
 
+static u64 mtl_pte_encode(dma_addr_t addr,
+ enum i915_cache_level level,
+ u32 flags)
+{
+   gen8_pte_t pte = addr | GEN8_PAGE_PRESENT | GEN8_PAGE_RW;
+
+   if (unlikely(flags & PTE_READ_ONLY))
+   pte &= ~GEN8_PAGE_RW;
+
+   if (flags & PTE_LM)
+   pte |= GEN12_PPGTT_PTE_LM | GEN12_PPGTT_PTE_NC;
+
+   switch (level) {
+   case I915_CACHE_NONE:
+   pte |= GEN12_PPGTT_PTE_PAT1;
+   break;
+   case I915_CACHE_LLC:
+   case I915_CACHE_L3_LLC:
+   pte |= GEN12_PPGTT_PTE_PAT0 | GEN12_PPGTT_PTE_PAT1;
+   break;
+   case I915_CACHE_WT:
+   pte |= GEN12_PPGTT_PTE_PAT0;
+   break;
+   }
+
+   return pte;
+}
+
 static void gen8_ppgtt_notify_vgt(struct i915_ppgtt *ppgtt, bool create)
 {
struct drm_i915_private *i915 = ppgtt->vm.i915;
@@ -427,7 +455,7 @@ gen8_ppgtt_insert_pte(struct i915_ppgtt *ppgtt,
  u32 flags)
 {
struct i915_page_directory *pd;
-   const gen8_pte_t pte_encode = gen8_pte_encode(0, cache_level, flags);
+   const gen8_pte_t pte_encode = ppgtt->vm.pte_encode(0, cache_level, 
flags);
gen8_pte_t *vaddr;
 
pd = i915_pd_entry(pdp, gen8_pd_index(idx, 2));
@@ -580,7 +608,7 @@ static void gen8_ppgtt_insert_huge(struct 
i915_address_space *vm,
   enum i915_cache_level cache_level,
   u32 flags)
 {
-   const gen8_pte_t pte_encode = gen8_pte_encode(0, cache_level, flags);
+   const gen8_pte_t pte_encode = vm->pte_encode(0, cache_level, flags);
unsigned int rem = sg_dma_len(iter->sg);
u64 start = vma_res->start;
 
@@ -743,7 +771,7 @@ static void gen8_ppgtt_insert_entry(struct 
i915_address_space *vm,
GEM_BUG_ON(pt->is_compact);
 
vaddr = px_vaddr(pt);
-   vaddr[gen8_pd_index(idx, 0)] = gen8_pte_encode(addr, level, flags);
+   vaddr[gen8_pd_index(idx, 0)] = vm->pte_encode(addr, level, flags);
drm_clflush_virt_range([gen8_pd_index(idx, 0)], sizeof(*vaddr));
 }
 
@@ -773,7 +801,7 @@ static void __xehpsdv_ppgtt_insert_entry_lm(struct 
i915_address_space *vm,
}
 
vaddr = px_vaddr(pt);
-   vaddr[gen8_pd_index(idx, 0) / 16] = gen8_pte_encode(addr, level, flags);
+   vaddr[gen8_pd_index(idx, 0) / 16] = vm->pte_encode(addr, level, flags);
 }
 
 static void xehpsdv_ppgtt_insert_entry(struct i915_address_space *vm,
@@ -820,8 +848,8 @@ static int gen8_init_scratch(struct i915_address_space *vm)
pte_flags |= PTE_LM;
 
vm->scratch[0]->encode =
-   gen8_pte_encode(px_dma(vm->scratch[0]),
-   I915_CACHE_NONE, pte_flags);
+   vm->pte_encode(px_dma(vm->scratch[0]),
+  I915_CACHE_NONE, pte_flags);
 
for (i = 1; i <= vm->top; i++) {
struct drm_i915_gem_object *obj;
@@ -963,7 +991,10 @@ struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt,
 */
ppgtt->vm.alloc_scratch_dma = alloc_pt_dma;
 
-   ppgtt->vm.pte_encode = gen8_pte_encode;
+   if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 70))
+   ppgtt->vm.pte_encode = mtl_pte_encode;
+   else
+   ppgtt->vm.pte_encode = gen8_pte_encode;
 
ppgtt->vm.bind_async_flags = I915_VMA_LOCAL_BIND;
ppgtt->vm.insert_entries =

[PATCH v1 4/4] drm/i915: make sure correct pte encode is used

2023-04-23 Thread fei . yang

From: Fei Yang 

PTE encode is platform dependent. After replacing cache_level with
pat_index, the newly introduced mtl_pte_encode is actually generic
for all gen12 platforms, thus rename it to gen12_pte_encode and
apply it to all gen12 platforms.

Cc: Chris Wilson 
Cc: Matt Roper 
Signed-off-by: Fei Yang 
Reviewed-by: Andi Shyti 
---
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c 
b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
index c046813514f4..a738a25dd857 100644
--- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
@@ -55,9 +55,9 @@ static u64 gen8_pte_encode(dma_addr_t addr,
return pte;
 }
 
-static u64 mtl_pte_encode(dma_addr_t addr,
- unsigned int pat_index,
- u32 flags)
+static u64 gen12_pte_encode(dma_addr_t addr,
+   unsigned int pat_index,
+   u32 flags)
 {
gen8_pte_t pte = addr | GEN8_PAGE_PRESENT | GEN8_PAGE_RW;
 
@@ -994,8 +994,8 @@ struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt,
 */
ppgtt->vm.alloc_scratch_dma = alloc_pt_dma;
 
-   if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 70))
-   ppgtt->vm.pte_encode = mtl_pte_encode;
+   if (GRAPHICS_VER(gt->i915) >= 12)
+   ppgtt->vm.pte_encode = gen12_pte_encode;
else
ppgtt->vm.pte_encode = gen8_pte_encode;
 
-- 
2.25.1

[PATCH v1 2/4] drm/i915: preparation for using PAT index

2023-04-23 Thread fei . yang

From: Fei Yang 

This patch is a preparation for replacing enum i915_cache_level with PAT
index. Caching policy for buffer objects is set through the PAT index in
PTE, the old i915_cache_level is not sufficient to represent all caching
modes supported by the hardware.

Preparing the transition by adding some platform dependent data structures
and helper functions to translate the cache_level to pat_index.

cachelevel_to_pat: a platform dependent array mapping cache_level to
   pat_index.

max_pat_index: the maximum PAT index recommended in hardware specification
   Needed for validating the PAT index passed in from user
   space.

i915_gem_get_pat_index: function to convert cache_level to PAT index.

obj_to_i915(obj): macro moved to header file for wider usage.

I915_MAX_CACHE_LEVEL: upper bound of i915_cache_level for the
  convenience of coding.

Cc: Chris Wilson 
Cc: Matt Roper 
Cc: Andi Shyti 
Signed-off-by: Fei Yang 
Reviewed-by: Andi Shyti 
Reviewed-by: Andrzej Hajda 
---
 drivers/gpu/drm/i915/gem/i915_gem_object.c|  9 +++
 drivers/gpu/drm/i915/gem/i915_gem_object.h|  4 +
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |  1 +
 drivers/gpu/drm/i915/gem/i915_gem_shrinker.c  |  2 -
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c  |  6 ++
 drivers/gpu/drm/i915/gt/intel_ggtt.c  |  6 ++
 drivers/gpu/drm/i915/i915_pci.c   | 79 ---
 drivers/gpu/drm/i915/intel_device_info.h  |  5 ++
 .../gpu/drm/i915/selftests/mock_gem_device.c  |  9 +++
 9 files changed, 110 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c 
b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index 4666bb82f312..8c70a0ec7d2f 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -45,6 +45,15 @@ static struct kmem_cache *slab_objects;
 
 static const struct drm_gem_object_funcs i915_gem_object_funcs;
 
+unsigned int i915_gem_get_pat_index(struct drm_i915_private *i915,
+   enum i915_cache_level level)
+{
+   if (drm_WARN_ON(>drm, level >= I915_MAX_CACHE_LEVEL))
+   return 0;
+
+   return INTEL_INFO(i915)->cachelevel_to_pat[level];
+}
+
 struct drm_i915_gem_object *i915_gem_object_alloc(void)
 {
struct drm_i915_gem_object *obj;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h 
b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index 885ccde9dc3c..4c92e17b4337 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -20,6 +20,8 @@
 
 enum intel_region_id;
 
+#define obj_to_i915(obj__) to_i915((obj__)->base.dev)
+
 static inline bool i915_gem_object_size_2big(u64 size)
 {
struct drm_i915_gem_object *obj;
@@ -30,6 +32,8 @@ static inline bool i915_gem_object_size_2big(u64 size)
return false;
 }
 
+unsigned int i915_gem_get_pat_index(struct drm_i915_private *i915,
+   enum i915_cache_level level);
 void i915_gem_init__objects(struct drm_i915_private *i915);
 
 void i915_objects_module_exit(void);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h 
b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
index 830c11431ee8..41b35abccf88 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
@@ -194,6 +194,7 @@ enum i915_cache_level {
 * engine.
 */
I915_CACHE_WT,
+   I915_MAX_CACHE_LEVEL,
 };
 
 enum i915_map_type {
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c 
b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
index b1672e054b21..214763942aa2 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
@@ -460,8 +460,6 @@ void i915_gem_shrinker_taints_mutex(struct drm_i915_private 
*i915,
fs_reclaim_release(GFP_KERNEL);
 }
 
-#define obj_to_i915(obj__) to_i915((obj__)->base.dev)
-
 /**
  * i915_gem_object_make_unshrinkable - Hide the object from the shrinker. By
  * default all object types that support shrinking(see IS_SHRINKABLE), will 
also
diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c 
b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
index 11b91e0453c8..7a4b1d1afce9 100644
--- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
@@ -78,6 +78,12 @@ static u64 mtl_pte_encode(dma_addr_t addr,
case I915_CACHE_WT:
pte |= GEN12_PPGTT_PTE_PAT0;
break;
+   default:
+   /* This should never happen. Added to deal with the compile
+* error due to the addition of I915_MAX_CACHE_LEVEL. Will
+* be removed by the pat_index patch.
+*/
+   break;
}
 
return pte;
diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c 
b/drivers/gpu/drm/i915/gt/intel_ggtt.c
index 20915edc8bd9..c8390d03fce2 100644
---

[PATCH v1 3/4] drm/i915: use pat_index instead of cache_level

2023-04-23 Thread fei . yang

From: Fei Yang 

Currently the KMD is using enum i915_cache_level to set caching policy for
buffer objects. This is flaky because the PAT index which really controls
the caching behavior in PTE has far more levels than what's defined in the
enum. In addition, the PAT index is platform dependent, having to translate
between i915_cache_level and PAT index is not reliable, and makes the code
more complicated.

>From UMD's perspective there is also a necessity to set caching policy for
performance fine tuning. It's much easier for the UMD to directly use PAT
index because the behavior of each PAT index is clearly defined in Bspec.
Having the abstracted i915_cache_level sitting in between would only cause
more ambiguity.

For these reasons this patch replaces i915_cache_level with PAT index. Also
note, the cache_level is not completely removed yet, because the KMD still
has the need of creating buffer objects with simple cache settings such as
cached, uncached, or writethrough. For such simple cases, using cache_level
would help simplify the code.

Cc: Chris Wilson 
Cc: Matt Roper 
Signed-off-by: Fei Yang 
Reviewed-by: Andi Shyti 
---
 drivers/gpu/drm/i915/display/intel_dpt.c  | 12 +--
 drivers/gpu/drm/i915/gem/i915_gem_domain.c| 27 ++
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 10 ++-
 drivers/gpu/drm/i915/gem/i915_gem_mman.c  |  3 +-
 drivers/gpu/drm/i915/gem/i915_gem_object.c| 52 +++-
 drivers/gpu/drm/i915/gem/i915_gem_object.h|  4 +
 .../gpu/drm/i915/gem/i915_gem_object_types.h  | 25 +-
 drivers/gpu/drm/i915/gem/i915_gem_stolen.c|  4 +-
 drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c  | 16 ++--
 .../gpu/drm/i915/gem/selftests/huge_pages.c   |  2 +-
 .../drm/i915/gem/selftests/i915_gem_migrate.c |  2 +-
 .../drm/i915/gem/selftests/i915_gem_mman.c|  2 +-
 drivers/gpu/drm/i915/gt/gen6_ppgtt.c  | 10 ++-
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c  | 71 
 drivers/gpu/drm/i915/gt/gen8_ppgtt.h  |  3 +-
 drivers/gpu/drm/i915/gt/intel_ggtt.c  | 82 +--
 drivers/gpu/drm/i915/gt/intel_gtt.h   | 20 ++---
 drivers/gpu/drm/i915/gt/intel_migrate.c   | 47 ++-
 drivers/gpu/drm/i915/gt/intel_migrate.h   | 13 ++-
 drivers/gpu/drm/i915/gt/intel_ppgtt.c |  6 +-
 drivers/gpu/drm/i915/gt/selftest_migrate.c| 47 ++-
 drivers/gpu/drm/i915/gt/selftest_reset.c  |  8 +-
 drivers/gpu/drm/i915/gt/selftest_timeline.c   |  2 +-
 drivers/gpu/drm/i915/gt/selftest_tlb.c|  4 +-
 drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c  | 10 ++-
 drivers/gpu/drm/i915/i915_debugfs.c   | 55 ++---
 drivers/gpu/drm/i915/i915_gem.c   | 16 +++-
 drivers/gpu/drm/i915/i915_gpu_error.c |  8 +-
 drivers/gpu/drm/i915/i915_vma.c   | 16 ++--
 drivers/gpu/drm/i915/i915_vma.h   |  2 +-
 drivers/gpu/drm/i915/i915_vma_types.h |  2 -
 drivers/gpu/drm/i915/selftests/i915_gem.c |  5 +-
 .../gpu/drm/i915/selftests/i915_gem_evict.c   |  4 +-
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 15 ++--
 .../drm/i915/selftests/intel_memory_region.c  |  4 +-
 drivers/gpu/drm/i915/selftests/mock_gtt.c |  8 +-
 36 files changed, 378 insertions(+), 239 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_dpt.c 
b/drivers/gpu/drm/i915/display/intel_dpt.c
index c5eacfdba1a5..7c5fddb203ba 100644
--- a/drivers/gpu/drm/i915/display/intel_dpt.c
+++ b/drivers/gpu/drm/i915/display/intel_dpt.c
@@ -43,24 +43,24 @@ static void gen8_set_pte(void __iomem *addr, gen8_pte_t pte)
 static void dpt_insert_page(struct i915_address_space *vm,
dma_addr_t addr,
u64 offset,
-   enum i915_cache_level level,
+   unsigned int pat_index,
u32 flags)
 {
struct i915_dpt *dpt = i915_vm_to_dpt(vm);
gen8_pte_t __iomem *base = dpt->iomem;
 
gen8_set_pte(base + offset / I915_GTT_PAGE_SIZE,
-vm->pte_encode(addr, level, flags));
+vm->pte_encode(addr, pat_index, flags));
 }
 
 static void dpt_insert_entries(struct i915_address_space *vm,
   struct i915_vma_resource *vma_res,
-  enum i915_cache_level level,
+  unsigned int pat_index,
   u32 flags)
 {
struct i915_dpt *dpt = i915_vm_to_dpt(vm);
gen8_pte_t __iomem *base = dpt->iomem;
-   const gen8_pte_t pte_encode = vm->pte_encode(0, level, flags);
+   const gen8_pte_t pte_encode = vm->pte_encode(0, pat_index, flags);
struct sgt_iter sgt_iter;
dma_addr_t addr;
int i;
@@ -83,7 +83,7 @@ static void dpt_clear_range(struct i915_address_space *vm,
 static void dpt_bind_vma(struct i915_address_space *vm,
 struct i915_vm_pt_stash *stash,
 struct

[PATCH v1 0/4] drm/i915/mtl: add PTE encode function

2023-04-23 Thread fei . yang

From: Fei Yang 

These patches are extracted from series
https://patchwork.freedesktop.org/series/115980/

This series start with adding PTE encode functions for MTL as it can no
longer reuse the PTE encode functions for GEN8 due to PAT index changes.
Then there are patches refactoring the cache policy programming so that
the PTE encode functions can be unified across all GEN12 platforms.
This refactor is also important in implementing the design which allows
uerspace to directly set cache policy for each Buffer Object.

Fei Yang (4):
  drm/i915/mtl: Add PTE encode function
  drm/i915: preparation for using PAT index
  drm/i915: use pat_index instead of cache_level
  drm/i915: make sure correct pte encode is used

 drivers/gpu/drm/i915/display/intel_dpt.c  | 14 ++--
 drivers/gpu/drm/i915/gem/i915_gem_domain.c| 27 ++
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 10 ++-
 drivers/gpu/drm/i915/gem/i915_gem_mman.c  |  3 +-
 drivers/gpu/drm/i915/gem/i915_gem_object.c| 61 +-
 drivers/gpu/drm/i915/gem/i915_gem_object.h|  8 ++
 .../gpu/drm/i915/gem/i915_gem_object_types.h  | 26 +-
 drivers/gpu/drm/i915/gem/i915_gem_shrinker.c  |  2 -
 drivers/gpu/drm/i915/gem/i915_gem_stolen.c|  4 +-
 drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c  | 16 ++--
 .../gpu/drm/i915/gem/selftests/huge_pages.c   |  2 +-
 .../drm/i915/gem/selftests/i915_gem_migrate.c |  2 +-
 .../drm/i915/gem/selftests/i915_gem_mman.c|  2 +-
 drivers/gpu/drm/i915/gt/gen6_ppgtt.c  | 10 ++-
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c  | 76 -
 drivers/gpu/drm/i915/gt/gen8_ppgtt.h  |  3 +-
 drivers/gpu/drm/i915/gt/intel_ggtt.c  | 84 +--
 drivers/gpu/drm/i915/gt/intel_gtt.h   | 31 ---
 drivers/gpu/drm/i915/gt/intel_migrate.c   | 47 ++-
 drivers/gpu/drm/i915/gt/intel_migrate.h   | 13 ++-
 drivers/gpu/drm/i915/gt/intel_ppgtt.c |  6 +-
 drivers/gpu/drm/i915/gt/selftest_migrate.c| 47 ++-
 drivers/gpu/drm/i915/gt/selftest_reset.c  |  8 +-
 drivers/gpu/drm/i915/gt/selftest_timeline.c   |  2 +-
 drivers/gpu/drm/i915/gt/selftest_tlb.c|  4 +-
 drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c  | 10 ++-
 drivers/gpu/drm/i915/i915_debugfs.c   | 55 +---
 drivers/gpu/drm/i915/i915_gem.c   | 16 +++-
 drivers/gpu/drm/i915/i915_gpu_error.c |  8 +-
 drivers/gpu/drm/i915/i915_pci.c   | 79 +++--
 drivers/gpu/drm/i915/i915_vma.c   | 16 ++--
 drivers/gpu/drm/i915/i915_vma.h   |  2 +-
 drivers/gpu/drm/i915/i915_vma_types.h |  2 -
 drivers/gpu/drm/i915/intel_device_info.h  |  5 ++
 drivers/gpu/drm/i915/selftests/i915_gem.c |  5 +-
 .../gpu/drm/i915/selftests/i915_gem_evict.c   |  4 +-
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 15 ++--
 .../drm/i915/selftests/intel_memory_region.c  |  4 +-
 .../gpu/drm/i915/selftests/mock_gem_device.c  |  9 ++
 drivers/gpu/drm/i915/selftests/mock_gtt.c |  8 +-
 40 files changed, 527 insertions(+), 219 deletions(-)

-- 
2.25.1

[PATCH v1 1/4] drm/i915/mtl: Add PTE encode function

2023-04-23 Thread fei . yang

From: Fei Yang 

PTE encode functions are platform dependent. This patch implements
PTE functions for MTL, and ensures the correct PTE encode function
is used by calling pte_encode function pointer instead of the
hardcoded gen8 version of PTE encode.

Signed-off-by: Fei Yang 
Reviewed-by: Andrzej Hajda 
Reviewed-by: Andi Shyti 
Acked-by: Nirmoy Das 
---
 drivers/gpu/drm/i915/display/intel_dpt.c |  2 +-
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c | 45 
 drivers/gpu/drm/i915/gt/intel_ggtt.c | 36 +--
 drivers/gpu/drm/i915/gt/intel_gtt.h  | 13 +--
 4 files changed, 83 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_dpt.c 
b/drivers/gpu/drm/i915/display/intel_dpt.c
index b8027392144d..c5eacfdba1a5 100644
--- a/drivers/gpu/drm/i915/display/intel_dpt.c
+++ b/drivers/gpu/drm/i915/display/intel_dpt.c
@@ -300,7 +300,7 @@ intel_dpt_create(struct intel_framebuffer *fb)
vm->vma_ops.bind_vma= dpt_bind_vma;
vm->vma_ops.unbind_vma  = dpt_unbind_vma;
 
-   vm->pte_encode = gen8_ggtt_pte_encode;
+   vm->pte_encode = vm->gt->ggtt->vm.pte_encode;
 
dpt->obj = dpt_obj;
dpt->obj->is_dpt = true;
diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c 
b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
index 4daaa6f55668..11b91e0453c8 100644
--- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
@@ -55,6 +55,34 @@ static u64 gen8_pte_encode(dma_addr_t addr,
return pte;
 }
 
+static u64 mtl_pte_encode(dma_addr_t addr,
+ enum i915_cache_level level,
+ u32 flags)
+{
+   gen8_pte_t pte = addr | GEN8_PAGE_PRESENT | GEN8_PAGE_RW;
+
+   if (unlikely(flags & PTE_READ_ONLY))
+   pte &= ~GEN8_PAGE_RW;
+
+   if (flags & PTE_LM)
+   pte |= GEN12_PPGTT_PTE_LM | GEN12_PPGTT_PTE_NC;
+
+   switch (level) {
+   case I915_CACHE_NONE:
+   pte |= GEN12_PPGTT_PTE_PAT1;
+   break;
+   case I915_CACHE_LLC:
+   case I915_CACHE_L3_LLC:
+   pte |= GEN12_PPGTT_PTE_PAT0 | GEN12_PPGTT_PTE_PAT1;
+   break;
+   case I915_CACHE_WT:
+   pte |= GEN12_PPGTT_PTE_PAT0;
+   break;
+   }
+
+   return pte;
+}
+
 static void gen8_ppgtt_notify_vgt(struct i915_ppgtt *ppgtt, bool create)
 {
struct drm_i915_private *i915 = ppgtt->vm.i915;
@@ -427,7 +455,7 @@ gen8_ppgtt_insert_pte(struct i915_ppgtt *ppgtt,
  u32 flags)
 {
struct i915_page_directory *pd;
-   const gen8_pte_t pte_encode = gen8_pte_encode(0, cache_level, flags);
+   const gen8_pte_t pte_encode = ppgtt->vm.pte_encode(0, cache_level, 
flags);
gen8_pte_t *vaddr;
 
pd = i915_pd_entry(pdp, gen8_pd_index(idx, 2));
@@ -580,7 +608,7 @@ static void gen8_ppgtt_insert_huge(struct 
i915_address_space *vm,
   enum i915_cache_level cache_level,
   u32 flags)
 {
-   const gen8_pte_t pte_encode = gen8_pte_encode(0, cache_level, flags);
+   const gen8_pte_t pte_encode = vm->pte_encode(0, cache_level, flags);
unsigned int rem = sg_dma_len(iter->sg);
u64 start = vma_res->start;
 
@@ -743,7 +771,7 @@ static void gen8_ppgtt_insert_entry(struct 
i915_address_space *vm,
GEM_BUG_ON(pt->is_compact);
 
vaddr = px_vaddr(pt);
-   vaddr[gen8_pd_index(idx, 0)] = gen8_pte_encode(addr, level, flags);
+   vaddr[gen8_pd_index(idx, 0)] = vm->pte_encode(addr, level, flags);
drm_clflush_virt_range([gen8_pd_index(idx, 0)], sizeof(*vaddr));
 }
 
@@ -773,7 +801,7 @@ static void __xehpsdv_ppgtt_insert_entry_lm(struct 
i915_address_space *vm,
}
 
vaddr = px_vaddr(pt);
-   vaddr[gen8_pd_index(idx, 0) / 16] = gen8_pte_encode(addr, level, flags);
+   vaddr[gen8_pd_index(idx, 0) / 16] = vm->pte_encode(addr, level, flags);
 }
 
 static void xehpsdv_ppgtt_insert_entry(struct i915_address_space *vm,
@@ -820,8 +848,8 @@ static int gen8_init_scratch(struct i915_address_space *vm)
pte_flags |= PTE_LM;
 
vm->scratch[0]->encode =
-   gen8_pte_encode(px_dma(vm->scratch[0]),
-   I915_CACHE_NONE, pte_flags);
+   vm->pte_encode(px_dma(vm->scratch[0]),
+  I915_CACHE_NONE, pte_flags);
 
for (i = 1; i <= vm->top; i++) {
struct drm_i915_gem_object *obj;
@@ -963,7 +991,10 @@ struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt,
 */
ppgtt->vm.alloc_scratch_dma = alloc_pt_dma;
 
-   ppgtt->vm.pte_encode = gen8_pte_encode;
+   if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 70))
+   ppgtt->vm.pte_encode = mtl_pte_encode;
+   else
+   ppgtt->vm.pte_encode = gen8_pte_encode;
 
ppgtt->vm.bind_async_flags = I915_VMA_LOCAL_BIND;
ppgtt->vm.insert_entries =

[PATCH] drm/amdgpu: Mark contexts guilty for any reset type

2023-04-23 Thread André Almeida

When a DRM job timeout, the GPU is probably hang and amdgpu have some
ways to deal with that, ranging from soft recoveries to full device
reset. Anyway, when userspace ask the kernel the state of the context
(via AMDGPU_CTX_OP_QUERY_STATE), the kernel reports that the device was
reset, regardless if a full reset happened or not.

However, amdgpu only marks a context guilty in the ASIC reset path. This
makes the userspace report incomplete, given that on soft recovery path
the guilty context is not told that it's the guilty one.

Fix this by marking the context guilty for every type of reset when a
job timeouts.

Signed-off-by: André Almeida 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 ---
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c| 8 +++-
 2 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index ac78caa7cba8..ea169d1689e2 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -4771,9 +4771,6 @@ int amdgpu_device_pre_asic_reset(struct amdgpu_device 
*adev,
 
amdgpu_fence_driver_isr_toggle(adev, false);
 
-   if (job && job->vm)
-   drm_sched_increase_karma(>base);
-
r = amdgpu_reset_prepare_hwcontext(adev, reset_context);
/* If reset handler not implemented, continue; otherwise return */
if (r == -ENOSYS)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index c3d9d75143f4..097ed8f06865 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -51,6 +51,13 @@ static enum drm_gpu_sched_stat amdgpu_job_timedout(struct 
drm_sched_job *s_job)
memset(, 0, sizeof(struct amdgpu_task_info));
adev->job_hang = true;
 
+   amdgpu_vm_get_task_info(ring->adev, job->pasid, );
+
+   if (job && job->vm) {
+   DRM_INFO("marking %s context as guilty", ti.process_name);
+   drm_sched_increase_karma(>base);
+   }
+
if (amdgpu_gpu_recovery &&
amdgpu_ring_soft_recovery(ring, job->vmid, s_job->s_fence->parent)) 
{
DRM_ERROR("ring %s timeout, but soft recovered\n",
@@ -58,7 +65,6 @@ static enum drm_gpu_sched_stat amdgpu_job_timedout(struct 
drm_sched_job *s_job)
goto exit;
}
 
-   amdgpu_vm_get_task_info(ring->adev, job->pasid, );
DRM_ERROR("ring %s timeout, signaled seq=%u, emitted seq=%u\n",
  job->base.sched->name, atomic_read(>fence_drv.last_seq),
  ring->fence_drv.sync_seq);
-- 
2.40.0

Re: [PATCH] drm/meson: set variables meson_hdmi_* storage-class-specifier to static

2023-04-23 Thread Martin Blumenstingl

On Sun, Apr 23, 2023 at 4:53 PM Tom Rix  wrote:
>
> smatch has several simailar warnings to
s/simailar/similar/

> drivers/gpu/drm/meson/meson_venc.c:189:28: warning: symbol
>   'meson_hdmi_enci_mode_480i' was not declared. Should it be static?
>
> These variables are only used in their defining file so should be static
>
> Signed-off-by: Tom Rix 
With above typo fixed (or with a comment from the maintainers that
they can fix it while applying):
Acked-by: Martin Blumenstingl

Re: [PATCH] drm/probe_helper: fix the warning reported when calling drm_kms_helper_poll_disable during suspend

2023-04-23 Thread Janne Grunau

On 2023-04-20 23:07:01 +0300, Dmitry Baryshkov wrote:
> On Thu, 20 Apr 2023 at 23:01, Janne Grunau  wrote:
> >
> > On 2023-03-28 10:31:29 +0800, Zongmin Zhou wrote:
> > > When drivers call drm_kms_helper_poll_disable from
> > > their device suspend implementation without enabled output polling before,
> > > following warning will be reported,due to work->func not be initialized:
> >
> > we see the same warning with the wpork in progress kms driver for apple
> > silicon SoCs. The connectors do not need to polled so the driver never
> > calls drm_kms_helper_poll_init().
> >
> > > [   55.141361] WARNING: CPU: 3 PID: 372 at kernel/workqueue.c:3066 
> > > __flush_work+0x22f/0x240
> > > [   55.141382] Modules linked in: nls_iso8859_1 snd_hda_codec_generic 
> > > ledtrig_audio snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi 
> > > snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_seq_midi 
> > > snd_seq_midi_event snd_rawmidi snd_seq intel_rapl_msr intel_rapl_common 
> > > bochs drm_vram_helper drm_ttm_helper snd_seq_device nfit ttm 
> > > crct10dif_pclmul snd_timer ghash_clmulni_intel binfmt_misc sha512_ssse3 
> > > aesni_intel drm_kms_helper joydev input_leds syscopyarea crypto_simd snd 
> > > cryptd sysfillrect sysimgblt mac_hid serio_raw soundcore qemu_fw_cfg 
> > > sch_fq_codel msr parport_pc ppdev lp parport drm ramoops reed_solomon 
> > > pstore_blk pstore_zone efi_pstore virtio_rng ip_tables x_tables autofs4 
> > > hid_generic usbhid hid ahci virtio_net i2c_i801 crc32_pclmul psmouse 
> > > virtio_scsi libahci i2c_smbus lpc_ich xhci_pci net_failover virtio_blk 
> > > xhci_pci_renesas failover
> > > [   55.141430] CPU: 3 PID: 372 Comm: kworker/u16:9 Not tainted 6.2.0-rc6+ 
> > > #16
> > > [   55.141433] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 
> > > rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
> > > [   55.141435] Workqueue: events_unbound async_run_entry_fn
> > > [   55.141441] RIP: 0010:__flush_work+0x22f/0x240
> > > [   55.141444] Code: 8b 43 28 48 8b 53 30 89 c1 e9 f9 fe ff ff 4c 89 f7 
> > > e8 b5 95 d9 00 e8 00 53 08 00 45 31 ff e9 11 ff ff ff 0f 0b e9 0a ff ff 
> > > ff <0f> 0b 45 31 ff e9 00 ff ff ff e8 e2 54 d8 00 66 90 90 90 90 90 90
> > > [   55.141446] RSP: 0018:ff59221940833c18 EFLAGS: 00010246
> > > [   55.141449] RAX:  RBX:  RCX: 
> > > 9b72bcbe
> > > [   55.141450] RDX: 0001 RSI: 0001 RDI: 
> > > ff3ea01e4265e330
> > > [   55.141451] RBP: ff59221940833c90 R08:  R09: 
> > > 8080808080808080
> > > [   55.141453] R10: ff3ea01e42b3caf4 R11: 000f R12: 
> > > ff3ea01e4265e330
> > > [   55.141454] R13: 0001 R14: ff3ea01e505e5e80 R15: 
> > > 0001
> > > [   55.141455] FS:  () GS:ff3ea01fb7cc() 
> > > knlGS:
> > > [   55.141456] CS:  0010 DS:  ES:  CR0: 80050033
> > > [   55.141458] CR2: 563543ad1546 CR3: 00010ee82005 CR4: 
> > > 00771ee0
> > > [   55.141464] DR0:  DR1:  DR2: 
> > > 
> > > [   55.141465] DR3:  DR6: fffe0ff0 DR7: 
> > > 0400
> > > [   55.141466] PKRU: 5554
> > > [   55.141467] Call Trace:
> > > [   55.141469]  
> > > [   55.141472]  ? pcie_wait_cmd+0xdf/0x220
> > > [   55.141478]  ? mptcp_seq_show+0xe0/0x180
> > > [   55.141484]  __cancel_work_timer+0x124/0x1b0
> > > [   55.141487]  cancel_delayed_work_sync+0x17/0x20
> > > [   55.141490]  drm_kms_helper_poll_disable+0x26/0x40 [drm_kms_helper]
> > > [   55.141516]  drm_mode_config_helper_suspend+0x25/0x90 [drm_kms_helper]
> > > [   55.141531]  ? __pm_runtime_resume+0x64/0x90
> > > [   55.141536]  bochs_pm_suspend+0x16/0x20 [bochs]
> > > [   55.141540]  pci_pm_suspend+0x8b/0x1b0
> > > [   55.141545]  ? __pfx_pci_pm_suspend+0x10/0x10
> > > [   55.141547]  dpm_run_callback+0x4c/0x160
> > > [   55.141550]  __device_suspend+0x14c/0x4c0
> > > [   55.141553]  async_suspend+0x24/0xa0
> > > [   55.141555]  async_run_entry_fn+0x34/0x120
> > > [   55.141557]  process_one_work+0x21a/0x3f0
> > > [   55.141560]  worker_thread+0x4e/0x3c0
> > > [   55.141563]  ? __pfx_worker_thread+0x10/0x10
> > > [   55.141565]  kthread+0xf2/0x120
> > > [   55.141568]  ? __pfx_kthread+0x10/0x10
> > > [   55.141570]  ret_from_fork+0x29/0x50
> > > [   55.141575]  
> > > [   55.141575] ---[ end trace  ]---
> > >
> > > Fixes: a4e771729a51 ("drm/probe_helper: sort out poll_running vs 
> > > poll_enabled")
> > > Signed-off-by: Zongmin Zhou
> > > ---
> > >  drivers/gpu/drm/drm_probe_helper.c | 3 ++-
> > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/gpu/drm/drm_probe_helper.c 
> > > b/drivers/gpu/drm/drm_probe_helper.c
> > > index 8127be134c39..ac72b18e2257 100644
> > > --- a/drivers/gpu/drm/drm_probe_helper.c
> > > +++ b/drivers/gpu/drm/drm_probe_helper.c
> > > @@ -855,7 +855,8 @@ void

Re: Disabling -Warray-bounds for gcc-13 too

2023-04-23 Thread Kees Cook

On April 23, 2023 10:36:24 AM PDT, Linus Torvalds 
 wrote:
>Kees,
>  I made the mistake of upgrading my M2 Macbook Air to Fedora-38, and
>in the process I got gcc-13 which is not WERROR-clean because we only
>limited the 'array-bounds' warning to gcc-11 and gcc-12. But gcc-13
>has all the same issues.
>
>And I want to be able to do my arm64 builds with WERROR on still...
>
>I guess it never made much sense to hope it was going to go away
>without having a confirmation, so I just changed it to be gcc-11+.

Yeah, that's fine. GCC 13 released without having a fix for at least one 
(hopefully last) known array-bounds vs jump threading bug:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109071

>And one of them is from you.
>
>In particular, commit 4076ea2419cf ("drm/nouveau/disp: Fix
>nvif_outp_acquire_dp() argument size") cannot possibly be right, It
>changes
>
> nvif_outp_acquire_dp(struct nvif_outp *outp, u8 dpcd[16],
>
>to
>
> nvif_outp_acquire_dp(struct nvif_outp *outp, u8 dpcd[DP_RECEIVER_CAP_SIZE],
>
>and then does
>
>memcpy(args.dp.dpcd, dpcd, sizeof(args.dp.dpcd));
>
>where that 'args.dp.dpcd' is a 16-byte array, and DP_RECEIVER_CAP_SIZE is 15.

Yeah, it was an incomplete fix. I sent the other half here, but it fell through 
the cracks:
https://lore.kernel.org/lkml/20230204184307.never.825-k...@kernel.org/

>

>I think it's all entirely harmless from a code generation standpoint,
>because the 15-byte field will be padded out to 16 bytes in the
>structure that contains it, but it's most definitely buggy.

Right; between this, that GCC 13 wasn't released yet, and I had no feedback 
from NV folks, I didn't chase down landing that fix.

>
>So that warning does find real cases of wrong code. But when those
>real cases are hidden by hundreds of lines of unfixable false
>positives, we don't have much choice.

Yup, totally agreed. The false positives I've looked at all seem to be similar 
to the outstanding jump threading bug, so I'm hoping once that gets fixed we'll 
finally have a good signal with that warning enabled. :)

-Kees

-- 
Kees Cook

Disabling -Warray-bounds for gcc-13 too

2023-04-23 Thread Linus Torvalds

Kees,
  I made the mistake of upgrading my M2 Macbook Air to Fedora-38, and
in the process I got gcc-13 which is not WERROR-clean because we only
limited the 'array-bounds' warning to gcc-11 and gcc-12. But gcc-13
has all the same issues.

And I want to be able to do my arm64 builds with WERROR on still...

I guess it never made much sense to hope it was going to go away
without having a confirmation, so I just changed it to be gcc-11+.

A lot of the warnings seem just crazy, with gcc just not getting the
bounds right, and then being upset about us going backwards with
'container_of()' etc. Ok, so the kernel is special. We do odd things.
I get it, gcc ends up being confused.

But before I disabled it, I did take a look at a couple of warnings
that didn't look like the sea of crazy.

And one of them is from you.

In particular, commit 4076ea2419cf ("drm/nouveau/disp: Fix
nvif_outp_acquire_dp() argument size") cannot possibly be right, It
changes

 nvif_outp_acquire_dp(struct nvif_outp *outp, u8 dpcd[16],

to

 nvif_outp_acquire_dp(struct nvif_outp *outp, u8 dpcd[DP_RECEIVER_CAP_SIZE],

and then does

memcpy(args.dp.dpcd, dpcd, sizeof(args.dp.dpcd));

where that 'args.dp.dpcd' is a 16-byte array, and DP_RECEIVER_CAP_SIZE is 15.

So yeah, it's copying 16 bytes from an argument that claims to be 15
bytes in size.

I think that commit was wrong, and the problem is that the 'dpcd'
array is something 15 and sometimes 16. For example, we have

  struct nouveau_encoder {
...
union {
struct {
...
u8 dpcd[DP_RECEIVER_CAP_SIZE];
} dp;
};

so there it's indeed 15 bytes, but then we have

union nvif_outp_acquire_args {
struct nvif_outp_acquire_v0 {
...
union {
...
struct {
...
__u8 dpcd[16];
} dp;

where it's 16.

I think it's all entirely harmless from a code generation standpoint,
because the 15-byte field will be padded out to 16 bytes in the
structure that contains it, but it's most definitely buggy.

So that warning does find real cases of wrong code. But when those
real cases are hidden by hundreds of lines of unfixable false
positives, we don't have much choice.

But could the Nouveau driver *please* pick a size for the dhcp[] array
and stick with it?

The other driver where the warnings didn't look entirely crazy was the
ath/carl9170 wireless driver, but I didn't look closer at that one.

 Linus

Re: [PATCH v4 1/6] mm/gup: remove unused vmas parameter from get_user_pages()

2023-04-23 Thread Jarkko Sakkinen

On Tue Apr 18, 2023 at 6:49 PM EEST, Lorenzo Stoakes wrote:
> No invocation of get_user_pages() uses the vmas parameter, so remove
> it.
>
> The GUP API is confusing and caveated. Recent changes have done much to
> improve that, however there is more we can do. Exporting vmas is a prime
> target as the caller has to be extremely careful to preclude their use
> after the mmap_lock has expired or otherwise be left with dangling
> pointers.
>
> Removing the vmas parameter focuses the GUP functions upon their primary
> purpose - pinning (and outputting) pages as well as performing the actions
> implied by the input flags.
>
> This is part of a patch series aiming to remove the vmas parameter
> altogether.
>
> Suggested-by: Matthew Wilcox (Oracle) 
> Acked-by: Greg Kroah-Hartman 
> Acked-by: David Hildenbrand 
> Reviewed-by: Jason Gunthorpe 
> Signed-off-by: Lorenzo Stoakes 
> ---
>  arch/x86/kernel/cpu/sgx/ioctl.c | 2 +-
>  drivers/gpu/drm/radeon/radeon_ttm.c | 2 +-
>  drivers/misc/sgi-gru/grufault.c | 2 +-
>  include/linux/mm.h  | 3 +--
>  mm/gup.c| 9 +++--
>  mm/gup_test.c   | 5 ++---
>  virt/kvm/kvm_main.c | 2 +-
>  7 files changed, 10 insertions(+), 15 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c
> index 21ca0a831b70..5d390df21440 100644
> --- a/arch/x86/kernel/cpu/sgx/ioctl.c
> +++ b/arch/x86/kernel/cpu/sgx/ioctl.c
> @@ -214,7 +214,7 @@ static int __sgx_encl_add_page(struct sgx_encl *encl,
>   if (!(vma->vm_flags & VM_MAYEXEC))
>   return -EACCES;
>  
> - ret = get_user_pages(src, 1, 0, _page, NULL);
> + ret = get_user_pages(src, 1, 0, _page);
>   if (ret < 1)
>   return -EFAULT;
>  
> diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c 
> b/drivers/gpu/drm/radeon/radeon_ttm.c
> index 1e8e287e113c..0597540f0dde 100644
> --- a/drivers/gpu/drm/radeon/radeon_ttm.c
> +++ b/drivers/gpu/drm/radeon/radeon_ttm.c
> @@ -362,7 +362,7 @@ static int radeon_ttm_tt_pin_userptr(struct ttm_device 
> *bdev, struct ttm_tt *ttm
>   struct page **pages = ttm->pages + pinned;
>  
>   r = get_user_pages(userptr, num_pages, write ? FOLL_WRITE : 0,
> -pages, NULL);
> +pages);
>   if (r < 0)
>   goto release_pages;
>  
> diff --git a/drivers/misc/sgi-gru/grufault.c b/drivers/misc/sgi-gru/grufault.c
> index b836936e9747..378cf02a2aa1 100644
> --- a/drivers/misc/sgi-gru/grufault.c
> +++ b/drivers/misc/sgi-gru/grufault.c
> @@ -185,7 +185,7 @@ static int non_atomic_pte_lookup(struct vm_area_struct 
> *vma,
>  #else
>   *pageshift = PAGE_SHIFT;
>  #endif
> - if (get_user_pages(vaddr, 1, write ? FOLL_WRITE : 0, , NULL) <= 0)
> + if (get_user_pages(vaddr, 1, write ? FOLL_WRITE : 0, ) <= 0)
>   return -EFAULT;
>   *paddr = page_to_phys(page);
>   put_page(page);
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 37554b08bb28..b14cc4972d0b 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -2380,8 +2380,7 @@ long pin_user_pages_remote(struct mm_struct *mm,
>  unsigned int gup_flags, struct page **pages,
>  struct vm_area_struct **vmas, int *locked);
>  long get_user_pages(unsigned long start, unsigned long nr_pages,
> - unsigned int gup_flags, struct page **pages,
> - struct vm_area_struct **vmas);
> + unsigned int gup_flags, struct page **pages);
>  long pin_user_pages(unsigned long start, unsigned long nr_pages,
>   unsigned int gup_flags, struct page **pages,
>   struct vm_area_struct **vmas);
> diff --git a/mm/gup.c b/mm/gup.c
> index 1f72a717232b..7e454d6b157e 100644
> --- a/mm/gup.c
> +++ b/mm/gup.c
> @@ -2251,8 +2251,6 @@ long get_user_pages_remote(struct mm_struct *mm,
>   * @pages:  array that receives pointers to the pages pinned.
>   *  Should be at least nr_pages long. Or NULL, if caller
>   *  only intends to ensure the pages are faulted in.
> - * @vmas:   array of pointers to vmas corresponding to each page.
> - *  Or NULL if the caller does not require them.
>   *
>   * This is the same as get_user_pages_remote(), just with a less-flexible
>   * calling convention where we assume that the mm being operated on belongs 
> to
> @@ -2260,16 +2258,15 @@ long get_user_pages_remote(struct mm_struct *mm,
>   * obviously don't pass FOLL_REMOTE in here.
>   */
>  long get_user_pages(unsigned long start, unsigned long nr_pages,
> - unsigned int gup_flags, struct page **pages,
> - struct vm_area_struct **vmas)
> + unsigned int gup_flags, struct page **pages)
>  {
>   int locked = 1;
>  
> - if (!is_valid_gup_args(pages, vmas, NULL, _flags, FOLL_TOUCH))

[PATCH] drm/meson: set variables meson_hdmi_* storage-class-specifier to static

2023-04-23 Thread Tom Rix

smatch has several simailar warnings to
drivers/gpu/drm/meson/meson_venc.c:189:28: warning: symbol
  'meson_hdmi_enci_mode_480i' was not declared. Should it be static?

These variables are only used in their defining file so should be static

Signed-off-by: Tom Rix 
---
 drivers/gpu/drm/meson/meson_venc.c | 32 +++---
 1 file changed, 16 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/meson/meson_venc.c 
b/drivers/gpu/drm/meson/meson_venc.c
index fcd532db19c1..27ef9f88e4ff 100644
--- a/drivers/gpu/drm/meson/meson_venc.c
+++ b/drivers/gpu/drm/meson/meson_venc.c
@@ -186,7 +186,7 @@ union meson_hdmi_venc_mode {
} encp;
 };
 
-union meson_hdmi_venc_mode meson_hdmi_enci_mode_480i = {
+static union meson_hdmi_venc_mode meson_hdmi_enci_mode_480i = {
.enci = {
.hso_begin = 5,
.hso_end = 129,
@@ -206,7 +206,7 @@ union meson_hdmi_venc_mode meson_hdmi_enci_mode_480i = {
},
 };
 
-union meson_hdmi_venc_mode meson_hdmi_enci_mode_576i = {
+static union meson_hdmi_venc_mode meson_hdmi_enci_mode_576i = {
.enci = {
.hso_begin = 3,
.hso_end = 129,
@@ -226,7 +226,7 @@ union meson_hdmi_venc_mode meson_hdmi_enci_mode_576i = {
},
 };
 
-union meson_hdmi_venc_mode meson_hdmi_encp_mode_480p = {
+static union meson_hdmi_venc_mode meson_hdmi_encp_mode_480p = {
.encp = {
.dvi_settings = 0x21,
.video_mode = 0x4000,
@@ -272,7 +272,7 @@ union meson_hdmi_venc_mode meson_hdmi_encp_mode_480p = {
},
 };
 
-union meson_hdmi_venc_mode meson_hdmi_encp_mode_576p = {
+static union meson_hdmi_venc_mode meson_hdmi_encp_mode_576p = {
.encp = {
.dvi_settings = 0x21,
.video_mode = 0x4000,
@@ -318,7 +318,7 @@ union meson_hdmi_venc_mode meson_hdmi_encp_mode_576p = {
},
 };
 
-union meson_hdmi_venc_mode meson_hdmi_encp_mode_720p60 = {
+static union meson_hdmi_venc_mode meson_hdmi_encp_mode_720p60 = {
.encp = {
.dvi_settings = 0x2029,
.video_mode = 0x4040,
@@ -360,7 +360,7 @@ union meson_hdmi_venc_mode meson_hdmi_encp_mode_720p60 = {
},
 };
 
-union meson_hdmi_venc_mode meson_hdmi_encp_mode_720p50 = {
+static union meson_hdmi_venc_mode meson_hdmi_encp_mode_720p50 = {
.encp = {
.dvi_settings = 0x202d,
.video_mode = 0x4040,
@@ -405,7 +405,7 @@ union meson_hdmi_venc_mode meson_hdmi_encp_mode_720p50 = {
},
 };
 
-union meson_hdmi_venc_mode meson_hdmi_encp_mode_1080i60 = {
+static union meson_hdmi_venc_mode meson_hdmi_encp_mode_1080i60 = {
.encp = {
.dvi_settings = 0x2029,
.video_mode = 0x5ffc,
@@ -454,7 +454,7 @@ union meson_hdmi_venc_mode meson_hdmi_encp_mode_1080i60 = {
},
 };
 
-union meson_hdmi_venc_mode meson_hdmi_encp_mode_1080i50 = {
+static union meson_hdmi_venc_mode meson_hdmi_encp_mode_1080i50 = {
.encp = {
.dvi_settings = 0x202d,
.video_mode = 0x5ffc,
@@ -503,7 +503,7 @@ union meson_hdmi_venc_mode meson_hdmi_encp_mode_1080i50 = {
},
 };
 
-union meson_hdmi_venc_mode meson_hdmi_encp_mode_1080p24 = {
+static union meson_hdmi_venc_mode meson_hdmi_encp_mode_1080p24 = {
.encp = {
.dvi_settings = 0xd,
.video_mode = 0x4040,
@@ -552,7 +552,7 @@ union meson_hdmi_venc_mode meson_hdmi_encp_mode_1080p24 = {
},
 };
 
-union meson_hdmi_venc_mode meson_hdmi_encp_mode_1080p30 = {
+static union meson_hdmi_venc_mode meson_hdmi_encp_mode_1080p30 = {
.encp = {
.dvi_settings = 0x1,
.video_mode = 0x4040,
@@ -596,7 +596,7 @@ union meson_hdmi_venc_mode meson_hdmi_encp_mode_1080p30 = {
},
 };
 
-union meson_hdmi_venc_mode meson_hdmi_encp_mode_1080p50 = {
+static union meson_hdmi_venc_mode meson_hdmi_encp_mode_1080p50 = {
.encp = {
.dvi_settings = 0xd,
.video_mode = 0x4040,
@@ -644,7 +644,7 @@ union meson_hdmi_venc_mode meson_hdmi_encp_mode_1080p50 = {
},
 };
 
-union meson_hdmi_venc_mode meson_hdmi_encp_mode_1080p60 = {
+static union meson_hdmi_venc_mode meson_hdmi_encp_mode_1080p60 = {
.encp = {
.dvi_settings = 0x1,
.video_mode = 0x4040,
@@ -688,7 +688,7 @@ union meson_hdmi_venc_mode meson_hdmi_encp_mode_1080p60 = {
},
 };
 
-union meson_hdmi_venc_mode meson_hdmi_encp_mode_2160p24 = {
+static union meson_hdmi_venc_mode meson_hdmi_encp_mode_2160p24 = {
.encp = {
.dvi_settings = 0x1,
.video_mode = 0x4040,
@@ -730,7 +730,7 @@ union meson_hdmi_venc_mode meson_hdmi_encp_mode_2160p24 = {
},
 };
 
-union meson_hdmi_venc_mode meson_hdmi_encp_mode_2160p25 = {
+static union meson_hdmi_venc_mode meson_hdmi_encp_mode_2160p25 = {
.encp = {
.dvi_settings = 0x1,
.video_mode = 0x4040,
@@

[RFC PATCH 40/40] drm/amd/display: allow newer DC hardware to use degamma ROM for PQ/HLG

2023-04-23 Thread Melissa Wen

From: Joshua Ashton 

Need to funnel the color caps through to these functions so it can check
that the hardware is capable.

Signed-off-by: Joshua Ashton 
---
 .../amd/display/amdgpu_dm/amdgpu_dm_color.c   | 34 ---
 1 file changed, 21 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
index a034c0c0d383..f0b5f09b9146 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
@@ -336,6 +336,7 @@ static int amdgpu_dm_set_atomic_regamma(struct 
dc_stream_state *stream,
 /**
  * __set_input_tf - calculates the input transfer function based on expected
  * input space.
+ * @caps: dc color capabilities
  * @func: transfer function
  * @lut: lookup table that defines the color space
  * @lut_size: size of respective lut.
@@ -343,7 +344,7 @@ static int amdgpu_dm_set_atomic_regamma(struct 
dc_stream_state *stream,
  * Returns:
  * 0 in case of success. -ENOMEM if fails.
  */
-static int __set_input_tf(struct dc_transfer_func *func,
+static int __set_input_tf(struct dc_color_caps *caps, struct dc_transfer_func 
*func,
  const struct drm_color_lut *lut, uint32_t lut_size)
 {
struct dc_gamma *gamma = NULL;
@@ -360,7 +361,7 @@ static int __set_input_tf(struct dc_transfer_func *func,
__drm_lut_to_dc_gamma(lut, gamma, false);
}
 
-   res = mod_color_calculate_degamma_params(NULL, func, gamma, gamma != 
NULL);
+   res = mod_color_calculate_degamma_params(caps, func, gamma, gamma != 
NULL);
 
if (gamma)
dc_gamma_release();
@@ -512,7 +513,7 @@ static int amdgpu_dm_atomic_blend_lut(const struct 
drm_color_lut *blend_lut,
func_blend->tf = tf;
func_blend->sdr_ref_white_level = 80; /* hardcoded for now */
 
-   ret = __set_input_tf(func_blend, blend_lut, blend_size);
+   ret = __set_input_tf(NULL, func_blend, blend_lut, blend_size);
} else {
func_blend->type = TF_TYPE_BYPASS;
func_blend->tf = TRANSFER_FUNCTION_LINEAR;
@@ -819,7 +820,8 @@ int amdgpu_dm_update_crtc_color_mgmt(struct dm_crtc_state 
*crtc,
 }
 
 static int map_crtc_degamma_to_dc_plane(struct dm_crtc_state *crtc,
-   struct dc_plane_state *dc_plane_state)
+   struct dc_plane_state *dc_plane_state,
+   struct dc_color_caps *caps)
 {
const struct drm_color_lut *degamma_lut;
enum dc_transfer_func_predefined tf = TRANSFER_FUNCTION_SRGB;
@@ -874,7 +876,7 @@ static int map_crtc_degamma_to_dc_plane(struct 
dm_crtc_state *crtc,
dc_plane_state->in_transfer_func->tf =
TRANSFER_FUNCTION_LINEAR;
 
-   r = __set_input_tf(dc_plane_state->in_transfer_func,
+   r = __set_input_tf(caps, dc_plane_state->in_transfer_func,
   degamma_lut, degamma_size);
if (r)
return r;
@@ -887,7 +889,7 @@ static int map_crtc_degamma_to_dc_plane(struct 
dm_crtc_state *crtc,
dc_plane_state->in_transfer_func->tf = tf;
 
if (tf != TRANSFER_FUNCTION_SRGB &&
-   !mod_color_calculate_degamma_params(NULL,
+   !mod_color_calculate_degamma_params(caps,
dc_plane_state->in_transfer_func, NULL, false))
return -ENOMEM;
}
@@ -898,7 +900,8 @@ static int map_crtc_degamma_to_dc_plane(struct 
dm_crtc_state *crtc,
 #ifdef CONFIG_STEAM_DECK
 static int
 __set_dm_plane_degamma(struct drm_plane_state *plane_state,
-  struct dc_plane_state *dc_plane_state)
+  struct dc_plane_state *dc_plane_state,
+  struct dc_color_caps *color_caps)
 {
struct dm_plane_state *dm_plane_state = to_dm_plane_state(plane_state);
const struct drm_color_lut *degamma_lut;
@@ -907,6 +910,9 @@ __set_dm_plane_degamma(struct drm_plane_state *plane_state,
bool has_degamma_lut;
int ret;
 
+   if (dc_plane_state->ctx && dc_plane_state->ctx->dc)
+   color_caps = _plane_state->ctx->dc->caps.color;
+
degamma_lut = __extract_blob_lut(dm_plane_state->degamma_lut, 
_size);
 
has_degamma_lut = degamma_lut &&
@@ -928,8 +934,8 @@ __set_dm_plane_degamma(struct drm_plane_state *plane_state,
dc_plane_state->in_transfer_func->type =
TF_TYPE_DISTRIBUTED_POINTS;
 
-   ret = __set_input_tf(dc_plane_state->in_transfer_func,
-  degamma_lut, degamma_size);
+   ret = __set_input_tf(color_caps, 
dc_plane_state->in_transfer_func,
+degamma_lut, degamma_size);

[RFC PATCH 38/40] drm/amd/display: add DRM plane blend LUT and TF support

2023-04-23 Thread Melissa Wen

From: Joshua Ashton 

Map DRM plane blend properties to DPP blend gamma. Plane blend is a
post-3D LUT curve that linearizes color space for blending. It may be
defined by a user-blob LUT and/or predefined transfer function. As
hardcoded curve (ROM) is not supported on blend gamma, we use AMD color
module to fill parameters when setting non-linear TF with empty LUT.

Signed-off-by: Joshua Ashton 
---
 .../amd/display/amdgpu_dm/amdgpu_dm_color.c   | 60 +--
 1 file changed, 56 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
index baa7fea9ebae..a034c0c0d383 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
@@ -493,6 +493,34 @@ static int amdgpu_dm_atomic_shaper_lut(const struct 
drm_color_lut *shaper_lut,
return ret;
 }
 
+static int amdgpu_dm_atomic_blend_lut(const struct drm_color_lut *blend_lut,
+  bool has_rom,
+  enum dc_transfer_func_predefined tf,
+  uint32_t blend_size,
+  struct dc_transfer_func *func_blend)
+{
+   int ret = 0;
+
+   if (blend_size || tf != TRANSFER_FUNCTION_LINEAR) {
+   /* DRM plane gamma LUT or TF means we are linearizing color
+* space before blending (similar to degamma programming). As
+* we don't have hardcoded curve support, or we use AMD color
+* module to fill the parameters that will be translated to HW
+* points.
+*/
+   func_blend->type = TF_TYPE_DISTRIBUTED_POINTS;
+   func_blend->tf = tf;
+   func_blend->sdr_ref_white_level = 80; /* hardcoded for now */
+
+   ret = __set_input_tf(func_blend, blend_lut, blend_size);
+   } else {
+   func_blend->type = TF_TYPE_BYPASS;
+   func_blend->tf = TRANSFER_FUNCTION_LINEAR;
+   }
+
+   return ret;
+}
+
 /* amdgpu_dm_atomic_shaper_lut3d - set DRM CRTC shaper LUT and 3D LUT to DC
  * interface
  * @dc: Display Core control structure
@@ -921,9 +949,11 @@ amdgpu_dm_plane_set_color_properties(struct 
drm_plane_state *plane_state,
 {
struct dm_plane_state *dm_plane_state = to_dm_plane_state(plane_state);
enum drm_transfer_function shaper_tf = DRM_TRANSFER_FUNCTION_DEFAULT;
-   const struct drm_color_lut *shaper_lut, *lut3d;
-   uint32_t lut3d_size, shaper_size;
-
+   enum drm_transfer_function blend_tf = DRM_TRANSFER_FUNCTION_DEFAULT;
+   const struct drm_color_lut *shaper_lut, *lut3d, *blend_lut;
+   uint32_t lut3d_size, shaper_size, blend_size;
+   int ret;
+   
/* We have nothing to do here, return */
if (!plane_state->color_mgmt_changed)
return 0;
@@ -940,8 +970,30 @@ amdgpu_dm_plane_set_color_properties(struct 
drm_plane_state *plane_state,
ret = amdgpu_dm_atomic_shaper_lut(shaper_lut, false,
  drm_tf_to_dc_tf(shaper_tf),
  shaper_size, 
dc_plane_state->in_shaper_func);
+   if (ret) {
+   drm_dbg_kms(plane_state->plane->dev,
+   "setting plane %d shaper/3d lut failed.\n",
+   plane_state->plane->index);
 
-   return ret;
+   return ret;
+   }
+
+   blend_tf = dm_plane_state->blend_tf;
+   blend_lut = __extract_blob_lut(dm_plane_state->blend_lut, _size);
+   blend_size = blend_lut != NULL ? blend_size : 0;
+
+   ret = amdgpu_dm_atomic_blend_lut(blend_lut, false,
+drm_tf_to_dc_tf(blend_tf),
+blend_size, dc_plane_state->blend_tf);
+   if (ret) {
+   drm_dbg_kms(plane_state->plane->dev,
+   "setting plane %d gamma lut failed.\n",
+   plane_state->plane->index);
+
+   return ret;
+   }
+
+   return 0;
 }
 #endif
 
-- 
2.39.2

[RFC PATCH 36/40] drm/amd/display: add plane shaper/3D LUT and shaper TF support

2023-04-23 Thread Melissa Wen

We already have the steps to program post-blending shaper/3D LUT on AMD
display driver, so that we can reuse them and map plane properties to DC
plane for pre-blending (plane) shaper/3D LUT setup.

Signed-off-by: Melissa Wen 
---
 .../amd/display/amdgpu_dm/amdgpu_dm_color.c   | 34 +--
 .../drm/amd/display/dc/dcn20/dcn20_hwseq.c|  5 +--
 2 files changed, 35 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
index 854510b05194..e17141fc8d12 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
@@ -909,6 +909,35 @@ __set_dm_plane_degamma(struct drm_plane_state *plane_state,
}
return 0;
 }
+
+static int
+amdgpu_dm_plane_set_color_properties(struct drm_plane_state *plane_state,
+struct dc_plane_state *dc_plane_state)
+{
+   struct dm_plane_state *dm_plane_state = to_dm_plane_state(plane_state);
+   enum drm_transfer_function shaper_tf = DRM_TRANSFER_FUNCTION_DEFAULT;
+   const struct drm_color_lut *shaper_lut, *lut3d;
+   uint32_t lut3d_size, shaper_size;
+
+   /* We have nothing to do here, return */
+   if (!plane_state->color_mgmt_changed)
+   return 0;
+
+   dc_plane_state->hdr_mult = 
dc_fixpt_from_s3132(dm_plane_state->hdr_mult);
+
+   shaper_tf = dm_plane_state->shaper_tf;
+   shaper_lut = __extract_blob_lut(dm_plane_state->shaper_lut, 
_size);
+   lut3d = __extract_blob_lut(dm_plane_state->lut3d, _size);
+   lut3d_size = lut3d != NULL ? lut3d_size : 0;
+   shaper_size = shaper_lut != NULL ? shaper_size : 0;
+
+   amdgpu_dm_atomic_lut3d(lut3d, lut3d_size, dc_plane_state->lut3d_func);
+   ret = amdgpu_dm_atomic_shaper_lut(shaper_lut, false,
+ drm_tf_to_dc_tf(shaper_tf),
+ shaper_size, 
dc_plane_state->in_shaper_func);
+
+   return ret;
+}
 #endif
 
 /**
@@ -939,7 +968,9 @@ int amdgpu_dm_update_plane_color_mgmt(struct dm_crtc_state 
*crtc,
has_crtc_cm_degamma = (crtc->cm_has_degamma || 
crtc->cm_is_degamma_srgb);
 
 #ifdef CONFIG_STEAM_DECK
-   dc_plane_state->hdr_mult = 
dc_fixpt_from_s3132(dm_plane_state->hdr_mult);
+   ret = amdgpu_dm_plane_set_color_properties(plane_state, dc_plane_state);
+   if(ret)
+   return ret;
 
ret = __set_dm_plane_degamma(plane_state, dc_plane_state);
if (ret != -EINVAL)
@@ -971,6 +1002,5 @@ int amdgpu_dm_update_plane_color_mgmt(struct dm_crtc_state 
*crtc,
return ret;
}
 
-
return 0;
 }
diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c 
b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c
index 5800acf6aae1..91fee60410f4 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c
@@ -1759,8 +1759,9 @@ static void dcn20_program_pipe(
hws->funcs.set_hdr_multiplier(pipe_ctx);
 
if (pipe_ctx->update_flags.bits.enable ||
-   
pipe_ctx->plane_state->update_flags.bits.in_transfer_func_change ||
-   pipe_ctx->plane_state->update_flags.bits.gamma_change)
+   pipe_ctx->plane_state->update_flags.bits.in_transfer_func_change ||
+   pipe_ctx->plane_state->update_flags.bits.gamma_change ||
+   pipe_ctx->plane_state->update_flags.bits.lut_3d)
hws->funcs.set_input_transfer_func(dc, pipe_ctx, 
pipe_ctx->plane_state);
 
/* dcn10_translate_regamma_to_hw_format takes 750us to finish
-- 
2.39.2

[RFC PATCH 39/40] drm/amd/display: copy dc_plane color settings to surface_updates

2023-04-23 Thread Melissa Wen

As per previous code, copy shaper, 3d and blend settings from dc_plane
to surface_updates before commit.

Signed-off-by: Melissa Wen 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 0e3b6d414ec4..cdaaec1b2a3a 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -7987,11 +7987,13 @@ static void amdgpu_dm_commit_planes(struct 
drm_atomic_state *state,
continue;
 
bundle->surface_updates[planes_count].surface = dc_plane;
-   if (new_pcrtc_state->color_mgmt_changed) {
+   if (new_pcrtc_state->color_mgmt_changed || 
new_plane_state->color_mgmt_changed) {
bundle->surface_updates[planes_count].gamma = 
dc_plane->gamma_correction;
bundle->surface_updates[planes_count].in_transfer_func 
= dc_plane->in_transfer_func;

bundle->surface_updates[planes_count].gamut_remap_matrix = 
_plane->gamut_remap_matrix;
bundle->surface_updates[planes_count].hdr_mult = 
dc_plane->hdr_mult;
+   bundle->surface_updates[planes_count].func_shaper = 
dc_plane->in_shaper_func;
+   bundle->surface_updates[planes_count].lut3d_func = 
dc_plane->lut3d_func;
}
 
amdgpu_dm_plane_fill_dc_scaling_info(dm->adev, new_plane_state,
-- 
2.39.2

[RFC PATCH 35/40] drm/adm/display: add HDR multiplier support

2023-04-23 Thread Melissa Wen

From: Joshua Ashton 

With `dc_fixpt_from_s3132()` translation, we can just use it to set
hdr_mult.

Signed-off-by: Joshua Ashton 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c   | 1 +
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c | 3 +++
 2 files changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index c0321272c129..0e3b6d414ec4 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -7991,6 +7991,7 @@ static void amdgpu_dm_commit_planes(struct 
drm_atomic_state *state,
bundle->surface_updates[planes_count].gamma = 
dc_plane->gamma_correction;
bundle->surface_updates[planes_count].in_transfer_func 
= dc_plane->in_transfer_func;

bundle->surface_updates[planes_count].gamut_remap_matrix = 
_plane->gamut_remap_matrix;
+   bundle->surface_updates[planes_count].hdr_mult = 
dc_plane->hdr_mult;
}
 
amdgpu_dm_plane_fill_dc_scaling_info(dm->adev, new_plane_state,
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
index d714728ca143..854510b05194 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
@@ -939,6 +939,8 @@ int amdgpu_dm_update_plane_color_mgmt(struct dm_crtc_state 
*crtc,
has_crtc_cm_degamma = (crtc->cm_has_degamma || 
crtc->cm_is_degamma_srgb);
 
 #ifdef CONFIG_STEAM_DECK
+   dc_plane_state->hdr_mult = 
dc_fixpt_from_s3132(dm_plane_state->hdr_mult);
+
ret = __set_dm_plane_degamma(plane_state, dc_plane_state);
if (ret != -EINVAL)
return ret;
@@ -969,5 +971,6 @@ int amdgpu_dm_update_plane_color_mgmt(struct dm_crtc_state 
*crtc,
return ret;
}
 
+
return 0;
 }
-- 
2.39.2

[RFC PATCH 37/40] drm/amd/display: handle empty LUTs in __set_input_tf

2023-04-23 Thread Melissa Wen

From: Joshua Ashton 

Unlike degamma, blend gamma doesn't support hardcoded curve
(predefined/ROM), but we can use AMD color module to fill blend gamma
parameters when we have non-linear plane gamma TF without plane gamma
LUT. The regular degamma path doesn't hit this.

Signed-off-by: Joshua Ashton 
---
 .../amd/display/amdgpu_dm/amdgpu_dm_color.c   | 21 ---
 1 file changed, 13 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
index e17141fc8d12..baa7fea9ebae 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
@@ -349,21 +349,26 @@ static int __set_input_tf(struct dc_transfer_func *func,
struct dc_gamma *gamma = NULL;
bool res;
 
-   gamma = dc_create_gamma();
-   if (!gamma)
-   return -ENOMEM;
+   if (lut_size) {
+   gamma = dc_create_gamma();
+   if (!gamma)
+   return -ENOMEM;
 
-   gamma->type = GAMMA_CUSTOM;
-   gamma->num_entries = lut_size;
+   gamma->type = GAMMA_CUSTOM;
+   gamma->num_entries = lut_size;
 
-   __drm_lut_to_dc_gamma(lut, gamma, false);
+   __drm_lut_to_dc_gamma(lut, gamma, false);
+   }
 
-   res = mod_color_calculate_degamma_params(NULL, func, gamma, true);
-   dc_gamma_release();
+   res = mod_color_calculate_degamma_params(NULL, func, gamma, gamma != 
NULL);
+
+   if (gamma)
+   dc_gamma_release();
 
return res ? 0 : -ENOMEM;
 }
 
+
 #ifdef CONFIG_STEAM_DECK
 static enum dc_transfer_func_predefined drm_tf_to_dc_tf(enum 
drm_transfer_function drm_tf)
 {
-- 
2.39.2

[RFC PATCH 34/40] drm/amd/display: add dc_fixpt_from_s3132 helper

2023-04-23 Thread Melissa Wen

From: Joshua Ashton 

Detach value translation from CTM to reuse it for programming HDR
multiplier property.

Signed-off-by: Joshua Ashton 
---
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c  |  8 +---
 drivers/gpu/drm/amd/display/include/fixed31_32.h | 12 
 2 files changed, 13 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
index 07303c9f3618..d714728ca143 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
@@ -182,7 +182,6 @@ static void __drm_lut_to_dc_gamma(const struct 
drm_color_lut *lut,
 static void __drm_ctm_to_dc_matrix(const struct drm_color_ctm *ctm,
   struct fixed31_32 *matrix)
 {
-   int64_t val;
int i;
 
/*
@@ -201,12 +200,7 @@ static void __drm_ctm_to_dc_matrix(const struct 
drm_color_ctm *ctm,
}
 
/* gamut_remap_matrix[i] = ctm[i - floor(i/4)] */
-   val = ctm->matrix[i - (i / 4)];
-   /* If negative, convert to 2's complement. */
-   if (val & (1ULL << 63))
-   val = -(val & ~(1ULL << 63));
-
-   matrix[i].value = val;
+   matrix[i] = dc_fixpt_from_s3132(ctm->matrix[i - (i / 4)]);
}
 }
 
diff --git a/drivers/gpu/drm/amd/display/include/fixed31_32.h 
b/drivers/gpu/drm/amd/display/include/fixed31_32.h
index ece97ae0e826..f4cc7f97329f 100644
--- a/drivers/gpu/drm/amd/display/include/fixed31_32.h
+++ b/drivers/gpu/drm/amd/display/include/fixed31_32.h
@@ -69,6 +69,18 @@ static const struct fixed31_32 dc_fixpt_epsilon = { 1LL };
 static const struct fixed31_32 dc_fixpt_half = { 0x8000LL };
 static const struct fixed31_32 dc_fixpt_one = { 0x1LL };
 
+static inline struct fixed31_32 dc_fixpt_from_s3132(__u64 x)
+{
+   struct fixed31_32 val;
+
+   /* If negative, convert to 2's complement. */
+   if (x & (1ULL << 63))
+   x = -(x & ~(1ULL << 63));
+
+   val.value = x;
+   return val;
+}
+
 /*
  * @brief
  * Initialization routines
-- 
2.39.2

[RFC PATCH 32/40] drm/amd/display: add support for plane degamma TF and LUT properties

2023-04-23 Thread Melissa Wen

From: Joshua Ashton 

We only set CRTC degamma if we don't have plane degamma LUT or TF to
configure. We return -EINVAL if we don't have plane degamma settings, so
we can continue and check CRTC degamma.

Signed-off-by: Joshua Ashton 
---
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  4 +-
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h |  1 +
 .../amd/display/amdgpu_dm/amdgpu_dm_color.c   | 71 +--
 3 files changed, 70 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 1dac311cab67..c0321272c129 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -5043,7 +5043,9 @@ static int fill_dc_plane_attributes(struct amdgpu_device 
*adev,
 * Always set input transfer function, since plane state is refreshed
 * every time.
 */
-   ret = amdgpu_dm_update_plane_color_mgmt(dm_crtc_state, dc_plane_state);
+   ret = amdgpu_dm_update_plane_color_mgmt(dm_crtc_state,
+   plane_state,
+   dc_plane_state);
if (ret)
return ret;
 
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
index b9840c1f3cdf..bcf74b7391c9 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
@@ -912,6 +912,7 @@ int amdgpu_dm_verify_lut_sizes(const struct drm_crtc_state 
*crtc_state);
 int amdgpu_dm_update_crtc_color_mgmt(struct dm_crtc_state *crtc,
 struct dc_state *ctx);
 int amdgpu_dm_update_plane_color_mgmt(struct dm_crtc_state *crtc,
+ struct drm_plane_state *plane_state,
  struct dc_plane_state *dc_plane_state);
 
 void amdgpu_dm_update_connector_after_detect(
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
index 51808fff..693168cc1d9c 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
@@ -868,9 +868,59 @@ static int map_crtc_degamma_to_dc_plane(struct 
dm_crtc_state *crtc,
return 0;
 }
 
+#ifdef CONFIG_STEAM_DECK
+static int
+__set_dm_plane_degamma(struct drm_plane_state *plane_state,
+  struct dc_plane_state *dc_plane_state)
+{
+   struct dm_plane_state *dm_plane_state = to_dm_plane_state(plane_state);
+   const struct drm_color_lut *degamma_lut;
+   enum drm_transfer_function drm_tf = DRM_TRANSFER_FUNCTION_DEFAULT;
+   uint32_t degamma_size;
+   bool has_degamma_lut;
+   int ret;
+
+   degamma_lut = __extract_blob_lut(dm_plane_state->degamma_lut, 
_size);
+
+   has_degamma_lut = degamma_lut &&
+ !__is_lut_linear(degamma_lut, degamma_size);
+
+   drm_tf = dm_plane_state->degamma_tf;
+
+   /* If we don't have plane degamma LUT nor TF to set on DC, we have
+* nothing to do here, return.
+*/
+   if (!has_degamma_lut && drm_tf == DRM_TRANSFER_FUNCTION_DEFAULT)
+   return -EINVAL;
+
+   dc_plane_state->in_transfer_func->tf = drm_tf_to_dc_tf(drm_tf);
+
+   if (has_degamma_lut) {
+   ASSERT(degamma_size == MAX_COLOR_LUT_ENTRIES);
+
+   dc_plane_state->in_transfer_func->type =
+   TF_TYPE_DISTRIBUTED_POINTS;
+
+   ret = __set_input_tf(dc_plane_state->in_transfer_func,
+  degamma_lut, degamma_size);
+   if (ret)
+   return ret;
+   } else {
+   dc_plane_state->in_transfer_func->type =
+   TF_TYPE_PREDEFINED;
+
+   if (!mod_color_calculate_degamma_params(NULL,
+   dc_plane_state->in_transfer_func, NULL, false))
+   return -ENOMEM;
+   }
+   return 0;
+}
+#endif
+
 /**
  * amdgpu_dm_update_plane_color_mgmt: Maps DRM color management to DC plane.
  * @crtc: amdgpu_dm crtc state
+ * @plane_state: DRM plane state
  * @dc_plane_state: target DC surface
  *
  * Update the underlying dc_stream_state's input transfer function (ITF) in
@@ -881,13 +931,28 @@ static int map_crtc_degamma_to_dc_plane(struct 
dm_crtc_state *crtc,
  * 0 on success. -ENOMEM if mem allocation fails.
  */
 int amdgpu_dm_update_plane_color_mgmt(struct dm_crtc_state *crtc,
+ struct drm_plane_state *plane_state,
  struct dc_plane_state *dc_plane_state)
 {
bool has_crtc_cm_degamma;
int ret;
 
+   /* Initially, we can just bypass the DGM block. */
+   dc_plane_state->in_transfer_func->type = TF_TYPE_BYPASS;
+

[RFC PATCH 31/40] drm/amd/display: decouple steps for mapping CRTC degamma to DC plane

2023-04-23 Thread Melissa Wen

The next patch adds pre-blending degamma to AMD color mgmt pipeline, but
pre-blending degamma caps (DPP) is currently in use to provide DRM CRTC
atomic degamma or implict degamma on legacy gamma. Detach degamma usage
regarging CRTC color properties to manage plane and CRTC color
correction combinations.

Signed-off-by: Melissa Wen 
---
 .../amd/display/amdgpu_dm/amdgpu_dm_color.c   | 55 +--
 1 file changed, 38 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
index 6a233380f284..51808fff 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
@@ -791,20 +791,8 @@ int amdgpu_dm_update_crtc_color_mgmt(struct dm_crtc_state 
*crtc,
return 0;
 }
 
-/**
- * amdgpu_dm_update_plane_color_mgmt: Maps DRM color management to DC plane.
- * @crtc: amdgpu_dm crtc state
- * @dc_plane_state: target DC surface
- *
- * Update the underlying dc_stream_state's input transfer function (ITF) in
- * preparation for hardware commit. The transfer function used depends on
- * the preparation done on the stream for color management.
- *
- * Returns:
- * 0 on success. -ENOMEM if mem allocation fails.
- */
-int amdgpu_dm_update_plane_color_mgmt(struct dm_crtc_state *crtc,
- struct dc_plane_state *dc_plane_state)
+static int map_crtc_degamma_to_dc_plane(struct dm_crtc_state *crtc,
+   struct dc_plane_state *dc_plane_state)
 {
const struct drm_color_lut *degamma_lut;
enum dc_transfer_func_predefined tf = TRANSFER_FUNCTION_SRGB;
@@ -827,8 +815,7 @@ int amdgpu_dm_update_plane_color_mgmt(struct dm_crtc_state 
*crtc,
 _size);
ASSERT(degamma_size == MAX_COLOR_LUT_ENTRIES);
 
-   dc_plane_state->in_transfer_func->type =
-   TF_TYPE_DISTRIBUTED_POINTS;
+   dc_plane_state->in_transfer_func->type = 
TF_TYPE_DISTRIBUTED_POINTS;
 
/*
 * This case isn't fully correct, but also fairly
@@ -864,7 +851,7 @@ int amdgpu_dm_update_plane_color_mgmt(struct dm_crtc_state 
*crtc,
   degamma_lut, degamma_size);
if (r)
return r;
-   } else if (crtc->cm_is_degamma_srgb) {
+   } else {
/*
 * For legacy gamma support we need the regamma input
 * in linear space. Assume that the input is sRGB.
@@ -876,6 +863,40 @@ int amdgpu_dm_update_plane_color_mgmt(struct dm_crtc_state 
*crtc,
!mod_color_calculate_degamma_params(NULL,
dc_plane_state->in_transfer_func, NULL, false))
return -ENOMEM;
+   }
+
+   return 0;
+}
+
+/**
+ * amdgpu_dm_update_plane_color_mgmt: Maps DRM color management to DC plane.
+ * @crtc: amdgpu_dm crtc state
+ * @dc_plane_state: target DC surface
+ *
+ * Update the underlying dc_stream_state's input transfer function (ITF) in
+ * preparation for hardware commit. The transfer function used depends on
+ * the preparation done on the stream for color management.
+ *
+ * Returns:
+ * 0 on success. -ENOMEM if mem allocation fails.
+ */
+int amdgpu_dm_update_plane_color_mgmt(struct dm_crtc_state *crtc,
+ struct dc_plane_state *dc_plane_state)
+{
+   bool has_crtc_cm_degamma;
+   int ret;
+
+   has_crtc_cm_degamma = (crtc->cm_has_degamma || 
crtc->cm_is_degamma_srgb);
+   if (has_crtc_cm_degamma){
+   /* AMD HW doesn't have post-blending degamma caps. When DRM
+* CRTC atomic degamma is set, we maps it to DPP degamma block
+* (pre-blending) or, on legacy gamma, we use DPP degamma to
+* linearize (implicit degamma) from sRGB/BT709 according to
+* the input space.
+*/
+   ret = map_crtc_degamma_to_dc_plane(crtc, dc_plane_state);
+   if (ret)
+   return ret;
} else {
/* ...Otherwise we can just bypass the DGM block. */
dc_plane_state->in_transfer_func->type = TF_TYPE_BYPASS;
-- 
2.39.2

[RFC PATCH 33/40] drm/amd/display: reject atomic commit if setting both plane and CRTC degamma

2023-04-23 Thread Melissa Wen

DC only has pre-blending degamma caps (pre-blending/DPP) that is
currently in use for CRTC/post-blending degamma, so that we don't have
HW caps to perform plane and CRTC degamma at the same time. Reject
atomic updates when serspace sets both plane and CRTC degamma
properties.

Signed-off-by: Melissa Wen 
---
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c   | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
index 693168cc1d9c..07303c9f3618 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
@@ -949,6 +949,17 @@ int amdgpu_dm_update_plane_color_mgmt(struct dm_crtc_state 
*crtc,
if (ret != -EINVAL)
return ret;
 
+   /* We only have one degamma block available (pre-blending) for the
+* whole color correction pipeline, so that we can't actually perform
+* plane and CRTC degamma at the same time. Reject atomic updates when
+* userspace sets both plane and CRTC degamma properties.
+*/
+   if (has_crtc_cm_degamma && ret == -EINVAL){
+   drm_dbg_kms(crtc->base.crtc->dev,
+   "doesn't support plane and CRTC degamma at the same 
time\n");
+   return -EINVAL;
+   }
+
/* As we don't have plane degamma, check if we have CRTC degamma
 * waiting for mapping to pre-blending degamma block */
 #endif
-- 
2.39.2

[RFC PATCH 30/40] drm/amd/display: mark plane as needing reset if plane color mgmt changes

2023-04-23 Thread Melissa Wen

We took a similar path for CRTC color mgmt changes, since we remap CRTC
degamma to plane/DPP block. Here we can use the status of
`plane->color_mgmt_changed` to detect when a plane color property
changed and recreate the plane accordingly.

Co-developed-by: Joshua Ashton 
Signed-off-by: Joshua Ashton 
Signed-off-by: Melissa Wen 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 760080e4a4da..1dac311cab67 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -9441,6 +9441,9 @@ static bool should_reset_plane(struct drm_atomic_state 
*state,
if (drm_atomic_crtc_needs_modeset(new_crtc_state))
return true;
 
+   if (new_plane_state->color_mgmt_changed)
+   return true;
+
/*
 * If there are any new primary or overlay planes being added or
 * removed then the z-order can potentially change. To ensure
-- 
2.39.2

[RFC PATCH 27/40] drm/amd/display: add CRTC regamma TF support

2023-04-23 Thread Melissa Wen

From: Joshua Ashton 

Add predefined transfer function programming. There is no out gamma ROM,
but we can use AMD color modules to program LUT parameters from a
predefined TF and an empty regamma LUT (or power LUT parameters with
predefined TF setup).

Signed-off-by: Joshua Ashton 
---
 .../amd/display/amdgpu_dm/amdgpu_dm_color.c   | 60 ++-
 1 file changed, 44 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
index ff29be3929af..55aa876a5008 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
@@ -268,16 +268,18 @@ static int __set_output_tf(struct dc_transfer_func *func,
struct calculate_buffer cal_buffer = {0};
bool res;
 
-   ASSERT(lut && lut_size == MAX_COLOR_LUT_ENTRIES);
-
cal_buffer.buffer_index = -1;
 
-   gamma = dc_create_gamma();
-   if (!gamma)
-   return -ENOMEM;
+   if (lut_size) {
+   ASSERT(lut && lut_size == MAX_COLOR_LUT_ENTRIES);
 
-   gamma->num_entries = lut_size;
-   __drm_lut_to_dc_gamma(lut, gamma, false);
+   gamma = dc_create_gamma();
+   if (!gamma)
+   return -ENOMEM;
+
+   gamma->num_entries = lut_size;
+   __drm_lut_to_dc_gamma(lut, gamma, false);
+   }
 
if (func->tf == TRANSFER_FUNCTION_LINEAR) {
/*
@@ -285,30 +287,34 @@ static int __set_output_tf(struct dc_transfer_func *func,
 * on top of a linear input. But degamma params can be used
 * instead to simulate this.
 */
-   gamma->type = GAMMA_CUSTOM;
+   if (gamma)
+   gamma->type = GAMMA_CUSTOM;
res = mod_color_calculate_degamma_params(NULL, func,
-   gamma, true);
+gamma, gamma != NULL);
} else {
/*
 * Assume sRGB. The actual mapping will depend on whether the
 * input was legacy or not.
 */
-   gamma->type = GAMMA_CS_TFM_1D;
-   res = mod_color_calculate_regamma_params(func, gamma, false,
+   if (gamma)
+   gamma->type = GAMMA_CS_TFM_1D;
+   res = mod_color_calculate_regamma_params(func, gamma, gamma != 
NULL,
 has_rom, NULL, 
_buffer);
}
 
-   dc_gamma_release();
+   if (gamma)
+   dc_gamma_release();
 
return res ? 0 : -ENOMEM;
 }
 
 static int amdgpu_dm_set_atomic_regamma(struct dc_stream_state *stream,
const struct drm_color_lut *regamma_lut,
-   uint32_t regamma_size, bool has_rom)
+   uint32_t regamma_size, bool has_rom,
+   enum dc_transfer_func_predefined tf)
 {
int ret = 0;
-   if (regamma_size) {
+   if (regamma_size || tf != TRANSFER_FUNCTION_LINEAR) {
/* CRTC RGM goes into RGM LUT.
 *
 * Note: there is no implicit sRGB regamma here. We are using
@@ -316,7 +322,7 @@ static int amdgpu_dm_set_atomic_regamma(struct 
dc_stream_state *stream,
 * from a linear base.
 */
stream->out_transfer_func->type = TF_TYPE_DISTRIBUTED_POINTS;
-   stream->out_transfer_func->tf = TRANSFER_FUNCTION_LINEAR;
+   stream->out_transfer_func->tf = tf;
 
ret = __set_output_tf(stream->out_transfer_func,
  regamma_lut, regamma_size, has_rom);
@@ -364,6 +370,25 @@ static int __set_input_tf(struct dc_transfer_func *func,
 }
 
 #ifdef CONFIG_STEAM_DECK
+static enum dc_transfer_func_predefined drm_tf_to_dc_tf(enum 
drm_transfer_function drm_tf)
+{
+   switch (drm_tf)
+   {
+   default:
+   case DRM_TRANSFER_FUNCTION_DEFAULT: return TRANSFER_FUNCTION_LINEAR;
+   case DRM_TRANSFER_FUNCTION_SRGB:return TRANSFER_FUNCTION_SRGB;
+
+   case DRM_TRANSFER_FUNCTION_BT709:   return TRANSFER_FUNCTION_BT709;
+   case DRM_TRANSFER_FUNCTION_PQ:  return TRANSFER_FUNCTION_PQ;
+   case DRM_TRANSFER_FUNCTION_LINEAR:  return TRANSFER_FUNCTION_LINEAR;
+   case DRM_TRANSFER_FUNCTION_UNITY:   return TRANSFER_FUNCTION_UNITY;
+   case DRM_TRANSFER_FUNCTION_HLG: return TRANSFER_FUNCTION_HLG;
+   case DRM_TRANSFER_FUNCTION_GAMMA22: return 
TRANSFER_FUNCTION_GAMMA22;
+   case DRM_TRANSFER_FUNCTION_GAMMA24: return 
TRANSFER_FUNCTION_GAMMA24;
+   case DRM_TRANSFER_FUNCTION_GAMMA26: return 
TRANSFER_FUNCTION_GAMMA26;
+   }
+}
+

[RFC PATCH 28/40] drm/amd/display: set sdr_ref_white_level to 80 for out_transfer_func

2023-04-23 Thread Melissa Wen

From: Joshua Ashton 

Otherwise this is just initialized to 0.

This needs to actually have a value so that compute_curve can work
for PQ EOTF.

Signed-off-by: Joshua Ashton 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
index 55aa876a5008..6e7271065a56 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
@@ -323,6 +323,7 @@ static int amdgpu_dm_set_atomic_regamma(struct 
dc_stream_state *stream,
 */
stream->out_transfer_func->type = TF_TYPE_DISTRIBUTED_POINTS;
stream->out_transfer_func->tf = tf;
+   stream->out_transfer_func->sdr_ref_white_level = 80;
 
ret = __set_output_tf(stream->out_transfer_func,
  regamma_lut, regamma_size, has_rom);
-- 
2.39.2

[RFC PATCH 26/40] drm/amd/display: add CRTC shaper LUT support to amd color pipeline

2023-04-23 Thread Melissa Wen

Now, we can use DRM CRTC shaper LUT to delinearize and/or normalize the
color space for a more efficient 3D LUT support (so far, only for DRM
atomic color mgmt). If a degamma 1D LUT is passed to linearize the color
space, a custom shaper 1D LUT can be used before applying 3D LUT.

NOTE: although DRM CRTC shaper and 3D LUTs are optional properties, from
our tests, AMD HW doesn't allow 3D LUT when shaper LUT is set to BYPASS
(without user shaper LUT)

Signed-off-by: Melissa Wen 
---
 .../amd/display/amdgpu_dm/amdgpu_dm_color.c   | 81 +--
 1 file changed, 38 insertions(+), 43 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
index 672ca5e9e59c..ff29be3929af 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
@@ -443,46 +443,26 @@ static void amdgpu_dm_atomic_lut3d(const struct 
drm_color_lut *drm_lut,
}
 }
 
-/**
- * __set_input_tf - calculates the input transfer function based on expected
- * input space.
- * @func: transfer function
- * @lut: lookup table that defines the color space
- * @lut_size: size of respective lut.
- *
- * Returns:
- * 0 in case of success. -ENOMEM if fails.
- */
-static int __set_input_tf(struct dc_transfer_func *func,
- const struct drm_color_lut *lut, uint32_t lut_size)
+static int amdgpu_dm_atomic_shaper_lut(const struct drm_color_lut *shaper_lut,
+  uint32_t shaper_size,
+  struct dc_transfer_func *func_shaper)
 {
-   struct dc_gamma *gamma = NULL;
-   bool res;
-
-   gamma = dc_create_gamma();
-   if (!gamma)
-   return -ENOMEM;
-
-   gamma->type = GAMMA_CUSTOM;
-   gamma->num_entries = lut_size;
-
-   __drm_lut_to_dc_gamma(lut, gamma, false);
-
-   res = mod_color_calculate_degamma_params(NULL, func, gamma, true);
-   dc_gamma_release();
+   int ret = 0;
 
-   return res ? 0 : -ENOMEM;
-}
+   if (shaper_size) {
+   /* If DRM shaper LUT is set, we assume a linear color space
+* (linearized by DRM degamma 1D LUT or not)
+*/
+   func_shaper->type = TF_TYPE_DISTRIBUTED_POINTS;
+   func_shaper->tf = TRANSFER_FUNCTION_LINEAR;
 
-static int amdgpu_dm_atomic_shaper_lut(struct dc_transfer_func *func_shaper)
-{
-   /* We don't get DRM shaper LUT yet. We assume the input color space is 
already
-* delinearized, so we don't need a shaper LUT and we can just BYPASS
-*/
-   func_shaper->type = TF_TYPE_BYPASS;
-   func_shaper->tf = TRANSFER_FUNCTION_LINEAR;
+   ret = __set_output_tf(func_shaper, shaper_lut, shaper_size, 
false);
+   } else {
+   func_shaper->type = TF_TYPE_BYPASS;
+   func_shaper->tf = TRANSFER_FUNCTION_LINEAR;
+   }
 
-   return 0;
+   return ret;
 }
 
 /* amdgpu_dm_atomic_shaper_lut3d - set DRM CRTC shaper LUT and 3D LUT to DC
@@ -530,7 +510,8 @@ static int amdgpu_dm_atomic_shaper_lut3d(struct dc *dc,
 
amdgpu_dm_atomic_lut3d(drm_lut3d, drm_lut3d_size, lut3d_func);
 
-   return amdgpu_dm_atomic_shaper_lut(func_shaper);
+   return amdgpu_dm_atomic_shaper_lut(drm_shaper_lut,
+  drm_shaper_size, func_shaper);
 }
 
 /**
@@ -562,12 +543,22 @@ static uint32_t amdgpu_dm_get_lut3d_size(struct 
amdgpu_device *adev,
 int amdgpu_dm_verify_lut3d_size(struct amdgpu_device *adev,
const struct drm_crtc_state *crtc_state)
 {
-   const struct drm_color_lut *lut3d = NULL;
struct dm_crtc_state *acrtc_state = to_dm_crtc_state(crtc_state);
+   const struct drm_color_lut *shaper = NULL, *lut3d = NULL;
uint32_t exp_size, size;
 
-   exp_size = amdgpu_dm_get_lut3d_size(adev, MAX_COLOR_3DLUT_ENTRIES);
+   /* shaper LUT is only available if 3D LUT color caps*/
+   exp_size = amdgpu_dm_get_lut3d_size(adev, MAX_COLOR_LUT_ENTRIES);
+   shaper = __extract_blob_lut(acrtc_state->shaper_lut, );
 
+   if (shaper && size != exp_size) {
+   DRM_DEBUG_DRIVER(
+   "Invalid Shaper LUT size. Should be %u but got %u.\n",
+   exp_size, size);
+   return -EINVAL;
+   }
+
+   exp_size = amdgpu_dm_get_lut3d_size(adev, MAX_COLOR_3DLUT_ENTRIES);
lut3d = __extract_blob_lut(acrtc_state->lut3d, );
 
if (lut3d && size != exp_size) {
@@ -652,14 +643,15 @@ int amdgpu_dm_update_crtc_color_mgmt(struct dm_crtc_state 
*crtc,
bool is_legacy;
int r;
 #ifdef CONFIG_STEAM_DECK
-   const struct drm_color_lut *lut3d;
-   uint32_t lut3d_size;
+   const struct drm_color_lut *shaper_lut, *lut3d;
+   uint32_t shaper_size, lut3d_size;
 
r =  amdgpu_dm_verify_lut3d_size(adev, >base);
if

[RFC PATCH 29/40] drm/amd/display: add CRTC shaper TF support

2023-04-23 Thread Melissa Wen

Inspired by regamma TF, follow similar steps to add TF + 1D LUT for
shaper func. Reuse gamma_tf property, since the driver doesn't support
shaper and out gamma at the same time. We could rename gamma_tf, if
necessary to avoid misunderstandings, or add a specific property for
shaper lut.

Signed-off-by: Melissa Wen 
---
 .../drm/amd/display/amdgpu_dm/amdgpu_dm_color.c  | 16 +++-
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
index 6e7271065a56..6a233380f284 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
@@ -470,19 +470,22 @@ static void amdgpu_dm_atomic_lut3d(const struct 
drm_color_lut *drm_lut,
 }
 
 static int amdgpu_dm_atomic_shaper_lut(const struct drm_color_lut *shaper_lut,
+  bool has_rom,
+  enum dc_transfer_func_predefined tf,
   uint32_t shaper_size,
   struct dc_transfer_func *func_shaper)
 {
int ret = 0;
 
-   if (shaper_size) {
+   if (shaper_size || tf != TRANSFER_FUNCTION_LINEAR) {
/* If DRM shaper LUT is set, we assume a linear color space
 * (linearized by DRM degamma 1D LUT or not)
 */
func_shaper->type = TF_TYPE_DISTRIBUTED_POINTS;
-   func_shaper->tf = TRANSFER_FUNCTION_LINEAR;
+   func_shaper->tf = tf;
+   func_shaper->sdr_ref_white_level = 80; /* hardcoded for now */
 
-   ret = __set_output_tf(func_shaper, shaper_lut, shaper_size, 
false);
+   ret = __set_output_tf(func_shaper, shaper_lut, shaper_size, 
has_rom);
} else {
func_shaper->type = TF_TYPE_BYPASS;
func_shaper->tf = TRANSFER_FUNCTION_LINEAR;
@@ -509,12 +512,14 @@ static int amdgpu_dm_atomic_shaper_lut3d(struct dc *dc,
 struct dc_stream_state *stream,
 const struct drm_color_lut 
*drm_shaper_lut,
 uint32_t drm_shaper_size,
+bool has_rom,
+enum dc_transfer_func_predefined tf,
 const struct drm_color_lut *drm_lut3d,
 uint32_t drm_lut3d_size)
 {
struct dc_3dlut *lut3d_func;
struct dc_transfer_func *func_shaper;
-   bool acquire = drm_shaper_size || drm_lut3d_size;
+   bool acquire = drm_shaper_size || drm_lut3d_size || tf != 
TRANSFER_FUNCTION_LINEAR;
 
lut3d_func = (struct dc_3dlut *)stream->lut3d_func;
func_shaper = (struct dc_transfer_func *)stream->func_shaper;
@@ -536,7 +541,7 @@ static int amdgpu_dm_atomic_shaper_lut3d(struct dc *dc,
 
amdgpu_dm_atomic_lut3d(drm_lut3d, drm_lut3d_size, lut3d_func);
 
-   return amdgpu_dm_atomic_shaper_lut(drm_shaper_lut,
+   return amdgpu_dm_atomic_shaper_lut(drm_shaper_lut, has_rom, tf,
   drm_shaper_size, func_shaper);
 }
 
@@ -735,6 +740,7 @@ int amdgpu_dm_update_crtc_color_mgmt(struct dm_crtc_state 
*crtc,
shaper_size = shaper_lut != NULL ? shaper_size : 0;
r = amdgpu_dm_atomic_shaper_lut3d(adev->dm.dc, ctx, stream,
  shaper_lut, shaper_size,
+ has_rom, tf,
  lut3d, lut3d_size);
if (r) {
DRM_DEBUG_DRIVER("Failed on shaper/3D LUTs setup\n");
-- 
2.39.2

[RFC PATCH 25/40] drm/amd/display: decouple steps to reuse in CRTC shaper LUT support

2023-04-23 Thread Melissa Wen

Decouple steps of post-blending shaper LUT setup and LUT size validation
according to HW caps as a preparation for DRM CRTC shaper LUT support.

Signed-off-by: Melissa Wen 
---
 .../amd/display/amdgpu_dm/amdgpu_dm_color.c   | 67 ---
 1 file changed, 58 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
index 25010fa19bc8..672ca5e9e59c 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
@@ -443,6 +443,48 @@ static void amdgpu_dm_atomic_lut3d(const struct 
drm_color_lut *drm_lut,
}
 }
 
+/**
+ * __set_input_tf - calculates the input transfer function based on expected
+ * input space.
+ * @func: transfer function
+ * @lut: lookup table that defines the color space
+ * @lut_size: size of respective lut.
+ *
+ * Returns:
+ * 0 in case of success. -ENOMEM if fails.
+ */
+static int __set_input_tf(struct dc_transfer_func *func,
+ const struct drm_color_lut *lut, uint32_t lut_size)
+{
+   struct dc_gamma *gamma = NULL;
+   bool res;
+
+   gamma = dc_create_gamma();
+   if (!gamma)
+   return -ENOMEM;
+
+   gamma->type = GAMMA_CUSTOM;
+   gamma->num_entries = lut_size;
+
+   __drm_lut_to_dc_gamma(lut, gamma, false);
+
+   res = mod_color_calculate_degamma_params(NULL, func, gamma, true);
+   dc_gamma_release();
+
+   return res ? 0 : -ENOMEM;
+}
+
+static int amdgpu_dm_atomic_shaper_lut(struct dc_transfer_func *func_shaper)
+{
+   /* We don't get DRM shaper LUT yet. We assume the input color space is 
already
+* delinearized, so we don't need a shaper LUT and we can just BYPASS
+*/
+   func_shaper->type = TF_TYPE_BYPASS;
+   func_shaper->tf = TRANSFER_FUNCTION_LINEAR;
+
+   return 0;
+}
+
 /* amdgpu_dm_atomic_shaper_lut3d - set DRM CRTC shaper LUT and 3D LUT to DC
  * interface
  * @dc: Display Core control structure
@@ -486,15 +528,23 @@ static int amdgpu_dm_atomic_shaper_lut3d(struct dc *dc,
if (!acquire)
return 0;
 
-   /* We don't get DRM shaper LUT yet. We assume the input color
-* space is already delinearized, so we don't need a shaper LUT
-* and we can just BYPASS.
-*/
-   func_shaper->type = TF_TYPE_BYPASS;
-   func_shaper->tf = TRANSFER_FUNCTION_LINEAR;
amdgpu_dm_atomic_lut3d(drm_lut3d, drm_lut3d_size, lut3d_func);
 
-   return 0;
+   return amdgpu_dm_atomic_shaper_lut(func_shaper);
+}
+
+/**
+ * amdgpu_dm_lut3d_size - get expected size according to hw color caps
+ * @adev: amdgpu device
+ * @lut_size: default size
+ *
+ * Return:
+ * lut_size if DC 3D LUT is supported, zero otherwise.
+ */
+static uint32_t amdgpu_dm_get_lut3d_size(struct amdgpu_device *adev,
+uint32_t lut_size)
+{
+   return adev->dm.dc->caps.color.mpc.num_3dluts ? lut_size : 0;
 }
 
 /**
@@ -516,8 +566,7 @@ int amdgpu_dm_verify_lut3d_size(struct amdgpu_device *adev,
struct dm_crtc_state *acrtc_state = to_dm_crtc_state(crtc_state);
uint32_t exp_size, size;
 
-   exp_size = adev->dm.dc->caps.color.mpc.num_3dluts ?
-  MAX_COLOR_3DLUT_ENTRIES : 0;
+   exp_size = amdgpu_dm_get_lut3d_size(adev, MAX_COLOR_3DLUT_ENTRIES);
 
lut3d = __extract_blob_lut(acrtc_state->lut3d, );
 
-- 
2.39.2

[RFC PATCH 24/40] drm/amd/display: add CRTC 3D LUT support to amd color pipeline

2023-04-23 Thread Melissa Wen

Map DRM CRTC 3D LUT in the atomic color mgmt pipeline to DC
(post-blending). 3D LUT works better in a non-linear color space,
therefore using a degamma to linearize the input space may produce
unexpected results. The next patch introduces shaper LUT support that
can be used to delinearize the color space before applying 3D LUT
conversion.

Signed-off-by: Melissa Wen 
---
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |   8 +-
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h |   5 +
 .../amd/display/amdgpu_dm/amdgpu_dm_color.c   | 185 +++---
 3 files changed, 174 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 6b40e17892e5..760080e4a4da 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -9945,7 +9945,13 @@ static int amdgpu_dm_atomic_check(struct drm_device *dev,
DRM_DEBUG_DRIVER("amdgpu_dm_verify_lut_sizes() 
failed\n");
goto fail;
}
-
+#ifdef CONFIG_STEAM_DECK
+   ret = amdgpu_dm_verify_lut3d_size(adev, new_crtc_state);
+   if (ret) {
+   DRM_DEBUG_DRIVER("amdgpu_dm_verify_lut_sizes() 
failed\n");
+   goto fail;
+   }
+#endif
if (!new_crtc_state->enable)
continue;
 
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
index 5faf4fc87701..b9840c1f3cdf 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
@@ -894,9 +894,14 @@ void amdgpu_dm_update_freesync_caps(struct drm_connector 
*connector,
 
 void amdgpu_dm_trigger_timing_sync(struct drm_device *dev);
 
+#ifdef CONFIG_STEAM_DECK
 /* 3D LUT max size is 17x17x17 */
 #define MAX_COLOR_3DLUT_ENTRIES 4913
 #define MAX_COLOR_3DLUT_BITDEPTH 12
+int amdgpu_dm_verify_lut3d_size(struct amdgpu_device *adev,
+   const struct drm_crtc_state *crtc_state);
+#endif
+
 /* 1D LUT degamma, regamma and shaper*/
 #define MAX_COLOR_LUT_ENTRIES 4096
 /* Legacy gamm LUT users such as X doesn't like large LUT sizes */
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
index 99b1738c98d3..25010fa19bc8 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
@@ -332,6 +332,117 @@ static int amdgpu_dm_set_atomic_regamma(struct 
dc_stream_state *stream,
return ret;
 }
 
+/**
+ * __set_input_tf - calculates the input transfer function based on expected
+ * input space.
+ * @func: transfer function
+ * @lut: lookup table that defines the color space
+ * @lut_size: size of respective lut.
+ *
+ * Returns:
+ * 0 in case of success. -ENOMEM if fails.
+ */
+static int __set_input_tf(struct dc_transfer_func *func,
+ const struct drm_color_lut *lut, uint32_t lut_size)
+{
+   struct dc_gamma *gamma = NULL;
+   bool res;
+
+   gamma = dc_create_gamma();
+   if (!gamma)
+   return -ENOMEM;
+
+   gamma->type = GAMMA_CUSTOM;
+   gamma->num_entries = lut_size;
+
+   __drm_lut_to_dc_gamma(lut, gamma, false);
+
+   res = mod_color_calculate_degamma_params(NULL, func, gamma, true);
+   dc_gamma_release();
+
+   return res ? 0 : -ENOMEM;
+}
+
+#ifdef CONFIG_STEAM_DECK
+static void __to_dc_lut3d_color(struct dc_rgb *rgb,
+   const struct drm_color_lut lut,
+   int bit_precision)
+{
+   rgb->red = drm_color_lut_extract(lut.red, bit_precision);
+   rgb->green = drm_color_lut_extract(lut.green, bit_precision);
+   rgb->blue  = drm_color_lut_extract(lut.blue, bit_precision);
+}
+
+static void __drm_3dlut_to_dc_3dlut(const struct drm_color_lut *lut,
+   uint32_t lut3d_size,
+   struct tetrahedral_params *params,
+   bool use_tetrahedral_9,
+   int bit_depth)
+{
+   struct dc_rgb *lut0;
+   struct dc_rgb *lut1;
+   struct dc_rgb *lut2;
+   struct dc_rgb *lut3;
+   int lut_i, i;
+
+
+   if (use_tetrahedral_9) {
+   lut0 = params->tetrahedral_9.lut0;
+   lut1 = params->tetrahedral_9.lut1;
+   lut2 = params->tetrahedral_9.lut2;
+   lut3 = params->tetrahedral_9.lut3;
+   } else {
+   lut0 = params->tetrahedral_17.lut0;
+   lut1 = params->tetrahedral_17.lut1;
+   lut2 = params->tetrahedral_17.lut2;
+   lut3 = params->tetrahedral_17.lut3;
+   }
+
+   for (lut_i = 0, i = 0; i < lut3d_size - 4; lut_i++, i += 4) {
+   /* We should

[RFC PATCH 21/40] drm/amd/display: allow BYPASS 3D LUT but keep shaper LUT settings

2023-04-23 Thread Melissa Wen

HW allows us to program shaper LUT without 3D LUT settings and it is
also good for testing shaper LUT behavior, therefore, DC driver should
allow acquiring both 3D and shaper LUT, but programing shaper LUT
without 3D LUT (not initialized).

Signed-off-by: Melissa Wen 
---
 drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hwseq.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hwseq.c 
b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hwseq.c
index 3303c9aae068..bacb0a001d68 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hwseq.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hwseq.c
@@ -113,7 +113,6 @@ static bool dcn30_set_mpc_shaper_3dlut(struct pipe_ctx 
*pipe_ctx,
}
 
if (stream->lut3d_func &&
-   stream->lut3d_func->state.bits.initialized == 1 &&
stream->lut3d_func->state.bits.rmu_idx_valid == 1) {
if (stream->lut3d_func->state.bits.rmu_mux_num == 0)
mpcc_id_projected = 
stream->lut3d_func->state.bits.mpc_rmu0_mux;
@@ -131,8 +130,12 @@ static bool dcn30_set_mpc_shaper_3dlut(struct pipe_ctx 
*pipe_ctx,
if (acquired_rmu != stream->lut3d_func->state.bits.rmu_mux_num)
BREAK_TO_DEBUGGER();
 
-   result = mpc->funcs->program_3dlut(mpc, 
>lut3d_func->lut_3d,
-  
stream->lut3d_func->state.bits.rmu_mux_num);
+   if (stream->lut3d_func->state.bits.initialized == 1)
+   result = mpc->funcs->program_3dlut(mpc, 
>lut3d_func->lut_3d,
+  
stream->lut3d_func->state.bits.rmu_mux_num);
+   else
+   result = mpc->funcs->program_3dlut(mpc, NULL,
+  
stream->lut3d_func->state.bits.rmu_mux_num);
result = mpc->funcs->program_shaper(mpc, shaper_lut,

stream->lut3d_func->state.bits.rmu_mux_num);
} else {
-- 
2.39.2

[RFC PATCH 23/40] drm/amd/display: dynamically acquire 3DLUT resources for color changes

2023-04-23 Thread Melissa Wen

From: Joshua Ashton 

dc_acquire_release_mpc_3dlut_for_ctx initializes the bits required to
program 3DLUT in DC MPC hw block, applied in set_output_transfer_func().
Since acquire/release can fail, we should check resources availability
during atomic check considering the new context created. We dynamically
acquire 3D LUT resources when we actually use them, so we don't limit
ourselves with the stream count.

Co-developed-by: Melissa Wen 
Signed-off-by: Melissa Wen 
Signed-off-by: Joshua Ashton 
---
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  7 ++-
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h |  3 +-
 .../amd/display/amdgpu_dm/amdgpu_dm_color.c   | 55 ++-
 .../amd/display/dc/dcn301/dcn301_resource.c   | 26 -
 4 files changed, 87 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 729e37fa1873..6b40e17892e5 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -9380,7 +9380,12 @@ static int dm_update_crtc_state(struct 
amdgpu_display_manager *dm,
 */
if (dm_new_crtc_state->base.color_mgmt_changed ||
drm_atomic_crtc_needs_modeset(new_crtc_state)) {
-   ret = amdgpu_dm_update_crtc_color_mgmt(dm_new_crtc_state);
+   if (!dm_state) {
+   ret = dm_atomic_get_state(state, _state);
+   if (ret)
+   goto fail;
+   }
+   ret = amdgpu_dm_update_crtc_color_mgmt(dm_new_crtc_state, 
dm_state->context);
if (ret)
goto fail;
}
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
index 54121c3fa040..5faf4fc87701 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
@@ -904,7 +904,8 @@ void amdgpu_dm_trigger_timing_sync(struct drm_device *dev);
 
 void amdgpu_dm_init_color_mod(void);
 int amdgpu_dm_verify_lut_sizes(const struct drm_crtc_state *crtc_state);
-int amdgpu_dm_update_crtc_color_mgmt(struct dm_crtc_state *crtc);
+int amdgpu_dm_update_crtc_color_mgmt(struct dm_crtc_state *crtc,
+struct dc_state *ctx);
 int amdgpu_dm_update_plane_color_mgmt(struct dm_crtc_state *crtc,
  struct dc_plane_state *dc_plane_state);
 
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
index f1885e9c614d..99b1738c98d3 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
@@ -332,6 +332,49 @@ static int amdgpu_dm_set_atomic_regamma(struct 
dc_stream_state *stream,
return ret;
 }
 
+/* amdgpu_dm_atomic_shaper_lut3d - set DRM CRTC shaper LUT and 3D LUT to DC
+ * interface
+ * @dc: Display Core control structure
+ * @ctx: 
+ * @stream: DC stream state to set shaper LUT and 3D LUT
+ * @drm_shaper_lut: DRM CRTC (user) shaper LUT
+ * @drm_shaper_size: size of shaper LUT
+ * @drm_lut3d: DRM CRTC (user) 3D LUT
+ * @drm_lut3d_size: size of 3D LUT
+ *
+ * Returns:
+ * 0 on success.
+ */
+static int amdgpu_dm_atomic_shaper_lut3d(struct dc *dc,
+struct dc_state *ctx,
+struct dc_stream_state *stream,
+const struct drm_color_lut 
*drm_shaper_lut,
+uint32_t drm_shaper_size,
+const struct drm_color_lut *drm_lut3d,
+uint32_t drm_lut3d_size)
+{
+   struct dc_3dlut *lut3d_func;
+   struct dc_transfer_func *func_shaper;
+   bool acquire = drm_shaper_size && drm_lut3d_size;
+
+   lut3d_func = (struct dc_3dlut *)stream->lut3d_func;
+   func_shaper = (struct dc_transfer_func *)stream->func_shaper;
+
+   ASSERT((lut3d_func && func_shaper) || (!lut3d_func && !func_shaper));
+   if ((acquire && !lut3d_func && !func_shaper) ||
+   (!acquire && lut3d_func && func_shaper))
+   {
+   if (!dc_acquire_release_mpc_3dlut_for_ctx(dc, acquire, ctx, 
stream,
+ _func, 
_shaper))
+   return DC_ERROR_UNEXPECTED;
+   }
+
+   stream->lut3d_func = lut3d_func;
+   stream->func_shaper = func_shaper;
+
+   return 0;
+}
+
 /**
  * __set_input_tf - calculates the input transfer function based on expected
  * input space.
@@ -402,6 +445,7 @@ int amdgpu_dm_verify_lut_sizes(const struct drm_crtc_state 
*crtc_state)
 /**
  * amdgpu_dm_update_crtc_color_mgmt: Maps DRM color management to DC stream.
  * @crtc: amdgpu_dm crtc state
+ * @ctx: 
  *
  * With no plane level color

[RFC PATCH 20/40] drm/amd/display: copy 3D LUT settings from crtc state to stream_update

2023-04-23 Thread Melissa Wen

From: Joshua Ashton 

When commiting planes, we copy color mgmt resources to the stream state.
Do the same for shaper and 3D LUTs.

Co-developed-by: Melissa Wen 
Signed-off-by: Melissa Wen 
Signed-off-by: Joshua Ashton 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 76a776fd8437..729e37fa1873 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -8190,6 +8190,10 @@ static void amdgpu_dm_commit_planes(struct 
drm_atomic_state *state,
_state->stream->csc_color_matrix;
bundle->stream_update.out_transfer_func =
acrtc_state->stream->out_transfer_func;
+   bundle->stream_update.lut3d_func =
+   (struct dc_3dlut *) 
acrtc_state->stream->lut3d_func;
+   bundle->stream_update.func_shaper =
+   (struct dc_transfer_func *) 
acrtc_state->stream->func_shaper;
}
 
acrtc_state->stream->abm_level = acrtc_state->abm_level;
-- 
2.39.2

[RFC PATCH 22/40] drm/amd/display: handle MPC 3D LUT resources for a given context

2023-04-23 Thread Melissa Wen

In the original dc_acquire_release_mpc_3dlut(), only current ctx is
considered, which doesn't fit the steps for atomic checking new ctx.
Therefore, create a function to handle 3D LUT resource for a given
context, so that we can check resources availability in atomic_check
time and handle failures properly.

Signed-off-by: Melissa Wen 
---
 drivers/gpu/drm/amd/display/dc/core/dc.c | 39 
 drivers/gpu/drm/amd/display/dc/dc.h  |  8 +
 2 files changed, 47 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c 
b/drivers/gpu/drm/amd/display/dc/core/dc.c
index 9230c122d77e..ee3fe4eae22e 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
@@ -2101,6 +2101,45 @@ bool dc_acquire_release_mpc_3dlut(
return ret;
 }
 
+bool
+dc_acquire_release_mpc_3dlut_for_ctx(struct dc *dc,
+bool acquire,
+struct dc_state *state,
+struct dc_stream_state *stream,
+struct dc_3dlut **lut,
+struct dc_transfer_func **shaper)
+{
+   int pipe_idx;
+   bool ret = false;
+   bool found_pipe_idx = false;
+   const struct resource_pool *pool = dc->res_pool;
+   struct resource_context *res_ctx = >res_ctx;
+   int mpcc_id = 0;
+
+   if (pool && res_ctx) {
+   if (acquire) {
+   /*find pipe idx for the given stream*/
+   for (pipe_idx = 0; pipe_idx < pool->pipe_count; 
pipe_idx++) {
+   if (res_ctx->pipe_ctx[pipe_idx].stream == 
stream) {
+   found_pipe_idx = true;
+   mpcc_id = 
res_ctx->pipe_ctx[pipe_idx].plane_res.hubp->inst;
+   break;
+   }
+   }
+   } else
+   found_pipe_idx = true;/*for release pipe_idx is not 
required*/
+
+   if (found_pipe_idx) {
+   if (acquire && pool->funcs->acquire_post_bldn_3dlut)
+   ret = 
pool->funcs->acquire_post_bldn_3dlut(res_ctx, pool, mpcc_id, lut, shaper);
+   else if (!acquire && 
pool->funcs->release_post_bldn_3dlut)
+   ret = 
pool->funcs->release_post_bldn_3dlut(res_ctx, pool, lut, shaper);
+   }
+   }
+   return ret;
+}
+
+
 static bool is_flip_pending_in_pipes(struct dc *dc, struct dc_state *context)
 {
int i;
diff --git a/drivers/gpu/drm/amd/display/dc/dc.h 
b/drivers/gpu/drm/amd/display/dc/dc.h
index b45974a2dec3..7fdb0bbb2df9 100644
--- a/drivers/gpu/drm/amd/display/dc/dc.h
+++ b/drivers/gpu/drm/amd/display/dc/dc.h
@@ -1350,6 +1350,14 @@ bool dc_acquire_release_mpc_3dlut(
struct dc_3dlut **lut,
struct dc_transfer_func **shaper);
 
+bool
+dc_acquire_release_mpc_3dlut_for_ctx(struct dc *dc,
+bool acquire,
+struct dc_state *state,
+struct dc_stream_state *stream,
+struct dc_3dlut **lut,
+struct dc_transfer_func **shaper);
+
 void dc_resource_state_copy_construct(
const struct dc_state *src_ctx,
struct dc_state *dst_ctx);
-- 
2.39.2

[RFC PATCH 18/40] drm/amd/display: encapsulate atomic regamma operation

2023-04-23 Thread Melissa Wen

We are introducing DRM 3D LUT property to DM color pipeline in the next
patch, but so far, only for atomic interface. By checking
set_output_transfer_func in DC drivers with MPC 3D LUT support, we can
verify that regamma is only programmed when 3D LUT programming fails. As
a groundwork to introduce 3D LUT programming and better understand each
step, detach atomic regamma programming from the crtc colocr updating
code.

Signed-off-by: Melissa Wen 
---
 .../amd/display/amdgpu_dm/amdgpu_dm_color.c   | 52 ---
 1 file changed, 33 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
index fe779d10834e..f1885e9c614d 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
@@ -303,6 +303,35 @@ static int __set_output_tf(struct dc_transfer_func *func,
return res ? 0 : -ENOMEM;
 }
 
+static int amdgpu_dm_set_atomic_regamma(struct dc_stream_state *stream,
+   const struct drm_color_lut *regamma_lut,
+   uint32_t regamma_size, bool has_rom)
+{
+   int ret = 0;
+   if (regamma_size) {
+   /* CRTC RGM goes into RGM LUT.
+*
+* Note: there is no implicit sRGB regamma here. We are using
+* degamma calculation from color module to calculate the curve
+* from a linear base.
+*/
+   stream->out_transfer_func->type = TF_TYPE_DISTRIBUTED_POINTS;
+   stream->out_transfer_func->tf = TRANSFER_FUNCTION_LINEAR;
+
+   ret = __set_output_tf(stream->out_transfer_func,
+ regamma_lut, regamma_size, has_rom);
+   } else {
+   /*
+* No CRTC RGM means we can just put the block into bypass
+* since we don't have any plane level adjustments using it.
+*/
+   stream->out_transfer_func->type = TF_TYPE_BYPASS;
+   stream->out_transfer_func->tf = TRANSFER_FUNCTION_LINEAR;
+   }
+
+   return ret;
+}
+
 /**
  * __set_input_tf - calculates the input transfer function based on expected
  * input space.
@@ -450,27 +479,12 @@ int amdgpu_dm_update_crtc_color_mgmt(struct dm_crtc_state 
*crtc)
regamma_size, has_rom);
if (r)
return r;
-   } else if (has_regamma) {
-   /* CRTC RGM goes into RGM LUT.
-*
-* Note: there is no implicit sRGB regamma here. We are using
-* degamma calculation from color module to calculate the curve
-* from a linear base.
-*/
-   stream->out_transfer_func->type = TF_TYPE_DISTRIBUTED_POINTS;
-   stream->out_transfer_func->tf = TRANSFER_FUNCTION_LINEAR;
-
-   r = __set_output_tf(stream->out_transfer_func, regamma_lut,
-   regamma_size, has_rom);
+   } else {
+   regamma_size = has_regamma ? regamma_size : 0;
+   r = amdgpu_dm_set_atomic_regamma(stream, regamma_lut,
+regamma_size, has_rom);
if (r)
return r;
-   } else {
-   /*
-* No CRTC RGM means we can just put the block into bypass
-* since we don't have any plane level adjustments using it.
-*/
-   stream->out_transfer_func->type = TF_TYPE_BYPASS;
-   stream->out_transfer_func->tf = TRANSFER_FUNCTION_LINEAR;
}
 
/*
-- 
2.39.2

[RFC PATCH 19/40] drm/amd/display: update lut3d and shaper lut to stream

2023-04-23 Thread Melissa Wen

It follows the same path of out_transfer_func for stream updates, since
shaper LUT and 3D LUT is programmed in funcs.set_output_transfer_func()
and this function is called in the atomic commit_tail when
update_flags.bits.out_tf is set.

Signed-off-by: Melissa Wen 
---
 drivers/gpu/drm/amd/display/dc/core/dc.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c 
b/drivers/gpu/drm/amd/display/dc/core/dc.c
index e65ba87ee2c5..9230c122d77e 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
@@ -2581,7 +2581,7 @@ static enum surface_update_type 
check_update_surfaces_for_stream(
stream_update->integer_scaling_update)
su_flags->bits.scaling = 1;
 
-   if (stream_update->out_transfer_func)
+   if (stream_update->out_transfer_func || 
stream_update->lut3d_func)
su_flags->bits.out_tf = 1;
 
if (stream_update->abm_level)
@@ -2936,6 +2936,14 @@ static void copy_stream_update_to_stream(struct dc *dc,
   sizeof(struct dc_transfer_func_distributed_points));
}
 
+   if (update->func_shaper &&
+   stream->func_shaper != update->func_shaper)
+   stream->func_shaper = update->func_shaper;
+
+   if (update->lut3d_func &&
+   stream->lut3d_func != update->lut3d_func)
+   stream->lut3d_func = update->lut3d_func;
+
if (update->hdr_static_metadata)
stream->hdr_static_metadata = *update->hdr_static_metadata;
 
-- 
2.39.2

[RFC PATCH 17/40] drm/amd/display: add comments to describe DM crtc color mgmt behavior

2023-04-23 Thread Melissa Wen

Describe some expected behavior of the AMD DM color mgmt programming.

Signed-off-by: Melissa Wen 
---
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
index a4cb23d059bd..fe779d10834e 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
@@ -440,12 +440,23 @@ int amdgpu_dm_update_crtc_color_mgmt(struct dm_crtc_state 
*crtc)
stream->out_transfer_func->type = TF_TYPE_DISTRIBUTED_POINTS;
stream->out_transfer_func->tf = TRANSFER_FUNCTION_SRGB;
 
+   /* Note: although we pass has_rom as parameter here, we never
+* actually use ROM because the color module only takes the ROM
+* path if transfer_func->type == PREDEFINED.
+*
+* See more in mod_color_calculate_regamma_params()
+*/
r = __set_legacy_tf(stream->out_transfer_func, regamma_lut,
regamma_size, has_rom);
if (r)
return r;
} else if (has_regamma) {
-   /* If atomic regamma, CRTC RGM goes into RGM LUT. */
+   /* CRTC RGM goes into RGM LUT.
+*
+* Note: there is no implicit sRGB regamma here. We are using
+* degamma calculation from color module to calculate the curve
+* from a linear base.
+*/
stream->out_transfer_func->type = TF_TYPE_DISTRIBUTED_POINTS;
stream->out_transfer_func->tf = TRANSFER_FUNCTION_LINEAR;
 
-- 
2.39.2

[RFC PATCH 16/40] drm/amd/display: add plane blend LUT and TF driver-private properties

2023-04-23 Thread Melissa Wen

From: Joshua Ashton 

Blend 1D LUT or a predefined transfer function can be set to linearize
content before blending, so that it's positioned just before blending
planes, and after 3D LUT (non-linear space). Shaper and Blend LUTs are
1D LUTs that sandwich 3D LUT. Drivers should advertize blend properties
according to HW caps.

Signed-off-by: Joshua Ashton 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_display.c   | 23 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h  | 18 ++
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h | 12 +++
 .../amd/display/amdgpu_dm/amdgpu_dm_plane.c   | 34 +++
 4 files changed, 87 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
index 2bf8b19feae4..0bcf0bc6baff 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
@@ -1369,6 +1369,29 @@ amdgpu_display_create_color_properties(struct 
amdgpu_device *adev)
return -ENOMEM;
adev->mode_info.plane_lut3d_size_property = prop;
 
+   prop = drm_property_create(adev_to_drm(adev),
+  DRM_MODE_PROP_BLOB,
+  "AMD_PLANE_BLEND_LUT", 0);
+   if (!prop)
+   return -ENOMEM;
+   adev->mode_info.plane_blend_lut_property = prop;
+
+   prop = drm_property_create_range(adev_to_drm(adev),
+DRM_MODE_PROP_IMMUTABLE,
+"AMD_PLANE_BLEND_LUT_SIZE", 0, 
UINT_MAX);
+   if (!prop)
+   return -ENOMEM;
+   adev->mode_info.plane_blend_lut_size_property = prop;
+
+   prop = drm_property_create_enum(adev_to_drm(adev),
+   DRM_MODE_PROP_ENUM,
+   "AMD_PLANE_BLEND_TF",
+   drm_transfer_function_enum_list,
+   
ARRAY_SIZE(drm_transfer_function_enum_list));
+   if (!prop)
+   return -ENOMEM;
+   adev->mode_info.plane_blend_tf_property = prop;
+
return 0;
 }
 #endif
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
index 17c7669ad9ab..f640dbd53b8c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
@@ -417,6 +417,24 @@ struct amdgpu_mode_info {
 * size of 3D LUT as supported by the driver (read-only).
 */
struct drm_property *plane_lut3d_size_property;
+   /**
+* @plane_blend_lut_property: Plane property for output gamma before
+* blending. Userspace set a blend LUT to convert colors after 3D LUT
+* conversion. It works as a post-3D LUT 1D LUT, with shaper LUT, they
+* are sandwiching 3D LUT with two 1D LUT.
+*/
+   struct drm_property *plane_blend_lut_property;
+   /**
+* @plane_blend_lut_size_property: Plane property to define the max
+* size of blend LUT as supported by the driver (read-only).
+*/
+   struct drm_property *plane_blend_lut_size_property;
+   /**
+* @plane_blend_tf_property: Plane property to set a predefined
+* transfer function for pre-blending blend (before applying 3D LUT)
+* with or without LUT.
+*/
+   struct drm_property *plane_blend_tf_property;
 #endif
 };
 
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
index 8a425e7a7e89..54121c3fa040 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
@@ -764,6 +764,18 @@ struct dm_plane_state {
 *  drm_color_lut.
 */
struct drm_property_blob *lut3d;
+   /**
+* @blend_lut: blend lut lookup table blob. The blob (if not NULL) is an
+* array of  drm_color_lut.
+*/
+   struct drm_property_blob *blend_lut;
+   /**
+* @blend_tf:
+*
+* Pre-defined transfer function for converting plane pixel data before
+* applying blend LUT.
+*/
+   enum drm_transfer_function blend_tf;
 #endif
 };
 
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c
index e4f28fbf6613..cdbd11f3be20 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c
@@ -1325,6 +1325,7 @@ static void dm_drm_plane_reset(struct drm_plane *plane)
amdgpu_state->degamma_tf = DRM_TRANSFER_FUNCTION_DEFAULT;
amdgpu_state->hdr_mult = AMDGPU_HDR_MULT_DEFAULT;
amdgpu_state->shaper_tf = DRM_TRANSFER_FUNCTION_DEFAULT;
+   amdgpu_state->blend_tf = DRM_TRANSFER_FUNCTION_DEFAULT;
}
 #endif
 }
@@ -1352,6 +1353,8 @@ dm_drm_plane_duplicate_state(struct

[RFC PATCH 15/40] drm/amd/display: add plane shaper TF driver-private property

2023-04-23 Thread Melissa Wen

Add property to set predefined transfer function to enable delinearizing
content with or without shaper LUT. Drivers should advertize this
property acoording to HW caps.

Signed-off-by: Melissa Wen 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_display.c   |  9 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h  |  6 ++
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h |  6 ++
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c   | 11 +++
 4 files changed, 32 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
index f41406ee96ad..2bf8b19feae4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
@@ -1346,6 +1346,15 @@ amdgpu_display_create_color_properties(struct 
amdgpu_device *adev)
return -ENOMEM;
adev->mode_info.plane_shaper_lut_size_property = prop;
 
+   prop = drm_property_create_enum(adev_to_drm(adev),
+   DRM_MODE_PROP_ENUM,
+   "AMD_PLANE_SHAPER_TF",
+   drm_transfer_function_enum_list,
+   
ARRAY_SIZE(drm_transfer_function_enum_list));
+   if (!prop)
+   return -ENOMEM;
+   adev->mode_info.plane_shaper_tf_property = prop;
+
prop = drm_property_create(adev_to_drm(adev),
   DRM_MODE_PROP_BLOB,
   "AMD_PLANE_LUT3D", 0);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
index 756d5f70be0a..17c7669ad9ab 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
@@ -401,6 +401,12 @@ struct amdgpu_mode_info {
 * pre-blending shaper LUT as supported by the driver (read-only).
 */
struct drm_property *plane_shaper_lut_size_property;
+   /**
+* @plane_shaper_tf_property: Plane property to set a predefined
+* transfer function for pre-blending shaper (before applying 3D LUT)
+* with or without LUT.
+*/
+   struct drm_property *plane_shaper_tf_property;
/**
 * @plane_lut3d_property: Plane property for gamma correction using a
 * 3D LUT (pre-blending).
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
index d3ecc73129ff..8a425e7a7e89 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
@@ -753,6 +753,12 @@ struct dm_plane_state {
 * array of  drm_color_lut.
 */
struct drm_property_blob *shaper_lut;
+   /**
+* @shaper_tf:
+*
+* Predefined transfer function to delinearize color space.
+*/
+   enum drm_transfer_function shaper_tf;
/**
 * @lut3d: 3D lookup table blob. The blob (if not NULL) is an array of
 *  drm_color_lut.
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c
index 69e2f1f86cce..e4f28fbf6613 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c
@@ -1324,6 +1324,7 @@ static void dm_drm_plane_reset(struct drm_plane *plane)
if (amdgpu_state) {
amdgpu_state->degamma_tf = DRM_TRANSFER_FUNCTION_DEFAULT;
amdgpu_state->hdr_mult = AMDGPU_HDR_MULT_DEFAULT;
+   amdgpu_state->shaper_tf = DRM_TRANSFER_FUNCTION_DEFAULT;
}
 #endif
 }
@@ -1503,6 +1504,9 @@ dm_plane_attach_color_mgmt_properties(struct 
amdgpu_display_manager *dm,
drm_object_attach_property(>base,
   
dm->adev->mode_info.plane_shaper_lut_size_property,
   MAX_COLOR_LUT_ENTRIES);
+   drm_object_attach_property(>base,
+  
dm->adev->mode_info.plane_shaper_tf_property,
+  DRM_TRANSFER_FUNCTION_DEFAULT);
drm_object_attach_property(>base,
   
dm->adev->mode_info.plane_lut3d_property, 0);
drm_object_attach_property(>base,
@@ -1547,6 +1551,11 @@ dm_atomic_plane_set_property(struct drm_plane *plane,
);
dm_plane_state->base.color_mgmt_changed |= replaced;
return ret;
+   } else if (property == adev->mode_info.plane_shaper_tf_property) {
+   if (dm_plane_state->shaper_tf != val) {
+   dm_plane_state->shaper_tf = val;
+   dm_plane_state->base.color_mgmt_changed = 1;
+   }
} else if (property == adev->mode_info.plane_lut3d_property) {

[RFC PATCH 14/40] drm/amd/display: add plane shaper LUT driver-private properties

2023-04-23 Thread Melissa Wen

Shaper 1D LUT delinearizes content before applying 3D LUT so that, it
comes before 3D LUT. It's an optional property and drivers should attach
it according to HW caps.

Signed-off-by: Melissa Wen 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_display.c   | 14 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h  | 10 ++
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h |  5 +
 .../amd/display/amdgpu_dm/amdgpu_dm_plane.c   | 19 +++
 4 files changed, 48 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
index 8d4726978c6e..f41406ee96ad 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
@@ -1332,6 +1332,20 @@ amdgpu_display_create_color_properties(struct 
amdgpu_device *adev)
return -ENOMEM;
adev->mode_info.plane_hdr_mult_property = prop;
 
+   prop = drm_property_create(adev_to_drm(adev),
+  DRM_MODE_PROP_BLOB,
+  "AMD_PLANE_SHAPER_LUT", 0);
+   if (!prop)
+   return -ENOMEM;
+   adev->mode_info.plane_shaper_lut_property = prop;
+
+   prop = drm_property_create_range(adev_to_drm(adev),
+DRM_MODE_PROP_IMMUTABLE,
+"AMD_PLANE_SHAPER_LUT_SIZE", 0, 
UINT_MAX);
+   if (!prop)
+   return -ENOMEM;
+   adev->mode_info.plane_shaper_lut_size_property = prop;
+
prop = drm_property_create(adev_to_drm(adev),
   DRM_MODE_PROP_BLOB,
   "AMD_PLANE_LUT3D", 0);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
index 9d9dac26edfc..756d5f70be0a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
@@ -391,6 +391,16 @@ struct amdgpu_mode_info {
 * @plane_hdr_mult_property:
 */
struct drm_property *plane_hdr_mult_property;
+   /**
+* @shaper_lut_property: Plane property to set pre-blending shaper LUT
+* that converts color content before 3D LUT.
+*/
+   struct drm_property *plane_shaper_lut_property;
+   /**
+* @shaper_lut_size_property: Plane property for the size of
+* pre-blending shaper LUT as supported by the driver (read-only).
+*/
+   struct drm_property *plane_shaper_lut_size_property;
/**
 * @plane_lut3d_property: Plane property for gamma correction using a
 * 3D LUT (pre-blending).
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
index b0ba0279dc25..d3ecc73129ff 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
@@ -748,6 +748,11 @@ struct dm_plane_state {
 * S31.32 sign-magnitude.
 */
__u64 hdr_mult;
+   /**
+* @shaper_lut: shaper lookup table blob. The blob (if not NULL) is an
+* array of  drm_color_lut.
+*/
+   struct drm_property_blob *shaper_lut;
/**
 * @lut3d: 3D lookup table blob. The blob (if not NULL) is an array of
 *  drm_color_lut.
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c
index 0e418e161b0b..69e2f1f86cce 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c
@@ -1347,6 +1347,8 @@ dm_drm_plane_duplicate_state(struct drm_plane *plane)
 #ifdef CONFIG_STEAM_DECK
if (dm_plane_state->degamma_lut)
drm_property_blob_get(dm_plane_state->degamma_lut);
+   if (dm_plane_state->shaper_lut)
+   drm_property_blob_get(dm_plane_state->shaper_lut);
if (dm_plane_state->lut3d)
drm_property_blob_get(dm_plane_state->lut3d);
 #endif
@@ -1419,6 +1421,7 @@ static void dm_drm_plane_destroy_state(struct drm_plane 
*plane,
struct dm_plane_state *dm_plane_state = to_dm_plane_state(state);
 #ifdef CONFIG_STEAM_DECK
drm_property_blob_put(dm_plane_state->degamma_lut);
+   drm_property_blob_put(dm_plane_state->shaper_lut);
drm_property_blob_put(dm_plane_state->lut3d);
 #endif
 
@@ -1495,6 +1498,11 @@ dm_plane_attach_color_mgmt_properties(struct 
amdgpu_display_manager *dm,
   AMDGPU_HDR_MULT_DEFAULT);
 
if (dm->dc->caps.color.dpp.hw_3d_lut) {
+   drm_object_attach_property(>base,
+  
dm->adev->mode_info.plane_shaper_lut_property, 0);
+   drm_object_attach_property(>base,
+  
dm->adev->mode_info.plane_shaper_lut_size_property,
+  MAX_COLOR_LUT_ENTRIES);

[RFC PATCH 10/40] drm/amd/display: add plane degamma LUT driver-private props

2023-04-23 Thread Melissa Wen

From: Joshua Ashton 

Create driver-private properties (not DRM KMS generic) for plane degamma
LUT (user-blob and its size).

Co-developed-by: Melissa Wen 
Signed-off-by: Melissa Wen 
Signed-off-by: Joshua Ashton 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_display.c   | 14 
 drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h  | 10 +++
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h | 11 +++
 .../amd/display/amdgpu_dm/amdgpu_dm_plane.c   | 78 ++-
 4 files changed, 111 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
index 1913903cab88..996c9c3fd471 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
@@ -1303,6 +1303,20 @@ amdgpu_display_create_color_properties(struct 
amdgpu_device *adev)
return -ENOMEM;
adev->mode_info.gamma_tf_property = prop;
 
+   prop = drm_property_create(adev_to_drm(adev),
+  DRM_MODE_PROP_BLOB,
+  "AMD_PLANE_DEGAMMA_LUT", 0);
+   if (!prop)
+   return -ENOMEM;
+   adev->mode_info.plane_degamma_lut_property = prop;
+
+   prop = drm_property_create_range(adev_to_drm(adev),
+DRM_MODE_PROP_IMMUTABLE,
+"AMD_PLANE_DEGAMMA_LUT_SIZE", 0, 
UINT_MAX);
+   if (!prop)
+   return -ENOMEM;
+   adev->mode_info.plane_degamma_lut_size_property = prop;
+
return 0;
 }
 #endif
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
index 76337e18c728..d4e609a8b67e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
@@ -372,6 +372,16 @@ struct amdgpu_mode_info {
 * @gamma_tf_property: Transfer function for CRTC regamma.
 */
struct drm_property *gamma_tf_property;
+   /**
+* @plane_degamma_lut_property: Plane property to set a degamma LUT to
+* convert color space before blending.
+*/
+   struct drm_property *plane_degamma_lut_property;
+   /**
+* @plane_degamma_lut_size_property: Plane property to define the max
+* size of degamma LUT as supported by the driver (read-only).
+*/
+   struct drm_property *plane_degamma_lut_size_property;
 #endif
 };
 
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
index 1e90a2dd445e..b1d0c65d821d 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
@@ -719,6 +719,17 @@ enum drm_transfer_function {
 struct dm_plane_state {
struct drm_plane_state base;
struct dc_plane_state *dc_state;
+
+#ifdef CONFIG_STEAM_DECK
+   /* Plane color mgmt */
+   /**
+* @degamma_lut:
+*
+* LUT for converting plane pixel data before going into plane merger.
+* The blob (if not NULL) is an array of  drm_color_lut.
+*/
+   struct drm_property_blob *degamma_lut;
+#endif
 };
 
 struct dm_crtc_state {
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c
index 4e5498153be2..7b9d62c70b30 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c
@@ -1337,7 +1337,10 @@ dm_drm_plane_duplicate_state(struct drm_plane *plane)
dm_plane_state->dc_state = old_dm_plane_state->dc_state;
dc_plane_state_retain(dm_plane_state->dc_state);
}
-
+#ifdef CONFIG_STEAM_DECK
+   if (dm_plane_state->degamma_lut)
+   drm_property_blob_get(dm_plane_state->degamma_lut);
+#endif
return _plane_state->base;
 }
 
@@ -1404,7 +1407,9 @@ static void dm_drm_plane_destroy_state(struct drm_plane 
*plane,
struct drm_plane_state *state)
 {
struct dm_plane_state *dm_plane_state = to_dm_plane_state(state);
-
+#ifdef CONFIG_STEAM_DECK
+   drm_property_blob_put(dm_plane_state->degamma_lut);
+#endif
if (dm_plane_state->dc_state)
dc_plane_state_release(dm_plane_state->dc_state);
 
@@ -1444,6 +1449,68 @@ amdgpu_dm_replace_property_blob_from_id(struct 
drm_device *dev,
 
return 0;
 }
+
+static void
+dm_plane_attach_color_mgmt_properties(struct amdgpu_display_manager *dm,
+ struct drm_plane *plane)
+{
+   if (dm->dc->caps.color.dpp.dgam_ram || 
dm->dc->caps.color.dpp.gamma_corr ) {
+   drm_object_attach_property(>base,
+  
dm->adev->mode_info.plane_degamma_lut_property, 0);
+   drm_object_attach_property(>base,
+  
dm->adev->mode_info.plane_degamma_lut_size_property,
+

[RFC PATCH 13/40] drm/amd/display: add plane 3D LUT driver-private properties

2023-04-23 Thread Melissa Wen

Add 3D LUT property for plane gamma correction using a 3D lookup table.
3D LUT is more effective when applying in non-linear space, therefore,
userpace may need one 1D LUT (shaper) before it to delinearize content
and another 1D LUT after 3D LUT (blend) to linearize content again for
blending. The next patches add these 1D LUTs to the plane color mgmt
pipeline.

Signed-off-by: Melissa Wen 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_display.c   | 14 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h  | 10 
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h |  5 
 .../amd/display/amdgpu_dm/amdgpu_dm_plane.c   | 24 +++
 4 files changed, 53 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
index dd658f162f6f..8d4726978c6e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
@@ -1332,6 +1332,20 @@ amdgpu_display_create_color_properties(struct 
amdgpu_device *adev)
return -ENOMEM;
adev->mode_info.plane_hdr_mult_property = prop;
 
+   prop = drm_property_create(adev_to_drm(adev),
+  DRM_MODE_PROP_BLOB,
+  "AMD_PLANE_LUT3D", 0);
+   if (!prop)
+   return -ENOMEM;
+   adev->mode_info.plane_lut3d_property = prop;
+
+   prop = drm_property_create_range(adev_to_drm(adev),
+DRM_MODE_PROP_IMMUTABLE,
+"AMD_PLANE_LUT3D_SIZE", 0, UINT_MAX);
+   if (!prop)
+   return -ENOMEM;
+   adev->mode_info.plane_lut3d_size_property = prop;
+
return 0;
 }
 #endif
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
index 65a9d62ffbe4..9d9dac26edfc 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
@@ -391,6 +391,16 @@ struct amdgpu_mode_info {
 * @plane_hdr_mult_property:
 */
struct drm_property *plane_hdr_mult_property;
+   /**
+* @plane_lut3d_property: Plane property for gamma correction using a
+* 3D LUT (pre-blending).
+*/
+   struct drm_property *plane_lut3d_property;
+   /**
+* @plane_degamma_lut_size_property: Plane property to define the max
+* size of 3D LUT as supported by the driver (read-only).
+*/
+   struct drm_property *plane_lut3d_size_property;
 #endif
 };
 
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
index bb7307b9cfd5..b0ba0279dc25 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
@@ -748,6 +748,11 @@ struct dm_plane_state {
 * S31.32 sign-magnitude.
 */
__u64 hdr_mult;
+   /**
+* @lut3d: 3D lookup table blob. The blob (if not NULL) is an array of
+*  drm_color_lut.
+*/
+   struct drm_property_blob *lut3d;
 #endif
 };
 
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c
index 57169dae8b3d..0e418e161b0b 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c
@@ -1347,7 +1347,10 @@ dm_drm_plane_duplicate_state(struct drm_plane *plane)
 #ifdef CONFIG_STEAM_DECK
if (dm_plane_state->degamma_lut)
drm_property_blob_get(dm_plane_state->degamma_lut);
+   if (dm_plane_state->lut3d)
+   drm_property_blob_get(dm_plane_state->lut3d);
 #endif
+
return _plane_state->base;
 }
 
@@ -1416,7 +1419,9 @@ static void dm_drm_plane_destroy_state(struct drm_plane 
*plane,
struct dm_plane_state *dm_plane_state = to_dm_plane_state(state);
 #ifdef CONFIG_STEAM_DECK
drm_property_blob_put(dm_plane_state->degamma_lut);
+   drm_property_blob_put(dm_plane_state->lut3d);
 #endif
+
if (dm_plane_state->dc_state)
dc_plane_state_release(dm_plane_state->dc_state);
 
@@ -1488,6 +1493,14 @@ dm_plane_attach_color_mgmt_properties(struct 
amdgpu_display_manager *dm,
drm_object_attach_property(>base,
   dm->adev->mode_info.plane_hdr_mult_property,
   AMDGPU_HDR_MULT_DEFAULT);
+
+   if (dm->dc->caps.color.dpp.hw_3d_lut) {
+   drm_object_attach_property(>base,
+  
dm->adev->mode_info.plane_lut3d_property, 0);
+   drm_object_attach_property(>base,
+  
dm->adev->mode_info.plane_lut3d_size_property,
+  MAX_COLOR_3DLUT_ENTRIES);
+   }
 }
 
 static int
@@ -1518,6 +1531,14 @@ dm_atomic_plane_set_property(struct drm_plane *plane,

[RFC PATCH 12/40] drm/amd/display: add plane HDR multiplier driver-private property

2023-04-23 Thread Melissa Wen

From: Joshua Ashton 

Multiplier to 'gain' the plane. When PQ is decoded using the fixed func
transfer function to the internal FP16 fb, 1.0 -> 80 nits (on AMD at
least) When sRGB is decoded, 1.0 -> 1.0.  Therefore, 1.0 multiplier = 80
nits for SDR content. So if you want, 203 nits for SDR content, pass in
(203.0 / 80.0).

Co-developed-by: Melissa Wen 
Signed-off-by: Melissa Wen 
Signed-off-by: Joshua Ashton 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_display.c   |  6 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h  |  4 +++
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h | 12 +
 .../amd/display/amdgpu_dm/amdgpu_dm_plane.c   | 25 ++-
 4 files changed, 41 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
index 24595906dab1..dd658f162f6f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
@@ -1326,6 +1326,12 @@ amdgpu_display_create_color_properties(struct 
amdgpu_device *adev)
return -ENOMEM;
adev->mode_info.plane_degamma_tf_property = prop;
 
+   prop = drm_property_create_range(adev_to_drm(adev),
+0, "AMD_PLANE_HDR_MULT", 0, UINT_MAX);
+   if (!prop)
+   return -ENOMEM;
+   adev->mode_info.plane_hdr_mult_property = prop;
+
return 0;
 }
 #endif
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
index ab9ce6f26c90..65a9d62ffbe4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
@@ -387,6 +387,10 @@ struct amdgpu_mode_info {
 * linearize content with or without LUT.
 */
struct drm_property *plane_degamma_tf_property;
+   /**
+* @plane_hdr_mult_property:
+*/
+   struct drm_property *plane_hdr_mult_property;
 #endif
 };
 
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
index 005632c1c9ec..bb7307b9cfd5 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
@@ -51,6 +51,7 @@
 
 #define AMDGPU_DMUB_NOTIFICATION_MAX 5
 
+#define AMDGPU_HDR_MULT_DEFAULT (0x1LL)
 /*
 #include "include/amdgpu_dal_power_if.h"
 #include "amdgpu_dm_irq.h"
@@ -736,6 +737,17 @@ struct dm_plane_state {
 * linearize.
 */
enum drm_transfer_function degamma_tf;
+   /**
+* @hdr_mult:
+*
+* Multiplier to 'gain' the plane.  When PQ is decoded using the fixed
+* func transfer function to the internal FP16 fb, 1.0 -> 80 nits (on
+* AMD at least). When sRGB is decoded, 1.0 -> 1.0, obviously.
+* Therefore, 1.0 multiplier = 80 nits for SDR content.  So if you
+* want, 203 nits for SDR content, pass in (203.0 / 80.0).  Format is
+* S31.32 sign-magnitude.
+*/
+   __u64 hdr_mult;
 #endif
 };
 
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c
index 5b458cc0781c..57169dae8b3d 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c
@@ -1321,8 +1321,10 @@ static void dm_drm_plane_reset(struct drm_plane *plane)
__drm_atomic_helper_plane_reset(plane, _state->base);
 
 #ifdef CONFIG_STEAM_DECK
-   if (amdgpu_state)
+   if (amdgpu_state) {
amdgpu_state->degamma_tf = DRM_TRANSFER_FUNCTION_DEFAULT;
+   amdgpu_state->hdr_mult = AMDGPU_HDR_MULT_DEFAULT;
+   }
 #endif
 }
 
@@ -1424,11 +1426,11 @@ static void dm_drm_plane_destroy_state(struct drm_plane 
*plane,
 #ifdef CONFIG_STEAM_DECK
 int
 amdgpu_dm_replace_property_blob_from_id(struct drm_device *dev,
-  struct drm_property_blob **blob,
-  uint64_t blob_id,
-  ssize_t expected_size,
-  ssize_t expected_elem_size,
-  bool *replaced)
+   struct drm_property_blob **blob,
+   uint64_t blob_id,
+   ssize_t expected_size,
+   ssize_t expected_elem_size,
+   bool *replaced)
 {
struct drm_property_blob *new_blob = NULL;
 
@@ -1482,6 +1484,10 @@ dm_plane_attach_color_mgmt_properties(struct 
amdgpu_display_manager *dm,
   
dm->adev->mode_info.plane_degamma_tf_property,
   DRM_TRANSFER_FUNCTION_DEFAULT);
}
+   /* HDR MULT is always available */
+   drm_object_attach_property(>base,
+

[RFC PATCH 08/40] drm/drm_plane: track color mgmt changes per plane

2023-04-23 Thread Melissa Wen

We will add color mgmt properties to DRM planes in the text patches and
we want to track when one of this properties change to define atomic
commit behaviors. Using a similar approach from CRTC color props, we set
a color_mgmt_changed boolean whenever a plane color prop changes.

Signed-off-by: Melissa Wen 
---
 drivers/gpu/drm/drm_atomic.c  | 1 +
 drivers/gpu/drm/drm_atomic_state_helper.c | 1 +
 include/drm/drm_plane.h   | 7 +++
 3 files changed, 9 insertions(+)

diff --git a/drivers/gpu/drm/drm_atomic.c b/drivers/gpu/drm/drm_atomic.c
index c0dc5858a723..da2429470c4f 100644
--- a/drivers/gpu/drm/drm_atomic.c
+++ b/drivers/gpu/drm/drm_atomic.c
@@ -724,6 +724,7 @@ static void drm_atomic_plane_print_state(struct drm_printer 
*p,
   drm_get_color_encoding_name(state->color_encoding));
drm_printf(p, "\tcolor-range=%s\n",
   drm_get_color_range_name(state->color_range));
+   drm_printf(p, "\tcolor_mgmt_changed=%d\n", state->color_mgmt_changed);
 
if (plane->funcs->atomic_print_state)
plane->funcs->atomic_print_state(p, state);
diff --git a/drivers/gpu/drm/drm_atomic_state_helper.c 
b/drivers/gpu/drm/drm_atomic_state_helper.c
index dfb57217253b..3df4c96a902e 100644
--- a/drivers/gpu/drm/drm_atomic_state_helper.c
+++ b/drivers/gpu/drm/drm_atomic_state_helper.c
@@ -338,6 +338,7 @@ void __drm_atomic_helper_plane_duplicate_state(struct 
drm_plane *plane,
state->fence = NULL;
state->commit = NULL;
state->fb_damage_clips = NULL;
+   state->color_mgmt_changed = false;
 }
 EXPORT_SYMBOL(__drm_atomic_helper_plane_duplicate_state);
 
diff --git a/include/drm/drm_plane.h b/include/drm/drm_plane.h
index 447e664e49d5..6c97380b8c76 100644
--- a/include/drm/drm_plane.h
+++ b/include/drm/drm_plane.h
@@ -237,6 +237,13 @@ struct drm_plane_state {
 
/** @state: backpointer to global drm_atomic_state */
struct drm_atomic_state *state;
+
+   /**
+* @color_mgmt_changed: Color management properties have changed. Used
+* by the atomic helpers and drivers to steer the atomic commit control
+* flow.
+*/
+   bool color_mgmt_changed : 1;
 };
 
 static inline struct drm_rect
-- 
2.39.2

[RFC PATCH 11/40] drm/amd/display: add plane degamma TF driver-private property

2023-04-23 Thread Melissa Wen

From: Joshua Ashton 

Allow userspace to tell the kernel driver the input space and,
therefore, uses correct predefined transfer function (TF) to delinearize
content with or without LUT (using hardcoded curve caps).

Signed-off-by: Joshua Ashton 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_display.c   |  9 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h  |  5 
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h |  7 +
 .../amd/display/amdgpu_dm/amdgpu_dm_plane.c   | 28 +++
 4 files changed, 49 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
index 996c9c3fd471..24595906dab1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
@@ -1317,6 +1317,15 @@ amdgpu_display_create_color_properties(struct 
amdgpu_device *adev)
return -ENOMEM;
adev->mode_info.plane_degamma_lut_size_property = prop;
 
+   prop = drm_property_create_enum(adev_to_drm(adev),
+   DRM_MODE_PROP_ENUM,
+   "AMD_PLANE_DEGAMMA_TF",
+   drm_transfer_function_enum_list,
+   
ARRAY_SIZE(drm_transfer_function_enum_list));
+   if (!prop)
+   return -ENOMEM;
+   adev->mode_info.plane_degamma_tf_property = prop;
+
return 0;
 }
 #endif
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
index d4e609a8b67e..ab9ce6f26c90 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
@@ -382,6 +382,11 @@ struct amdgpu_mode_info {
 * size of degamma LUT as supported by the driver (read-only).
 */
struct drm_property *plane_degamma_lut_size_property;
+   /**
+* @plane_degamma_tf_property: Predefined transfer function to
+* linearize content with or without LUT.
+*/
+   struct drm_property *plane_degamma_tf_property;
 #endif
 };
 
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
index b1d0c65d821d..005632c1c9ec 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
@@ -729,6 +729,13 @@ struct dm_plane_state {
 * The blob (if not NULL) is an array of  drm_color_lut.
 */
struct drm_property_blob *degamma_lut;
+   /**
+* @degamma_tf:
+*
+* Predefined transfer function to tell DC driver the input space to
+* linearize.
+*/
+   enum drm_transfer_function degamma_tf;
 #endif
 };
 
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c
index 7b9d62c70b30..5b458cc0781c 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c
@@ -1319,6 +1319,11 @@ static void dm_drm_plane_reset(struct drm_plane *plane)
 
if (amdgpu_state)
__drm_atomic_helper_plane_reset(plane, _state->base);
+
+#ifdef CONFIG_STEAM_DECK
+   if (amdgpu_state)
+   amdgpu_state->degamma_tf = DRM_TRANSFER_FUNCTION_DEFAULT;
+#endif
 }
 
 static struct drm_plane_state *
@@ -1450,6 +1455,19 @@ amdgpu_dm_replace_property_blob_from_id(struct 
drm_device *dev,
return 0;
 }
 
+static const struct drm_prop_enum_list drm_transfer_function_enum_list[] = {
+   { DRM_TRANSFER_FUNCTION_DEFAULT, "Default" },
+   { DRM_TRANSFER_FUNCTION_SRGB, "sRGB" },
+   { DRM_TRANSFER_FUNCTION_BT709, "BT.709" },
+   { DRM_TRANSFER_FUNCTION_PQ, "PQ (Perceptual Quantizer)" },
+   { DRM_TRANSFER_FUNCTION_LINEAR, "Linear" },
+   { DRM_TRANSFER_FUNCTION_UNITY, "Unity" },
+   { DRM_TRANSFER_FUNCTION_HLG, "HLG (Hybrid Log Gamma)" },
+   { DRM_TRANSFER_FUNCTION_GAMMA22, "Gamma 2.2" },
+   { DRM_TRANSFER_FUNCTION_GAMMA24, "Gamma 2.4" },
+   { DRM_TRANSFER_FUNCTION_GAMMA26, "Gamma 2.6" },
+};
+
 static void
 dm_plane_attach_color_mgmt_properties(struct amdgpu_display_manager *dm,
  struct drm_plane *plane)
@@ -1460,6 +1478,9 @@ dm_plane_attach_color_mgmt_properties(struct 
amdgpu_display_manager *dm,
drm_object_attach_property(>base,
   
dm->adev->mode_info.plane_degamma_lut_size_property,
   MAX_COLOR_LUT_ENTRIES);
+   drm_object_attach_property(>base,
+  
dm->adev->mode_info.plane_degamma_tf_property,
+  DRM_TRANSFER_FUNCTION_DEFAULT);
}
 }
 
@@ -1481,6 +1502,11 @@ dm_atomic_plane_set_property(struct drm_plane *plane,
);

[RFC PATCH 09/40] drm/amd/display: move replace blob func to dm plane

2023-04-23 Thread Melissa Wen

>From amdgpu_dm_plane we can get it for both CRTC and plane color
properties. We are adding new plane properties for AMD driver-private
color mgmt.

Signed-off-by: Melissa Wen 
---
 .../amd/display/amdgpu_dm/amdgpu_dm_crtc.c| 37 +--
 .../amd/display/amdgpu_dm/amdgpu_dm_plane.c   | 35 ++
 .../amd/display/amdgpu_dm/amdgpu_dm_plane.h   |  7 
 3 files changed, 44 insertions(+), 35 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crtc.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crtc.c
index 79324fbab1f1..27d7a8b18013 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crtc.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crtc.c
@@ -344,39 +344,6 @@ dm_crtc_additional_color_mgmt(struct drm_crtc *crtc)
   DRM_TRANSFER_FUNCTION_DEFAULT);
 }
 
-static int
-atomic_replace_property_blob_from_id(struct drm_device *dev,
-struct drm_property_blob **blob,
-uint64_t blob_id,
-ssize_t expected_size,
-ssize_t expected_elem_size,
-bool *replaced)
-{
-   struct drm_property_blob *new_blob = NULL;
-
-   if (blob_id != 0) {
-   new_blob = drm_property_lookup_blob(dev, blob_id);
-   if (new_blob == NULL)
-   return -EINVAL;
-
-   if (expected_size > 0 &&
-   new_blob->length != expected_size) {
-   drm_property_blob_put(new_blob);
-   return -EINVAL;
-   }
-   if (expected_elem_size > 0 &&
-   new_blob->length % expected_elem_size != 0) {
-   drm_property_blob_put(new_blob);
-   return -EINVAL;
-   }
-   }
-
-   *replaced |= drm_property_replace_blob(blob, new_blob);
-   drm_property_blob_put(new_blob);
-
-   return 0;
-}
-
 static int
 amdgpu_dm_atomic_crtc_set_property(struct drm_crtc *crtc,
   struct drm_crtc_state *state,
@@ -389,7 +356,7 @@ amdgpu_dm_atomic_crtc_set_property(struct drm_crtc *crtc,
int ret;
 
if (property == adev->mode_info.shaper_lut_property) {
-   ret = atomic_replace_property_blob_from_id(crtc->dev,
+   ret = amdgpu_dm_replace_property_blob_from_id(crtc->dev,
_state->shaper_lut,
val,
-1, sizeof(struct drm_color_lut),
@@ -397,7 +364,7 @@ amdgpu_dm_atomic_crtc_set_property(struct drm_crtc *crtc,
acrtc_state->base.color_mgmt_changed |= replaced;
return ret;
} else if (property == adev->mode_info.lut3d_property) {
-   ret = atomic_replace_property_blob_from_id(crtc->dev,
+   ret = amdgpu_dm_replace_property_blob_from_id(crtc->dev,
_state->lut3d,
val,
-1, sizeof(struct drm_color_lut),
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c
index 322668973747..4e5498153be2 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_plane.c
@@ -1411,6 +1411,41 @@ static void dm_drm_plane_destroy_state(struct drm_plane 
*plane,
drm_atomic_helper_plane_destroy_state(plane, state);
 }
 
+#ifdef CONFIG_STEAM_DECK
+int
+amdgpu_dm_replace_property_blob_from_id(struct drm_device *dev,
+  struct drm_property_blob **blob,
+  uint64_t blob_id,
+  ssize_t expected_size,
+  ssize_t expected_elem_size,
+  bool *replaced)
+{
+   struct drm_property_blob *new_blob = NULL;
+
+   if (blob_id != 0) {
+   new_blob = drm_property_lookup_blob(dev, blob_id);
+   if (new_blob == NULL)
+   return -EINVAL;
+
+   if (expected_size > 0 &&
+   new_blob->length != expected_size) {
+   drm_property_blob_put(new_blob);
+   return -EINVAL;
+   }
+   if (expected_elem_size > 0 &&
+   new_blob->length % expected_elem_size != 0) {
+   drm_property_blob_put(new_blob);
+   return -EINVAL;
+   }
+   }
+
+   *replaced |= drm_property_replace_blob(blob, new_blob);
+   drm_property_blob_put(new_blob);
+
+   return 0;
+}
+#endif
+
 static const struct

[RFC PATCH 07/40] drm/amd/display: add CRTC gamma TF to driver-private props

2023-04-23 Thread Melissa Wen

From: Joshua Ashton 

Add predefined transfer function property to DRM CRTC gamma to convert
to wire encoding with or without gamma LUT.

Co-developed-by: Melissa Wen 
Signed-off-by: Melissa Wen 
Signed-off-by: Joshua Ashton 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_display.c   | 22 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h  |  4 
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h | 23 +++
 .../amd/display/amdgpu_dm/amdgpu_dm_crtc.c| 13 +++
 4 files changed, 62 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
index 2abe5fe87c10..1913903cab88 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
@@ -1248,6 +1248,19 @@ amdgpu_display_user_framebuffer_create(struct drm_device 
*dev,
 }
 
 #ifdef CONFIG_STEAM_DECK
+static const struct drm_prop_enum_list drm_transfer_function_enum_list[] = {
+   { DRM_TRANSFER_FUNCTION_DEFAULT, "Default" },
+   { DRM_TRANSFER_FUNCTION_SRGB, "sRGB" },
+   { DRM_TRANSFER_FUNCTION_BT709, "BT.709" },
+   { DRM_TRANSFER_FUNCTION_PQ, "PQ (Perceptual Quantizer)" },
+   { DRM_TRANSFER_FUNCTION_LINEAR, "Linear" },
+   { DRM_TRANSFER_FUNCTION_UNITY, "Unity" },
+   { DRM_TRANSFER_FUNCTION_HLG, "HLG (Hybrid Log Gamma)" },
+   { DRM_TRANSFER_FUNCTION_GAMMA22, "Gamma 2.2" },
+   { DRM_TRANSFER_FUNCTION_GAMMA24, "Gamma 2.4" },
+   { DRM_TRANSFER_FUNCTION_GAMMA26, "Gamma 2.6" },
+};
+
 static int
 amdgpu_display_create_color_properties(struct amdgpu_device *adev)
 {
@@ -1281,6 +1294,15 @@ amdgpu_display_create_color_properties(struct 
amdgpu_device *adev)
return -ENOMEM;
adev->mode_info.lut3d_size_property = prop;
 
+   prop = drm_property_create_enum(adev_to_drm(adev),
+   DRM_MODE_PROP_ENUM,
+   "GAMMA_TF",
+   drm_transfer_function_enum_list,
+   
ARRAY_SIZE(drm_transfer_function_enum_list));
+   if (!prop)
+   return -ENOMEM;
+   adev->mode_info.gamma_tf_property = prop;
+
return 0;
 }
 #endif
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
index 205fa4f5bea7..76337e18c728 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
@@ -368,6 +368,10 @@ struct amdgpu_mode_info {
 * LUT as supported by the driver (read-only).
 */
struct drm_property *lut3d_size_property;
+   /**
+* @gamma_tf_property: Transfer function for CRTC regamma.
+*/
+   struct drm_property *gamma_tf_property;
 #endif
 };
 
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
index 09c3e1858b56..1e90a2dd445e 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
@@ -699,6 +699,23 @@ static inline void amdgpu_dm_set_mst_status(uint8_t 
*status,
 
 extern const struct amdgpu_ip_block_version dm_ip_block;
 
+#ifdef CONFIG_STEAM_DECK
+enum drm_transfer_function {
+   DRM_TRANSFER_FUNCTION_DEFAULT,
+
+   DRM_TRANSFER_FUNCTION_SRGB,
+   DRM_TRANSFER_FUNCTION_BT709,
+   DRM_TRANSFER_FUNCTION_PQ,
+   DRM_TRANSFER_FUNCTION_LINEAR,
+   DRM_TRANSFER_FUNCTION_UNITY,
+   DRM_TRANSFER_FUNCTION_HLG,
+   DRM_TRANSFER_FUNCTION_GAMMA22,
+   DRM_TRANSFER_FUNCTION_GAMMA24,
+   DRM_TRANSFER_FUNCTION_GAMMA26,
+   DRM_TRANSFER_FUNCTION_MAX,
+};
+#endif
+
 struct dm_plane_state {
struct drm_plane_state base;
struct dc_plane_state *dc_state;
@@ -751,6 +768,12 @@ struct dm_crtc_state {
 *  drm_color_lut.
 */
struct drm_property_blob *lut3d;
+/**
+* @gamma_tf:
+*
+* Pre-defined transfer function for converting internal FB -> wire 
encoding.
+*/
+   enum drm_transfer_function gamma_tf;
 #endif
 };
 
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crtc.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crtc.c
index 0e1280228e6e..79324fbab1f1 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crtc.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crtc.c
@@ -272,6 +272,7 @@ static struct drm_crtc_state 
*dm_crtc_duplicate_state(struct drm_crtc *crtc)
 #ifdef CONFIG_STEAM_DECK
state->shaper_lut = cur->shaper_lut;
state->lut3d = cur->lut3d;
+   state->gamma_tf = cur->gamma_tf;
 
if (state->shaper_lut)
drm_property_blob_get(state->shaper_lut);
@@ -336,6 +337,11 @@ dm_crtc_additional_color_mgmt(struct drm_crtc *crtc)
   adev->mode_info.lut3d_size_property,
   MAX_COLOR_3DLUT_ENTRIES);
}
+
+

[RFC PATCH 05/40] drm/amd/display: add shaper LUT driver-private props

2023-04-23 Thread Melissa Wen

CRTC shaper LUT shapes the content after blending, i.e., de-linearizes
or normalizes space before applying a 3D LUT color correction. In the
next patch, we add CRTC 3D LUT property to DRM color management after
this shaper LUT and before the current CRTC gamma LUT.

Signed-off-by: Melissa Wen 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_display.c   |  28 
 drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h  |  14 ++
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h |  17 +++
 .../amd/display/amdgpu_dm/amdgpu_dm_crtc.c| 122 +-
 4 files changed, 179 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
index 8632ab695a6c..44c22cb87dde 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
@@ -1247,6 +1247,30 @@ amdgpu_display_user_framebuffer_create(struct drm_device 
*dev,
return _fb->base;
 }
 
+#ifdef CONFIG_STEAM_DECK
+static int
+amdgpu_display_create_color_properties(struct amdgpu_device *adev)
+{
+   struct drm_property *prop;
+
+   prop = drm_property_create(adev_to_drm(adev),
+  DRM_MODE_PROP_BLOB,
+  "AMD_SHAPER_LUT", 0);
+   if (!prop)
+   return -ENOMEM;
+   adev->mode_info.shaper_lut_property = prop;
+
+   prop = drm_property_create_range(adev_to_drm(adev),
+DRM_MODE_PROP_IMMUTABLE,
+"AMD_SHAPER_LUT_SIZE", 0, UINT_MAX);
+   if (!prop)
+   return -ENOMEM;
+   adev->mode_info.shaper_lut_size_property = prop;
+
+   return 0;
+}
+#endif
+
 const struct drm_mode_config_funcs amdgpu_mode_funcs = {
.fb_create = amdgpu_display_user_framebuffer_create,
 };
@@ -1323,6 +1347,10 @@ int amdgpu_display_modeset_create_props(struct 
amdgpu_device *adev)
return -ENOMEM;
}
 
+#ifdef CONFIG_STEAM_DECK
+   if (amdgpu_display_create_color_properties(adev))
+   return -ENOMEM;
+#endif
return 0;
 }
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
index b8633df418d4..1fd3497af3b5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
@@ -344,6 +344,20 @@ struct amdgpu_mode_info {
int disp_priority;
const struct amdgpu_display_funcs *funcs;
const enum drm_plane_type *plane_type;
+
+   /* Driver-private color mgmt props */
+#ifdef CONFIG_STEAM_DECK
+   /**
+* @shaper_lut_property: CRTC property to set post-blending shaper LUT
+* that converts content before 3D LUT gamma correction.
+*/
+   struct drm_property *shaper_lut_property;
+   /**
+* @shaper_lut_size_property: CRTC property for the size of
+* post-blending shaper LUT as supported by the driver (read-only).
+*/
+   struct drm_property *shaper_lut_size_property;
+#endif
 };
 
 #define AMDGPU_MAX_BL_LEVEL 0xFF
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
index 2e2413fd73a4..de63455896cc 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
@@ -726,6 +726,23 @@ struct dm_crtc_state {
struct dc_info_packet vrr_infopacket;
 
int abm_level;
+
+#ifdef CONFIG_STEAM_DECK
+   /* AMD driver-private color mgmt pipeline
+*
+* DRM provides CRTC degamma/ctm/gamma color mgmt features, but AMD HW
+* has a larger set of post-blending color calibration features, as
+* below:
+*/
+   /**
+* @shaper_lut:
+*
+* Lookup table used to de-linearize pixel data for gamma correction.
+* See drm_crtc_enable_color_mgmt(). The blob (if not NULL) is an array
+* of  drm_color_lut.
+*/
+   struct drm_property_blob *shaper_lut;
+#endif
 };
 
 #define to_dm_crtc_state(x) container_of(x, struct dm_crtc_state, base)
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crtc.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crtc.c
index e3762e806617..503433e5cb38 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crtc.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crtc.c
@@ -229,7 +229,9 @@ static void dm_crtc_destroy_state(struct drm_crtc *crtc,
if (cur->stream)
dc_stream_release(cur->stream);
 
-
+#ifdef CONFIG_STEAM_DECK
+   drm_property_blob_put(cur->shaper_lut);
+#endif
__drm_atomic_helper_crtc_destroy_state(state);
 
 
@@ -266,7 +268,12 @@ static struct drm_crtc_state 
*dm_crtc_duplicate_state(struct drm_crtc *crtc)
state->crc_skip_count = cur->crc_skip_count;
state->mpo_requested = cur->mpo_requested;
/* TODO Duplicate dc_stream after objects are

[RFC PATCH 03/40] drm/amd/display: introduce Steam Deck color features to AMD display driver

2023-04-23 Thread Melissa Wen

We are enabling a large set of color calibration features to enhance KMS
color mgmt but these properties are specific of AMD display HW, and
cannot be provided by other vendors. Therefore, set a config option to
enable AMD driver-private properties used on Steam Deck color mgmt
pipeline.

Co-developed-by: Joshua Ashton 
Signed-off-by: Joshua Ashton 
Signed-off-by: Melissa Wen 
---
 drivers/gpu/drm/amd/display/Kconfig | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/Kconfig 
b/drivers/gpu/drm/amd/display/Kconfig
index 06b438217c61..c45a8deb1098 100644
--- a/drivers/gpu/drm/amd/display/Kconfig
+++ b/drivers/gpu/drm/amd/display/Kconfig
@@ -53,5 +53,11 @@ config DRM_AMD_SECURE_DISPLAY
 of crc of specific region via debugfs.
 Cooperate with specific DMCU FW.
 
+config STEAM_DECK
+   bool "Enable color calibration features for Steam Deck"
+   depends on DRM_AMD_DC
+   help
+ Choose this option if you want to use AMDGPU features for broader
+ color management support on Steam Deck.
 
 endmenu
-- 
2.39.2

[RFC PATCH 06/40] drm/amd/display: add 3D LUT driver-private props

2023-04-23 Thread Melissa Wen

Add CRTC 3D LUT for gamma correction using a 3D lookup table. A shaper
lut must be set to shape the content for a non-linear space. That
details should be handled by the driver according to HW color
capabilities.

Signed-off-by: Melissa Wen 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_display.c   | 14 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h  | 11 ++
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h | 13 
 .../amd/display/amdgpu_dm/amdgpu_dm_crtc.c| 20 +++
 4 files changed, 58 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
index 44c22cb87dde..2abe5fe87c10 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
@@ -1267,6 +1267,20 @@ amdgpu_display_create_color_properties(struct 
amdgpu_device *adev)
return -ENOMEM;
adev->mode_info.shaper_lut_size_property = prop;
 
+   prop = drm_property_create(adev_to_drm(adev),
+  DRM_MODE_PROP_BLOB,
+  "AMD_LUT3D", 0);
+   if (!prop)
+   return -ENOMEM;
+   adev->mode_info.lut3d_property = prop;
+
+   prop = drm_property_create_range(adev_to_drm(adev),
+DRM_MODE_PROP_IMMUTABLE,
+"AMD_LUT3D_SIZE", 0, UINT_MAX);
+   if (!prop)
+   return -ENOMEM;
+   adev->mode_info.lut3d_size_property = prop;
+
return 0;
 }
 #endif
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
index 1fd3497af3b5..205fa4f5bea7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
@@ -357,6 +357,17 @@ struct amdgpu_mode_info {
 * post-blending shaper LUT as supported by the driver (read-only).
 */
struct drm_property *shaper_lut_size_property;
+   /**
+* lut3d_property: CRTC property to set post-blending 3D LUT gamma
+* correction; a shaper LUT can be used before applying 3D LUT to
+* delinearize content.
+*/
+   struct drm_property *lut3d_property;
+   /**
+* @lut3d_size_property: CRTC property for the size of post-blending 3D
+* LUT as supported by the driver (read-only).
+*/
+   struct drm_property *lut3d_size_property;
 #endif
 };
 
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
index de63455896cc..09c3e1858b56 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
@@ -742,6 +742,15 @@ struct dm_crtc_state {
 * of  drm_color_lut.
 */
struct drm_property_blob *shaper_lut;
+   /**
+* @lut3d:
+*
+* 3D Lookup table for converting pixel data. Position where it takes
+* place depends on hw design, after @ctm or @gamma_lut. See
+* drm_crtc_enable_color_mgmt(). The blob (if not NULL) is an array of
+*  drm_color_lut.
+*/
+   struct drm_property_blob *lut3d;
 #endif
 };
 
@@ -804,6 +813,10 @@ void amdgpu_dm_update_freesync_caps(struct drm_connector 
*connector,
 
 void amdgpu_dm_trigger_timing_sync(struct drm_device *dev);
 
+/* 3D LUT max size is 17x17x17 */
+#define MAX_COLOR_3DLUT_ENTRIES 4913
+#define MAX_COLOR_3DLUT_BITDEPTH 12
+/* 1D LUT degamma, regamma and shaper*/
 #define MAX_COLOR_LUT_ENTRIES 4096
 /* Legacy gamm LUT users such as X doesn't like large LUT sizes */
 #define MAX_COLOR_LEGACY_LUT_ENTRIES 256
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crtc.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crtc.c
index 503433e5cb38..0e1280228e6e 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crtc.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crtc.c
@@ -231,6 +231,7 @@ static void dm_crtc_destroy_state(struct drm_crtc *crtc,
 
 #ifdef CONFIG_STEAM_DECK
drm_property_blob_put(cur->shaper_lut);
+   drm_property_blob_put(cur->lut3d);
 #endif
__drm_atomic_helper_crtc_destroy_state(state);
 
@@ -270,9 +271,12 @@ static struct drm_crtc_state 
*dm_crtc_duplicate_state(struct drm_crtc *crtc)
/* TODO Duplicate dc_stream after objects are stream object is 
flattened */
 #ifdef CONFIG_STEAM_DECK
state->shaper_lut = cur->shaper_lut;
+   state->lut3d = cur->lut3d;
 
if (state->shaper_lut)
drm_property_blob_get(state->shaper_lut);
+   if (state->lut3d)
+   drm_property_blob_get(state->lut3d);
 #endif
return >base;
 }
@@ -326,6 +330,11 @@ dm_crtc_additional_color_mgmt(struct drm_crtc *crtc)
drm_object_attach_property(>base,
   
adev->mode_info.shaper_lut_size_property,
   MAX_COLOR_LUT_ENTRIES);
+

[RFC PATCH 04/40] drm/drm_mode_object: increase max objects to accommodate new color props

2023-04-23 Thread Melissa Wen

In the next patches we are adding 17 new properties for color
correction:
- CRTC: 3D LUT+size, shaper LUT+size, regamma TF (5)
- Plane: Degamma LUT+size+TF, HDR multiplier, shaper LUT+size+TF, 3D LUT+size, 
blend
  LUT+size+TF (12)
We still need to detach driver-private counter from DRM/KMS-generic, by
now, increase max objs to 41.

Signed-off-by: Melissa Wen 
---
 include/drm/drm_mode_object.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/drm/drm_mode_object.h b/include/drm/drm_mode_object.h
index 912f1e415685..7e4fb7536c6a 100644
--- a/include/drm/drm_mode_object.h
+++ b/include/drm/drm_mode_object.h
@@ -60,7 +60,7 @@ struct drm_mode_object {
void (*free_cb)(struct kref *kref);
 };
 
-#define DRM_OBJECT_MAX_PROPERTY 24
+#define DRM_OBJECT_MAX_PROPERTY 47
 /**
  * struct drm_object_properties - property tracking for _mode_object
  */
-- 
2.39.2

[RFC PATCH 01/40] drm/amd/display: fix segment distribution for linear LUTs

2023-04-23 Thread Melissa Wen

From: Harry Wentland 

The region and segment calculation was incapable of dealing
with regions of more than 16 segments. We first fix this.

Now that we can support regions up to 256 elements we can
define a better segment distribution for near-linear LUTs
for our maximum of 256 HW-supported points.

With these changes an "identity" LUT looks visually
indistinguishable from bypass and allows us to use
our 3DLUT.

Signed-off-by: Harry Wentland 
---
 .../amd/display/dc/dcn10/dcn10_cm_common.c| 95 +++
 1 file changed, 76 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_cm_common.c 
b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_cm_common.c
index 7a00fe525dfb..f27413e94280 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_cm_common.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_cm_common.c
@@ -346,20 +346,37 @@ bool cm_helper_translate_curve_to_hw_format(
 * segment is from 2^-10 to 2^1
 * There are less than 256 points, for optimization
 */
-   seg_distr[0] = 3;
-   seg_distr[1] = 4;
-   seg_distr[2] = 4;
-   seg_distr[3] = 4;
-   seg_distr[4] = 4;
-   seg_distr[5] = 4;
-   seg_distr[6] = 4;
-   seg_distr[7] = 4;
-   seg_distr[8] = 4;
-   seg_distr[9] = 4;
-   seg_distr[10] = 1;
-
-   region_start = -10;
-   region_end = 1;
+   if (output_tf->tf == TRANSFER_FUNCTION_LINEAR) {
+   seg_distr[0] = 0; /* 2 */
+   seg_distr[1] = 1; /* 4 */
+   seg_distr[2] = 2; /* 4 */
+   seg_distr[3] = 3; /* 8 */
+   seg_distr[4] = 4; /* 16 */
+   seg_distr[5] = 5; /* 32 */
+   seg_distr[6] = 6; /* 64 */
+   seg_distr[7] = 7; /* 128 */
+
+   region_start = -8;
+   region_end = 1;
+   } else {
+   seg_distr[0] = 3; /* 8 */
+   seg_distr[1] = 4; /* 16 */
+   seg_distr[2] = 4;
+   seg_distr[3] = 4;
+   seg_distr[4] = 4;
+   seg_distr[5] = 4;
+   seg_distr[6] = 4;
+   seg_distr[7] = 4;
+   seg_distr[8] = 4;
+   seg_distr[9] = 4;
+   seg_distr[10] = 1; /* 2 */
+   /* total = 8*16 + 8 + 64 + 2 = */
+
+   region_start = -10;
+   region_end = 1;
+   }
+
+
}
 
for (i = region_end - region_start; i < MAX_REGIONS_NUMBER ; i++)
@@ -372,16 +389,56 @@ bool cm_helper_translate_curve_to_hw_format(
 
j = 0;
for (k = 0; k < (region_end - region_start); k++) {
-   increment = NUMBER_SW_SEGMENTS / (1 << seg_distr[k]);
+   /*
+* We're using an ugly-ish hack here. Our HW allows for
+* 256 segments per region but SW_SEGMENTS is 16.
+* SW_SEGMENTS has some undocumented relationship to
+* the number of points in the tf_pts struct, which
+* is 512, unlike what's suggested TRANSFER_FUNC_POINTS.
+*
+* In order to work past this dilemma we'll scale our
+* increment by (1 << 4) and then do the inverse (1 >> 4)
+* when accessing the elements in tf_pts.
+*
+* TODO: find a better way using SW_SEGMENTS and
+*   TRANSFER_FUNC_POINTS definitions
+*/
+   increment = (NUMBER_SW_SEGMENTS << 4) / (1 << seg_distr[k]);
start_index = (region_start + k + MAX_LOW_POINT) *
NUMBER_SW_SEGMENTS;
-   for (i = start_index; i < start_index + NUMBER_SW_SEGMENTS;
+   for (i = (start_index << 4); i < (start_index << 4) + 
(NUMBER_SW_SEGMENTS << 4);
i += increment) {
+   struct fixed31_32 in_plus_one, in;
+   struct fixed31_32 value, red_value, green_value, 
blue_value;
+   uint32_t t = i & 0xf;
+
if (j == hw_points - 1)
break;
-   rgb_resulted[j].red = output_tf->tf_pts.red[i];
-   rgb_resulted[j].green = output_tf->tf_pts.green[i];
-   rgb_resulted[j].blue = output_tf->tf_pts.blue[i];
+
+   in_plus_one = output_tf->tf_pts.red[(i >> 4) + 1];
+   in = output_tf->tf_pts.red[i >> 4];
+   value = dc_fixpt_sub(in_plus_one, in);
+   value = dc_fixpt_shr(dc_fixpt_mul_int(value, t),  4);
+

[RFC PATCH 02/40] drm/amd/display: fix the delta clamping for shaper LUT

2023-04-23 Thread Melissa Wen

From: Harry Wentland 

The shaper LUT requires a 10-bit value of the delta between
segments. We were using dc_fixpt_clamp_u0d10() to do that
but it doesn't do what we want it to do. It will preserve
10-bit precision after the decimal point, but that's not
quite what we want. We want 14-bit precision and discard
the 4 most-significant bytes.

To do that we'll do dc_fixpt_clamp_u0d14() & 0x3ff instead.

Signed-off-by: Harry Wentland 
---
 .../gpu/drm/amd/display/dc/dcn10/dcn10_cm_common.c | 14 +++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_cm_common.c 
b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_cm_common.c
index f27413e94280..efa6cee649d0 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_cm_common.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_cm_common.c
@@ -539,10 +539,18 @@ bool cm_helper_translate_curve_to_hw_format(
rgb->delta_green = dc_fixpt_sub(rgb_plus_1->green, rgb->green);
rgb->delta_blue  = dc_fixpt_sub(rgb_plus_1->blue,  rgb->blue);
 
+
if (fixpoint == true) {
-   rgb->delta_red_reg   = 
dc_fixpt_clamp_u0d10(rgb->delta_red);
-   rgb->delta_green_reg = 
dc_fixpt_clamp_u0d10(rgb->delta_green);
-   rgb->delta_blue_reg  = 
dc_fixpt_clamp_u0d10(rgb->delta_blue);
+   uint32_t red_clamp = 
dc_fixpt_clamp_u0d14(rgb->delta_red);
+   uint32_t green_clamp = 
dc_fixpt_clamp_u0d14(rgb->delta_green);
+   uint32_t blue_clamp = 
dc_fixpt_clamp_u0d14(rgb->delta_blue);
+
+   if (red_clamp >> 10 || green_clamp >> 10 || blue_clamp 
>> 10)
+   DC_LOG_WARNING("Losing delta precision while 
programming shaper LUT.");
+
+   rgb->delta_red_reg   = red_clamp & 0x3ff;
+   rgb->delta_green_reg = green_clamp & 0x3ff;
+   rgb->delta_blue_reg  = blue_clamp & 0x3ff;
rgb->red_reg = dc_fixpt_clamp_u0d14(rgb->red);
rgb->green_reg   = dc_fixpt_clamp_u0d14(rgb->green);
rgb->blue_reg= dc_fixpt_clamp_u0d14(rgb->blue);
-- 
2.39.2

[RFC PATCH 00/40] drm/amd/display: add AMD driver-specific properties for color mgmt

2023-04-23 Thread Melissa Wen

Hi all,

Joshua Ashton and I (with the great collaboration of Harry Wentland -
thanks) have been working on KMS color pipeline enhancement for Steam
Deck/SteamOS by exposing the large set of color caps available in AMD
display HW.

This patchset results from this full-stack work, including pre-blending
and post-blending new color properties. The first two patches fix
quantization issues on shaper LUT programming. Just after, we have one
patch that adds a config option to restrict AMD colo feature usage. The
following 13 patches implement AMD driver-private color properties
(pending detachment of property counter and plane color_mgmt_changed
from DRM). Finally, the last 24 patches rework the AMD display manager
and color management to support the properties exposed.

In short, for pre-blending, we added the following:
- plane degamma LUT and predefined transfer function;
- plane HDR multiplier
- plane shaper LUT/transfer function;
- plane 3D LUT; and finally,
- plane blend LUT/transfer function, just before blending.

After blending, we already have DRM CRTC degamma/gamma LUTs and CTM,
therefore, we extend CRTC color pipeline with the following:
- CRTC shaper LUT/transfer function;
- CRTC 3D LUT; and
- CRTC gamma transfer function.

You can already find the AMD color capabilities and color management
pipeline documented here:
https://dri.freedesktop.org/docs/drm/gpu/amdgpu/display/display-manager.html#color-management-properties

In previous iterations, we tried to provide a generic solution for
post-blending shaper and 3D LUT [1][2][3], and also Alex Hung worked on
a pre-blending 3D LUT solution[4] extending plane color mgmt proposal
from Uma Shankar [5]. However, we identified during our work [6] that
AMD provides many other valuable capabilities that we don't find in
other vendors, so we started to work on AMD driver-private color
properties that better describe its color pipeline, enabling us to
expose full AMD color capabilities on Deck HW.

Our primary purpose is to avoid usage limitations of color calibration
features provided by HW just because we don't have an interface for
that. At the same time, a generic solution doesn't fit well since some
of these capabilities seem AMD HW specific, such as hardcoded
curve/predefined transfer function and shaper 1D LUTs sandwiching 3D
LUT.

So far, we keep these properties' usage under an AMD display config
option (STEAM_DECK). However, we are fine with having them fully
available to other DCN HW generations. In the current proposal, we are
already checking ASICs before exposing a color feature. We can work on
3D LUT resource acquisition details to fit them to DCN 3+ families that
support them. Indeed, before moving to these config boundaries, we
started working on an open solution for any AMD HW [7].

The userspace case here is Gamescope which is the compositor for
SteamOS. It's already using all of this functionality (although with a
VALVE1_ prefix instead of AMD) to implement its color management
pipeline right now:
https://github.com/ValveSoftware/gamescope

We are planning on shipping our color management support with gamut
mapping, HDR, SDR on HDR, HDR on SDR, and much more in Steam OS 3.5. A
brief overview of our color pipeline can be found here:
https://github.com/ValveSoftware/gamescope/blob/master/src/docs/Steam%20Deck%20Display%20Pipeline.png

We have also had some other userspace interests from Xaver Hugl (KDE) in
experimenting with these properties for their HDR/color bring-up before
a generic interface is settled on also.

It still needs AMD-specific IGT tests; we are working on documentation
and adding plane CTM support too. 

We decided first to share our work to collect thoughts and open for
discussion, even with missing refinements, since driver-private
properties are not the usual DMR/KMS color management approach.

Please, let us know your thoughts.

Best Regards,

Signed-off-by: Joshua Ashton 
Signed-off-by: Melissa Wen

[1] https://lore.kernel.org/dri-devel/20220619223104.667413-1-m...@igalia.com/
[2] https://lore.kernel.org/amd-gfx/20220906164628.2361811-1-m...@igalia.com/
[3] https://lore.kernel.org/dri-devel/20230109143846.1966301-1-m...@igalia.com/
[4] 
https://lore.kernel.org/dri-devel/20221004211451.1475215-1-alex.h...@amd.com/
[5] 
https://lore.kernel.org/dri-devel/20210906213904.27918-1-uma.shan...@intel.com/
[6] https://gitlab.freedesktop.org/mwen/linux-amd/-/commits/amd-color-mgmt
[7] 
https://gitlab.freedesktop.org/mwen/linux-amd/-/commits/amd-private-color-mgmt

Harry Wentland (2):
  drm/amd/display: fix segment distribution for linear LUTs
  drm/amd/display: fix the delta clamping for shaper LUT

Joshua Ashton (15):
  drm/amd/display: add CRTC gamma TF to driver-private props
  drm/amd/display: add plane degamma LUT driver-private props
  drm/amd/display: add plane degamma TF driver-private property
  drm/amd/display: add plane HDR multiplier driver-private property
  drm/amd/display: add plane blend LUT and TF driver-private

Re: [PATCH v2 0/7] drm: sun4i: set proper TCON0 DCLK rate in DSI mode

2023-04-23 Thread Frank Oltmanns

Hi Roman,

On 2023-04-18 at 09:40:01 +0200, Roman Beranek  wrote:
> According to Allwinner's BSP code, in DSI mode, TCON0 clock needs to be
> running at what's effectively the per-lane datarate of the DSI link.
> Given that the TCON DCLK divider is fixed to 4 (SUN6I_DSI_TCON_DIV),
> DCLK can't be set equal to the dotclock. Therefore labeling TCON DCLK
> as sun4i_dotclock or tcon-pixel-clock shall be avoided.
>
> With bpp bits per pixel transmitted over n DSI lanes, the target DCLK
> rate for a given pixel clock is obtained as follows:
>
> DCLK rate = 1/4 * bpp / n * pixel clock
>
> Effect of this change can be observed through the rate of Vblank IRQs
> which should now match refresh rate implied by set display mode. It
> was verified to do so on a A64 board with a 2-lane and a 4-lane panel.
>
> v2:
> 1. prevent reparent of tcon0 to pll-video0-2x
> 2. include pll-video0 in setting TCON0 DCLK rate
> 3. tested the whole thing also on a PinePhone
>
> Roman Beranek (7):
>   clk: sunxi-ng: a64: propagate rate change from pll-mipi
>   clk: sunxi-ng: a64: export PLL_MIPI
>   clk: sunxi-ng: a64: prevent CLK_TCON0 being reparented
>   arm64: dts: allwinner: a64: assign PLL_MIPI to CLK_TCON0
>   ARM: dts: sunxi: rename tcon's clock output
>   drm: sun4i: rename sun4i_dotclock to sun4i_tcon_dclk
>   drm: sun4i: calculate proper DCLK rate for DSI
>
>  arch/arm/boot/dts/sun5i.dtsi  |  2 +-
>  arch/arm/boot/dts/sun8i-a23-a33.dtsi  |  2 +-
>  arch/arm/boot/dts/sun8i-a83t.dtsi |  2 +-
>  arch/arm/boot/dts/sun8i-v3s.dtsi  |  2 +-
>  arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi |  4 +-
>  drivers/clk/sunxi-ng/ccu-sun50i-a64.c |  6 ++-
>  drivers/clk/sunxi-ng/ccu-sun50i-a64.h |  4 +-
>  drivers/gpu/drm/sun4i/Makefile|  2 +-
>  drivers/gpu/drm/sun4i/sun4i_tcon.c| 46 +++
>  .../{sun4i_dotclock.c => sun4i_tcon_dclk.c}   |  2 +-
>  .../{sun4i_dotclock.h => sun4i_tcon_dclk.h}   |  0
>  include/dt-bindings/clock/sun50i-a64-ccu.h|  1 +
>  12 files changed, 43 insertions(+), 30 deletions(-)
>  rename drivers/gpu/drm/sun4i/{sun4i_dotclock.c => sun4i_tcon_dclk.c} (99%)
>  rename drivers/gpu/drm/sun4i/{sun4i_dotclock.h => sun4i_tcon_dclk.h} (100%)
>
>
> base-commit: 4aa35a0130d6b8afbefc9ef530a521fb0fb9b8e1


I've tried your patches on my pinephone. I also set the panel's clock to
72 MHz, so at 24 bpp and 4 lanes that should result in a data clock of
108 MHz. This should be possible when pll-video0 is at 297 MHz.

Unfortunately, pll-video0 is not set and therefore the relevant part of
the clk_summary looks like this:

  enable  prepare  protect  hardware
clock  countcountcountrateenable

 pll-video0111   29400 Y
hdmi   000   29400 N
tcon1  000   29400 N
pll-mipi   111   43120 Y
   tcon0   221   43120 Y
  tcon-data-clock  111   10780 Y
pll-video0-2x  000   58800 Y

Note, I've cut the columns accuracy, phase, and duty cycle, because they
show the same values for all clocks (0, 0, 5).

My understanding was that with this patchset setting the parent clock
should be possible. Do you have any idea why it doesn't work on the
pinephone? Or maybe it does work on yours and I'm making some kind of
mistake?

On a brighter note, when I initialize pll-video0 to 297 MHz in
sunxi-ng/ccu-sun50i-a64.c:sun50i_a64_ccu_probe() I get an even 108 Mhz
for the data clock. The patch is:

writel(0x515, reg + SUN50I_A64_PLL_MIPI_REG);

+   /*
+* Initialize PLL VIDEO0 to default values (297 MHz)
+* to clean up any changes made by bootloader
+*/
+   writel(0x03006207, reg + 0x10);
+
ret = devm_sunxi_ccu_probe(>dev, reg, _a64_ccu_desc);
if (ret)
return ret;

Best,
  Frank

[PATCH V2 5/6] drm: bridge: samsung-dsim: Support non-burst mode

2023-04-23 Thread Adam Ford

The high-speed clock is hard-coded to the burst-clock
frequency specified in the device tree.  However, when
using devices like certain bridge chips without burst mode
and varying resolutions and refresh rates, it may be
necessary to set the high-speed clock dynamically based
on the desired pixel clock for the connected device.

This also removes the need to set a clock speed from
the device tree for non-burst mode operation, since the
pixel clock rate is the rate requested from the attached
device like an HDMI bridge chip.  This should have no
impact for people using burst-mode and setting the burst
clock rate is still required for those users.

Signed-off-by: Adam Ford 
---
 drivers/gpu/drm/bridge/samsung-dsim.c | 17 ++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/bridge/samsung-dsim.c 
b/drivers/gpu/drm/bridge/samsung-dsim.c
index f165483d5044..cea847b8e23c 100644
--- a/drivers/gpu/drm/bridge/samsung-dsim.c
+++ b/drivers/gpu/drm/bridge/samsung-dsim.c
@@ -657,11 +657,21 @@ static unsigned long samsung_dsim_set_pll(struct 
samsung_dsim *dsi,
 
 static int samsung_dsim_enable_clock(struct samsung_dsim *dsi)
 {
-   unsigned long hs_clk, byte_clk, esc_clk;
+   unsigned long hs_clk, byte_clk, esc_clk, pix_clk;
unsigned long esc_div;
u32 reg;
+   struct drm_display_mode *m = >mode;
+   int bpp = mipi_dsi_pixel_format_to_bpp(dsi->format);
+
+   /* m->clock is in KHz */
+   pix_clk = m->clock * 1000;
+
+   /* Use burst_clk_rate for burst mode, otherwise use the pix_clk */
+   if ((dsi->mode_flags & MIPI_DSI_MODE_VIDEO_BURST) && 
dsi->burst_clk_rate)
+   hs_clk = samsung_dsim_set_pll(dsi, dsi->burst_clk_rate);
+   else
+   hs_clk = samsung_dsim_set_pll(dsi, DIV_ROUND_UP(pix_clk * bpp, 
dsi->lanes));
 
-   hs_clk = samsung_dsim_set_pll(dsi, dsi->burst_clk_rate);
if (!hs_clk) {
dev_err(dsi->dev, "failed to configure DSI PLL\n");
return -EFAULT;
@@ -1800,10 +1810,11 @@ static int samsung_dsim_parse_dt(struct samsung_dsim 
*dsi)
return PTR_ERR(pll_clk);
}
 
+   /* If it doesn't exist, use pixel clock instead of failing */
ret = samsung_dsim_of_read_u32(node, "samsung,burst-clock-frequency",
   >burst_clk_rate);
if (ret < 0)
-   return ret;
+   dsi->burst_clk_rate = 0;
 
ret = samsung_dsim_of_read_u32(node, "samsung,esc-clock-frequency",
   >esc_clk_rate);
-- 
2.39.2

[PATCH V2 6/6] drm: bridge: samsung-dsim: Let blanking calcuation work in non-burst mode

2023-04-23 Thread Adam Ford

The blanking calculation currently uses burst_clk_rate for calculating
the settings. Since it's possible to use this in non-burst mode, it's
possible that where won't be burst_clk_rate.  Instead, cache the
clock rate configured from of samsung_dsim_set_pll and use it instead.

Signed-off-by: Adam Ford 
---
 drivers/gpu/drm/bridge/samsung-dsim.c | 4 +++-
 include/drm/bridge/samsung-dsim.h | 1 +
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/bridge/samsung-dsim.c 
b/drivers/gpu/drm/bridge/samsung-dsim.c
index cea847b8e23c..8c69d22a57b7 100644
--- a/drivers/gpu/drm/bridge/samsung-dsim.c
+++ b/drivers/gpu/drm/bridge/samsung-dsim.c
@@ -652,6 +652,8 @@ static unsigned long samsung_dsim_set_pll(struct 
samsung_dsim *dsi,
reg = samsung_dsim_read(dsi, DSIM_STATUS_REG);
} while ((reg & DSIM_PLL_STABLE) == 0);
 
+   dsi->hs_clock = fout;
+
return fout;
 }
 
@@ -960,7 +962,7 @@ static void samsung_dsim_set_display_mode(struct 
samsung_dsim *dsi)
u32 reg;
 
if (dsi->mode_flags & MIPI_DSI_MODE_VIDEO) {
-   int byte_clk_khz = dsi->burst_clk_rate / 1000 / 8;
+   int byte_clk_khz = dsi->hs_clock / 1000 / 8;
int hfp = (m->hsync_start - m->hdisplay) * byte_clk_khz / 
m->clock;
int hbp = (m->htotal - m->hsync_end) * byte_clk_khz / m->clock;
int hsa = (m->hsync_end - m->hsync_start) * byte_clk_khz / 
m->clock;
diff --git a/include/drm/bridge/samsung-dsim.h 
b/include/drm/bridge/samsung-dsim.h
index 25475d78adb3..41cbae00cd7e 100644
--- a/include/drm/bridge/samsung-dsim.h
+++ b/include/drm/bridge/samsung-dsim.h
@@ -94,6 +94,7 @@ struct samsung_dsim {
 
u32 pll_clk_rate;
u32 burst_clk_rate;
+   u32 hs_clock;
u32 esc_clk_rate;
u32 lanes;
u32 mode_flags;
-- 
2.39.2

[PATCH V2 2/6] drm: bridge: samsung-dsim: Fix PMS Calculator on imx8m[mnp]

2023-04-23 Thread Adam Ford

According to Table 13-45 of the i.MX8M Mini Reference Manual, the min
and max values for M and the frequency range for the VCO_out
calculator were incorrect.  This information was contradicted in other
parts of the mini, nano and plus manuals.  After reaching out to my
NXP Rep, when confronting him about discrepencies in the Nano manual,
he responded with:
 "Yes it is definitely wrong, the one that is part
  of the NOTE in MIPI_DPHY_M_PLLPMS register table against PMS_P,
  PMS_M and PMS_S is not correct. I will report this to Doc team,
  the one customer should be take into account is the Table 13-40
  DPHY PLL Parameters and the Note above."

These updated values also match what is used in the NXP downstream
kernel.

To fix this, make new variables to hold the min and max values of m
and the minimum value of VCO_out, and update the PMS calculator to
use these new variables instead of using hard-coded values to keep
the backwards compatibility with other parts using this driver.

Fixes: 4d562c70c4dc ("drm: bridge: samsung-dsim: Add i.MX8M Mini/Nano support")
Signed-off-by: Adam Ford 
---
 drivers/gpu/drm/bridge/samsung-dsim.c | 22 --
 include/drm/bridge/samsung-dsim.h |  3 +++
 2 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/bridge/samsung-dsim.c 
b/drivers/gpu/drm/bridge/samsung-dsim.c
index 2be3b58624c3..adb9c13c5f7f 100644
--- a/drivers/gpu/drm/bridge/samsung-dsim.c
+++ b/drivers/gpu/drm/bridge/samsung-dsim.c
@@ -405,6 +405,9 @@ static const struct samsung_dsim_driver_data 
exynos3_dsi_driver_data = {
.num_bits_resol = 11,
.pll_p_offset = 13,
.reg_values = reg_values,
+   .m_min = 41,
+   .m_max = 125,
+   .vco_min = 500,
 };
 
 static const struct samsung_dsim_driver_data exynos4_dsi_driver_data = {
@@ -418,6 +421,9 @@ static const struct samsung_dsim_driver_data 
exynos4_dsi_driver_data = {
.num_bits_resol = 11,
.pll_p_offset = 13,
.reg_values = reg_values,
+   .m_min = 41,
+   .m_max = 125,
+   .vco_min = 500,
 };
 
 static const struct samsung_dsim_driver_data exynos5_dsi_driver_data = {
@@ -429,6 +435,9 @@ static const struct samsung_dsim_driver_data 
exynos5_dsi_driver_data = {
.num_bits_resol = 11,
.pll_p_offset = 13,
.reg_values = reg_values,
+   .m_min = 41,
+   .m_max = 125,
+   .vco_min = 500,
 };
 
 static const struct samsung_dsim_driver_data exynos5433_dsi_driver_data = {
@@ -441,6 +450,9 @@ static const struct samsung_dsim_driver_data 
exynos5433_dsi_driver_data = {
.num_bits_resol = 12,
.pll_p_offset = 13,
.reg_values = exynos5433_reg_values,
+   .m_min = 41,
+   .m_max = 125,
+   .vco_min = 500,
 };
 
 static const struct samsung_dsim_driver_data exynos5422_dsi_driver_data = {
@@ -453,6 +465,9 @@ static const struct samsung_dsim_driver_data 
exynos5422_dsi_driver_data = {
.num_bits_resol = 12,
.pll_p_offset = 13,
.reg_values = exynos5422_reg_values,
+   .m_min = 41,
+   .m_max = 125,
+   .vco_min = 500,
 };
 
 static const struct samsung_dsim_driver_data imx8mm_dsi_driver_data = {
@@ -469,6 +484,9 @@ static const struct samsung_dsim_driver_data 
imx8mm_dsi_driver_data = {
 */
.pll_p_offset = 14,
.reg_values = imx8mm_dsim_reg_values,
+   .m_min = 64,
+   .m_max = 1023,
+   .vco_min = 1050,
 };
 
 static const struct samsung_dsim_driver_data *
@@ -547,12 +565,12 @@ static unsigned long samsung_dsim_pll_find_pms(struct 
samsung_dsim *dsi,
tmp = (u64)fout * (_p << _s);
do_div(tmp, fin);
_m = tmp;
-   if (_m < 41 || _m > 125)
+   if (_m < driver_data->m_min || _m > driver_data->m_max)
continue;
 
tmp = (u64)_m * fin;
do_div(tmp, _p);
-   if (tmp < 500 * MHZ ||
+   if (tmp < driver_data->vco_min  * MHZ ||
tmp > driver_data->max_freq * MHZ)
continue;
 
diff --git a/include/drm/bridge/samsung-dsim.h 
b/include/drm/bridge/samsung-dsim.h
index ba5484de2b30..a088d84579bc 100644
--- a/include/drm/bridge/samsung-dsim.h
+++ b/include/drm/bridge/samsung-dsim.h
@@ -59,6 +59,9 @@ struct samsung_dsim_driver_data {
unsigned int num_bits_resol;
unsigned int pll_p_offset;
const unsigned int *reg_values;
+   u16 m_min;
+   u16 m_max;
+   u64 vco_min;
 };
 
 struct samsung_dsim_host_ops {
-- 
2.39.2

[PATCH V2 4/6] drm: bridge: samsung-dsim: Dynamically configure DPHY timing

2023-04-23 Thread Adam Ford

The DPHY timings are currently hard coded. Since the input
clock can be variable, the phy timings need to be variable
too.  Add an additional variable to the driver data to enable
this feature to prevent breaking boards that don't support it.

The phy_mipi_dphy_get_default_config function configures the
DPHY timings in pico-seconds, and a small macro converts those
timings into clock cycles based on the pixel clock rate.

Signed-off-by: Adam Ford 
---
 drivers/gpu/drm/bridge/samsung-dsim.c | 79 +++
 include/drm/bridge/samsung-dsim.h |  1 +
 2 files changed, 70 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/bridge/samsung-dsim.c 
b/drivers/gpu/drm/bridge/samsung-dsim.c
index 5b6e7825b92f..f165483d5044 100644
--- a/drivers/gpu/drm/bridge/samsung-dsim.c
+++ b/drivers/gpu/drm/bridge/samsung-dsim.c
@@ -18,9 +18,7 @@
 #include 
 #include 
 #include 
-
 #include 
-
 #include 
 #include 
 #include 
@@ -218,6 +216,8 @@
 
 #define OLD_SCLK_MIPI_CLK_NAME "pll_clk"
 
+#define PS_TO_CYCLE(PS, MHz) DIV64_U64_ROUND_CLOSEST(((PS) * (MHz)), 
1ULL)
+
 static const char *const clk_names[5] = {
"bus_clk",
"sclk_mipi",
@@ -487,6 +487,7 @@ static const struct samsung_dsim_driver_data 
imx8mm_dsi_driver_data = {
.m_min = 64,
.m_max = 1023,
.vco_min = 1050,
+   .dynamic_dphy = 1,
 };
 
 static const struct samsung_dsim_driver_data *
@@ -698,13 +699,50 @@ static void samsung_dsim_set_phy_ctrl(struct samsung_dsim 
*dsi)
const struct samsung_dsim_driver_data *driver_data = dsi->driver_data;
const unsigned int *reg_values = driver_data->reg_values;
u32 reg;
+   struct drm_display_mode *m = >mode;
+   int bpp = mipi_dsi_pixel_format_to_bpp(dsi->format);
+   struct phy_configure_opts_mipi_dphy cfg;
+   int clk_prepare, lpx, clk_zero, clk_post, clk_trail;
+   int hs_exit, hs_prepare, hs_zero, hs_trail;
+   unsigned long long clock_in_hz = m->clock * 1000;
 
if (driver_data->has_freqband)
return;
 
+   /* The dynamic_phy has the ability to adjust PHY Timing settings */
+   if (driver_data->dynamic_dphy) {
+   phy_mipi_dphy_get_default_config(clock_in_hz, bpp, dsi->lanes, 
);
+
+   /*
+* TODO:
+* The tech reference manual for i.MX8M Mini/Nano/Plus
+* doesn't state what the definition of the PHYTIMING
+* bits are beyond their address and bit position.
+* After reviewing NXP's downstream code, it appears
+* that the various PHYTIMING registers take the number
+* of cycles and use various dividers on them.  This
+* calculation does not result in an exact match to the
+* downstream code, but it is very close, and it appears
+* to sync at a variety of resolutions. If someone
+* can get a more accurate mathematical equation needed
+* for these registers, this should be updated.
+*/
+
+   lpx = PS_TO_CYCLE(cfg.lpx, clock_in_hz);
+   hs_exit = PS_TO_CYCLE(cfg.hs_exit, clock_in_hz);
+   clk_prepare = PS_TO_CYCLE(cfg.clk_prepare, clock_in_hz);
+   clk_zero = PS_TO_CYCLE(cfg.clk_zero, clock_in_hz);
+   clk_post = PS_TO_CYCLE(cfg.clk_post, clock_in_hz);
+   clk_trail = PS_TO_CYCLE(cfg.clk_trail, clock_in_hz);
+   hs_prepare = PS_TO_CYCLE(cfg.hs_prepare, clock_in_hz);
+   hs_zero = PS_TO_CYCLE(cfg.hs_zero, clock_in_hz);
+   hs_trail = PS_TO_CYCLE(cfg.hs_trail, clock_in_hz);
+   }
+
/* B D-PHY: D-PHY Master & Slave Analog Block control */
reg = reg_values[PHYCTRL_ULPS_EXIT] | reg_values[PHYCTRL_VREG_LP] |
reg_values[PHYCTRL_SLEW_UP];
+
samsung_dsim_write(dsi, DSIM_PHYCTRL_REG, reg);
 
/*
@@ -712,7 +750,11 @@ static void samsung_dsim_set_phy_ctrl(struct samsung_dsim 
*dsi)
 * T HS-EXIT: Time that the transmitter drives LP-11 following a HS
 *  burst
 */
-   reg = reg_values[PHYTIMING_LPX] | reg_values[PHYTIMING_HS_EXIT];
+   if (driver_data->dynamic_dphy)
+   reg  = DSIM_PHYTIMING_LPX(lpx) | 
DSIM_PHYTIMING_HS_EXIT(hs_exit);
+   else
+   reg = reg_values[PHYTIMING_LPX] | reg_values[PHYTIMING_HS_EXIT];
+
samsung_dsim_write(dsi, DSIM_PHYTIMING_REG, reg);
 
/*
@@ -728,10 +770,17 @@ static void samsung_dsim_set_phy_ctrl(struct samsung_dsim 
*dsi)
 * T CLK-TRAIL: Time that the transmitter drives the HS-0 state after
 *  the last payload clock bit of a HS transmission burst
 */
-   reg = reg_values[PHYTIMING_CLK_PREPARE] |
-   reg_values[PHYTIMING_CLK_ZERO] |
-   reg_values[PHYTIMING_CLK_POST] |
-   reg_values[PHYTIMING_CLK_TRAIL];
+   if

[PATCH V2 3/6] drm: bridge: samsung-dsim: Fetch pll-clock-frequency automatically

2023-04-23 Thread Adam Ford

Make the pll-clock-frequency optional.  If it's present, use it
to maintain backwards compatibility with existing hardware.  If it
is absent, read clock rate of "sclk_mipi" to determine the rate.

Signed-off-by: Adam Ford 
---
 drivers/gpu/drm/bridge/samsung-dsim.c | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/bridge/samsung-dsim.c 
b/drivers/gpu/drm/bridge/samsung-dsim.c
index adb9c13c5f7f..5b6e7825b92f 100644
--- a/drivers/gpu/drm/bridge/samsung-dsim.c
+++ b/drivers/gpu/drm/bridge/samsung-dsim.c
@@ -1726,12 +1726,20 @@ static int samsung_dsim_parse_dt(struct samsung_dsim 
*dsi)
 {
struct device *dev = dsi->dev;
struct device_node *node = dev->of_node;
+   struct clk *pll_clk;
int ret;
 
ret = samsung_dsim_of_read_u32(node, "samsung,pll-clock-frequency",
   >pll_clk_rate);
-   if (ret < 0)
-   return ret;
+
+   /* If it doesn't exist, read it from the clock instead of failing */
+   if (ret < 0) {
+   pll_clk = devm_clk_get(dev, "sclk_mipi");
+   if (!IS_ERR(pll_clk))
+   dsi->pll_clk_rate = clk_get_rate(pll_clk);
+   else
+   return PTR_ERR(pll_clk);
+   }
 
ret = samsung_dsim_of_read_u32(node, "samsung,burst-clock-frequency",
   >burst_clk_rate);
-- 
2.39.2

[PATCH V2 1/6] drm: bridge: samsung-dsim: fix blanking packet size calculation

2023-04-23 Thread Adam Ford

From: Lucas Stach 

Scale the blanking packet sizes to match the ratio between HS clock
and DPI interface clock. The controller seems to do internal scaling
to the number of active lanes, so we don't take those into account.

Signed-off-by: Lucas Stach 
Signed-off-by: Adam Ford 
---
 drivers/gpu/drm/bridge/samsung-dsim.c | 18 +++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/bridge/samsung-dsim.c 
b/drivers/gpu/drm/bridge/samsung-dsim.c
index e0a402a85787..2be3b58624c3 100644
--- a/drivers/gpu/drm/bridge/samsung-dsim.c
+++ b/drivers/gpu/drm/bridge/samsung-dsim.c
@@ -874,17 +874,29 @@ static void samsung_dsim_set_display_mode(struct 
samsung_dsim *dsi)
u32 reg;
 
if (dsi->mode_flags & MIPI_DSI_MODE_VIDEO) {
+   int byte_clk_khz = dsi->burst_clk_rate / 1000 / 8;
+   int hfp = (m->hsync_start - m->hdisplay) * byte_clk_khz / 
m->clock;
+   int hbp = (m->htotal - m->hsync_end) * byte_clk_khz / m->clock;
+   int hsa = (m->hsync_end - m->hsync_start) * byte_clk_khz / 
m->clock;
+
+   /* remove packet overhead when possible */
+   hfp = max(hfp - 6, 0);
+   hbp = max(hbp - 6, 0);
+   hsa = max(hsa - 6, 0);
+
+   dev_dbg(dsi->dev, "calculated hfp: %u, hbp: %u, hsa: %u",
+   hfp, hbp, hsa);
+
reg = DSIM_CMD_ALLOW(0xf)
| DSIM_STABLE_VFP(m->vsync_start - m->vdisplay)
| DSIM_MAIN_VBP(m->vtotal - m->vsync_end);
samsung_dsim_write(dsi, DSIM_MVPORCH_REG, reg);
 
-   reg = DSIM_MAIN_HFP(m->hsync_start - m->hdisplay)
-   | DSIM_MAIN_HBP(m->htotal - m->hsync_end);
+   reg = DSIM_MAIN_HFP(hfp) | DSIM_MAIN_HBP(hbp);
samsung_dsim_write(dsi, DSIM_MHPORCH_REG, reg);
 
reg = DSIM_MAIN_VSA(m->vsync_end - m->vsync_start)
-   | DSIM_MAIN_HSA(m->hsync_end - m->hsync_start);
+   | DSIM_MAIN_HSA(hsa);
samsung_dsim_write(dsi, DSIM_MSYNC_REG, reg);
}
reg =  DSIM_MAIN_HRESOL(m->hdisplay, num_bits_resol) |
-- 
2.39.2

[PATCH V2 0/6] drm: bridge: samsung-dsim: Support variable clocking

2023-04-23 Thread Adam Ford

This series fixes the blanking pack size and the PMS calculation.  It then
adds support to allows the DSIM to dynamically DPHY clocks, and support
non-burst mode while allowing the removal of the hard-coded clock values
for the PLL for imx8m mini/nano/plus, and it allows the removal of the
burst-clock device tree entry when burst-mode isn't supported by connected
devices like an HDMI brige.  In that event, the HS clock is set to the value
requested by the bridge chip.

This has been tested on both an i.MX8M Nano and i.MX8M Plus, and should work
on i.MX8M Mini as well.


Adam Ford (5):
  drm: bridge: samsung-dsim: Fix PMS Calculator on imx8m[mnp]
  drm: bridge: samsung-dsim: Fetch pll-clock-frequency automatically
  drm: bridge: samsung-dsim: Dynamically configure DPHY timing
  drm: bridge: samsung-dsim: Support non-burst mode
  drm: bridge: samsung-dsim: Let blanking calcuation work in non-burst
mode

Lucas Stach (1):
  drm: bridge: samsung-dsim: fix blanking packet size calculation

 drivers/gpu/drm/bridge/samsung-dsim.c | 150 ++
 include/drm/bridge/samsung-dsim.h |   5 +
 2 files changed, 135 insertions(+), 20 deletions(-)

---
V2:  Instead of using my packet blanking calculation, this integrates
 on from Lucas Stach which gets modified later in the series to
 cache the value of the HS-clock instead of having to do the
 calucations again.
 
 Instead of completely eliminating the PLL clock frequency from
 the device tree, this makes it optional to avoid breaking some
 Samsung devices.  When the samsung,pll-clock-frequency is not 
 found, it reads the value of the clock named "sclk_mipi"
 This also maintains backwords compatibility with older device
 trees.
 
 This also changes the DPHY calcuation from a Look-up table, 
 a reverse engineered algorithm which uses
 phy_mipi_dphy_get_default_config to determine the standard
 nominal values and calculates the cycles necessary to update
 the DPHY timings accordingly.
 
-- 
2.39.2

Re: [PATCH] drm/bridge:Fix the panic problem caused by bridge->funcs->attach

2023-04-23 Thread Jagan Teki

+ Bridge Maintainers

On Wed, Apr 19, 2023 at 8:35 AM 余治国  wrote:
>
> The log looks like this:
> [ 31.723823] Internal error: Oops: 9604 [#1] SMP\013 \010
> [ 31.729030] Modules linked in:\013 \010
> [ 31.733395] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.10.59+2.0.0 #250\013
> \010
> [ 31.745167] pstate: 60c5 (nZCv daif +PAN +UAO -TCO BTYPE=--)\013 \010
> [ 31.752785] pc : drm_bridge_attach+0xdc/0x1a4\013 \010
> [ 31.757518] lr : drm_bridge_attach+0x40/0x1a4\013 \010
> [ 31.762104] sp : 000678f0\013 \010
> [ 31.765726] x29: 000678f0 x28: 0031c880 \013 \010
> [ 31.771599] x27: 0060 x26: 800010539e90 \013 \010
> [ 31.777340] x25: 000a6000 x24: 00129c10 \013 \010
> [ 31.783060] x23:  x22: 000131c0 \013 \010
> [ 31.788770] x21: 80001052a5b8 x20: 003e2c80 \013 \010
> [ 31.794506] x19: 00017080 x18: 0010 \013 \010
> [ 31.800210] x17: 5fcfcd6c x16: d27b105e \013 \010
> [ 31.805907] x15: 00c7 x14: 000675d0 \013 \010
> [ 31.811599] x13: ffea x12: 8000114158c0 \013 \010
> [ 31.817282] x11: 0003 x10: 800011295880 \013 \010
> [ 31.823000] x9 : 8000112958d8 x8 : 0017ffe8 \013 \010
> [ 31.828731] x7 : c000fffe x6 : 0001 \013 \010
> [ 31.834425] x5 : 0057ffa8 x4 :  \013 \010
> [ 31.840101] x3 :  x2 : 00013218 \013 \010
> [ 31.845787] x1 : 003e2cf8 x0 :  \013 \010
> [ 31.851594] Call trace:\013 \010
> [ 31.854603] drm_bridge_attach+0xdc/0x1a4\013 \010
> [ 31.859079] dw_mipi_dsi_bridge_attach+0x98/0xa8\013 \010
> [ 31.864035] drm_bridge_attach+0x104/0x1a4\013 \010
> [ 31.869211] dw_mipi_dsi_bind+0x24/0x58\013 \010
> drm_bridge_attach will call bridge->funcs in many cases, so I think it is 
> necessary to check。

I don't think the bridge pointer or bridge->funcs pointer is NULL
here. It looks like the controlling is attaching the bridge prior to
registering it. If I remember correctly, I got a similar issue with
rockchip-drm pipeline. Would you please share your pipeline as well?

Thanks,
Jagan.

RE: [Intel-gfx] [PATCH 3/8] drm/i915/mtl: Add PTE encode function

2023-04-23 Thread Yang, Fei

> On Fri, Apr 21, 2023 at 10:27:22AM -0700, Yang, Fei wrote:
>>> On Wed, Apr 19, 2023 at 04:00:53PM -0700, fei.y...@intel.com wrote:
 From: Fei Yang 

 PTE encode functions are platform dependent. This patch implements
 PTE functions for MTL, and ensures the correct PTE encode function
 is used by calling pte_encode function pointer instead of the
 hardcoded gen8 version of PTE encode.

 Signed-off-by: Fei Yang 
 Reviewed-by: Andrzej Hajda 
 Reviewed-by: Andi Shyti 
 Acked-by: Nirmoy Das 
>>>
>>> Bspec: 45015, 45040
>>>
 ---
  drivers/gpu/drm/i915/display/intel_dpt.c |  2 +-
  drivers/gpu/drm/i915/gt/gen8_ppgtt.c | 45 
  drivers/gpu/drm/i915/gt/intel_ggtt.c | 36 +--
  3 files changed, 72 insertions(+), 11 deletions(-)

 diff --git a/drivers/gpu/drm/i915/display/intel_dpt.c
>>b/drivers/gpu/drm/i915/display/intel_dpt.c
 index b8027392144d..c5eacfdba1a5 100644
 --- a/drivers/gpu/drm/i915/display/intel_dpt.c
 +++ b/drivers/gpu/drm/i915/display/intel_dpt.c
 @@ -300,7 +300,7 @@ intel_dpt_create(struct intel_framebuffer *fb)
vm->vma_ops.bind_vma= dpt_bind_vma;
vm->vma_ops.unbind_vma  = dpt_unbind_vma;

 - vm->pte_encode = gen8_ggtt_pte_encode;
 + vm->pte_encode = vm->gt->ggtt->vm.pte_encode;

dpt->obj = dpt_obj;
dpt->obj->is_dpt = true;
 diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
  b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
 index 4daaa6f55668..11b91e0453c8 100644
 --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
 +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
 @@ -55,6 +55,34 @@ static u64 gen8_pte_encode(dma_addr_t addr,
return pte;
  }

 +static u64 mtl_pte_encode(dma_addr_t addr,
 +   enum i915_cache_level level,
 +   u32 flags)
 +{
 + gen8_pte_t pte = addr | GEN8_PAGE_PRESENT | GEN8_PAGE_RW;
 +
 + if (unlikely(flags & PTE_READ_ONLY))
 + pte &= ~GEN8_PAGE_RW;
 +
 + if (flags & PTE_LM)
 + pte |= GEN12_PPGTT_PTE_LM | GEN12_PPGTT_PTE_NC;
>>>
>>> GEN12_PPGTT_PTE_NC got defined in the previous patch as BIT(5).  But
>>> according to bspec 45040, bit 5 is ignored in the PTE encoding.  What is
>>> this trying to do?
>>
>> This takes effect only for PTE_LM, doesn't affect MTL.
>> PTE_NC is needed for PVC (use of access counter).
>> I believe this function was writen based on the one for PVC. And this
>> function did get extended to cover all gen12 in a later patch.
>
> Even though MTL doesn't have local memory, PTE_LM is supposed to be
> used on MTL for access to BAR2 stolen memory.

You were right, but I still think this code is fine because this bit is
ignored for MTL anyway and it is needed for other platforms with LMEM.
Otherwise this code would have some sort of platform checking which is
hard to do because we don't have platform info here.
Or we would have to define another PTE encode function for platforms
needing PTE_NC just for this one difference, then manage the function
pointer correctly.

-Fei

> Matt
>
>> -Fei
>>> Matt
>>>
 +
 + switch (level) {
 + case I915_CACHE_NONE:
 + pte |= GEN12_PPGTT_PTE_PAT1;
 + break;

RE: [Intel-gfx] [PATCH 7/8] drm/i915: use pat_index instead of cache_level

2023-04-23 Thread Yang, Fei

> On 20/04/2023 00:00, fei.y...@intel.com wrote:
>> From: Fei Yang 
>>
>> Currently the KMD is using enum i915_cache_level to set caching policy for
>> buffer objects. This is flaky because the PAT index which really controls
>> the caching behavior in PTE has far more levels than what's defined in the
>> enum. In addition, the PAT index is platform dependent, having to translate
>> between i915_cache_level and PAT index is not reliable, and makes the code
>> more complicated.
>>
>>>From UMD's perspective there is also a necessity to set caching policy for
>> performance fine tuning. It's much easier for the UMD to directly use PAT
>> index because the behavior of each PAT index is clearly defined in Bspec.
>> Having the abstracted i915_cache_level sitting in between would only cause
>> more ambiguity.
>>
>> For these reasons this patch replaces i915_cache_level with PAT index. Also
>> note, the cache_level is not completely removed yet, because the KMD still
>> has the need of creating buffer objects with simple cache settings such as
>> cached, uncached, or writethrough. For such simple cases, using cache_level
>> would help simplify the code.
>>
>> Cc: Chris Wilson 
>> Cc: Matt Roper 
>> Signed-off-by: Fei Yang 
>> Reviewed-by: Andi Shyti 
>
> I think have some ideas no how to perhaps make this simpler, please bear
> with me.
>
> In my mind get/set caching ioctls need to be failing once explicit pat
> index has been set by userspace. Or at least not return false information.

By design we are ending the support for set caching ioctl. The patch is included
in this series, "drm/i915/mtl: end support for set caching ioctl"

+   if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70))
+   return -EOPNOTSUPP;
+

> And I don't like i915_gem_object_has_cache_level and
> i915_gem_get_pat_index as a refactoring step.
>
> It also seems that the driver has a need to query the caching mode set
> regardless of the route (of setting).

Only for the objects created by the KMD. For UMD created objects with PAT
index set KMD should never touch the setting.

> So how about this.
>
> Three callers which query the caching mode: use_cpu_reloc, vm_fault_gtt,
> gpu_write_needs_clflush.
>
> We convert them to be like:
>
> i915_gem_object_has_caching_mode(obj, PAT_UC / PAT_WT / ...);

PAT_UC/WT/WB are platform dependent 
(https://gfxspecs.intel.com/Predator/Home/Index/45101),
performing this check you would have to do something like,

if (MTL)
...
else if (PVC)
...
else if (GEN12)
...
else
...

> Then apart from the per platform tables for mapping between cache level
> to pat index, you add tables which map pat index to caching modes
> (PAT_UC, etc, naming TBD, just enums or bitmasks also TBD, I haven't
> looked at the bspec to see how exactly it works).
>
> You would use that table in the i915_gem_object_has_caching_mode helper,
> called from the above three functions instead of obj->cache_level direct
> comparison.
>
> I am assuming at least for instance cache_level != I915_CACHE_NONE would
> be equivalent to i915_gem_object_has_caching_mode(obj, PAT_UC), etc.

So far kernel only needs 4 cache levels defined in enum i915_cache_level,
kernel doesn't need to understand all PAT indices. By desgin if the userspace
is setting PAT index directly, kernel only needs to pass the setting to PTE.

For objects created by kernel (including objects created by userspace without
specifying pat index), there are only 4 options (defined in the 
cachelevel_to_pat).

For objects created by userspace with PAT index set (GEM_CREATE + set_pat 
extension),
kernel should not touch the setting, just pass it to the PAT index bits in PTE.

That's why I was only checking cache_level. Handling PAT index is much more
complicated because of its platform dependent nature and even the number of
PAT indices varies from platform to platform. Fortunately kernel doesn't need
to understand that.

-Fei

> Same mapping table could also be used in debugfs (i915_cache_level_str)
> to universally describe any obj->pat_index, with no need to have
> anything platform dependend there.
>
> In set caching set you always set obj->pat_index and so low level code
> can always just use that.
>
> Unless I am missing something (possible) I think like that we end up
> with no i915_gem_get_pat_index sprinkled around and also no confusing
> i915_gem_object_has_cache_level.
>
> Obj->pat_index would be a single point of truth, while obj->cache_level
> is just a legacy field for get/set_caching ioctl - not used in the
> internal driver flows.
>
> We would need an additional field for storing the boolean of whether
> userspace had overriden the PAT.
>
> Regards,
>
> Tvrtko

Re: [PATCH 4/5] drm/sti: Drop of_gpio header

2023-04-23 Thread Alain Volmat

Hi Maira,

thanks for the patch.

On Fri, Aug 12, 2022 at 05:57:45PM -0300, Maíra Canal wrote:
> This driver includes the deprecated OF GPIO header 
> yet fail to use symbols from it, so drop this include.
> 
> Cc: Alain Volmat 
> Signed-off-by: Maíra Canal 
> ---
>  drivers/gpu/drm/sti/sti_dvo.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/sti/sti_dvo.c b/drivers/gpu/drm/sti/sti_dvo.c
> index b6ee8a82e656..0fc7710b054a 100644
> --- a/drivers/gpu/drm/sti/sti_dvo.c
> +++ b/drivers/gpu/drm/sti/sti_dvo.c
> @@ -8,7 +8,7 @@
>  #include 
>  #include 
>  #include 
> -#include 
> +#include 
>  #include 
>  
>  #include 
> -- 

Acked-by: Alain Volmat 

> 2.37.1
>

Re: [PATCH v7 6/7] drm/i915: Replace custom intel runtime_pm tracker with ref_tracker library

2023-04-23 Thread Zhou Furong






+
+static inline void
+intel_wakeref_tracker_show(struct ref_tracker_dir *dir,
+  struct drm_printer *p)
+{
+   const size_t buf_size = PAGE_SIZE;
+   char *buf, *sb, *se;
+   size_t count;
+
+   buf = kmalloc(buf_size, GFP_NOWAIT);
+   if (!buf)
+   return;
+
+   count = ref_tracker_dir_snprint(dir, buf, buf_size);
+   if (!count)
+   goto free;
+   /* printk does not like big buffers, so we split it */
+   for (sb = buf; *sb; sb = se + 1) {
+   se = strchrnul(sb, '\n');
+   drm_printf(p, "%.*s", (int)(se - sb + 1), sb);
+   if (!*se)
+   break;
+   }
+   if (count >= buf_size)
+   drm_printf(p, "\n...dropped %zd extra bytes of leak report.\n",
+  count + 1 - buf_size);
+free:
+   kfree(buf);
+}
+


move to c source?

Re: [PATCH v3 2/3] drm/panel: Add Samsung S6D7AA0 panel controller driver

2023-04-23 Thread Artur Weber

Hi,

thank you for the review.

On 20/04/2023 09:35, Linus Walleij wrote:
>> +static int s6d7aa0_on(struct s6d7aa0 *ctx)
>> +{
>> +   struct mipi_dsi_device *dsi = ctx->dsi;
>> +   struct device *dev = >dev;
>> +   int ret;
>> +
>> +   dsi->mode_flags |= MIPI_DSI_MODE_LPM;
> 
> (...)
> 
>> +static int s6d7aa0_off(struct s6d7aa0 *ctx)
>> +{
>> +   struct mipi_dsi_device *dsi = ctx->dsi;
>> +   struct device *dev = >dev;
>> +   int ret;
>> +
>> +   dsi->mode_flags &= ~MIPI_DSI_MODE_LPM;
> 
> I haven't seen this mode flag MIPI_DSI_MODE_LPM set and
> masked in other DSI panel drivers! Is this something we should
> fix everywhere then? Or even something the core should be
> doing?

These bits were included in a driver for a similar panel with the same
controller in an MSM8916 close-to-mainline kernel fork[1]; that driver
was generated with lmdpdg[2], which adds the LPM mode flag automatically
based on some downstream DTS property. In this case, I left it in, since
it didn't seem to break anything... but I just re-tested without it and
it seems that it might've fixed some odd issues I'd get sometimes when
going out of sleep mode. I'll get rid of it in the next version.

(I based my panel driver off that driver; now that I think about it, it
might be worth mentioning somewhere in the copyright notice...?)

Best regards
Artur Weber

[1]
https://github.com/msm8916-mainline/linux/blob/msm8916/6.3-rc7/drivers/gpu/drm/panel/msm8916-generated/panel-samsung-s6d7aa0-lsl080al03.c
[2]
https://github.com/msm8916-mainline/linux-mdss-dsi-panel-driver-generator

Re: [PATCH v7 7/7] drm/i915: track gt pm wakerefs

2023-04-23 Thread Zhou Furong


If in doubt, say "N"

+
+config DRM_I915_DEBUG_WAKEREF
+   bool "Enable extra tracking for wakerefs"
+   depends on DRM_I915
+   default n

'default n' is not need

Re: [PATCH 7/9] drm/sti/sti_hdmi: convert to using is_hdmi from display info

2023-04-23 Thread Alain Volmat

Hi Jani,

On Thu, Sep 01, 2022 at 03:47:09PM +0300, Jani Nikula wrote:
> Prefer the parsed results for is_hdmi in display info over calling
> drm_detect_hdmi_monitor(). Remove the now redundant hdmi_monitor member
> from struct sti_hdmi.
> 
> Cc: Alain Volmat 
> Signed-off-by: Jani Nikula 
> ---
>  drivers/gpu/drm/sti/sti_hdmi.c | 11 ++-
>  drivers/gpu/drm/sti/sti_hdmi.h |  2 --
>  2 files changed, 6 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/gpu/drm/sti/sti_hdmi.c b/drivers/gpu/drm/sti/sti_hdmi.c
> index cb82622877d2..7e6d47c1c43b 100644
> --- a/drivers/gpu/drm/sti/sti_hdmi.c
> +++ b/drivers/gpu/drm/sti/sti_hdmi.c
> @@ -266,6 +266,7 @@ static void hdmi_active_area(struct sti_hdmi *hdmi)
>   */
>  static void hdmi_config(struct sti_hdmi *hdmi)
>  {
> + struct drm_connector *connector = hdmi->drm_connector;
>   u32 conf;
>  
>   DRM_DEBUG_DRIVER("\n");
> @@ -275,7 +276,7 @@ static void hdmi_config(struct sti_hdmi *hdmi)
>  
>   /* Select encryption type and the framing mode */
>   conf |= HDMI_CFG_ESS_NOT_OESS;
> - if (hdmi->hdmi_monitor)
> + if (connector->display_info.is_hdmi)
>   conf |= HDMI_CFG_HDMI_NOT_DVI;
>  
>   /* Set Hsync polarity */
> @@ -985,15 +986,15 @@ static int sti_hdmi_connector_get_modes(struct 
> drm_connector *connector)
>   if (!edid)
>   goto fail;
>  
> - hdmi->hdmi_monitor = drm_detect_hdmi_monitor(edid);
> - DRM_DEBUG_KMS("%s : %dx%d cm\n",
> -   (hdmi->hdmi_monitor ? "hdmi monitor" : "dvi monitor"),
> -   edid->width_cm, edid->height_cm);
>   cec_notifier_set_phys_addr_from_edid(hdmi->notifier, edid);
>  
>   count = drm_add_edid_modes(connector, edid);
>   drm_connector_update_edid_property(connector, edid);
>  
> + DRM_DEBUG_KMS("%s : %dx%d cm\n",
> +   (connector->display_info.is_hdmi ? "hdmi monitor" : "dvi 
> monitor"),
> +   edid->width_cm, edid->height_cm);
> +
>   kfree(edid);
>   return count;
>  
> diff --git a/drivers/gpu/drm/sti/sti_hdmi.h b/drivers/gpu/drm/sti/sti_hdmi.h
> index 05b2f3d0d48d..6d4c3f57bc46 100644
> --- a/drivers/gpu/drm/sti/sti_hdmi.h
> +++ b/drivers/gpu/drm/sti/sti_hdmi.h
> @@ -57,7 +57,6 @@ struct hdmi_audio_params {
>   * @reset: reset control of the hdmi phy
>   * @ddc_adapt: i2c ddc adapter
>   * @colorspace: current colorspace selected
> - * @hdmi_monitor: true if HDMI monitor detected else DVI monitor assumed
>   * @audio_pdev: ASoC hdmi-codec platform device
>   * @audio: hdmi audio parameters.
>   * @drm_connector: hdmi connector
> @@ -83,7 +82,6 @@ struct sti_hdmi {
>   struct reset_control *reset;
>   struct i2c_adapter *ddc_adapt;
>   enum hdmi_colorspace colorspace;
> - bool hdmi_monitor;
>   struct platform_device *audio_pdev;
>   struct hdmi_audio_params audio;
>   struct drm_connector *drm_connector;
> -- 

Acked-by: Alain Volmat 

Thanks.

> 2.34.1
>

RE: [Intel-gfx] [PATCH 7/8] drm/i915: use pat_index instead of cache_level

2023-04-23 Thread Yang, Fei

> On 20/04/2023 00:00, fei.y...@intel.com wrote:
>> From: Fei Yang 
>>
>> Currently the KMD is using enum i915_cache_level to set caching policy
>> for buffer objects. This is flaky because the PAT index which really
>> controls the caching behavior in PTE has far more levels than what's
>> defined in the enum. In addition, the PAT index is platform dependent,
>> having to translate between i915_cache_level and PAT index is not
>> reliable, and makes the code more complicated.
>>
>> From UMD's perspective there is also a necessity to set caching policy for
>> performance fine tuning. It's much easier for the UMD to directly use
>> PAT index because the behavior of each PAT index is clearly defined in Bspec.
>> Having the abstracted i915_cache_level sitting in between would only
>> cause more ambiguity.
>>
>> For these reasons this patch replaces i915_cache_level with PAT index.
>> Also note, the cache_level is not completely removed yet, because the
>> KMD still has the need of creating buffer objects with simple cache
>> settings such as cached, uncached, or writethrough. For such simple
>> cases, using cache_level would help simplify the code.
>>
>> Cc: Chris Wilson 
>> Cc: Matt Roper 
>> Signed-off-by: Fei Yang 
>> Reviewed-by: Andi Shyti 
>
> [snip]
>
>>
>>   bool i915_gem_cpu_write_needs_clflush(struct drm_i915_gem_object
>> *obj) @@ -267,7 +267,7 @@ int i915_gem_object_set_cache_level(struct 
>> drm_i915_gem_object *obj,
>>   {
>>  int ret;
>>
>> -if (obj->cache_level == cache_level)
>> +if (i915_gem_object_has_cache_level(obj, cache_level))
>>  return 0;
>
> When userspace calls i915_gem_set_caching_ioctl

We are ending the support for set_caching_ioctl.

> after having set the PAT index explicitly this will make it silently succeed
> regardless of the cache level passed in, no? Because of:

Yes, that's the point. For objects created by userspace with PAT index set,
KMD is not supposed to touch the setting.

> +bool i915_gem_object_has_cache_level(const struct drm_i915_gem_object *obj,
> +  enum i915_cache_level lvl)
> +{
> + /*
> +  * cache_level == I915_CACHE_INVAL indicates the UMD's have set the
> +  * caching policy through pat_index, in which case the KMD should
> +  * leave the coherency to be managed by user space, simply return
> +  * true here.
> +  */
> + if (obj->cache_level == I915_CACHE_INVAL)
> + return true;
>
> I think we need to let it know it is doing it wrong with an error.

This is not an error, by design userspace should know exactly what it's doing.

-Fei

> Regards,
>
> Tvrtko

79 matches

Mail list logo